GB2306868A - Apparatus for decoding coded data - Google Patents

Apparatus for decoding coded data Download PDF

Info

Publication number
GB2306868A
GB2306868A GB9624754A GB9624754A GB2306868A GB 2306868 A GB2306868 A GB 2306868A GB 9624754 A GB9624754 A GB 9624754A GB 9624754 A GB9624754 A GB 9624754A GB 2306868 A GB2306868 A GB 2306868A
Authority
GB
United Kingdom
Prior art keywords
memory
codeword
data
bit
context
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB9624754A
Other versions
GB9624754D0 (en
GB2306868B (en
Inventor
Edward L Schwartz
Michael Gormish
James D Allen
Martin Boliek
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ricoh Co Ltd
Original Assignee
Ricoh Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US31611694A priority Critical
Application filed by Ricoh Co Ltd filed Critical Ricoh Co Ltd
Priority to GB9518375A priority patent/GB2293735B/en
Publication of GB9624754D0 publication Critical patent/GB9624754D0/en
Publication of GB2306868A publication Critical patent/GB2306868A/en
Application granted granted Critical
Publication of GB2306868B publication Critical patent/GB2306868B/en
Anticipated expiration legal-status Critical
Application status is Expired - Fee Related legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H03BASIC ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/40Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
    • H03M7/4006Conversion to or from arithmetic code
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/436Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using parallelised computational arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/91Entropy coding, e.g. variable length coding [VLC] or arithmetic coding

Description

SPECIFICATION

TITLE APPARATUS FOR DECODING DATA 2306868 FIELD OF-THE INVENTIP14

The Present invention relates to the field of data =mpression and decompression Systems; particularly, the present invention relates to an apparatus for parallel cornpreSSiDnldecompressic)n systems.

decoding of data in BACKGROUND OF THE INVENTION

Today. data compression is widely used, particularly for storing and transmitting large amounts of data. Many different data compression lechniques exist in the prior art. Compression techniques can be divided into two broad categories, lossy coding and lossless coding. Lossy coding involves coding that results in the loss of information, such that there is no guarantee of perfect reconstruction of the original data. In lossless compression, aU the information is retained and the data is compressed in a manner which allows for perfect reconstruction.

BAD ORIGINAL In lossless compression, input symbols are converted to output codewords. If the compression is successful, the codewo'rds are represented in fewer bits than the number of input symbols. Lossless coding methods include dictionary methods of coding (e.g., Lempel-Mv), run length ericoding, enumerative coding and entropy coding.

Entropy coding consists of any method of lossiess Coding which attempts to compress data close to the entropy limit using known or estimated symbol probabilities. Entropy codes include Huffrnan codes, arithmetic codes and binary entropy codes. Binary entropy coders are lossless coders which act only on binary (yes/no) decisions, often expressed as the most probable symbol (MPS) and the least probable symbol (LPS). Examples of binary entropy coders include IBM's 0-coder and a coder referred to as the Bcoder.

For more information on the B-coder, see. U.S. Patent No. 5,272,478, entitled 0Method and Apparatus for Entropy Codinj, (J.D. Alien), issued December 21, 1993, and assigned to the corporate assignee of the present invention.

See also MJ. Gormish and J.D. Alien, "Finite State Machine Binary Entropy Coding," abstract in Proc. Data Compression Conference, 30 March 1993, Snowbird, UT, pg. 449. The B-coder is a binary entropy coder which uses a finite state machine for compression.

Figure 1 shows a block diagram of a prior art compression and decompression system using a binary entropy coder. For coding, data is input into context model (CM) 101. CM 101 translates the input data into a set or sequence of binary decisions and provides the context bin for each decision. Both the sequence of binary decisions and their associated context bins are output from CM 101 to the probability estimation module (PEM) 102.

BAD ORIGINAL PEM 102 receives each context bin and generates a probability estimate for each binary decision. The actual probability estimate is typically represented by a class, referred to as PClass. Each PClass is used for a range of probabilities. PEM 102 also determines whether the binary decision (result) is or is not in its more probable state (i.e., whether the decision corresponds to the MPS). The bit-stream generator (BG) Module 103 receives the probability estimate (i.e., the PClass) and the determination of whether or not the binary decision was likely as inputs. In response, BG module 103 produces a compressed data stream, outputting zero or more bits, to represent the original input data.

For decoding, CM 104 provides a context bin to PEM 105, and PEM provides the probability class (PClass) to BG module 106 based on the context bin. BG Module 106 is coupled to receive the probability class. In response to the probability class'and the compressed data, BG module 106 retums a bit representing whether the binary decision (i.e., the event) is in its most probable state. PEM 105 receives the bit, updates the probability estimate based on the received bit, and returns the result to CM 104. CM 104 receives the returned bit and uses the returned bit to generate the original data and update the context bin for the next binary decision.

One problem with decoders using binary entropy codes, such as IBM's 0-coder and the B-coder, is that they are slow, even when implemented in hardware. Their operation requires a single large, slow feedback loop. To restate the decoding process, the context model uses past decoded data to produce a context. The probability estimation module uses the context to produce a probability class. The bit-stream generator uses the probability E1AD ORIGINAL -5 is class and the compressed data to determine if the next bit is the likely or unlikely result. The probability estimation module uses the likely/unlikely result to produce a result bit (and to update the probani ity estimate for the context). The result bit is used by the context model to update its history of past data. All of these steps are required for decoding a single bit. Because the context model must wait for the result bit to update its history before it can provide the next context, the decoding of the next bit must wait. It is desirable to avoid having to wait for the feedback loop to be completed before decoding the next bit. in other words, it is desirable to decode more than one bit or codeword at a lime in order to increase the speed at which compressed data is decoded.

Another problem with decoders using binary entropy codes is that variable length data must be processed. In most systems, the codewords to be decoded have variable lengths. Alternatively, other systems encode vahable length symbols (uncoded data). When processing the variable length data, it is necessary to shift the data at the bit level in order to provide the correct next data for the decoding or encoding operation. These bit level manipulations on the data stream can require costly andlor slow hardware and/or software. Furthermore, prior art systems require this shifting to be done in time critical feedback loops that limit the performance of the decoder. It would also be advantageous to remove the bit level manipulation of the data stream from time critical feedback loops, so that parallelization could be used to increase speed.

BAD ORIGINAL The present invention provides a decoder for decoding coded data, said decoder comprising:

a context modelling mechanism for providing contexts, wherein the context modelling mechanism comprises a plurality of integrated circuits; a memory coupled to the context model for storing state information, wherein the memory provides state information in response to each context provided by the context model; and a plurality of decoders coupled to the memory for decoding codewords using the state information from the memory, wherein the plurality of decoders decode codewords using a plurality of R-codes, wherein the plurality of Rcodes include at least one non-maximum length run of most probable symbols that is not followed by a least probable symbol. The invention also provides a system for decoding a code stream having a plurality of codewords, said system comprising: 20 a context modelling mechanism for providing contexts, wherein the context modelling mechanism comprises a plurality of integrated circuits; a memory coupled to the context model for storing state information, wherein the memory provides state information in response to each context provided by the context model; and a plurality of decoders coupled to the memory for decoding codewords using the state information from the memory.

The invention further provides a system for decoding a code stream having a plurality of codewords, said system comprising:

context modelling mechanism for providing contexts; memory coupled to the context model for storing state information, wherein the memory provides state information in response to each context provided by the Bp, D ORIGINAL 7 context model; and a plurality of decoders coupled to the memory for decoding codewords using the state information from the memory, wherein at least one of the plurality of decoders 5 comprises a delay tolerant decoder.

The present invention will now be described, by way of example only, with reference to the accompanying drawings. To assist in more fully understanding the present invention, it will be described and illustrated in the context of examples of coding and decoding methods and apparatus. The following is a brief description of the accompanying drawings.

Figure 1 is a block diagram of a prior art binary entropy encoder and decoder.

Figure 2A is a block diagram of an example of a decoding system.

Figure 2B is a block diagram of an example of an encoding system.

Figure 2C is a block diagram of an example of a decoding system which processes context bins in parallel.

Figure 2D is a block diagram of an example of a decoding system which processes probability classes in 15 parallel.

Figure 3 i llustrates a non-interleaved code stream.

-g- Figure 4 illustrates an example of the interleaved code stream as derived from an exemplary set of data.

Figure 5 is one example of a prDbability estimation table and bit-stream generator for an R-coder.

Figure 6 is a block diagram of one example of an encoder.

is 2D Figure 7 is a block diagram of one example of a bit generator.

Figure 6 is a bloCk dl:a;ram of one example of a reorder unit.

Figure 9 is a block diagram, of one example of a run count reorder unit.

reorder unit.

Figure 10 is a block diagram of another example of a run count Figure 11 is a block diagram of one example of a bit packing unit.

- 10 Figure 12 is a block diagram of one example of the packing logic.

Figure 13 is a block diagram of the encoder bit generator.

Figure 14A is a block diagram of an example of a decoding system.

Figure 14B is a block diagram of an example of a decoder.

Figure 14C is a block diagram of an example of a FIFO structure.

is Figure ISA illustrates one example of a decoding pipeline.

Figure 15B illustrates an example of a decoder Figure 16A is a block diagram of one example of a shifter.

Figure 16B is a block diagram of another example of a shifter.

Figure 17 is a block diagram of a system having an external context model according to the present invention.

Figure 18 is a block diagram of another system having an external context model according to the present invention.

Figure 19 is a block diagram of one example of a decoder.

Figure 20 is a block diagram of one example of a decoder with separate bit generators.

Figure 21 is a block diagram of one example of a bit generator.

Figure 22 is a block diagram of one example of a long run unit.

Figure 23 is a block diagram of one example of a shor run unit.

Figure 24 is a block diagram of one example of an initialization and control logic.

Figure 25 is a block d'op-;-,a".n Of One example of reordering data PSnooper der d Oder.

Fijoure 26 is a block diagram of another example of a reordering un..I.

Figure 27 is a block diagram of another example of a reordering a merged qjeue.

Figure 2a is a block dl:ac,,an., of a hich bandwiddh system using the WCSen, invention.

j rnatching syslern using FiCure 3D is a block diaprarn of a real-lime video s),S4,ern using the Figure 31 illustrates one example of the coded data memory.

Figure 32 is a timing diagram of a decoding system.

Figure 33 is a graph of coding efficiency versus MPS probability for different R-codes.

is A method and apparatus for parallel encoding and decoding of data is described. In the following description, numerous specific details are set forth, such as specific numbers of bits, numbers of coders, specific probabilities, types of data, etc., in order to provide a thorough understanding of the preferred embDdiments of the present invention. It Will be understood to one skilled in the arl that the present invention may be practiced without these specific details. Also, well-known circuits have been shown in block diagram form rather than in detail in order to avoid unnecessarily obscuring the present invention.

Some portions of the detailed descriptions which follow are presented in terms of algorithms and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the means U'sed by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the arl. An algorithm is here, and generally, conceived to be a self-consistent sequence of steps leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being Stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.

It should be bDme in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwitse as apparent from the following discussions. ft is appreciated that throughout the present invention, discussions utilizing terms such as aprocessing' or "computing' or calculating' or 'determining' or displaying or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.

The present examples also relate to apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general purpose computer selectively activated or reconfigured by a computer program stored in the computer. The alaorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various general purpose machines may be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform the required method steps. The required structure for a variety of these machines will appear from the description below. In addition, the present invention is not described wi.th reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the leachings of the invention as described herein.

-Parallel F-ntroz Cod= The present invention provides a parallel entropy coding system. The system includes an encoder and a decoder. In one example. the encoder performs encoding on data in real-time. Similarly, in one example, the decoder of the present invention performs decoding on data in real-time. Together, the real-time encoder and real-time decoder form a balanced coding system.

The present invention provides a system that decodes losslessly encoded data in parallel. The data is decoded in parallel by using multiple decoding resources. Each of the multiple decoding resources is assigned data (e.g., codewDrds) from the data stream to decode. The assignment of the data stream occurs on the fly wherein the decoding resources decode data concurrently, thereby decoding the data stream in parallel. In order to enable the assignment of data in a manner which makes efficient use of the decoding resources, the data stream is ordered. This is referred to as paraMelizing the data stream. The ordering of data allows each decoding resource to decode any or all of the coded data without waiting for feedback from the context model.

Figure 2A illustrates a decoding system without the slow feedback loop of the prior art. An input buffer 204 receives coded data (i.e., codewords) and a feedback signal from decoder 205 and supplies coded data in a predetermined order (e.g., Context bin order) to decoder 205 of the present invention, which decodes the coded data. Decoder 205 includes multiple decoders (e.g., 205A. 205B, 205C, etc. ).

in one example, each of the decoders 205A, 205B, 2D5C, etc. is supplied data for a group of contexts. Each of the decoders in decoder 205 is supplied coded data for every context bin in its group of contexts from input buffer 204. Using this data, each decoder 205A, 205B, 2D5C. etc. produces the decoded dWa for its group of context bins. The context model is not required to associate coded data with a particular group of context bins.

The decoded data is sent by decoder 205 to decoded data storage 207 (e.g., 207A, 207B, 207C, etc.). Note that decoded data storage 207 may store intermediate data that is neitherCDded nor uncoded, such as run counts. In this case, decoded data storage 207 stores the data in a compact, but not entropy coded, form.

Operating independently, context model 206 is coupled to receive the previously decoded data from decoded data storage 207 (i.e., 207A, 207B, 207C, etc.) in response to a feedback signal it sends to decoded data storage 207. Therefore, two independent feedbacklDops exist, one between decoder 205 and input buffer 204 and a second between context model 206 and decoder data storage 207. Since the large feedback loop is eliminated, the decoders in decoder 205 (e.g., 205A, 205B, 205C, etc.) are able to decode their associated codewords as Soon as they are received from input buffer 204.

The context model provides the memorypDrtion of the coding system and divides a set of data (e.g., an image) into different categories (e.g. , context bins) based on the memory. In the present examples, the context bins are considered independent ordered sets of data. In one example, each group of context bins has its own probability estimation model and each context bin has its own state (where probability estimation models are shared). Therefore, each context bin could use a different probability estimation model andlor bit-stream generator.

Thus, the data is ordered, or parallelized, and data from the data stream is assigned to individual coders for decoding.

Addinc Parallelisrn to the ClasSic EntrOPY Coding MOdel To parallelize the data stream, the data may be divided according to either context, probability, tiling, codeword sequence (based on codewords), etc. The reordering of the coded data stream is independent of the parallelism, a method used to parallelize data or the probability at any other point. A parallel encoder portion of an encoding system of the present example fed by data diffelentiated by contexl model (CM) is shown in Figure 2B.

Referring to Figure 2B, the context dependent parallel encoder portion comprises context model (CM) 214, probability estimation modules (PEMS) 215-217, and bitstrearn generators (BGs) 218-220. CM 214 is coupled to receive coded input data. CM 214 is also coupled to PEMs 215-217. PEMs 215-217 are also coupled to BGs 218-220, respectively, which output code streams 1, 2 and 3 respectively. Each PEM and BG pair comprises a coder. Therefore, the parallel encoder is shown with three coders. Although only three parallel coders are shown, any number of coders may be used.

CM 214 divides data stream into different contexts in the same way as a conventional CM and sends the multiple streams to the parallel hardware encoding resources. Individual contexts, or groups of contexts, are directed -1 B- to separate probability estimators (PEMs) 215-217 and bit generators (BGs) 218-219. Each of BGs 218-220 outputs a coded data stream.

Figure 2C is a block diagram of one example of the decoder portion of the decoding Bystem. Referring to Figure 2C, a context dependent parallel decoder is shown having BGs 2-21-2.23, PEMs 224-226 and CM 227. Code streams 1-3 are coupled to BGs 221-223 respectively. BGs 221-2-23 are also coupled to PEMs 224-226 respectively.

PEMs 224-226 are coupled to CM 2-27 which outputs the reconstructed input data. The input comes from several code streams, shown as code streams 13. One code stream is assigned to each PEM and BG. Each of the BG 221223 returns a bit representing whether the binary decision is in its more probable state, which the PEMs 224-226 use to return decoded bits (e.g., the binary decision). Each of PEMs 224-226 is associated with one of BGs 221223, indicating which code is to be used to produce a data stream from its input code stream. CM 227 produces a decoded data stream by selecting the decoded b,;ts from the bit-stream generators in the proper sequence, thereby reCreatling the original data. Thus, the CM 227 obtains the decompressed dalla bit from the appropriate PEM and BG, in effect reordering the data into the original order. Note that the control for this design flows in the reverse direction of the data stream. The BG and PEM may decode data before the CM 227 needs it, staying one or more bits ahead. Alternatively, the CM 227 may request (but not receive) a bit from one BG and PEM and then request one or more bits from other BGs and PEMs before using the initially requested bit.

The configuration shown in Figure 2C is designed to couple the PEM and BG tightly. The IBM Q-Coder is a good example of a coder having a tightly coupled PEM and BG. Local feedback loops between these two are not fundamental limit to system performance.

In a different design, the PEM could differentiate the data and send it to parallel BG units. Thus, there would be only one CM and PEM and the BG is replicated. Adaptive Huff.man coding and finite state machine Coding could be used in this way.

A similar decoding system that uses the PEM to differentiate the data and send it to parallel BGs is shown in Figure 2D. In this case, probability classes are handled in parallel and each bit-stream generator is assigned to a specific probability class and receives knowledge of the result. Referring to Ficure 2D, the coded dalla streams 1-3 are coupled to one of multiple bitV,rea,,n generators (e.g., BG 232, BG 233, BG 234, etc.), which are coupled to 15 receive it. Each of the bit-strearn generators is coupled to PEM 235. PEM 235 is also coupled to CM 236. In this Configuration, each of the bit-stream generators decodes Coded data and the results of the decoding are selected by PEM 235 (instead of by CM 236). Each of the bit-stream generator receives coded data from a source associated with one probability class (i.e., where the coded data could from any context bin). PEM 235 selects the bitstream generators using a probability class. The probability class is dictated by the context bin provided to it by CM 236. In this manner, decoded data is produced by processing probability classes in parallel.

Numerous implementations exist. for the parallel decoding systems.

In one example, the coded da-ta streams corresponding to the multiple context bins can be interleaved into one stream ordered by the demands of the various coders. In one example, the coded data is ordered such that each coder is constantly supplied With data even though the coded data is delivered to the decoder in one stream. Note that t he present examples operates with all types of data, including image data.

By using small simple coders that can be cheaply replicated in integrated circuits, coded data can be decoded quickly in parallel. In one example, the coders are implemented in hardware using field programmable gate array (FPGA) chips or a standard cell application specific integrated circuit (ASIC) chip. The combination of parallelism and simple bitstream generators allow the decoding of coded data to occur at speeds in excess of the prior art decoders, while maintaining or exceeding the compression efficiency of prior decoding systems.

is Cha-lnel Orderno of M,-,1!iple Data Streams There are many different design issues and problems that affect svs4Lem performance. A few of these will be mentioned below. However, the examples shown in Figure 2B and 2C (and 2D) use the multiple code streams. Systems with parallel channels that could accommodate this embodiment are imaginable: multiple telephone lines, multiple heads on a disk drive, etc. In Some applications, only one channel is available, or convenient. Indeed, if multiple channels are required there may be poor utilization of the bandwidth because of the bursty nature of the individual code streams.

BAD ORIGINAL In one example, the code streams are concatenated and sent contiquously to the decoder. A preface header contains pointers to the beginning bit location of each stream. Figure 3 illustrates one example of the arrangement of this data. Referring to Figure 3, three pointers 301-300 indicate the starting location in the concatenated code of code streams 1. 2 and 3 respectively. The Complete compressed data file is available in a buffer to the decoder. As needed, the CDdewords are retrieved from the proper location via the proper pointer. The pointer is then updated to the next CDdeword in that code stream.

Note that this method requires an entire coded frame to be stored at the decoder and, for practical purposes, at the encoder. If a real-time system, or less bursty data flow, is required then two frame buffers may be used for banking at both the encoder and the decoder.

Dala 0,de,, to r.,odeword Orde Notice that a decoder decodes codewords in a given deterministic order. With parallel coding, the order of the requests to the code stream is deterministic. Thus, if the codewords from parallel code streams can be interleaved in the dght order at the encoder, then a single code stream will suffice. The codewords are delivered to the decoder in the same order on a jusrt-in-time basis. At the encoder, a Model of the decoder determines the codeword order and packs the codewords into a single stream. This model might be an actual decoder.

A problem with de!ivedng data to the parallel decoding elements arises when data is variable length. Unpacking a stream of variable length BAD 0RiGINAL L codewords requires using a bit shifter to align the codewords. Bit shifters are often costly and/or SIDW when implemented in hardware. The control of the bit shifter depends on the size of the particular codeword. This control feedback loop prevents variable length shifting from being performed quickly.

The virtues of feeding multiple decoders with a single stream cannot be realized if the process of unpacking the stream is performed in a single bit shifter that is not fast enough to keep up with the multiple decoders.

The solution described herein separates the problem of distributing the coded data to the parallel coders from the alignment of the variable-length codewords for decoding. The codewords in each independent code stream are packed into fixed-length words. called interleaved words. At the decoder end of the channel these interleaved words can be distributed to the paral!el decoder units with fast hardwired data lines and a simple control circuit.

It is convenient to have the interleaved word length larger than the maximum cDdeword lencith so that at least enough bits to complete one codeword is contained in each interleaved word. Th3 interleaved words can contain may codewords and parts of codewords. Figure 4 illustrates the interleaving of an example set of parallel code streams.

These words are interleaved according to the demand at the decoder.

Each independent decoder receives an entire interleaved word. The bit shifting operation is now done locally at each decoder, maintaining the parallelism of the system. Note in Figure 4 that the first codeword in each interleaved word is the lowest remaining codeword in the set. For instance, the first interleaved words come from code stream 1, starting with the lowest BAD ORIGINAL codeword (i.e., AM). This is followed by the first interleaved word in code stream 2 and then by the first interleaved word in code stream 3. However, the next lowest codeword not containedcompletely in an already ordered interleaved word is #7. Therefore, the nexl word in the stream is the second 5 interleaved word of code stream 2.

In another example, the order in which the subsequent set of interleaved words (e.g., the codeword starting with CDdeword #8 in stream 1, the codewDrd starting with codeword #7 in stream 2, the codeword sl&a-,s&ing with codeword #11 in stream 3) are inserted into the interleaved codestrearn is based on the first codeword of the previous set of interleaved words (e.g., the codeword starting with codeword #1 in stream 1, the codeword starting with codeword T172 in stream 2, the codeword starling with codeword #4 in stream 3) and are ordered from the interleaved word with the lowest number first codeword to the interleaved word with the highest number first codeword.

Therefore, in this case, since the interleaved word starting with codeword #1 was first, then the next interleaved word in stream 1 is the first of the second group of interleaved words to be inserted into the interleaved stream, followed by the next interleaved word in stream 2 and then the next interleaved word in stream 3. Note that after the second group of interleaved words is inserted into the interleaved stream, the next interleaved word in stream 2 would be the next interleaved word inserted into the stream because codeword #7 is the lowest codeword of the second set of interleaved words (followed by codeword #8 in stream 1 and then codeword #11 in stream 3).

Using the actual decoder as the modeler for the data stream accounts for all design choices and delays to create the intedeaved stream. This is not BAD ORiGINAL a great cost for duplex systems that have both encoders and decoders anyway. Note that this can be generalized to any parallel set of variablelength (or different sized) data words that are consumed in a deterministic order.

Types Qf Codes ancl Bil-SIrearn C;enerajors For Parallel pecodiDZ The present systems could employ existing Coders, such as 0-coders or Bcoders, as the bit-stream generation elements which are replicated in parallel. However, other codes and coders may be used. The coders and their associated Codes employed by the present example are simple coders.

Using a bit-stream generator With a simple code instead of complex code, such as the arithmetic code used by the 0-coder or the multi-sta!e codes used by the B-cDder, offers advantages. A simple code is advantageous in that the hardware implementation is much faster and simpler and requires less silicon than a complex code.

Another advantage is that coding efficiency can be improved. A code that uses a finite amount of state information cannot perfectly meet the Shannon entropy limit for every probability. Hardware implemented Codes known in the art that allow a single bh-strearn generator to handle multiple probabilities or contexts have constraints that reduce coding efficiency. Removing the constraints needed for multiple contexts or probability classes allows the use of codes that comes closer to meeting the Shannon entropy limit.

BAD ORIGINAL R-code The code (and coder) employed by one examplary system is referred to as an R-Code. R-codes are adaptive codes that convert a variable number of identical input symbols into a codeword. In an embodiment, the R-codes are parameterized so that many different probabilities can be handled by a single decoder design. Moreover, the Rcodes of the present invention can be decoded by simple, high-speed hardware.

In the present examples, B-codes are used by an R-coder to perform encoding or decoding. In one example, an R-coder is a combined bitstream generator and probability estimation module. For instance, in Figure 1, an R-coder could include the combination of probability estimation module 102 and bil-stream generator 103 and the combination of probability estimation module 105 with bit-stream generator 106.

Codawords represent runs of the most probable symbol (MPS). A MIPS represents the outcome of a binary decision with More than 50% probability. On the other hand, the least probable symbol (LPS) represents the outcome in a binary decision with less than 50% probability. Note that when two outcomes are equally probable, h is not important which is designated MPS or LPS as long as both the encoder and decoder make the same designation. The resulting bit sequence in the compressed file is shown in Table 1, for a given parameter referred to as MAXRUN.

BAD ORGiNAL Table 1 - Bit-aeneration Encodina cpji-w-Q-rd MeanhQ 0 MAXRUN Consecutive MPSs 1N N Consecutive MPSs followed by LPS, N < MAXRUN To encode, the number of MPSs in a run are counted by a simple counter. If that count equals the MAXFtUN count value, a 0 codeword is emitted into the code stream and the counter is reset. If an LPS is encountered, then a 1 followed by the bits N, which uniquely describe the number of MPS symbols before the LPS, is emitted into the code stream. (Note that there are many ways to assign the N bits to describe the run length). Again the counter is reset. Note that the number of bits needed for N is dependent on the value of MAXRUN. Also note that the l's complement of the codewords could be used.

To decode, if the first bit in the code stream is 0, then the value of MAXRUN is put in the MPS counter and the LPS indication is cleared. Then the 0 bit is discarded. If the first bit is a 1, then the following bits are examined to extract the bits N and the appropriate count (N) is put in the MPS counter and the LPS indicator is set. Then the code stream bits containing the 1 N codeword are discarded.

R-codes are generated by the rules in Table 1. Note that the definition of a given R-code Rx(k) is defined by the MAXRUN. For instance:

MAXRUN for Rx(k) = x - 2k-1, 13AD C)FIONAL thus MAXRUN for R2(k) = 2 2k-1, MAXRUN for R3(k) = 3 - 2k-1, etc......

Note that R-codes are a subset of Golomb codes. Also note that Rice codes use R2(.) codes only. The R-codes of the present invention allow the use of both R2(k) and R3(k) codes, and other Rn(k) codes if desired. In one embodiment, R2(k) and R3(k) codes are used. Note that Rn exists for n=2 and n equals any odd number (e.g., R2, R3, RS, R7, Rg, R1 l, R13, R15). In one embodiment, for R2(k) code, the run count, r, is encoded in N; the run count, r, is described in k bits, such that 1 N is represented with k+ l bits. Also in one embodiment, for an R3(k) code, the bits N can contain 1 bit to indicate 9 n<2(k-1) or nk2(k-1) and either k-1 or k bits to indicate the run count, r, such that the variable N is represented by a total k or k+l bits respectively. In other embodiments, the l's complement of N could be used in the codeword. In this case, the MPS tends to produce code streams with many Os and LPS tends to produce code streams with many 1 s.

Tables 2, 3, 4 and 5 depict some efficient R-codes utilized for one embodiment of the present invention. It should be noted that other run length codes may also be used in the present invention. An example of alternative run length code for R2(2) is shown in Table 6. Tables 7 and 8 show examples of the codes used in an embodiment.

BAD ORGiNAL Table 2 uncoded data 1 codeword 0 0 1 1 Table 4 - uncoded data-T-codeword 000 0 001 100 01 101 1 11 Table 6 - -Alternative R2(2) Alternative R2 (2) 0000 0 0001 ill 001 101 01 110 1 100 able 3 uncoded data 1 codeword 00 0 01 10 1 11 Table 5 - uncoded data 1 codeword 0000 0 0001 100 001 101 01 110 1 ill Table 7 - lternative Code Preferred R3(2) 000000 0 000001 1000 00001 1010 0001 1001 001 1011 01 110 1 ill Table 6 Another Alternative R2(2) Code Preferred 7R2(2) 0000 0 0001 100 001 110 01 101 1 ill Prob:-b;li,t,y Estimation Model for P-Codes In one example, the R2(0) Code performs no coding: an input of 0 is encoded into a 0 and an input of 1 is encoded into a 1 (or vice versa) and is optinnal for probabilities equal to 50%. The R2(1) code of the currently preferred embodiment is optimal. for probabilities close to 0. 707 (i.e., 70.7%) an.d the R3(1) is optimal for the 0.794 probability (79. 4%). The R2(2) code is optimal for the 0.841 probability (84.1 %). Table 9 below depicts the near- optimal run-lenath code, where the probability skew is defined by the following equation:

Probability skew = -1092 (LPS).

BAD ORIGNAL Table 9 probabilit 7probaew Best Golomb Code 500 1.00 R2(0) 707 1.77 R2(1) 841 2.65 R2(2) 917 3.59 R2(3) 053 4.56 R2(4) 979 5.54 R2(5) 989 6.54 R2(6) 90-5 7.53 R2(7) 297 8.53 R2(8) 999 9.53 R2(9) Note that the codes are near-optimal in that the probability range, as indicated by the probability skew, is covering the space relatively evenly even though the optimal probabilities do not differentiate as much in the higher k values as 5 in the lower k values.

Reference is made to the probability at which an R-code is optimal. In fact, only R2(2) meets the entropy curve. The real consideration is for what range of probabilities is a particular R-coder better than all other R-codes in a given class. The following tables provide the probability ranges for the class 10 of R2 codes and the class of R2 and R3 codes.

For the class of R2 codes from 0 to 12 the ranges are in the Table 10 below. For example, when only R2 codes are used, R2(0) is best when 0.50 BAD c)RiGINAL -31.

:s probability 5 0.6180. Similarly. R2(1) is best when 0.6180 s probability s 0.7862.

Table 10 - R2 Codes from 0 to 12 Code Probabilities R2(0) R2(1) R2(2) R2(3) R2(4) R2 (5) R2 (6) R2(7) R2(8) R2(9) R2 (10) R2(1 1) R2(12) 0.6180 0.7862 0.8867 0.9416 0.9704 0.9851 0.9925 0.9962 0.9981 0.9991 0. 9995 0.9998 For the class of R2 and R3 codes the solutions are in the Table 11 below. For example, when R2 and FR3 codes are used, R20) is best when 0.6180 5 probability _n 0.7549.

Table 11 - R2 and R3 codes lenoths less than or equal to 13 bits Code Probabilities R2(0) R2(1) R3(1) R2(2) R3(2) R2(3) R3(3) R2(4) R3(4) R2 (5) R3(5) R2 (6) R3(6) R2(7) R3(7) R2(6) R3(8) R2 (9) R3(9) R2 (10) R3(1 0) R2 (11) R3(1 1) R2(12) 0.6180 0.7549 0.8192 0.6688 0.9051 0.9321 0.9514 0.9655 0.9754 0.9826 0.9876 0.9913 0.9938 0.9956 0.9969 0.9978 0.9984 0.9989 0.9992 0.9995 0.9996 0.9997 0.91.1098 An R2(k) for a fixed k is called a run-length code. However, a fixed k is only best for a range of probabilities. It is noted that when coding near an optlimal probability, an R-code according to the present example uses a 0 and 1N codewords with roughly equal frequency. In other words, half the time, the E-coder of the present example outputs one code and the other hair of the time, the R-Coder Outputs the other. By examining the number of 0 and 1 N codewords, a determination can be made as to whether the best code is being used. That is, if too many 1 N codewords are being output, then the run-length is too long; on the other hand, if too many 0 codewords are being output, then the run length is too short.

The probability estimation model used by Langdon examines the first bit of each codeword to determine whether the source probability is above or below the current estima!e. See G.G. Langdon, mAn Adaptive Run-Length Coding Algorithm, IBM Technical Disclosure Bulletill, Vol. 26. No. 7B, Dec. 1983. Based on this determination, k is increased or decreased. For example, if a codeword indicating MPS is seen, the probability estimate is too low. Therefore, according to Langdon, k is increased by 1 for each 0 cDdeword. If a codeword indicating less than MAXRLIN MPS followed by an LPS (e.g., IN codeword) is seen, the probability estimate is too high. There fore, according to Langdon, k is decreased by 1 for each 1 N codeword.

The present examples allow more complex probability estimation than the simple increase or decrease of k by 1 every codeword. The present examples include a probability estimation Module state that determines the code to use. Many states may use the same Code. Codes are assigned to states using a state table or state machine.

In one example, the probability estimate changes state every codeword output. Thus, the probability estimation mDJule increases or decreases the probability estimate depending on BAD ORIGINAL whether a cc)dewDrd begins with a 0 or a 1. For instance, if a 0 codeword is output, an increase of the estimate of the MPS probability occurs. On the other hand, if a M codeword is output, the estimate of MPS probability is decreased.

The Langdon coder of the prior art only used R2(k) codes and increased or decreased k for each codeword. The present"example alternatively, uses R2(k) and R3(k) codes, in conjunction with the state table or state machine, to allow the adaptation rate to be tuned to the application.

That is, if there is a small amount of stationary data, adaptation must be quicker to result in more optimal coding, and where there is a larger amount of stationary data, the adaptation time can be longer so that the coding can be chosen to achieve better compression on the remainder of the data. Note that where variable numbers of state changes can occur, application specific chararcleristics may also influence the adaptation rate. Because of the nature oll the R-codes, the estimation for R-cDdes is simple and requires little hardware, while being very powerful. Figure 33 illustrates this graph of coding e,,iciency (cDdelength normalized with respect to entropy) versus MPS probablility. Figure 33 shows how some of the R-cDdes nover the probability space. As an example, Figure 33 shows that for a MPS probability of approximately 0.55, the efficiency of the R2(0) code is 1.01 (or 1% worse than) the entropy limit. In contrast, the R2(1) code has an efficiency of 1.09 (or 9% worse than) the entropy limit. This example shows that using the wrong code for this particular low probability case cause an 8% loss in coding efficiency.

BAD OFLIG't4AL ------- - The incorporation of the R3(k) codes allows more probability space to be covered with a greater efficiency. An example probability estimation state table 1 is shown in Figure 5. Referring to Figure 5, the probability estimation state table shows both a state counter and the code associated with each of the separate states in the table. Note that the table includes both positive and negative states. The table is shown having 37 Positive states and 37 negative states, including the zerostates. The necative states signify a different MPS than the positive states. In one example, the negative states can be used when the MPS is 1 and the posilkive states can be used when the MPS is 0, or vice versa. Note that the table shown in Figure 5 is an example only and that other tables might have more or less states and a different state allocation.

Initially, the coder is in state 0 which is the R2(0) code (i.e., no code) for probability estimate equal to 0.50. After each codeword is processed, the state counter is incremented or decremented depending on the first bit of the codeword. In one example, a codeword of 0 increases the magnitude of a Ciale, counter; a cDdeword starting with 1 decreases the magnitude of the state counter. Therefore, every codeword causes a change to be made in the state by the state counter. In other words, the probability estimation module changes state. However, consecutive states could be associated with the same code. In this case, the probability estimation is accomplished viithout' changing codes every codeword. In other words, the state is changed for every codeword; however, the state is mapped into the same probabilities at certain times. For instance, states 5 to -5 all use the R2(0) code, while states 6 throuch 11 and -6 through -11 use the R2(1) code. Using the state table of BAL) UHIUiNAL the present example, probability estimation is allowed to stay with the same coder in a non-linear manner.

h should be noted that more states with the same R-code are included for the lower probabilities. This is done because the loss of efficiency when using the wrong code at low probabilities is great. The nature of the run length codes state table is to transfer between states after each codeword. In a state table designed to change codes with every change in state, when toggling between states at the lower probabilities, the code toggles between a code which is very close to the entropy efficiency limit and code which is far from the entropy efficiency limit. Thus, a penalty (in terms of the number of coded data bits) can result in the transition between states. Prior art probab,.lity estimation Modules, such as Langdon's probability estimation module, lose performance because of this penalty.

In the higher probability run length codes, the penalty for being in the wrong code is not as great. Therefore, in the present examples, additional states are added at the lower probabilities, so that the changes of toggling between the two correct states are increased, thereby reducing the coding inefficiency.

Note that in certain examples, the coder may have initial probability estimate state. In other words, the Coder could start in a predetermined one of the states, such as stale 18. In one example. a different state table could be used so that some states would be used for the first few symbols to allow for quick adaptation, and a second state table could be used for the remaining symbols for slow adaptation to allow fine-tuning of the probability estimate. In this manner, the coder may be able to use a more efficient code BAD ORIGINAL sooner in the coding process. In another example, the code stream could specify an initial probability estimate for each context. In one example, the increments and decrements are not made according to a fixed number (e.g., 1). Instead, the probability estimate state can be incremented by a variable number according to the amount of data already encountered or the amount of change in the data (stability). Examples of such tables are Tables 21-25 described below.

If the state table is symmetric, as theexample table of Figure 5 shows, only half of it (including the zero state) needs to be stored or implemented in hardware. In one example, the stale number is stored in sign magnitude (1 s) complement form to take advantage of the symmetry. In this manner, the table can be utilized by taking the absolute value of the ones complement number to determine the state and examining the sign to determine whether the LIPS is a 1 or 0. This allows the hardware needed for incrementing and decrementing the state to be reduced because the absolute value of the state is used to index the table and the computation of the absolute value of ones complem ent number is trivial. In another example, for greater hardware einciency, a slate table can be replaced by a hardwired or programmable stale machine. A hardwired state to code converter is one implementation of the state table.

Overview of the Ba lanced Paraliel Entropy Coding System The present invention provides a balanced parallel entropy coding system. The parallel entropy coding system includes both real-time encoding and real-time decoding performed in high speedflow cost hardware. The BAD ORIGINAL present invention may be used in numerous lossless coding applications, including, but not limited 10, real-time compression/decompression of writeable optical disk or magnetic disk data, real-time compression/decornpreSSiDn of computer network data, real-time ccompression/decompressiDn of image data in a compressed framestore in a multi-f unction (e.g., copier, facsimile, scanner, printer, etc.) machine, and realtime compressiorddeCDmpressiDn of audio data.

Speclitying the performance of the encoder requires some atention. It is straightforwardto design an encoder that achieves a certain rate for the original data given a sufficiently fast Coded data channel. In many however, the goal is for the encoder to utilize the Coded data channel effliciently. Coded data channel utilization is impacted by the max,rn, Lim burst rate of the original dalla interface, the encoder speed, and the comil bpression. achieved on the data. The impact of these effects must be over some local amount of da-ta which is dependent on the an. of buffeino in the encoder. It is desirable to have an encoder that VI] 1: z e s the coded dalla channel efficiently while maintaining encoder speed and high compression and still accommodating the maximurp burst rate.

The following description describes an example of such an encoder. A decoder that may be used with the encoder is also described.

Pea'.-lirne Encodinc Figure 6 is a block diagram of the encoding system.

In one example the encoder performs real-lime encoding. Referring to Figure 6, the encoding system 60D includes 13AD ORIG1t4AL an encoder 6D2 coupled to a context model (CM) & state memory 603 for generating coded information in the form Of CDdeWOrd information 604 in response to original data 601. Codeword information 604 is received by a reorder unit 606, which is coupled to a reorder memory 6D7. In response to codeword information 604, reorder unit 606 in cooperation VAth reorder memory 607 generates coded data stream 608. 11 should be noted that the encod:ing system 600 is not limited to operating on codewords, and may, in Cher examples operate on discrete analog waveforms, vahable length bit P=-"terns, channel symbols, alphabets, events, etc.

Encoder 602 includes a context model (CM), a probability estimation rnachine (PEII,) and a bitstream generator (BG). The context model and PEIA estlimation machine) in encoder 602 are essentially identical to th,zse in the decoder (except the direction of data flow). The bit generator of enCoder 602 is similar to the decoder bit generator, and is deschbed below.

The result oil the coding by encoder 602 is the output of zero or more bits that represent the oginal data. In one embodiment, the Output of the bitstream generator also includes one or more control signals. These control signals provide a control path to the data in the bit stream. In one example,, the codeword information may comprise a start of run indication, an end of run indication, a codeword and an index identifying the run count (whether it be by context or probability class) for the cDdeword. One example of the penerator is described below.

Reorder unit 606 receives the bits and control signals generated by the bit strea-ni generalor (if any) of coder 602 and generates coded data. In one SAD ORIGINAL 4D- example. the Coded data output by reorder unit 6D6 comprises a stream of interleaved words.

In one example, reorder unit 6D6 performs two functions. Reorder unit 606 moves codewords from the end of runs as created by the encoder to the beginning of runs as needed by the decoder and combines variable length codewDrds into fixed length interleaved words and outputs them in the proper order required by the decoder.

The reorder unit 606 uses a temporary reordering memory 607. In one example, where encoding is performed on a workstation, temporary remdmdm mernory 607 can be over 1 DO Megabytes in size. In the balanced system of the present, examples, the temporary reordering memory 607 is rn,.jch, smaller (e.g., approximately 1 Kbyte) and fixed. Thus, in one example real-time encoding is performed using a fixed amount of even if th;s increases the memory required by the decoder or the b.,"a,e (sj,-h, as when an output is made prior to the Completion of a run). The de=der is ab!e to determine the effects of the reorder un,'1's 1;-,n.:.,e.-' mernory using, for instance, implicit, explicit or instream signaling (as below). Reorder unit 606 has finite memory ava,,lable for reordz-,']n9, but the memory "needed" is unbounded. Both the effect of limilted 2D memory for end of run to beginning of run queue and for interleaved Word reordering must be Considered.

In one example, the encoding system (and corresponding decoding system) of the present invention performs the encoding (or deCDding) using a s:inc!e inlecrated circult chip. In another example, a single integrated circult cont2ins the encoder systen including its BAD c)RtGtNAL encoder and decoder, and memory. A separate external memory may be added to aid in enCDding. A multi-chip module or integrated circuit may contain both the encoding/decoding hardware and the memory.

The encoding systern. may attempt to increase the effe:tive bandwidth by up to a factor of N. If the compression achieved is iess than N1, then the Coded data channel wilf be fully utilized but the effeetive bandwidth increase achieved is only equal to the compression rate. If the compression achieved is greater then NA, then the effective bandwidth is achieved with extra bandwidth being writable. In both cases, the CDr.npressio,,i achieved must be over a local region of the data defined by the a-nount of buffering present in the encoding system.

G fc)r 1he C-ncoti, Fic,_,re 7 ShDWS One. example of the encoder bit generator.

Bit generator 701 is coupled to receive a probability class and, an un,.Dded b,t (e.g., an MPS or LPS indication) as inputs. In response to the inpus, bit generatDr 701 outputs multiple signals. Two of the outputs are control sinnals that indicate the start of the run and the end of a run (each codeword represents a run), start signal 711 and end signal 712 respectively. 11 is possible for a run to start and end at the same time. When a run starts or ends, index" output 713 comprises an indication of the probability class (or context) for the uncDded b,,t. In one example, index output 713 represents a co-,nbinr--ion of the probability class for the bit and a bank identific;E,ion for sys!ems in which each probability class is replicated in several banks of BAD ORIGINAL memory. Codeword output 714 is used to output a codeword from bit generator 701 when a run ends.

A memory 702 is coupled to bit generator 701 and contains the run wunt fora given probability class. During bit generation, bit generator 701 reads from memory 702 using the index (e.g., probability class). After reading from mernory 702, bit generator 701 performs bit generation as follows. First, if the run count equa!s zero, then start signal 711 is asserted indicating the sta"I of a run. Then, if the uncoded bit is equal to the LPS, then end signal 712 is asseed indicating the end of the run. Also if the uncDded bit equals an LPS, codewDrd output 714 is set to indicate that the codeword is a 1 N codeword and the run count is cleared, e.g., set to zero (since its the end of the run). If the uncoded bit does not equal the LPS, then the run count is and a test determines if the run count equa!s the maximum run cc-,,-,'& 'do,, the code. If so, then end signal 712 is assel.ed, codeword output 714 is set to zero and the run count is cleared (e.a., run count is set to zero). 11 t! le les', delermInes that the run count does not equal the maximum for the code, then the run count is incremented. Note that index signal 713 represents the probability class received as an input.

In the present examples, the generation of 1 N codewords is performed such that their length can be determined without any additional information. Table 12 illustrates 1 N codewords representations of R3(2) co.dewords for the decoder and encoder. The decoder expects that the M' bit in a "IN cod-2wor,.' be the LSE and thai,"N" count portion is in the proper MSB... LSE order. In decoder order, the variable lenoth codewDrd cannot be d;s,Lin-,,jished from zero padding without knowing which particular code is BAD ORIGINAL used. In encoder order, the codeword is reversed and the Position of the most significant M' bit indicates the length of 'I M' codewords. To generate cDdewords in encoder order, the Complement of the Count value must be reversed. This can be accomplished by reversing the 13-blt count and then sh.i,l"ling it So that h is aligned to the I-SS. As described in detall below, the bit pack unit reverses the COdewords back into decoder order. However. this reversal of codewords causes no increased complexity of the bit pack unit 6DE sinCe it must performing shifting anyway.

1D Tab!e 12 - 1 W Codeword Penresentations for 3(2) Codewords tL-i.-oded data word reverse decoder order 1 encoder order of count value (count value is underlined) ODD333 0 ODODDDODDOODD OODDOODODDOD0 0 D 3 3 13 1 1000 OD ODDDDDDDD0001 ODDOODODD1000 ODD31 1010 01 0DOODDODD0101ODODDDDDDIOIO 0DO1 1DO1 10 ODDMODDOL001 00000DD0010-01 031 1011 11 0DOD000001-101 0000000001011 01 110 0 ODOODOODODDII OODDOODODD110 ill 1 ODOODDOOD0111 JODODODDOODII1 For R3 codes, generating %" codewords also requires that the bit following the 1 indicate whether a short or long count is present.

By using multiple banks of memory, the present example allows Ppelining. For instance, in the case of a multi-poried memory, a read operation occurs to memory for an uncoded bit while a write operation occurs tc5 the memory for the previous uncDded bit.

BAD ORIG&NAL - L Ss-n,21,2 Desi;n One example of the encoder bit generator, comphses a FPGA. The design handles all R2 and R3 codes up to R2(12). The AHDI (Aftera Hardware description language) source code is listed 5 below.

The design comprises multiple parts, as shown in Figure 13. First. ENCBX 1301 is the main part of the design which has the logic to handle the start, end and continuation of runs. Second, 'VEXPAND' 1302 is used to expand the probability class into the maximum run length, a variable length 10 mask, and the length of the first long codeword for R3 codes. KEXPAND 1302 is identical to the decoder function with the same name. Third, the 'LPSOW 1303pari takes a count value and information about the probability class as inputs and generates the proper 1 W CDdeword.

The desicn uses two pipeline stages. During the first pipeline stage, the is incrernented, the probability class is expanded, and a subtraction and' corrPaison for long R3 codewords is performed. All of the other operations are pc-A49o,,,r,,jed during the second pipeline stage.

P- 0 33 er,cbg.l.df TITI 1: "Bit Generator for the encoder"; PCLUD"Z: "kexpand.inc" LNCLUDE. "lpsc-,%,.inc"; nIEDESIGN encbg k13-01. r3, bit, count in[12..0], clk e, D OBIG1t^ : INPUT; Start - run, end-run, index[4..0], count - out[12..0], c.odeword[12..0] VARIABLE k_q[3..Ol, r3_q, k_qq[3..0], r3qq, bit q, bit q, count in_q[12..0], start - run, end-run, start run_q, indJ[4..0), count - out[12..Ol, count_plus[12..0], ma-x-rl[12..Ol, codeword[12..0] BEGIN : OUTPUT; kexpand lpscw_ Ipscw_xIk k_qOxIk r3_q.clk k_qqo.clk r3_qq.clk bit_q.clk bit_qq.clk count - in qn.clk start-run.clk end run.clk start run_q.clk indeZO.clk coun t outo,clk count_pluso.clk ma.x-rl[].clk code",ordo.clk DFF; kex-pand; Ipseiv; = clk; = clk; - clk; - clk; = clk; - clk; = clk; = clk; = clk; clk; = clk; = clk., = clk; = clk; = clk; = clk; k_qn r3_q k_qqo r3_qq bit q bit_qq count in q count:pllo start-run stat run q index[O] index[4.. 11 kg; r3; r3_q; bit; bit_q; count_ino; count - in_qO + 1; start-run_q; (count_in_qO - 0); r3_qq; k-_qqo; kexpand_.k_rego = k_q D; kexpand_.r3_reg = r3-q; Ipscw_.r3 = r3_q; Ipsm_1_qU = k-_qo Ipsm,r3_q = r3_qq; Ipscw_.counto - count - in_qo; Ipscw_.mask[] - kex-pand-.mask[l Ipscw_.r3_splito - kex-pand_.r3 split Ipscw_.ma>.Tl_q[] = max - rlo; max-rl 0 = ke.\-pand_.ma>xl[] IF (bjt_qq) THE end run - VCC; count - outo = 0; codewordo = IPSCw-xwo; ELSIF (count_plus[] max - rlo) THEN end run = VCC; count - outo = D; codewordo = 0; ELSE end run count - outo codeivordo END IF; END; Ipscw.tdf SUBDESIGN Ipscw % L.PS % = GND; - count_pluso; = 0; r3, k_q[3..0], r3_q, count[12-01, mask[II..0], r3 split[I0..0], mixl-q[12. .Ol, dk- cw[12..0] VARIABLE temp[12..Ol temp_rev[12..0] temp_sh[12..0] Split[l L.0] r3_Iong : input; : output; NODE; NODE; NODE; NODE, DFF; count-minuslll..0]: DFF; mask_q[ 1 L.0]: DFF; count_q[12..0]: DFF; BEGIN r3_Iong.clk = clk; coun t - minusD.clk - clk; mask_qo.clk = clk; count_qo.clk = clk-, split[I0..0] = r3_splito; split[ll] = GND; d_long = (d) AND (count[l L.0] splito); count-minuso = count[ll..0] - split[]-, mask_q 0 = mask[] count_qo = counto % pipeline stage - % IF (r3_Iong) THEN temp[l L.0] = (count-minuso) XOR mask-q[]; ELSE temp[l L.01 = count_q[l L.0] XOR mask_qo END IF; temp[12],. GND:

temp_rev[O] = temp[121; temp_rev[11 - temp[ll]; temp_rev[2] = temp[101; temp_rev[3] = temp19] temp_rev[4] = temp [81 temp_rev[S] - temp17]; temp_rev[6] - temp[6]; temp_rev[7] = temp[S]; temp_rev[8] - temp[4]; temp_rev[9] - temp[3); temp_rev[10] = temp[2] temp_yev[ 11] - temp[ 1] temp_rev[1.2] = temp [0] CASE k-qo IS WHEN 0 => temp_sho = 0; WHEN 1 => temp_sh[O] = temp_rev11 2] temp_sh[12..1] = 0; WHEN 2 => temp_sh [ L.0] temp_rev [ 1 2A 1 temp_sh[12..21 = 0; WHEN 3 temp_sh[2 0] - temp_rev[12 10] temp_sh[12 3] = 0; WHEN 4 => temp_sh[3 0] - temp_rev[129]; temp_sh[12 4) - 0; WHEN 5 => temp_sh[4 0] - temp_rev[128]; temp_sh[I2 SI = 0; WHEN 6 => temp_sh[S 0] - temp_rev[127]; temp-sh[12 6] - 0; WHEN 7 => temp - sh[6 0] = temp_rev[12 61; temp-sh[12 77] = 0; WHEN 8 => temp_sh[7 0] = temp_rev[12 5]; temp_sh[12 8) = 0; WHEN 9 => temp_sh[8 0] = temp_rev[12 4]; temp_sh[12 9] = 0; WHEN' 10 => temp_sh[12 10] = 0; temp_sh[12 10] = 0; WHEN 11 => temp_sh[I0 0] - temp_rev[12 2] temp_sh [ 11.11 = 0; WHEN 12 => temp_sh[ll 0] temp_re,,[12 11; temp_sh[12] = GND; El,TD CASE, IF (NOTr3_q) THEN % R2 % nv[] - temp_sho OR maxTl_qo -, ELS1P (NOT r3-long) THEN % R3 SHORT % c,,A,[11..0]=temp_sh[12..11 ORmax71_q[11-0]; M[12] = GND; ELSE % R3 LONG % Chl 12 11 = temp_sh [ 11.1] OR (ma-,.xl_q[ 1 L.0] AND NOT mask-_q[ 11..0]) END IF; END; ew[O] - temp_shIO]; kexpan d. tO M1E "decoder, k expand logiC; SUBDESIGN kex-pand k,_reg[3..0] r3_reg is ma>xl[12..0] mask[ 11.. 0) r3split[10..0] BEGIN : input; : output; TABLE k-_rego,r3_reg > M2-xr101 0, 2 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9, 10, 10, 0 0 0 0 0 0 0 0 0 0 0 0 1, 2, 3, 4, 6, 8, 12, 16, 24, 32, 48, 64, 96, 128, 192, 256, 384, 512, 768, 1024, 1536, 2048, masko, r3split[l; 0, X 1 X 3, X; 3, 2-, 7, X; 7, 4; 15, X; 15, 8; 31, X; 31, 16; 63, X; 63, 32; 127, X; 127, 64; 255, X; 255, 128., 511, X; 511, 256; 1023, X; 1023, 512; 2047, X; -so- 11, 1 m > 3072, 12, 0 - > 4096, ENM TABLE,

ENM; 2047, 1024; 4095, X; The Ren!.der Un! of the Present Invention Figure 8 is a block diagram of one example of the reorder unit. Referring to Figure 8, reorder unit 6D6 comprises a run count reorder unit B01 and a bit packing unit 802. Run count reorder unit 801 moves codewords frorn the end of runs as created by the encoder 10 the beginning of runs as needed' by the decoder, while bit packing unit 802 combines variable length codewords into fixed length interleaved words and outputs them in the proper order required by the decoder.

A snooper' decoder can be used to reorder for any, decoder, in which a de--oier is included in the encoder and provides requests for data in an order in which the CDdewords will be needed by the real decoder. To support a snooper decoder, reordering of run counts might have to be done independently for each strearn. For decoders that can be modeled easily, multiple time stamped queues or a single merged queue may be used to allow reordering. In one, example, reordering each CDdeword can be a---ornplished using a queue-like data structure and is independent of the use of multiple coded data streams. A description of how the reordering may be performed is given below.

The first reordering operation that is perlormed in the encoder is to reorder each c', the run counts so that the run count is specified at the beginning of the run (as the decoder requires for decoding). This reordering BAD ORIGIML.

is required because the encoder does not determine what.a run count (and codeword) is until the end of a run. Thus, the resulting run count produced from coding the data is reordered so that the decoder is able to properly decode the run counts back into the data stream.

Referring back to Figure 8, reorder unh 606 compnses run count reorder unit 801 and bit pack unit 802. Run count reorder unit ED1 is coupled to receive multiple inputs that include start signal 711, end signal 712, index signal 713 and codeword 714. These signals will be described in more detail in conjunction with the run count reorder unit of Figure 9. In response to the inputs, the run count reorder unit 801 generates codeword 803 and signal 804. Signal 804 indicates when to reset the run count. Codeword 803 is received by bit pack unit 802. In response to wieword 803, bit pack unit 802 generates interiz-aved words 805.

Run count reorder unit 801 and bit pack unit BD2 are described in tui-li'ier de-4&a:,1 below.

P,,jn CCil-lt FRe:!.de unl As descbed above, the decoder receives codeWDrds at the time the beginning of the dat2 Coded by the codeword is needed. However, the 2D encoder does not know the identity of the CDdeWOrd until the end of the data c-ocled by the codeword.

A block diagram of one example of the run count reorder unit 801 is described in Figure 9. The described embodiment accommodates four strearns, where each intedeaved word is 16 bits, and the c-odewords vary in length from one to thirteen bits. In such a case, the f:i,An OpIr,1MAYreorder unit 606 may be pipelined to handle all streams. Furthermore, an encoder that associates run counts with probability classes is used such that the maximum number of run counts that can be active at any lime is small, and is assumed to be 25 for this embodiment. Note that the present example is not limited to four interleaved streams, interleaved words of 16 bits or codeword lengths of 1 to 13 bits, and may be used for more or less streams with interleaved words of more or less than 16 bits and CDdeword lengths that e)l,e,nd from 1 bit to over 13 bits.

Referring to Figure 9, a pointer memory 901 is Coupled to receive index inpit 713 and produces an address output that is coupled to one input of multiplexer (PAUX) 902. Two other inputs of MUX 902 are coupled to receive an ad.,',,,ess in the form of a head pointer from head counter 903 and an a-dJ,-,,ess in 1he form of a tall pointer from tail counter 904. The output of MUX 9^12 is an address coupled to and used to access a CDdeword memory 908.

is Index input 713 is also coupled to as an input to MUX 9D5. Another inp, of h4UX 905 is coupled to the codeword input 714. The output of IJILIX 0K5 is coupled to an input of valid detection module 906 and to a data bus PD7. Daa bus 907 is coupled to cDdeword memory 90B and an input of MUX 935. Also coupled to data bus 907 is an output of control module 9D9. Start input 711 and end input 712 are coupled to separate inputs of control module 9D9. The outputs of valid detection module 9D6 comprise the codeword output ED3 and the signal 804 (Figure B). Run count reorder unit 801 also C0-1.1pn,SeS Controller logic (not shown to avoid obscuring the present inve"i',lzn) to coordinate the operations of the various components of run 25 count reorder unit BD1.

BAD ORIGINAL To re,lierate, index input 713 identifies a run. In one example, the index indicates one of 25 probability classes. In such a case, five bits are needed to represent the index. Note that if multiple banks of probability c!asses are used, then extra bits might be required to specify 1he particular bank. In one example. the index input identifies the probability class for the run count. CDdeword input 714 is the codeword when the end of a run o--curs and is a 'don't care" otherwise. Start input 711 and end input 712 are control sionals that indicate whether a run is beginning, ending, or both. A run becins and ends at the same tirne when the run consists of a single uncoded bit.

Run count reorder unit 801 reorders the run counts generated by the b:'& generator in response to its input signals. CodewDrd memory 908 stores cod=-v,.ords diring reordering. In one example, cDdeword memory 908 is the number of run counts that can be actJve at one time. This s to b=-- ier compression. If the cDdewDrd memory is smaller than the n,...,m,ber of run counis that can be active at one time, this would actually limit the nurnber of active runcDunis to the nurnber that could be held in memory. In a system that provides good Compression, it often occurs that while data for one cDdeword with a long runcDunt is being accumulated, many codewords with short runcounts will start (and perhaps end also). This requires having a large memory to avoid forcing out the long run before it is completed.

Pointer memory 901 stores addresses for codeword memory locations for probablEty classes that are in the middle of a run and addresses codeword r-.ie,,n.ory -0S in a random access fashion. Pointer memory 901 has a storage localion for the address in codeword memory 908 for each probability class BAD ORGINAL 54- that may be in the middle of a run. Once a run has completed for a particular probability class, the address stored in pointer memory 90 1 for that probability class is used to access codeword memory 908 and the completed codeword is written into codeword memory 908 at that location. Until that time, that location in codeword memory 908 contained an invalid entry. Thus, pointer memory 901 stores the location of the invalid codeword for each run count.

Head counter 903 and tail counter 904 also provide addresses to access codeword memory 908. Using head counter 903 and tail counter 904 allow codeword memory 908 to be addressed as a queue or circular buffer (e.g., a first in, first out [FIFO] memory). Tail pointer 904 contains the address of the next available location in codeword memory 908 to permit the insertion of a codeword into codeword memory 908. Head counter 903 contains the address in codeword memory 908 of the next codeword to be output. In other words, head counter 903 contains the codeword memory address of the next codeword to be deleted from codeword memory 90B. A location for each possible index (e.g., probability class) in pointer memory 901 is used to remember where tail pointer 904 was when a run was started so that the proper codeword can be placed in that location of codeword memory 908 when the run ends.

Control module 909 generates a valid signal as part of the data stored in codeword memory 908 to indicate whether or not an entry stores valid codeword data. For instance, if the valid bit is at a logical 1. then the codeword memory location contains valid data. However, if the valid bit is at a logic 0, then the codeword memory location contains invalid data. Valid detect module 907 determines if a memory location contains a valid codeword BAD ORIGINAL 6_=Ch time a. COdeWOrd is read DU1 from COCIeWOrd Memory B09. in one example, the valid detection module 907 detects whether the memory location has a valid codeword or a special invalid code.

When starting a new run, an invalid data entry is put in cDdeword memory 9D8. The invalid data entry. acts as space holders in the stream of data stored in codeword memory 90B, such that the codeWDrd for the run may be store".' in the memory in the correct location (to ensure proper ordering to model the decoder) when the run has completed. In one example, the ir,, a!d data entry includes the index via MiLIX 9D5 and an invalid indication 10 (e.g., an, invalid bili) from control module 9D9. The address in codeword 93B at which the invalid entry is stored is given by tail pointer 904, and subse;uen'Lly stored in pointer memory 901 as a reminder of the location foe the run count in codeword memoy, 908. The remainder of the dalla that 2;p=-as be-tween head poin ter 903 and tail pointer 904 in codeword memory c-lE as c:-plc-lied run counts (e.g., reordered run Counts). The maximum number of invalid memory locations is 0 to 1-1 where 1 is the number of run co,-,nIs. When a cDdev.,:)-,d is cDe.nple4,e at the end of a run, the run count is filled in cDdeword memory 9DE using the address stored in pointer Memory 9D1.

When a run starls, the index for the run is Stored in codeword memory 9S, so that if codeword memory 90S is full but the run is not yet complete, the index is used in conjunction with signal 804 to reset the corresponding run counter. In addition to storing codewords or indices in codeword memory 9S, one bit, refe, red to hereln as the "valid" bit, is used to indicate which of these two types of data is stDred.

BAD 0RiGiNAL If not starting or ending a run, the run count reorder unit is idle. If star,Iing a run and not ending a run and if the memory is full, then a codeword is output from codeword memory 9D8. The codeword that is output is the CDdeword stored at the address contained in head pointer 903 for that probability class. Then, if starting a run and not ending a run (irrespective of whether the memory is full), index input 713 is wdtten into CDdeword MeMDry 908 via MUX 905 at the address designated by tail pointer 904. Tail pointer 9D4 is then writ en into pointer memory 901 at an address designated by the dala on index input 713 (e.g., at the location in pointer memory 9D1 for the 10 probab,.iity class). After writing tail pointer 9D4, tail pointer 9D4 is incremented.

If ending a run and not starting a run, then the address stored in the the index (probability class) is read out rnberno-,y 901 corresponding to and used as the loCation in the cDdewDrd memory to store the completed 15 codeword on cDd-2word input 714.

If s',a-,&iIng a run and ending a run (i.e., a run both begins and ends at the same tirne), and the mennory is full, then a codeword is output from codeword memory 908. Then, if starting a run and ending a run (irrespective of whether the memory is full), codeword input 714 is written into COdeword memory 9D8 all the address specified by tail pointer 9D4. Tail pointer 904 is then incremented to contain the next available location (e.g., increment by 11).

in the present examples, Pm Count reorder unit 801 may output cDdewords at different times. In one example codeWDrds may be output when they are valid or invalid. COdeWDrds may be output when invalid if a mernory full condf,ion exists and a run has not completed. Invalid CDdewDrds 'D OgIGMO.

57- rnay also be outpit to maintain a minimum rate (i.e., for rate control). Also, invalid codewords may be output to flush CD1JeWOrd memory 9D8 when all of the data has undergone run count reordering or when the run count reorder unit jumps to the middle of codeword memory 908 as a result of a reset operation. Note that in such a case, the decoder must be aware that the encoder is operating in this way.

As described above, a codeword is output whenever the codeword mernory 008 is full. Once the memory is full, whenever an input (i.e., starting a new codeword) to the codeword memory 908 is made, an output from the codeworld memory 908 is rnade. Note that an update to an entry does not cause an output from the codeword memory 905 when a MeMDry full cond:,,, ion exis!s. That is, the completion of a run followed by the writing of the resiing codeword into its previously assigned memory location does not ca,-,se Z MeMOry fUll DUIPUt to D=Ur. Similarly, when a run ends and the cc)-,resp.-.,nd;i,nr, a,..d,.ess in pointer memory 901 and the address in the head counter 93 are the same, the CDdeword can be output immediately and the head counter 935 can then be incremented without accessing the codeword rneMDry 905. In one example, a memory full condition occurs when the tall pointer 904 is equal the head pointer 903 after the tail pointer has been incremented. Therefore, once the tail pointer 9D4 has been incremented, the controller logic in the run count reorder unit B01 compares the tall pointer 904 and the head pointer 903 and if the same, the controller logic determines that the codeword memory 908 is full and that a codeword should be output. In another example, codewords may be output prior to the memory being full. For instance, if the p:)rtic)n of the queue addressed by the head contains BAD ORIGINAL is valid COdewOrds, it may be output. This requires that the beginning of the queue be repeatedly examined to determine the status of the codewords therein. Note that the codeword memory 908 is emptied at the end of coding of file.

Using run count reorder unit BD1, a cDdewDrd is output by first reading a value (e.g., data) from cDdewDrd memory 908 at an address specitfied by head pointer 903. The outputting of codewDrds is controlled and coordinated using controller logic. Valid detection module 906 performs a test to if the value is a cDdeword. In other words, valid detection module c-l5 deermines if the codeword is valid. In one example, valid detection mo,.,,luie SDS determines the validity of any entry by checking the validity bit stored with each entry. If the value is a codeworci (i.e., the cDdeword is valid), then the valie is output as a cDdeword. On the other hand, if the value is not a (i.e., the codeword is invalid), then any codeword may be output which, has a run of MPSs at least as lono as the current run count. The M is one codeword that correctly represents the current run thus far, an.d may be citput. After the output has been made, head pointer 903 is incre mented to the nexl location in cDdewDrd memory 9D8 Alternatively, L,si,nc-, the 1 N' with the shortest allowable run lengths allows the decoder to check only whether a codeword has been forced out before emitting a LPS.

In one, example, run count reorder unit BD1 operates with a two clock cycle time. In the first clock cycle, input$ are received into run count reorder unit 801. In the second Clock cycle, an CUtpUl from CDdeword memory 9 a D= B'D 0FLIGISM- While codewords may be output whenever head pointer 903 addresses a valid codeword, it may be desirable in some implementations to only output a codeword when the buffer is full. This causes the system to have a fixed delay in terms of a number of codewords, instead a variable delay. If memory 908 is able to hold a predetermined nurn ber of codewords, between the time when a run is started and is input and when is output, the delay is that number of codewords since an output it is not made until it is full. Thus, there is constant delay in codewords. Note that the reordering delay is still variable in other measures, for example, the amount of coded or original data. By allowing memory 908 to fill up prior to producing an output, the output generates a codeword per cycle.

Note that if a codeword memory location is marked as invalid, the unused bits may be used to store an identification of what run count it is for (i.e., the context bin or probability class that must fill the location is stored therein). This information is useful for handling the case where the memory is full. Specifically, the information may be used to indicate to the bit generator that a codeword for this particular run length was not finished and that it must be finished now. In such a case, a decision has been made to output an invalid codeword, which may have occurred due to a memory full condition.

Thus, when the system resets the run counter, the information indicates when, in terms of bit generators and run counts, the system is,to begin again.

With respect to the index input, for pipelining reasons when banks of probability classes are used, the index may include a bank identifier. That is, there may be multiple run counts for a particular probability class. For BAD ORIGINAL instance, two run counts may be used for the 80 percent code, where one is used and then the other.

Since the codewords are variable length, they must be stored in codeword memory 90B in a manner that allows their length to be determined.

While it would be possible to store the size explicitly, this would not minimize memory usage. For R-codes, storing a value of zero in memory can indicate a one bit V codeword and the " 1 W codewords can be stored such that a phority encoder can be used to determine the length from the first M' bit.

If codeword memory 90B is multi-ported (e.g., dual ported), this design can be pipelined to handle one codeword per one clock cycle. Because any location in codeword memory 90B could be accessed from multiple ports, a location in codeword memory 908 may be written, such as when an invalid or codeword is being stored, while another portion may be read, such as when a codeword is being output. Note that in such a case, the mutliplexers may have to be modified to support the multiple data and address buses.

Whenever the encoder outputs a "0" codeword and resets a run counter because the codeword memory is full, the decoder must do the same.

This requires the decoder to model the encDdes codeword memory queue.

How this is accomplished will be discussed below.

Note that to save power in CMOS implementations, counters can be disabled for "IN" codewords when V' codewords are output for invalid runs.

This is because a M W codeword being decoded is valid, while only a M codeword may be invalid.

BAD ORIGiNAL -61.

Alternative Example Based on Context Figure 10 is a block diagram for another example of a run count reorder unit that reorders data received according to context (as opposed to p4Pobab,lity class). The run count reorder unit 1000 performs reordering using theR-codes. Referring to Figure 10, the reorder unit 1000 includes a pointer memory 1001, a head counter 1002, a tall counter 1003, a data multiplexer (MUX) 1004, an address MUX 1005. a compute length block 1006. a valid detect block 1007, and a cDdewDrd memory 1008. Codeword memory 1008 C-dres codewords during reordering. Pointer memory 10D1 stores addresses fw code.,.,,ord mernory locations for context bins thatt are in the middle of a run. Head counter 1002 and tail counter 1003 allow codeword memory 100B to be addIressed as a queue or circular buffer in addition to being addressed in randorn, a----ess 4.ashiDn by the pointer me-.nDry 1001. For R-codes, storing a va!ue cl, zero in mernory can indicate a one bit "0" codeword and the MW cod-2words can be stored such that a priwity encoder can be used to d=.'e,,rn,ine the length frorn the first M" bit. Compute length module 1006 cp-2"alles like a priority encoder. (if other variable length codes were used, it ",:)uld be more memory efficient to add a M" bit to mark the start of the codeword than to add 1092 bits to explicitly store the length.) Run count reorder unit 1000 also includes backstage controller logic to coordinate and control the operation of the components 1001 -1 DOB.

The operation of the run count reorder unit 1000 is very similar to the run count reorder unit that is based on probability estimates. If starting a new run, then an invalid entry including the context bin is written into codeword me",nDry 1008 at the address indicated by tail pointer 1003. Tail pointer 1003 BAD ORUNAL address is then stored in pointer memory 1 ool at the address of the context bin of the current run count. Tail pointer 1003 is then incremented. When completing a run, then the pointer in pointer memory 1 Dol corresponding to the run count is read from pointer memory 1001 and the codeword is written in parallel into codeword memory 1008 at a location designated by that pointer. If neither starting or ending a run, and if a location in codeword memory 1008 designated by the address of head pointer 1002 does not contain invalid data, then the codeword addressed by the head is read and output. Head pointer 1002 is then incremented. For the case when a run both begins and ends at the same time, the codeword is written into codeword memory 1008 at the address designated by tail pointer 1003 and then tail painter 1003 is incremented.

Similarly, when a run ends and the corresponding address in pointer memory 1001 and the address in head counter 1002 are the same, the codeword can be output immediately and value in head counter 1002 can be incremented without accessing codeword memory 1008.

For run count "by context" systems, every context requires a memory location in pointer memory 1001, so the width of the BG and PEM state memory can be extended to implement this memory. The width of pointer memory 1001 is equal to the size needed for a codeword memory address.

The number of locations in codeword memory 1 ODS can be chosen by the designer in a particular implementation. The limited size of this memory reduces compression efficiency, so there is a costicompression trade-off. The width of the codeword memory is equal to the size of the largest codeword plus one bit for a validAnvalid indication.

13j& D ORIGINAL An example using the R2(2) code, show in Table 13 below, will be used to illustrate reordering. Table 14 shows the data to be reordered (O=MPS, more probable symbol; 1=LPS, less probable symbol),labeled by context. There are only two contexts. The uncoded bit number indicates time in uncoded bit clock cycles. Start and end of runs are indicated, and codewords are shown at the end of runs. Table 13 Code Original Codeword 0000 0 0001 100 001 110 01 101 1 ill Table 14 - E ample Data to be Encoded Uncoded Data Context tart/End Codeword bit number -folf Run 1 0 0 S 2 0 1 S 3 0 0 4 1 1 E 101 0 0 6 0 1 7 0 0 E 0 8 1 1 E 101 9 0 0 S 0 1 S 11 0 0 12 0 1 13 0 0 14 0 1 1 0 E 100 16 0 1 E 0 The reordering operation for the example data is shown in Table 15. A codeword memory with four locations, 0-3. is used, which is large enough to not overflow in this example. Each row shows the state of the system after an operation which is either the start or end of run for a certain context or the output of a codeword. An "x" is used to indicate memory locations that are don't care". For some uncoded bits, a run neither starts or ends so the run count reorder unit is idle. For coded bits that end runs, one or more codewords can be output, which may cause several changes to the system state.

Table 15 - Example of Reordering Operations Uncoded Input pointers pointer memory codeword memory output bit number head 1 tail 0 11 o 12 13 1 start 0 0 1 0- X invalid X TX X 2 start 1 0- 2 0 1 inv2lid inv:tfirl X X 3 (reorderino unit idle) 4 endl,lollO-72 10 1X invalid 1101 1 X 1 X 1 (reorderino uni idle) 6 stan 1 0 3 0 2 invalid 101 invalid X 7 endO,0 0 3 X 2 0 101 invalid X 1 3 X 2 X 101 invalid X 0 2 3 X 2 X X invalid X 101 8 endl.lol 2 3 X X X X 101 X 3 3 X X X X X X 101 9 start 0 3 0 3 X X X X invalid start 1 3 1 3 0 inva X invalid 11 12 (reordering unit idle) 13 14 endD.loo 3 1 X 0 invalid X X 100 0 1 X 0 invalid X X X 100 16 endl.0 0 1 X X 0 X X X 1 1 1 X X X X X X Referring to Table 15, the head and tail pointers are initialized to zero, indicating that nothing is contained in the codeword memory (e.g., queue).

The pointer memory is shown having two storage locations, one for each context. Each location has 'don't care" values prior to bit number one. The codeword memory is shown with a depth of four codewords, all initially "donl care" values.

In response to the data received for bit number 1, the head pointer remains pointing to codeword memory location 0. Since the decoder will expect data, the next available codeword memory location, 0, is assigned to the codeword and an invalid value is written into the memory location 0. Because the context is zero, the address of the codeword memory location assigned to the codeword is stored in pointer memory location for the zero context (pointer memory location 0). Thus, a M" is stored in Pointer memory location 0. The tail pointer is incremented to the next codeword memory location 1.

In response to the data corresponding to bit number 2, the head counter remains pointing to the first memory location (since there has not been an output causing it to increment). Since the data corresponds to the second context, context 1, the next codeword memory location is assigned to the codeword as codeword memory location 1 as indicated by the tail pointer and an invalid value is written into the location. The address, codeword location 1, is written into the pointer memory location corresponding to context 1. That is, the address of the second codeword memory location is written into the pointer memory location 1. The tail pointer is then incremented.

in response to the data corresponding to bit number 3, the reorder unit is idle since a run is not starling or ending.

BAD ORIGINAL In response to the data corresponding to bit number 4, an end of a run is indicated for context 1. Therefore, the codeword 0101 is written into the codeword memory location assigned to context 1 (codeword memory location 1) as indicated by the pointer memory location for context 1. The head and tail pointers remain the same, and the value in the pointer memory location for context 1 will not be used again, so it is "donl care.

In response to the data corresponding to bit number 5, the reorder unit is idle since a run is not starting or ending.

In response to the data corresponding to bit number 6, the same type of operations as described above for bit 2 occur.

In response to the data corresponding to bit number 7, the end of the run for the codeword for context 0 occurs. In this case, the codeword "0" is written into the codeword memory location (cDdeword memory location 0) as indicated by the pointer memory location for context 0 (pointer memory location 0). Then the value on the pointer memory location will not be used again, such that it is a "don't care!' Also the codeword memory location designated by the head pointer contains valid data. Therefore, the valid data is output and the head pointer is incremented. Incrementing the head pointer causes it to point at another codeword memory location containing a valid codeword. Therefore, this codeword is output and the head pointer is incremented again. Note that in this example, codewords are output when they are able, as opposed to when the codeword memory is completely full.

Processing through the uncoded bits continues to occur according to the description above. Note that the codeword memory locations are not dedicated for use with particular contexts, such that codewords from any of E3AD ORIGINAL the contexts may be stored in a particular codeword memory location throughout the coding of a data file.

The S.1 Pack Unit B.11 packing is illustrated in Figure 4 where data processed by the reorder unit before and afler bit packing is shown. Referring back to Figure 4, sixteen variable length codewords are shown, numbered 1 through 16 to ind',ca'Le the order of use by the decoder. Every codeword is assigned to one cl, three coded streams. The data in each coded stream is broken into fixed len;,h, words called interleaved words. (Note that a single variable length codewz,,d may be broken into two interleaved words.) In this example, the ir,'lej,leaved words are ordered in a single interleaved stream such that the order of first variable length codeword (or partial codeword) in a particular interleave word determines the order of the interleaved word. Other types of ordering criteria may be performed. The advantage of interleaving the rnj!,Lip!e coded streams is that a single coded data channel to transfer data can be used and that variable length shifting can be performed for each siream, in a parallel or in a pipeline.

The bit pack unit 802 receives the variable length codewords from the run count reorder unit 801 and packs them into interleaved words. The bit pack unit 802 comprises 1DgIC 10 perform the handling of variable length codewords and a merged queue type reordering unit to ol.pul fixed length interleaved words in the correct order. In one example, the codewords are received from the run count reorder unit at a rate of up to one codeword per clock cycle. A block diagrarn of one 3P, D oRiGiv4PL example of the bit pack unlit 6012 is shown in Figure 11. In the following example, four interleaved streams are used. each interleaved word is 16 bits, and codewords vary in length from one to thirteen bits. In one example, a single bit pack unit is pipelined to handle all streams. If the bit pack unit ED2 uses a dual-pDrted memory (or register file), 11 can output one interleaved word per clock cycle. This may be faster than required to keep up with the rest of the encoder.

Referring again to Figure 11. the bit pack unit 802 includes packing logic 1101, a stream counter 1102, memory 1103, tail pointers 1104 and a head counter 1105. Packing logic 1101 is coupled to receive the codewords and is coupled to stream counter 1102. Stream Counter 1102 is also coupled I:) the rnernory 1103. Also coupled to memory 1103 are tail pointers 1104 and h e c o j n ter 1105.

Stream, counter 1102 keeps track of the interleaved stream with which the input COCIeWDrd is associated. In one example, stream coLjri"er 1102 repsa, e Idlly counts the streams from 0 to N-1, where N is the number of Crea-ns. Once strearn, counter 1102 reaches N-1, it begins counting from 0 In one, example, stream counter 1102 is 2 tWD-bil counter and counts from 0 to 3 (for four interleaved streams). In an example, stream counter 1102 is initialized to zero (e.g., through global reset).

Packing logic 1101 merges the current input codeword with previously input COdeWDrds to form interleaved CDdeWOrds. The length of each of the codewords may vary. Therefore, packing logic 1101 packs these variable len-4h ccdewords into fixed length words. The interleaved COCIeWDrds created by pa--king logic 1101 are Output to memory 1103 in order and are stored in RAD ORIGINAL memory 1103 unti! the proper time to outpul them. In one example, memony 1103 is a static random access memory (SRAM) or a register file with sixty-fDur 16-bit Words.

The interleaved Words are stored in memory 1103. In the present 1 example, the size of rnernory 1103 is large enough to handle two cases. One case is the normal operation case where one interleaved stream has minimum length codewD,Pds and the other interleaved streams have maximum le-.ig"h codewords. This first case requires 3xl 3=9 memory locations. The Cher case is the initialiizaion case where again one strearn has minimum leripth, or short, codewords and the others have maximum length, or long, codewD,ids. For the second case, while 2x3xl 3=76 memory locations are s,,';,J,.ient, the Operation of the PEM allows a tighter bound of 56.

1.1emory 1103 in cooperation wilth stream counter 1102 and the tall p--4=,S 1104 per!orm reD-.de ring. Stream counter 1102 indicates current Cream, of a codeword being received by memory 1103. Each interleaved strearns is at least one tail pointer. Tail pointers 1104 and head ccuriller 1105 perform a reordering of the cDdewo,,ds. The reason for having two tall pointers per stream follows from interleave word N being requested by the decoder when data in interleaved word N-1 Contains the cart of the next codeword. One tall pointer determines the location in the memory 1103 to store the next interleaved Word from a given interleaved stream. The other tail Pointer determines the location in memory to store the interleaved word Pater the next one. This allows the 10CatiCin of interleaved word N to be specified when the decoder requesi lime of interleaved word N.1 E1AD ORIGINAL is known. In one example, the pointers are eight 6-bil registers (two tail pointers per stream).

In onE example, at the start of encoding, the tail pointers 11 D4 are sell such that the first eight interleaved words (two from each stream) are stored in the memory 1103 in sequence one from each stream. After initialization. whenever the packing logic 1101 begins a new interleaved word for a particular code stream, the "nexi" tail pointer is set to value of the 'after nexl tall pointer, and the "after next" tail pointer for the code stream is set to the next available memory location. Thus, there are two tal! pointers for each strearn. In another example, only one tail pointer is used for each stream and indiCates where the next interleaved word is to be stored in the memory 1103.

The head counter 1105 is used 1b determine the memory location of the neXI, inierle2VeCI word to output from the bit pack unit 802. In the d-2s::rAbet example, the head counter 1105 cornprises a 6-bit counter that is incremented to output an entire interleaved word at a time.

The mernory 1103, in addition to being used for reordering, can also be used as a FIFO buffering between the encoder and the ch=annel. It may be desirable to have this memory bigger that what is required for reordering, so a FIFO-almost-full signal can be used to stall the encoder when the channel cannot keep up with the encoder. A one-bil-per-cycle encoder cannot generate one interleaved word per cycle. When an encoder is well matched to a channel, the channel will not accept an interleaved word every cycle, and some FIFO buffering is necessary. For example, a channel that can accept a BAD ORIGINAL 72- 16-bit interleaved word every 32 clock cycles would be a well matched design for 2:1 effective bandwidth expansion when Compression was 2:1 or greater.

The Packing Logic of the Present Examples A block diagram of the packing logic is shown in Figure 12. Referring to Figure 12, the packing logic 1101 comprises a size unit 12D1, a set of accumulators 1202, a shifter 1203, a MUX 1204. a set of registers 12D5, and an OR gate iDOiC 12D6. Size unit 1201 is coupled to receive codewords and is coupled to accumulators 1202. The accumulators as well as the codewords are coupled to shifter 1203. Shifiler 1203 is coupled to MUX 1204 and OR gate logic 1206. MUX 1204 is also coupled to registers 1205 and an outpit of OFR gale lonic 1206. The registers are also coupled to OR gate logic 1206.

In one example, codewords are input on a 13-b1 bus with unused b:; ts zeroed. These zeroed unused bits are adjacent to the "1" in 1 W czdew.^rds so a priority encoder in size unit 1201 can be used to determine the lenc4h of the 1 W cDdewords and generate a size for W codewords.

Accumulators 1202 comprise multiple accumulators, one for each interleaved stream. The accumulator for each interleaved stream maintains a re.-ord of the number of bits already in the current interleaved word. In one embodiment, each accumulator comprises a 4-bil adder (with carry out) and a 4-bit register used for each stream. In one example, the output of the adder is the output of the accumulator. In another embodiment, the output of the is the output of the accumulator. Using the size of the cDdewords as received from size unit 1201, the accumulators determine the number of BAD ORIGINAL bits to shift to concatenate the current codeword into the register containing the current interleaved Word for that stream.

Based on the current value of the accumulator, the shifter 1203 aligns the current codeword So it properly follows any previous codewords in that interleaved word. Thus, data in the encoder is shifted into decoder order.

The output of shifter 1203 is 28 bits, which handles the case where a 13bit codeword must be appended to 15 bits in the current interleaved word, such that bits from the current codeword end up in the higher 12 bits of the 28 bits being o;jtput. Note that shifier 1203 operales without feedback, and, thus, can be pipelined. In one example, shifter 1203 comprises a barrel shifter Realsters 1205 store bits in the current interleaved words. In one example, a 16-blit register for each interleaved stream holds previous bits in the current interleaved word.

Initially, a codeword of a strearn is received by shifter 1203, while size unit 1201 indicates the size of the CDdeword to the accumulator co!,, resp:)n,ding to the stream. The accumulator has an initial value of zero set through a glzbal reset. Since the accumulator value is zero, the CDdeword is not shifted and is then 0Red using OR logic 12D6 with the Contents of the register corresponding with the stream. However, in some examples, 1 N codewords must be shitted to be properly aligned even at the start of an interleaved wDrd. This register has been initialized to zero and, therefore, the result of the ORing operation is to put the codeWDrd into the right-mDst bit poCions of the output of OR logic 1206 and are feedback through MUX 1204 to the reg,ster for storage until the next codeword from the stream. Thus, initia'.]y shilfter 1203 operates as a pass through. Note that the number of bits BAD ORIGINAL in the firs! codeword are now stored in the accumulator. Upon receiving the nexl codeword for that stream, the value in the accumulator is sent to the shifter 1203 and the CDdeWDrd is shifted to the left that number of bits for combining with any previously input bits in the interleaved word. Zeros are placed in the other bit PD51tiOnS in the shifted word. Bits from the register corresponding to the stream are Combined with bits from shifter 1203 using OR logic 1206. If the accumulator does not produce a carry out indication (e.g., slignal), then More bits are required to complete the current interleaved word and the data resulting from the ORinp oDeraliDn is saved back into the rem,s"er through MUX 1204. In one example, MUX 1204 comprises a 2:1 mullhiplexer. When the accumulator generates a carry out, the 16 bits of 0Red data from OR logic 1206 are a complete interleaved word and are then outpit. MUX 1204 causes the recister to be loaded with any additional bits (e.g., the upper 12 bits of the 28bits output from the shifter 1203) afler the first 16 and fills the rest with zeros.

* The control for both MUX 1204 and the outputting of the interleaved word cornprises the carry out signal from the accumulator. In one example, the multiplexer 12D4 comprises sixteen 2:1 multiplexers with 4 of these having one input that is always zero.

Rec,clehnc 021,:)ns Multiple options are possible for performing reordering on the data. For instance, in a system with multiple code streams, the code streams must be reordered into interleaved words as Shown in BAD ORIGINAL Figure 4. There are numerous ways to accomplish reordering into interleaved words.

One method for reordering data into interleaved Words is to use a snooper decoder as shown in Figure 25. Referring to Figure 25, multiple run count reorder units 2501A.n are coupled to receive CDdeword informaton aJong with the codeword stream. Each generates a codeword output and a size output. A separate bit packing logic (1101) unit, such as bit packing units 25%"#2A.n, is coupled 10 receive the codeword and size outputs from one of the run cc)un.t reorder units 2501 A-n. Bit packing logic units 2502a-n Output intefleaved words that are Coupled to both MUX 2503 and snooper decoder 25D4. Decoder 2504 provides a select control signal that is received by MUX 253 and indicates to PAUX 2503 which intefleaved word to output into the code strearn.

Each. coded data strearn has a run count reorder unit, comprising run coint reorder unit 801, in Figure 8. Each bit pack unit combines variable len-th codewords into fixed size interleave words, perhaps 8. 16 or 32 bits per word. Each bit pack unit contains registers and shifting circuitry. as described above. Decoder 2504 comprises a fully operational decoder (including BG, PEM and CMj) that has access to interleaved words from all bit pack units (either on separate buses as shown in Figure 25 or via a common bus).

Whenever decoder 2504 is selected an interleave word from one of the bit pack units, that word is transmitted in the Code stream. Since the decoder at the receiving end will request the data in the same order as the identical snooper decoder, the interleaved words are transmitted in the proper order.

BAD ORIGINAL 76- An encoder with a snooper decoder may be attractive in a half duplex system, since the snooper decoder can also be used as a normal decoder. An advantage of the snooper decoder approach is its applicability for any deterministic decoder. Alternative solutions, discussed below, Without dependence on a snooper decoder, use simpler models of the decoder in order to reduce hardware cost. For the decoders that decode multiple cDdewords in the same clock cycle, modeling the decoder with less hardware than a decoder itself may not be possible, necessitating the use of a snooper decoder. As will be described below, for decoders that only decode at most one CDdeword per cycle, simpler models exist.

Another technique for reordering data for pipelined decoder systems that decode at most one code word per clock cycle is based on the fact that the only information needed to model the decoder's requests for coded data is to know the order of the codewords (considering all codewords, not the codewords for each coded data stream independently). If a time stamp is associated with each codeword when it enters the run count reorder unit, whichever bit packed interleaved word has the oldest time stamp associated with it is the next interleaved codeword to be output.

An exemplary encoder reordering unit is shown in block diagram form in Figure 26. Referring to Figure 26. the encoding system is the same as described in Figure 25, except time stamp information is received by each run count reorder unit 2501 A-n as well. This lime stamp information is also forwarded to bit pack units 2502A-n. Bit pack units 2502A.n provide interleaved words to MUX 2503 and their associated time stamps to logic BAD ORIGINAL 77- 2601. Lo-ic 2601 provides a cDntrol si nal to MUX 2503 to select the interleaved word to be output to the code stream. in this example, the snooper decoder is replaced by a simple compahsDn which deltermines which of bit pack unlis 2502a-n has a codeword (or part of a cDdeword) with the oldest time stamp. Such a system appears to MUX 2503 as multiple queues with time stamps. Logic 2601 sirn,p!y selects between various queues. The logic of each of run count recirder un4.,s 2503A-n only changes slightly (from run count reorder unit 801) to write, a lime stamp when a run is started. Each run count reorder units 2501 A.n is equipped to store the time stamp in the codeWDrd memory.

Ston.ng time stamps with enough bits to enumerate every CDdeword word in the coded data strearn is sufficient, but in some examples, fewer bits may be used.

A short description of the steps used with multiple queues with time

Camps appears below. The description is discernible to one skilled in the art. These are the encoder operations. No simplification has been done for the cases where a run is both started and ended by the same CDdeWDrd. The operations can be checked for each symbol encoded (although in practice not C checks need to be made). Interleaved words are assumed to be 32 bits in sze.

C (no current cDdeword for =ntexi) 1 p!a--e lime in Oueue (used to determine next Oueue) place context pointer in Oueue place invalid data in Oueue point =ntexi to Oueue entry increment Oueje tail C (already a CDdeWDrd and MPS) 1 increment roniexi runcount 1 BAD ORIGINAL if (MAXRUN or LPS) ( place correct data in Queue (context pointer unneeded) zero pointer & runcount in contexl memory update probability estimate in context memory it (valid data at head of next queue) place 32 bits of data on output clear Queue entry increment Queue head while (any queue is almost full) find the next Queue which must place data on the output while (less than 32 bits of valid data) ( use context pointer to find context zero pointer & run count in context memory place MAXRUN code word in Queue data The decoder operations are similar although the codewords need not be saved in the queue. It is still necessary to save the time stamp of the codewords in the queue.

The function of the time stamps discussed above is used to store the order information of the codewords. An equivalent manner of expressing the same concept is through the use a single queue for all codewords, i.e., a merged queue. In a merged queue system, as shown in Figure 27, a single run count reorder unit 2701 is used for all interleaved streams. Run count reorder unit 2701 generates codeword, size and stream outputs to bit pack units 2502A.n output interleaved words to MUX 2503 and position information to logic 2702, which signals MUX 2503 to output interleaved words as part of the code stream.

For arbitrary streams, the run count reorder memory stores an interleaved stream ID for each codeword. Each interleaved stream has its SAD ORIGINAL 1D own head pointer. When a bit pack unit needs more da-lia, the corresponding head pointer is used to fetch as many cDdewords as are needed to form a new interleaved word. This may involve looking at many codeword memory locations to determine which ones are part of the proper stream. Alternatively, this may involve looking to the codeword memory for additional fields to implement a linked list.

Another method of interleaving streams uses a merged queue with fixed stream assignment. This method uses a single tail pointer as in the merged queue case, so no time stamps are required. Also, multiple head pointers are used as in the previous case, so there is no overhead in outputting the data from a particular stream. To accomplish this, the assignment of cDdewords to interleaved streams is performed according to the following rule, for N streams: codeword M is assigned to stream M (mod) N. Note that interleaved streams can have cDdewords from any cD,, itexi bin or probability class according to this method. If the number of strearns is a power of two, M mod N can be computed by discarding some of the rnore significant bits. For example, assume that the codeword reorder memory is addressed with 12 bits and that four intedeaved streams are used. The tall pointer is 12 bits long, and the two least significant bits identify the coded stream for the next CDdeword. Four head pointers With 10 bits, each are implicitly assigned to each of the four possible Combinations of the two least significant bits. Both the tall and head pointers are incremented as normal binary counters.

In the decoder, the shifler has registers to store interleaved words. The shifter presents properly aligned coded data to the bit generator. When BAD ORIGINAL -so- the bit generator uses some coded dalla, it informs the shifter. The shifter presents properly aligned data from the next interleaved stream. If the number of coded data streams is N, the shifter has N-1 clock cycles to shift out the used data and perhaps request another interleaved codeword before that particular interleaved stream will be used again.

The Decoder The present, examples include a decoder that supports the real-time encoder with limited reDrder memory. In one example, the decoder also includes reduced memory requirements and complexity by maintaining a run count for each probability class instead of each contexl bin.

One Example of the Decoder System Figure 14A illustrates a block diagram of one example of a deCoder hardware system. Referring to Figure 14A, the decoder system 1400 includes first-in/first-Dul (FIFO) structure 1401, decoders 1402, memory 1403, and context model 1404. Decoders 1402 includes multiple decoders. Coded data 1410 is coupled to be received by FIFO structure 1401. FIFO structure 1401 is coupled to supply the coded dalka to decoder 1402. Decoders 1402 are coupled to memory 1403 and context model 1404. Context model 1404 is also coupled to memory 1403.

One output of context model 604 comprises the decoded data 1411.

In system 14DO, the coded data 1410 input into FIFO structure 1401 is ordered and interleaved. FIFO structure 1401 contains data in proper order.

The streams are delivered to decoders 1402. Decoders 1402 requires datta BAD ORIGINAL from these streams in a serial and deterministic order. Although the order in which decoders 1402 require the coded data is non-tritvial, h is not random.

By ordering the codewords in this order at the encoder instead of the decoder, the coded.data can be interleaved into a single stream. In another example, coded data 1410 may comprise a single stream of non interleaved data, where data for each context bin, context class or Drobability class is appended onto the data stream. In this case, FIFO 1401 is replaced by a storage area 'to receive all of the coded data prior to forwarding the data to decoders 1402 so that the data may be segmented properly.

As the coded data 1410 is received by FIFO 1401, context model 1404 determines the current context bin. In one example, context model 1404 determines the current context bin based,on previous pixels and/or bits. Afthough not shown, fine buffering may be included for context model 1404. The line bu,eng provides the necessary data, or template, by which context model 1404 determines the current context bin. For example, where the context is based on pixel values in the vicinity of the current pixel, line buffering may be used to store the pixel values of those pixels in the vicinity that are used to provide the specific context.

In response to the context bin, the decoder system 1400 fetches the decoder state from memory 1403 for the current context bin. In one example, the decoder state includes the probability estimation module(PEPwl,) slate and the bit generator state. The PEM state determines which code to use to decode new codewords. The bit generator stale maintains a record of the bits in the current run. The state is provided to decoders 1402 from memory 1403 in response to an address provided by context model BAD ORIGINAL 1404. The address accesses a location in memory 1403 that stores the information corresponding to the context bin.

Once the decoder state for the current context bin has been fetched from memory 1403, system 1400 determines the next uncompressed bit and processes the decoder state. Decoders 1402 then decode the new codeword, if needed, andlor updates the run count. The PEM state is updated, if needed, as well as the bit generation state. Decoders 1402 then write the new coder state into memory 1403.

Fig,ure 14B illustrates one example of a decoder.

Referring to Figure 14B, the decoder includes shifting logic 1431, bit generator logic 1432, "New k" logic 1433, PEM update logic 1434, New codeword logic 1435, PEM state to code. 1DgIC 1436, code-to-mask logic 1437, code-to-MaxRL, Mask, and R3Split expansion logic 1438, decode logic 1439, multiplexer 1440, and runtount update logic 1441. Shifting logic 1431 is coupled to receive the coded data input 1443, as well as the state input 1442 (from memory). The output of shifting logic 1431 is also Coupled as an input to bit generation logic 1432, "new-k" generation logic 1433 and PEM update logic 1434. Bit generation logic 1432 is also coupled to receive the slate input 1442 and generates the decoded data output to the context model.

New-k logic 1433 generates an output that is coupled to an input of codetomask logic 1437. PEM update logic 1434 is also coupled to state input 1442 and generates the state output (to memory). State input 1442 is also coupled to inputs of new-codeword logic 1435 and PEM state-to-code logic 1436. The output of PEM state-to-code logic 1436 is coupled to be received by expansion logic 143B. The output of expansion logic 1438 is coupled to BAD ORIGINAL decode logic 1439 and run count update logic 1441. Another input to decode logic is coupled to the output of code-to-mask 1437. The output of decode logic 1439 is coupled to one input of MUX 1440. The other input of MUX 1440 is coupled to state input 1442. The selection input of MUX 1440 is coupled to the output of new codeword logic 1435. The output of MUX 1440 and expansion logic 1438 are coupled to two inputs of run count update logic 1441 with the output of code-to-mask logic 1437. The output of run count update logic 1441 is included in the state output to memory.

Shifting logic 1431 shift in data from the code data stream. Based on the coded dalka input and state input, bit generation logic 1432 generates decoded data to the context model. New-k logic 1433 also uses the shifted in data and the state input to generate a new value of k. In one example, new-k 1o2ic 1433 uses the PEM state and the first bit of coded data to gene,Pate the new value of k. Based on the new k value, code-to-mask logic 1437 generates a RLZ mask for the next codeword. The RLZ mask for the next codeword is sent to decode logic 1439 and the run count update logic 1441.

The PEM update logic 1434 updates the PEM state. In one example, the PC-M state is updated using the present state. The updated stale is sent to memory. New codeword logic 1435 determine if a new codeword is needed. PEM state-to-code logic 1436 determines the code for decoding using the state input 1442. The code is input to expansion logic 143B to generate the maximum run length, the current mask and an R3 split value. Decode logic 1439 decodes the codeword to produce a run count output. MUX 1440 seleCts either the output from decode logic 1439 or the BAD ORIGINAL -84.

4 state input 1442 to the run count update logic 1441. Run count update logic 1441 updates the run count.

The decoding system 1400, including decoders 1402 1 oDerates in a pipeline manner. In one, example, the decoding system 1400 or, the present invention determines context bins, estimates probabilities, decodes codewords, and generates bits from run counts all in a pipelined manner. One. example of the pipeline structure of the decD. ding system is depicted in Figure 15A. Referring to Figure 15A, an example of the pipelined decoding process of the present invention is shown in six Cames, numbered 1-6.

In the first stage, the current context bin is determined (1501). In the second stage, after the context bin has been determined, a memory read occurs (1502) in which the current decoder state for the context bin is fetched from, merno,.y. As stated above, 1he decoder state includes the PEM state and the bit oenerator state.

In the third stage of the pipelined decoding process a decompressed bit is generated (1503). This allows for a bit to be available to the context model. Two other operations occur during the third stage. The PEM state is convened into a code type (1504) and a determination is made as to whether a new codeword must be decoded (1505) also occur in the third stage.

During the fourth stage, the decoding system processes a codeword and/or updates the run count (1506). Several sub-operations are involved in processing a codeword and updating the run count. For instance, a codeword is decoded to determine the next run count or the run count is updated for the BAD ORIGINAL -as- current codeword (1 5D6). If needed when decoding new codewords, more coded data is fetched from the input FIFO. Another sub-operation that occurs in the fourth stage is the updating of the PEM state (1507). Lastly. in the fourth stage of the decoding pipeline, the new PEM state is used to determine what the run length zero CD(JeWOrd (described later) is for the next code if the run count of the current code word is zero (1508).

During the fifth stage of the decoding pipeline the decoder state with an updated PEM state is written into memory (I 5D9) and the shifting begins for the next codeword (1510). In the sixth stage, the shifting to the nexl codeWDrd is completed (1510).

The pipelined decoding, actually begins with a decision as to whether to start the decoding process. This determination is based on whether there is enough data to present to the decoder If there is not enough data from the FIFO, the decoding system is stalled. In another case, the decoding system may be stalled when outputting decoded data to a peripheral device that is not capable of receiving all of the data output from the decoder as it is being generated. For instance, when the decoder is providing output to a video display interface and its associated video circuitry, the video may be too SIOW, such that the decoder needs to be stalled to allow a video to catch up.

Once the decision has been made to start the decoding process, the current context tfin is determined by the context model. In the present invention, the current context bin is ascertained by examining previous data. Suc h previous data may be stored in line buffers and may include data from the current line and/or previous lines. For instance, in a context template, for BAD ORIGNAL a given bit. Sits from line buffer(s) may be designed using a template with respect to the previous data, such that the context bin for the current data is selected according to whether the previous data being examined matches the template. These line buffers may include bit shift registers. A template may be used for each bit plane of an n-bil. image.

In one example, the context bin is selected by outputting an address to memory during the next pipeline stage. The address may include a predetermined number of bits, such as three bits, to identify the bit plane. By using three bits, the bit position in pixel data may be identified. The template used to determine the context may also be represented as a portion of the address. The bits used to identify the bit plane and the bits identifying the template may be combined to create an address for a specific location in memory thal contains the state information for the context bin defined by those bits. For example, by utilizing three bits to determine the bit position in a partil-ular pixel and the ten previous bits in the same position in each of the previou,s pixels in the template, a 13-bit context address may be generated.

Using the address created by the context model, the memory (e.g., RAM) is accessed to obtain the state information. The state includes the PEM state. The PEM state includes the current probability estimate.

Because more than one state uses the same code, the PEM state does not include a probability class or code designation, but rather an index into a table, such as the table shown in Figure 5. Also when using a table such as that shown in Figure 5, the PEM state also provides the most probable symbol (tAPS) as a means for identifying whether the current PEM state is located on the positive or negative side of the table. The bit generation state lo,D OFOGINAL may include the count value and an indication of whether an LIPS is present.

In one embodiment, the MPS value for the current run is also included for decoding the next COdeword. In the present invention, the bit generator state is stored in memory to reduce the space required for run counters. If the cost of space in the system for counters for each context is low, the bit generation state does not have to be stored in memory.

Once the fourth stage has been completed, the new bit generator state and PEM state is written to memory. Also in the fifth stage, the coded data stream is shifted to the next codeword. The shifting operation is completed in the sixth stage.

Figure 14C is a block diagram of one example of FIFO structure 1401 illustrating interleave word buffering for two decoders. Note that any number of decoders may be supported using the teachings of the present invention. As shown, the inpLn data and the FIFO are wide enough to hold two interleave words. FIFO 1401 comprises FIFO 1460, registers 1461-62, MUXs 1463-1464 and control block 1465. The two input codewords are coupled as the input interleaved words. The outputs of FiFO 1460 are coupled to inputs to registers 1461-1462. Inputs to MUX 1463 are coupled to the outputs of registers 1461 and 1462. Control block 1465 is coupled to provide control signals to FIFO 1460, registers 1461 and 1462 and MUXs 1463 and 1464. Interleave words are the output data (output data 1 and 2) provided to two decoders. Each decoder uses a request signal to indicate that the current word has been used and a new word will be needed nexl. The request signals from the decoders are coupled to inputs of control f3AD OffiGINAL block 1465. Control block 1465 also outputs a FIFO request signal to request more data from memory.

Initially, the FIFO and registers 1461 and 1462 are filled with data and a valid flip flop in the control unit 1465 is set. Whenever a request occurs, the control block 1465 provides the data according to the logic shown in Table 16.

Table 16

Both Valid Request 1 Request 2 Multiplexer 1,,iplexer 2 Next Both FIFO and Valid Register Enable X X 0 0 0 0 1 X REG 1462 1 1 0 1 0 REG 1462 X 1 1 0 1 1 RE-G 1462 FIFO 0 1 1 0 0 X X 1 0 1 0 1 X PEG 1461 0 0 REG 1461 X 0 0 REG 1461 REG 1462 1 1 X means donj care.

Fioure 15B illustrates a different conceptual view of the decoder.

Referring to Figure 15B, variable length (coded) data is input into a decoder. The decoder outputs fixed length (decoded) data. The output is also fed back as a delayed feedback which received as an input into the decoder. in the decoder of the present invention, variable length shifting BW OFtIGI%A- used in decoding is based on decoded data that is available after some delay. The feedback delay does not reduce the throughput in the delay tolerant decoders.

The input variable length data is divided into fixed length interleaved words such as described in conjunction with Figure 4. The decoder uses the fixed length Words as described in Figure 16A below. The decoder and delay models a pipeline decoder as described in conjunction with Figures 15 and 32 or multiple parallel decDderr, such as described in conjunction with Figures 2A-21D. Thus, the present, example provides a delay tolerant decoder. The delay tolerant decoders of the present invention allow handling of variable length data in parallel.

Prior art decoders (e.g., Huffrnan decoders) are not delay tolerant. Information determined from decoding all previous cDdewords is required to per',.,rm the variable length shiftin'g needed to decode the next cDdeword. In contrast, the present examples are delay tolerant decoders.

Sh,,.',;.ic in, the Decodin Syslern The decoder has shifting logic to shift the interleaved words to the proper bit generator for decoding. The shifter does not require any particular type of "by context or"by probability' parallelism. An encoder which assigns codeword M to stream M mod N (M%N in the C language), where N is the number of streams is assumed. In the present example.coded data from the current stream is presented until a codeword is requested. Only then is the data switched to the next stream.

BAD ORIGINALL' Figure 16A illustrates one example of the shifter for the decoder Shifter 1600 is designed for four data streams. This allows four clock cycles for each shifting operation. The interleaved words are 16 bits and the longest codeword is 13 bits. Referring to Figure 'M dw"" 1600 comprises four registers 1601-1604 coupled to receive inputs from the interleaved coded data. The outputs of each of registers 1601-1604 is coupled as inputs to MUX 1605. The output of MUX 1605 is coupled to the input of a barrel shifter 1606. The output of barrel shifter 1606 is coupled as inputs to a register 1607, MUX & registers 1608-1610, and a size unit 1611.

The output of size unit 1611 is coupled to an accumulator 1612. An output of accumulator 1612 is fed back and coupled to barrel shifter 1606. An output of register 1607 is coupled as an input to MUX & register 160B. An output of MUX & recister 160B is coupled as an input to MUX & register 1309. An output of MUX & register 1609 is coupled as an input to MUX & register 1610.

The Output of MUX & register 1610 is the aligned coded data. In one e.mbodiment, registers 1601-1604 are 16-bit registers, barrel shifter 1606 is a 03 32-bit to 13-bil barrel shifier and accumulator 1612 is a 4-bit accumulator.

Reg,sters 1601-1604 accept 16-bit words from the FIFO and input them into barrel shifter 1606. At least 32 bits of the undecoded data is provided to barrel shifter 1606 at all times. The four registers 1601- 1604 are initialized with two 16-bit words of coded data to begin. This allows there to always be at least one new codeword available for each strea m.

For R-cDdes, codeword size unit 1611 determines if a "0" or 1 W codeword is present and, if it is an M W codeword so, how many bits after the "1" are part of the current codeword. The size unit, providing the same BAD ORIGINAL function, was described in cDnjLinction with Figure 12. For other codes, determining the size of a codeword is well-known in the art.

Shifter 1600 comprise a FIFO consisting of four registers, three of. which have multiplexed inputs. Each register of registers 1607-1610 holds at least one CDdeword, so the Width of the registers and the multiplexers is 13 bIts to accommodate the longest possible cDdeword. Each register also has one control flip-flop associated with it (not shown) that indicates if a particular register contains a codeword or if it is waiting for barrel shifter 1606 to provide a codeword.

The FIFO will never empty. Only one codeword can be used Per Clock cycle and one codeword can be shifted per clock cycle. The delay to perform the shifting is compensated for since the system starts out four codewords ahead. As each codeword is shifted into being the aligned coded data output, the other codewords in registers 1607-1610 shift down. When the codeword lell,k in the FIFO is stored in register 1610, the barrel shifter 1606 causes codewords to be read out from registers 1601-1604 through MUX 1605 in order to fill registers 1607-1609. Note that the FIFO may be designed to refill register 1607 with the next codeword as soon as its codeword is shifted into register 1606.

Barrel shifter 1605, codeword size calculator 1611 and an accumulator select 1612 handle the variable length shifting. Accumulator 1612 has four registers, one for each coded data stream, that contains the alignment of the current codeWDrd for each data stream. Accumulator 1612 is a four bit accumulator used to control barrel shifter 1606. Accumulator 1612 increases its value by the value input from the codeword size unit 1611. When BAD ORIGINAP- - - ----- azcurn.ulator 1612 overflows (e.g., every time the shift count is 16 or greater), registers 1601-1604 are clocked to shift. Every other 16 bit shift causes a new 32 bit word to be requested from the FIFO. The input to accumulator 1612 is the size of the codeword, which is determined by the current code and the first one or two bits of the current codeword. Note that in scoe examples, registers 1601-1604 must be initialized with coded data before the decoding can begin.

When a codeword is requested by the system, the registers in the FIFO are clocked so that codewords are moved towards the output. When the barrel shifter 1606 is ready to deliver a new codeword, it is multiplexed into the first empty register in the FIFO.

In this example, a next codeword signal from the bit generator is received before the decision to switch streams is made.

If the next codeword sicnal from the bit generator cannot be guaranteed to be received before the decision to switch streams, a lockall-lead system such as the one shown in Figure 16B can be used. Referring to Ficure 163, a shifter 1620 using look ahead is shown in block diagram form. Shifter 1620 includes a shifter 1600 that produces outputs of the current coded data and the next coded data. The current coded data is coupled to an input of codeword preprocessing logic unit 1621 and an input of a codeword processing unit 1624. The next coded data is coupled to an input of codeword preprocessing loolc unit 1622. Outputs from both preprocessing ionic units 1621 and 1622 are coupled to inputs of a MUX 1623. The output of MUX 1623 is coupled to another input of codeword processing logic 1624.

BAD ORIGINM The logic that uses the codeword is divided into two parts, codeword preprocessing logic and C0deword processing logic. Two identical pipelines preprocessing units 1621-1622 operate before the interleaved stream can be shifted. One of preprocessing units 1621-1622 generates the proper information if the stream is switched and the other generates the information if the stream is not switched. When the stream is switched, the output of the proper codeword preprocessing is multiplexed by MUX 1623 to codeword processing logic 1624 which completes the operation with the proper codeword.

Off Chip Memory and Context Models Ac=xbng to the- pre immtun, adtiple fcr external memory or external context Models. In these embodiments, it is desirable to reduce the delay between generating a bit and having the bit be available to the context model where multiple integrated circuits are used.

Figure 17 illustrates a block diagram of one embodiment of a system with both an external context model chip 1701 and a coder chip 1702 with memory for each context. Note that only the units relevant to the context model in the coder chip are shown; it is apparent to those sWilled in the art that the coder chip 1702 Contains bit generation, probability estimation, etc.

Referring to Figure 17, the coder chip 1702 comprises a zero order context model 1703, context Models 1704 and 1705, a select logic 1706, a memory control 1707 and a memory 1708. Zero order context Model 1703 and context models 1704-1705 generate outputs that are coupled to inputs of the select logic 1706. Another input of select logic 1706 is coupled to an output of BAD ORIGINAL 94- external context model chip 1701. The output of select logic 1706 is coupled to an input of memory 170B. Also coupled to an input of memory 170B is an output of memory control 1707.

Select logic 1706 allows either an external context model or an internal context model (e.g., zero order context model 1703, context model 1704, context model 1705) to be used. Select logic 1706 allows the internal zero order portion of context model 1703 to be used even when the external context model 1701 is used. Zero order context model 1703 provides one bit or more while the external context model chip 1701 provides the remainder.

For instance, the immediately previous bits may be feedback and retrieved from zero order context model 1703, while previous bits go to the external context model 1701. In this manner, the time critical information remains onchip. This eliminates the off-chip communication delay for recently generated bits.

is Figure 18 is a block diagram of one system with an external context model 1601, and external memory 1803 and a coder chip 1802. Referring to Figure 18, some memory address lines are driven by the external context model 1 B01, while others are driven by the "zero order' context model in the decoder chip 1802. That is, the context from the immediately past decoding cycle are driven by the zero order context model. This allows the decoder chip to provide the context information from the immediate past with minimum communication delay. The context model chip 1802 precedes the rest of the context information using bits decoded further in the past only, therefore allowing for communication delay. in many cases, the context information from the immediate past is zero order Markov state, and the context B,'D OBIGINkb information from further in the past is higher order Markov state. The embodiment shown in Figure 18 eliminates the communication delay inherent in implementing the zero order model in the external context model chip 1802.

However. there may still be a context bin determination to bit generated delay due to the decoder chip 1802 and the memory 1803.

It should be noted that other memory architecture's could be used. For instance, a system with the context model and memory in one chip and the coder in another chipmay be used. Also a system may includes a coder chip with an internal memory that is used for some contexts and an external memory that is used with other contexts.

Bit Generators Usina a Memo:y Figure 19 shows a decoder with a pipelined bit generator using memory. Referring to Figure 19, the decoder 1900 comprises a context model 1901, memory 1902, PEM state-to-code block 1903, pipelined bit generator 1905, memory 1904 and shifter 1906. The input of Context model 1901 comprises the deCoded data from pipelined bit stream generator 1 9D5.

The inputs of shifter 1906 are coupled to receive the coded data. The output of context model 1901 is coupled to an input to memory 1902. The output of memory 1902 is coupled to PEM state-to-code 1903. The output of PEM state-to-code 1903 and the aligned coded data output from shifter 1906 are coupled to inputs of bit generator 1905. Memory 1904 is also coupled to bit generator 1905 using a bi-directional bus. The output of bit generator 1905 is the decoded data.

BAD ORiGINAL Context model 1 gol outputs a context bin in response to coded data on its inputs. The context bin is used as an address to access memory 1902 based on the context bin to obtain a probability state. The probability state is received by PEM state-to-code module 1903 that generates the probability class in response to the probability state. The probability class is then used as an address to access memory 1904 to obtain the run count. The run count is then used by bit generator 1905 to produce the decoded data.

In one example, memory 1902 comprises a 1024x7 bit memory (where 102-4 is the number of different contexts), while memory 1904 comprises a 25x14 bit memory (where 25 is the number of different run counts).

Since bit generator states (run counts, etc.) are associated With probability classes, not context bins, there is additional pipeline delay before a bit is avail,---bie to the context model. Because updating a bit generator stale takes multiple clock cycles (the bit generator state memory revisit delay), multiple bit generator states will be used for each probability class. For example, if the pipeline is six clock cycles, then the bit generator state memory will have six entries per probability class. A counter is used to select the proper memory location. Even with muftiple entries per probability class.

the size of the memory will typically be less than the number of contexts. The memory can be implemented with either multiple banks of SRAM or a multiporlied register file.

Since one run count may be associated with multiple contexIs, a system must upgrade the probability estimation state of one or more BAD MGWAL contexts. In onú example, 'the PEM state of the context which causes a run to end is updated.

Instead of requiring a read,MDdify and writeDf a runCOunt before it can be read again, a run count can be used again asSDon as the modify is 5 complete.

Figure 32 illustrates. a timing diagram of a decode operation in one example of a decoder. Referfing to Figure 32, a three cycle decode operation is depicted. Signal names are listed on the left hand co!umn of the timing diagram. The validity of a signal during any one cycle is indicated with a bar during the cycle (or portion thereof). In certain cases, the unit or logic responsible for generating the signal or supplying the valid signal is shown adjacent to the valid signal indication in a dotted-lined box. At times, examples of specific elements and units disclosed herein are provided as well. Note that any portion of the signal that extends into another cycle indicates the validity of the signal only for that period, of time in which the signal is Shown extending into the other cycle. AISD, certain signals are shown as being separately valid for more than one cycle. An example of such is the temp run count signal which is valid at one point at the end of the second cycle and then again during the third cycle. Note that this indicates that the signal is merely being registered at the end of the cycle. A list of dependencies is also shown in Table 17 below setting forth the dependcies from the same or previous clock cycle to the current time which the signal is specified to be valid.

BAD ORGINAL Table 17

N2me Urfil DeDenden--ies register file 1 CM (previous bit. CM shift register) state to code CM register file 1 barrel shill SH (accu=1ator register. Unaligned coded data O.L.1 reciste,., sze SH barrel shilter outpu 1 (aligned coded data) {K RM a= (a=umulator) SH size (previous accumulator remster value) register file 2 (K. R3 registered) BG cx (mcleword needed) BG register file 2.

coce to (rr..sk, BG (K. R3 registered) rnaxRL, R3spl;t) gen bit (generator bit) SG register file 2 barrel shiller output (aligned coded data) code to (mask. maxRL, R3split) ( reo,;ste r file 1, recistered MPS de:ode barrel shiller output (aligned coded da,a) BG code to (mask, rnaxRL, R3szAil) PEM =le (K. R3 registered) PEM (PEN4 update) TT (registered: PEM table outpit, LP$ present, PEM continue) -1 (run count update) BG (registered: codeword needed registered. run count, L PS pre ent, continue) nt inu e, LPS (registered: codeword needed, run count, LPS oresc-ni u:),.4ate) BG Dresent, continue) model, SH=shifter, BG=bl generator, PEN1l=probability estimation machine. [italics) means dependencies from previous clock cycle. tt In one embodiment, most combinational logic for updating the PEM state is performed in the TEM table" step, TEM update" is simply a multiplex operation.

Imnlicit Sionaling In sorne examples, the decoder must model the finite reordering buffer of the encoder. In one example, this modeling is accomplished with implicit signalling.

BAD OFUGANAL As explained previously which regard to the encoder, when a codeword is started in the encoder, space is reserved in the appropriatebuffer for the codeword in the order the codewords should be placed on the channel. When the last space in a buffer is reserved for a new codeword, then some codewords are placed in the compressed bit stream whether or not they have been completely determined.

When a partial cDdeword must be completed, a codeword may be chosen which is short and correctly specifies the symbols received so far. For example, in a R-coder system, if it is necessary to prematurely complete a codeword for a series of 100 MPSs in a run code with 128 maximum run- length, then the codeword for 128 MPSs can be used, since this correctly specifies the first 100 symbols.

Alternatively, a codeword that specifies 100 MPSs followed by a LPS can be used. When the codeword has been completed, it can be removed frorn the reordering buffer and added to the code stream. This may allow previously completed CDdewords to be placed in the code stream as well. If forcing the completion of one partial codeword results in the removal of a codeword from the full buffer then encoding can continue. If one buffer is still full, then the next codeword must again be completed and added to the code stream. This process continues until the buffer which was full is no longer full. The decoder may model the encoder for implicit signaling using a counter for each bjt generator state information memory location.

In one example, each run counter (probability class in this example) has a counter which is the same size as the head or tail counters in the encoders (e.g., 6 bits). Every time a new run is started (a new codeword BAD ORIGiNAL -100- is fetched). the corresponding count is loaded with the size of the codeword memory. Every time a run is started (a new codeword is fetched) all counters are decremented. Any counter that reaches zero causes the corresponding run count to be cleared (and the LPS present flag is cleared).

OZions, for Sionalino for Finite Memo:y.

Real-lime encoding in the present examples requires the decoder handle runs of MPSs that are not followed by an LPS and are not the maximum run length. This occurs when the encoder begins a run of MPSs, bit does not have enough limited re-ordering memory to wall until the run is complete. This condition requires a new codeword to be decoded the next time this context bin is used, and this condition must be signaled to the decoder. Three potential ways of modifying the decoder are described below.

When the buffer is full, the run count for the context bin or probability cass that is forced out must be reset. To implement this efficiently, storing the context bin or probability class in the codeword memory is useful. Since this is only needed for runs that do not yet have an associated CDdeword, the memory used to store the codeword can be shared. Note that in some systems, instead of forcing out an incomplete codeword, bits can be forced into the contexVprobabl lily class of the (or any) codeword that is pending in the buffer when the buffer is full. The decoder detects this and uses the corresponding (wrong) context bin or probability class.

Instream signaling uses codewords to signal the decoder. In one example, the R2(k) and R3(k) Code definitions are changed to include non-maximum length runs of MPS that are not followed by an LPS. This can -101- be implemented by adding one bit to the codewDrd that should occur with the lowest probability. This allows a uniquely decodable prefix for the nonmaximum length run counts. Table 16 shows a replacement for R2(2) codes that allows instream signaling. The disadvantages of this method are that the R-cDde decoding logic must be changed and that there is a compression cost every time the cDdeword with the lowest probability occurs.

Table 18

Oriainal Data Codeword 0000 0 0001 1 ODD 001 101 01 11D 1 ill 000 100100 00 100101 0 10D11 In SOMe examples, lhe decoder performs implicit signaling using time stamps. A counter keeps track of the current "time" by incrementing every time a codeword is requested. Also, whenever a codeword is started, the current "lime" is saved in memory associated with the codeword. Anytime after the first time a codeword is used, the corresponding stored "time' value plus the size of the encoders reordering buffer is compared with the current ^time". If the current "time" is greater, an implicit signal is generated so that a new codeword is requesled. Thus, the limited reorder memory in the encoder BAD ORUNAL -102- has been simulated. In one example, enough bits for "time values are used to allow all cDdewords to be enumerated.

To reduce the memory required, the number of bits used for the time stamps is kept to a minimum. If the time stamps use a small number of bits, such that time values are reused, care must be taken that all old time stamps are noted before the counter starts reusing times. Let N be the greater of the number of address bits for the queue or the bit generator state memory. Time stamps with N+1 bits can be used. The bit generator state memory must support multiple accesses, perhaps two reads and two writes per decoded bit.

A counter is used to cycle through the bit generator state memory, incrementing once for each bit decoded. Any memory location that is too old is cleared so a new codeword is fetched when its used in the future. This guarantees all time stamps are checked before any lime value is reused.

If the bit generator state memory is smaller than the queue, the rate of counting (the time stamp counter) and the memory bandwidth required can be reduced. This is because each time stamp (one per bit generator state memory) must be checked only once per the number of cycles required to use the entire queue. Also storing the time stamps in a different memory might reduce the memory bandwidth required. In a system that uses 'V codewords for partial runs, lime stamps do not have to be checked for M W codewords. In a system that uses 1 W codewords for partial runs, the lime stamp only has to be checked before, generating a LPS.

In some examples, implicit signaling is implemented with a queue during decoding. This method might be useful in a half duplex system where the hardware for encoding is available during decoding. The operation of the BAD ORIGINAL -103- queue is almost the same as during encoding. When a new codeword is requested, its index is placed in the queue and marked as invalid. When the data from a codeword is completed, h's queue entry is marked as valid. As data is taken out of the queue to make room for new codewords, if the data taken out is marked as invalid, the bit generator state information from that index is cleared. This clearing operation may require that the bit generator state memory be able to support an additional write operation.

Explicit signaling, in contrast, communicates buffer overflow as compressed data. One example is to have an auxiliary context bin that is used once for every normal context bin deCDde or once for every CDdeword that is decoded. Bits decoded from the auxiliary context bin indicates if the new-codeword-needed condition occurs and a new codeword must be decoded for the corresponding normal context bin. In this case, the cDdev. ,ords for this special context must be reordered properly. Since the L.Ckil'iza'biDn of this context is a function of something known to the reorder unit (typically, it is used once for each codeword), the memory required to reorder the auxiliary context can be bounded or modeled implicitly. Also, the possible codes allowed for this auxiliary context can be limited.

Implicit signaling models the encoder's limited buffer when decoding to generate a sigral that indicates that a new codeword must be decoded. In one example, a time stamp is maintained for each conten In one example, the encoders finhe size reordering buffer is modeled directly. In a half duplex system, since the encoder's reordering circuitry is available during decoding, it might be used to generate the Signals for the decoder.

BAD ORIGiNAL -104- Exactly how implicit signaling is accomplished depends on the details of how the encoder recognizes and handles the full buffer condition. For a system using a merged queue with fixed allocation, the use of multiple head pointers allows choices of what "buffer full" means. Given a design for the 5 encoder. an appropriate Model can be designed.

The following provides encoder operation and a model for use by the decoder for a merged queue with fixed stream assignment, parallel by probability system. For this example, assume that the reordering buffer has 256 locations, 4 interleaves streams are used, and each interleaved word is 16 bits. When the buffer contains 256 entries, an entry must be sent out to a bit packer (e.g., bit pack unit) before the entry for the 257th codeword can be placed in the queue. Entries can be forced out earlier if necessary.

In some systems removing the first entry in the buffer requires removing enough bits to complete an entire interleaved codeword. Therefore, if 1-bit codewords are possible, removing codeword 0 might require also removing codewords 4, 8, 12,.... 52, 56, 60 for 16-bit interleaved words. To ensure that all of these buffer entry have valid entries, forcing an entry to be filled to because the memory is full can be performed at address 64, 192 locations from the location where a new codeword is entered (256 - 16 X 4 192).

In the decoder there is a counter for each probability. When a new codeword is used to starl a run, the counter is loaded with 192. Any time a new CDdeword is used by any probability, all counters are decremented. If any c:)unter reaches zero, the run length for that probability is set to zero (and the LPS present flag is cleared).

BAD ORIGINAL -105- is It may be convenient to use multiple RAM banks (multi-ported memory, simulation with fast memory, etc.), one bank for each coded data stream. This permits all bit pack units to receive data simultaneously, so reading multiple codewords for a particular stream does not prohibit reading by other streams.

In other systems, multiple bit pack units must arbitrate for a single memory based on the codeword order as stored in.the buffer. In these systems, removing an entry from a buffer may not complete an interleaved word. Each bit pack unit typically receives some fraction of an interleaved word in sequence. Each bit pack unit receives at least a number of bits equal to the shortest codeword length (e.g. 1 bit) and at most a number of bits equal tot he longest codeword length (e.g. 13 bits). Interleave words cannot be emitted until they are complete, and must be emitted in the order of initialization. In this example, a bit pack unit might have to buffer 13 interleave words, this is the maximum number of interleave words that can be completed with maximal length codewords while another stream has an interleaved word pending that is receiving minimal length codewords.

A system where every codeword requires two writes and one read of memory may be less desirable for hardware implementation than a system that performs two writes and two reads. If this was desired for the example system with four streams, bit pack units 1 and 2 could share one memory read cycle and bit pack units 1 and 3 could share the other read cycle (or any other arbitrary combination). While this would not reduce the size of the buffering needed, it would allow a higher transfer rate into the bit pack unit.

BAD ORGINAL -106- This may allow the bit pack units to better utilize the capacity of the coded data channel.

Systerns with Fixed Size Memoa One advantage of a system that has multiple bit generator states per probability class is that the system can Support lossy coding when a fixed size memory overflows. This might be useful for image compression for a frame buffer and other applications that can only store a limited amount of coded data.

For systems with fixed size memory, the multiple bit generator states for each probability are each assigned to a part of the data. For example, each of eight states could be assigned to a particular bitplane for eight bit data. In this case, a shifter is also assigned to each part of the data, in contrast to shifters sequentially providing the next codeword. It should be noted that the data need not be divided by bitplane. Also, in the encoder, no interleaving is performed, each part of the data is simply bitpacked. Memory is allocated to each part of the data.

Memory management for coded data is presented for systems that store all of the data in a fixed size memory and for systems that transmit data in a channel with a maximum allowable bandwidth. In both of these systems, graceful degradation to a lossy system is desired. Different streams of data are used for data with different importance so that less important streams can be stored or not transmitted when sufficient storage or bandwidth is not available.

BAD ORIGINAL -107- When using memory, the coded data must be stored so that it can be accessed such that less important data streams can be discarded without losing the ability to decode important data streams. Since coded data is variable length, dynamic memory allocation can be used. Figure 31 shows an example dynamic memory allocation unit for three coded data streams. A register file 3100 (or other storage) holds a pointer for each stream plus another pointer for indicating the next free memory location. Memory 3101 is divided into fixed size pages.

Initially, each pointer assigned to a stream points to the start of a page of memory and the free pointer to the next available page of memory. Coded data from a particular stream is stored at the memory location addressed by the corresponding pointer. The pointer is then incremented to the next memory location.

When the pointer reaches the maximum for the current page, the following occurs. The address of the start of the next free page (stored in the free pointer) is stored with the current page. (Either part of the coded data memory or a separate memory or register file could be used.) The current pointer is set to the next free page. The free pointer is incremented. These actions allocate a new page of memory to a particular stream and provide links so that the order of allocation can be determined during decoding.

When all pages in the memory are in use and there is more data from a stream that is more important than the least important data in memory, one of three things may be done. In all three cases, memory assigned to the least important data stream is reassigned to more important data stream and no more data from the least important data stream is stored.

BAD ORIGINAL -108- First, the page currently being used by the least important stream is simply assigned to the more important data. Since most typical entropy coders use internal state information, all of the least important data stored previously in that page is lost.

Second, the page currently being used by the least important stream is simply assigned to the more important data stream. Unlike the previous case, the pointer is set to the end of the page and as more important data is written to the page, the corresponding pointer is decremented. This has the advantage of preserving the least important data at the start of the page if the more important data stream does not require the entire page.

Third, instead of the current page of least important data being reassigned, any page of least important data may be reassigned. This requires that the coded data for all pages be coded independently, which may reduce the compression achieved. It also requires that the uncoded data corresponding to the start of all pages be identified. Since any page of least important data can be discarded, greater flexibility in graceful degradation to lossy coding is available.

The third alternative might be especially attractive in a system that achieves a fixed rate of compression over regions of the image. A specified number of memory pages can be allocated to a region of the image. Whether less important data is retained or not can depend on the compression achieved in a particular region. (The memory assigned to a region might not be fully utilized if lossless compression required less than the amount of memory assigned.) Achieving a fixed rate of compression on a region of the image can support random access to the image regions.

Elp,D ORIMNAL -log- The ability to write data into each page from both ends can be used to better utilize the total amount of memory available in the system. When all pages are allocated, any page that has sufficient free space at the end can be allocated for use from the end. The ability to use both ends of a page must be balanced against 1he cost of keeping track of the location where the two types of data meet. (This is different from the case where one of the data types was not important and could simply be overwritten.) Now consider a system where data is transmitted in a channel instead of being stored in a memory. Fixed size pages of memory are used, but only one page per stream is needed. (Or perhaps two if ping-ponging is needed to provide buffing for the channel, such that while writing to one, the other may be read for output) When a page of me-mory is full, it is transmitted in the channel, and the memory location can be reused as soon as the page is transmitted. In some applications, the page size of the memory can be the size of data packets used in the channel or a multiple of the packet size.

In some communications systems, for example ATM (Asynchronous Transfer Mode), priorities can be assigned to packets. ATM has two priority levels, priority and secondary. Secondary packets are only transmitted if sufficient bandwidth is available. A threshold can be used to determine which streams are priority and which are secondary. Another method would be to use a threshold at the encoder to not transmit streams that were less important than a threshold.

BAD ORIGINAL -110- Separate Bit Generators for EaQh Code Figure 20 is a block diagram of a system with separate bit generators for each code. Referring to Figure 20, decoding system 2000 comprises context model 2001, memory 2002, PEM state-to-code block 2003. decoder 2004. bit generators 2005A.n, and shifter 2006. The output of context model 2001 is coupled to an input of memory 2002. The output of memory 2002 is coupled to an input of PEM state-to-code block 2003. The output of PEM state-to-code block 2003 is coupled to an input of decoder 2004. The output of decoder 2004 is coupled as an enable for bit generators 2005A-n. Bit generators 2005A-n are also coupled to receive coded data output from shifter 2006.

Context model 2001, memory 2002, and PEM state-to-code block 2003 operate like their counterparts in Figure 19. Context model 2001 generates a context bin. Memory 2002 outputs a probability state based on the context bin. The probability state is received by the PEM state-to-code block 2003 which generates a probability class for each probability state. Decoder 20D4 enables one of the bit generators 200SA-n upon decoding the probability class. (Note that decoder 2004 is a M to 2M decoder circuit similar to a 74xl 38 3:8 decoder which is well-known -- it is not an entropy coding decoder.) Note that since each code has a separate bit generator, some bit generators may use codes other than R-codes. Particularly, a code for probabilities near 60% might be used to better tile the probability space between R2(0) and R2(1). For instance, Table 19 depicts such a code.

ejD Onowl- -111- Table 19 uncoded data codeword 0 0 0 0 0 0 0 1 0 1 0 1 1 0 1 1 1 If needed to achieve the desired speed, pre-decoding of one or more bits may be done to guarantee that decoded data is available quickly. In SOMe examples, to avoid the need to be able to update a large run count every clock cycle, both codeword decoding and run counting for long codes are partitioned.

The bit generator for R2(0) codes is uncomplicated. A codeword is requested every time a bit is requested. The bit generated is simply the cDdewDrd (X0Red with the MPS).

Codes for short run length, for example, R2(1), R3(1), R2(2) and R3(2), are handled in the following manner. All of the bits in a codeword are decoded and stored in a state machine that comprises of a small counter (1, 2, or 3 bits respectively) and a LPS present bit. The Counter and LPS present bit operate as an R-code decoder.

For longer codes, such as R2(k) and R3(k) for k >2. bit generators are partitioned into two units as shown in Figure 21. Referring to Figure 21, a bit generator structure for R2(k) codes for k>2, is shown having a short run unit 2101 and a long run unit 2102. Note that although the structure is for use with BAD ORIG&NAL -112- R2(k>2) codes, its operation will be similar for R3(k>2) codes (and is apparent to one skilled in the art).

Short run unit 2101 is coupled to receive an enable signal and a codeword [2..03 as inputs into the bit generator and an all oner." signal and a 'count zero signal (indicating a count of zero). both from long run unit 2102.

In response, to these inputs, short run unit 2101 outputs a decoded bit and a next signal indication, which signals that a new codeword is needed. Short run unit 2101 also generates a count enable signal, a count load signal and a count max signal to long run unit 2102. Long run unit 2102 is also coupled to receive cDdeword 1k... 3] as an input to the bit generator.

Short run unit 2101 handles runs of up to length 4 and is similar to a R2(2) bit generator. In one, example, shorl run unit 2101 is the same for all R2(k>2) codes. The purpDse of long run count 2102 is to determine when the last 1-4 bits of the run are to be output. Long run unit 2102 has inputs, 15 AND logic and a counter that vary in size with k.

One example of the long run count unit 2102 is shown in Figure 22.

Referring to Figure 22, the long run unit 2102 comprises AND logic 2201 coupled to receive the CDdeword[k... 31 and outputs an "all ones' signal as a logical 1 if all of the bits in the cDdeword are Vs, thereby indicating that the current codeword is a 1 N codewDrd and that the run count is less than 4.

NOT logic 2202 is also coupled to receive the codeword and inverts ft. The output of NOT logic 2202 is coupled to one input of a bit counter 2203. The bit counter 2203 is also coupled to receive the count enable signal, the count load si-nal and the count max signal. In response to the inputs, the bit counter 2203 generates a Count zero signal.

Bo ORIC.1WL -113- In one example, the counter 2203.: is a k-2 bit counter and is used to break long run counts into runs of four MPSs and possibly some remainder. The count enable signal indicates that four MPSs have been output and the counter should be decremented. The count load signal is used when decoding M N' codewords and causes the counter to be loaded With the complement of codeword bits k through 3. The count max signal is used when decoding "0" codewords and loads the counter with its maximum value. A count zero output signal indicates when the counter is zero.

One example of the short run count unit 2101 is shown in Figure 23. Referring to Figure 23. the short run count unit contains a control module 2301, a twD-bit counter 2302 and a three-bit counter 2303. The control module 2301 receives the enable signal, the codeword [2..0], and the all ones and count zero signals frorn the long run count unit. The two bit counter is used to count four bit runs of MPSs that are part of longer runs. A R2(2) counter and LPS bit (three bits total) 2303 is used to generate the 1-4 bits at the end of a run. The enable input indicates that a bit should be generated on the bit output. The count zero input when not asserted indicates that a run of four 1APSs should be output. Whenever the MPS counter 2302 reaches zero, the count enable output is asserted. When the count zero input is asserted, e'ther the R2(2) counter the LPS is used or a new cDdeword is decoded and the next output is asserted.

When the new codeword is decoded, the actions performed are determined by the codeword input. If the input is "0" codeword. the MPS counter 2302 is used and the count max output is asserted. For 1 N codewords, the first three bits of the codeword are loaded into the R2(2) BAD ORIGINXL -114counter and LPS 2303, and the count load output is asserted. If the all ones input is asserted then the R2(2) counter and LPS 2303 are used to generate bits; otherwise the MPS counter is used until the count zero input is asserted.

From a system perspective, the number of codes must be small for the system to work well, typically 25 or less. The size of the multiplexer needed for bit and next codeWDrd outputs and the decoder for enabling a particular bit generator must be limited for fast operation. Also, the fan-out of the codeword from the shifter must not be too high for high speed operation.

Separate bit generators for each code allow pipelining. If all codewords resulted in at least two bits, processing of codewords could be pipelined in two cycles instead of one. This might double the speed of the decoder if the bit generators were a limitingportion of the system. One way to accomplish this is for the run length zero codeword (the codeword indicates just a LPS) to be followed by one bit which is the next uncoded bit. These might be called RN(k)+1 codes and would always code at least two bits. Note that R2(0) codewords and perhaps some of the other short codewords do not need to be pipelined for speed.

Separate bit generators lends itself for use with implicit signaling. Implicit signaling for encoding with finite memory can be accomplished in the following manner. Each bit generator has a counter that is the size of a queue address, for example,,9 bits when a size 512 queue is used. Every time a new codeword is used by a bit generator, the counter is loaded with the maximum value. Any time any bit generator requests a codeword, the counters for all bit generators are decremented. Anytime a counter reaches zero, the corresponding bit generator's state is cleared (for example, the MPS 13,1) oRIGINAL -115- counter, the R2(2) counter and LPS and the long run count counter are cleared). Because clearing can occur even if a particular bit generator is not enabled, there is no problem With stale counts.

5Iniliallzction-of Merno!: for Each Context Bin In cases where memory for each context bin holds probability estimation information, additional memory bandwidth may be required to initialize the decoder (e.g., the memory) very quickly. Initializing the decoder quickly can be a problem when the decoder has many contexts and they all 10. need to be cleared. When the decoder supports many contexts (1 K or more) and the memory cannot be globally cleared, an unacceptably large number of clock cycles would be required to clear the memory.

In order to clear contexts quickly, some examples use an extra bit, referred to herein as the initialized status bit, that is stored with each context. Thus, an extra bit is stored with the PEN1 state (e.g., 6 bits) for each context.

The memory for each context bin and the initialization control logic are shown in Figure 24. Referring to Figure 24, a context memory 2401 is shown coupled to a register 2402. In one example, the register 2402 comprises a one bit register that indicates the current proper state for the initialized status bin. The register 2402 is coupled to one input of XOR logic 2403.

Another input to XOR logic 2403 is coupled to an output of the memory 2401.

The output of XOR logic 2403 is the valid signal and is coupled to an input of control lDcjic 2404. Other inputs of control logic 2404 is coupled to the output of counter 2405 and the context bin signal. An output of control logic 2404 is BAD ORIGINAL -116- coupled to the select inputs of MUXs 2406-2407 and to an input of counter 2405. Another output of control logic 2404 is coupled to the select input of MUX 2408. The inputs of MUX 2406 are coupled to the output of counter 2405 and the context bin indication. The output of MUX 2406 is coupled to the memory2401. The inputs of MUX 2407 are coupled to the new PEM stat,e and zero. The output of MUX 2407 is coupled to one input of the memory 2401. The output of memory 2401 and the initial PEM state are coupled to input of MUX 2408. The output of MU 2408 is the PEM state out.

The value in register 2402 is complemented every occurrence of a decode operation (i.e., each data set, not each decoded bit). XOR logic 2403 compares the validity of the accessed memory location with the register value to determine whether the accessed memory location is valid for this decode operation. This is accomplished using XOR logic 2403 to check if the initialized status bit matches the proper state in register 2402. If the data in memory 2401 is not valid. then control logic 2404 causes the data to be ignored by the state to code logic and the initial PEM state to be used instead. This is accomplished using MUX 2408. When a new PEM state is written to mernory, the initialized bit is set to the current value of the register so that it will be considered valid when accessed again.

Every context bin memory entry must have its initialized status bit set to the current value of the register before another decode operation can begin. Counter 2405 steps through all memory locations to assure that they are initialized. Whenever a context bin is used, but its PEM state in not updated, the unused write cycle can be used to test or update the memory location pointed to be counter 2405. After a decode operation is complete, if BAD ORIGINP& -117- counter 2405 has not reached the maximum value, the remaining con!exts are initialized before beginning the next operation. The following logic is used to control operation.

write-ft false; counter 0; all initialized a false; wRie (counterc maximum context bin+l) read PEM state from context memory 9 ( (counter m. context bin read) and (wrfte_:d) write-it false counter counter + 1 9 ( (PEM state changed) write new PEM state else 9 (write-it) else write initial PEM state to memory location "counter" counter m counter + 1 read memory io-cation counter^ if (initialized bit in read location is in wrong state) wrne-ft true else counter counter + 1 all-inttialized m true; while (decoding) read PEM state from context memory 9 (PEM state changed) write new PEM, state The PEM used may include an adaptation scheme to allow faster adaptation regardless of the amount of, data available. By doing so, the decoding is allowed to adapt wore quickly initially, and to adapt more slowly as more data is available, as a means for providing a more accurate estimate. Furthermore. the PEM may be fixed in an field programmable gate array (FPGA) or ASIC implementation of a PEM state tablelmachine.

BAD ORIGINA1 -118- Tables 20-25 below describe a number of probability estimation state machines. Some tables use do not use R3 codes or do not use long codes, for reduced hardware cost. All tables except for Table 20 use fast adapting' special states used to quickly adapt at the start of coding until the first LPS occurs. These fast adaptation states are shown italisized in the tables. For instance, referring to Table 21, when decoding begins, the current state is state 0. If an MPS occurs, then the decoder transitions to state 35. As long as MPSs occur, the decoder transitions upward from state 35, eventually transitioning to state 28. If an LPS occurs at any time, the decoder transitions out of the bolded fast adapting states to a state that represents the correct probablility state for the data that has been received thus far.

Note that for each table, after a certain number of MPSs have been received, the decoder transitions out of fast adapting states. In the desired example, once the fast adapting stales have been exited, there is no mechanism to return to them, aside from restarting the decoding process. In olkher examples, the state table may be designed to re-enter these fast adapting states by allowing faster adaptation, the present invention allows for the decoder to arrive at the more skewed codes faster, thereby possibly benefiting from improved compression. Note that the fast adaptation can be eliminated for a particular table by changing the table entry for current state 0 such that the table transitions only one state up or down depending on the data input.

For all the tables, the data for each state is the code for that state, the next state on a positive update (up) and the next state on a negative update BAD ORIG -119- (down). Asterisks indicate states where the MPS must be changed on a negative update.

Table 20

Current Code Up next Down state state next state 0 r2(0) 1 0 1 r2(0) 2 0 2 r2(0) 3 1 3 r2(0) 4 2 4 r2(0) 5 3 r2(0) 6 4 6 r2M 7 5 7 r2M 8 6 8 r2M 9 7 9 r2M 10 8 r2M 11 9 11 r2M 12 10 12 r3M 13 11 13 r3M 14 12 14 r3M is 13 is r2(2) 16 14 16 r3(2) 17 is 17 r2W is 16 Current Code Up next Down state state next state 18 r3(3) 19 17 19 r2M 20 18 r3W 21 19 21 r2(5) 22 20 22 r3(5) 23 21 23 r2(6) 24 22 24 r3(6) 25 23 r2M 26 24 26 r3M 27 25 27 r20 28 26 28 r30 29 27 29 r2(9) 30 28 r3(9) 31 29 31 r2(10) 32 30 32 r3(10) 33 31 33 r2(11) 34 32 34 r3 (11) 34 33 a Switch to MPS -120- Table 21

Current Code Up next Down State state next state 0 r2(0) 35 35 1 r2(0) 2 1 2 r2(0) 3 1 3 r2(0) 4 2 4 r2(0) 5 3 r2(0) 6 4 6 r2M 7 5 7 r2M 8 6 8 r2M 9 7 9 r2M 10 8 r2M 11 9 11 r2M 12 10 12 r3M 13 11 13 r3M 14 12 14 r3M is 13 r2(2) 16 14 16 r3(2) 17 15 17 r2(3) 18 16 18 r3(3) 19 17 19 r2M 20 18 r3W 21 19 121 r2(5) 22 20 Current Code Up next Down State state next state 22 r3(5) 23 21 23 r2(6) 24 22 24 r3(6) 25. 23 r2(7) 26 24 26 r3(7) 27 25 27 r2(8) 28 26 28 r3(8) 29 27 29 r2(9) 30 28 r3(9) 31 29 31 r2(10) 32 30 32 r3GO) 33 31 33 r2(11) 34 32 34 r3(11) 34 33 r2(0) 36 1 -56 r2(1) 37 2 57 r2(2) 38 4 38 r2(3) 39 6 59 r2M 40 10 r2(5) 41 16 41 r2(6) 42 19 42 r2(7) 43 22 F4 -5 r2(8) 2,8 2.5 0 Switch to MPS -121- Table 22

Current Code Up next Down State state next state 0 r2(0) 35 35 1 r2(0) 2 1 2 r2(0) 3 1 3 r2(0) 4 2 4 r2(0) 5 3 r2(0) 6- 4 6 r2(1) 7 5 7 r2M 8 6 8 r2M 9 7 9 r2(1) 10 8 r2M 11 9 11 r2M 12 10 12 r2M 13 11 13 r2(2) 14 12 14 r2(2) 15 13 r2(2) 16 14 16 r2(2) 17 15 17 r2(3) is 16 18 r2(3) 19 17 19 r2(4) 20 18 r2(4) 21 19 21 r2(5) 22 20 0 51"itch IVE>S Current Code Up next Down State state next state 22 r2(5) 23 21 23 r2(6) 24 22 24 r2(6) 25 23 r2(7) 26 24 26 r2(7) 27 25 27 r2(8) 28 26 28 r2(8) 29 27 29 r2(9) 30 28 r2(9) 31 29 31 r2(10) 32 30 32 r2(10) 33 31 33 rM 1) 33 32 36 1 56 37 2 57 r2(2) 38 4 58 r2(3) 39 6 59 r2W 40 10 48 r23) 41 16 41 r2(6) 42 19 42 r2M 43 22 r2(8) 28 25 -122- Table 23

Current Code Up next Down State state next state 0 r2(0) 35 35 1 r2(0) 2 1 2 r2(0) 3 1 3 r2(0) 4 2 4 r2(0) 5 3 r2(0) 6 4 6 r2(1) 7 5 7 r2(1) 8 6 8 r2M 9 7 9 r2M 10 8 r2M 11 9 11 r2M 12 10 12 r3M 13 11 13 r3M 14 12 14 r3M is 13 is r2(2) 16 14 16 r3(2) 17 is 17 r2(3) 18 16 18 r3(3) 19 17 19 r2(4) 20 18 r3W 21 19 r2(5) 22 Current Code Up next Down State state next state 22 r2(5) 23 21 23 r2(6) 24 22 24 r2(6) 25 23 r2M 26 24 26 r2(7) 27 25 27 r2(8) 28 26 28 r2(R) 29 27 29 r2(9) 30 28 r2(9) 31 29 31 r2(1 0) 32 30 32 r2(1 0) 33 31 33 r2 2 1) 34 32 34 r2(1 1) 34 33 i2(0) 36 1 56 r2M 37 2 57 r2(2) 38 4 58 r2(3) 39 6 59 r2(4) 40 10 48 r2(5) 41 16 41 r2(6) 42 19 42 r2M 43 22 r2(8) 28 & Switch MPS -123- Table 24

Current Code Up next Down State state next state 0 r2(0) 35 35 1 r2(0) 2 1 2 r2(0) 3 1 3 r2(0) 4 2 4 r2(0) 5 3 r2(0) 6 4 6 r2(1) 7 5 7 r2M 8 6 8 r2(1) 9 7 9 r2M 10 8 r2(1) 11 9 11 r2(1) 12 10 12 r3M 13 11 13 r3M 14 12 14 r3M is 13 r2(2) 16 14 16 r3(2) 17 15 17 r2(3) is 16 18 r3(3) 19 17 19 r2(4) 20 18 r3(4) 21 19 21 r2(5) 22 20 Switch MPS Current Code Up next Down State state next state 22 r3(5) 23 21 23 r2(6) 24 22 24 r3(6) 25 23 r2(7) 26 24 26 r2M 27 25 27 r2M 27 26 r2(0) 36 1 56 r2M 37 2 57 r2(2) 38 4 38 r2(3) 39 6 39 r2W 40 10 r2(5) 41 16 41 r2(6) 42 19 42 r2(7) 2.5 2-2 -124- Table 25

Current Code Up next Down State state next state 0 r2(0) 35 35 1 r2(0) 2 1 2 r2(0) 3 1 3 r2(0) 4 2 4 r2(0) 5 3 r2(0) 6 4 6 r2(1) 7 5 7 r2M 8 6 8 r2(1) 9 7 9 r2(1) 10 8 r2(1) 11 9.

11 r2(1) 12 10 12 r2(1) 13 11 13,2(2) 14 12 14 r2(2) 15 13 r2(2) 16 14 16 r2(2) 17 15 17 r2(3) 18 16 18 r2(3) 19 17 19 r2(4) 20 18 r2(4) 21 19 21 r2(5) 22 20 & 5%itch MPS Current Code Up next Down State state next state 22 r2(5) 23 21 23 r2(6) 24 22 24 r2(6) 25 23 r2M 26 24 26 r27) 27 25 27 r2M 28 26 28 2(7) 28 27 r2(0) 36 1 56 r2(1) 37 2 57 r2(2) 38 4 38 r2(3) 39 6 59 r2(4) 40 10 r2(5) 41 16 41 r2(6) 42 19 42--- r27) 2-5 22 4 3 r2(8) 28 25 -125- p4 Adding a fast ada Lation to probability estimation only helps at the start of coding. Other methods can be used to improve adaptation during coding when the statistics of a context bin change more rapidly than the previously described PEM state tables can track. - One method of maintaining fast adaptation throughout coding is to add an acceleration term to the PEM state update. This acceleration could be incorporated into a PEM state table by repeating every code a constant number of times (e.g., 8). Then an acceleration term M (e.g., a positive integer) can be added or subtracted from the current state when updating.

When M is 1, the system operates the same as one without acceleration and the slowest adaptation occurs. When M is greater than 1. faster adaptation occurs. Initially, M may be set to some value greater than 1 to provide an initial fast adaptation.

One method for updating the value of M is based on the number of consecutive codewords. For instance, if a predetermined number of cDdewords occurred consecutively, then the value of M is increased. For instance, if four consecutive codewords are V V '0' Ow or 1 W M W 01 No 1 Nw, then the value of M is increased. On the other hand, a pattern of switching between 00n and 'I Nw codewords may be used to decrease the value of M. For instance, if four consecutive codewords are MW M N Om 01 Nw or 1 W V 1 W V then the value of M is decreased.

Another method of acceleration uses state tables in which each code is repeated S times, where S is a positive integer. S is an inverse acceleration parameter. When S is one, adaptation is fast, and when S is larger, adaptation is slower. The value of S can be initially set to 1 to provide initial BAD ORIGWAL -126- fast adaptation. Using a similar method to the one described above, the value of S may be updated when four consecutive codewords are '0' 01 V V or 1 N M Nw l N xl W. In such a case, the value of S is decreased. In contrast, if four consecutive CDdewords are Ow 1 Nw "Ow M Nw or 1 W '0' M N- '0', then S the value of S is increased.

The definition of consecutive codewords can have several meanings. In a by contextw system, consecutive codewords may refer to consecutive codewords in one context bin. In a "by probability" system, consecutive codewords may refer to consecutive codewords in one probability class.

Alternatively, in either system consecutive codewDrds may refer to consecutive codewords globally (without regard to context bin or probability class). For these three examples, the bits of storage required to maintain a history of codewords is 3 x number-of-context-bins, 3 x number-cf_probability_classes and 3 respectively. Maintaining acceleration for each context bin might provide the best adaptation. Since poor tracking is often due to a global change in the uncoded data, determining acceleration globally might also provide good adaptation.

Systern Applications One virtue of any compression system is to reduce storage requirements for a set of data. The parallel system of the present invention may be substituted for any application' currently fulfilled by a lossless coding system, and may be applied to systems operating on audio, text, databases, computer executable, or other digital data, signals or symbols. Exemplary iossiess coding systems include facsimile compression, database BAD ORIGINAL -127.

compression, compression of bitmap graphic images. and compression of transform coefficients in image compression standards such as JPEG and MPEG. The presentinvention allows small efficient hardware implementation and relatively fast software implementations making 11 a good choice even for applications that do not require high speed.

The real virlue that the present invention has over the prior an is the possibility of operation at very high speeds, especially for decoding. In this manner, the present 'invention can make full use of expensive high speed channels, such as high speed computer networks. satellite and terrestrial broadcast channels. Figure 28 illustrates such a system, wherein broadcast data or a high speed computer network supplies data to decoding system 2801 which decodes the data in parallel to produce output data. Current hardware entropy (such as the O-Coder) would slow the throughput of these systems. All of these systems are designed. at great cost, to have high bandwidth. It is counter productive to have a decoder slow the throughput. The parallel system of the present Invention not only accommodates these high bandwidths, It actually increases the eflective bandwidth because the data can be transmitted in a compressed form.

The parallel system of the present invention is also applicable to obtaining more effective bandwidth out of moderately fast channels like ISDN. CD-ROM. and SCSI. Such a bandwidth matching system is shown In Figure 29. whereas data from sources. such as a CD-ROM. Ethernet, Small Computer Standard Interlace (SCSI), or other similar source. is coupled to decoding system 2901, whicn receives and decodes to the data to produce an output. These channels are still faster than some current coders. Ofien these SAL) ORGiNAL -128- channels are used to service a data source that requires more bandwidth than the channel has, such as real-time video or computer based multimedia. The system of the present invention can perform the role of bandwidth matching.

The system of the present invention is an excellent choice for an entropy coder part of a real-time video system like the High Definiflon Television (HDTV) and the MPEG video standards. Such a system is shown in Figure 30. Referring to Figure 30, the real-time video system includes decoding system 3001 which is coupled to compressed image data. System 3001 decodes the data and outputs it to lossy decoder 3002. Lossy decoder 3002 could be the transform, color conversion and subsampling portion of an HDTV or MPEG decoder. Monitor 3003 may be a television or Video monitor.

Whereas many alterations and modifications of the present invention will no doubt become apparent to a person of ordinary skill in the art after having read the foregoing description, it is to be understood that the particular embodiment shown and described by way of illustration is in no way intended to be considered limiting. Therefore, references to details of the preferred embodiment are not intended to limit the scope of the claims which in themselves recite only those features regarded as essential to the invention.

Thus, a method and apparatus for parallel decoding and encoding of data has been described.

BAD ORIGINAL - 129 is Attention is drawn to the following UK patent applications:

Patent Application Number 9518375.2 (Publication Number GB 2 293 735), from which the present application is divided, which claims aspects of an encoding method and encoding system using an encoder, context model and state memory, reorder unit and reorder memory; Patent Application Number (Publication Number GB a further divisional of Application Number 9518375.2, which claims aspects of the encoding and decoding systems described above incorporating fast adaption of states and acceleration; Patent Application Number q i (Publication Number GB a further divisional of Application Number 9518375.2, which claims aspects of a decoder for decoding a plurality of interleaved words, comprising a variable length shifting mechanism, a run length decoder, a probability estimation machine and a plurality of registers; and Patent Application Number (Publication Number GB a further divisional Application Number 9518375.2, which claims aspects of a decoding method employing a counter associated with each run counter which is loaded with the count value corresponding to the size of codeword memory used during encoding.

BAD ORIGINAL - 130 -

Claims (13)

C L A I M S
1. A decoder for decoding coded data, said decoder comprising:
a context modelling mechanism for providing contexts, wherein the context modelling mechanism comprises a plurality of integrated circuits; a memory coupled to the context model for storing state information, wherein the memory provides state information in response to each context provided by the context model; and a plurality of decoders coupled to the memory for decoding codewords using the state information from the memory, wherein the plurality of decoders decode codewords using a plurality of R-codes, wherein the plurality of Rcodes include at least one non-maximum length run of most probable symbols that is not followed by a least probable symbol.
2. The decoder defined in claim 1 wherein non- maximum length run counts have a uniquely decodeable prefix.
3. A system for decoding a code stream having a plurality of codewords, said system comprising:
a context modelling mechanism for providing contexts, wherein the context modelling mechanism comprises a plurality of integrated circuits; a memory coupled to the context model for storing state information, wherein the memory provides state information in response to each context provided by the context model; and a plurality of decoders coupled to the memory for decoding codewords using the state information from the memory.
4. The system defined in claim 3 wherein the context modelling mechanism comprises at least one context model providing contexts from one of the plurality of integrated BAD ORIGINAL - 131 - circuits and at least one context model providing contexts from a second of the plurality of integrated circuits.
5. The system defined in claim 4, wherein said at least one context model on said one of the plurality of integrated circuits comprises a zero order context model.
6. The system defined in claim 3 wherein contexts from the plurality of integrated circuits are provided directly to the memory.
7. The system defined in claim 3 wherein a first portion of a first context is provided by one integrated circuit and a second portion of the first context is provided by a second integrated circuit.
8. A system for decoding a code stream having a plurality of codewords, said system comprising: 15 a context modelling mechanism for providing contexts; a memory coupled to the context model for storing state information, wherein the memory provides state information in response to each context provided by the context model; and 20 a plurality of decoders coupled to the memory for decoding codewords using the state information from the memory, wherein at least one of the plurality of decoders comprises a delay tolerant decoder.
9. The system defined in claim 8, wherein at least one of the plurality of decoders performs variable length shifting based on decoded data that is available after a delay.
10. The system defined in claim 8, wherein each of the plurality of decoders receive variable length data as input.
11. The system defined in claim 10, wherein the plurality of decoders decode variable length input data in parallel.
12. The system defined in claim 8, wherein output of the plurality of decoders is divided into fixed length interleaved words.
- 132 -
13. A decoder or system for decoding a code stream according to any one of the preceding claims and substantially as described herein with reference to Figures 17 and 18.
GB9624754A 1994-09-30 1995-09-07 Apparatus for decoding data Expired - Fee Related GB2306868B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US31611694A true 1994-09-30 1994-09-30
GB9518375A GB2293735B (en) 1994-09-30 1995-09-07 Method and apparatus for encoding data

Publications (3)

Publication Number Publication Date
GB9624754D0 GB9624754D0 (en) 1997-01-15
GB2306868A true GB2306868A (en) 1997-05-07
GB2306868B GB2306868B (en) 1997-10-22

Family

ID=26307711

Family Applications (4)

Application Number Title Priority Date Filing Date
GB9624358A Expired - Fee Related GB2306280B (en) 1994-09-30 1995-09-07 A coding system and entropy decoder
GB9624640A Expired - Fee Related GB2306281B (en) 1994-09-30 1995-09-07 nethod for decoding data
GB9624357A Expired - Fee Related GB2306279B (en) 1994-09-30 1995-09-07 Apparatus for decoding data
GB9624754A Expired - Fee Related GB2306868B (en) 1994-09-30 1995-09-07 Apparatus for decoding data

Family Applications Before (3)

Application Number Title Priority Date Filing Date
GB9624358A Expired - Fee Related GB2306280B (en) 1994-09-30 1995-09-07 A coding system and entropy decoder
GB9624640A Expired - Fee Related GB2306281B (en) 1994-09-30 1995-09-07 nethod for decoding data
GB9624357A Expired - Fee Related GB2306279B (en) 1994-09-30 1995-09-07 Apparatus for decoding data

Country Status (1)

Country Link
GB (4) GB2306280B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017074539A1 (en) * 2015-10-28 2017-05-04 Qualcomm Incorporated Parallel arithmetic coding techniques
WO2017161272A1 (en) * 2016-03-18 2017-09-21 Oracle International Corporation Run length encoding aware direct memory access filtering engine for scratchpad-enabled multi-core processors
US9886459B2 (en) 2013-09-21 2018-02-06 Oracle International Corporation Methods and systems for fast set-membership tests using one or more processors that support single instruction multiple data instructions
US10025823B2 (en) 2015-05-29 2018-07-17 Oracle International Corporation Techniques for evaluating query predicates during in-memory table scans
US10061714B2 (en) 2016-03-18 2018-08-28 Oracle International Corporation Tuple encoding aware direct memory access engine for scratchpad enabled multicore processors
US10061832B2 (en) 2016-11-28 2018-08-28 Oracle International Corporation Database tuple-encoding-aware data partitioning in a direct memory access engine
US10067954B2 (en) 2015-07-22 2018-09-04 Oracle International Corporation Use of dynamic dictionary encoding with an associated hash table to support many-to-many joins and aggregations
US10176114B2 (en) 2016-11-28 2019-01-08 Oracle International Corporation Row identification number generation in database direct memory access engine
US10380058B2 (en) 2016-09-06 2019-08-13 Oracle International Corporation Processor core to coprocessor interface with FIFO semantics
US10402425B2 (en) 2016-03-18 2019-09-03 Oracle International Corporation Tuple encoding aware direct memory access engine for scratchpad enabled multi-core processors

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6222468B1 (en) 1998-06-04 2001-04-24 Ricoh Company, Ltd. Adaptive coding with adaptive speed
GB2356508B (en) 1999-11-16 2004-03-17 Sony Uk Ltd Data processor and data processing method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5381145A (en) * 1993-02-10 1995-01-10 Ricoh Corporation Method and apparatus for parallel decoding and encoding of data
GB2285374A (en) * 1993-12-23 1995-07-05 Ricoh Kk Parallel encoding and decoding of data

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA1291820C (en) * 1986-09-15 1991-11-05 William B. Pennebaker Probability estimation based on decision history
CA1291821C (en) * 1986-09-15 1991-11-05 Glen G. Langdon, Jr. Arithmetic coding encoder and decoder system
KR960006827Y1 (en) * 1990-03-31 1996-08-08 구자홍 A multipurpose plat of a front panel
US5272478A (en) * 1992-08-17 1993-12-21 Ricoh Corporation Method and apparatus for entropy coding
US5475388A (en) * 1992-08-17 1995-12-12 Ricoh Corporation Method and apparatus for using finite state machines to perform channel modulation and error correction and entropy coding
JP3220598B2 (en) * 1994-08-31 2001-10-22 三菱電機株式会社 The variable length code table and a variable length coding apparatus

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5381145A (en) * 1993-02-10 1995-01-10 Ricoh Corporation Method and apparatus for parallel decoding and encoding of data
GB2285374A (en) * 1993-12-23 1995-07-05 Ricoh Kk Parallel encoding and decoding of data

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9886459B2 (en) 2013-09-21 2018-02-06 Oracle International Corporation Methods and systems for fast set-membership tests using one or more processors that support single instruction multiple data instructions
US10216794B2 (en) 2015-05-29 2019-02-26 Oracle International Corporation Techniques for evaluating query predicates during in-memory table scans
US10025823B2 (en) 2015-05-29 2018-07-17 Oracle International Corporation Techniques for evaluating query predicates during in-memory table scans
US10067954B2 (en) 2015-07-22 2018-09-04 Oracle International Corporation Use of dynamic dictionary encoding with an associated hash table to support many-to-many joins and aggregations
WO2017074539A1 (en) * 2015-10-28 2017-05-04 Qualcomm Incorporated Parallel arithmetic coding techniques
US10419772B2 (en) 2015-10-28 2019-09-17 Qualcomm Incorporated Parallel arithmetic coding techniques
US10061714B2 (en) 2016-03-18 2018-08-28 Oracle International Corporation Tuple encoding aware direct memory access engine for scratchpad enabled multicore processors
US10055358B2 (en) 2016-03-18 2018-08-21 Oracle International Corporation Run length encoding aware direct memory access filtering engine for scratchpad enabled multicore processors
WO2017161272A1 (en) * 2016-03-18 2017-09-21 Oracle International Corporation Run length encoding aware direct memory access filtering engine for scratchpad-enabled multi-core processors
US10402425B2 (en) 2016-03-18 2019-09-03 Oracle International Corporation Tuple encoding aware direct memory access engine for scratchpad enabled multi-core processors
US10380058B2 (en) 2016-09-06 2019-08-13 Oracle International Corporation Processor core to coprocessor interface with FIFO semantics
US10176114B2 (en) 2016-11-28 2019-01-08 Oracle International Corporation Row identification number generation in database direct memory access engine
US10061832B2 (en) 2016-11-28 2018-08-28 Oracle International Corporation Database tuple-encoding-aware data partitioning in a direct memory access engine

Also Published As

Publication number Publication date
GB2306279B (en) 1997-10-22
GB2306868B (en) 1997-10-22
GB2306281B (en) 1997-10-22
GB9624358D0 (en) 1997-01-08
GB9624357D0 (en) 1997-01-08
GB9624640D0 (en) 1997-01-15
GB2306281A (en) 1997-04-30
GB2306280A (en) 1997-04-30
GB9624754D0 (en) 1997-01-15
GB2306280B (en) 1997-10-22
GB2306279A (en) 1997-04-30

Similar Documents

Publication Publication Date Title
US4935882A (en) Probability adaptation for arithmetic coders
US5838597A (en) MPEG-2 decoding with a reduced RAM requisite by ADPCM recompression before storing MPEG-2 decompressed data
US6166664A (en) Efficient data structure for entropy encoding used in a DWT-based high performance image compression
CN101480054B (en) Hardware-based cabac decoder with parallel binary arithmetic decoding
US6624762B1 (en) Hardware-based, LZW data compression co-processor
US7079057B2 (en) Context-based adaptive binary arithmetic coding method and apparatus
EP2559166B1 (en) Probability interval partioning encoder and decoder
US5179378A (en) Method and apparatus for the compression and decompression of data using Lempel-Ziv based techniques
US5325092A (en) Huffman decoder architecture for high speed operation and reduced memory
US5436626A (en) Variable-length codeword encoder
CN1044183C (en) Compression of palettized image and binarization for bitwise coding of m-aray alphbets therefor
US4494108A (en) Adaptive source modeling for data file compression within bounded memory
JP2870515B2 (en) Variable-length encoding device
EP0663774A2 (en) Adaptive bit stream demultiplexing apparatus in a decoding system
US7777654B2 (en) System and method for context-based adaptive binary arithematic encoding and decoding
EP0568305A2 (en) Data compression
KR101955143B1 (en) Entropy encoding and decoding scheme
US5289577A (en) Process-pipeline architecture for image/video processing
EP0589682B1 (en) Variable length code decoder
US5226082A (en) Variable length decoder
US7660352B2 (en) Apparatus and method of parallel processing an MPEG-4 data stream
US5351047A (en) Data decoding method and apparatus
EP0154860B1 (en) Model driven data compression/decompression system for data transfer
US5623423A (en) Apparatus and method for video decoding
US6263019B1 (en) Variable rate MPEG-2 video syntax processor

Legal Events

Date Code Title Description
PCNP Patent ceased through non-payment of renewal fee

Effective date: 20070907