METHOD AND APPARATUS OF TRANSMITTING AND RECEIVING VARIABLE BIT RATE STREAMS
BACKGROUND
A motion picture program includes video, audio, and data streams. Since video and audio streams occupy most of the bandwidth in a motion picture program, efficient distribution of a motion picture program requires audio and video compression. Well-known compression techniques, for example, MPEG and JPEG, can reduce the necessary bandwidth ten times or higher.
The International Organization for Standardization (ISO) has adopted a standard (ISO/TEC 13818-1) that addresses the combining of one or more "elementary streams" of video and audio, as well as other data, into single or multiple streams suitable for storage or transmission. The ISO/TEC 13818-1 standard, hereinafter referred to as the MPEG-2 standard, is described in detail in the ISO draft document "Generic Coding of Moving Pictures and Associated Audio", ISO/IEC JTC1/SC29/WG1 1 N0801 (13 November 1994).
The MPEG-2 standard defines an individual coded video, audio or other coded bitstream as an "elementary stream". The contents of an elementary stream maybe broken into a sequence of discrete units, in which case the elementary stream is structured as a Packetized Elementary Stream (PES). The individual units, or packets, are known as PES packets, which can be of large and variable size. The MPEG-2 standard defines generic structures for PES packet formats and specifies particular rules for creating PESs from digital video and audio elementary streams.
The MPEG-2 standard defines two methods of creating a multiplex of PESs. In a Program Stream (PS), all components in the multiplex are assumed to belong to a single "Program", that is, a collection of elementary streams which may sensibly be presented as a unity to a user, all components being referenced to a common time base, together with certain coordinating control information. PES packets from component PESs are multiplexed by PES packet.
In a Transport Stream (TS), the components of the multiplex may belong to many programs. Each transport stream packet is assigned a "packet identifier" (PID). A sequence of packets identified by the same value of the PTD field generally represents a single service component, typically a video or an audio component, or a user data component. The PES packets are broken into small, fixed-size units (188 bytes) called transport packets, which may be multiplexed with transport packets from other PESs. The syntax for Transport Stream packets is shown in FIG. 1.
The Transport Stream is transmitted at a transport rate which is sufficient to accommodate the bandwidth requirements of all components carried within the Transport Stream. Since the transport rate may, either momentarily or in aggregate, exceed the bandwidth requirements of the constituent components, the MPEG-2 standard reserves PJD OxlFFF as the "null PTD". Packets in this PTD are "null packets" and do not carry any component. An MPEG decoder can discard them with impunity. Certain PTD numbers may also reserved for user data. The protocol inside the user data packet can be standardized or proprietary.
An MPEG-2 TS can also be carried by other lower layer network protocols in an Open System Interconnect (OSI) architecture. Popular lower layer network protocols used to carry MPEG-2 include Asynchronous Transfer Mode (ATM) and Internet Protocol (IP).
The TS packet can also contain an adaptation field. One of the most important types of information carried in an adaptation field is the Program Clock Reference (PCR) time stamp derived from a System Time Clock. The MPEG-2 standard specifies that PCR time stamps are to be carried periodically at intervals < 100 ms.
The System Clock Frequency of the System Time Clock in MPEG-2 is specified as 27 MHz. The system clock is used for the synchronization of audio and video access units. An audio access unit is the coded representation of an audio frame. In the case of video, a video access unit includes all coded data for a picture and any stuffing that may follow it. While being compressed, decoding time stamps and presentation time stamps are attached to each access unit for use by an MPEG-2 decoder. Both decoding time and presentation time stamps are derived using the System Clock Frequency.
In an MPEG-2 decoder, PCR time stamps are used to recover the System Clock Frequency. By reading the decoding time stamps and presentation time stamps, audio and video data can be decoded and scheduled to be presented accordingly. By synchronizing the respective 27 MHz System Time Clocks in the MPEG-2 encoder and decoder, and by conforming to the MPEG standard decoder buffer model, the probability of overflow or underflow in video and audio buffers of the decoder can be eliminated.
The generation of PCR time stamps in the MPEG-2 encoder and the recovery of the System Clock Frequency in the MPEG-2 decoder rely on a basic assumption that the Transport Stream is a Constant Bit Rate (CBR) data stream. Based on this assumption, the difference between a retrieved PCR time stamp and a local counter at the decoder is considered solely due to the frequency offset. Therefore, the System Clock Frequency at the decoder can be synchronized with that of the encoder. However, if the corresponding Transport Stream is a Variable Bit Rate (VBR) data stream, the time uncertainty combined with frequency uncertainty can
result in the failure of frequency synchronization between the respective 27 MHz System Time Clocks.
In reality, the compressed video is usually a VBR data stream. For example, fast moving scenes may require a higher data rate than slowly moving ones after compression. In other words, the output data rate of a compressed video signal is usually content dependent. In order to maintain a CBR data stream, MPEG-2 encoders usually insert null TS packets to stuff a VBR data stream into a CBR output.
However, a VBR data stream is often preferred in some communication links. Transmitting several VBR data streams in one communication link can take advantage of statistical multiplexing and reduce the overall bandwidth. If all VBR data streams are first stuffed into CBR data streams, the total required bandwidth is higher than that achieved by statistically multiplexing VBR data streams.
SUMMARY In order to save bandwidth realized by using statistical multiplexing, a technique is needed that allows VBR data streams (e.g., MPEG-2 Transport Streams) to use time stamps (e.g., PCR) to recover System Clock Frequency.
Accordingly, a method of encoding variable bit rate (VBR) data includes converting a source VBR data stream to a continuous bit rate (CBR) data stream by inserting null packets into the source VBR data stream. Time stamps are inserted into packets of the CBR data stream at an insertion interval. Consecutive null packets in the CBR data stream are replaced with an indication of the number of consecutive null packets replaced to provide an output VBR data stream. In an embodiment, the indication of the number of consecutive null packets is carried in a user data packet (UDP). h alternate embodiments, the indication is carried in an ATM cell or an Internet Protocol packet.
The output VBR data stream is transmitted over a communications medium and received at a decoder. At the decoder, the UDP, ATM cell or IP packet is replaced with the number of consecutive null packets replaced to provide a received CBR data stream. Time stamps can be retrieved from the CBR data stream to derive a local system clock for use in further decoding the CBR data stream content.
BRIEF DESCRIPTION OF THE DRAWINGS
The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of preferred embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention.
FIG. 1 is a diagram illustrating the syntax for Transport Stream packets. FIG. 2 is a schematic block diagram showing an embodiment of an encoder in accordance with the present invention.
FIG. 3 is a flow diagram illustrating the processing of Transport Stream packets in a pre-processing unit of the encoder in FIG. 2.
FIGs. 4A-4C are diagrams which illustrate Transport Stream packets as processed in the encoder in FIG. 2. FIG. 5 is a schematic block diagram showing an embodiment of a decoder in accordance with the present invention.
FIG. 6 is a flow diagram illustrating the processing of Transport Stream packets in a post-processing unit of the decoder in FIG. 5.
FIGs. 7A-7B are diagrams which illustrate Transport Stream packets as processed in the decoder in FIG. 5.
DETAILED DESCRIPTION
FIG. 2 is a schematic block diagram of an embodiment of an encoder 10. The encoder 10 includes a variable bit rate (VBR) data source 300, Transport Stream
packetizing unit 302, PCR time stamping unit 304 and VBR pre-processing unit 308. The encoder further includes a 27 MHz System Time Clock 312 and counter 314. The encoder allows PCR time stamps to be used in a VBR data stream.
The VBR data source 300 can be a video or audio encoder which generates a source variable bit rate data stream 301. The TS packetizing unit 302 converts the source VBR data stream into a stream of Transport Stream packets and fills up the VBR data stream by inserting null packets (i.e., having null PTD = OxlFFF) to provide a CBR data stream 303. The PCR stamping unit 304 generates PCR time stamps at an insertion interval (e.g., less than or equal to 100 ms) based on the received CBR data stream.
The values of PCR time stamps are derived from counter 314 which is driven by the 27 MHz System Time Clock 312. The PCR stamping unit 304 periodically stamps the TS packets received in CBR data stream 303 by reading the content of the counter 314. The output 306 is a Transport Stream containing PCR time stamps.
It should be understood that, according to the MPEG-2 standard, a PCR time stamp must be placed into a non-null packet. Thus, if the PCR stamping unit 304 encounters a null packet, it replaces the null PID value OxlFFF in the null packet with a PCR PID value, thereby converting the null packet to a non-null packet.
In order to transmit VBR data streams, the VBR pre-processing unit 308 replaces consecutive null packets in the CBR data stream 306 with an indication of the number of replaced null packets to provide an output VBR data stream 310. The VBR data stream 310 can be coupled to a transmission medium or channel 15 for transport to a downstream decoder.
In one embodiment, the VBR pre-processing unit 308 replaces consecutive null packets with one MPEG-2 user data packet (UDP) also referred to as an MPEG private data packet. An indication is provided in the UDP of the number of
consecutive null packets that have been replaced. To provide for a UDP to carry the count information, a PID can be specified as being dedicated for this purpose through a user-private descriptor defined in an MPEG-2 program management table (PMT). A format for the UDP comprises an MPEG transport packet having a payload formatted as follows:
UDP = { no_packets 16 bits }
where the count of null packets is assumed to include the UDP and equals the value given as no_packets. That is, 1 < no_packets < 65535. If the number of consecutive null packets exceeds 65535, then additional UDPs can be inserted as needed.
The protocol for replacement which operates in the pre-processing unit 308 can be understood with reference to the flow diagram in FIG. 3. Initially, a counter NULL_PKT which indicates the number of consecutive null packets received is set to zero at block 502. At block 504 a Transport Stream packet is received. A determination is made whether the received TS packet is a null packet at block 506. If the TS packet is a null packet, the counter NULL_PKT is incremented at block 508 and the next TS packet is received. If the TS packet is not a null packet, then the counter NULL_PKT is checked at block 510 to determine whether the count exceeds 1.
If the count exceeds 1, meaning that more than one null packet has been received consecutively, then a user data packet containing an indication of consecutive null packets (e.g., COUNT=3) is output at block 512. In addition, the non-null TS packet is output following the user data packet. Processing continues with a reset of the counter NULL_PKT at block 502. However, if the count does not exceed 1 , then the non-null TS packet is output at block 514. Processing of the next TS packet in the data stream continues at block 504.
An example Transport Stream as processed in the encoder 10 (FIG. 2) is illustrated in FIGs. 4A-4C. FIG. 4A shows the data stream 303 that is output from the TS packetizing unit 302 (FIG. 2) and includes consecutive null packets indicated as TS packets TS3, TS4, TS5. In FIG. 4B, the data stream 306 that is output from the PCR stamping unit 304 (FIG. 2) is shown. TS packets TSl and TS6 in the data stream 306 include PCR stamps attached by the PCR stamping unit. FIG. 4C shows the data stream 310 output from the VBR pre-processing unit 308 (FIG. 2). Note that null TS packets TS3, TS4, TS5 have been replaced with TS packet TS3' which is a user data packet that includes an indication of the number of consecutive null packets replaced, in this case the count being equal to 3. A two packet gap or interval K is shown between TS packets TS2 and TS3' due to the processing in the VBR pre-processing unit. Following the TS packet TS3' is TS packet TS6 with its PCR value.
In another embodiment, the VBR pre-processing unit 308 replaces consecutive packets with a lower-layer network protocol, for example, an ATM cell or an IP packet. A lower layer protocol carries the information indicating the number of consecutive null packets that have been replaced.
FIG. 5 is a schematic block diagram of an embodiment of a decoder 20. The decoder 20 includes a VBR post-processing unit 402, a PCR stamps retrieving unit 406, a clock recovery unit 408, 27 MHz System Time Clock 410 and counter 412. The decoder 20 further includes video/audio buffers 414 and data sink 416.
The VBR post-processing unit 402 receives an input VBR Transport Stream 310' of the type encoded by the encoder 10 (FIG. 2). The VBR post-processing unit 402 converts the VBR input 310' to a CBR stream. A buffer 403 is included in the VBR post-processing unit from which is extracted CBR output 404. Since the data stream at 404 is again CBR, the PCR Time stamps retrieving unit 406 can be used to recover System Clock Frequency.
In an alternate embodiment, the buffer 403 can be located at the input to the pre-processing unit 308 (FIG. 2). In that case, the transmission gap or interval K (FIG. 4C) is between TS packets TS3' and TS6 rather than between packets TS2 and TS3'. The worst-case buffer depth is equal to the maximum number of consecutive null packets, which for the example UDP syntax is about 12.5 MB. In practice, a smaller buffer is preferred due to MPEG timing limitations.
In an MPEG-2 decoder, PCR time stamps are used to recover System Clock Frequency. The input VBR Transport Stream 310' passes through the PCR stamps retrieving unit 406. The value of PCR time stamp retrieved or recovered from the time stamps unit is compared with the local counter 412 at clock recovery unit 208 to provide an error signal 409 to adjust the frequency of the local 27 MHz System Time Clock 410. As noted above, by synchronizing the respective 27 MHz System Time Clocks in the encoder 10 (FIG. 2) and decoder 20 (FIG. 5), and by conforming to the MPEG standard decoder buffer model, the probability of overflow or underflow in the video/audio buffers 414 can be eliminated.
The Transport Stream 404 is buffered in video/audio buffers 414. By reading the decoding time stamps and presentation time stamps that are included in the data stream in accordance with the System Clock Frequency provided at 411, audio and video data can be decoded and scheduled for presentation accordingly at data sink 416.
In one embodiment, the VBR post-processing unit processes the UDP and substitute the corresponding user data packet with the number of null packets indicated by user data.
The protocol for reversing the replacement which operates in the post- processing unit 402 can be understood with reference to the flow diagram in FIG. 6. Initially, a TS packet is received from stream 310' at block 602. A determination is made whether the received TS packet is a UDP at block 604. If the TS packet is not
a UDP packet, the TS packet is output to the PCR stamps retrieving unit 406 (FIG. 5) at block 606 and the next TS packet is received at block 602.
If the packet at block 604 is a UDP, the COUNT of null packets is extracted from the UDP and a counter NULLJPKT is set to zero at block 608. At block 610, a null packet is output to the PCR stamps retrieving unit and counter NULL_PKT is incremented. At block 612 the counter NULL_PKT is checked to determine whether the count exceeds the value of COUNT. If the COUNT value is not exceeded, then another null packet is output at block 610; otherwise, the next TS packet is received at 602.
An example Transport Stream as processed in the decoder 20 (FIG. 5) is illustrated in FIGs. 7A-7B. FIG. 7A shows the VBR data stream 310' that is received at the VBR post-processing unit and is identical in content to the VBR stream 310 shown in FIG. 4C. In FIG. 7B, the CBR data stream 404 that is output from the VBR post-processing unit 402 to the PCR time stamps retrieving unit 406 (FIG. 5) is shown. Note that in stream 404 the null TS packets TS3, TS4, TS5 have replaced TS packet TS3'.
In another embodiment, the VBR post-processing unit receives a lower layer protocol (e.g., ATM cell or IP packet) and processes the lower layer protocol to replace it with the same number of null packets indicated by the lower protocol.
It will be apparent to those of ordinary skill in the art that methods disclosed herein may be embodied in a computer program product that includes a computer usable medium. For example, such a computer usable medium can include a readable memory device, such as a hard drive device, a CD-ROM, a DVD-ROM, or a computer diskette, having computer readable program code segments stored thereon. The computer readable medium can also include a communications or transmission medium, such as a bus or a communications link, either optical, wired,
or wireless, having program code segments carried thereon as digital or analog data signals.
While this invention has been particularly shown and described with references to preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention encompassed by the appended claims.