WO2005036896A2

WO2005036896A2 - Apparatus and method for supporting enhanced transcoding functionality

Info

Publication number: WO2005036896A2
Application number: PCT/US2004/026057
Authority: WO
Inventors: Gino A. Scribano; Michael J. Kirk
Original assignee: Motorola, Inc. , A Corporation Of The State Of Delaware
Priority date: 2003-09-12
Filing date: 2004-08-11
Publication date: 2005-04-21
Also published as: WO2005036896A3

Abstract

Embodiments of the present invention address the need to support enhanced transcoding functionality by various improvements to the Real Time Protocol (RTP). Enhanced RTP (200, 300, 400) can be used by base station controllers (114, 124) in a radio access network (106) to communicate bearer traffic to core network media gateways (160) containing transcoders (161). Embodiments are described that can provide some or all of the following improvements: UTC time reference for vocoders, errored radio frame indication in conjunction with the decoded radio frame information, encoder analysis frame (i.e., time) alignment, enhanced support for discontinuous transmission, support for Tandem Free Operation (TFO), and support for vocoder rate control (a.k.a. Dim & Burst).

Description

1 CE11721 R Scribano et al.

APPARATUS AND METHOD FOR SUPPORTING ENHANCED TRANSCODING FUNCTIONALITY

Reference(s) to Related Application(s)

The present application claims priority from provisional US application, Serial No. 60/502438, entitled "APPARATUS AND METHOD FOR

SUPPORTING ENHANCED TRANSCODING FUNCTIONALITY," filed

September 12, 2003, which is commonly owned and incorporated herein by reference in its entirety. This application is related to a US application, Serial No. --/ , entitled "METHOD AND APPARATUS FOR CONTROLLING DISTRIBUTED TRANSCODERS," filed July 9, 2004, which is assigned to the assignee of the present application.

Field of the Invention

The present invention relates generally to wireless communication systems, and, in particular, to transcoding functionality in wireless communication systems.

Background of the Invention In a typical Code Division Multiple Access (CDMA) cellular network, such as a second generation (2G) CDMA communication network, transcoders are located in a radio access network (RAN), and in particular in a base station controller (BSC) located in the RAN. The transcoders receive 2 CE11721 R Scribano et al. compressed voice packets from a mobile station and convert the voice packets to pulse code modulated (PCM) signals for transmission through a circuit switched core network included in the cellular network. The BSCs then transmit the PCM signals upstream through the circuit switched core network and, via the core network, to a Public Switched Telephone Network (PSTN) coupled to the operator's cellular network. Similarly, PCM signals received by a 2G CDMA cellular network from a PSTN, that are intended for a mobile station serviced by the RAN, are transmitted as PCM signals through the circuit switched core network to the RAN, where the transcoder in the RAN converts the PCM signals to compressed voice packets. The RAN then transmits the compressed voice packets to the mobile station. However, later generation CDMA communication systems are being designed with transcoding functionality in the core network. For example, transcoders will be incorporated into core network media gateways (MGWs). Standards bodies such as IETF and 3GPP2 are working to develop standards that will support the communications between RAN entities and core network MGWs. At present, the Real Time Protocol (RTP) specifications, particularly RFC 3550 and RFC 3558, only support very basic transcoding functionality. In general, a substantially more robust level of transcoding functionality would be desirable. For example, it would be desirable to reduce the bandwidth requirements between the RAN and core network and to better exploit the encoding characteristics of newer vocoders such as the Selectable Mode Vocoder (SMV). Therefore, a need exists for an apparatus and method that support enhanced transcoding functionality.

Brief Description of the Drawings

FIG. 1 is a block diagram depiction of a wireless communication system in accordance with multiple embodiments of the present invention. 3 CE11721 R Scribano et al. FIG. 2A is a block diagram conveying exemplary bit definitions of a Real Time Protocol (RTP) message in accordance with multiple embodiments of the present invention. FIG. 2B is a block diagram conveying exemplary bit definitions for a vocoder frame control field in both a reverse-directed and a forward-directed RTP message in accordance with multiple embodiments of the present invention.

Detailed Description of Embodiments

Embodiments of the present invention address the need to support enhanced transcoding functionality by various improvements to the Real Time Protocol (RTP). Enhanced RTP can be used by base station controllers in a radio access network to communicate bearer traffic to core network media gateways containing transcoders. Embodiments are described that can provide some or all of the following improvements: UTC time reference for vocoders, errored radio frame indication in conjunction with the decoded radio frame information, encoder analysis frame (i.e., time) alignment, enhanced support for discontinuous transmission, support for Tandem Free Operation (TFO), and support for vocoder rate control (a.k.a. Dim & Burst). One embodiment of the present invention encompasses a method for supporting enhanced transcoding functionality. The method comprises requesting in a reverse-directed, Real Time Protocol (RTP) packet a maximum frame size for a vocoder frame and receiving, in response to the request and in a forward-directed RTP packet, a vocoder frame having a frame size less than or equal to the maximum frame size. Another embodiment of the present invention encompasses a base station (BS) comprising a base site controller (BSC) and a base transceiver system (BTS). The BSC is adapted to request in a reverse-directed, Real Time Protocol (RTP) packet a maximum frame size for a vocoder frame and also adapted to receive, in response to the request and in a forward-directed 4 CE11721 R Scribano et al.

RTP packet, a vocoder frame having a frame size less than or equal to the maximum frame size. The BTS is communicatively coupled to the BSC and adapted to communicate with a mobile station (MS) via an air interface. Yet another embodiment of the present invention encompasses a transcoding system comprising a BSC and a media gateway, communicatively coupled to the BSC. The BSC is adapted to request in a reverse-directed, Real Time Protocol (RTP) packet a maximum frame size for a vocoder frame and adapted to receive in a forward-directed RTP packet a vocoder frame having a frame size less than or equal to the maximum frame size. The media gateway is adapted to receive the request and adapted to send, in response to the request, the vocoder frame having a frame size less than or equal to the maximum frame size. The present invention may be more fully described with reference to FIGs. 1 and 2. FIG. 1 is a block diagram depiction of a wireless communication system 100 in accordance with multiple embodiments of the present invention. Communication system 100 includes a Radio Access Network (RAN) 106 that comprises multiple base stations (BSs) 110, 120. Each BS of the multiple BSs 110, 120 includes a respective at least one base transceiver station (BTS) 112, 122 operably coupled to a respective base station controller (BSC) 114, 124. Communication system 100 further comprises a mobile station (MS) 102 in wireless communication with a BS, such as BS 110, of RAN 106 via an air interface 104. Air interface 104 comprises a forward link (not shown) having multiple communication channels, such as one or more forward link control channels, one or more forward link traffic channels, and a forward link paging channel, and a reverse link (not shown) having multiple communication channels, such as one or more reverse link control channels, one or more reverse link traffic channels, and a reverse link access channel. Each BS 110, 120, preferably a respective BSC 114, 124 of the BS 110, 120, is coupled to an Inter-BS Packet Transport network 140 via a respective signaling interface 130, 134 and a respective bearer interface 132, 136. Inter-BS Packet Transport network 140 is further coupled to a packet switched controller 144 via a signaling interface 142, thereby providing a 5 CE11721 R Scribano et al. signaling link between each BS of the multiple BSs 110, 120 and the packet switched controller 144. Inter-BS Packet Transport network 140 is still further coupled to a wide area packet transport network 154 via a bearer interface 150. In turn, wide area packet transport network 154 is further coupled to a local Media Gateway (MGW) 160 via a bearer traffic interface 156, thereby providing a bearer traffic link between each BS of the multiple BSs 110, 120 and Media Gateway 160. Wide area packet transport network 154 is also coupled to packet switched controller 144 via a signaling interface 148 and to remote packet voice network 192 via a signaling interface 194 and a bearer interface 196. Preferably, each of signaling interfaces 130, 134, and 142 comprises an A1 interface that has been modified to support an exchange of signaling messages in a packet voice format (which interfaces are depicted in FIG. 1 as A1_p interfaces). In addition, preferably each of bearer interfaces 132, 136, 150, and 156 comprises an A2 interface that has been modified to support the exchange of bearer traffic in a packet voice format (which interfaces are depicted in FIG. 1 as A2_P interfaces) and that has been additionally modified as described herein. Media Gateway 160 optionally includes a transcoder 161 that is capable of decoding voice data packets received from MS 102 into at least one of multiple bearer formats, such as SMV (IS-893), EVRC (IS-127), 13k- QCELP (IS-733), 8k-QCELP (IS-96C), and G.711 , for conveyance to public network 190 or remote packet voice network 192 and is further capable of encoding voice data received from public network 190 or remote packet voice network 192 in at least one of the multiple bearer formats into voice data packets for conveyance to MS 102. Media Gateway 160 is further coupled to public network 190, preferably a Public Switched Telephone Network (PSTN), via a bearer interface 162, preferably a pulse code modulation (PCM) interface, and to packet switched controller 144 via a signaling interface 164. Public network 190 is further coupled to packet switched controller 144 via a signaling interface 152, preferably an ISDN User Part (ISUP) interface. Inter-BS Packet Transport network 140, packet switched controller 144, wide area packet transport network 154, Media Gateway 160, and interconnecting 6 CE11721 R Scribano et al. interfaces 130, 132, 142, 148, 150, 152, 156, 162, 164, 194, and 196 may be collectively referred to as a packet switched core network and provide a packet voice communication link between each BS 110, 120 and each of public network 190 and remote packet voice network 192. BS 110, preferably BSC 114 of BS 110, is further coupled to a circuit switched controller 174, preferably a circuit switched MSC, via a signaling interface 170, preferably an A1 interface, and a bearer interface 172, preferably an A2 interface. In turn, circuit switched controller 174 is coupled to public network 190 via a bearer interface 176, preferably a PCM interface, and a signaling interface 178, preferably an ISUP interface. Circuit switched controller 174 and the associated interfaces 170, 172, 176, and 178 coupling the circuit switched controller 174 to BS 110 and public network 190 may be collectively referred to as a circuit switched core network and provide a circuit switched communication link between BS 110 and public network 190. RAN 106, BS 110, packet switched controller 144, Media Gateway 160, transport networks 140 and 154, and circuit switched controller 174 are collectively referred to herein as an infrastructure 180 of communication system 100. Communication system 100 comprises a wireless packet voice communication system. In order for MS 102 to engage in a voice communication with an external network 190, 192 connected to infrastructure 180, each of BS 110, packet switched controller 144, Media Gateway 160, transport networks 140 and 154, and circuit switched controller 174 operates in accordance with well-known wireless telecommunications protocols. By operating in accordance with well-known protocols, a user of MS 102 can be assured that MS 102 will be able to communicate with infrastructure 180 and establish a communication link with an external network 190, 192 via the infrastructure. Preferably, communication system 100 operates in accordance with the 3GPP2 and TIA/EIA (Telecommunications Industry Association/Electronic Industries Association) IS-2000 and IS-2001 standards, wherein each communication channel of the multiple communication channels of each of the forward link and the reverse link of air interface 104 comprises one or more orthogonal codes, such as Walsh codes. The standards specify 7 CE11721R Scribano et al. wireless telecommunications system operating protocols, including radio system parameters and call processing procedures. However, those who are of ordinary skill in the art realize that communication system 100 may alternatively operate in accordance with other wireless protocols, such IS-95 or IS-856, for example, or any one of a variety of wireless packet-oriented voice communication systems, such as a Global System for Mobile communication (GSM) communication system, a Universal Mobile Telecommunications System (UMTS) communication system, a Time Division Multiple Access (TDMA) communication system, a Frequency Division Multiple Access (FDMA) communication system, or an Orthogonal Frequency Division Multiple Access (OFDM) communication system. Further description of embodiments of the present invention will focus on the enhanced Real Time Protocol (RTP) messaging, such as that between BSC 114 and MGW 160, for example. FIG. 2A is a block diagram conveying exemplary bit definitions of a Real Time Protocol (RTP) message 200 in accordance with multiple embodiments of the present invention. Furthermore, FIG. 2B is a block diagram conveying exemplary bit definitions for a vocoder frame control field in both a reverse-directed and a forward-directed RTP message in accordance with multiple embodiments of the present invention. Although RTP message 200 and vocoder frame control fields 300 and 400 are depicted as including numerous enhancements to prior art RTP messages, many distinct embodiments can be implemented by considering each of these enhancements independent of the others or in combination with one or more of the others. Many additional embodiments of the present invention can be implemented by making minor changes to the bit representations of each of these enhancements as depicted in messages/fields 200, 300, and 400 as well as to the manner in which the information is indicated by the bit representations. The header portion of RTP message 200 is shown between the "RTP Header Begin" and "RTP Header End" labels. A prior art RTP Header is specified in RFC 3550 (July 2003). As shown in message 200, the unmodified fields include the following header fields: 8 CE11721 R Scribano et al. Protocol Version (V): Although the field is not changed, its value should be set to '3' to indicate a new RTP protocol version. Padding (P): Should be set to '0' because payload padding is not required for this embodiment. Extension (X): Should be set to '0' because this is intended for 'experimental' applications. CSRC Count (CC): Should be set to '0' because multiple synchronization sources are not required in this embodiment. Marker (M): Should be set to '0' because marker events cannot be distinguished for multiplexed payloads. Payload Type (PT): Should be set according to specifications for wireless vocoders. Synchronization Source (SSRC): Should be set according to RFC 1889. Contributing Sources (CSRC): Not Used in this embodiment.

As shown in message 200, the fields that are modified (from those specified in RFC 3550) include the Sequence Number and Timestamp fields. These fields are concatenated to form a 48-bit field containing Universal Time Coordinated (UTC) information. UTC is a CDMA System time reference, known by both MSs and BSs. The primary reason to include UTC time in RTP packets is to facilitate event correlation and debugging across BS and Core Network elements. Conveying UTC to Core Network MGWs enables all bearer-path processing elements to utilize a common time reference for bearer processing events and state changes. 9 CE11721 R Scribano et al. The prior art Sequence Number (SN) field (as specified in RFC 3550) was defined as follows: Previous definition/function - Increments by one for each packet sent. Used by receiver to detect lost packets and re-sequence packets that may be received out-of-order. Initial value should be random.

The Sequence Number (SN) field used by various embodiments of the present invention is defined as follows: New definition/function - Sequence Number is composed of the least significant 16 bits of the number of 20mS periods since 6-Jan-1980 00:00:00 UTC. This field can also be used to detect lost packets and also resequence packets that may be received out-of-order. The rollover period of Sequence Number field is 22 minutes.

The prior art Timestamp (TS) field (as specified in RFC 3550) was defined as follows: Previous definition/function - The timestamp reflects the sampling instant of the first octet in the RTP data packet. The clock frequency is dependent on the format of data carried as payload and is specified statically in the profile or payload format specification that determines the format, or may be specified dynamically for payload formats defined through non-RTP means. If RTP packets are generated periodically, the nominal sampling instant as determined from the sampling clock is to be used, not a reading of the system clock. As an example, for fixed-rate audio the timestamp clock would likely increment by one for each sampling period. If an audio application reads blocks covering 160 sampling periods from the input device, the timestamp would be increased by 160 for each such block, regardless of whether the block is 10 CE11721 R Scribano et al. transmitted in a packet or dropped as silent. The initial value of the timestamp should be random, as for the sequence number.

The Timestamp (TS) field used by various embodiments of the present invention is defined as follows: New definition/function - The Timestamp field contains the next more significant 32 bits, relative to the SN field, of the number of 20mS periods since 6- Jan-1980 00:00:00 UTC. This field cannot be used to determine the number of audio samples represented by the payload information. Rather, the number of samples represented by the payload information is obtained by the combination of the RTP Payload Type (PT) and Interleaved/Bundled Packet fields described below. Rollover period of the TS field is about 178,500 years.

The BSC should source the SN and TS fields as a concatenated field representing the number of 20mS periods since 6-Jan- 1980 00:00:00 UTC. The vocoder should source the SN and TS fields as a concatenated field representing a time-corrected loopback of the TS+SN contained in previous BSC-sourced packet; or an independently derived number of 20 mS periods since 6-Jan-1980 00:00:00 UTC. Default initial values of O0000000H' and 'OOOOH' should be used for the vocoder sourced TS and SN fields, respectively, when relative or absolute UTC is not available. This is intended for call 'start-up' conditions when BSC-sourced packets containing UTC information has not yet been received. The Interleaved/Bundled Frame Header portion of RTP message 200 is shown between the "Interleaved/Bundled Frame Header Begin" and "Interleaved/Bundled Frame Header End" labels. A prior art Interleaved/Bundled Frame Header is specified in RFC 3558 (July 2003). As 11 CE11721 R Scribano et al. shown in message 200, the unmodified fields include the RR, LLL, NNN, MMM, and Count fields. Their definitions, as follow, remain unchanged from those specified in RFC 3558. Reserved (RR): Reserved. Must be set to zero by sender. Interleave Length (LLL): Indicates length of interleave. Interleave Index (NNN): Indicates index within interleave group. Mode Request (MMM): Requests a codec specific encoding mode. The definition of this field is constrained to conveying codec-encoding mode for codecs that support multiple modes (eg, SMV). This field will not be used to request encoded data frame sizes. A new parameter (FSM-R) is defined in the VFC field to request encoded data maximum frame sizes. Frame Count (Count): Indicates number of VFC fields (and vocoder frames) in the packet.

As shown in message 200, the Table of Contents (TOC) field as specified in RFC 3558 (July 2003) is replaced with a Vocoder Frame Control

(VFC) field as depicted by VFC fields 300 and 400 and described below. The prior art the Table of Contents (TOC) field as specified in RFC 3558 was defined as follows: Previous definition/function -

Value Rate Total codec data frame size (in octets)

0 Blank 0 (0 bit) 1 1/8 2 (16 bits) 2 1/4 5 (40 bits; not valid for EVRC) 3 1/2 10 (80 bits) 4 1 22 (171 bits; 5 padded at end with zeros) 12 CE11721 R Scribano et al. 5 Erasure 0 (SHOULD NOT be transmitted by sender)

The Vocoder Frame Control (VFC) field and its constituent sub-fields,icted by VFC fields 300 and 400 and as used by various embodiments resent invention, are defined as follows:

FE: Frame Error indicates estimated recovered air frame quality for the encapsulated vocoder frame. FE= (Errored) if one or more air interface frame errors are detected in this vocoder frame. FE=O' (Not Errored) if zero air interface frame errors are detected in this vocoder frame.

FT: Frame Type indicates the vocoder frame type, and hence size, for the encapsulated vocoder frame. FT- 000' (Null) if the expected vocoder frame is not available. For the reverse link, this Frame Type is used when a vocoder frame cannot be recovered from the air interface due to frame errors. This case is accompanied by FE= to distinguish between frame unavailability due to air interface errors and other causes. FT='001' (Eighth Rate) FT='010' (Quarter Rate) FT='011' (Half Rate) FT='100' (Full Rate) FT='101 ' (Spare) FT=' 110' (Spare) FT='111' (Spare)

FSM-R: 13 CE11721 R Scribano et al. Frame Size Maximum Request indicates the maximum frame size to be generated by the speech encoder for one corresponding vocoder frame in the forward direction. This field enables Dim & Burst, Blank & Burst, as well as general frame- by-frame Vocoder Rate Control functionality. FSM-R='000' (Null) Allows a 'blank frame' encoder state update for vocoders that support this feature (eg, 8K/13K QCELP) FSM-R='001' (Eighth Rate) FSM-R=O10' (Quarter Rate) FSM-R='0i r (Half Rate) FSM-R='100' (Full Rate) FSM-R='101' (Spare) FSM-R='110' (Spare) FSM-R='111' (Spare)

FSM-A: Frame Size Maximum Acknowledge indicates the accepted maximum frame size request to be generated by the speech encoder for one corresponding vocoder frame in the forward direction. In cases where the encoder does not immediately satisfy an FSM-R, the FSM-A will contain an echo of the FSM-R value that is pending. FSM-A='000' (Spare) FSM-A='00r (Eighth Rate) FSM-A='010' (Quarter Rate) FSM-A='011' (Half Rate) FSM-A='100' (Full Rate) FSM-A='101 ' (Spare) FSM-A=' 110' (Spare) FSM-A='111' (Spare) 14 CE11721R Scribano et al.

S: Selects the function of the dual-use Size Delay (SD)/Time Alignment(TA) field. S=O' indicates that the SD/TA field contains SD information. S=T indicates that the SD/TA field contains TA information.

SD/TA-R: This is a dual-use field, conveying reverse direction Size Delay (SD) or Time Adjustment (TA) Requests. Note that SD and TA requests should not occur concurrently. SD-R indicates a maximum number of vocoder frames, relative to the receipt of this request, in which the encoder can 'wait' before honoring the corresponding FSM-R in the Vocoder Frame Control field. This is intended to enable improved speech quality by providing the encoder an opportunity to comply with the FSM-R during an interval of low speech activity. An SD-R value of OOOOO' indicates that the FSM-R should be immediately applied to the next possible encoded frame. TA-R indicates that the encoder should adjust the analysis frame reference point relative to the current reference point. TA-R values can be positive or negative, and is encoded in 2's complement format. TA-R increments represent 1.25ms (ie, 10 PCM samples). SD-R field: SD-R- 00000' indicates zero frame delay request SD-R='00001' indicates 1 frame (ie, 20ms) maximum delay request

SD-R='11111' indicates 32 frame (ie, 640ms) maximum delay request TA-R field: TA-R- 00000' indicates zero encoder sampling time adjustment request 15 CE11721 R Scribano et al. TA-R- 00001' indicates +1.25ms advance request (ie, delete 10 PCM samples)

TA-R='01111' indicates +18.75ms advance request (ie, delete 150 PCM samples) TA-R='10000' indicates -20.00ms retard request (ie, insert 160 PCM samples)

TA-R- 11110' indicates -2.50ms retard request (ie, insert 20 PCM samples) TA-R- 11111' indicates -1.25ms retard request (ie, insert 10 PCM samples)

SD/TA-P: This is a dual-use field, conveying forward direction Size Delay (SD) or Time Adjustment (TA) requests that are pending in the transcoder processing element. Note that SD and TA pending indications should not occur concurrently. If both zero-valued SD and TA requests are pending, then the transcoder should toggle the 'S' bit field relative to 'S' bit field value in the previous frame. In this case, the SD/TA-P field value should always equal '00000'.

If either of SD-P or TA-P have non-zero-valued pending requests, then the transcoder should only indicate pending SD- P or TA-P requests in consecutive frames until the request has been completely granted, yielding a zero-valued pending request. Note that the TA request may be granted in one frame period or may be granted incrementally over more than one frame period. If granted in one frame period, then the TA-R value should only be included in the first time-adjusted frame. If the TA request is granted incrementally, then the first of each incrementally time-adjusted frame set should contain the 16 CE11721 R Scribano et al. incremental adjustment value. Note that SD requests apply on a frame-by-frame basis, and may accumulate in the transcoder- processing element. The SD-P value should represent a down counter, initialized with a non-zero SD-R value, and decremented by 1 for each frame whose size is larger than the

FSM-R value. The SD-P value should be set to zero in the first frame whose size is equal to or less than the FSM-R value.

If both non-zero-valued SD and TA requests are pending, then the transcoder-processing element should give precedence to the non-zero TA-P indications.

SD-P field: SD-P- 00000' indicates zero frame delay pending SD-P- 00001' indicates 1 frame (ie, 20ms) delay pending

SD-P='11111' indicates 32 frame (ie, 640ms) delay pending TA-P field: TA-P- 00000' indicates zero encoder sampling time adjustment pending TA-P- 00001' indicates +1.25ms advance pending (ie, delete 10 PCM samples)

TA-P- 01111' indicates +18.75ms advance pending (ie, delete 150 PCM samples) TA-P- 10000' indicates -20.00ms retard pending (ie, insert 160 PCM samples)

TA-P='11110' indicates -2.50ms retard pending (ie, insert 20 PCM samples) 17 CE11721R Scribano et al. TA-P- 11111' indicates -1.25ms retard pending (ie, insert 10 PCM samples)

DTX field: Indication from encoder to decoder that subsequent packets in expected time intervals may not be received. This enables stabilization of jitter buffers. In order to mitigate loss of DTX indication at the receiver (due to packet loss), two consecutive Vocoder Control Frames must be transmitted with DTX= before discontinuous operation can begin. Resumption of encoder transmission can begin at any time relative to DTX='1 ' indication. DTX- 0' indicates that discontinuous transmission will not occur within the next frame. DTX= indicates that discontinuous transmission may begin within the next two frames. Upon reception of this indication, the decoder should 'freeze' any expected receive time reference. Spare field: Not defined in this specification.

Embodiments of the present invention can also support circuit-based Tandem Free Operation (TFO). Inter-connection of packet-based core networks and circuit-based RAN's may be required for some deployment configurations. Such inter-connection would require a circuit-to-packet bearer interworking function. In order to facilitate Tandem Free Operation (TFO) for circuit-based RAN's in this configuration, support for TFO must be communicated between the RAN's. This communication should be supported by use of unique Payload Type (PT) values, inserted by the interworking function, and packet-based RAN's with TFO capability. The TFO payloads will be unique in that they contain both compressed voice frame parameters as 18 CE11721R Scribano et al. well as uncompressed voice (eg, G.711) samples associated with the frame parameters. Prior art RTP specifications (RFC 3550 and RFC 3558) only support very basic transcoding functionality required by Base Station Controllers communicating with Media Gateways containing wireless vocoders. One key enhancement provided by embodiments described herein relates to the support for vocoder rate control (a.k.a. Dim & Burst). RFC 3558 specifies a limited vocoder rate control via mode request (MMM) bits. Further, the vocoder rate control indicated via the MMM bits is dependent on the vocoder type (i.e., EVRC or SMV). For SMV, the MMM bits effect the mode of SIVIV operation. Frame rate control for SMV modes 2 and 3 are not supported by the SMV vocoder, so vocoder rate control via MMM bits is not achievable when operating in these modes. However, vocoder rate should instead be independent of the vocoder algorithm and determined by the maximum allowable transmission frame size from the entire set (null, eighth, quarter, half, full) of frame sizes. Therefore, in embodiments described herein, rate control is advantageously requested in terms of a maximum rate allowable from the entire set of frame rates and independent of the vocoder algorithm . Another enhancement to RTP messaging described herein involves replacing random values in the Sequence Number (SN) and Timestamp (TS) fields with fields derived from UTC time. A UTC time reference enables bearer stream synchronization and correlation across GPS-based CDMA systems. Moreover, additional RTP enhancements described herein support errored radio frame indication in conjunction with the decoded radio frame information, encoder analysis frame (i.e., time) alignment, enhanced support for discontinuous transmission, and support for Tandem Free Operation (TFO). In the foregoing specification, the present invention has been described with reference to specific embodiments. However, one of ordinary skill in the art will appreciate that various modifications and changes may be made without departing from the spirit and scope of the present invention as set forth in the appended claims. Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense, and all 19 CE11721R Scribano et al. such modifications are intended to be included within the scope of the present invention. In addition, those of ordinary skill in the art will appreciate that the elements in the drawings are illustrated for simplicity and clarity, and have not necessarily been drawn to scale. For example, the dimensions of some of the elements in the drawings may be exaggerated relative to other elements to help improve an understanding of the various embodiments of the present invention. Benefits, other advantages, and solutions to problems have been described above with regard to specific embodiments of the present invention. However, the benefits, advantages, solutions to problems, and any element(s) that may cause or result in such benefits, advantages, or solutions, or cause such benefits, advantages, or solutions to become more pronounced are not to be construed as a critical, required, or essential feature or element of any or all the claims. As used herein and in the appended claims, the term "comprises," "comprising," or any other variation thereof is intended to refer to a non-exclusive inclusion, such that a process, method, article of manufacture, or apparatus that comprises a list of elements does not include only those elements in the list, but may include other elements not expressly listed or inherent to such process, method, article of manufacture, or apparatus. The terms a or an, as used herein, are defined as one or more than one. The term plurality, as used herein, is defined as two or more than two. The term another, as used herein, is defined as at least a second or more. The terms including and/or having, as used herein, are defined as comprising (i.e., open language). The term coupled, as used herein, is defined as connected, although not necessarily directly, and not necessarily mechanically.

What is claimed is:

Claims

20 CE11721 R Scribano et al. Claims

1. A method for supporting enhanced transcoding functionality comprising: requesting in a reverse-directed, Real Time Protocol (RTP) packet a maximum frame size for a vocoder frame; in response to the request, receiving in a forward-directed RTP packet a vocoder frame having a frame size less than or equal to the maximum frame size.

2. The method of claim 1 , wherein the maximum frame size comprises a frame size from the group consisting of null, eighth rate, quarter rate, half rate, and full rate.

3. The method of claim 1 , further comprising: indicating in the reverse-directed RTP packet a coordinated universal time (UTC) time stamp.

4. The method of claim 3, wherein the UTC time stamp is indicated by a number of 20 millisecond (ms) periods occurring since January 6, 1980

00:00:00 UTC.

5. The method of claim 4, wherein a sequence number field of the reverse-directed RTP packet indicates the 16 least significant bits of the number of 20 millisecond periods.

6. The method of claim 4, wherein a time stamp field of the reverse- directed RTP packet indicates the 32 bits of the number of 20 millisecond periods that are the next more significant bits than the 16 least significant bits.

7. The method of claim 1 , wherein the forward-directed RTP packet indicates a coordinated universal time (UTC) time stamp. 21 CE11721 R Scribano et al.

8. The method of claim 1 , further comprising: indicating in the reverse-directed RTP packet that at least one air- interface frame error was detected for an included vocoder frame.

9. The method of claim 1 , further comprising: indicating in the reverse-directed RTP packet that no air-interface frame errors were detected for an included vocoder frame.

10. The method of claim 1 , further comprising: in response to the indication, receiving in a forward-directed, RTP packet an acknowledgment of the maximum frame size requested.

11. The method of claim 1 , wherein the acknowledgment indicates that generation of a vocoder frame having a frame size less than or equal to the maximum frame size is pending.

12. The method of claim 1 , further comprising: indicating in the reverse-directed RTP packet a maximum delay allowed for honoring the maximum frame size request.

13. The method of claim 12, wherein the maximum delay indication is represented as a number of vocoder frames.

14. The method of claim 1, further comprising: requesting in the reverse-directed RTP packet a vocoder-time- alignment change.

15. The method of claim 14, wherein the vocoder-time-alignment change is represented as a number of voice encoder samples to insert / discard to effect a sampling alignment change. 22 CE11721 R Scribano et al.

16. The method of claim 1 , further comprising: indicating in the reverse-directed RTP packet that transmission of at least one subsequent vocoder frame may be discontinued.

17. The method of claim 1 , further comprising: indicating in the reverse-directed RTP packet that transmission of at least one subsequent vocoder frame will not be discontinued.

18. The method of claim 1 , further comprising: receiving in a forward-directed, RTP packet an indication that transmission of at least one subsequent vocoder frame may be discontinued.

19. The method of claim 1 , further comprising: receiving in a forward-directed, RTP packet an indication that transmission of at least one subsequent vocoder frame will not be discontinued.

20. The method of claim 1 , further comprising: indicating in the reverse-directed RTP packet a payload type that supports Tandem Free Operation (TFO).

21. The method of claim 1 , further comprising: receiving in a forward-directed, RTP packet an indication of a payload type that supports Tandem Free Operation (TFO).

23 CE11721 R Scribano et al.

22. A base station (BS) comprising: a base site controller (BSC) adapted to request in a reverse-directed, Real Time Protocol (RTP) packet a maximum frame size for a vocoder frame and adapted to receive, in response to the request and in a forward- directed RTP packet, a vocoder frame having a frame size less than or equal to the maximum frame size; a base transceiver system (BTS), communicatively coupled to the BSC, adapted to communicate with a mobile station (MS) via an air interface.

23. The base station of claim 22, wherein the BSC is further adapted to indicate in the reverse-directed RTP packet that transmission of at least one subsequent vocoder frame may be discontinued.

24. The base station of claim 22, wherein the BSC is further adapted to indicate in the reverse-directed RTP packet that at least one air-interface frame error was detected for an included vocoder frame.

25. The base station of claim 22, wherein the BSC is further adapted to indicate in the reverse-directed RTP packet a coordinated universal time (UTC) time stamp.

26. The base station of claim 22, wherein the forward-directed RTP packet indicates a coordinated universal time (UTC) time stamp.

27. The base station of claim 22, wherein the BSC is further adapted to request in the reverse-directed RTP packet a vocoder-time- alignment change.

28. The base station of claim 22, wherein the BSC is further adapted to receive in a forward-directed, RTP packet an indication that transmission of at least one subsequent vocoder frame will not be discontinued. 24 CE11721 R Scribano et al.

29. The base station of claim 22, wherein the BSC is further adapted to indicate in the reverse-directed RTP packet a payload type that supports Tandem Free Operation (TFO).

30. The base station of claim 22, wherein the BSC is further adapted to receive in a forward-directed, RTP packet an indication of a payload type that supports Tandem Free Operation (TFO).

25 CE11721 R Scribano et al.

31. A transcoding system comprising: a base site controller (BSC) adapted to request in a reverse-directed, Real Time Protocol (RTP) packet a maximum frame size for a vocoder frame and adapted to receive in a forward-directed RTP packet a vocoder frame having a frame size less than or equal to the maximum frame size; a media gateway, communicatively coupled to the BSC, adapted to receive the request and adapted to send, in response to the request, the vocoder frame having a frame size less than or equal to the maximum frame size.

32. The base station of claim 31 , wherein the BSC is further adapted to indicate in the reverse-directed RTP packet that transmission of at least one subsequent vocoder frame may be discontinued.

33. The base station of claim 31 , wherein the BSC is further adapted to indicate in the reverse-directed RTP packet that at least one air-interface frame error was detected for an included vocoder frame.