Method and Apparatus in a Telecommunication System TECHNICAL FIELD OF THE INVENTION
The present invention generally relates to a method and apparatus used between a mobile station in a mobile network and a station, which is not able to suppress echoes, e.g. a fixed station, and particularly to speech and acoustic signal processing including cancellation of echoes.
DESCRIPTION OF RELATED ART
Echo cancellers are widely used both in terrestrial and atmospheric (i.e. radio, microwave) communication, to eliminate the "echo" phenomenon which greatly affects the quality of speech and audio services . An echo canceller essentially uses a copy of the data incoming to a listener to digitally estimate the echo that should return on the outgoing line. Having calculated the estimate, the echo canceller subtracts the echo estimate from the outgoing signal such that the echo cancels out.
The problem with the implementation of existing echo cancellers (ECs) is in their great computational complexity, which is performed in digital signal processors (DSPs) , even for the simplest algorithms . Many times echo cancellers ECs are pooled and thus share digital signal processors DSPs that perform the computation. Digital signal processors DSPs increase the cost for echo cancellers ECs, and it is therefore important to limit the numbers .
Third generation Public Land Mobile Networks (PLMNs) , such as the Universal Mobile Telecommunications Service (UMTS) and enhanced second generation PLMNs such as e.g. the European Global System for Mobile Communications (GSM) are capable of operating with compressed voice streams in their core networks (CNs) . Communications that are transmitted and received over a radio network are performed typically at a different bit rate to communications that are transmitted
and received over a fixed or switched telecommunications transit network such as the Public Switched Telephone Network (PSTN) . Over a radio network the bit rate of data transmissions is typically 4.75-13 kBit/s in e.g. an Adaptive Multi-Rate (AMR) coded format . For Adaptive Multi- Rate (AMR) codec the speech and channel codec are capable of operating at gross bit-rates of 11.4 kBit/s ("half-rate" ) and 22.8 kbit/s ("full-rate"). In addition, the codec may operate at various combinations of speech and channel coding (codec mode) bit-rates for each channel mode.
Bit rate transmissions over the PSTN networks are conducted synchronously at 64 kBit/s in e.g. a Pulse Code Modulation (PCM) coded format. For example, speech data transmitted over the PSTN network is converted by a transcoder (TC) into a PCM code according to ITU-T Recommendation G.711 A-Law or μ-Law at 64 kBit/s.
In e.g. US patent No. 5687229 and WO patent No. 9815068 the use of a silence detector for echo cancelling is described. These silence detector are built-in the echo cancelling function block.
SUMMARY OF THE INVENTION
The problem dealt with by the present invention is to provide improved/maintained voice quality communications between a mobile station and a station which is not able to suppress echoes, e.g. a fixed station, with less data traffic in the network and less workload in the echo cancellers in the network.
Briefly, the present invention solves said problem by using the silence descriptor SID information in the core network.
Specifically, the problem is solved by method according to claim 1 and apparatus, echo canceller, transcoder and node according to claims 5, 9, 14 and 16.
An object of the invention is to provide improved/maintained voice quality communications between a mobile station and a station which is not able to suppress echoes, e.g. a fixed station, with less data traffic between the transcoder and the echo canceller and less workload in the echo cancellers .
Another object is to develop an apparatus which is applicable to a mobile communication system and which comprises processing steps so that problems associated with the complexity, price and reliability of the system can be solved.
A further object is to leave out the silence calculation part in the echo canceller, which improves the workload on the digital signal processors DSPs which are connected to the echo cancellers, and less digital signal processors DSPs are needed, leading to increased bandwidth of the echo cancellers .
An advantage of the present invention is improved bandwidth, less data traffic in the network and less workload in the echo cancellers in the network.
Another advantage is that no re-sampling of silence will occur .
Still another advantage is a need of less digital signal processors DSPs that are very costly.
Other objects, advantages and novel features of the invention will become apparent from the following detailed description of the invention when considered in conjunction with the accompanying drawings and claims.
DESCRIPTION OF THE DRAWINGS
FIG. 1 shows a block diagram of a mobile telecommunication system including a second and third generation digital system.
FIG. 2a-c shows a block diagram of how echo cancelling function between a mobile and fixed station.
FIG. 3 shows a block diagram of how signal is transmitted between a mobile station and fixed station, and between a transcoder and an echo canceller according to the invention.
FIG. 4 shows a block diagram of how speech and silence is transmitted and transcoded between a transcoder and an echo canceller according to prior art.
FIG. 5 shows a block diagram of how speech and silence is transmitted and transcoded between a transcoder and an echo canceller according to the present invention.
FIG. 6 shows a flow chart illustrating how a silence descriptor information is used according to the present invention .
DESCRIPTION OF PREFERRED EMBODIMENTS
The present invention is particularly suited but is not limited to use in second generation digital systems, such as e.g. the European Global System for Mobile communications (GSM) and third generation Public Land Mobile Networks (PLMNs) such as e.g. Universal Mobile Telecommunications
Service (UMTS) and CDMA-2000.
Mobile communications have developed from first generation, analog-based mobile radio systems to second generation digital systems, such as e.g. the European Global System for Mobile communications GSM. Current developments for a third generation of mobile radio communication include the Universal Mobile Telephone communications System UMTS and the IMT 2000 system. For simplicity, third generation systems are referred to simply as UMTS. Current mobile/cellular telecommunications networks are typically designed to connect and function with Public Switched
Telephone Networks (PSTNs) and Integrated Services Digital Networks (ISDNs) . Both of these networks are circuit- switched networks (rather than packet-switched) and handle relatively narrow bandwidth traffic.
Figure 1 shows, from the point of view of the invention, the essential parts of a second generation digital systems 120, such as the European Global System for Mobile communications GSM and the essential parts of a third generation of mobile radio communication 130 including the Universal Mobile Telephone Communications System UMTS. In second generation digital systems, first mobile station MSI 111 communicates with a base station transceiver BTS 122 through Um interface. The base station transceiver BTS 122 is controlled by a base station controller BSC 121, through interface Abis. They are further connected to core network
(CN) 140. A Radio Access Network (RAN) under control of a base station controller BSC 121, including the base station transceivers BTSs 122 (only one BTS is shown here) , is commonly referred to as a GSM Radio Access Network (GRAN) 120. The core network CN 140 comprises of a mobile switching center (MSC) 141 and the interface between the mobile switching center MSC 141 and the GSM radio access network GRAN 120 is referred to as an A-interface (A) . The mobile switching center MSC 141 handles the connecting of incoming and outgoing calls. It performs functions similar to those of an exchange of a public switched telephone network PSTN 150. In addition to these, it also performs functions characteristic of mobile communications only, such as subscriber location management, jointly with the subscriber registers of the network. As subscriber registers, the GSM system at least includes a Home Location Register (HLR) 149 and a Visitor Location Register (here VLR included in the MSC, see MSC/VLR) 141. Further in the core network CN 140, is included the gateway MSC (GMSC) 149, serving GPRS (SGSN) 149 and gateway GPRS support node (GGSN) 149. The mobile
switching centers MSC 141 are connected to other networks, such as the public switched telephone network PSTN 150 (and/or an integrated services digital network ISDN), etc. The base station transceiver BTS 122, in conjunction with the base station controller BSC 121, receive information from the first mobile station MSI 111, process the information and forward the processed information to the mobile switching center MSC 141. One skilled in the art will appreciate that a base station transceiver BTS 122 and base station controller BSC 121 pair also receives information from mobile switching centers MSC 141 and transfers such information to one or more mobile stations MSI 111. At the mobile switching centers MSCs 141, information from the base station controller BSC 121 is routed to its destination (e.g., the PSTN when the information is speech or fax/modem data) .
For the third generation of mobile radio communication system such as e.g. the Universal Mobile Telephone communications System UMTS each of the core network CN 140 service nodes connects through the Iu interface to a UMTS
Terrestrial Radio Access Network (UTRAN) 130 a part of the Radio Access Network (RAN) . The UTRAN 130 includes one or more radio network controllers RNCs 131-132. Each RNC 131- 132 is connected to a plurality of radio base stations RBSs 133-137 (another name for radio base station RBS is Node-B in a UMTS system) through the Iub interface and through Iur interface to any other RNC 131-132 in the UTRAN 130. Radio communications between the radio base stations RBS 133-137 and second and third mobile stations MS2-MS3 112-113 are by way of a Uu radio interface. The PSTN 150 connection- oriented network is connected to the core network CN 140 with the included connection-oriented service node shown as a mobile switching center MSC node 141 that provides circuit-switched services. A first fixed station FS1 161 is e.g. connected to PSTN 150, one skilled in the art will
appreciate that it can be any station/user equipment which is not able to suppress echoes. In the preferred embodiment, radio access is based on Wideband-CDMA (WCDMA) with individual radio channels allocated using WCDMA spreading codes. The RAN interface is an "open" interface between the GSM-based service nodes GRANs 120 or the UMTS-based service nodes UTRANs 130 and which provide services to/from first, second and third mobile station MS1-MS3 112-113 over the radio interface to the external core networks CNs 140 and (and ultimately to external, core network end users without having to request specific radio resources necessary to provide those services) . The RAN interface essentially hides those details from the service nodes, external networks, and users. Instead, logical radio access bearers are simply requested, established, maintained, and released at the RAN interface by the service nodes. A radio access bearer is a logical connection between an external core network support node and a mobile station/user equipment through the UTRAN. It is the task of the UTRAN to map radio access bearers onto physical transport channels in a flexible, efficient, and optimal manner. Moreover in FIG. 1, the number of each of the components depicted is reduced for the sake of simplicity.
In a mobile communication system, a speech signal is processed and converted from one format to another at several different points and which are different depending on if it is the GSM or UMTS system. The bit rate used over a mobile network represented as a mobile transmission rate, may for the GSM system be e.g. 13 kBit/s, see FIG. 4 and standard ETSI GSM 06.10, depending on whether full-rate or half-rate is used and for the UMTS system 12.2 kBit/s, see
FIG. 5 and standard 3G TS 26.103 v.3.0.0. For both systems a fixed transit network transmission rate may be used over the network and is typically 64 kBit/s in a PCM coded format, see standard G.711. The second and third mobile station 112-
113 (GSM: the first mobile station MSI 111) in a UMTS Terrestrial Radio Access Network UTRAN 130 (GSM: GSM radio access network GRAN 120) communicates with a station/user equipment which is not able to suppress echoes, e.g. a first fixed station FS1 161 in a public switched telephone network PSTN 150, whereby user data such as the user's voice is coded and compressed within the second and third mobile station 112-113 (GSM: first mobile station MSI 111) and transmitted over radio link to any one of RBS 133-137 (GSM: base transceiver station BTS 122) at a transmission rate e.g. 12.2...4.75 kBit/s (GSM: e.g. 13, 12.2, 5.6 kBit/s). The user-data is forwarded through the mobile switching centers MSC 141 in a compressed format, at the same transmission rate 12.2 kBit/s (GSM: 13 kBit/s), contained in a first mobile frame structure. The mobile switching centers MSC 141 will generate a fixed transit network frame structure and insert the user data to be transmitted across the transit network synchronously at the 64 kBit/s fixed transit network transmission rate. For the prior art GSM system the fixed transit network transmission rate is already coded and compressed in the base station controller BSC 121, and for prior art UMTS system in the radio network controllers RNCs 131-132, and from there forwarded to the mobile switching center MSC 141 at that rate.
Transcoders (TCs) are used when to convert speech from a digital audio format to another (e.g. from AMR coded to PCM coded) and for data rate adaptation (e.g. from 13, 12.2, 5.6 kBit/s to 64 kBit/s, or from 12.2...4.75 kBit/s to 64 kBit/s), as described above, see FIG. 4-5. In prior art GSM system transcoders TCs are located in the base station controllers BSCs 121, and for the prior art UMTS systems they are located in the radio network controller RNC, not shown. But, in the present invention for both GSM and UMTS system the first to n transcoders TCl-TCn TCl-TCn, 142-143 can e.g. be placed in the core network CN 140 as shown in FIG. 1.
Echo Cancellers (ECs) , first to n echo canceller ECl-ECn 144-145 prevent an echo returning from the e.g. the first fixed station FS1 161, a station/user equipment which can not suppress echoes, from reaching a mobile station/user equipment MS1-MS3 111-113, see FIG. 2a-c. In FIG. 2a no echo canceller in the network is used, and "Hello" is echoed back to mobile station MS. In FIG. 2b "Silence" occur in mobile station MS instead, "Hello" is cancelled in echo canceller EC, and correspondingly in FIG. 2c "Silence" is handled by the echo canceller EC, according to prior art. In a mobile switching center MSC 141, the first to n echo cancellers ECl-ECn 144-145 are e.g. usually placed in connecting lines between the mobile switching centers MSCs 141, or included in the mobile switching center MSC 141, as shown in FIG. 1.
The problem with the implementation of existing echo cancellers ECs is in their great computational complexity, even for the simplest algorithms. The present invention's use of a Silence Descriptor (SID) , explained further down, as a means to disable echo canceller EC from adapting itself to "silent" (e.g. only noise) periods of speech and disabling FIR filter, that is normally connected to the echo canceller EC, brings a considerable reduction in the number of adaptations . It has been shown that on average a speaker will actively produce speech, including inter-syllable pauses, only 40% of the time. Therefore, potentially over a whole conversation, a 60% reduction in complexity is achieved with echo canceller EC of the present invention.
Many times echo cancellers ECs are pooled and thus share first to n digital signal processors DSPl-DSPn 146-147 that perform the computation. Digital signal processors DSPl-DSPn 146-147 increase the cost and with the present invention echo cancellers ECs 144-145 can be made inoperative, e.g. by turning the echo cancellers ECs off or by-passing the echo cancellers ECs when a silence descriptor SID information or changed format of the silence descriptor SID information is received at the echo canceller.
The echo canceller ECl-ECn 144-145 may comprises a by-pass function for by-passing the echo canceller ECl-ECn 144-145 as a response to the silence descriptor SID information or changed format of the silence descriptor SID information. The echo canceller ECl-ECn 144-145 may further comprises a turning off function for turning the echo canceller ECl-ECn 144-145 off as a response to the silence descriptor SID information or changed format of the silence descriptor SID information. The echo canceller ECl-ECn 144-145 may further comprises a transmitter for transmitting from the echo canceller ECl-ECn 144-145 a comfort noise signal as a response to the silence descriptor SID information or changed format of the silence descriptor SID information. The echo canceller ECl-ECn 144-145 may further comprises a component for generating a comfort noise signal as a response to receiving said silence descriptor SID information or changed format of said silence descriptor SID information.
As the radiocom unication industry matures, various subscriber usage patterns have been recognized. For example, as described before it has been found that during a typical voice connection between two subscribers, the actual voice activity transmitted over the air interface accounts less than 50% of the total connection time. This has been implemented, for example, using a detector for detecting voice activity and a discontinuous transmitter (DTX) that becomes inoperative when the voice activity detector (VAD) detects a pause in the user's speech. One way to overcome this difficulty is to generate artificial background noise for reproduction at the receiving side when no voice signal is transmitted. This artificial background noise is commonly referred to as "comfort noise". Comfort noise can be generated by adaptive functions that monitor the background noise picked up by the microphone of a remote unit. When a pause in speech is detected, the comfort noise functions
generate comfort noise information that is transmitted over the air interface instead of speech information. This information takes relatively little time to transmit, thereby allowing the transmitter to be turned off during most of each period of silence. At the receiving end, the comfort noise information is used to generate background noise so that the listener is not troubled by the discontinuity in transmission. Such a comfort noise generation technique is currently available in both GSM and UMTS. In the mobile station MS1-MS3 111-113 a comfort noise evaluation algorithm is used in a speech encoder in the mobile station MS1-MS3 111-113 to create parameters that include information on the level and spectrum of the background noise. The evaluated comfort noise parameters may then be encoded into a Silence Descriptor SID frame for GSM and UMTS for transmission according to the invention, to the transcoders TCl-TCn 142-143. The SID frame is characterised by the SID gross bit patterns . It may convey information on the acoustic background noise. For simplicity a silence descriptor SID frame is referred to as a silence descriptor SID information. The SID frame is transmitted at the end of a speech burst, i.e., before the transmitter is switched off. As such, for GSM the SID frame serves to initiate the comfort noise generation on the receiver side. For GSM if, after transmission of the first SID frame, the period of silence continues, the mobile station MS1-MS3 111-113 transmits SID update frames . A SID update frame performs several f nctions . It indicates not only that the period of speech inactivity continues, but also that the cellular connection is still present. Moreover, the SID update frame serves to update the background noise detected at the remote unit. The interval at which these SID update frames are transmitted depends on the type of speech coder employed. For example, for Full Rate (FR) and Enhanced Full Rate (EFR) speech coders in GSM, the rate at which SID frames are transmitted is FN MOD 104 = 52, where FN is the Frame
Number. This corresponds to SID frames being transmitted approximately every 480 ms . For a Half Rate speech coder, the rate at which SID frames are transmitted is doubled, i.e. every 240 ms . Moreover, for the newly developed Adaptive Multi-Rate (AMR) speech coder, the SID transmit rate is predicted to be up to four times higher than for the FR or EFR coders, i.e. every 120 ms . See standard 3G TS 26.103 V.3.0.0. In general, the VAD in the transmitter of a mobile station MS1-MS3 111-113 detects whether a traffic frame consists of speech, or no speech (e.g. non-transparent data or background noise) . For GSM, if the frame consists of no speech (e.g. only of noise) , the transmitter sends one silence descriptor SID frame, and then the transmission is stopped. Following the initial SID frame, one new SID frame is sent during each SACCH period until either speech or no speech (e.g. non-transparent data) is again detected within a traffic frame. The signal quality measure reports are sent e.g. on the SACCH. Each of the SID frames e.g. contains information about the background noise of the established connection, which is being monitored by the voice activity detector VAD. However, one skilled in the art will recognize that silence descriptor SID information may be transmitted in another form than in a SID frame as described here . The silence descriptor SID information can for example be based the standard 3G TS 26.103 V3.0.0.
In other instances, detailed descriptions of well-known methods, devices, and circuits are omitted so as not to obscure the description of the present invention. For the present invention to work each mobile station MS1-MS3 111- 113 comprises a discontinuous transmitter DTX and voice activity detector VAD (not shown) that operate in the manner described above. As such, each mobile station MS1-MS3 111- 113 transmits speech frames (Speech frame: Traffic frame that has been classified as a SPEECH frame) , as well as SID
frames. Each mobile station MS1-MS3 111-113 is also capable of transmitting packet data and fax/modem information.
In an exemplary arrangement according to the present invention, shown in FIG. 1 a processing block 148 with included mobile switching center MSC/VLR 141 in the core network CN 140, carries out tasks that are comprised by the functions transcoding and echo cancelling. The basic task of the transcoding is to convert speech encoding with a format optimized for the transmission over the air interface to e.g. G.711 A-law or μ-law at 64 kBit/s. Echo cancellation may take place e.g. as follows. A speech sample, IFlU in FIG. 3, is encoded according to ITU-T Recommendation G.711 A-law or μ-law at 64 kBit/s in the transcoder TC (TCl-TCn 142-143, in FIG. 1); speech thus encoded is transmitted to the echo canceller EC (ECl-ECn 144-145, in FIG. 1), IF2U in FIG. 3; in addition, the encoded speech is stored in a memory, not shown, from which a speech processing unit may retrieve it when needed; when the same speech sample comes back to the transcoder TC (TCl-TCn 142-143, in FIG. 1) from the direction of the fixed station FSl (fixed station FS1 161, in FIG. 1), IF3D in FIG. 3, which is not able to suppress echoes, the speech processing unit retrieves the sample corresponding to the speech frame in question from the memory and on the basis of the sample it cancels, IF2D in FIG. 3, from the downlink speech frame a signal detected to be an echo. The speech processing unit consists e.g. of a digital signal processor DSPl-DSPn 146-147 with its peripheral devices, see FIG.l.
As illustrated in FIG. 1, according to the present invention the transcoders TCl-TCn 142-143 are located at the edge or interface of a core network CN 140, where their placement next to the echo cancellers ECl-ECn 144-145 facilitate communication between the echo cancellers ECl-ECn 144-145 and the transcoders TCl-TCn 142-143. Assume that each of the plurality of the transcoders TCl-TCn 142-143 receives either
speech frames or SID frames during a first time slot. The received frames are decoded by a respective transcoder TCl- TCn 142-143 into e.g. PCM coded 64 kBit/s packets, and according to the present invention transferred to an echo cancellers ECl-ECn 144-145. In the core network CN 140, the transcoder TCl-TCn 142-143 processes the packets received during the first time slot irrespective of the packet's contents. That is, the transcoder TCl-TCn 142-143 process packets that contain only SID update information with the same priority as it process packets containing speech information. According to the present invention the SID information is used either directly in the transcoder TCl- TCn 142-143 by directly by-passing or disabling (e.g. turning the echo cancellers ECl-ECn off) the echo cancellers ECl-ECn 144-145, or by transmitting packets containing both speech data or SID information to the echo cancellers ECl- ECn 144-145. From the transcoder TCl-TCn 142-143 the SID information can be used directly to differentiate transmitted speech data from no speech data (e.g. silence or noise) , and in response to the information of no speech bypassing or disabling the echo cancellers ECl-ECn 144-145. If the SID information is transmitted to the echo cancellers ECl-ECn 144-145, the SID information can be used to differentiate speech from no speech, this save time as in prior art echo cancellers ECs calculation is needed to differentiate speech from no speech, which also takes up processing time in the digital signal processors DSPl-DSPn 146-147. With the SID information the echo cancellers ECl- ECn 144-145 is according to the invention able to disable itself (e.g. turning the echo cancellers ECl-ECn 144-145 off) , or incoming signals being by-passed, and thereby avoiding unnecessary echo cancellation calculations during no speech. And in prior art echo cancellers ECs a silence detector is necessary, see FIG 4, by complicated calculation using the signal processors DSPs the Silence Detector detect if speech or not occur. A faster way of making the echo
cancellers ECs inoperative by using the SID information is thereby accomplished.
The flowchart 600 of FIG. 6 illustrates an exemplary embodiment of the invention according to which a silence descriptor SID information is modified to be incorporated in the transcoder TCl-TCn 142-143. A first node MS1-MS3 111-113 and a second node FS1 161 in a mobile communication system, communicating over a link from the first node MS1-MS3 111- 113 to the second node FS1 161. A non-desired echo signal is generated in the second node FS1 161, and the echo signal appears in a third node CN 140 and is cancelled in the third node CN 140. In the first node communication signals are being encoded in such a way so in response to detecting that no speech is present within said first node MS1-MS3 111-113, a silence descriptor SID information is being sent within the one link from a transmitter of the first node MS1-MS3 111-113. In first step 601, the silence descriptor SID information is transmitted from the transmitter of the first node MS1-MS3 111-113 to a receiver of the third node CN 140. In next step 602, the silence descriptor SID information is transmitted to a receiver of a transcoder 142-143 in the third node CN 140, and further from a transmitter of the transcoder 142-143 transmitted in a same or changed format to a receiver of an echo canceller 144-145. Further in step 603, the silence descriptor SID information or changed format of the silence descriptor SID information is processed in the echo canceller ECl-ECn 144-145, and the echo canceller ECl-ECn 144-145 is made inoperative in such way so no echo cancelling is performed as a response to said silence descriptor SID information or changed format of said silence descriptor SID information. In step 604, the silence descriptor SID information or changed format of the silence descriptor SID information is processed and as a response to the silence descriptor SID information, the echo canceller 144-145 is by-passed or the echo canceller is turned off
144-145, or from the echo canceller 144-145 a comfort noise signal is transmitted to the echo canceller 144-145.
As a person skilled in the art appreciates, application of the invention is in no way limited to only AMR Speech Codec conforming to the 3G TS 26.xxx specifications. Thus the invention is also applicable in other speech coding specification, e.g. GSM 06.xx.
As will be recognized by those skilled in the art, the innovative concepts described in the present application can be modified and varied over a wide range of applications. Accordingly, the scope of patented subject matter should not be limited to any of the specific exemplary teachings discussed.