WO2008083517A1 - Procédé et système permettant de réaliser une compensation vocale dans un réseau de communication mobile - Google Patents

Procédé et système permettant de réaliser une compensation vocale dans un réseau de communication mobile Download PDF

Info

Publication number
WO2008083517A1
WO2008083517A1 PCT/CN2007/000099 CN2007000099W WO2008083517A1 WO 2008083517 A1 WO2008083517 A1 WO 2008083517A1 CN 2007000099 W CN2007000099 W CN 2007000099W WO 2008083517 A1 WO2008083517 A1 WO 2008083517A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
voice
invalid
compensation
network side
Prior art date
Application number
PCT/CN2007/000099
Other languages
English (en)
French (fr)
Inventor
Donghua Lu
Wei Ruan
Jian Cao
Hongwei Lou
Wanchun Zhang
Original Assignee
Zte Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zte Corporation filed Critical Zte Corporation
Priority to EP07702031.1A priority Critical patent/EP2129051B1/en
Priority to PCT/CN2007/000099 priority patent/WO2008083517A1/zh
Priority to CN2007800403922A priority patent/CN101529830B/zh
Publication of WO2008083517A1 publication Critical patent/WO2008083517A1/zh

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/005Correction of errors induced by the transmission channel, if related to the coding algorithm

Definitions

  • the present invention relates to speech compensation techniques, and more particularly to a method and system for speech compensation when a network side device does not use or partially uses a vocoder.
  • the network side vocoder mainly has the following two important functions: In the uplink, the user terminal device compresses and encodes the voice and sends it to the network side, and the network side vocoder needs to receive the compressed. The voice is decoded to make it suitable for transmission in the network; in the downlink, the network side vocoder needs to compress and encode the voice code stream transmitted in the network, so that it is suitable for transmission in the air link.
  • the voice coding and decoding methods currently used in CDMA2000 systems mainly include the following three types: EVRC (Enhanced Variable Rate Coder), QCELP-13k (Qualcomm Code Excited Linear Predictive Coding-Qualcomm- 13k code excitation at 13kps) Linear predictive coding), QCELP-8k (code-excited linear predictive coding with Qualcomm Code Excited Linear Predictive Coding-Qualcomm-8k rate of 8kps).
  • EVRC is the mainstream codec format currently widely used.
  • MS1 and MS2 use the same voice encoding method (for example, EVRC) during a typical MS1 (Mobile Station mobile station) and MS2 call.
  • the voice of the MS1 user arrives in the ear of the MS2 user in the following manner: First, the MS1 transmits the encoded EVRC compressed voice frame to the network side 1 through the uplink air link, and the network side 1 uses the vocoder 1 to receive the received EVRC.
  • the voice frame is decoded, converted into a circuit mode PCM (Pulse Coded Modulation) code stream, and then circuit-switched; the network side 2 receives the PCM code stream exchanged by the network side circuit, and then uses the vocoder 2
  • the PCM stream is converted into an EVRC compressed speech frame and sent to MS2 over the downstream air link.
  • PCM Pulse Coded Modulation
  • the vocoder's encoding and decoding of speech is lossy compression, and each time the codec is decoded, the voice quality is degraded. Still taking the above MS1 and MS2 calls as an example, since MS1 and MS2 use the same codec. Format, if the EVRC compressed speech frame can be removed from the voice coding and decoding process on the network side, that is, the voice codec process is reduced twice on the network side, and the voice of the MS1 user arrives in the ear of the MS2 user.
  • MS1 passes The uplink air link transmits the encoded EVRC compressed voice frame to the network side 1, and the network side 1 directly exchanges the received EVRC voice frame to the network side 2; the network side 2 receives the exchanged EVRC compressed voice frame. , sent to MS2 through the downlink air link.
  • TrFO means: Through some out-of-band negotiation mechanism, the network can negotiate the codec type and mode of the vocoder before the call is established. After negotiation, calls between mobile users can be completely free of vocoders on the network side, thereby improving voice quality and saving expensive vocoder resources and power consumption.
  • RTO is a special case of TrFO. Since the codec mode of the two parties cannot be agreed in out-of-band negotiation, there is a need for a vocoder on the network side to convert the pattern of the party to the other party. The pattern.
  • the main difference between RTO and TDM circuit transmission network is: In the TDM network, the network side needs to perform two codec conversions, and the RTO only needs to undergo one codec conversion.
  • MS1 uses the EVRC codec format
  • MS2 uses the QCELP-13k codec format
  • the MS1 user's voice is W arrives in the ear of the MS2 user: first, the MSI sends the encoded EVRC compressed voice frame to the network side 1 through the uplink air link; the network side 1 directly transfers the received EVRC voice frame to the network side 2
  • the network side 2 receives the exchanged EVRC compressed speech frame and converts it into a QCELP-13k compressed speech frame through the vocoder, and transmits it to the MS2 through the downlink air link.
  • CDMA2000 LMSD Legacy Mobile Station Domain
  • the out-of-band negotiation of TrFO is done through signaling negotiation between the access network and MSCe. Since the CDMA2000 LMSD adopts the IP switching technology, the network side can directly transmit the compressed voice data encoded by the user terminal device as an RTP (Real-Time Transport Protocol) packet through the IP network, eliminating the need for various voices.
  • RTP Real-Time Transport Protocol
  • the encoding method is converted to PCM and transmitted through the TDM circuit.
  • the maximum transmission rate of EVRC is 8kpbs (the transmission rate of full-rate frames), and EVRC also includes a large number of half-rate frames and 1/8 rate frames.
  • the average full-rate frame is about 30%, and the transmission rate is 22 bytes/20ms frame; the half-rate frame is about 30%, and the transmission rate is 10 bytes/ 20ms frame; 1/8 rate frame accounts for about 40%, and its transmission rate is 2 bytes/20ms frame.
  • the RTP transmission supports the multi-frame packing function, the EVRC can be packaged and transmitted in the network to save IP header overhead.
  • the average rate of EVRC transmission in the network is 11.7 kbps.
  • TrFO can save a lot of network bandwidth.
  • TrFO technology encountered some problems in the actual use process. For example: If MS1 and MS2 are in the process of TrFO, if the quality of the air link is not good, MS1 sends the frame to the network side 1 through the uplink. It is possible that the network side 1 cannot correctly receive and parse the contents of some frames. That is, the error frame in the air. These unresolvable frames are smoothed by the vocoder on the network side in the TDM circuit transmission network. In the TrFO technology, since no vocoder is involved, the network side 1 can only fill these frames into the protocol.
  • Compensation frame (for example: in EVRC, all half-rate frames with bits "0", all-rate frames with all bits "0" are defined as frames to be compensated, etc.), and switched to network side 2, network Side 2 will send the frames to be compensated for these protocols. Send it to MS2 mobile phone.
  • the voice frame of the network side 1 reaches the network side 2 through the network, some frame loss or jitter may be introduced.
  • the network side 2 cannot receive the network side 1 within the specified time.
  • the frame will also be filled into the to-be-compensated frame and sent to MS2 according to the protocol.
  • the RTO call uses a vocoder on the network side. It is assumed that MS 1 and MS2 are in the process of RTO call, MS 1 sends the frame to the network side 1 through the uplink, because if the air link shield is not very good, the network side 1 After receiving the error frame, voice compensation can also be performed through the vocoder on the network side.
  • voice compensation can also be performed through the vocoder on the network side.
  • the compensated speech frame arrives at the network side, it is still possible to introduce frame loss and jitter due to network transmission quality problems.
  • the network side 2 will fill the frame to be compensated specified by the protocol to MS2. Therefore, if this part of the frame to be compensated cannot be effectively compensated by MS2, it will have a significant impact on the overall voice quality of the RTO. ⁇
  • TrFO and RTO reduce the number of codecs of the network side vocoder, which will improve the voice quality.
  • TrFO and RTO cannot use the network side vocoder for voice compensation like the original circuit switched mobile communication system.
  • the voice compensation is completely dependent on the vocoder on the user terminal side.
  • the user terminal equipment produced by various manufacturers on the market does not need to compensate for the received speech frame to be compensated. Therefore, the voice quality of TrFO and RTO is heavily dependent on the vocoder compensation performance of the user terminal equipment. Whether the device compensates for the speech frame to be compensated in various situations, which has a great influence on the overall voice quality of TrFO and RTO.
  • the present invention provides a method and system for implementing voice compensation in a mobile communication network, which is applied to a voice with poor transmission quality, and when the network side device does not use a vocoder or partially uses a vocoder, Approximate compensation is performed to improve overall voice quality.
  • a method for implementing voice compensation in a mobile communication network comprising: a. At each frame processing time, the network side device determines whether a voice frame received or ready to be sent is an invalid frame; ;
  • the network side device performs voice compensation processing on the invalid frame.
  • the method further includes the following steps:
  • A2. Determine whether the frame distance between the invalid frame and the last valid frame is less than or equal to the compensation threshold; if yes, proceed to the next step.
  • the method for performing voice compensation processing on the invalid frame in the step b includes one of the following methods: an effective frame copy method: using the last valid frame instead of the current invalid frame for compensation;
  • 1/4 rate frame padding method compensation is performed by replacing the current invalid frame with an arbitrary 1/4 rate frame of one frame content
  • Simulation approximation Replace the current invalid frame with the simulated frame.
  • the invalid frame refers to a blank frame, a deleted frame, a frame with no defined frame rate in other protocols; a frame that is not received at a specified frame processing time, or is received by a vocoder specified in the protocol.
  • the voice frame is a forward voice frame or a reverse voice frame
  • the last valid frame is the last valid frame of the forward speech frame
  • the last valid frame is the last valid frame of the reverse speech frame.
  • the present invention also provides a system for implementing voice compensation in a mobile communication network, wherein the system is disposed in a network side device, and includes:
  • the invalid frame detecting unit determines whether the voice frame received or prepared to be transmitted by the network side device is an invalid frame; sends the invalid frame to the voice compensation unit, and sends the valid frame to the unit for processing the voice frame in the network side device;
  • the voice compensation unit performs voice compensation processing on the invalid ⁇ , and sends the compensated voice frame to the unit for processing the voice frame in the network side device.
  • the voice compensation unit includes:
  • the voice compensation determining unit receives the invalid frame sent by the invalid frame detecting unit, sends the invalid frame in the non-1/8 rate state to the voice compensation processing unit, and sends the other invalid frame to the unit in the network side device that processes the voice frame;
  • the voice compensation processing unit receives the invalid frame sent by the voice compensation judgment unit, and performs voice compensation on the voice frame.
  • the compensated voice frame is sent to the unit for processing the voice frame in the network side device. Further, the voice compensation determining unit determines whether the last valid frame of the received invalid frame is a non-.1/8 rate frame; if the invalid frame is considered to be an invalid frame in a non-1/8 rate state; The invalid frame is not an invalid frame in the non-1/8 rate state.
  • the voice compensation determining unit determines the frame distance of the invalid frame of the non-1/8 rate state and the previous valid frame, and sends the invalid frame whose frame distance is less than or equal to the compensation threshold to the speech compensation processing unit, and sets the frame distance.
  • An invalid frame larger than the compensation threshold is sent to the unit for processing the voice frame in the network side device
  • the voice compensation process performed by the voice compensation unit on the invalid frame includes one of the following situations:
  • the voice frame received by the network side device is a blank frame, a deleted frame, a frame with no frame rate defined in other protocols, a frame that is not received at a predetermined frame processing time, or
  • the voice frame is considered to be an invalid frame.
  • the voice frame received by the network side device is a forward voice frame or a reverse voice frame; when the voice frame is a forward voice frame, the last effective frame is a previous effective frame of the forward voice frame;
  • the last valid frame is the last valid frame of the reverse speech frame.
  • the network side device is a base station, a base station controller, a radio network controller, or a mobile switching center.
  • FIG. 1 is a flow chart showing a specific implementation of a method for implementing voice compensation according to the present invention
  • FIG. 1 is a schematic diagram of a specific implementation of a system for implementing voice compensation according to the present invention
  • Figure 3 is a flow chart of the first embodiment of the present invention
  • FIG. 4 is a flow chart of Embodiment 2 of the present invention.
  • FIG. 5 is a flow chart of the third embodiment of the present invention. Preferred embodiment of the invention
  • the main idea of the present invention is that during the call, the full-rate frame and the half-rate frame contribute the most to the speech. If the full-rate frame or the half-rate frame is lost or damaged, the voice quality is easily affected.
  • a large number of experiments have shown that especially in the continuous full half-rate frame state, the loss of one or several full-rate frames often causes intermittent and swallowing; loss of one or several half-rate frames is lost. Vibrato is often produced.
  • the uncomfortable feeling of the human ear, the specific degree of discomfort depends on the vocoder codec performance of the user terminal.
  • the object of the invention is therefore primarily to compensate for full rate frames or half rate frames.
  • the invention provides a method for realizing voice compensation in a mobile communication network, which is applied to a wireless environment difference or a transmission shield difference, and the network side does not use a vocoder (such as TrFO) or partially uses a vocoder (such as RTO).
  • a vocoder such as TrFO
  • RTO partially uses a vocoder
  • Step 1 The network side device judges the forward speech frame to be processed received or ready to be sent from the network side at each forward speech frame processing time, and determines whether the frame is an invalid frame; or the network side is at each time
  • the reverse speech frame processing time is used to judge the reverse speech frame to be processed from the user terminal device or to be sent, and determine whether the frame is an invalid frame:
  • step 2 If it is an invalid frame, go to step 2;
  • the speech frame is normally processed and output.
  • invalid frame refers to the following frames:
  • Blank frame (blank frame), earased frame (delete frame), or frame with no frame rate defined in other protocols;
  • a frame that is not received at the specified frame processing time (such as frame loss, or frame delay caused by jitter); Or a frame that requires speech compensation after the vocoder specified in the protocol is received.
  • Step 2 The network side device continues to determine whether voice compensation processing is required for the invalid frame. The judgment is based on whether the invalid frame is an invalid frame in a non-1/8 rate state:
  • step 3 If it is an invalid frame in the non-1/8 rate state, the invalid frame will have a greater impact on the voice quality, and proceeds to step 3;
  • the invalid frame is an invalid frame in the 1/8 rate state, the invalid frame has little effect on the voice quality, and may not be compensated, and the invalid frame is normally processed and output.
  • the method for judging whether an invalid frame is a frame in a non-1/8 rate state is as follows:
  • the network side device determines whether the last effective frame rate is a 1/8 rate frame. If the valid frame is a non-1/8 rate frame, it indicates that the invalid frame is an invalid frame in a non-1/8 state; otherwise, the invalid frame is an invalid frame in a 1/8 rate state;
  • step 1 determines each forward speech frame, then in this step, the last valid frame of the forward speech frame is determined; if the network side device in step 1 is for each reverse speech frame. In the judgment made, in this step, the last valid frame of the reverse speech frame is judged.
  • the "effective frame” refers to: a frame in which a vocoder can perform normal codec during a voice call, that is, a frame other than an invalid frame is called a valid frame.
  • the "previous valid frame” refers to: a valid frame received or ready to be sent by the elbow of the last frame processing; if the last frame processing time is received or is ready to be sent is an invalid frame, then the last frame is A valid frame received or ready to be sent at the moment of processing, and so on.
  • Step 3 Determine whether the frame distance between the invalid frame and the last valid frame is less than or equal to the compensation threshold for voice compensation:
  • step 4 If the frame distance is less than or equal to the threshold, then proceeds to step 4;
  • the compensation threshold is related to the performance of the mobile communication system and the effect of the compensation, and may be passed Compared with the results of multiple experiments, the compensation threshold that can achieve the best compensation effect is selected according to the voice quality; for example, if the compensation threshold is set to 6, the six consecutive invalid frames will be compensated; When the value is set to 2, only two consecutive invalid frames are compensated, and the third consecutively arriving invalid frames are no longer compensated.
  • the "frame distance” refers to: a set of sequentially arriving frames, wherein the number of frames between frame A and frame B is increased by one, which is called the frame distance of frame A and frame B.
  • a set of sequentially arriving frames, frame a, frame b, frame c, frame d, where frame a and frame d have a frame distance of 3.
  • Step 4 The network side device performs a speech frame compensation process on the invalid frame, and replaces the invalid frame with the compensated speech frame to become a speech frame to be processed and output.
  • the voice frame compensation method adopted by the network side device includes one of the following methods: an effective frame copy method, a 1/4 rate frame padding method, a simulation approximation method, and the like.
  • Effective frame copy method Replace the current invalid frame with the last valid frame.
  • 1/4 rate frame padding This method is only applicable to the speech call of the EVRC codec format; the current invalid frame is replaced by a 1/4 rate frame, and the frame content of the 1/4 rate frame can be arbitrary.
  • Simulation approximation According to the law obtained by simulation, using the rate and frame content of the previous effective frame, and the frame distance between the current invalid frame and the previous effective frame, a frame is simulated, and the frame obtained by this simulation is used instead of the current one. Invalid frame.
  • the compensated speech frame is processed normally and output.
  • the present invention also provides a system for implementing voice compensation in a mobile communication network, which is installed in a network side device, is applied to an airborne wireless environment, or has poor transmission quality, and the network side device does not use a vocoder or partially uses a vocoding code.
  • the system includes: . ' The invalid frame detecting unit determines whether the forward speech frame or the reverse speech frame received or ready to be transmitted by the network side device is an invalid frame at each frame processing time. Sending the invalid frame to the voice compensation unit; transmitting the valid frame to the unit for processing the voice frame in the network side device;
  • the voice compensation unit includes:
  • the voice compensation judging unit receiving the invalid frame sent by the invalid frame detecting unit, which will have the previous one
  • the frame distance of the effect frame is less than or equal to the compensation threshold, and is sent to the voice compensation processing unit for the invalid frame in the non-1/8 rate state; the other invalid frame is sent to the unit for processing the voice frame in the network side device;
  • the unit receives the invalid frame sent by the voice compensation judgment unit, and performs voice compensation processing on the unit, that is, performs one of the following processes:
  • the compensated voice frame is sent to a unit in the network side device that processes the voice frame.
  • the network side device may be one of a base station, a base station controller, and a wireless network 4 mobile device.
  • the present invention is applicable to a voice call in which a network side device does not use a vocoder or partially uses a vocoder for voice compensation, and includes: a wireless communication system using TrFO technology, RTO technology, TFO (Tandem Free Operation).
  • the present invention is equally applicable to wireless communication systems such as CDMA2000, WCDMA (Wideband-Code Division Multiple Access) and TDS-CDMA (Time Division Synchronization-Code Division Multiple Access; .
  • Application Embodiment 1 Voice compensation is implemented by using an effective frame copy method.
  • the voice compensation method used in this embodiment is an effective frame copy method.
  • the threshold of the frame distance for performing voice compensation is 1, that is, only the first invalid frame after the effective frame in the full-rate frame state is voiced.
  • the compensation process, the invalid frame that arrives continuously after the invalid frame is no longer compensated; as shown in Figure 3, the specific steps that need to be performed are as follows: 101: The network side device receives the reception time at each forward voice frame. The forward speech frame from the network side is judged:
  • step 102 If this time is an invalid frame, proceed to step 102; If this time is a normal voice frame, then go directly to step 104;
  • step 104 If the previous frame is also an invalid frame, no special processing is performed, and the process proceeds directly to step 104. If the previous frame is a non-full rate frame, no special processing is performed, and the process proceeds directly to step 104; if the previous frame is For full rate frames, proceed to step 103; note that the full rate frame here is a valid frame.
  • step 103 discarding the invalid frame of this time, replacing the invalid frame of the current frame with the previous frame that arrives, that is, the full rate frame; proceeding to step 104;
  • the 1/4 rate frame padding method is used to implement voice compensation.
  • This embodiment is applicable to a call using the EVRC codec mode.
  • the EVRC legal encoding format does not include 1/4 rate frames.
  • a large number of experiments have shown that the vocoders of various user terminal devices perform speech compensation when receiving 1/4 rate frames in the EVRC codec format. As shown in FIG. 4, the specific steps of this embodiment are as follows:
  • the network side device determines the rate of the received forward voice frame from the network side at each forward voice frame processing time:
  • step 205 If the normal speech frame is reached this time, then go directly to step 205;
  • step 203 If the frame rate of the last valid frame is a full rate frame, then proceeds to step 203;
  • step 203 Determine the frame distance between the last valid frame and the current invalid frame: If the frame distance is less than or equal to the preset compensation threshold, then proceeds to step 204; if the frame distance is greater than the preset compensation threshold, then proceeds to step 205;
  • step 204 Discard the invalid frame of this time, and replace the invalid frame of this time with a 1/4 rate frame.
  • the frame content of the 1/4 rate frame can be arbitrary.
  • This 1/4 rate frame 4 is the current forward speech frame.
  • the forward speech frame of this time is normally processed and output.
  • the main idea of this embodiment is to replace a batch of consecutive invalid frames immediately following the full rate frame with 1/4 rate frames; for each frame distance from the last full rate effective frame is less than or equal to Invalid frames with pre-defined thresholds are replaced with 1/4 rate frames; no additional speech compensation is applied for invalid frames with a frame distance greater than the threshold; that is, if the full rate valid frame is followed by a continuous
  • the compensation threshold of the method can also be set to infinity, that is, the continuous invalid frame immediately following the full rate is replaced by a 1/4 rate frame.
  • Application Example 3 Using the simulation approximation method to implement speech compensation.
  • the network side device determines the rate of the forward voice frame received from the network side at each forward speech frame processing time:
  • step 302 If this time is an invalid frame, proceed to step 302; If this time is a normal speech frame, then go directly to step 305;
  • step 303 If the frame rate of the last valid frame is a full rate frame, then proceeds to step 303;
  • step 305 If the last valid frame is a non-full rate frame, then without any special processing, directly go to step 305;
  • step 304 If the frame distance is less than or equal to 6, then proceeds to step 304;
  • step 304 discarding the invalid frame of this time, using the content of the last valid frame, the frame distance of the last valid frame and the current invalid frame as parameters, constructing a pseudo full rate frame according to the approximate rule obtained by the previous statistical induction, and constructing Pseudo full rate frame replaces the invalid frame; using the pseudo full rate frame as the current forward speech frame; proceeding to step 305;
  • the main idea of this embodiment is to replace the invalid frame immediately after the full rate frame by using the simulated speech frame, and the content of the full rate frame and the frame distance of the invalid frame and the full rate frame during simulation.
  • the main idea of this embodiment is to replace the invalid frame immediately after the full rate frame by using the simulated speech frame, and the content of the full rate frame and the frame distance of the invalid frame and the full rate frame during simulation.
  • the above three embodiments mainly compensate for the frame in the full rate state; in practical applications, it can be set to perform voice compensation when the last valid frame is a full rate frame or a half rate frame.
  • the compensation threshold can also be set according to the actual situation.
  • the invention solves the problem that when the airborne wireless environment is poor and the network transmission quality is poor, since the network side does not use the vocoder or partially uses the vocoder for the voice shield compensation and the line prediction, the voice quality is heavily dependent on the user terminal equipment. Whether the vocoder compensates for some of the frames to be compensated and the performance of the compensation, resulting in a decrease in overall voice quality and discomfort to the human ear, provides a system for implementing voice compensation on the network side and method.
  • the technical solution of the present invention can compensate the voice with a certain approximation, and reduce the ⁇ word, when the network side does not use the vocoder or partially uses the vocoder when the air environment in the air is poor or the network transmission quality is poor.
  • the human ear discomfort caused by vibrato and speech discontinuity increases the overall voice shield and reduces the dependence of the call on the performance of the user terminal and its vocoder.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)
  • Mobile Radio Communication Systems (AREA)

Description

一一种在移动通信网络中实现语音补偿的方法和系统
技术领域
本发明涉及语音补偿技术,尤其涉及网络侧设备不使用或部分使用声码 器时进行语音补偿的方法和系统。
背景技术
在移动通信系统中, 网络侧声码器主要有以下两个重要作用: 在上行链 路中,用户终端设备将语音经过压缩编码后发送到网络侧, 网络侧声码器需 要对接收到的压缩语音进行解码, 使之适合在网络中传输; 在下行链路中, 网络侧声码器需要对网络中传输的语音码流进行压缩编码,使之适合在空中 链路中传输。
以 CDMA2000( Code Division Multiple Access 2000码分多址接入 2000 ) 系统为例。 CDMA2000 系统目前所使用的语音编解码方式主要包括以下三 种: EVRC( Enhanced Variable Rate Coder增强型可变速率编码)、 QCELP-13k ( Qualcomm Code Excited Linear Predictive Coding-Qualcomm- 13k速率为 13kps的码激励线性预测编码)、 QCELP-8k ( Qualcomm Code Excited Linear Predictive Coding-Qualcomm-8k速率为 8kps的码激励线性预测编码) 。 其 中, EVRC为当前被广泛采用的主流编解码格式。在一次典型的 MS1( Mobile Station移动台)和 MS2通话过程中, MS1和 MS2都使用相同的语音编解 码方式(例如 EVRC ) 。 MS1用户的语音是通过以下方式到达 MS2用户的 耳中的: 首先 MS1通过上行空中链路将编码后的 EVRC压缩语音帧发送到 网络侧 1 , 网络侧 1使用声码器 1将接收到的 EVRC语音帧进行解码, 转换 成电路方式的 PCM ( Pulse Coded Modulation脉冲编码调制)码流, 再进行 电路交换处理; 网絡侧 2接收到网络侧电路交换过来的 PCM码流后, 再利 用声码器 2将 PCM码流转换成为 EVRC压缩语音帧, 并通过下行空中链路 发送给 MS2。
声码器对语音的编解码是有损压缩,每经过一次编解码都会降低话音质 量。 仍以上述 MS1和 MS2通话为例, 由于 MS1和 MS2使用相同的编解码 格式, 如果可以免去 EVRC压缩语音帧在网络侧附加的语音编码和解码过 程, 即在网络侧减少两次语音编解码过程, MS1用户的语音到达 MS2用户 的耳中, 过程如下: 首先 MS1通过上行空中链路将编码后的 EVRC压缩语 音帧发送到网络侧 1 , 网络侧 1直接将接收到的 EVRC语音帧进行交换处理 传输到网络侧 2; 网络侧 2接收到交换过来的 EVRC压缩语音帧 , 通过下行 空中链路发送给 MS2。
从这个例子可以看出,由于网络侧减少了两次有损语音质量的语音编解 码过程, 不仅可以明显改善语音庸量, 同时还可以节省网络侧声码器资源, 减少语音的传输时延和处理时延。在移动通信系统发展初期, 系统中的语音 呼叫主要集中在移动用户与固定用户之间, 上述影响尚不明显。话务统计表 明, 目前移动用户间的呼叫已占主导地位,原有的声码器配置方式不仅增加 设备成本, 而且影响系统性能。 为此, 改进声码器配置管理的网络结构和策 略成为人们研究的热点问题。
随着全 IP移动通信技术的发展, 以更低成本、 更灵活有效的方式支持 传统话音业务和分组数据业务是全 IP移动通信技术发展的主要驱动力。 在 支持传统的语音业务时, 全 IP移动通信网面临着一个如何以更低成本支持 多种声码器使用的问题,即所谓的免(无)声码器操作 TrFO( Transcoder Free Operation免码型转换操作)和 RTO ( Remote Transcoder Operation远端码 型转换操作)特性支持问题。
TrFO是指: 通过某种带外协商机制, 使得网络可以在呼叫建立前就对 声码器的编解码类型和模式进行协商。 经协商后,移动用户之间的呼叫可以 完全不经过网络侧的声码器,从而提高话音质量, 节省昂贵的声码器资源及 其带来的功耗。
RTO是 TrFO的一种特例, 由于通讯双方的编解码方式在带外协商中无 法达成一致,此时网絡侧就会有存在声码器的必要性, 将通话一方的码型转 换成通话另一方的码型。 RTO与 TDM电路传输网络的主要区别在于: TDM 网络中, 网络侧需要进行两次编解码转换, 而 RTO仅需经过一次编解码转 换。 此处举例说明 RTO: MS1使用的是 EVRC编解码格式, MS2使用的是 QCELP-13k编解码格式, 在 MS1和 MS2通话过程中, MS1用户的语音是 W 这样到达 MS2用户的耳中的:首先 MSI通过上行空中链路将编码后的 EVRC 压缩语音帧发送到网络侧 1; 网络侧 1直接将接收到的 EVRC语音帧进行交 换处理传输到网络侧 2; 网络侧 2接收到交换过来的 EVRC压缩语音帧通过 声码器转换为 QCELP-13k压缩语音帧, 通过下行空中链路发送给 MS2。
以 CDMA2000 LMSD ( Legacy Mobile Station Domain传统移动终端域) 为例, TrFO的带外协商是通过接入网和 MSCe之间的信令协商来完成的。 由于 CDMA2000 LMSD采用了 IP交换技术, 因此网络侧可以直接将用户终 端设备编码过的压缩语音数据做为 RTP ( Real-Time Transport Protocol 实时 传输协议)分组通过 IP网络传输, 不再需要进行各种语音编码方式到 PCM 的转换并通过 TDM电路传输。
以 EVRC为例, EVRC的最大传输速率为 8kpbs(全速率帧的传输速率), EVRC还包括大量的半速率帧和 1/8速率帧。 根据统计表明0 ^ 在 EVRC呼叫 中,平均全速率帧所占比例为 30%左右,其传输速率为 22 bytes/20ms frame; 半速率帧所占比例为 30%左右, 其传输速率为 10 bytes/20ms frame; 1/8速 率帧所占比例为 40%左右, 其传输速率为 2 bytes/20ms frame。 此外, 由于 RTP传输支持多帧打包功能, 因此迷可以将 EVRC进行打包在网络中传输 以节省 IP报头开销。 以 3个 EVRC帧打包成一个 RTP报文为例,再加入 IP 报头的开销, EVRC在网络中传输的平均速率为 11.7kbps。而在以往的 TDM 电路传输中, 一路语音; PCM码流在网络中的传输速率是 64kbps, 因此全 IP 方式下传送压缩语音要比 TDM电路方式下的 PCM码流带宽利用率节省了 (1-11.7/64)=81.7%。 这个例子可以说明, TrFO可以节省大量的网络带宽。
然而 TrFO技术在实际使用过程中, 却遇到了一些问题。 举例说明: 假 设 MS1和 MS2正在 TrFO通过过程中, 如果空中链路质量不是很好, MS1 通过上行链路发送给网络侧 1的帧,有可能网络侧 1不能正确接收和解析有 些帧的内容, 即空中的误帧。 这些不能解析的帧在 TDM电路传输网絡中会 被网络侧的声码器平滑处理, 而在 TrFO技术中, 由于没有声码器参与, 网 络侧 1只能将这些误帧填补成协议规定的待补偿帧(例如: EVRC中, 所有 比特都为 "0" 的半速率帧, 所有比特都为 "0" 的全速率帧被定义为待补偿 帧, 等等) , 并交换到网络侧 2, 网络侧 2则把这些协议规定的待补偿帧发 送给 MS2手机。 同时, 由于 IP网络传输的特性, 网络侧 1的语音帧通过网 络到达网络侧 2 的过程中,也有可能引入一些丟帧或者抖动,此时网络侧 2 在规定时间内收不到网络侧 1的帧,也会按照协议规定填补成待补偿帧发送 给 MS2。 这些由于空中链路质量和网络传输质量引入的待补偿帧到达 MS2 后, 如果 MS2对其进行语音补偿也就不存在问题, 然而大量的实验表明: · 绝大多数的用户终端设备都不会对这种待补偿帧进行语音补偿。因此这些待 补偿帧对 TrFO的整体话音质量造成了很大的影响。
RTO技术也存在着同样的问题。 RTO呼叫在网络侧使用了声码器, 假 设 MS 1和 MS2正在 RTO通话过程中, MS 1通过上行链路发送给网絡侧 1 的帧, 由于如果空中链路盾量不是很好, 网络侧 1接收到误帧后还可以通过 网络侧的声码器进行语音补偿。 但是经过补偿后的语音帧到达网络侧 时, 仍有可能由于网络传输质量问题引入丟帧和抖动,此时网络侧 2会填补协议 规定的待补偿帧给 MS2。 .因此, 这部分待补偿帧如果不能被 MS2进行有效 的语音补偿, 将会对 RTO的整体话音质量造成重大影响。 ·
综上所述,在空中无线链路环境良好时,在网络传输质量理想时, TrFO 和 RTO减少网络侧声码器的编解码次数固然会提高话音质量。 但是当空中 无线链路环境较差, 网络传输质量较差时, TrFO和 RTO却不能象原有电路 交换的移动通信系统一样, 利用网络侧的声码器进行语音补偿。 此时, 语音 补偿就完全依靠用户终端侧的声码器来完成。 目前, 市面上各厂商生产的用 户终端设备对接收到的待补偿语音帧是否需要进行补偿尚不完全相同,因此 TrFO和 RTO的话音质量严重依赖于用户终端设备的声码器补偿性能 ^声码 器是否对各种情况的待补偿语音帧进行补偿, 这对 TrFO和 RTO的整体话 音质量造成了很大的影响。
我们的实践已经证明:在连续全速率帧时,若某个全速率帧损坏或丢失, 在 TrFO或者 RTO情况下, 此时用户终端设备接收到的为待补偿帧。 这种 由用户终端设备处理待补偿帧时的话音质量明显劣于 TDM电路传输网络中 网络侧有声码器处理的情况,前者有时甚至会出现吞字、颤音和语音断续等 现象。对于不同声码器的用户终端设备,话音质量好坏表现出来的程度也不 尽相同。 发明内容
针对以上不足,本发明提供了一种在移动通信网络中实现语音补偿的方 法和系统,应用于传输质量差, 而且网络侧设备不使用声码器或者部分使用 声码器的情况下, 对语音进行近似补偿, 提高整体话音质量。
本发明所采用的技术方案是:
一种在移动通信网络中实现语音补偿的方法, 其特征在于, 包括: a、 在每次帧处理时刻, 网络侧设备判断接收或准备发送的语音帧是否 为无效帧; 是则进行下一步驟;
b、 网络侧设备对无效帧进行语音补偿处理。
进一步地, 所述步骤 a后还存在:
al、 判断该无效帧是否为非 1/8速率状态下的帧; 是则进行下一步驟。 进一步地, 所述步骤 al 中判断无效帧是否为非 1/8速率状态下的无效 帧的方法是:
判断该无效帧的上一个有效帧是否为非 1/8速率帧; 是则该无效帧为非
1/8速率状态下的无效帧; 否则该无效帧不是非 1/8速率状态下的无效帧。
进一步地, 所述步骤 al后还包括步骤:
a2、 判断所述无效帧与上一个有效帧的帧距是否小于或等于补偿阀值; 是则进行下一步驟。
进一步地,所述步骤 b中对无效帧进行语音补偿处理的方法包括以下方 法中的一种: 有效帧复制法: 用上一个有效帧代替当前的无效帧进行补偿;
1/4速率帧填补法: 用一个帧内容任意的 1/4速率帧代替当前的无效帧 进行补偿;
仿真近似法: 用仿真得到的帧代替当前的无效帧。
进一步地, 所述无效帧是指空白帧、 删除帧、 其它协议中未定义帧速率 的帧; 在规定的帧处理时刻未收到的帧、或者是在协议中规定的声码器收到 需要进行语音补偿的帧。
进一步地, 所述语音帧为前向语音帧或反向语音帧;
当语音帧为前向语音帧时,所述上一个有效帧为前向语音帧的上一个有 效帧;
当语音帧为反向语音帧时,所述上一个有效帧为反向语音帧的上一个有 效帧。
本发明还提供了一种在移动通信网络中实现语音补偿的系统,其特征在 于, 该系统设置于网络侧设备中, 包括:
无效帧检测单元,判断网络侧设备接收或准备发送的语音帧是否为无效 帧; 将无效帧发送给语音补偿单元, 将有效帧发送给网络侧设备中处理语音 帧的单元; 及
语音补偿单元,对无效桢进行语音补偿处理, 将补偿后的语音帧发送给 网络侧设备中处理语音帧的单元。
进一步地, 所述语音补偿单元包括:
语音补偿判断单元, 接收无效帧检测单元发送的无效帧, 将非 1/8速率 状态下的无效帧发送给语音补偿处理单元,将其它无效帧发送给网络侧设备 中处理语音帧的单元; 及
语音补偿处理单元,接收语音补偿判断单元发送的无效帧, 并对其进行 语音补偿; 将补偿后的语音帧发送给网络侧设备中处理语音帧的单元。 . 进一步地,所述语音补偿判断单元判断所接收的无效帧的上一个有效帧 是否为非 .1/8速率帧; 是则认为该无效帧为非 1/8速率状态下的无效帧; 否 则该无效帧不是非 1/8速率状态下的无效帧。
进一步地, 所述语音补偿判断单元判断非 1/8速率状态的无效帧与上一 个有效帧的帧距,将帧距小于或等于补偿阀值的无效帧发送给语音补偿处理 单元, 将帧距大于补偿阀值的无效帧发送给网络侧设备中处理语音帧的单 元 进一步地,所述语音补偿单元对无效帧进行的语音补偿处理包括以下情 况中的一种:
用上一个有效帧代替当前的无效帧;
用一个帧内容任意的 1/4速率帧代替当前的无效帧; 或
用仿真得到的帧代替当前的无效帧。
进一步地, 所述无效帧检测单元当网络侧设备接收到的语音帧为空白 帧、 删除帧、 其它协议中未定义帧速率的帧、 在规定的帧处理时刻未收到的 帧、或是在协议中规定的声码器收到后需要进行语音补偿的帧时,认为该语 音帧为无效帧。
进一步地, 所述网络侧设备接收的语音帧为前向语音帧或反向语音桢; 当语音帧为前向语音帧时,所述上一个有效帧为前向语音帧的上一个有 效帧;
当语音帧为反向语音帧时,所述上一个有效帧为反向语音帧的上一个有 效帧。
进一步地, 所述网络侧设备为基站、基站控制器、 无线网络控制器或移 动交换中心。
本发明所述系统及方法可以有效的解决在网络侧无声码器参与、或声码 器仅部分参与的呼叫过程中, 由于空中无线环境较差或者网络传输质量较 差,使话音质量对人耳引起不舒适感的问题, 包括明显的减少话音过程中的 断续现象,颤音现象和吞字现象等。本发明的方案在网络侧设备中完成语音 补偿, 可以有效的减少呼叫对用户终端及其声码器性能的依赖性, 满足各种 用户终端的语音质量需求。 附图概述 图 1是本发明的实现语音补偿的方法的具体实施流程图;
图 1是本发明的实现语音补偿的系统的具体实施示意图; 图 3是本发明的实施例一的流程图;
图 4是本发明的实施例二的流程图;
图 5是本发明的实施例三的流程图。 本发明的较佳实施方式
下面将结合附图和具体实施方式对本发明进行更详细的阐述。
本发明的主要思想是:在通话过程中,对语音贡献最大的是全速率帧和 半速率帧,如果全速率帧或者半速率帧丢失或者损坏,很容易影响语音质量。 大量的实验表明,尤其是在连续全半速率帧状态下,某个或某几个全速率帧 的丢失损坏经常会产生断续、吞字现象; 某个或某几个半速率帧的丢失损坏 经常会产生颤音现象。造成人耳的不舒适感,具体的不舒适程度依赖于用户 终端的声码器编解码性能。因此本发明的目标主要是针对全速率帧或半速率 帧进行补偿。
本发明提出了一种在移动通信网络中实现语音补偿的方法,应用于无线 环境差或传输盾量差, 并且网络侧不使用声码器(如 TrFO )或者部分使用 声码器(如 RTO )的情况下, 如图 1所示, 包含以下步驟:
步驟 1: 网络侧设备在每一次前向语音帧处理时刻, 对从网络侧接收或 准备发送的、待处理的前向语音帧进行判断, 判断该帧是否是无效帧; 或者 网络侧在每一次反向语音帧处理时刻 , 对来自用户终端设备的或准备发送 的、 待处理的反向语音帧进行判断, 判断该帧是否是无效帧:
如果是无效帧, 则转入步骤 2;
如果不是无效帧, 则对该语音帧做正常处理并输出。
所述 "无效帧,,是指以下几种帧:
blank帧 (空白帧) 、 earased帧 (删除帧) 、 或其它协议中未定义帧速 率的帧;
或在规定的帧处理时刻未收到的帧 (如丢帧, 或抖动引起的帧延时到 达); 或在协议中规定的声码器收到后需要进行语音补偿的帧。
步骤 2: 网络侧设备继续判断是否需要对该无效帧进行语音补偿处理。 判断的依据是该无效帧是否为非 1/8速率状态下的无效帧:
如果是非 1/8速率状态下的无效帧, 则该无效帧会对语音质量产生较大 影响, 转入步骤 3;
如果该无效帧是 1/8速率状态下的无效帧, 则该无效帧对语音质量的影 响不大, 可以不进行补偿, 对该无效帧做正常处理并输出。
判断无效帧是否为非 1/8速率状态下的帧的方法如下:
网络侧设备判断上一个有效帧速率是否为 1/8速率帧。 如果该有效帧为 非 1/8速率帧, 则说明所述无效帧是非 1/8状态下的无效帧; 否则所述无效 帧为 1/8速率状态下的无效帧; .
如果步骤 1中网络侧设备是对每一个前向语音帧进行的判断,则本步骤 里是判断前向语音帧的上一个有效帧;如果步骤 1中网络侧设备是对每一个 反向语音帧进行的判断, 则本步骤里是判断反向语音帧的上一个有效帧。
所述的 "有效帧,,是指: 语音通话过程中声码器可以进行正常编解码的 帧, 即除无效帧以外的帧称为有效帧。
所述的 "上一个有效帧"是指: 上一次帧处理的肘刻收到或准备发送的 有效帧; 如果上一次帧处理时刻收到或准备发送的是无效帧, 则指再上一次 帧处理时刻收到或准备发送的有效帧, 依此类推。
步骤 3.: 判断该无效帧与上一个有效帧之间的帧距是否小于或等于进行 语音补偿的补偿阀值:
如果帧距小于或等于该阀值, 则转入步骤 4;
如果帧距大于该阀值, 则不进行语音补偿,对该无效帧做正常处理并输 出。
所述的补偿阀值与移动通信系统的性能及补偿的效果有关,可以通过对 比多次实验的结果, 才艮据语音质量来选择能达到最佳补偿效果的补偿阀值; 比如将补偿阀值设为 6,. 则会对连续六个无效帧进行补偿; 如将补偿阀值设 为 2, 则只对连续两个无效帧进行补偿, 对于第三个连续到达的无效帧不再 补偿。
所述 "帧距" 是指: 一组按序到达的帧, 其中帧 A和帧 B之间间隔的 帧数加 1 , 称为帧 A和帧 B的帧距。 例如, 一组按序到达的帧, 帧 a, 帧 b, 帧 c, 帧 d , 其中帧 a和帧 d的帧距为 3。
步驟 4: 网络侧设备对该无效帧进行语音帧补偿处理, 用补偿的语音帧 代替该无效帧成为本次准备处理并输出的语音帧。网络侧设备所采用的语音 帧补偿方法包括以下方法中的一种: 有效帧复制法、 1/4速率帧填补法、 仿 真近似法等。
有效帧复制法: 用上一个有效帧代替当前的无效帧。
1/4速率帧填补法: 此方法仅适用于 EVRC编解码格式的语青呼叫; 利 用一个 1/4速率帧代替当前的无效帧, 所述 1/4速率帧的帧内容可以任意。
仿真近似法:根据仿真得到的规律,利用上一个有效帧的速率和帧内容, 以及当前的无效帧与上一个有效帧间的帧距,仿真出来一个帧, 用这个仿真 得到的帧代替当前的无效帧。
补偿后, 对补偿的语音帧进行正常处理并输出。
本发明还提供了一种在移动通信网络中实现语音补偿的系统,设置于网 络侧设备中,应用于空中无线环境差或者传输质量差, 而且网络侧设备不使 用声码器或者部分使用声码器的情况下, 如图 2所示, 该系统包括; . ' 无效帧检测单元,在每次帧处理时刻判断网络侧设备接收或准备发送的 前向语音帧或反向语音帧是否为无效帧; 将无效帧发送给语音补偿单元; 将 有效帧发送给网络侧设备中处理语音帧的单元;
所述语音补偿单元包括:
语音补偿判断单元: 接收无效帧检测单元发送的无效帧,将与上一个有 效帧的帧距小于或等于补偿阀值、 并且为非 1/8速率状态下的无效帧发送给 语音补偿处理单元; 将其它无效帧发送给网络侧设备中处理语音帧的单元; 语音补偿处理单元,接收语音补偿判断单元发送的无效帧, 并对其进行 语音补偿处理, 即进行以下几种处理中的一种:
用上一个有效帧代替当前的无效帧;
用一个帧内容任意的 1/4速率帧代替当前的无效帧; 或者
用仿真得到的帧代替当前的无效帧。
将补偿后的语音帧发送给网络侧设备中处理语音帧的单元。
所述的网络侧设备可以为基站、基站控制器、无线网络 4空制器 *者移动 交换中心中的一种。
本发明适用于网络侧设备不使用声码器或者部分使用声码器进行语音 补偿的语音呼叫, 包括: 采用 TrFO技术、 RTO技术、 TFO ( Tandem Free Operation 免二次编码) 技术的无线通信系统。 本发明同样适用于 CDMA2000, WCDMA ( Wideband-Code Division Multiple Access 宽频码分 多址接入)和 TDS-CDMA ( Time Division Synchronization- Code Division Multiple Access 时分同步码分多址接入;)等无线通信系统。
下面用本发明的三个应用实施例进一步加以说明。
应用实施例一: 采用有效帧复制法实现语音补偿。
本实施例采用的语音补偿方法是有效帧复制法,本实施例中进行语音补 偿的帧距的阀值为 1 , 即仅对全速率帧状态下的有效帧后的第一个无效帧进 行语音补偿处理, 对该无效帧后连续到达的无效帧不再 4故补偿处理; 如图 3 所示, 需要进行的具体步骤如下: 101 : 网络侧设备在每个前向语音帧处理时刻, 对接收到的来自网络侧 的前向语音帧进行判断:
如果本次到达的是无效帧, 则继续步骤 102; 如果本次到达的是正常的语音帧, 则直接转入步骤 104;
102: 对到达的上一帧进行判断;
如果上一帧也为无效帧, 则不做任何特殊处理, 直接转入步骤 104; 如果上一帧为非全速率帧, 则不做任何特殊处理, 直接转入步骤 104; 如果上一帧为全速率帧, 则继续步骤 103; 注意这里的全速率帧是有效 帧。
103: 丟弃本次的无效帧, 用到达的上一帧, 即所述全速率帧代替本次 的无效帧; 继续步骤 104;
104: 对本次的前向语音帧 正常处理并输出。
本实施例虽然只描述了网络侧设备对来自.网络侧的前向语音帧进行检 测判断和补偿的步驟,但是本实施例同样适用于网络侧设备对来自用户终端 设备的反向语音帧进行检测判断和补偿, 此处不再赘述。
应用实施例二, 采用 1/4速率帧填补法实现语音补偿。
本实施例适用于采用 EVRC编解码方式的呼叫。 EVRC的合法编码格式 中不包含 1/4速率帧,大量的实验表明,各款用户终端设备的声码器在 EVRC 编解码格式下接收到 1/4速率帧时会进行语音补偿。 如图 4所示, 本实施例 的具体步骤如下:
201 : 网絡侧设备在每个前向语音帧处理时刻, 判断接收到的来自网络 侧的前向语音帧的速率:
如果本次到达的是无效帧, 则继续步骤 202;
如果本次到达的是正常的语音帧, 则直接转入步骤 205;
202: 判断上一个有效帧的帧速率:
如果上一个有效帧的帧速率为全速率帧, 则转入步骤 203;
如果上一个有效帧为非全速率帧, 则不做任何特殊处理, 直接转入步骤
205;
203: 判断上一个有效帧与当前无效帧的帧距: 如果帧距小于或等于预先设定的补偿阀值, 则转入步骤 204; 如果帧距大于预先设定的补偿阀值, 则转入步骤 205;
204: 丟弃本次的无效帧, 用一个 1/4速率帧代替本次的无效帧。 该 1/4 速率帧的帧内容可以任意。 用该 1/4速率帧 4故为本次的前向语音帧。 继续步 驟 205;
205: 对本次的前向语音帧故正常处理并输出。
由上述步驟可知,本实施例的主要思想是将全速率帧之后紧跟的一批连 续的无效帧用 1/4速率帧代替; 对于每个与上一个全速率有效帧的帧距小于 或等于预先定义的阀值的无效帧, 都用 1/4速率帧代替; 对于帧距大于阀值 的无效帧则不再进行额外的语音补偿; 也就是说, 如果全速率有效帧后紧跟 的连续无效帧的个数超出最大门限时,就不再对超出门限的无效帧进行额外 的语音补偿处理了; 所述无效帧个数的最大门限即所述补偿阀值。 实际应用 时, 本方法的补偿阀值也可以设为无穷, 即对全速率争后紧跟的连续无效帧 都用 1/4速率帧代替。
本实施例虽然只描述了网络侧设备对来自网絡侧的前向语音帧进行检 测判断和补偿的步驟,但是本实施例同样适用于网络侧设备对来自用户终端 设备的反向语音帧进行检测判断和补偿, 此处不再赘述。
应用实施例三: 采用仿真近似法实现语音补偿。
本实施例中,根据以往的实际情况对大量的全速率语音数据进行统计归 纳, 然后得出帧的内容及速率变化的近似规律; 当对无效帧进行补偿时, 根 据所述近似规律,.利用上一个有效帧的内容及速率、及该无效帧与上一个有 效帧的帧距, 就可以仿真得到一个帧, 来代替该无效帧的速率及内容; 本文 中把仿真所得到的帧称为伪全速率帧。 本实施例中预先设定补偿阀值为 6。 如图 5所示, 本实施例的具体步骤如下:
301 : 网绦侧设备在每个前向语音帧处理时刻, 判.断接收到的来自网络 侧的前向语音帧的速率:
如果本次到达的是无效帧, 则继续步骤 302; 如果本次到达的是正常的语音帧, 则直接转入步骤 305;
302: 判断保留的上一个有效帧的帧速率:
如果上一个有效帧的帧速率为全速率帧, 则转入步骤 303;
如果上一个有效帧为非全速率帧, 则不做任何特殊处理, 直接转入步骤 305;
303: 判断上一个有效帧与当前无效帧的帧距:
如果帧距小于或等于 6, 则转入步骤 304;
如果帧距大于 6, 则转入步骤 305;
304: 丢弃本次的无效帧, 利用上一个有效帧内容、 上一个有效帧与当 前无效帧的帧距作为参数,根据以往统计归纳得到的近似规律仿真构造一个 伪全速率帧, 用所构造的伪全速率帧代替该无效帧; 用该伪全速率帧做为本 次的前向语音帧; 继续步骤 305;
305: 对本次的前向语音帧做正常处理并输出。
由上述步骤可知,本实施例的主要思想是利用仿真得到的语音帧代替全 速率帧之后紧跟的无效帧, 仿真时, 根据全速率帧的内容, 及无效帧与该全 速率帧的帧距,通过统计规律就可以对全速率帧之后紧跟的 6个连续无效帧 进行补偿。
本实施例虽然只描述了网络侧设备对来自网络侧的'前向语音帧进行检 测判断和补偿的步骤,但是本实施例同样适用于网络侧设备对来自用户终端 设备的反向语音帧进行检测判断和补偿, 此处不再赘述。
上述三个实施例各有优点, 但是仅话音质量一点来说,仿真近似法得到 的话音质量要略胜一筹,它可以对连续全速率帧状态下的多个删除帧都进行 补偿。 所需开销也不大, 仅需保留最近一个全速率帧的内容。
上述三个实施例主要是针对全速率状态下的帧进行补偿; 在实际应用 中,可以设定, 当上一个有效帧为全速率帧或半速率帧时,都进行语音补偿。 另外, 在实际应用中, 补偿阀值也可以根据实际情况设定。 但这些相应的改变和变形都应属于本发明所附的权利要求的保护范围。
工业实用性
本发明解决了当空中无线环境较差、 网络传输质量较差时, 由于网絡侧 没有使用声码器或者部分使用声码器进行语音盾量补偿和线形预测,致使话 音质量严重依赖于用户终端设备的声码器是否对某些待补偿帧进行补偿以 及补偿性能的好坏, 导致整体话音质量下降和对人耳造成的不舒适感的问 题,提供了一种在网络侧实现语音补偿的系统和方法。本发明的技术方案可 以在空中无线环境较差或者网络传输质量较差时,在网络侧不使用声码器或 者部分使用声码器的情况下, 对语音进行一定近似补偿, 减轻呑字、.颤音和 语音断续等引起的人耳不舒适感,提高整体话音盾量, 并可以减少呼叫对用 户终端及其声码器性能的依赖性。

Claims

权 利 要 求 书
1、 一种在移动通信网络中实现语音补偿的方法, 其特征在于, 包括: a、 在每次帧处理时刻, 网络侧设备判断接收或准备发送的语音帧是否 为无效帧; 是则进行下一步骤;
b、 网络侧设备对无效帧进行语音补偿处理。
2、 如权利要求 1所述的方法, 其特征在于, 所述步骤 a后还存在: al、 判断该无效帧是否为非 1/8速率状态下的帧; 是则进行下一步骤。
3、 如权利要求 2所述的方法, 其特征在于, 所述步骤 al中判断无效帧 是否为非 1/8速率状态下的无效帧的方法是:
判断该无效帧的上一个有效帧是否为非 1/8速率帧; 是则该无效帧为非
1/8速率状态下的无效帧; 否则该无效帧不是非 1/8速率状态下的无效帧。
4、 如权利要求 2所述的方法, 其特征在于, 所述步驟 al后还进一步包 括步骤:
a2、 判断所述无效帧与上一个有效帧的帧距是否小于或等于补偿阀值; 是则进行下一步驟。
5、 如权利要求 1所述的方法, 其特征在于, 所述步骤 b中对无效帧进 行语音补偿处理的方法包括以下方法中的一种:
有效帧复制法: 用上一个有效帧代替当前的无效帧进行补偿;
1/4速率帧填补法: 用一个帧内容任意的 1/4速率帧代替当前的充效帧 进行补偿;
仿真近似法: 用仿真得到的帧代替当前的无效帧。
6、 如权利要求 1所述的方法, 其特征在于, 所述无效帧是指空白帧、 删除帧、 其它协议中未定义帧速率的帧; 在规定的帧处理时刻未收到的帧、 或者是在协议中规定的声码器收到后需要进行语音补偿的帧。
7、 如权利要求 3到 5中任一项所述的方法, 其特征在于, 所述语音帧 为前向语音帧或反向语音帧; 当语音帧为前向语音帧时,所述上一个有效帧为前向语^ ~帧的上一个有 效帧;
当语音帧为反向语音帧时,所述上一个有效帧为反向语音帧的上一个有 效帧。
8、 一种在移动通信网络中实现语音补偿的系统, 其特征在于, 该系统 设置于网络侧设备中, 包括:
无效帧检测单元,判断网络侧设备接收或准备发送的语音帧是否为无效 帧; 将无效帧发送给语音补偿单元,将有效帧发送给网络侧设备中处理语音 帧的单元; 及
语音补偿单元, 对无效帧进行语音补偿处理,将补偿后的语音帧发送给 网络侧设备中处理语音帧的单元。
9、 如权利要求 8所述的系统, 其特征在于, 所述语音补偿单元包括: 语音补偿判断单元, 接收无效帧检测单元发送的无效帧, 将非 1/8速率 状态下的无效帧发送给语音补偿处理单元,将其它无效帧发送给网络侧设备 中处理语音帧的单元; 及
语音朴偿处理单元,接收语音补偿判断单元发送的无效帧, 并对其进行 语音补偿; 将补偿后的语音帧发送给网络侧设备中处理语音帧的单元。
10、 如权利要求 9所述的系统,其特征在于,所述语音补偿判断单元判 断所接收的无效帧的上一个有效帧是否为非 1/8速率帧; 是则认为该无效帧 为非 1/8速率状态下的无效帧; 否则该无效帧不是非 1/8速率状态下的无效 帧。
11、 如权利要求 9所述的系统,其特征在于,所述语音补偿判断单元判 断非 1/8速率状态的无效幀与上一个有效帧的帧距, 将帧距小于或等于补偿 阀值的无效帧发送给语音补偿处理单元,将帧距大于补偿阔值的无效帧发送 给网络侧设备中处理语音帧的单元。
12、 如权利要求 8所迷的系统,其特征在于,所述语音补偿单元对无效 帧进行的语音补偿处理包括以下情况中的一种:
用上一个有效帧代替当前的^效帧; ' 用一个帧内容任意的 1/4速率帧代替当前的无效帧; 或
用仿真得到的帧代替当前的无效帧。
13、 如权利要求 8所述的系统,其特征在于,所述无效帧检测单元 '当网 络侧设备接收到的语音帧为空白帧、删除帧、其它协议中未定义帧速率的帧、 在规定的帧处理时刻未收到的帧、或是在协议中规定的声码器收到后需要进 行语音补偿的帧时, 认为该语音帧为无效帧。
14、 如权利要求 10、 11或 12所述的系统, 其特征在于, 所述网络侧设 备接收的语音帧为前向语音帧或反向语音帧;
当语音帧为前向语音帧时,所述上一个有效帧为前向语音帧的上一个有 效帧;
当语音帧为反向语音帧时,所述上一个有效帧为反向语音帧的上一个有 效帧。
15、 如权利要求 8所述的系统, 其特征在于, 所述网络侧设备为基站、 基站控制器、 无线网络控制器或移动交换中心。
PCT/CN2007/000099 2007-01-10 2007-01-10 Procédé et système permettant de réaliser une compensation vocale dans un réseau de communication mobile WO2008083517A1 (fr)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP07702031.1A EP2129051B1 (en) 2007-01-10 2007-01-10 A method and system for realizing the voice compensation in the mobile communication network
PCT/CN2007/000099 WO2008083517A1 (fr) 2007-01-10 2007-01-10 Procédé et système permettant de réaliser une compensation vocale dans un réseau de communication mobile
CN2007800403922A CN101529830B (zh) 2007-01-10 2007-01-10 一种在移动通信网络中实现语音补偿的方法和系统

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2007/000099 WO2008083517A1 (fr) 2007-01-10 2007-01-10 Procédé et système permettant de réaliser une compensation vocale dans un réseau de communication mobile

Publications (1)

Publication Number Publication Date
WO2008083517A1 true WO2008083517A1 (fr) 2008-07-17

Family

ID=39608312

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2007/000099 WO2008083517A1 (fr) 2007-01-10 2007-01-10 Procédé et système permettant de réaliser une compensation vocale dans un réseau de communication mobile

Country Status (3)

Country Link
EP (1) EP2129051B1 (zh)
CN (1) CN101529830B (zh)
WO (1) WO2008083517A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102242260B1 (ko) 2014-10-14 2021-04-20 삼성전자 주식회사 이동 통신 네트워크에서 음성 품질 향상 방법 및 장치
CN107393559B (zh) * 2017-07-14 2021-05-18 深圳永顺智信息科技有限公司 检校语音检测结果的方法及装置

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0738608A (ja) * 1993-07-19 1995-02-07 Nec Corp 音声パケット受信装置
JP2005223375A (ja) * 2004-02-03 2005-08-18 Elwing Co Ltd データ伝送方法及びその装置

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100393085C (zh) * 2000-12-29 2008-06-04 诺基亚公司 数字网络中的音频信号质量增强
FI20010235A (fi) * 2001-02-08 2002-08-09 Nokia Corp Menetelmä informaatiokehysten prosessoimiseksi

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0738608A (ja) * 1993-07-19 1995-02-07 Nec Corp 音声パケット受信装置
JP2005223375A (ja) * 2004-02-03 2005-08-18 Elwing Co Ltd データ伝送方法及びその装置

Also Published As

Publication number Publication date
CN101529830A (zh) 2009-09-09
EP2129051B1 (en) 2017-08-09
CN101529830B (zh) 2013-01-30
EP2129051A4 (en) 2010-07-28
EP2129051A1 (en) 2009-12-02

Similar Documents

Publication Publication Date Title
EP1368979B1 (en) Mobile communications using wideband terminals allowing tandem-free operation
EP1782644B1 (en) Interoperability for wireless user devices with different speech processing formats
JP4365029B2 (ja) ディジタル通信システム内での音声およびデータ送信切換
US8432935B2 (en) Tandem-free intersystem voice communication
US20110294501A1 (en) Codec deployment using in-band signals
WO2001082640A1 (fr) Procede de communication multipoint et dispositif de commande de communication
FI106510B (fi) Järjestelmä puheen siirtämiseksi matkapuhelinverkon ja kiinteän verkon päätelaitteen välillä
US20080133247A1 (en) Speech coding arrangement for communication networks
EP2108193B1 (en) Methods, systems, and computer program products for silence insertion descriptor (sid) conversion
US7379877B2 (en) Signal processing device and signal processing method
CN103871415B (zh) 实现异系统间语音互通的方法、系统与tfo转换装置
WO2008083517A1 (fr) Procédé et système permettant de réaliser une compensation vocale dans un réseau de communication mobile
KR20050007977A (ko) 이동 통신 시스템에서의 보코더의 모드 및 전송율 제어 방법
WO2009036693A1 (fr) Procédé et système de traitement de données de liaison montante et descendante dans un réseau de communication sans fil
WO2007118392A1 (fr) Procédé et dispositif de transmission de données vocales
RU2426250C2 (ru) Способ и система для речевой компенсации в сети мобильной связи
US8300622B2 (en) Systems and methods for tandem free operation signal transmission
US20050195861A1 (en) Sound communication system and mobile station
KR20040106777A (ko) 복수개의 모뎀을 갖는 단말기의 사운드 데이터 공유 장치및 그 방법
AU4299499A (en) Alternating speech and data transmission in digital communications systems

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200780040392.2

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07702031

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

REEP Request for entry into the european phase

Ref document number: 2007702031

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2007702031

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 4646/CHENP/2009

Country of ref document: IN

ENP Entry into the national phase

Ref document number: 2009129402

Country of ref document: RU

Kind code of ref document: A