MXPA99011737A

MXPA99011737A - Speech quality measurement based on radio link parameters and objective measurement of received speech signals

Info

Publication number: MXPA99011737A
Application number: MXPA/A/1999/011737A
Authority: MX
Inventors: Uvliden Anders; Bjorn Minde Tor; Karlsson Anders; Heikkila Gunnar
Original assignee: Telefonaktiebolaget Lm Ericsson (Publ)
Priority date: 1997-06-24
Filing date: 1999-12-15
Publication date: 2000-06-01

Abstract

An improved method and system of measuring the perceived speech quality in mobile telecommunications network are disclosed herein. In an embodiment of the invention, the method uses both radio link parameters and an objective measuring technique performed on received signals to estimate the speech quality perceived by the end-user. A radio link processing stage extracts temporal information from a setof available radio link parameters such as the BER, FER, RxLev, handover statistics, soft information, and speech energy. Concurrently, a speech processing stage is used to process a sequence of original signals and received signals, obtained from the output of a telecommunications system. The signal sequences are processed by an objective measuring technique such as Perceptual Speech Quality Measure (PSQM). The outputs from the radio link processing and speech processing stages are utilized to calculate an estimate for speech quality. Furthermore, a weight may be given to radio link processing and speech processing in accordance with their performance under various conditions such that the overall speech quality is calculated with respect to the best approach.

Description

MEASUREMENT OF VOICE QUALITY BASED ON LINK PARAMETERS RADIUS AND OBJECTIVE MEASUREMENT OF VOICE SIGNALS RECEIVED FIELD OF THE INVENTION The present invention relates generally to the measurement of voice quality in wireless telecommunication systems and, more specifically refers to a method for measuring speech quality using radio link parameters together with objective measurement techniques based on the received voice. BACKGROUND OF THE INVENTION In the wireless telecommunications industry, service providers are interested in providing high quality reliable services for their clients in today's highly competitive environment. For example, reliability problems such as interrupted calls and quality issues such as, for example, fading, multipath interference as well as common channel interference are concerns that cellular operators constantly face. Another issue of great interest to operators is the improvement of the perceived quality of the voice by the end user within the cellular system. Therefore, it is desirable that operators can determine which areas in the network are presenting quality problems. In the past, a large number of methods have been used to measure the quality of voice in cellular networks. A commonly used method includes testing a cellular network by transmitting known signals and comparing the received signals with a previously defined signal database in order to define an estimate of the quality. The term signal is used here to refer to perceptible sounds within the range of audio frequencies of humans that include voice and tones. This method is illustrated in Figure 1. A database of known signals 2 is shown, where predetermined signals are sent through a system being studied 4. The system under study 4 represents all the operating components of the system. a cellular network that includes a mobile switching center (MSC), a radio base station (RBS), all communication links and the air interface. Once the transmitted signals have been received, a second signal database 6 containing the original signal patterns with the signals received in step 8 is compared. An estimate of the quality of the signal received for the signal is then calculated. net.

In digital systems, the conversion of analog voice signals to digital signals requires much more bandwidth for transmission than is desirable. Bandwidth limitations in wireless telecommunication systems have resulted in the need for low bit rate speech coders that work by reducing the number of bits needed to transmit while maintaining both quality and clarity. In general, it is desirable to transmit at lower bit rates but the quality tends to decrease with the decrease in bit rates. The voice coders used in these applications work by encoding the voice while removing built-in redundancies during voice production. Typically, speech coders obtain their low bit rates by modeling human speech production in order to obtain a more efficient representation of the speech signal. The original voice signal can be synthesized using several estimated filter parameters. Since many of the prior art test methods include the use of audio tones in the test procedure, they are not well presented for testing with digital systems this is because voice coders are modeled on the basis of production of voice and are not optimal for tones, therefore it is likely that errors are found in tone regeneration. Another source of potential problems with the method of Figure 1 when using voice signals is in the comparison and estimation of step 8. The voice database 2 contains a limited number of predetermined repetitive sentences (for example 6-8 sentences). ) that are representative of voice patterns typically made through a mobile network. The estimation portion in step 8 uses perception models that mimic the listening process. Models of this type tend to work well when the distortion is small but can present problems in conditions of high distortion. As an example, an error condition that causes a repetition of a previous frame may sound satisfactory to the person listening, especially when referring to vowel sounds, but the perception model may erroneously determine that the distortion is severe when the frame is compared to the original picture. A predominant factor that affects the quality of the voice in digital systems is the proportion of errors in the bits (BER). The BER is the frequency with which bit errors are introduced in the frames transmitted. Bit errors tend to be introduced during transmission at the air interface. High BER situations occur frequently during conditions of high common channel interference, weak signals such as when a mobile station moves out of range, and fading caused by multipath interference due to obstructions such as, for example, buildings, etc. Even when attempts are made to correct those errors, an excessively high BER has a negative effect on the quality of the voice. In a Global System for a Communication Network (GSM) as for example, the BER and other related parameters, such as the Quality of reception (RxQual) and the Level of Reception (RxLev) are monitored in order to evaluate the quality of the voice. There are limitations in the use of this method since the correlation relationships and the temporal information that can be obtained from the parameters are not used. For example, the extraction of temporary information allows the formulation of a set of relationships between the variables that can be exploited to measure the quality of the voice. The perceived quality of the voice for the end user is associated with an average time in the length of a sentence at its highest resolution. The final quality is averaged throughout the conversation which means that the lowest resolution is approximately in the range of a few minutes. Therefore, the use of derived and correlated temporal parameters, which is missing in GSM, will offer a clearer perspective as to the state of voice quality that is observed in many situations. The RxQual parameter in the GSM system is measured every 0.5 seconds and depends inherently on the BER for each 20 millisecond frame. In addition, RxQual can fluctuate widely due to fading, noise or interference which can cause quality measurements that fluctuate much faster than the perceived sound quality. A seemingly obvious solution would be to increase the temporary solution with a time constant in the area of 2-5 seconds. But it has been found that the relationship between the digital communication link and the voice quality does not depend only on a BER averaged over time. What is required is a method that combines the information obtained from the radio link parameters and objective measurement techniques based on signals in such a way that both benefits are obtained and the drawbacks of the prior art methods are avoided. . SUMMARY OF THE INVENTION In order to achieve the above and other objects in accordance with the purpose of the present invention, an improved method as well as a system for measuring voice quality in a mobile telecommunication network is presented here. In one embodiment of the present invention, the method includes extracting temporary information from a set of radio link parameters available in a radio link processing step. A set of correlated temporal parameters is then produced from the radio link processing stage. Concurrently, a sequence of original signals and received signals (signals such as, for example, speech, tones or otherwise), which are produced from the telecommunications system, for example, voice encoded from a digital encoder. voice, are processed by the use of an objective measurement technique in order to produce a set of speech processing parameters. The outputs of the radio link processing and voice processing steps are fed to an estimator in order to calculate the quality of the voice. In addition, a weight may be provided at the output of the radio link processing stage and to the speech processing stage according to its relative performance under the current conditions of the mobile connection. The quality of the voice is then calculated as to the appropriate meaning assigned to the respective stages for improved performance under various conditions. In an aspect of the invention, an improved system of objective measurement of voice quality for a wireless telecommunication network is presented. The system includes a radio processor to extract temporary information from the radio link parameters. It includes a signal processor to objectively measure signals (voice). An estimator is included to calculate the overall perceived voice quality by combining the parameters of both the radio link processor and the speech processor. The estimator can be implemented as a linear, non-linear, state machine, or neural network. These and other advantages of the present invention will be apparent upon reading the following detailed descriptions and studying the various figures of the drawings. BRIEF DESCRIPTION OF THE DRAWINGS The invention, together with other objects and advantages thereof, can be better understood with reference to the following description in combinations with the accompanying drawings in which: Figure 1 shows a prior art method for measuring of voice quality using signal databases; Figure 2 illustrates a method for the temporal processing of radio link parameters in accordance with an embodiment of the present invention; Figure 3 illustrates a method for voice processing of signals received in accordance with an embodiment of the present invention; Figure 4 depicts a flow chart of the voice processing method in accordance with an embodiment of the present invention; and Figure 5 shows a diagram for estimating voice quality using both radio link parameters and voice processing in accordance with one embodiment of the present invention. DETAILED DESCRIPTION OF THE PREFERRED MODALITIES A discussion of Figure 1 focused on a prior art method of voice quality measurement was provided in the previous sections. In a basic cellular system, a mobile switching center (MSC) is linked to a plurality of geographically dispersed base stations (BS) to form the area of cellular coverage for the system. Each of the base stations is designed to cover a specific area known as a cell, in which a two-way radio communication can be carried out between the mobile station MS and the base station BS in the associated cell . The level of quality of coverage is not uniform for all points in the coverage area due to several non-controllable factors. Therefore, the quality perceived by the end user provides important information regarding the current level of performance of the network. The quality of the voice received through a mobile telecommunication network can generally be separated in the different areas of understanding and naturalness. A highly synthesized voice, for example, may have a high capacity for understanding in terms of carrying information but does not necessarily have a high quality. Cellular systems that employ low bit rate voice coders tend to maintain understanding but at the expense of naturalness. In situations in which the identification of the person speaking is important, for example, in speech recognition applications, the quality of the voice can not be compromised. Numerous methods have been proposed to objectively measure the quality of speech using mathematical models. To date, none has shown exceptional correspondence with subjective assessments in digital networks. For this purpose, a technique for estimating the quality of voice in digital networks through the use of both radio link parameters and target speech processing follows. Figure 2 illustrates a voice quality measurement process employing temporal information obtained from radio link parameters, in accordance with one embodiment of the present invention. A radio link processing is carried out through a multi-stage configuration that includes a temporary processing stage 16 and a correlation stage 18. Radio link parameters available, for example, in a D-AMPS network , such as SEE, frame error frequency (FER), RxLev, transfer statistics, soft information, and voice energy are entered in the temporary processing stage 16. New parameters obtained from the temporary information are calculated from the radio link parameters. The application of what is known as "sliding windows" or simply "windows" that includes, for example, a rectangular, exponential, or sin2 window applied to the parameters to achieve a temporary weighting. The parameters can then be correlated by taking, for example, the root, exponential or logarithmic of the function in order to achieve a more appropriate form. In addition, the transformed data can be analyzed with statistical methods that can include the determination of the maximum value, minimum value, average value, standard deviation, asymmetry, kurtosis, etc. These processes can be carried out independently and in any order to achieve the desired relationships. A temporal processing can extract information about what has happened with specific parameters during a specified period of time. For example, by viewing a sequence history of measurements of a variable, it is possible to calculate temporary parameters such as, for example, average value for the last X seconds, estimate the standard deviation during Y seconds, or the autocorrelation function during the last Z seconds. By way of example, the average BER during the last 3 seconds or the number of frames erased during the last 5 seconds are examples of new parameters that can be derived which are closely related to an aspect of OZ quality. A correlation stage 18 combines the original or new parameters, employing relationships between them, to produce parameters that are more directly correlated with voice quality. For example, modern cellular systems try to hide the loss of a frame due to errors in the bits by repeating the previous frame of 20 ms with the hope that it will not be heard. This means that the number of errors in the bits in the lost frame is not relevant, since the content of the frame never reaches the person who is listening. This suggests new parameters that correlate more closely with voice quality, such as through the combination of SEE with FrameLoss for example. In a first example illustrating a temporal processing and correlation, the mean for the BER is calculated in a range of 0.5 seconds in a temporary processing step 16 to create a new parameter such as, for example, RXQ_MEAN_5. In the correlation stage 18, the parameter RXQ_MEAN_5 is correlated by the application of a cubic transformation providing a correlated parameter (RXQ_MEA? _5) 3. A second example may include the calculation of the FER at intervals of 5 seconds to form the parameter temporary FER_BURSTS_5. The correlation is then reached by applying a square root transformation to the temporal parameter in order to form a correlated parameter (FER BURSTS 5) Another example can be to determine the frequency of errors in the average residual bits (RBER) for 3 seconds , which is the BER calculated for the "good" boxes. It will be noted that temporal processing and statistical analysis can be carried out on the correlated parameters and that some calculations, for example, the RBER, can be carried out on the "raw" data. The parameters can be combined and correlated in various ways as will be observed by one skilled in the art to achieve better results for particular situations and it is intended that all these variations are within the scope of the present invention. Other temporal processing and correlation of parameters are described in Minde's serial number co-pending application ..., entitled "Speech Quality Measurement in Mobile Telecommunication Networks Based on Radio Link Parameters" (Measurement of Voice Quality in Mobile Telecommunication Networks Based on Radio Link Parameters), presented the day that is incorporated herein by reference in its entirety. Figure 3 shows a target speech processing method employed in combination in the aforementioned temporal processing and correlation stage. The objective measurement of processing employs two sequences of signals to produce a set of highly correlated parameters related to voice quality. A first signal sequence containing undisturbed original signals 24 enters step 22 for processing. A second sequence containing received signals 26, which have been sent to through the cellular telecommunication system and subjected to distortion. Figure 4 illustrates a typical method of measuring objective speech quality using the original signal output 24 and the received signal signal 26 from a cellular telecommunication system 30. An objective measurement process 32 is applied to the original signal 24 and the received signal 26 for the purpose of measuring the quality characteristics of the signal. Objective measurement techniques usually carry out quality measurements on the signal by determining waveform distortions, spectral and spectral envelope. By way of example, the distortions between the original and received signals are detected and plotted in the time and frequency domain of the signals. In addition, distortions in the frequency domain can be measured in the spectral characteristics or the spectral envelope of the signals. An objective measurement technique that works well with the present invention is what is known as Perceptual Speech Quality Measure (PSQM in accordance with that specified in ITU-T, recommendation P.861). as the experts in the field can observe, PSQM presents a substantial correlation with the subjective quality of the coded voice. Various parameters such as for example listening level, weighting in silence intervals, ambient noise on the receiving side, listening threshold characteristics as well as sending and receiving characteristics of the mobile station are used in the method in order to mimic the perception of sound of subjects in situations of "real life". A more complete description of the PSQM methodology is provided in the aforementioned ITU-T Recommendation P.861. In addition, those skilled in the art will note that other well-known objective measurement methods can be adapted for use by the present invention as for example the Signal-to-Noise Ratio (SNR), Segment SNR (SegSNR), Noise Ratio and Mask (NMR), and Cepstral Distance Techniques (CD) Figure 5 illustrates an embodiment of the present invention for estimating voice quality using both radio link parameters and the processing of received signals. The parameters, correlated or otherwise, are sent from the radio link processing and voice processing stages respectively and are entered directly into an estimator 36. The estimator 36 combines the parameters and calculates a quality estimate of perceived voice. The architecture of the estimator 36 can be based on several mathematical models such as, for example, linear, non-linear network, state machine, or neural network. In many cases, a linear estimator can provide satisfactory results and can take the form of: Estimation = A (parameter 1) + B (parameter 1) + Where the coefficients A and B are optimized to obtain a better performance. The coefficients can be derived, for example, by the use of a linear regression technique in a subjectively graded training material, as is known to those skilled in the art. An exemplary procedure that employs a linear estimate can be carried out on the correlated parameters of a previous example and can take the form of: Estimation = A * (FER BURSTS 5) 1/2 + B * (Even though a linear estimate often offers adequate results, non-linear estimators can offer a more accurate estimate when the relationships between the parameters are significantly non-linear.A relatively simple method of non-linear estimation can be carried out using multiple linear estimators that approach the non-linear segments of the curves with successive linear estimators This multi-line estimator approach offers a relatively simple and accurate model for many correlated parameters Another type of estimator that can be employed with the present invention is a neural network For example, a neural network estimator can used to simultaneously record the link parameters radio with test voice. Registered voice is evaluated by a listener panel where it is graded and combined with the results from radio link processing and used to train the network. The use of the neural network can be less complicated since the network can be better suited to this task than ordinary estimators. An example of a neural network that works well with the present invention is offered in U.S. Patent No. 5,432,778 which is incorporated herein by reference. Another type of estimator that can be employed with the present invention is a finite state machine. An estimator based on a finite state machine operates by changing the state in accordance with some dynamic criteria. For example, the estimator can be configured to change state in response to a change in terms of mobile speed or the change from hopping frequencies to non-hopping frequencies and vice versa. Several suitable estimators are presented in the co-pending application of Minde et al., Serial number incorporated. Another aspect of the present invention is the ability to assign respective weights to the radio link processing step and the signal processing step. For example, since it is known that high levels of BER cause the Voice Processing Methods to perform unsatisfactorily, in this situation, a relatively higher weight is consequently provided to the processing of the radio link parameters than to the processing of received voice. Thus, an estimator 36 therefore gives greater importance to the radio link parameter processing when calculating the voice quality estimate. In contrast, greater importance is given to the voice processing component during low BER conditions, since objective measurement techniques have a better resolution than the radio link parameters in these conditions. A) Yes, the method of shifting the importance between different types of processing, while calculating voice quality, reduces the probability of performing calculations under conditions of high error level. The present invention contemplates an improved method for measuring the quality of speech in a cellular telecommunication system by using both radio link parameters and voice processing information. The method offers the flexibility and advantage of using temporal information from radio link parameters together with objective quality measurements to provide an improved estimate of the perceived quality of the voice by the end user. Improved performance is also carried out from the ability to appropriately displace the dependency for estimation in accordance with the best approach under different conditions. Although the invention has been described in some aspects with reference to a specific preferred embodiment, various modifications and applications thereof will be apparent to those skilled in the art. Particularly, the concept of the present invention can be applied, in addition to being applied to D-AMPS, to other systems based on Time Division Multiple Access (TDMA) (Time Division Multiple Access), for example Global System for Mobile Communication (GSM) (Global System for Mobile Communication) and Personal Digital Cellular (PDC) (Personal Digital Cellular), or to other types of systems such as Code Division Multiple Access (CDMA) (Multiple Access by Division of Code) and Frequency Division Multiple Access (FDMA) (Multiple Access by Frequency Division), etc. Accordingly, the intention is that the following claims should not receive a restrictive interpretation but should be viewed as encompassing variations and modifications derived from the subject matter object of the present invention.

Claims

CLAIMS A method for measuring voice quality in a mobile telecommunication network, comprising the steps of: receiving a set of radio link parameters; processing said radio link parameters by extracting temporary information to calculate a set of time parameters; receive a sequence of an original signal; receiving a sequence of a received signal that is sent from said telecommunications network; processing said original signal and said received voice signal by an objective measurement technique in order to produce a set of signal processing parameters; and estimating the voice quality from said time parameters and said signal processing parameters with an estimator. n method according to claim 1, wherein said received radio link parameters include BER, FER, RxLev, transfer statistics, soft information, as well as speech energy parameters. The method according to claim 1, wherein the signal processing step includes the use of the objective measurement technique of Perceptual Speech Quality Measurement (PSQM) (Measurement of perceived voice quality). 4. A method according to claim 1, wherein the processing step further comprises calculating the distortion between the original signal and the received signal. 5. A method according to claim 1, wherein the processing step further includes the application of a selected objective processing technique within a group consisting of Signal to noise ratio, segment SNR, Ratio of Noise to mask, and distance Cepstral. 6. A method according to claim 1, wherein the estimation step further includes the step of identifying the state of a mobile connection from the radio link parameters and the output of the objective measurement technique. 7. A method according to claim 1, wherein the estimation step further includes the step of assigning a weighted value to the time parameters and speech processing parameters in relation to the performance of a particular mobile connection state. A method according to claim 7, wherein the estimation step further includes the step of shifting the relative importance between the correlated temporal parameters and the speech processing parameters, where an estimate of the voice quality of conformity is calculated. with the weighted values. 9. A method according to claim 7, wherein the estimation step employs a linear estimate. 10. A method according to claim 7, wherein the estimation step employs a non-linear estimate. 11. A voice quality measurement system for wireless telecommunication networks, comprising: a radio link parameter processor for extracting temporal information from a set of radio link parameters; a signal processor to objectively measure speech quality aspects of signals; and an estimator to estimate the quality of the voice from the output from the radio link parameter processor and the voice signal processor. 12. A voice quality measurement system according to claim 11, wherein the radio link parameters include BER, FER, RxLev, transfer statistics, soft information as well as voice energy parameters. 13. A voice quality measurement system according to claim 11, wherein the estimator is a linear estimator. 14. A voice quality measurement system according to claim 11, wherein the estimator is a non-linear estimator. A voice quality measurement system according to claim 11, wherein the estimator is a neural network. A voice quality measurement system according to claim 11, where the estimator comprises multiple linear estimators. A voice quality measurement system according to claim 11, wherein the estimator comprises a state machine configured to alter the state in response to a change in any of said parameters. A voice quality measurement system according to claim 11, wherein the estimator comprises a state machine configured to alter the state in response to the speed of a traveling mobile station. A voice quality measurement system according to claim 11, wherein the estimator comprises a state machine configured to alter the state in response to a change from frequency hop to no frequency hop and vice versa.