DE10116984A1

DE10116984A1 - Voice, audio signal transmission involves dividing parameters into several packets so packet to be transmitted can also contain parameters of current and preceding or subsequent frames

Info

Publication number: DE10116984A1
Application number: DE10116984A
Authority: DE
Inventors: Rainer Martin
Original assignee: Rainer Martin
Current assignee: MARTIN, RAINER, DR.-ING., 38106 BRAUNSCHWEIG, DE
Priority date: 2001-04-05
Filing date: 2001-04-05
Publication date: 2002-10-10

Abstract

The method involves using a voice or audio encoding technique. The parameters determined by the coding process for a single frame are divided into several packets so that a packet to be transmitted can also contain parameters of the current and preceding or subsequent frames. Erroneous or missing parameters in the event of transmission noise or packet loss are reconstructed using a priori knowledge. AN Independent claim is also included for the following: a method of storage of coded voice or audio signals.

Description

Die Erfindung betrifft ein Verfahren zur Übertragung oder Speicherung von Sprach- oder Audiosignalen. Die Erfindung kann zum Beispiel in Mobilfunk netzen oder in paketvermittelten Netzen oder im Zusammenhang mit dem Internet Protokoll ("Voice over IP") eingesetzt werden. In diesen Netzen kann die Qualität des empfangenen Signals durch Übertragungsfehler be einträchtigt werden. Die Erfindung beschreibt ein Verfahren zur Reduktion solcher Übertragungsfehler. Die Erfindung betrifft desweiteren eine Vorrich tung zur Ausführung dieses Verfahrens.The invention relates to a method for the transmission or storage of Voice or audio signals. The invention can be used, for example, in mobile radio networks or in packet switched networks or in connection with the Internet protocol ("Voice over IP") can be used. In these networks can be the quality of the received signal due to transmission errors be impaired. The invention describes a method for reduction such transmission error. The invention further relates to a Vorrich to perform this procedure.

Sprach- oder Audiosignale werden in modernen Kommunikationsnetzen (Mo bilfunk, Fernverkehrnetze, Voice over IP) vor der Übertragung mittels eines Sprach- oder Audiocodierverfahrens codiert und im Empfänger decodiert. Die Codierung verringert die Redundanz der zu übertragenden Daten und damit auch die für die Übertragung erforderliche Bitrate (siehe z. B. P. Vary, U. Heute, W. Hess, Digitale Sprachsignalverarbeitung, Teubner Verlag, Stutt gart, 1998). Der Sender modifiziert das codierte Signal, so daß ein im Sinne hoher Signalqualität optimaler Empfang möglich ist. Eine derartige Übert ragungskette mit den Elementen Codierer, Sender, Kanal und Empfänger ist in Fig. 1 dargestellt.Voice or audio signals are encoded in modern communication networks (cellular, long-distance networks, Voice over IP) before transmission using a voice or audio coding method and decoded in the receiver. The coding reduces the redundancy of the data to be transmitted and thus also the bit rate required for the transmission (see, for example, BP Vary, U. Today, W. Hess, digital voice signal processing, Teubner Verlag, Stuttgart, 1998). The transmitter modifies the coded signal so that optimal reception in the sense of high signal quality is possible. Such a transmission chain with the elements encoder, transmitter, channel and receiver is shown in Fig. 1.

Der Codierung liegt in der Regel ein Modell zugrunde, z. B. im Fall eines Sprachcoders wird ein Modell der Spracherzeugung bestehend aus einem li nearen Prädiktionsfilter und einem Anregungsmodell verwendet (siehe z. B. P. Vary, U. Heute, W. Hess, Digitale Sprachsignalverarbeitung, Teubner Verlag, Stuttgart, 1998). In der Audiocodierung (z. B. MPEG "MP3") ist der Ein satz von Gehörmodellen üblich. Der Codiervorgang berechnet aus dem zu codierenden Signalabschnitt die Parameter des Modells und quantisiert diese Parameter indem sie auf eine endlich Anzahl von Bitmustern abgebildet wer den. Die zu einem Signalabschnitt gehörigen quantisierten Parameter werden in Form von Bits oder Bitgruppen in Rahmen gepackt und zum Empfänger übermittelt. Der Empfänger generiert aus den quantisierten Parametern und dem zugrundeliegenden Modell schließlich wieder ein Sprach- oder Audiosi gnal.The coding is usually based on a model, e.g. B. in the case of a Speech encoder is a model of speech production consisting of a li near prediction filter and an excitation model (see e.g. P. Vary, U. Today, W. Hess, Digital Speech Signal Processing, Teubner Verlag, Stuttgart, 1998). In audio coding (eg MPEG "MP3") is the on set of hearing models common. The coding process calculates from the coding signal section the parameters of the model and quantized them Parameters by mapping them to a finite number of bit patterns the. The quantized parameters belonging to a signal section are packed in frames in the form of bits or bit groups and sent to the receiver transmitted. The receiver generates from the quantized parameters and the underlying model is finally a speech or audio sound gnal.

Beispiele für derartige Codierverfahren liegen in den ETSI/3GPP-Standards vor. Z. B. wird beim sogenannten Adaptive Multirate Sprachcodec (EN 301 703, V7.0.2: Digital cellular telecommunications system (Phase 2+); Ad aptive Multi-Rate (AMR) speech processing functions; General Descripti on, 1999) das zu codierende Sprachsignal in Abschnitte von 20 Millisekun den (ms) unterteilt. Aus diesen Signalabschnitten ("Rahmen") und einem "look-ahead"-Abschnitt von 5 ms werden dann die Parameter für ein li neares Prädiktionsfilter, 2 Anregungscodebücher und 2 Gewichtungsfakto ren berechnet und quantisiert. Je nach ausgewählter Bitrate ergeben sich so Bitraten zwischen 95 und 244 Bit pro Sprachrahmen. Trotz der redun danzvermindernden Codierung weisen die Parameter untereinander und über die Rahmengrenzen hinweg noch Korrelation auf. Diese Korrelation kann im Empfänger zur Verbesserung der Qualität des empfangenen Signals genutzt werden. Dafür geeignete Verfahren sind z. B. in C. G. Gerlach, "A Probabi listic Framework for Optimum Speech Extrapolation in Digital Mobile Ra dio", Proc. Intl. Conf. Acoustics, Speech, Signal Processing (ICASSP), p. 419-422, IEEE, 1993, oder in T. Fingscheidt, P. Vary, "Speech Decoding with Error Concealment Using Residual Source Redundancy", Proc. IEEE Speech Coding Workshop, S. 91-92, IEEE, 1997 beschrieben. Diese Verfahren beruhen auf der Anwendung von a priori Information über die Verteilungs dichtefunktion der zu verbessernden Parameter und Optimalschätzung. Die a priori Information wird bei den bisher bekannten Verfahren in der Form von Histogrammen abgelegt. Da der Speicheraufwand für derartige Histogramm exponentiell mit der vom Codierverfahren erzeugten Bitzahl anwächst, sind die Verfahren daher nur für Parameter, die mit relativ wenigen Bits quanti siert werden, praktisch einsetzbar.Examples of such coding methods are in the ETSI / 3GPP standards in front. For example, the so-called adaptive multirate speech codec (EN 301 703, V7.0.2: Digital cellular telecommunications system (phase 2+); ad aptive multi-rate (AMR) speech processing functions; General Descripti on, 1999) the speech signal to be encoded in sections of 20 milliseconds divided (ms). From these signal sections ("frame") and one "Look-ahead" section of 5 ms then the parameters for a li near prediction filter, 2 excitation code books and 2 weighting factor ren calculated and quantized. Depending on the selected bit rate, this results in so bit rates between 95 and 244 bits per speech frame. Despite the talk the parameters to each other and above the frame boundaries still correlate. This correlation can be found in Receiver used to improve the quality of the received signal become. Suitable methods are e.g. B. in C.G. Gerlach, "A Probabi listic Framework for Optimal Speech Extrapolation in Digital Mobile Ra dio ", Proc. Intl. Conf. Acoustics, Speech, Signal Processing (ICASSP), p. 419-422, IEEE, 1993, or in T. Fingscheidt, P. Vary, "Speech Decoding with Error Concealment Using Residual Source Redundancy ", Proc. IEEE Speech Coding Workshop, pp. 91-92, IEEE, 1997. This procedure are based on the application of a priori information about the distribution density function of the parameters to be improved and optimal estimate. The a Priori information is in the form of Histograms filed. Because the amount of memory for such histogram increases exponentially with the number of bits generated by the coding method the method therefore only for parameters that quanti with relatively few bits be practiced.

Zum anderen kann die Qualität des empfangenen Signals auch durch sendesei tige redundanzerhöhende Maßnahmen verbessert werden. An erster Stelle sind hier die sogenannte Kanalcodierung (siehe z. B. M. Bossert, Kanalco dierung, Teubner, 1992), das Interleaving, oder die wiederholte Übertragung von Information oder Rahmen zu nennen. Für die Audioübertragung über das Internet ist das in V. Hardman, M. A. Sasse, M. Handley und A. Watson, "Reliable Audio for Use over the Internet", Proc. INET-95, International Networking Conference, Hawaii, 1995, prinzipiell geeignet. Bei diesem Ver fahren wird jedem Paket noch eine mit niedriger Bitrate codierte Version eines vorangegangenen Signalrahmens angehängt. Bei Verlust eines Pakets können die Parameter des Rahmen aus der Zusatzinformation mit entspre chend schlechterer Qualität zurückgewonnen werden. Eine erneute Anfor derung eines verlorenen Rahmens ("Automatic Repeat Request", ARQ) ist wegen der dabei auftretenden Verzögerungszeit bei Sprach- oder Audioüber tragung nicht möglich.On the other hand, the quality of the received signal can also be determined by the sendesei measures to increase redundancy are improved. First of all are the so-called channel coding (see e.g. M. Bossert, Kanalco dierung, Teubner, 1992), interleaving, or retransmission of information or framework to name. For audio transmission over the Internet is that in V. Hardman, M.A. Sasse, M. Handley and A. Watson, "Reliable Audio for Use Over the Internet," Proc. INET-95, International Networking Conference, Hawaii, 1995, suitable in principle. With this ver each packet will still run a version coded with a low bit rate of a previous signal frame. If a package is lost can correspond to the parameters of the frame from the additional information poorer quality can be recovered. Another request change of a lost frame ("Automatic Repeat Request", ARQ) because of the delay in voice or audio over wearing not possible.

Der Nachteil der bisher bekannten sendeseitigen Maßnahmen zur Vermin derung der durch Übertragungsfehler verursachten Störungen besteht darin, daß sie zumeist mit einer Erhöhung der Bitrate einhergehen. Dies ist für die Übertragung im Internet, die u. a. auch niederratige Modemverbindungen mit einschließt, wenig vorteilhaft. Die empfangsseitig arbeitenden Verfahren mit Optimalschätzung sind auf relativ einfache Anwendungen mit wenigen Bits beschränkt und keinesfalls geeignet, die Korrelation der Parameter über einen ganzen Rahmen hinweg zu nutzen.The disadvantage of the previously known transmission side measures for Vermin The interference caused by transmission errors is that they are usually accompanied by an increase in the bit rate. This is for the transmission on the Internet, the u. a. also low-rate modem connections includes, little beneficial. The procedures at the receiving end with optimal estimate are on relatively simple applications with few Bits limited and in no way suitable to correlate the parameters via to use a whole frame.

Die Aufgabe der Erfindung besteht deshalb darin, ein Verfahren und eine Vorrichtung anzugeben, das eine wirkungsvolle Verminderung der mit Über tragungsfehlern einhergehenden Qualitätsverluste bewirkt und dabei weder das Hinzufügen zusätzlicher Redundanz erfordert noch einen mit der Bitzahl exponentiell steigenden Speicher- oder Rechenaufwand benötigt.The object of the invention is therefore a method and a Device to provide an effective reduction in over wear and tear associated quality losses and neither adding additional redundancy requires one more with the number of bits exponentially increasing memory or computational effort required.

Diese Aufgabe wird durch ein Verfahren gelöst, dass die Merkmale des An spruchs 1 aufweist.This problem is solved by a method that the characteristics of the An has claim 1.

Das Verfahren beruht darauf, die Parameter eines Rahmens so auf zwei oder mehrere Pakete zu verteilen, dass bei Störung oder Verlust eines Paketes die gestörten oder fehlenden Parameter aus den vorhandenen empfangenen Pa rametern geschätzt werden können. Das dafür erforderliche a priori Wissen wird nicht in Form von Histogrammen sondern als sogenannte statistische Mischmodelle ("Mixture Models") dargestellt. Damit ergeben sich neuartige Parameterschätzer, die mit geringem Speicher- und Rechenaufwand realisiert werden können. Mit Hilfe der Mischmodelle werden optimale Schätzer ange geben, die die beeinträchtigten oder fehlenden Parameter korrigieren bzw. ergänzen. Der Vorteil des Verfahrens liegt darin, daß bei fehlerloser Übertra gung die volle Qualität des Signals erreicht wird, ohne daß zusätzliche Bitrate für zum Beispiel eine Kanalcodierung bereitgestellt werden muß. Im Fall von verlorenen Paketen ergibt sich, da die Optimalschätzung mit Fehlern behaftet ist, eine verminderte Qualität. Wie Messungen zeigen, kann der Qualitätsver lust jedoch klein gehalten werden und noch weiter vermindert werden, wenn für die Qualität wichtige Parameter doppelt übertragen werden.The method is based on the fact that the parameters of a frame on two or to distribute several packets that in the event of a packet failure or loss disturbed or missing parameters from the existing received Pa parameters can be estimated. The a priori knowledge required for this is not in the form of histograms but as so-called statistical Mixture models are shown. This results in new types Parameter estimator that is implemented with little memory and computing effort can be. With the help of the mixed models, optimal estimators are displayed give, which correct the impaired or missing parameters or complete. The advantage of the method is that if the transfer is error-free full quality of the signal is achieved without additional bit rate for example, channel coding must be provided. In case of lost packets result because the optimal estimate has errors is a reduced quality. As measurements show, the quality ver desire to be kept small, however, and further reduced if parameters important for quality are transferred twice.

Das Verfahren ist in Fig. 2 an einem Beispiel erläutert. Dabei sei angenom men, daß die vom Codierverfahren erzeugten Rahmen aus N = 6 Parametern bestehen und diese Parameter zu einem Vektor U _k zusammengefaßt werden, wobei k einen über die Zeit zu inkrementierenden Rahmenindex angibt. Die quantisierten Parameter U 1|k-U 6|k, d. h. die Komponenten des Parametervek tors U _k, aufeinanderfolgender Rahmen R_k-1, R_k, R_k+1 werden in unterschied liche, aufeinanderfolgende Pakete P_k-1, P_k, P_k+1 aufgeteilt. Bei Verlust eines Paketes werden die fehlenden Parameter unter Verwendung der Korrelation der Parameter und von a priori Informationen geschätzt, wobei die a prio ri Informationen in Form von Mischmodellen gespeichert sind. Wenn keine Übertragungsfehler auftreten, werden im Empfänger die in einem Paket ent haltenen Parameter wieder den Rahmen zugeordnet. Der Parameterschätzer, in der Regel ein Optimalschätzer, berechnet die fehlenden Parameter indem er z. B. den mittleren quadratischen Schätzfehler minimiert (MMSE-Schätzer) oder z. B. die a posteriori Wahrscheinlichkeiten maximiert (MAP-Schätzer) und dabei die Korrelation der Parameter untereinander und über der Zeit nutzt.The method is explained in FIG. 2 using an example. It is assumed that the frames generated by the coding method consist of N = 6 parameters and these parameters are combined to form a vector U _k , where k indicates a frame index to be incremented over time. The quantized parameters U 1 | kU 6 | k, ie the components of the parameter vector U _k , successive frames R _k-1 , R _k , R _{k + 1} are divided into different, successive packets P _k-1 , P _k , P split _{k + 1} . If a packet is lost, the missing parameters are estimated using the correlation of the parameters and a priori information, the a prio ri information being stored in the form of mixed models. If no transmission errors occur, the parameters contained in a packet are reassigned to the frame in the receiver. The parameter estimator, usually an optimal estimator, calculates the missing parameters by z. B. minimized the mean square estimation error (MMSE estimator) or z. B. maximizes the a posteriori probabilities (MAP estimator) and uses the correlation of the parameters with each other and over time.

Eine zweite typische Anordnung ist in Fig. 3 gezeigt. Bei diesem Übertra gungsverfahren werden die Parameter U 1|k-U 6|k eines Rahmens auf zwei (oder mehr) Pakete P_k und P '|k aufgeteilt. Diese Pakete werden unabhängig ("par allel") voneinander übertragen. Wird wegen der Übertragungsfehler auf dem Kanal nur eines der beiden Pakete empfangen, werden die fehlenden Parame ter aus den empfangenen geschätzt. Bei dieser Realisation wird zur Vermin derung von Verzögerungszeiten nur die Parameter des akutellen Rahmens zur Schätzung der fehlenden Parameter herangezogen.A second typical arrangement is shown in FIG. 3. In this transmission method, the parameters U 1 | kU 6 | k of a frame are divided into two (or more) packets P _k and P '| k. These packets are transmitted independently ("par allel") of each other. If only one of the two packets is received on the channel due to the transmission errors, the missing parameters are estimated from the received ones. In this implementation, only the parameters of the current frame are used to estimate the missing parameters in order to reduce delay times.

Wird die zeitliche Folge der quantisierten Parametervektoren U _k zu Vek toren zusammengefaßt und kann die zeitliche Folge der vektorwertigen, quantisierten Parameter _κ verschiedene Werte ^(l), l = 1 . . . κ, annehmen, dann ist der optimal Schätzwert im Sinne des mittleren quadratischen Fehlers durch
If the temporal sequence of the quantized parameter vectors U _{k is combined} into vectors, and the temporal sequence of the vector-valued, quantized parameters _{κ can have} different values ^(l) , l = 1. , , κ, then the optimal estimate in terms of the mean square error is through

gegeben, wobei P() die Verbundauftrittswahrscheinlichkeit der zeitlichen Folge der quantisierten Parametervektoren und z die Folge der empfangenen Parametervektoren bezeichnet. P() repräsentiert das a priori Wissen über die Verteilung der quantisierten Parameter. Die Verteilungsdichte p( | ^(l)) gibt die Übergangswahrscheinlichkeit der gesendeten Parameter zu den emp fangenen Parametern an und repräsentiert somit die Eigenschaften des Über tragungskanals.given, where P () denotes the association probability of the temporal sequence of the quantized parameter vectors and z the sequence of the received parameter vectors. P () represents the a priori knowledge about the distribution of the quantized parameters. The distribution density p (| ^(l) ) indicates the transition probability of the sent parameters to the received parameters and thus represents the properties of the transmission channel.

Statt den mittleren quadratischen Fehler zu minimieren, kann der Opti malschätzer auch denjenigen quantisierten Wert auswählen, der die a po steriori Wahrscheinlichkeit
Instead of minimizing the mean square error, the optimal estimator can also select the quantized value that represents the a po steriori probability

maximiert. Man erhält dann den sogenannten Maximum A Posteriori (MAP) Schätzwert. Wenn jede Komponente des Parametervektors U _k z. B. mit 3 Bit quantisiert wird und jeder Parametervektor N = 6 Komponenten enthält und 2 aufeinanderfolgende Vektoren U _k in die Verbundwahrscheinlichkeit P() aller quantisierten Werte eingehen, dann muß die Summe in Gleichung 1 über (2³.2³.2³.2³.2³.2³)² ≈ 6.9.10¹⁰ Summanden ausgeführt werden, was zu einem inakzeptablen Speicher- und Rechenaufwand führt.maximized. The so-called Maximum A Posteriori (MAP) estimate is then obtained. If each component of the parameter vector U _k z. B. is quantized with 3 bits and each parameter vector contains N = 6 components and 2 consecutive vectors U _{k are included} in the association probability P () of all quantized values, then the sum in equation 1 must be (2 ³ .2 ³ .2 ³ . 2 ³ .2 ³ .2 ³ ) ² ≈ 6.9.10 ¹⁰ summands are executed, which leads to an unacceptable storage and computing effort.

Die Erfindung unterscheidet sich nun vom Stand der Technik dadurch, daß die a priori Wahrscheinlichkeit P() nicht in Form von speicheraufwendigen Histogrammen, sondern durch Mischmodelle ("Mixture Models") dargestellt wird. Mischmodelle werden z. B. in der Spracherkennung im Zusammenhang mit Hidden Markov Modellen verwendet, sind aber in der Sprachübertragung bisher nicht eingesetzt worden. Die a priori Verbundwahrscheinlichkeit kann also durch eine gewichtete Summe von Funktionen V_i dargestellt werden
The invention now differs from the prior art in that the a priori probability P () is not represented in the form of memory-intensive histograms, but rather by means of “mix models”. Mixed models are e.g. B. used in speech recognition in connection with Hidden Markov models, but have not previously been used in speech transmission. The a priori association probability can therefore be represented by a weighted sum of functions V _i

Die Verwendung von Mischmodellen erlaubt die Darstellung der a priori Wahrscheinlichkeit P() mit einer relativ geringen Zahl an zu speichernden Parametern. Von besonderem Interesse ist Verwendung multivariater Gauß verteilungen. Z. B. kann die Verbundverteilung eines Parametervektors U _k mittels
The use of mixed models allows the representation of the a priori probability P () with a relatively small number of parameters to be stored. The use of multivariate Gaussian distributions is of particular interest. For example, the network distribution of a parameter vector U _{k can} be determined using

dargestellt werden, wobei jede N-dimensionale Einzelverteilung durch
are represented, with each N-dimensional individual distribution by

gegeben ist und α_i die a priori Wahrscheinlichkeit der Mischkomponenten _i = (U _k, µ _i, C_i) bezeichnet, d. h. P(_i) = α_i.is given and α _i denotes the a priori probability of the mixed components _i = ( U _k , µ _i , C _i ), ie P ( _i ) = α _i .

Die Mischwahrscheinlichkeiten α_i, die Vektoren der Mittelwerte µ _i. (Zentro iden), und die Kovarianzmatrizen C_i werden durch numerische Trainingsver fahren bestimmt. Um die Zahl der zu speichernden Parameter zu reduzieren, können auch Kovarianzmatrizen verwendet werden, die nur in der Hauptdia gonalen von Null verschiedene Werte aufweisen.The mixing probabilities α _i , the vectors of the mean values µ _i . (Zentro iden), and the covariance matrices C _i are determined by numerical training methods. In order to reduce the number of parameters to be saved, covariance matrices can also be used which only have non-zero values in the main diagonal.

Als ein Ausführungsbeispiel der in Fig. 2 und 3 angeführten Parameterschätzer sei die Übertragung über einen Kanal mit Paketverlusten betrachtet. Wie in Fig. 3 ausgeführt, sei angenommen, daß ein Teil der Parameter ungestört empfangen wird und ein anderer Teil fehlt. Der Parametervektor U _k wird daher in einen empfangenen Teil U (p)|k und einen fehlenden Teil U (m)|k unter teilt,
Transmission over a channel with packet loss is considered as an embodiment of the parameter estimators shown in FIGS. 2 and 3. As explained in FIG. 3, it is assumed that some of the parameters are received without interference and another part is missing. The parameter vector U _k is therefore divided into a received part U (p) | k and a missing part U (m) | k,

U _k = (U (m)|k, U (p)|k)^T (7) U _k = ( U (m) | k, U (p) | k) ^T (7)

Bei Verwendung Gaußscher Mischmodelle werden analog zu Gleichung (7) auch die Zentroiden µ _i und die Kovarianzmatrizen C_i aller Mischkomponen ten in empfangene und fehlende Komponenten unterteilt
When using Gaussian mixing models, the centroids µ _i and the covariance matrices C _{i of} all mixing components are subdivided into received and missing components analogously to equation (7)

Die bedingte Wahrscheinlichkeit der fehlenden Komponenten, gegeben die empfangenen Komponenten, können jetzt unter Verwendung eines Gauß schen Mischmodells mit
The conditional probability of the missing components, given the received components, can now be calculated using a Gaussian mixed model

angegeben werden. Da die bedingte Verteilungsdichte und jede Randvertei lung Gaußscher Zufallsvariablen wiederum (multivariate) Gaußsche Vertei lungen ergeben, kann die Verbundverteilungsdichte (U _k, µ _i, C_i) in eine be dingte Verteilung und eine Randverteilung aufgespalten werden (siehe z. B. S. Kotz, N. Balakrishnan, N. L. Johnson, Continuous Multivariate Distributions, Wiley, 2000)
can be specified. Since the conditional distribution density and each marginal distribution of Gaussian random variables in turn result in (multivariate) Gaussian distributions, the composite distribution density ( U _k , µ _i , C _i ) can be split into a conditional distribution and an edge distribution (see e.g. BS Kotz, N Balakrishnan, NL Johnson, Continuous Multivariate Distributions, Wiley, 2000)

Man definiert die a posteriori Wahrscheinlichkeiten
The a posteriori probabilities are defined

und erhält
and receives

Unter Verwendung der Gleichungen 12 und 15, ist der im Sinne des kleinsten quadratischen Fehlers optimal Schätzwert für die fehlenden Parameter durch den bedingten Erwartungswert
Using equations 12 and 15, the optimal estimate for the missing parameters in the sense of the smallest quadratic error is the conditional expected value

gegeben.given.

Falls diagonale Kovarianzmatrizen eingesetzt werden sind die Nebenmatrizen C (m,p)|i und C (p,m)|i Null und der Schätzwert ist durch
If diagonal covariance matrices are used, the secondary matrices C (m, p) | i and C (p, m) | i are zero and the estimated value is through

gegeben, wobei die a posteriori Wahrscheinlichkeiten jetzt leicht durch ein Produkt von univariaten Normalverteilungen ausgerechnet werden können
given, the a posteriori probabilities can now easily be calculated by a product of univariate normal distributions

U (p)|k,j bezeichnet dabei die j-te Komponente des k-ten Vektors U (p)|k der emp fangenen Komponenten und µ (p)|i,j und (σ (p,p)|i,j)² den Mittelwert und die Varianz der j-ten Vektorkomponente der i-ten Mischkomponente. Σ_j = und Π_j = bezeichnen Summen bzw. Produkte der empfangenen Kompo nenten wobei N_p die Zahl der empfangenen Komponenten angibt.U (p) | k, j denotes the jth component of the kth vector U (p) | k of the received components and µ (p) | i, j and (σ (p, p) | i, j) ² the mean value and the variance of the j-th vector component of the i-th mixed component. Σ _j = and Π _j = denote sums or products of the components received, where N _p indicates the number of components received.

Der Speicherbedarf dieser neuartigen Lösung ist unmittelbar proportional zur Ordnung des Parametervektors N und zur Zahl der Mischkomponen ten M, d. h. für einen Parametervektor mit 6 Komponenten und Gaußschen Mischmodellen mit diagonalen Kovarianzmatrizen müssen (6 + 6 + 1).M Werte gespeichert werden. Da M sich in der Größenordnung von 100 bewegt ist der Speicheraufwand der dargestellten Erfindung bedeutend kleiner als der Speicheraufwand eines Histogramm-basierten Verfahrens, das bei ausrei chend feiner Quantisierung des Parametervektors in der Größenordnung von 300000 Speicherplätze erfordert.The memory requirement of this new solution is directly proportional the order of the parameter vector N and the number of mixed components ten M, d. H. for a parameter vector with 6 components and Gaussian Mixed models with diagonal covariance matrices must (6 + 6 + 1) .M Values are saved. Since M is on the order of 100 the memory requirement of the illustrated invention is significantly less than the storage effort of a histogram-based method, which is sufficient for accordingly fine quantization of the parameter vector in the order of 300,000 storage spaces required.

Das in der Erfindung beschriebene Schätzverfahren kann auch bei der Spei cherung von Sprach- oder Audiosignalen nutzbringend eingesetzt werden. Z. B. können bestimmte, vom Codierverfahren erzeugte Parameter bei der Spei cherung weggelassen werden, wenn sie später beim Abrufen der gespeicherten Signale mit geringem Fehler rekonstruiert werden können. Damit wird eine Reduktion des für die Speicherung eines codierten Sprach- oder Audiosignals erforderlichen Speicherplatzes erreicht.The estimation method described in the invention can also be applied to the memory of voice or audio signals can be used to advantage. E.g. can certain parameters generated by the coding process in the Spei can be omitted if they are later retrieved when the saved Signals with little error can be reconstructed. So that becomes a Reduction in the storage of a coded voice or audio signal required storage space reached.

Claims

1. A method for frame-by-frame transmission of coded voice or audio signals via communication networks using a voice or audio coding method, characterized in that the parameters determined by the coding method for a signal frame are divided into several packets, so that a packet to be sent is except the parameters of the current can also contain parameters of the preceding or following signal frames, and in the case of transmission interference or packet loss, the disturbed or missing parameters can be reconstructed using a priori knowledge, the a priori knowledge of one or more of the parameters generated by the coding method is stored in the form of mixed models and a mixed model is understood to mean a weighted sum of functions according to equation (3).

2. A method of storing encoded voice or audio signals under Use of a speech or audio coding method, characterized in that selected Pa determined by the coding method for a signal frame rameter will not be saved and when restoring the saved The missing parameters using a priori knowledge be reconstructed, the a priori knowledge of one or more of the parameters generated by the coding process in the form of mixed models is stored and under a mixed model a weighted sum of Functions are understood according to equation (3).

3. A method according to any one of the preceding claims, characterized ge indicates that the mixed models have the composite distribution density on or edge distribution density functions of the unquantized or the quan approximate the parameterized parameters;

4. A method according to any one of the preceding claims, characterized ge indicates that the mixed models from (multivariate) Gaussian distribution gene are formed;

5. A method according to any one of the preceding claims, characterized ge indicates that the mixed models from (multivariate) gamma or Laplace distributions are formed;

6. A method according to any one of the preceding claims, characterized ge characterizes that the mixed models a priori knowledge for a single Represent or approximate parameters of the coding method;

7. A method according to any one of the preceding claims, characterized ge indicates that the mixed models a priori knowledge for several parameters represent or approximate a frame of the coding method;

8. A method according to any one of the preceding claims, characterized ge indicates that the mixed models a priori knowledge for one or more Other parameters of consecutive frames of the coding method represent or approximate;

9. A method according to any one of the preceding claims, characterized ge indicates that the reconstruction of the disturbed or missing Para meter using an optimal estimator;

10. A method according to any one of the preceding claims, characterized characterized that the optimal estimator is the mean square Minimized estimation errors;

11. A method according to any one of the preceding claims, characterized ge indicates that the optimal estimator is the posterior probability maximizes the disturbed or missing parameters;

12. A method according to any one of the preceding claims, characterized characterized in that an estimator uses equations (4) - (16) becomes;

13. A method according to any one of the preceding claims, characterized characterized in that an estimator uses equations (17) - (19) becomes;

14. A method according to any one of the preceding claims, characterized ge indicates that the mixed model has fully occupied covariance matrices comprises;

15. A method according to any one of the preceding claims, characterized in that that the mixed model has diagonally populated covariance matrices having;

16. A method according to any one of the preceding claims, characterized characterized that the mixed model sparsely populated covariance matrices having;

17. A method according to any one of the preceding claims, characterized characterized in that the transmission takes place via a mobile radio network;

18. A method according to any one of the preceding claims, characterized characterized that the transmission over a packet-switched network he follows;

19. A method according to any one of the preceding claims, characterized ge indicates that the transmission using the Internet protocol (IP, UDP, TCP, RTP, RCTP);

20. A method according to any one of the preceding claims, characterized characterized that it is the parameters of the coding process around the coefficients of a linear prediction filter or transformations these coefficients (e.g. "Line Spectral Frequencies");

21. A method according to any one of the preceding claims, characterized characterized that the quantized parameters before transmission protected from transmission errors by adding redundancy;

22. A method according to any one of the preceding claims, characterized characterized that the quantized parameters by adding Redundancy before transmission the detection of transmission errors in the Allow recipients;

23. A method according to any one of the preceding claims, characterized characterized that one and the same parameter of the coding process can also be assigned to several packages;

24. A method according to any one of the preceding claims, characterized ge indicates that the ETSI / 3GPP Adaptive Multirate (AMR) voice codec is used for transmission;

25. A method according to any one of the preceding claims, characterized characterized that an audio coding method according to MPEG (Moving Pictures Expert Group), e.g. B. "MP3" is used;

26. A method according to any one of the preceding claims, characterized characterized that a coding method of the Global System for Mobile Communications (GSM) is used;

27. A method according to any one of the preceding claims, characterized characterized in that the parameters of the coding method quantized Si are signal samples or bits of these samples.