WO1993006590A1

WO1993006590A1 - A speech coding device

Info

Publication number: WO1993006590A1
Application number: PCT/BE1991/000067
Authority: WO
Inventors: Gao Yang
Original assignee: Lernout & Hauspie Speechproducts
Priority date: 1991-09-20
Filing date: 1991-09-20
Publication date: 1993-04-01
Also published as: JPH06502928A; EP0558492A1

Abstract

A speech coding device having a first input for receiving a first error word determined upon an ideal and an estimated excitation word itself determined from a speech signal, said first input being connected to a second input of a first unit, a third input of which being connected to a code word generator, an output of said first unit being connected to a filter unit which is close looped with said code word generator, said said code word generator being provided for generating a first series of first code words having a substantially periodic character with a period associated with an input pitch period, said code word generator being also provided for generating a second series of second code words having a substantially non-periodic character, said coding device comprises processing means provided for determining upon words output by said filter unit an energy value and for selecting among said code words the one having produced the lowest energy value.

Description

"A speech coding device"

The invention relates to a speech coding device having a first input for receiving a first error word determined upon an ideal and an estimated excitation word themselves determined from a speech ' signal, said first input being connected to a second input of a first unit, a third input of which being connected to a code word generator, said first unit being provided for determining a second error word upon inputted words, an output of said first unit being connected to a filter unit which is close looped with said code word generator, said code word generator being provided for generating a series of code words, said coding device comprises processing means provided for determining upon words output by said filter unit an energy value and for selecting among said code words the one having produced the lowest energy value.

Such a speech coding device is known from the article of H. Hassanein, A Brind' Amour and K. Bryden entitled "A 4800 bps

CELP Vocoder with an Improved Excitation" and published in the papers of the International Mobile Satellite conference of Ottowa, 1990.

The first error word input into the speech coding device originates for example from a weighting filter unit. The code word generator comprises a code book having a first respectively a second part wherein periodic code words for voiced speech signals respectively non-periodic code words for unvoiced speech signals are stored. The choice between a first or a second code word being realized upon an analysis of the input speech signal in order to determine whether the input speech signal is voice or unvoiced. Such an analysis is done by using the

LTP (Long Term Prediction) gain and a predetermined threshold value. The term "word" be it an error word or a code word means a sequence of binary samples. A drawback of the known speech coding device is that the generated code word is fully dependent of the choice made between a voiced or unvoiced speech signal. Even if the choice is correctly made, it does not necessarily imply that the best code word is found in the selected code book. That best code word could as well be found in the non-chosen code book. When the choice voiced- unvoiced is not correctly done, this could also lead to a wrong choice of code book. The basic problem in chosing between a voiced and unvoiced speech signal, is that a lot of speech signals are in fact most of the time built up by a mixture of voiced and unvoied signals.

It is an object of the present invention to realize a speech coding device wherein an adequate code word is generated, independently whether the processed signal is voiced or unvoiced, a mixture of voiced-unvoiced, a plosive sound or any other type of speechlike signal.

A speech coding device according to the invention is therefore characterized in that said code word generator being provided for generating a first series of first code words having a substantially periodic character with a period associated with an input pitch period , said code word generator being also provided for genera¬ ting a second series of second code word having a substantially non- periodic character. Since the code word generator generates now as well first as second code words it is no longer necessary to make a choice between voiced and unvoiced speech so that errors caused by an erroneous choice are avoided.

A first preferred embodiment of a speech coding device according to the invention is characterized in that said code word generator is provided for generating a third series of third code words having a substantially periodic character with a period correspon- ding to a fraction m (m £ R) or a multiple r (rξ. R ) of said input pitch value. The pitch value is generally determined by an LTP analysis.

Due to the fact that by this LTP analysis the pitch value is sometime not correctly determined, an error in the pitch value automatically would lead to erroneous first code words. The provision of said third code words enables to take into account errors made in determining the pitch value. The invention will now be described in more details by means of the drawings, in which :

Figure 1 shows an embodiment of a coding device according to the invention. Figure 2 respectively 3 shows a first respectively a second series of code words.

The illustrated speech coding device comprises a first input 1 for receiving a first error word. This first error word is for example determined by computing the difference between an ideal and an estimated excitation word obtained after processing a speech signal which originates for example from a human voice.

The first input 1 is connected to a second input of a first unit 2.

The first unit is for example formed by a subtracting unit. An output of said first unit 2 is connected to a filter unit 3, which is close looped via a unit provided for determining the energy value of the output signal of said filter unit 3, and a minimization unit 5 with a code word generator 6. An output of the code word generator 6 is connected via a gain multiplying element 7 with a third input of the first unit 2. An input first error word is supplied to the first unit 2 which also receives an estimated word. Said estimated word being formed by multiplying a code word output by the code word generator 6 with a gain value. The latter operation being realized by element 7. The first unit deducts the estimated word from the first word and thus forms a second error word which is supplied to the filter unit 3 in order to form a third error word. The unit compu¬ tes the energy value of input third error word. This energy value is then input into the minimization unit 5 which controls the code word generator 6. The code word generator 6 is provided for generating a first series of first code words as well as a second series of second code words. For the sake of clarity, the first code word will first be considered. Figure 2 shows an example of a first series of first code words. Consider for example the length L of the subframe being L = 50 samples, and the pitch length P = 20 samples. The pitch length is for example determined by means of a LTP (Long Term Prediction) analysis. With L = 50 and P 20, the initial first code word (a in figure 2) in the first series has a bit quantified as non-zero bit at sample position 0, 20 and 40. Thus, the distance between successive ^* non-zero bits in the first code words is each time 20 samples, i.e. the pitch length P. Each subsequent (b, c,..i) first code word of said first series is then each time obtained by time directional shifting the non-zero bits over one sample period. Thus, for example, the first code word illustrated under b in figure 2 has non-zero bits at sample position 1, 21 and 41 while the last first code word (i in figure 2) has only one non-zero bit at sample position 49. In the example illustrated in figure 2, P_* L, however if P> L there is of course only one non-zero bit in each first code word.

In operating the speech coding device according to the invention, each first code word of said first series is successively supplied to said third input of the first unit in order to determine the second and third error word. The code word generator is further provided for selecting among the first code words, the one having produced the third error word with the lowest energy value. This is for example realized each time when comparing the energy value obtained by the considered code word with a temporarily memorized lowest energy obtained by a preceding first code word during a same operation and by overruling the memorized lowest energy value if the latter is larger than the one obtained by the considered first code word. By storing that lowest energy value there is also stored the sequence number of the considered first code word or another index identifying that sequence number, in order to identify that first code word having produced a third error word with the lowest energy value. The code word generator is also provided for generating a second series of second code words upon a received predetermined code value Q indicating the number of bits quantified as non-zero bits that have to be considered in said second code words. In figure 3, the signal e_ illustrates an example of an analogous form of a first error word, while f_ illustrates an example of a second code word wherein Q = 6 and d = 1 , d being a shift value indicating the position of a first non-zero bit in the second code word. The distance between the successive non-zero bits being determined by L/Q, where L is again the subframe length.

The code word generator comprises processing means for determining upon said second code word f_ and said first error word a first resp. a second candidate word f_ resp. f . of which an example is shown in figure 3. That first candidate word f, is for example obtained by multiplying f_ with a binary form of e_, while

10 the second candidate word f . is obtained by inversing the first non-zero bit value of the first candidate word.

The first and the second candidate words are now successively input into the first device and the filter unit in order to obtain each time a third error word e, . and e, _. The processing 1-5 means of the code word generator being also provided for comparing e. . with e, _ and for selecting among both third error words, the one having the lowest energy value. The candidate word having produced the third error word with the lowest energy value is then considered for further processing. 20 Suppose now that e, . < e_ _. Under those circumstan¬ ces, the first candidate word is chosen for further processing because its related lower energy value indicates that this candidate word is closer to the first error word.

A third candidate word f ,. now generated by taking

2-5 the first candidate word and inversing the second non-zero bit. A third error word e_ - is now determined by considering the third candida-

3_.3 te word. The energy value of e_ .. is compared with e, . and again the lowest one is selected. The candidate word having generated the lowest energy value is then again considered for determining the

30 fourth candidate word and the process described herebefore is repeated until all Q non-zero bits have been considered, i.e. until the Q candi¬ date word has been generated and considered.

After having considered all Q generated candidate words, the energy value e, . of the remaining candidate word is compared

^⁵ with the energy value of the first code word selected by considering the series of first code words. Among both energy values, there is now chosen the one having the lowest value. The first code word or the candidate word having the lowest energy value is then selected and this word is the one output by the speech coding unit. From the description set out herebefore, it will be clear that, contrary to what is done in the known coding devices, it is not necessary to make a selection among voiced and unvoiced speech in order to determine a first code word or a second code word. In the description given herebefore only one Q value has been considered for determining the candidate words. However it will be clear that several Q values can be considered. For each consideration of Q value the described process is then repeated and of course the candidate words issued for each Q value are then compared with each other on the basis of the lowest energy value and the one with the lowest energy value is chosen.

It could happen that the pitch value determined by the LTP analysis is not the correct one. This would of course have consequences for the first series of first code words. In order to provide a solution for this problem, the code word generator of the speech coding device according to the present invention is preferably provided for generating a third series of third code words. The third code words are built up in an analogous manner as the first code words but the distance between successive bits quantified as non-zero bits now equals a fraction m (m lR) of a multiple r .r _£ϊR) of the input pitch value P. In a preferred embodiment m = 2 and thus the distance between successive non-zero bits is P/2. The third code words are considered in an analogous manner as the one of the first series. However computation time can be saved by not considering the last third code words of the third series. Since the third code words are, just as the first code words, obtained by shifting the non-zero bits, the last third code words equals the first code words so that it would be redundant to consider them again.

Of course a code word generator provided for genera¬ ting beside a first and a third series also a fourth and even further series each time having a different m or r value could also be realized.

Claims

1. A speech coding device having a first input for receiving a first error word determined upon an ideal and an estimated excitation word itself determined from a speech signal, said first input being connected to a second input of a first unit, a third input of which being connected to a code word generator, said first unit being provided for determining a second error word upon inputted words, an output of said first unit being connected to a filter unit which is close looped with said code word generator, said code word generator being provided for generating a series of code words, said coding device comprises processing means provided for determining upon words output by said filter unit an energy value and for selecting among said code words the one having produced the lowest energy value, characterized in that said code word generator being provided for generating a first series of first code words having a substantially periodic character with a period associated with an input pitch period, said code word generator being also provided for generating a second series of second code words having a substantially non-periodic character.

2. A speech coding device as claimed in claim 1, characterized in that said code word generator is provided for generating a third series of third code words having a substantially periodic charac¬ ter with a period corresponding to a fraction m (m fc _R.) of said input pitch value.

3. A speech coding device as claimed in claim 2, characterized in that said code word generator is provided for generating a third series of third code words having a substantially periodic charac¬ ter with a period corresponding to a a multiple r (r £ fO of said input pitch value.