CA2083335A1 - Method for the quantification of the energy of the speech signal in a vocoder with very low bit rate - Google Patents

Method for the quantification of the energy of the speech signal in a vocoder with very low bit rate

Info

Publication number
CA2083335A1
CA2083335A1 CA 2083335 CA2083335A CA2083335A1 CA 2083335 A1 CA2083335 A1 CA 2083335A1 CA 2083335 CA2083335 CA 2083335 CA 2083335 A CA2083335 A CA 2083335A CA 2083335 A1 CA2083335 A1 CA 2083335A1
Authority
CA
Canada
Prior art keywords
energy
value
frame
determined number
energies
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA 2083335
Other languages
French (fr)
Inventor
Pierre-Andre Laurent
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Thales SA
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Publication of CA2083335A1 publication Critical patent/CA2083335A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components
    • G10L19/038Vector quantisation, e.g. TwinVQ audio
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/06Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

ABSTRACT OF THE DISCLOSURE
The method consists in dividing the speech signal into packets of a determined number of frames of a constant duration by the sampling of a determined number n of energy values in each frame, quantifying the first energy value measured in each first frame of a packet according to a determined number Q0 of bits and the variations of the k - 1 remaining energies in relation to the first value of the energy sampled on determined number Q1 of bits smaller than Q0, the variations of the k - 1 energies being selected from a table of "slopes" enabling each energy sample k to be assigned the energy "slope" that separates it from the energy of the "k - 1th" previous sample. Application:
Vocoders.
Figure 3

Description

- ~08~3~

METHOD FOR THE QUANTIFICATION OF THE ENERGY OF THE
SPEECH SIGNAL IN A VOCODER WITH VERY LOW BIT RATE
BACKGROUND OF THE INVENTION
1. Field of the Invention The present invention relates to a method for the quantification of the energy of the speech signal in a vocoder with a very low bit rate.
It can be applied notably to the making of the linear predlction vocoders used for the transmission of speech by radio, similar to those described for example in the Revue Technique THOMSON-CSF (THOMSON-CSF
Technical Journal), volume 14, No. 3, September 1982, pp. 715 to 731, in which the~ speech signal is identified at the output of a digital filter, the input of which recelves either a periodic waveform corresponding to the waveforms of its voiced sounds such as the vowels or a random waveform corresponding to the waveforms of its unvoiced sounds such as most of its consonants.
2. Description of the Prior Art . .
It is known that the auditory quality of linear prediction vocoders depends greatly on the precision with which their predictive filter is quantified, but also on the quality of the restitution of the power profile of the excita~ion. This is especially true for certain transitory sounds such as many consonants: for .

.
~ ' example, poor quality restitution does not allow a "d"
to be distinguished from a "t" or from a "k".
As a rule, the speech signal is segmented into frames of constant duration, and a single value of power (or energy) is given ~ox each frame.
In vocoders with very low bit rate, one way to lower the bit rate is to increase the duration of the frame, for example from 22.5 ms to 30 ms as well as to group together and quantify the parameters relating to several frames once alone. This enables the dlfferent parameters of synthesis to be renewed less frequently.
Unfortunately, the intelligibility of the restituted speech is diminished, for the transmitting of only one value of ener~y per frame no longer enables the appropriate restitution of certain transitory sounds.
A first known way to overcome these difficulties consists in grouping the frames together in packets while considering k values o~ energy per packet, each of which can be represented by the coordinates of a point referenced in a k-dimensional space. A
statistical analysis makes it possible to determine the main axas of the cloud of the poin~s observed. The quantification takes place on the coordinates of the points borne by the main axes t each point being quantified on a number o~ bits depending on the eigen value or characteristic value associated with each axis considered. However, he drawback of operating in this ~3~

way is that it is necessary to plan a p.rocedure of correction at the synthesis filter so that the values of the energies compute~ are not negative. Furthermore, in this processing operation, no special attention is paid to the fidelity of restitution of the transitory sounds.
According to a second method, also known, which partly follows the procedure of the first method by the grouping of frames in packets and which also takes k values of energy per packet into consideration, the k values of energy are no longer encoded in a scalar way but vectorially by means of a dictionary containing M =
2Q multiplets of k v~lues each in considering the k values to be quantified on Q bits.
In this case, the difficulties of setting up the system appear from the fact that it is necessary, firstly, to create and store a dictionary and, secondly, to carry out a quantification. Since the dictionary is generally poorly structured and since it is necessary to count at least two bits per value of energy, the encoding o~ the number Q occupies no less than 22 combinations which represents very major computing loads for the signal processors of the vocoders.
SUMMARY OF THE INVENTIO~
:
It is the aim of :the invention to overcome the above-mentioned drawbacks. To this effect, an object of '.
:
, - 2~833~

the invention is a method ~or the quan-tification of the energy of the speech signal in a vocoder with very low bit rate, said method consisting in dividing (1) the speech signal into packets of a determined number of ~frames of a constant duration by the sampling of a determined number n o~ energy values in each frame, quantifying ~2, 3, 4) the first energy value measured in each fixst frame of a packet according to a determinéd number QO of bits and the variations of the k - 1 remaining energies in relation to the first value of the energy sampled on a determined number Q1 of bits smaller t,han Q0, the variations~ of the k - 1 energies being selected from a table of "slopes" enabling each energy sample k ~o be assigned the energy "slope'i that separates~i~ from the energy of the "k - 1th" or "k - 1 order" previous sample.
The main advantage of the method according to the invention is that it can be used to obtain high quality energy in each frame of the speech signal while at the same time respecting the energy transitions from frame to frame without thereby affecting the computation load and the necessary memory space in the vocoder.
BRIEF DESCRIPTION OF THE DRAWINGS
Other features and advantages of the invention shall appear~from~the~followlng description, made with reference to the appended drawings, of which:

:

-- 2~833~
s .

- Figures 1 and 2 show two graphs to illustrate the principle of quantification of the energy of a vocoder implemented by the inventlon;
- Figure 3 is a flow chart illus~rating the different steps of the method according to the invention.
MOEE DETAILED DESCRIPTIOM~ ~
The method~according to the lnvention consists, in :
the manne~ shown in fi~ure 1, in segmenting the speech signal into irames wlth a constant determlned duration ranging, for examplel from 22.5 to 30 ms, grouping the frames in packets of a determined number n of energy values of the signal in each frame to transmit, in each packet, only~the first quantified value of the energy ~measured~El in the~first frame of a packet as well as the k - l values o~ the diffPrences of the energies existing between the frames that follow, k being equal to n.L. In reception, the differences of the energies received are placed end to end after the first energy value that~;is received in the first frame of each packet to reconstitute the profile of the quantified values of the energies at emission.
To do this, in the emission vocoder, a first value :: :
; of energy is quantified in each first frame ko of a packet in a determined number QO of bits and the :: ~ : ::
variations of ~ the k~ - 1 remaining energies are quantified with a determined number Q1 of bits smaller . . . : . . :
: - ~ .

' 2~3~3~

than QO. The 2Qo possible initial values include a zero value representing the silences. ThP other values are distributed according to an almost logarithmic scale which is best suited to following the properties of sensitivity of the ear: the higher the level of the speech signal, the smaller is the quantification step.
Typically, a 3dB step is adopted for the low levels and a 1 dB step is adopted for the high levels. The m = 2Q1 other values represent energy increments d; also referred to hereinafter as "legal values of energy", the values of which are predetermined to emphasize the transitions. These transitions are chosen for example as being respectively e~ual to -3dR, OdB, +2dB and +7dB
f the number Ql ls~encoded with only two bits.
As can be~seen ln figure 2, the energy increments can be used to make a search, from each quantified value B of a frame k, for the quantified values ~ of the energy in the k - 1th preceding ~rame which could lead to said value B by a legal increment dj starting with the zero increment do~
The numbers QO and Q1 are determined according to the steps 1 to S of the method represented by the flow chart of figure; 3. The flrst step referenced 1 in figure 3 groups together the frames in packets of L
frames. The values of the energies E1 to Ek are computed~at the step 2. These are quantified in the manner shown in figures 1 and 2 between two values Emax - ~$3~3~

and Emjn in relation to a scale comprising P
graduatlons which may be identified for convenience's sake with the 2Qo possible values of the initial energy E1 measured in the first frame. The quantified values corresponding to the 2Qo posslble values are designated in figure 2 by eO, e1 ... ep_1 with eO = Emjn and ep_1 = Emax The method continues at the step 3 in figure 3 by an initialization stage in which a set of P distances is computed between the first value of energy E and the P possible quantified value.s of this energy.
The corresponding distances Dp are memorized in the form of a first table (D~, not shown, in a memory of the vocoder. The computations take place by squaring the differences between the first energy EI and the quantified values eO, el... ep_1 according to the relationship:
D(p) = (El - ep)2 where p = 0, 1 ... P - 1 The computed distances are all the smaller as the quantified value ep is closer to the value El. The next step 4 consists, in a manner similar to the known VITERBI algorit~mt in in carrying out k - 1 iterations aimed at estimating the distances between all the potential quantification profiles and the real energy profile, in eliminating the least probable quantification profiles. ~ second table (D') not shown and referenced ~Islope~ is prepared. For each of the - . . .

. - ., -.
. .

- 2~33~

iterations l to k - l r this second table D' associates a slope or a legal energy increment dj with each quantified value P of the i.teration k. A search for the quantified value of the preceding k - 1th iteration is ; 5 done by the ticking off, in the "slopes" table, of the "part" or legal increment dj that can lead directly thereto, beginning with the zero increment do. The sequence ~of ~the programming instructions to be implemented is the following:

- FOR p = 0 ... P - l, DO
/* initiaIization for a zero incrementation*/
- Let Dm;n = D (p~do~ = D' (p) and let PrecIndex = O
/*test of:the~non-zero Lncrementations */
- FOR i =~1 ... m, DO
~: - If p - dj > = O AND p - dj < = P - 1 THEN /*legal value dl*/
- If D'(p - dj) < Dmin then /* shorter distance */
DO Dmjn = ~' (P dj) - DO PrecIndex = i - END IF
- END IF
END DO
~ DO SlopeIndex ~k~p)=precIndex/* memoriæe the most probable quant1fied value at the preceding step*/

- DO D(p) = Dmjn -~ Ek-ep)2/* update the distance*/
END DO

.. . . . .
':' ''. ' ' ' : `' ` ` ~
, .

` ` ` . ' 2~3~3~

Thus, at -the k - 1th iteration, a table of distances D~) is prepared. This table, at the position p, contains the cumulated distance between the best quantified profile that arrives at the position p and the original profile. This makes it possible to keep, in memory, a table of slope indices wherein the slope index value (k, p) represents the index of the best possible s~ope to arrive at the quantified value ep at the step k. The two tables thus obtained make it 1~ possible~to arri.ve at a fina} decision. To do this, the method entails carrying out a search in the table D(+) for the index Pmin which corresponds to t~e minimum value. Then it conslsts in making a trace-back in the slopes table by carrying out k - 1 iterations programmed as follows:
- for k = K - 1, K - 2, ...., 1 DO
- Dif~Index(f) = SlopeIndex(k,p ~ Pmi n = Pmi n - SlopeIndex(kl Pmi n ) END DO
The index values Index Diff (1 .......... K - 1) are the indices of the best quantified values possible for the slopes Dj. The final value of Pmin is then simply the most probable quantified value~
The correspondence between the original profile o~
the values o~ the energies to be quantified after the final profile~after quantification is shown in figure 1. The fact that the algorithm automatically eliminates ' ~ . ': : . - ' ' .
- : - ~ ., ,: ' ,... . .
' ' ''' '. .
. ,' :~, ~ , - - 2~8~33~

the aberrant values resulting from a false analysis appears in the fourth value of energy shown in figure 1.
Naturally, the method that has just been described can always be matched to particular characteristics of the system of analysis. In particular, 1f this system tends to find erroneous values for energy, it is always possible to minimize the influence of the erroneous values through the replacement, for example, of the squaring operations used for the distance measurements by absolute values that enable the profile of the quantified values to be linked with the correct values of energy, provided that they are more numerous than the incorrect~values.
Furthermore, the operat1ons of matching and fine tuning fcr the vocoder require only modifications of the quantified starting values (number and values), the increments (number and values), or again the number of iterations.
~ Finally, the method that has just been described represents only a small computation load since the initialization is done starting with the very first frame, and the kth iteration is done at the k ~ 1 frame. This.enables the distribution of the computation load in time, except for the last frame where the final decision is taken without the arrangement's being costly in terms of computation power.

'

Claims (6)

1. A method for the quantification of the energy of the speech signal in a vocoder with very low bit rate, said method consisting in dividing the speech signal into packets of a determined number of frames of a constant duration by the sampling of a determined number n of energy values in each frame, quantifying the first energy value measured in each first frame of a packet according to a determined number Q0 of bits and the variations of the k - 1 remaining energies in relation to the first value of the energy sampled on a determined number Q1 of bits smaller than Q0, the variations of the k - 1 energies being selected from a table of "slopes" enabling each energy sample k to be assigned the energy "slope" that separates it from the energy of the "k - 1th" previous sample.
2. A method according to claim 1, consisting in memorizing the energy slopes associated with each energy sample in the order of appearance of the energy samples.
3. A method according to any one of the claims 1 or 2, wherein the first energy value measured in each first frame is quantified according to an almost logarithmic scale of quantification in giving a greater step value to the low levels of energy.
4. A method according to claim 3, wherein the variations of the k - 1 energies are quantified on levels distributed about a zero level of increase.
5. A method according to claim 4, wherein the selection of the parts of energy is done in making a search, in the table of the slopes, for one of the slopes corresponding to the quantification levels, starting with the zero slope increment do which leads, from an energy sample k of a frame, to an energy value closest to the value of the energy of the k - 1th preceding sample.
6. A method according to any of the claim 5, wherein the determination of the variations of the k -1 energies takes place by the application of the VITERBI algorithm.
CA 2083335 1991-11-22 1992-11-19 Method for the quantification of the energy of the speech signal in a vocoder with very low bit rate Abandoned CA2083335A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FR9114402 1991-11-22
FR9114402A FR2684225A1 (en) 1991-11-22 1991-11-22 METHOD FOR QUANTIFYING SPEECH SIGNAL ENERGY IN A VOCODER AT VERY LOW SPEED.

Publications (1)

Publication Number Publication Date
CA2083335A1 true CA2083335A1 (en) 1993-05-23

Family

ID=9419210

Family Applications (1)

Application Number Title Priority Date Filing Date
CA 2083335 Abandoned CA2083335A1 (en) 1991-11-22 1992-11-19 Method for the quantification of the energy of the speech signal in a vocoder with very low bit rate

Country Status (3)

Country Link
EP (1) EP0543700A3 (en)
CA (1) CA2083335A1 (en)
FR (1) FR2684225A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110109646A (en) * 2019-03-28 2019-08-09 北京迈格威科技有限公司 Data processing method, device and adder and multiplier and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2331207A2 (en) * 1975-11-05 1977-06-03 Ibm France BLOCK QUANTIFICATION PROCESS OF SAMPLES OF AN ELECTRIC SIGNAL, AND DEVICE FOR IMPLEMENTING THE SAID PROCESS
US4870685A (en) * 1986-10-26 1989-09-26 Ricoh Company, Ltd. Voice signal coding method
FR2661541A1 (en) * 1990-04-27 1991-10-31 Thomson Csf METHOD AND DEVICE FOR CODING LOW SPEECH FLOW
DE4103277A1 (en) * 1991-02-04 1992-08-06 Hilberg Wolfgang Encoding of place or time-dependent analogue functions - assembling successive samples with symbolic significance into quasi-words for stepwise abstraction in hierarchical associative memory

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110109646A (en) * 2019-03-28 2019-08-09 北京迈格威科技有限公司 Data processing method, device and adder and multiplier and storage medium
CN110109646B (en) * 2019-03-28 2021-08-27 北京迈格威科技有限公司 Data processing method, data processing device, multiplier-adder and storage medium

Also Published As

Publication number Publication date
EP0543700A2 (en) 1993-05-26
EP0543700A3 (en) 1993-09-29
FR2684225A1 (en) 1993-05-28

Similar Documents

Publication Publication Date Title
DE69322313T2 (en) C.E.L.P. - vocoder
EP0504627B1 (en) Speech parameter coding method and apparatus
US6148283A (en) Method and apparatus using multi-path multi-stage vector quantizer
Juang et al. Distortion performance of vector quantization for LPC voice coding
CN1954642B (en) Multi-channel synthesizer and method for generating a multi-channel output signal
Sugamura et al. Speech analysis and synthesis methods developed at ECL in NTT—From LPC to LSP—
US6202046B1 (en) Background noise/speech classification method
EP0848374B1 (en) A method and a device for speech encoding
EP0696026B1 (en) Speech coding device
US5694426A (en) Signal quantizer with reduced output fluctuation
JP3254687B2 (en) Audio coding method
Soong et al. Optimal quantization of LSP parameters using delayed decisions
EP0186763A1 (en) Method of and device for speech signal coding and decoding by vector quantization techniques
US20040153318A1 (en) System and method for enhancing bit error tolerance over a bandwidth limited channel
US5313553A (en) Method to evaluate the pitch and voicing of the speech signal in vocoders with very slow bit rates
CA2083335A1 (en) Method for the quantification of the energy of the speech signal in a vocoder with very low bit rate
CA2026823C (en) Pitch period searching method and circuit for speech codec
Kroon et al. Experimental evaluation of different approaches to the multi-pulse coder
CA2054849C (en) Speech parameter encoding method capable of transmitting a spectrum parameter at a reduced number of bits
EP0694907A2 (en) Speech coder
Ribeiro et al. Application of speaker modification techniques to phonetic vocoding
JPH08234797A (en) Voice parameter quantization device and vector quantization device
KR960015861B1 (en) Quantizer &amp; quantizing method of linear spectrum frequency vector
EP0755047B1 (en) Speech parameter encoding method capable of transmitting a spectrum parameter at a reduced number of bits
EP0910064B1 (en) Speech parameter coding apparatus

Legal Events

Date Code Title Description
FZDE Dead