CA2083335A1 - Method for the quantification of the energy of the speech signal in a vocoder with very low bit rate - Google Patents
Method for the quantification of the energy of the speech signal in a vocoder with very low bit rateInfo
- Publication number
- CA2083335A1 CA2083335A1 CA 2083335 CA2083335A CA2083335A1 CA 2083335 A1 CA2083335 A1 CA 2083335A1 CA 2083335 CA2083335 CA 2083335 CA 2083335 A CA2083335 A CA 2083335A CA 2083335 A1 CA2083335 A1 CA 2083335A1
- Authority
- CA
- Canada
- Prior art keywords
- energy
- value
- frame
- determined number
- energies
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 title claims abstract description 25
- 238000011002 quantification Methods 0.000 title claims description 14
- 238000005070 sampling Methods 0.000 claims abstract description 3
- 230000007704 transition Effects 0.000 description 3
- 230000015572 biosynthetic process Effects 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 230000001594 aberrant effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000001955 cumulated effect Effects 0.000 description 1
- 230000003292 diminished effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- MKHLSGKJYGBQLI-UHFFFAOYSA-N ethoxy-(2-ethylsulfanylethylsulfanyl)-methoxy-sulfanylidene-$l^{5}-phosphane Chemical compound CCOP(=S)(OC)SCCSCC MKHLSGKJYGBQLI-UHFFFAOYSA-N 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 208000006379 syphilis Diseases 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
- G10L19/038—Vector quantisation, e.g. TwinVQ audio
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/06—Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
ABSTRACT OF THE DISCLOSURE
The method consists in dividing the speech signal into packets of a determined number of frames of a constant duration by the sampling of a determined number n of energy values in each frame, quantifying the first energy value measured in each first frame of a packet according to a determined number Q0 of bits and the variations of the k - 1 remaining energies in relation to the first value of the energy sampled on determined number Q1 of bits smaller than Q0, the variations of the k - 1 energies being selected from a table of "slopes" enabling each energy sample k to be assigned the energy "slope" that separates it from the energy of the "k - 1th" previous sample. Application:
Vocoders.
Figure 3
The method consists in dividing the speech signal into packets of a determined number of frames of a constant duration by the sampling of a determined number n of energy values in each frame, quantifying the first energy value measured in each first frame of a packet according to a determined number Q0 of bits and the variations of the k - 1 remaining energies in relation to the first value of the energy sampled on determined number Q1 of bits smaller than Q0, the variations of the k - 1 energies being selected from a table of "slopes" enabling each energy sample k to be assigned the energy "slope" that separates it from the energy of the "k - 1th" previous sample. Application:
Vocoders.
Figure 3
Description
- ~08~3~
METHOD FOR THE QUANTIFICATION OF THE ENERGY OF THE
SPEECH SIGNAL IN A VOCODER WITH VERY LOW BIT RATE
BACKGROUND OF THE INVENTION
1. Field of the Invention The present invention relates to a method for the quantification of the energy of the speech signal in a vocoder with a very low bit rate.
It can be applied notably to the making of the linear predlction vocoders used for the transmission of speech by radio, similar to those described for example in the Revue Technique THOMSON-CSF (THOMSON-CSF
Technical Journal), volume 14, No. 3, September 1982, pp. 715 to 731, in which the~ speech signal is identified at the output of a digital filter, the input of which recelves either a periodic waveform corresponding to the waveforms of its voiced sounds such as the vowels or a random waveform corresponding to the waveforms of its unvoiced sounds such as most of its consonants.
METHOD FOR THE QUANTIFICATION OF THE ENERGY OF THE
SPEECH SIGNAL IN A VOCODER WITH VERY LOW BIT RATE
BACKGROUND OF THE INVENTION
1. Field of the Invention The present invention relates to a method for the quantification of the energy of the speech signal in a vocoder with a very low bit rate.
It can be applied notably to the making of the linear predlction vocoders used for the transmission of speech by radio, similar to those described for example in the Revue Technique THOMSON-CSF (THOMSON-CSF
Technical Journal), volume 14, No. 3, September 1982, pp. 715 to 731, in which the~ speech signal is identified at the output of a digital filter, the input of which recelves either a periodic waveform corresponding to the waveforms of its voiced sounds such as the vowels or a random waveform corresponding to the waveforms of its unvoiced sounds such as most of its consonants.
2. Description of the Prior Art . .
It is known that the auditory quality of linear prediction vocoders depends greatly on the precision with which their predictive filter is quantified, but also on the quality of the restitution of the power profile of the excita~ion. This is especially true for certain transitory sounds such as many consonants: for .
.
~ ' example, poor quality restitution does not allow a "d"
to be distinguished from a "t" or from a "k".
As a rule, the speech signal is segmented into frames of constant duration, and a single value of power (or energy) is given ~ox each frame.
In vocoders with very low bit rate, one way to lower the bit rate is to increase the duration of the frame, for example from 22.5 ms to 30 ms as well as to group together and quantify the parameters relating to several frames once alone. This enables the dlfferent parameters of synthesis to be renewed less frequently.
Unfortunately, the intelligibility of the restituted speech is diminished, for the transmitting of only one value of ener~y per frame no longer enables the appropriate restitution of certain transitory sounds.
A first known way to overcome these difficulties consists in grouping the frames together in packets while considering k values o~ energy per packet, each of which can be represented by the coordinates of a point referenced in a k-dimensional space. A
statistical analysis makes it possible to determine the main axas of the cloud of the poin~s observed. The quantification takes place on the coordinates of the points borne by the main axes t each point being quantified on a number o~ bits depending on the eigen value or characteristic value associated with each axis considered. However, he drawback of operating in this ~3~
way is that it is necessary to plan a p.rocedure of correction at the synthesis filter so that the values of the energies compute~ are not negative. Furthermore, in this processing operation, no special attention is paid to the fidelity of restitution of the transitory sounds.
According to a second method, also known, which partly follows the procedure of the first method by the grouping of frames in packets and which also takes k values of energy per packet into consideration, the k values of energy are no longer encoded in a scalar way but vectorially by means of a dictionary containing M =
2Q multiplets of k v~lues each in considering the k values to be quantified on Q bits.
In this case, the difficulties of setting up the system appear from the fact that it is necessary, firstly, to create and store a dictionary and, secondly, to carry out a quantification. Since the dictionary is generally poorly structured and since it is necessary to count at least two bits per value of energy, the encoding o~ the number Q occupies no less than 22 combinations which represents very major computing loads for the signal processors of the vocoders.
SUMMARY OF THE INVENTIO~
:
It is the aim of :the invention to overcome the above-mentioned drawbacks. To this effect, an object of '.
:
, - 2~833~
the invention is a method ~or the quan-tification of the energy of the speech signal in a vocoder with very low bit rate, said method consisting in dividing (1) the speech signal into packets of a determined number of ~frames of a constant duration by the sampling of a determined number n o~ energy values in each frame, quantifying ~2, 3, 4) the first energy value measured in each fixst frame of a packet according to a determinéd number QO of bits and the variations of the k - 1 remaining energies in relation to the first value of the energy sampled on a determined number Q1 of bits smaller t,han Q0, the variations~ of the k - 1 energies being selected from a table of "slopes" enabling each energy sample k ~o be assigned the energy "slope'i that separates~i~ from the energy of the "k - 1th" or "k - 1 order" previous sample.
The main advantage of the method according to the invention is that it can be used to obtain high quality energy in each frame of the speech signal while at the same time respecting the energy transitions from frame to frame without thereby affecting the computation load and the necessary memory space in the vocoder.
BRIEF DESCRIPTION OF THE DRAWINGS
Other features and advantages of the invention shall appear~from~the~followlng description, made with reference to the appended drawings, of which:
:
-- 2~833~
s .
- Figures 1 and 2 show two graphs to illustrate the principle of quantification of the energy of a vocoder implemented by the inventlon;
- Figure 3 is a flow chart illus~rating the different steps of the method according to the invention.
MOEE DETAILED DESCRIPTIOM~ ~
The method~according to the lnvention consists, in :
the manne~ shown in fi~ure 1, in segmenting the speech signal into irames wlth a constant determlned duration ranging, for examplel from 22.5 to 30 ms, grouping the frames in packets of a determined number n of energy values of the signal in each frame to transmit, in each packet, only~the first quantified value of the energy ~measured~El in the~first frame of a packet as well as the k - l values o~ the diffPrences of the energies existing between the frames that follow, k being equal to n.L. In reception, the differences of the energies received are placed end to end after the first energy value that~;is received in the first frame of each packet to reconstitute the profile of the quantified values of the energies at emission.
To do this, in the emission vocoder, a first value :: :
; of energy is quantified in each first frame ko of a packet in a determined number QO of bits and the :: ~ : ::
variations of ~ the k~ - 1 remaining energies are quantified with a determined number Q1 of bits smaller . . . : . . :
: - ~ .
' 2~3~3~
than QO. The 2Qo possible initial values include a zero value representing the silences. ThP other values are distributed according to an almost logarithmic scale which is best suited to following the properties of sensitivity of the ear: the higher the level of the speech signal, the smaller is the quantification step.
Typically, a 3dB step is adopted for the low levels and a 1 dB step is adopted for the high levels. The m = 2Q1 other values represent energy increments d; also referred to hereinafter as "legal values of energy", the values of which are predetermined to emphasize the transitions. These transitions are chosen for example as being respectively e~ual to -3dR, OdB, +2dB and +7dB
f the number Ql ls~encoded with only two bits.
As can be~seen ln figure 2, the energy increments can be used to make a search, from each quantified value B of a frame k, for the quantified values ~ of the energy in the k - 1th preceding ~rame which could lead to said value B by a legal increment dj starting with the zero increment do~
The numbers QO and Q1 are determined according to the steps 1 to S of the method represented by the flow chart of figure; 3. The flrst step referenced 1 in figure 3 groups together the frames in packets of L
frames. The values of the energies E1 to Ek are computed~at the step 2. These are quantified in the manner shown in figures 1 and 2 between two values Emax - ~$3~3~
and Emjn in relation to a scale comprising P
graduatlons which may be identified for convenience's sake with the 2Qo possible values of the initial energy E1 measured in the first frame. The quantified values corresponding to the 2Qo posslble values are designated in figure 2 by eO, e1 ... ep_1 with eO = Emjn and ep_1 = Emax The method continues at the step 3 in figure 3 by an initialization stage in which a set of P distances is computed between the first value of energy E and the P possible quantified value.s of this energy.
The corresponding distances Dp are memorized in the form of a first table (D~, not shown, in a memory of the vocoder. The computations take place by squaring the differences between the first energy EI and the quantified values eO, el... ep_1 according to the relationship:
D(p) = (El - ep)2 where p = 0, 1 ... P - 1 The computed distances are all the smaller as the quantified value ep is closer to the value El. The next step 4 consists, in a manner similar to the known VITERBI algorit~mt in in carrying out k - 1 iterations aimed at estimating the distances between all the potential quantification profiles and the real energy profile, in eliminating the least probable quantification profiles. ~ second table (D') not shown and referenced ~Islope~ is prepared. For each of the - . . .
. - ., -.
. .
- 2~33~
iterations l to k - l r this second table D' associates a slope or a legal energy increment dj with each quantified value P of the i.teration k. A search for the quantified value of the preceding k - 1th iteration is ; 5 done by the ticking off, in the "slopes" table, of the "part" or legal increment dj that can lead directly thereto, beginning with the zero increment do. The sequence ~of ~the programming instructions to be implemented is the following:
- FOR p = 0 ... P - l, DO
/* initiaIization for a zero incrementation*/
- Let Dm;n = D (p~do~ = D' (p) and let PrecIndex = O
/*test of:the~non-zero Lncrementations */
- FOR i =~1 ... m, DO
~: - If p - dj > = O AND p - dj < = P - 1 THEN /*legal value dl*/
- If D'(p - dj) < Dmin then /* shorter distance */
DO Dmjn = ~' (P dj) - DO PrecIndex = i - END IF
- END IF
END DO
~ DO SlopeIndex ~k~p)=precIndex/* memoriæe the most probable quant1fied value at the preceding step*/
- DO D(p) = Dmjn -~ Ek-ep)2/* update the distance*/
END DO
.. . . . .
':' ''. ' ' ' : `' ` ` ~
, .
` ` ` . ' 2~3~3~
Thus, at -the k - 1th iteration, a table of distances D~) is prepared. This table, at the position p, contains the cumulated distance between the best quantified profile that arrives at the position p and the original profile. This makes it possible to keep, in memory, a table of slope indices wherein the slope index value (k, p) represents the index of the best possible s~ope to arrive at the quantified value ep at the step k. The two tables thus obtained make it 1~ possible~to arri.ve at a fina} decision. To do this, the method entails carrying out a search in the table D(+) for the index Pmin which corresponds to t~e minimum value. Then it conslsts in making a trace-back in the slopes table by carrying out k - 1 iterations programmed as follows:
- for k = K - 1, K - 2, ...., 1 DO
- Dif~Index(f) = SlopeIndex(k,p ~ Pmi n = Pmi n - SlopeIndex(kl Pmi n ) END DO
The index values Index Diff (1 .......... K - 1) are the indices of the best quantified values possible for the slopes Dj. The final value of Pmin is then simply the most probable quantified value~
The correspondence between the original profile o~
the values o~ the energies to be quantified after the final profile~after quantification is shown in figure 1. The fact that the algorithm automatically eliminates ' ~ . ': : . - ' ' .
- : - ~ ., ,: ' ,... . .
' ' ''' '. .
. ,' :~, ~ , - - 2~8~33~
the aberrant values resulting from a false analysis appears in the fourth value of energy shown in figure 1.
Naturally, the method that has just been described can always be matched to particular characteristics of the system of analysis. In particular, 1f this system tends to find erroneous values for energy, it is always possible to minimize the influence of the erroneous values through the replacement, for example, of the squaring operations used for the distance measurements by absolute values that enable the profile of the quantified values to be linked with the correct values of energy, provided that they are more numerous than the incorrect~values.
Furthermore, the operat1ons of matching and fine tuning fcr the vocoder require only modifications of the quantified starting values (number and values), the increments (number and values), or again the number of iterations.
~ Finally, the method that has just been described represents only a small computation load since the initialization is done starting with the very first frame, and the kth iteration is done at the k ~ 1 frame. This.enables the distribution of the computation load in time, except for the last frame where the final decision is taken without the arrangement's being costly in terms of computation power.
'
It is known that the auditory quality of linear prediction vocoders depends greatly on the precision with which their predictive filter is quantified, but also on the quality of the restitution of the power profile of the excita~ion. This is especially true for certain transitory sounds such as many consonants: for .
.
~ ' example, poor quality restitution does not allow a "d"
to be distinguished from a "t" or from a "k".
As a rule, the speech signal is segmented into frames of constant duration, and a single value of power (or energy) is given ~ox each frame.
In vocoders with very low bit rate, one way to lower the bit rate is to increase the duration of the frame, for example from 22.5 ms to 30 ms as well as to group together and quantify the parameters relating to several frames once alone. This enables the dlfferent parameters of synthesis to be renewed less frequently.
Unfortunately, the intelligibility of the restituted speech is diminished, for the transmitting of only one value of ener~y per frame no longer enables the appropriate restitution of certain transitory sounds.
A first known way to overcome these difficulties consists in grouping the frames together in packets while considering k values o~ energy per packet, each of which can be represented by the coordinates of a point referenced in a k-dimensional space. A
statistical analysis makes it possible to determine the main axas of the cloud of the poin~s observed. The quantification takes place on the coordinates of the points borne by the main axes t each point being quantified on a number o~ bits depending on the eigen value or characteristic value associated with each axis considered. However, he drawback of operating in this ~3~
way is that it is necessary to plan a p.rocedure of correction at the synthesis filter so that the values of the energies compute~ are not negative. Furthermore, in this processing operation, no special attention is paid to the fidelity of restitution of the transitory sounds.
According to a second method, also known, which partly follows the procedure of the first method by the grouping of frames in packets and which also takes k values of energy per packet into consideration, the k values of energy are no longer encoded in a scalar way but vectorially by means of a dictionary containing M =
2Q multiplets of k v~lues each in considering the k values to be quantified on Q bits.
In this case, the difficulties of setting up the system appear from the fact that it is necessary, firstly, to create and store a dictionary and, secondly, to carry out a quantification. Since the dictionary is generally poorly structured and since it is necessary to count at least two bits per value of energy, the encoding o~ the number Q occupies no less than 22 combinations which represents very major computing loads for the signal processors of the vocoders.
SUMMARY OF THE INVENTIO~
:
It is the aim of :the invention to overcome the above-mentioned drawbacks. To this effect, an object of '.
:
, - 2~833~
the invention is a method ~or the quan-tification of the energy of the speech signal in a vocoder with very low bit rate, said method consisting in dividing (1) the speech signal into packets of a determined number of ~frames of a constant duration by the sampling of a determined number n o~ energy values in each frame, quantifying ~2, 3, 4) the first energy value measured in each fixst frame of a packet according to a determinéd number QO of bits and the variations of the k - 1 remaining energies in relation to the first value of the energy sampled on a determined number Q1 of bits smaller t,han Q0, the variations~ of the k - 1 energies being selected from a table of "slopes" enabling each energy sample k ~o be assigned the energy "slope'i that separates~i~ from the energy of the "k - 1th" or "k - 1 order" previous sample.
The main advantage of the method according to the invention is that it can be used to obtain high quality energy in each frame of the speech signal while at the same time respecting the energy transitions from frame to frame without thereby affecting the computation load and the necessary memory space in the vocoder.
BRIEF DESCRIPTION OF THE DRAWINGS
Other features and advantages of the invention shall appear~from~the~followlng description, made with reference to the appended drawings, of which:
:
-- 2~833~
s .
- Figures 1 and 2 show two graphs to illustrate the principle of quantification of the energy of a vocoder implemented by the inventlon;
- Figure 3 is a flow chart illus~rating the different steps of the method according to the invention.
MOEE DETAILED DESCRIPTIOM~ ~
The method~according to the lnvention consists, in :
the manne~ shown in fi~ure 1, in segmenting the speech signal into irames wlth a constant determlned duration ranging, for examplel from 22.5 to 30 ms, grouping the frames in packets of a determined number n of energy values of the signal in each frame to transmit, in each packet, only~the first quantified value of the energy ~measured~El in the~first frame of a packet as well as the k - l values o~ the diffPrences of the energies existing between the frames that follow, k being equal to n.L. In reception, the differences of the energies received are placed end to end after the first energy value that~;is received in the first frame of each packet to reconstitute the profile of the quantified values of the energies at emission.
To do this, in the emission vocoder, a first value :: :
; of energy is quantified in each first frame ko of a packet in a determined number QO of bits and the :: ~ : ::
variations of ~ the k~ - 1 remaining energies are quantified with a determined number Q1 of bits smaller . . . : . . :
: - ~ .
' 2~3~3~
than QO. The 2Qo possible initial values include a zero value representing the silences. ThP other values are distributed according to an almost logarithmic scale which is best suited to following the properties of sensitivity of the ear: the higher the level of the speech signal, the smaller is the quantification step.
Typically, a 3dB step is adopted for the low levels and a 1 dB step is adopted for the high levels. The m = 2Q1 other values represent energy increments d; also referred to hereinafter as "legal values of energy", the values of which are predetermined to emphasize the transitions. These transitions are chosen for example as being respectively e~ual to -3dR, OdB, +2dB and +7dB
f the number Ql ls~encoded with only two bits.
As can be~seen ln figure 2, the energy increments can be used to make a search, from each quantified value B of a frame k, for the quantified values ~ of the energy in the k - 1th preceding ~rame which could lead to said value B by a legal increment dj starting with the zero increment do~
The numbers QO and Q1 are determined according to the steps 1 to S of the method represented by the flow chart of figure; 3. The flrst step referenced 1 in figure 3 groups together the frames in packets of L
frames. The values of the energies E1 to Ek are computed~at the step 2. These are quantified in the manner shown in figures 1 and 2 between two values Emax - ~$3~3~
and Emjn in relation to a scale comprising P
graduatlons which may be identified for convenience's sake with the 2Qo possible values of the initial energy E1 measured in the first frame. The quantified values corresponding to the 2Qo posslble values are designated in figure 2 by eO, e1 ... ep_1 with eO = Emjn and ep_1 = Emax The method continues at the step 3 in figure 3 by an initialization stage in which a set of P distances is computed between the first value of energy E and the P possible quantified value.s of this energy.
The corresponding distances Dp are memorized in the form of a first table (D~, not shown, in a memory of the vocoder. The computations take place by squaring the differences between the first energy EI and the quantified values eO, el... ep_1 according to the relationship:
D(p) = (El - ep)2 where p = 0, 1 ... P - 1 The computed distances are all the smaller as the quantified value ep is closer to the value El. The next step 4 consists, in a manner similar to the known VITERBI algorit~mt in in carrying out k - 1 iterations aimed at estimating the distances between all the potential quantification profiles and the real energy profile, in eliminating the least probable quantification profiles. ~ second table (D') not shown and referenced ~Islope~ is prepared. For each of the - . . .
. - ., -.
. .
- 2~33~
iterations l to k - l r this second table D' associates a slope or a legal energy increment dj with each quantified value P of the i.teration k. A search for the quantified value of the preceding k - 1th iteration is ; 5 done by the ticking off, in the "slopes" table, of the "part" or legal increment dj that can lead directly thereto, beginning with the zero increment do. The sequence ~of ~the programming instructions to be implemented is the following:
- FOR p = 0 ... P - l, DO
/* initiaIization for a zero incrementation*/
- Let Dm;n = D (p~do~ = D' (p) and let PrecIndex = O
/*test of:the~non-zero Lncrementations */
- FOR i =~1 ... m, DO
~: - If p - dj > = O AND p - dj < = P - 1 THEN /*legal value dl*/
- If D'(p - dj) < Dmin then /* shorter distance */
DO Dmjn = ~' (P dj) - DO PrecIndex = i - END IF
- END IF
END DO
~ DO SlopeIndex ~k~p)=precIndex/* memoriæe the most probable quant1fied value at the preceding step*/
- DO D(p) = Dmjn -~ Ek-ep)2/* update the distance*/
END DO
.. . . . .
':' ''. ' ' ' : `' ` ` ~
, .
` ` ` . ' 2~3~3~
Thus, at -the k - 1th iteration, a table of distances D~) is prepared. This table, at the position p, contains the cumulated distance between the best quantified profile that arrives at the position p and the original profile. This makes it possible to keep, in memory, a table of slope indices wherein the slope index value (k, p) represents the index of the best possible s~ope to arrive at the quantified value ep at the step k. The two tables thus obtained make it 1~ possible~to arri.ve at a fina} decision. To do this, the method entails carrying out a search in the table D(+) for the index Pmin which corresponds to t~e minimum value. Then it conslsts in making a trace-back in the slopes table by carrying out k - 1 iterations programmed as follows:
- for k = K - 1, K - 2, ...., 1 DO
- Dif~Index(f) = SlopeIndex(k,p ~ Pmi n = Pmi n - SlopeIndex(kl Pmi n ) END DO
The index values Index Diff (1 .......... K - 1) are the indices of the best quantified values possible for the slopes Dj. The final value of Pmin is then simply the most probable quantified value~
The correspondence between the original profile o~
the values o~ the energies to be quantified after the final profile~after quantification is shown in figure 1. The fact that the algorithm automatically eliminates ' ~ . ': : . - ' ' .
- : - ~ ., ,: ' ,... . .
' ' ''' '. .
. ,' :~, ~ , - - 2~8~33~
the aberrant values resulting from a false analysis appears in the fourth value of energy shown in figure 1.
Naturally, the method that has just been described can always be matched to particular characteristics of the system of analysis. In particular, 1f this system tends to find erroneous values for energy, it is always possible to minimize the influence of the erroneous values through the replacement, for example, of the squaring operations used for the distance measurements by absolute values that enable the profile of the quantified values to be linked with the correct values of energy, provided that they are more numerous than the incorrect~values.
Furthermore, the operat1ons of matching and fine tuning fcr the vocoder require only modifications of the quantified starting values (number and values), the increments (number and values), or again the number of iterations.
~ Finally, the method that has just been described represents only a small computation load since the initialization is done starting with the very first frame, and the kth iteration is done at the k ~ 1 frame. This.enables the distribution of the computation load in time, except for the last frame where the final decision is taken without the arrangement's being costly in terms of computation power.
'
Claims (6)
1. A method for the quantification of the energy of the speech signal in a vocoder with very low bit rate, said method consisting in dividing the speech signal into packets of a determined number of frames of a constant duration by the sampling of a determined number n of energy values in each frame, quantifying the first energy value measured in each first frame of a packet according to a determined number Q0 of bits and the variations of the k - 1 remaining energies in relation to the first value of the energy sampled on a determined number Q1 of bits smaller than Q0, the variations of the k - 1 energies being selected from a table of "slopes" enabling each energy sample k to be assigned the energy "slope" that separates it from the energy of the "k - 1th" previous sample.
2. A method according to claim 1, consisting in memorizing the energy slopes associated with each energy sample in the order of appearance of the energy samples.
3. A method according to any one of the claims 1 or 2, wherein the first energy value measured in each first frame is quantified according to an almost logarithmic scale of quantification in giving a greater step value to the low levels of energy.
4. A method according to claim 3, wherein the variations of the k - 1 energies are quantified on levels distributed about a zero level of increase.
5. A method according to claim 4, wherein the selection of the parts of energy is done in making a search, in the table of the slopes, for one of the slopes corresponding to the quantification levels, starting with the zero slope increment do which leads, from an energy sample k of a frame, to an energy value closest to the value of the energy of the k - 1th preceding sample.
6. A method according to any of the claim 5, wherein the determination of the variations of the k -1 energies takes place by the application of the VITERBI algorithm.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR9114402 | 1991-11-22 | ||
FR9114402A FR2684225A1 (en) | 1991-11-22 | 1991-11-22 | METHOD FOR QUANTIFYING SPEECH SIGNAL ENERGY IN A VOCODER AT VERY LOW SPEED. |
Publications (1)
Publication Number | Publication Date |
---|---|
CA2083335A1 true CA2083335A1 (en) | 1993-05-23 |
Family
ID=9419210
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA 2083335 Abandoned CA2083335A1 (en) | 1991-11-22 | 1992-11-19 | Method for the quantification of the energy of the speech signal in a vocoder with very low bit rate |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP0543700A3 (en) |
CA (1) | CA2083335A1 (en) |
FR (1) | FR2684225A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110109646A (en) * | 2019-03-28 | 2019-08-09 | 北京迈格威科技有限公司 | Data processing method, device and adder and multiplier and storage medium |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2331207A2 (en) * | 1975-11-05 | 1977-06-03 | Ibm France | BLOCK QUANTIFICATION PROCESS OF SAMPLES OF AN ELECTRIC SIGNAL, AND DEVICE FOR IMPLEMENTING THE SAID PROCESS |
US4870685A (en) * | 1986-10-26 | 1989-09-26 | Ricoh Company, Ltd. | Voice signal coding method |
FR2661541A1 (en) * | 1990-04-27 | 1991-10-31 | Thomson Csf | METHOD AND DEVICE FOR CODING LOW SPEECH FLOW |
DE4103277A1 (en) * | 1991-02-04 | 1992-08-06 | Hilberg Wolfgang | Encoding of place or time-dependent analogue functions - assembling successive samples with symbolic significance into quasi-words for stepwise abstraction in hierarchical associative memory |
-
1991
- 1991-11-22 FR FR9114402A patent/FR2684225A1/en active Pending
-
1992
- 1992-11-10 EP EP19920403025 patent/EP0543700A3/en not_active Withdrawn
- 1992-11-19 CA CA 2083335 patent/CA2083335A1/en not_active Abandoned
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110109646A (en) * | 2019-03-28 | 2019-08-09 | 北京迈格威科技有限公司 | Data processing method, device and adder and multiplier and storage medium |
CN110109646B (en) * | 2019-03-28 | 2021-08-27 | 北京迈格威科技有限公司 | Data processing method, data processing device, multiplier-adder and storage medium |
Also Published As
Publication number | Publication date |
---|---|
EP0543700A2 (en) | 1993-05-26 |
EP0543700A3 (en) | 1993-09-29 |
FR2684225A1 (en) | 1993-05-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DE69322313T2 (en) | C.E.L.P. - vocoder | |
EP0504627B1 (en) | Speech parameter coding method and apparatus | |
US6148283A (en) | Method and apparatus using multi-path multi-stage vector quantizer | |
Juang et al. | Distortion performance of vector quantization for LPC voice coding | |
CN1954642B (en) | Multi-channel synthesizer and method for generating a multi-channel output signal | |
Sugamura et al. | Speech analysis and synthesis methods developed at ECL in NTT—From LPC to LSP— | |
US6202046B1 (en) | Background noise/speech classification method | |
EP0848374B1 (en) | A method and a device for speech encoding | |
EP0696026B1 (en) | Speech coding device | |
US5694426A (en) | Signal quantizer with reduced output fluctuation | |
JP3254687B2 (en) | Audio coding method | |
Soong et al. | Optimal quantization of LSP parameters using delayed decisions | |
EP0186763A1 (en) | Method of and device for speech signal coding and decoding by vector quantization techniques | |
US20040153318A1 (en) | System and method for enhancing bit error tolerance over a bandwidth limited channel | |
US5313553A (en) | Method to evaluate the pitch and voicing of the speech signal in vocoders with very slow bit rates | |
CA2083335A1 (en) | Method for the quantification of the energy of the speech signal in a vocoder with very low bit rate | |
CA2026823C (en) | Pitch period searching method and circuit for speech codec | |
Kroon et al. | Experimental evaluation of different approaches to the multi-pulse coder | |
CA2054849C (en) | Speech parameter encoding method capable of transmitting a spectrum parameter at a reduced number of bits | |
EP0694907A2 (en) | Speech coder | |
Ribeiro et al. | Application of speaker modification techniques to phonetic vocoding | |
JPH08234797A (en) | Voice parameter quantization device and vector quantization device | |
KR960015861B1 (en) | Quantizer & quantizing method of linear spectrum frequency vector | |
EP0755047B1 (en) | Speech parameter encoding method capable of transmitting a spectrum parameter at a reduced number of bits | |
EP0910064B1 (en) | Speech parameter coding apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FZDE | Dead |