EP0543700A2 - Verfahren zur Quantisierung der Sprachsignalenergie in einem Vocoder mit niedriger Bitrate - Google Patents
Verfahren zur Quantisierung der Sprachsignalenergie in einem Vocoder mit niedriger Bitrate Download PDFInfo
- Publication number
- EP0543700A2 EP0543700A2 EP19920403025 EP92403025A EP0543700A2 EP 0543700 A2 EP0543700 A2 EP 0543700A2 EP 19920403025 EP19920403025 EP 19920403025 EP 92403025 A EP92403025 A EP 92403025A EP 0543700 A2 EP0543700 A2 EP 0543700A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- energy
- frame
- value
- values
- variations
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000000034 method Methods 0.000 title claims abstract description 21
- 238000011002 quantification Methods 0.000 title description 3
- 238000005070 sampling Methods 0.000 claims abstract description 3
- 238000013139 quantization Methods 0.000 claims description 7
- 238000004422 calculation algorithm Methods 0.000 claims description 3
- 230000005540 biological transmission Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000005284 excitation Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
- G10L19/038—Vector quantisation, e.g. TwinVQ audio
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/06—Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
Definitions
- the present invention relates to a method for quantifying the energy of the speech signal in a vocoder at very low bit rate.
- the speech signal is segmented into frames of constant duration, and a single value of the power (or energy) of the excitation is provided for each frame.
- one way of lowering the bit rate consists in increasing the duration of the frame, for example from 22.5 ms to 30 ms as well as in grouping and quantifying in a single time the parameters relating to several frames. This makes it possible to renew the different synthesis parameters less often. Unfortunately, the intelligibility of the restored speech decreases because transmitting a single value of the energy per frame no longer allows an adequate restitution of certain transients.
- a first known method consists in grouping the frames in packets by considering k energy values by packet, each representable by the coordinates of a point referenced in a k-dimensional space.
- a statistical analysis makes it possible to determine the main axes of the cloud of points observed.
- the quantization takes place on the coordinates of the points carried by the main axes, each coordinate being quantized on a number of bits depending on the eigenvalue associated with each axis considered.
- the drawback of doing so is that there must be a correction procedure at the level of the synthesis filter so that the values of the calculated energies are not negative.
- this processing no particular attention is paid to the fidelity of restitution of the transients.
- the subject of the invention is a method for quantifying the energy of the speech signal in a vocoder at very low speed, characterized in that it consists in sharing (1) the speech signal in packets of a determined number of frames of constant duration by sampling a determined number n of energy values in each frame, to be quantified (2, 3, 4) the first energy value measured in each first frame of a packet according to a determined number Q o of bits and the variations of k - 1 remaining energies relative to the first value of the energy sampled on a determined number Q1 of bits less than Q0, the variations of the k - 1 energies being selected in a table of "slopes", allowing to attribute to each sample of energy k the "slope" of energy which separates it from the energy of the previous sample k - 1.
- the main advantage of the method according to the invention is that it allows good quality energy to be quantified in each frame of the speech signal, while respecting the energy transitions from frame to frame without vocoder the computational load and the necessary memory space are not affected.
- the method according to the invention consists in the manner shown in FIG. 1 of segmenting the speech signal into frames of constant determined duration of for example between 22.5 and 30 ms, of grouping the frames in packets of a determined number n of energy values of the signal in each frame so as to transmit in each packet only the first quantified value of the measured energy E1 in the first frame of a packet as well as the k - 1 values of the differences of the energies existing between the frames which follow, k being equal to n. L.
- the differences in the energies received are placed end to end following the first energy value which is received in the first frame of each packet to reconstitute the profile of the quantized values of the energies on transmission.
- a first energy value is quantified in each first frame k o of a packet over a determined number Q o bit rates and the variations of the k - 1 remaining energies are quantified with a number determined Q1 of bits less than Q o .
- the 2 Q o possible initial values comprise a null value representing the silences.
- the other values are distributed on a quasi scale logarithmic best suited to follow the sensitivity properties of the ear, the quantization step being lower the higher the level of the speech signal. Typically, a 3dB step is adopted for low levels and a 1dB step is adopted for high levels.
- the m 2 Q 1 other values represent increments d i of energy also called hereinafter "legal energy values" whose values are predetermined to favor the transitions, these are chosen for example equal to - 3dB respectively , 0dB, + 2dB and + 7dB if the number Q1 is coded with only 2 bits.
- the energy increments make it possible, as shown in FIG. 2, to search from each quantized value B of a frame k for the quantized values A of the energy in the previous frame k - 1 which can lead to it by an increment legal d i starting with the zero increment d o .
- the determination of the numbers Q o and Q1 is established according to steps 1 to 5 of the method represented by the flowchart in FIG. 3.
- the first step referenced 1 in FIG. 3 groups the frames in packets of L frames.
- the values of the energies E1 to E k are calculated in step 2. These are quantified in the manner shown in FIGS. 1 and 2 between two values E max and E min relative to a scale comprising P graduations which can be confused for convenience with the 2 Q o possible values of the initial energy E1 measured in the first frame.
- step 3 in FIG. 3 by an initialization phase consisting in calculating a set of P distances between the first value of energy E, and the P quantified values possible of this energy.
- the corresponding distances Dp are stored in the form of a first table (D) not shown in a memory of the vocoder.
- step 4 consists, similar to the known algorithm of VITERBI, of performing k-1 iterations which aim to estimate the distances between all the potential quantization profiles and the real energy profile, by eliminating the quantization profiles the least likely.
- a second table (D ') not shown, denoted "slope" is constructed which for each of the iterations 1 to k-1 associates a slope or a legal energy increment d i with each quantized value P of the iteration k.
- a search for the quantified value of the iteration k-1 which precedes is carried out by pointing in the table of "slopes" the "part” or the legal increment d i which can lead there directly starting with the null increment d o .
- a table of distances D (+) is constructed which at position p contains the cumulative distance between the best quantized profile which arrives at position p and the original profile. This allows to keep in memory a table of slope index whose index value of slope (k, p) represents the index of the best possible slope to arrive at the quantized value e p in step k.
- the two tables thus obtained make it possible to arrive at a final decision. To do this, the process searches in table D (+) for the index p min which corresponds to the minimum value.
- the values index Diff index (1 ... K-1) are the indices of the best possible quantified values for the slopes D i .
- the final value of P min is then simply the index of the most probable quantified value.
- the method which has just been described can always be adapted as a function of particular characteristics of the analysis system.
- it is always possible to minimize the influence of erroneous values by replacing, for example, the squared elevations which serve as distance measurements, by absolute values. which allow the profile of the quantized values to be calibrated on the correct energy values, provided that they are more numerous than the incorrect values.
- the adaptation and development for a particular vocoder requires only the modification of the quantized starting values (number and values), the increments (number and values), or even the number of iterations.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
FR9114402A FR2684225A1 (fr) | 1991-11-22 | 1991-11-22 | Procede de quantification de l'energie du signal de parole dans un vocodeur a tres faible debit. |
FR9114402 | 1991-11-22 |
Publications (2)
Publication Number | Publication Date |
---|---|
EP0543700A2 true EP0543700A2 (de) | 1993-05-26 |
EP0543700A3 EP0543700A3 (en) | 1993-09-29 |
Family
ID=9419210
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP19920403025 Withdrawn EP0543700A3 (en) | 1991-11-22 | 1992-11-10 | Method for quantification of speed signal energy in a low bit rate vocoder |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP0543700A3 (de) |
CA (1) | CA2083335A1 (de) |
FR (1) | FR2684225A1 (de) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110109646B (zh) * | 2019-03-28 | 2021-08-27 | 北京迈格威科技有限公司 | 数据处理方法、装置和乘加器及存储介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2331207A2 (fr) * | 1975-11-05 | 1977-06-03 | Ibm France | Procede de quantification par blocs d'echantillons d'un signal electrique, et dispositif de mise en oeuvre dudit procede |
DE3736193A1 (de) * | 1986-10-26 | 1988-05-05 | Ricoh Kk | Sprachsignal-kodierverfahren |
EP0454552A2 (de) * | 1990-04-27 | 1991-10-30 | Thomson-Csf | Verfahren und Einrichtung zur Sprachcodierung mit niedriger Bitrate |
DE4103277A1 (de) * | 1991-02-04 | 1992-08-06 | Hilberg Wolfgang | Speichergesteuerte codierung zeitabhaengiger funktionen mit symbolbedeutung |
-
1991
- 1991-11-22 FR FR9114402A patent/FR2684225A1/fr active Pending
-
1992
- 1992-11-10 EP EP19920403025 patent/EP0543700A3/fr not_active Withdrawn
- 1992-11-19 CA CA 2083335 patent/CA2083335A1/en not_active Abandoned
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2331207A2 (fr) * | 1975-11-05 | 1977-06-03 | Ibm France | Procede de quantification par blocs d'echantillons d'un signal electrique, et dispositif de mise en oeuvre dudit procede |
DE3736193A1 (de) * | 1986-10-26 | 1988-05-05 | Ricoh Kk | Sprachsignal-kodierverfahren |
EP0454552A2 (de) * | 1990-04-27 | 1991-10-30 | Thomson-Csf | Verfahren und Einrichtung zur Sprachcodierung mit niedriger Bitrate |
DE4103277A1 (de) * | 1991-02-04 | 1992-08-06 | Hilberg Wolfgang | Speichergesteuerte codierung zeitabhaengiger funktionen mit symbolbedeutung |
Non-Patent Citations (2)
Title |
---|
ICASSP'87 (1987 INTERNATIONAL CONFERENCE ON ACOUSITCS, SPEECH, AND SIGNAL PROCESSING, Dallas, Texas, 6-9 avril 1987), vol. 4, pages 1949-1952, IEEE, New York, US; S. ROUCOS et al.: "A segment vocoder algorithm for real-time implementation" * |
ICASSP'87 (1987 INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, Dallas, Texas, 6-9 avril 1987), vol. 3, pages 1653-1656, IEEE, New York, US; J. PICONE et al.: "Low rate speech coding using contour quantization" * |
Also Published As
Publication number | Publication date |
---|---|
EP0543700A3 (en) | 1993-09-29 |
CA2083335A1 (en) | 1993-05-23 |
FR2684225A1 (fr) | 1993-05-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP0784311B1 (de) | Verfahren und Vorrichtung zur Feststellung der Sprachaktivität in einem Sprachsignal und eine Kommunikationsvorrichtung | |
KR100361883B1 (ko) | 오디오신호압축방법,오디오신호압축장치,음성신호압축방법,음성신호압축장치,음성인식방법및음성인식장치 | |
EP1320087B1 (de) | Synthese eines Anregungssignales zur Verwendung in einem Generator von Komfortrauschen | |
KR100334202B1 (ko) | 이동전화에서고속음성압축을수행하기위한에이식 | |
RU2138030C1 (ru) | Система передачи, терминальный блок, кодирующее устройство, декодирующее устройство и адаптивный фильтр | |
EP0501421B1 (de) | Sprachkodiersystem | |
EP0801789B1 (de) | Verfahren zur sprachkodierung mittels analyse durch synthese | |
EP0490740A1 (de) | Verfahren und Einrichtung zum Bestimmen der Sprachgrundfrequenz in Vocodern mit sehr niedriger Datenrate | |
EP0882287B1 (de) | System und verfahren zur fehlerkorrektur in einer auf korrelation basierenden grundfrequenzschätzvorrichtung | |
EP0685833B1 (de) | Verfahren zur Sprachkodierung mittels linearer Prädiktion | |
EP1267325A1 (de) | Verfahren zur Sprachaktivitätsdetektion in einem Signal, und Sprachkodierer mit Vorrichtung zur Ausführung des Verfahrens | |
Nikolić et al. | Low complex forward adaptive loss compression algorithm and its application in speech coding | |
EP0506535A1 (de) | Verfahren und Einrichtung zur Bearbeitung von Vorechos eines mittels einer Frequenztransformation kodierten digitalen Audiosignals | |
EP2347411B1 (de) | Vor-echo-dämpfung in einem digitalaudiosignal | |
EP0234993A1 (de) | Verfahren und Vorichtung zur automatischen Zielerkennung aus Doppler-Echos | |
EP2769378A2 (de) | Verbesserte hierarchische kodierung | |
WO1997035301A1 (en) | Vocoder system and method for performing pitch estimation using an adaptive correlation sample window | |
SE470577B (sv) | Förfarande och anordning för kodning och/eller avkodning av bakgrundsljud | |
US6397177B1 (en) | Speech-encoding rate decision apparatus and method in a variable rate | |
EP0543700A2 (de) | Verfahren zur Quantisierung der Sprachsignalenergie in einem Vocoder mit niedriger Bitrate | |
RU2317595C1 (ru) | Способ обнаружения пауз в речевых сигналах и устройство его реализующее | |
EP0616315A1 (de) | Vorrichtung zur digitalen Sprachkodierung und -dekodierung, Verfahren zum Durchsuchen eines pseudologarithmischen LTP-Verzögerungskodebuchs und Verfahren zur LTP-Analyse | |
US7715447B2 (en) | Method and system for tone detection | |
FR2631146A1 (fr) | Procede et dispositif de codage de l'energie du signal vocal dans des vocodeurs a tres faibles debits | |
JPH0784596A (ja) | 符号化音声の品質評価方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): BE DE ES GB IT |
|
PUAL | Search report despatched |
Free format text: ORIGINAL CODE: 0009013 |
|
AK | Designated contracting states |
Kind code of ref document: A3 Designated state(s): BE DE ES GB IT |
|
RAP3 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: THOMSON-CSF |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 19940330 |