EP0543700A2 - Verfahren zur Quantisierung der Sprachsignalenergie in einem Vocoder mit niedriger Bitrate - Google Patents

Verfahren zur Quantisierung der Sprachsignalenergie in einem Vocoder mit niedriger Bitrate Download PDF

Info

Publication number
EP0543700A2
EP0543700A2 EP19920403025 EP92403025A EP0543700A2 EP 0543700 A2 EP0543700 A2 EP 0543700A2 EP 19920403025 EP19920403025 EP 19920403025 EP 92403025 A EP92403025 A EP 92403025A EP 0543700 A2 EP0543700 A2 EP 0543700A2
Authority
EP
European Patent Office
Prior art keywords
energy
frame
value
values
variations
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP19920403025
Other languages
English (en)
French (fr)
Other versions
EP0543700A3 (en
Inventor
Pierre Laurent
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Thales SA
Original Assignee
Thomson CSF SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Thomson CSF SA filed Critical Thomson CSF SA
Publication of EP0543700A2 publication Critical patent/EP0543700A2/de
Publication of EP0543700A3 publication Critical patent/EP0543700A3/fr
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components
    • G10L19/038Vector quantisation, e.g. TwinVQ audio
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/06Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients

Definitions

  • the present invention relates to a method for quantifying the energy of the speech signal in a vocoder at very low bit rate.
  • the speech signal is segmented into frames of constant duration, and a single value of the power (or energy) of the excitation is provided for each frame.
  • one way of lowering the bit rate consists in increasing the duration of the frame, for example from 22.5 ms to 30 ms as well as in grouping and quantifying in a single time the parameters relating to several frames. This makes it possible to renew the different synthesis parameters less often. Unfortunately, the intelligibility of the restored speech decreases because transmitting a single value of the energy per frame no longer allows an adequate restitution of certain transients.
  • a first known method consists in grouping the frames in packets by considering k energy values by packet, each representable by the coordinates of a point referenced in a k-dimensional space.
  • a statistical analysis makes it possible to determine the main axes of the cloud of points observed.
  • the quantization takes place on the coordinates of the points carried by the main axes, each coordinate being quantized on a number of bits depending on the eigenvalue associated with each axis considered.
  • the drawback of doing so is that there must be a correction procedure at the level of the synthesis filter so that the values of the calculated energies are not negative.
  • this processing no particular attention is paid to the fidelity of restitution of the transients.
  • the subject of the invention is a method for quantifying the energy of the speech signal in a vocoder at very low speed, characterized in that it consists in sharing (1) the speech signal in packets of a determined number of frames of constant duration by sampling a determined number n of energy values in each frame, to be quantified (2, 3, 4) the first energy value measured in each first frame of a packet according to a determined number Q o of bits and the variations of k - 1 remaining energies relative to the first value of the energy sampled on a determined number Q1 of bits less than Q0, the variations of the k - 1 energies being selected in a table of "slopes", allowing to attribute to each sample of energy k the "slope" of energy which separates it from the energy of the previous sample k - 1.
  • the main advantage of the method according to the invention is that it allows good quality energy to be quantified in each frame of the speech signal, while respecting the energy transitions from frame to frame without vocoder the computational load and the necessary memory space are not affected.
  • the method according to the invention consists in the manner shown in FIG. 1 of segmenting the speech signal into frames of constant determined duration of for example between 22.5 and 30 ms, of grouping the frames in packets of a determined number n of energy values of the signal in each frame so as to transmit in each packet only the first quantified value of the measured energy E1 in the first frame of a packet as well as the k - 1 values of the differences of the energies existing between the frames which follow, k being equal to n. L.
  • the differences in the energies received are placed end to end following the first energy value which is received in the first frame of each packet to reconstitute the profile of the quantized values of the energies on transmission.
  • a first energy value is quantified in each first frame k o of a packet over a determined number Q o bit rates and the variations of the k - 1 remaining energies are quantified with a number determined Q1 of bits less than Q o .
  • the 2 Q o possible initial values comprise a null value representing the silences.
  • the other values are distributed on a quasi scale logarithmic best suited to follow the sensitivity properties of the ear, the quantization step being lower the higher the level of the speech signal. Typically, a 3dB step is adopted for low levels and a 1dB step is adopted for high levels.
  • the m 2 Q 1 other values represent increments d i of energy also called hereinafter "legal energy values" whose values are predetermined to favor the transitions, these are chosen for example equal to - 3dB respectively , 0dB, + 2dB and + 7dB if the number Q1 is coded with only 2 bits.
  • the energy increments make it possible, as shown in FIG. 2, to search from each quantized value B of a frame k for the quantized values A of the energy in the previous frame k - 1 which can lead to it by an increment legal d i starting with the zero increment d o .
  • the determination of the numbers Q o and Q1 is established according to steps 1 to 5 of the method represented by the flowchart in FIG. 3.
  • the first step referenced 1 in FIG. 3 groups the frames in packets of L frames.
  • the values of the energies E1 to E k are calculated in step 2. These are quantified in the manner shown in FIGS. 1 and 2 between two values E max and E min relative to a scale comprising P graduations which can be confused for convenience with the 2 Q o possible values of the initial energy E1 measured in the first frame.
  • step 3 in FIG. 3 by an initialization phase consisting in calculating a set of P distances between the first value of energy E, and the P quantified values possible of this energy.
  • the corresponding distances Dp are stored in the form of a first table (D) not shown in a memory of the vocoder.
  • step 4 consists, similar to the known algorithm of VITERBI, of performing k-1 iterations which aim to estimate the distances between all the potential quantization profiles and the real energy profile, by eliminating the quantization profiles the least likely.
  • a second table (D ') not shown, denoted "slope" is constructed which for each of the iterations 1 to k-1 associates a slope or a legal energy increment d i with each quantized value P of the iteration k.
  • a search for the quantified value of the iteration k-1 which precedes is carried out by pointing in the table of "slopes" the "part” or the legal increment d i which can lead there directly starting with the null increment d o .
  • a table of distances D (+) is constructed which at position p contains the cumulative distance between the best quantized profile which arrives at position p and the original profile. This allows to keep in memory a table of slope index whose index value of slope (k, p) represents the index of the best possible slope to arrive at the quantized value e p in step k.
  • the two tables thus obtained make it possible to arrive at a final decision. To do this, the process searches in table D (+) for the index p min which corresponds to the minimum value.
  • the values index Diff index (1 ... K-1) are the indices of the best possible quantified values for the slopes D i .
  • the final value of P min is then simply the index of the most probable quantified value.
  • the method which has just been described can always be adapted as a function of particular characteristics of the analysis system.
  • it is always possible to minimize the influence of erroneous values by replacing, for example, the squared elevations which serve as distance measurements, by absolute values. which allow the profile of the quantized values to be calibrated on the correct energy values, provided that they are more numerous than the incorrect values.
  • the adaptation and development for a particular vocoder requires only the modification of the quantized starting values (number and values), the increments (number and values), or even the number of iterations.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
EP19920403025 1991-11-22 1992-11-10 Method for quantification of speed signal energy in a low bit rate vocoder Withdrawn EP0543700A3 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FR9114402A FR2684225A1 (fr) 1991-11-22 1991-11-22 Procede de quantification de l'energie du signal de parole dans un vocodeur a tres faible debit.
FR9114402 1991-11-22

Publications (2)

Publication Number Publication Date
EP0543700A2 true EP0543700A2 (de) 1993-05-26
EP0543700A3 EP0543700A3 (en) 1993-09-29

Family

ID=9419210

Family Applications (1)

Application Number Title Priority Date Filing Date
EP19920403025 Withdrawn EP0543700A3 (en) 1991-11-22 1992-11-10 Method for quantification of speed signal energy in a low bit rate vocoder

Country Status (3)

Country Link
EP (1) EP0543700A3 (de)
CA (1) CA2083335A1 (de)
FR (1) FR2684225A1 (de)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110109646B (zh) * 2019-03-28 2021-08-27 北京迈格威科技有限公司 数据处理方法、装置和乘加器及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2331207A2 (fr) * 1975-11-05 1977-06-03 Ibm France Procede de quantification par blocs d'echantillons d'un signal electrique, et dispositif de mise en oeuvre dudit procede
DE3736193A1 (de) * 1986-10-26 1988-05-05 Ricoh Kk Sprachsignal-kodierverfahren
EP0454552A2 (de) * 1990-04-27 1991-10-30 Thomson-Csf Verfahren und Einrichtung zur Sprachcodierung mit niedriger Bitrate
DE4103277A1 (de) * 1991-02-04 1992-08-06 Hilberg Wolfgang Speichergesteuerte codierung zeitabhaengiger funktionen mit symbolbedeutung

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2331207A2 (fr) * 1975-11-05 1977-06-03 Ibm France Procede de quantification par blocs d'echantillons d'un signal electrique, et dispositif de mise en oeuvre dudit procede
DE3736193A1 (de) * 1986-10-26 1988-05-05 Ricoh Kk Sprachsignal-kodierverfahren
EP0454552A2 (de) * 1990-04-27 1991-10-30 Thomson-Csf Verfahren und Einrichtung zur Sprachcodierung mit niedriger Bitrate
DE4103277A1 (de) * 1991-02-04 1992-08-06 Hilberg Wolfgang Speichergesteuerte codierung zeitabhaengiger funktionen mit symbolbedeutung

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ICASSP'87 (1987 INTERNATIONAL CONFERENCE ON ACOUSITCS, SPEECH, AND SIGNAL PROCESSING, Dallas, Texas, 6-9 avril 1987), vol. 4, pages 1949-1952, IEEE, New York, US; S. ROUCOS et al.: "A segment vocoder algorithm for real-time implementation" *
ICASSP'87 (1987 INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, Dallas, Texas, 6-9 avril 1987), vol. 3, pages 1653-1656, IEEE, New York, US; J. PICONE et al.: "Low rate speech coding using contour quantization" *

Also Published As

Publication number Publication date
EP0543700A3 (en) 1993-09-29
CA2083335A1 (en) 1993-05-23
FR2684225A1 (fr) 1993-05-28

Similar Documents

Publication Publication Date Title
EP0784311B1 (de) Verfahren und Vorrichtung zur Feststellung der Sprachaktivität in einem Sprachsignal und eine Kommunikationsvorrichtung
KR100361883B1 (ko) 오디오신호압축방법,오디오신호압축장치,음성신호압축방법,음성신호압축장치,음성인식방법및음성인식장치
EP1320087B1 (de) Synthese eines Anregungssignales zur Verwendung in einem Generator von Komfortrauschen
KR100334202B1 (ko) 이동전화에서고속음성압축을수행하기위한에이식
RU2138030C1 (ru) Система передачи, терминальный блок, кодирующее устройство, декодирующее устройство и адаптивный фильтр
EP0501421B1 (de) Sprachkodiersystem
EP0801789B1 (de) Verfahren zur sprachkodierung mittels analyse durch synthese
EP0490740A1 (de) Verfahren und Einrichtung zum Bestimmen der Sprachgrundfrequenz in Vocodern mit sehr niedriger Datenrate
EP0882287B1 (de) System und verfahren zur fehlerkorrektur in einer auf korrelation basierenden grundfrequenzschätzvorrichtung
EP0685833B1 (de) Verfahren zur Sprachkodierung mittels linearer Prädiktion
EP1267325A1 (de) Verfahren zur Sprachaktivitätsdetektion in einem Signal, und Sprachkodierer mit Vorrichtung zur Ausführung des Verfahrens
Nikolić et al. Low complex forward adaptive loss compression algorithm and its application in speech coding
EP0506535A1 (de) Verfahren und Einrichtung zur Bearbeitung von Vorechos eines mittels einer Frequenztransformation kodierten digitalen Audiosignals
EP2347411B1 (de) Vor-echo-dämpfung in einem digitalaudiosignal
EP0234993A1 (de) Verfahren und Vorichtung zur automatischen Zielerkennung aus Doppler-Echos
EP2769378A2 (de) Verbesserte hierarchische kodierung
WO1997035301A1 (en) Vocoder system and method for performing pitch estimation using an adaptive correlation sample window
SE470577B (sv) Förfarande och anordning för kodning och/eller avkodning av bakgrundsljud
US6397177B1 (en) Speech-encoding rate decision apparatus and method in a variable rate
EP0543700A2 (de) Verfahren zur Quantisierung der Sprachsignalenergie in einem Vocoder mit niedriger Bitrate
RU2317595C1 (ru) Способ обнаружения пауз в речевых сигналах и устройство его реализующее
EP0616315A1 (de) Vorrichtung zur digitalen Sprachkodierung und -dekodierung, Verfahren zum Durchsuchen eines pseudologarithmischen LTP-Verzögerungskodebuchs und Verfahren zur LTP-Analyse
US7715447B2 (en) Method and system for tone detection
FR2631146A1 (fr) Procede et dispositif de codage de l'energie du signal vocal dans des vocodeurs a tres faibles debits
JPH0784596A (ja) 符号化音声の品質評価方法

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): BE DE ES GB IT

PUAL Search report despatched

Free format text: ORIGINAL CODE: 0009013

AK Designated contracting states

Kind code of ref document: A3

Designated state(s): BE DE ES GB IT

RAP3 Party data changed (applicant data changed or rights of an application transferred)

Owner name: THOMSON-CSF

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 19940330