EP0402947B1 - Einrichtung und Verfahren zur Sprachkodierung mit Regular-Pulsanregung - Google Patents

Einrichtung und Verfahren zur Sprachkodierung mit Regular-Pulsanregung Download PDF

Info

Publication number
EP0402947B1
EP0402947B1 EP19900111360 EP90111360A EP0402947B1 EP 0402947 B1 EP0402947 B1 EP 0402947B1 EP 19900111360 EP19900111360 EP 19900111360 EP 90111360 A EP90111360 A EP 90111360A EP 0402947 B1 EP0402947 B1 EP 0402947B1
Authority
EP
European Patent Office
Prior art keywords
parameters
signal
generating
circuit
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
EP19900111360
Other languages
English (en)
French (fr)
Other versions
EP0402947A3 (de
EP0402947A2 (de
Inventor
Yoshihiro C/O Nec Corporation Unno
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from JP1150770A external-priority patent/JPH0315900A/ja
Priority claimed from JP1254458A external-priority patent/JP2900431B2/ja
Application filed by NEC Corp filed Critical NEC Corp
Publication of EP0402947A2 publication Critical patent/EP0402947A2/de
Publication of EP0402947A3 publication Critical patent/EP0402947A3/de
Application granted granted Critical
Publication of EP0402947B1 publication Critical patent/EP0402947B1/de
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/10Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a multipulse excitation
    • G10L19/113Regular pulse excitation

Definitions

  • the present invention relates generally to an arrangement and method for encoding a discrete-time speech signal using a regular pulse excitation scheme and more specifically to such an arrangement and method for encoding a speech signal at a low bit rate less than 16k-bit per second.
  • an a/d (analog-to-digital) converted speech signal is applied via an input terminal 10 to a pre-processing circuit 12 on a frame by frame basis.
  • the speech frame applied to the circuit 12 is pre-processed to produce an offset-free signal, which signal is then subjected to a first order pre-emphasis filter.
  • An original speech signal has been sampled at a rate of 8 kHz. Since the frame length is 20 ms in this prior art, the one frame consists of 160 signal samples.
  • the 160 samples thus obtained are applied to a short term LPC (Linear Predictive Coding) analysis circuit 14 and also to a short term analysis filter 16.
  • LPC Linear Predictive Coding
  • the 160 samples, applied to the short term LPC analysis circuit 14, are analyzed to determine 8 orders of reflection coefficients which represent a spectrum envelope of each frame.
  • the LPC short term analysis circuit 14 further transforms or encodes the reflection coefficients to log area ratios (LAR), which are applied to the short term analysis circuit 16 and a multiplexor 30.
  • the short term analysis circuit 16 decodes the LAR into the reflection coefficients and obtains 160 samples of short term residual signals.
  • the term "short term analysis” has the same meaning as the spectrum envelope analysis.
  • the short term residual signal, outputted from the filter 16, is applied to a subtractor 18 and a long term analysis circuit 22.
  • the long term analysis circuit 22 divides the speech frame into 4 sub-frames (5 ms) each of which consists of 40 samples forming the short term residual signal. Each sub-frame is processed blockwise by the subsequent function blocks.
  • the long term analysis circuit 22 produces a long term prediction (LTP) lag and an LTP gain on the basis of the two signals: the short term residual samples applied from the circuit 16 and an output sequence from an adder 26.
  • LTP long term prediction
  • the term "long term analysis” has the same meaning as pitch analysis, and the LTP lag and the LTP gain respectively correspond to a pitch period and a pitch gain.
  • the subtractor 18 outputs a block of 40 long term residual signal samples by subtracting the output of a long term analysis filter 20 from the short term residual signal applied from the filter 16.
  • the position of j-th excitation pulse (m j ) within a sub-frame is given by the following equation.
  • m j P ⁇ j + q (0 ⁇ j ⁇ N/p - 1 and 0 ⁇ q ⁇ p)
  • p denotes a predetermined pulse interval
  • q an RPE grid
  • N the number of samples within one sub-frame.
  • the RPE grid q of the excitation pulse sequence is obtained from the following equation.
  • max(q) indicates the maximum value of the right term when changing the value of q.
  • the amplitude of the excitation pulse sequence can be determined by quantizing x j (m j ).
  • the excitation pulse generator 28 decodes the signal applied from the circuit 24 to determine an excitation pulse, which is fed to the adder 26.
  • the adder 26 adds the excitation pulse from the circuit 28 and the output sequence of the long term analysis filter 20, and applies the resultant sum to the filter 20 as well as the analysis circuit 22.
  • the multiplexor 30 combines the encoded outputs of the blocks 14, 22 and 24, and applies the result to a transmission line coupled to an output terminal 32.
  • the above-mentioned prior art has encountered the difficulty of low quality of the reconstructed or reproduced speech. This is because the amplitude of each excitation pulse is determined on the basis of the short term residual signal applied to the subtractor 18. In other words, according to the prior art, the long term residual signal outputted from the subtractor 18 is shifted by an RPE grid and then every predetermined number of samples are quantized.
  • the aforesaid prior art has encountered another problem in that the reproduced speech is degraded by quantizing distortion. This results from the fact that the number of quantizing bits is insufficient at a bit rate in the order of 13k bps.
  • EP-A-0 374 941 discloses a communication system for improving the speech quality by calculating the excitation multi-pulses by means of an encoder for encoding a sequence of digital speech signals classified into a voiced sound and an unvoiced sound into a sequence of output signals by the use of a spectrum parameter and pitch parameters at every frame.
  • a judging circuit judges whether the digital speech signals are classified into the voiced sound or the unvoiced sound in order to produce a judged signal representative of a result of judging.
  • a processing unit processes the digital speech signals in accordance with the judged signal to selectively produce a first set of primary sound source signals and secondary sound source signals.
  • This first set is produced when the judged signal represents the voiced sound and are representative of locations and amplitudes of a first set of excitation multi-pulses calculated at every frame.
  • the second set of secondary sound source signals are produced when the judged signal represents the unvoiced sound and are representative of the amplitudes of a second set of excitation multi-pulses each of which is located at intervals of a preselected number of the samples.
  • Another object of the present invention is to provide a method for encoding a discrete-time speech signal at a low bit rate less than 16k-bit per second using a regular pulse excitation scheme.
  • a binary adder is comprised of a pre-processing circuit provided to receive a discrete-time speech signal which are then divided into a plurality of frames.
  • a parameter extracting circuit is coupled to the pre-processing circuit and extracts a plurality of parameters therefrom.
  • a impulse response calculating circuit is coupled to receive the plurality of parameters from the parameter extracting circuit, and generates an impulse response function signal using the plurality of parameters.
  • An autocorrelation function circuit is coupled to receive the impulse response signal and generates an autocorrelation function signal using the signal applied.
  • a cross-correlation function circuit generates a cross-correlation function signal using the discrete-time speech signal and the autocorrelation function signal.
  • a grid signal generator receive the output of the cross-correlation function calculating circuit, and outputs a grid signal indicative of a location of a first excitation pulse within one frame.
  • a pulse amplitude calculating circuit receives the autocorrelation function signal, the cross-correlation function signal and the grid signal, and determines an amplitude sequence of excitation pulses within one frame.
  • one aspect of this invention takes the form of an arrangement for encoding a speech signal using a regular pulse excitation scheme, as set out in the appended claims.
  • Another aspect of this invention takes the form of a method for encoding a speech signal using a regular pulse excitation scheme, as set out in the appended claims.
  • the present invention is characterized by algorithms for calculating an amplitude of each of the excitation pulses. It should be noted that the location of the excitation pulse can be determined in accordance with the prior art disclosed in Paper 1. The above mentioned algorithms will be discussed below.
  • equation (1) the location of a j-th excitation pulse within a frame can be specified by equation (1).
  • equation (1) is again shown as equation (3).
  • m j p ⁇ j + q (0 ⁇ j ⁇ N/p -1 and 0 ⁇ q ⁇ p) Algorithm of obtaining the RPE grid q will be described later.
  • Fig. 3 shows a synthesis filter 122 which comprises two digital filters 310 and 320 coupled in series.
  • the filter 310 includes an adder 322, a coefficient weighting circuit 324 and a delay 326.
  • the filter 320 includes an adder 328, a coefficient weighting circuit 330 and a delay 332.
  • the synthesis filter 122 forms part of the arrangement shown in Fig. 2, and will again be referred to later. Consequently, the detail description of Fig. 3 will be postponed.
  • the filter 310 is a long term prediction filter whose output represents a pitch structure, while the filter 320 is a short term prediction filter whose output represents spectrum envelope characteristics.
  • the synthesis filter 122 is supplied with the excitation pulse series and outputs a reconstructed signal sequence x'(n) in accordance with the following equation: where ⁇ denotes an LTP gain representative of tap coefficients of the long term filter 310, Md a LTP lag indicative of a pitch period of an incoming speech signal.
  • x d (n) denotes an output signal of the filter 310, Np a prediction order of the short term prediction filter 320, and a i (1 ⁇ i ⁇ Np) a prediction coefficient of the filter 320 (a i corresponds to LAR in Fig. 3).
  • ⁇ and Md can be obtained in accordance with the prior art techniques disclosed in Paper 1.
  • ⁇ and Md can be determined by a peak amplitude of the autocorrelation function sequence of an input speech signal and the position of said peak. The algorithms via which this can be achieved have been disclosed in the document entitled "Adaptive predictive coding of speech signals" by B.S. Atal et al., pages 1973 to 1986, The Bell System Technical Journal, October 1970 (referred to as Paper 2).
  • the square error J in weighting between the input speech signal x(n) and the reproduced signal x'(n) within one frame can be represented by: where N denotes the number of samples within one frame and w(n) a weighting function.
  • Equation (7) is rewritten as follows. where the term x'(n) * w(n) can be modified according to the following equation.
  • X w '(Z) X'(Z) ⁇ W(Z)
  • X'(Z) H(Z) ⁇ D(Z) where D(Z) represents the Z conversion of the excitation pulse series given by equation (4), and H(Z) the Z conversion value of the impulse response of the synthesis filter 122.
  • Equation (16) ⁇ xh (-m k ) - g 1 ⁇ hh (m 1 ,m k ) - ⁇ - g k-1 ⁇ hh (m k-1 ,m k ) ⁇ hh (m k ,m k )
  • ⁇ xh ( ⁇ ) represents a cross-correlation function sequence computed from x w (n) and h w (n)
  • ⁇ hh ( ⁇ ) represents an autocorrelation function sequence of hw(n).
  • RPE grid q is calculated using the cross-correlation function obtained by equation (18). That is to say, the RPE grid q can be determined so as to satisfy the following equation. where max(q) indicates the maximum value of the right term when changing the value of q.
  • the value that an RPE grid q can assume is 0, 1, 2, 3 in the prior art disclosed in Paper 1 merely by way of example.
  • an amplitude sequence of the excitation signal can be precisely obtained using equation (22), and hence a high quality reproduced voice can be realized.
  • an a/d (analog-to-digital) converted speech signal is applied via an input terminal 110 to a pre-processing circuit 112 on a frame by frame basis.
  • the pre-processing circuit 112 can be configured in the same manner as the circuit 12 of Fig. 1.
  • the speech frame applied to the circuit 112 is pre-processed to produce an offset-free signal, which is then subjected to a first order pre-emphasis filter.
  • An original speech signal to be applied to the input terminal 110 has been sampled at a predetermined rate such as 8 kHz.
  • the one frame consists of 160 signal samples.
  • the samples thus obtained are applied to a short term LPC (Linear Predictive Coding) analysis circuit 114 and also to a long term (pitch) analysis filter 116.
  • LPC Linear Predictive Coding
  • the reflection coefficients represent a spectrum envelope of each frame.
  • An LAR coding circuit 118 is supplied with the LAR(i)s and transforms or encodes them into log area ratios (coded-LAR(i)) based on predetermined quantizing levels (quantizing bits), and then applies them to a multiplexor 300. Further, the LAR coding circuit 118 decodes the coded-LAR(i)s, applies the decoded LAR'(i) to an impulse response calculating circuit 120 as well as a synthesis filter 122.
  • the long term analysis circuit 116 receives the one frame samples from the pre-processing circuit 112 to calculate LTP lag Md and LTP gain ⁇ along with the algorithms as disclosed in the above-mentioned Paper 2.
  • the Md, ⁇ are fed to a long term (pitch) coding circuit 124, which encodes the Md, ⁇ and applies the coded-Md and coded- ⁇ to the multiplexor 300. Further, the long term coding circuit 124 decodes the coded-Md and the coded- ⁇ into Md' and ⁇ ', respectively.
  • the decoded LTP lag (Md') and the decoded LTP gain ( ⁇ ') are applied to the impulse response calculating circuit 120 and also to the synthesis filter 122.
  • the impulse response calculating circuit 120 comprises an impulse generator 400, a long term prediction (LTP) filter 402 and a short term prediction (STP) filter 404, which are coupled in series.
  • the LTP filter 402 includes an adder 406, a coefficient weighting circuit 408 and a delay circuit 410.
  • the STP filter 404 includes an adder 412, a coefficient weighting circuit 414 and a delay circuit 416.
  • the operation of each of the filters 402 and 404 are known to those in the art, and hence the detail descriptions thereof will be omitted.
  • the decoded Md' and ⁇ ' are applied to the coefficient weighting circuit 408, while the decoded LAR'(i) to the coefficient weighting circuit 414.
  • the impulse response calculating circuit 120 determines an impulse response of a predetermined number of samples and applies the output h w (n) to an autocorrelation function calculating circuit 126 and a cross-correlation function calculating circuit 128.
  • the circuit 126 calculates an autocorrelation function R hh ( m i - m k ) according to equation (21), and applies the result to a pulse amplitude calculating circuit 132.
  • a subtractor 134 coupled to the pre-processing circuit 112 and the synthesis filter 122, subtracts the output sequence of the filter 122 from the speech signal sequence x(n), and applies the resultant difference to a weighting circuit 136.
  • the synthesis filter 122 has already stored one frame of response signal sequence, which is obtained by using an excitation pulse one frame before the present frame as an excitation signal and thereafter delayed to the present frame by making the excitation signal zero.
  • the speech signal sequence of the present frame can be expressed by the sum of a signal sequence obtained by delaying the output signal of the synthesis filter driven by an excitation pulse one frame before to the present frame by making the excitation signal zero, and by the output signal sequence of the synthesis filter driven by the excitation pulse sequence of the present frame.
  • the weighting circuit 136 is supplied with the parameter LAR'(i) from the LAR coding circuit 118, and calculates the weighting function w(n) in a manner that the Z conversion value thereof satisfies equation (8). This calculation can be implemented through the use of another frequency weighting scheme.
  • the weighting circuit 136 performs a convolution integration of the difference from the subtractor 134 and the function w(n), and applies the output thereof x w (n) to the cross-correlation function circuit 128.
  • This circuit 128 is further supplied with the impulse response hw(n), and calculates the cross-correlation function ⁇ xh (-m k ) (where 1 ⁇ m k ⁇ N) which is applied to a RPE grid selector 130 and also to the pulse amplitude calculating circuit 132.
  • the grid selector 130 determines or selects a grid q, using the cross-correlation function ⁇ xh (-m k ), according to equation (23) and applies the selected grid to the pulse amplitude calculating circuit 132.
  • the circuit 132 is synchronously supplied with the above-mentioned three outputs (viz., the autocorrelation function R hh ( m i - m k ), the cross-correlation function ⁇ xh (-m k ) and the selected grid q), and determines an amplitude of each of the excitation pulses within one frame. In other words, the circuit 132 determines a so-called amplitude sequence of the excitation pulses in one frame.
  • a pulse coding circuit 137 receives the output sequence of the circuit 132 and encodes the selected grid q and the amplitude sequence g k of the excitation pulses using normalizing coefficients, and applies the encoded information to the multiplexor 300.
  • the normalizing coefficients are also encoded within the pulse coding circuit 137 and applied to the multiplexor 300.
  • the circuit 137 further decodes the encoded data (viz., the grid and the amplitude sequence and the normalizing coefficients) to apply them to a pulse sequence generator 138.
  • the decoded grid and the decoded amplitude sequence are respectively denoted by q' and g k '.
  • the operation of the pulse coding circuit 137 has been disclosed in the above-mentioned Paper 1.
  • the pulse sequence generator 138 outputs an excitation pulse sequence of one frame using g k ' and m k ', which pulse sequence has an amplitude g k ' at a position m k '.
  • the synthesis filter 122 receives the excitation pulse sequence, and also receives the coefficients LAR'(i) and the pitch information (Md' and ⁇ ') from the circuits 118 and 124, respectively. It should be noted that the synthesis filter 122 converts LAR'(i) into a prediction parameter a i (1 ⁇ i ⁇ Np) by means of a well known method. The filter 122 adds the excitation signal applied thereto and one frame of 0 sequence together with to determine a response signal sequence x(n) for the two frame signal.
  • the sequence x'(n) can be represented by: This equation is identical to equation (5).
  • the excitation signal d(n) represents the output pulse signal generated by the pulse generating circuit 138 when 1 ⁇ n ⁇ N, while representing a series of all zeros in the case of (N + 1) ⁇ n ⁇ 2N.
  • the subtractor 134 receives x'(n) obtained using equation (24) (wherein N + 1 ⁇ n ⁇ 2N).
  • the multiplexor 300 combines the outputs of the circuits 137, 118 and 124, which are applied to a transmission line via an output terminal 302.
  • FIG. 5 differs from that of Fig. 2 in that the former arrangement further includes a switch 500, a decision circuit 502, a gate 504 and a section 506.
  • This section 506 is arranged in exactly the same manner as the arrangement of a section 508, although the functions of the two sections 506 and 508 are slightly different.
  • each of the blocks 120', 126', 128', 130', 132' and 136' in the section 506 bears the same reference numeral as the counterpart in the section 508 but has a prime for the purposes of differentiation.
  • the section 508 operates in the same manner as described above and hence further descriptions thereof will be omitted for simplicity.
  • the blocks included in the section 506 operates in the same manner as their counterparts in the section 508, the operations thereof may not be described for simplicity.
  • the impulse response calculating circuit 120' in the section 506 receives the decoded LAR'(i) at the coefficient weighting circuit 414 (Fig. 4), and determines an impulse response of a predetermined number of samples and applies the output h w '(n) to the autocorrelation function calculating circuit 126' as well as the cross-correlation function calculating circuit 128'.
  • the autocorrelation function calculating circuit 126' calculates an autocorrelation function R hh '( m i - m k ) according to equation (21), and applies the result to the pulse amplitude calculating circuit 132'.
  • the weighting circuit 136' operates in the same manner as the counterpart 136, and applies the output thereof x w (n) to the cross-correlation function calculating circuit 128'.
  • This circuit 138' is further supplied with the impulse response hw'(n), and calculates the cross-correlation function ⁇ xh '(-m k ) (where 1 ⁇ m k ⁇ N) which is applied to the RPE grid selector 130' and also to the pulse amplitude calculating circuit 132'.
  • the grid selector 130' determines or selects a grid q', using the cross-correlation function ⁇ xh '(-m k ), according to equation (23) and applies the selected grid q' to the pulse amplitude calculating circuit 132'.
  • the circuit 132' is synchronously supplied with the above-mentioned three outputs (viz., the autocorrelation function R hh '(
  • the decision circuit 502 is coupled to the circuits 132 and 132' to be supplied with the outputs: the autocorrelation functions R hh (
  • the decision circuit 502 determines power or energy J of an error signal between the incoming and reconstructed signals, according to the following equation (25), in connection with each of the two excitation pulse series which are obtained at the sections 508 and 506. Equation (25) can be obtained by substituting equations (15) and (22) into equation (9).
  • R xx (0) represents power or energy of the output x w (n) of the weighting circuit 136 (or 136').
  • the error signal energy can approximately be obtained using the following equation (26) instead of equation (25).
  • J ⁇ ⁇ 2 xh (-m i )/R hh Equation (26) utilizes an error of the cross-correlation function, which can be obtained by calculating the excitation pulse series.
  • the decision circuit 502 compares the two kinds of power or energy: one obtained depending on the parameters from the section 508 (referred to as Jo) and the other obtained depending on the parameters from the section 506 (referred to as Jo'). In the event of Jo' ⁇ Jo, the decision circuit 502 determines that the excitation pulse series obtained through the section 506 is suitable for use relative to that obtained through the section 508. In this case, the decision circuit 502 instructs the switch 500 to relay the output of the section 506 to the pulse coding circuit 137. Further, the decision circuit 502 opens the gate 504 allowing the coded information (coded-LAR(i), coded-Md and coded- ⁇ ) to be applied to the multiplexor 300.
  • coded information coded-LAR(i), coded-Md and coded- ⁇
  • the gate 504 attaches a predetermined code to the coded-Md and - ⁇ ). Contrarily, in the event of Jo'>Jo, the decision circuit 502 forces the switch 500 to relay the output of the section 508 to the circuit 137, and opens the gate 504 to pass the above-mentioned coded information therethrough.
  • the impulse response calculating circuit 120 can be adapted to calculate the above-mentioned two functions h w (n) and h w '(n). In this case the circuit 120 generates hw'(n) by making zero the parameters Md' and ⁇ ' which are applied to the coefficient weighting circuit 408. It goes without saying that h w (n) is first calculated and thereafter computation of the h w '(n) is performed or vice versa, which can be applied to the other blocks wherein two kinds of computation are implemented.
  • the second embodiment can be modified such that the pitch gain ⁇ is compared with a predetermined threshold. If the pitch gain ⁇ is less than the threshold then the pitch gain ⁇ is rendered zero. This means that the excitation pulses are generated using the spectrum parameters only. It is understood that this modification no longer requires the provision of the decision circuit 502 and the calculations of equations (25) and (26). This variation can result in the reduced number of operations.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Claims (6)

  1. Anordnung zur Codierung eines Sprachsignals unter Verwendung eines Regulär-Pulsanregungsschemas mit:
    einer ersten Einrichtung (112, 114, 116), die dazu bestimmt ist, mit einem Diskretzeit-Sprachsignal versorgt zu werden und das Diskretzeit-Sprachsignal in mehrere Rahmen zu teilen;
    einer zweiten Einrichtung (118, 124) zum Extrahieren mehrerer Parameter aus jedem der von der ersten Einrichtung übergebenen Rahmen;
    Syntheseeinrichtung (122) zum Erzeugen eines Signals unter Verwendung der mehreren Parameter und einer Folge von Anregungsimpulsen;
    einer dritten Einrichtung (120) zum Erzeugen eines Impulsantwortfunktionssignals unter Verwendung der mehreren Parameter;
    einer vierten Einrichtung (126) zum Erzeugen eines Autokorrelationsfunktionssignals unter Verwendung des Impulsantwortfunktionssignals; und
    einer fünften Einrichtung (128) zum Erzeugen eines Kreuzkorrelationsfunktionssignals unter Verwendung des Impulsantwortfunktionssignals und einer gewichteten Differenz zwischen einem der Rahmen des Diskretzeit-Sprachsignals und einem Rahmen des durch die Syntheseeinrichtung erzeugten Signals;
       gekennzeichnet durch:
    eine sechste Einrichtung (130) zum Erzeugen eines Rastersignals, das die Lage eines ersten Anregungsimpulses innerhalb eines Rahmens anzeigt, unter Verwendung des Kreuzkorrelationsfunktionssignals; und
    eine siebente Einrichtung (132) zum Empfangen des Autokorrelationsfunktionssignals, des Kreuzkorrelationsfunktionssignals und des Rastersignals, wobei die siebente Einrichtung eine Amplitudenfolge der Anregungsimpulse innerhalb eines Rahmens bestimmt.
  2. Anordnung nach Anspruch 1, wobei die zweite Einrichtung (118, 124) aufweist:
    eine achte Einrichtung, die einen oder mehrere erste Parameter, die eine spektrale Hüllkurve darstellen, aus jedem der von der ersten Einrichtung übergebenen Rahmen extrahiert, die ersten Parameter codiert, die codierten ersten Parameter decodiert und die decodierten ersten Parameter erzeugt; und
    eine neunte Einrichtung, die zweite und dritte Parameter aus jedem der von der ersten Einrichtung übergebenen Rahmen extrahiert, wobei die zweiten und dritten Parameter jeweils eine Tonhöhenperiode bzw. eine Tonhöhenverstärkung darstellen, wobei die neunte Einrichtung die codierten zweiten und dritten Parameter decodiert und die decodierten zweiten und dritten Parameter erzeugt,
    wobei die decodierten ersten, zweiten und dritten Parameter an die dritte Einrichtung (120) übergeben werden.
  3. Anordnung nach Anspruch 2, wobei die dritte Einrichtung (120) aufweist:
    einen Impulsgenerator (400) zum Erzeugen eines Impulses;
    ein Langzeit-Prädiktionsfilter (402), das den Impuls sowie die zweiten und dritten Parameter empfängt; und
    ein Kurzzeit-Prädiktionsfilter (404), das in Reihe mit dem Langzeit-Prädiktionsfilter (402) geschaltet ist und die ersten Parameter und das Ausgangssignal des Langzeit-Prädiktionsfilters empfängt.
  4. Verfahren zum Codieren eines Sprachsignals unter Verwendung eines Regulär-Pulsanregungsschemas mit den Schritten:
    (a) Empfangen eines Diskretzeit-Sprachsignals und Teilen des Diskretzeit-Sprachsignals in mehrere Rahmen;
    (b) Extrahieren mehrerer Parameter aus jedem der Rahmen des Diskretzeit-Sprachsignals;
    (c) Erzeugen eines Signals unter Verwendung der mehreren Parameter und einer Folge von Anregungsimpulsen;
    (d) Erzeugen eines Impulsantwortfunktionssignals unter Verwendung der mehreren Parameter;
    (e) Erzeugen eines Autokorrelationsfunktionssignals unter Verwendung des Impulsantwortsignals; und
    (f) Erzeugen eines Kreuzkorrelationsfunktionssignals unter Verwendung des Impulsantwortfunktionssignals und einer gewichteten Differenz zwischen einem der Rahmen des Diskretzeit-Sprachsignals und einem Rahmen des Signals;
       gekennzeichnet durch:
    (g) Erzeugen eines Rastersignals, das die Lage eines ersten Anregungsimpulses innerhalb eines Rahmens kennzeichnet, unter Verwendung des Kreuzkorrelationsfunktionssignals; und
    (h) Empfangen des Autokorrelationsfunktionssignals, des Kreuzkorrelationsfunktionssignals und des Rastersignals und Bestimmen einer Amplitudenfolge der Anregungsimpulse innerhalb eines Rahmens.
  5. Verfahren nach Anspruch 4, wobei der Schritt (b) die Schritte aufweist:
    Extrahieren eines oder mehrerer erster Parameter, die eine spektrale Hüllkurve darstellen, aus jedem der Rahmen des Diskretzeit-Sprachsignals und Codieren der ersten Parameter, Decodieren der codierten ersten Parameter und Erzeugen der decodierten ersten Parameter; und
    Extrahieren von zweiten und dritten Parametern aus jedem der Rahmen des Diskretzeit-Sprachsignals, wobei die zweiten und dritten Parameter jeweils eine Tonhöhenperiode bzw. eine Tonhöhenverstärkung darstellen, und Decodieren der codierten zweiten und dritten Parameter und Erzeugen der decodierten zweiten und dritten Parameter,
    wobei die decodierten ersten, zweiten und dritten Parameter den mehreren Parametern in Schritt (d) entsprechen.
  6. Verfahren nach Anspruch 5, wobei der Schritt (d) die Schritte aufweist:
    Erzeugen eines Impulses;
    Empfangen des Impulses sowie der zweiten und dritten Parameter und Erzeugen eines Ausgangssignals, das eine Tonhöhenstruktur darstellt; und
    Empfangen der ersten Parameter und des Ausgangssignals, das eine Tonhöhenstruktur darstellt, und Erzeugen eines Ausgangssignals, das eine spektrale Hüllkurvencharakteristik darstellt.
EP19900111360 1989-06-14 1990-06-15 Einrichtung und Verfahren zur Sprachkodierung mit Regular-Pulsanregung Expired - Lifetime EP0402947B1 (de)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
JP1150770A JPH0315900A (ja) 1989-06-14 1989-06-14 音声信号符号化装置
JP150770/89 1989-06-14
JP1254458A JP2900431B2 (ja) 1989-09-29 1989-09-29 音声信号符号化装置
JP254458/89 1989-09-29

Publications (3)

Publication Number Publication Date
EP0402947A2 EP0402947A2 (de) 1990-12-19
EP0402947A3 EP0402947A3 (de) 1991-10-23
EP0402947B1 true EP0402947B1 (de) 1997-11-26

Family

ID=26480256

Family Applications (1)

Application Number Title Priority Date Filing Date
EP19900111360 Expired - Lifetime EP0402947B1 (de) 1989-06-14 1990-06-15 Einrichtung und Verfahren zur Sprachkodierung mit Regular-Pulsanregung

Country Status (2)

Country Link
EP (1) EP0402947B1 (de)
DE (1) DE69031749T2 (de)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2084323C (en) * 1991-12-03 1996-12-03 Tetsu Taguchi Speech signal encoding system capable of transmitting a speech signal at a low bit rate

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5091946A (en) * 1988-12-23 1992-02-25 Nec Corporation Communication system capable of improving a speech quality by effectively calculating excitation multipulses

Also Published As

Publication number Publication date
EP0402947A3 (de) 1991-10-23
DE69031749T2 (de) 1998-05-14
EP0402947A2 (de) 1990-12-19
DE69031749D1 (de) 1998-01-08

Similar Documents

Publication Publication Date Title
EP0409239B1 (de) Verfahren zur Sprachkodierung und -dekodierung
KR100417836B1 (ko) 과다-샘플된 합성 광대역 신호를 위한 고주파 내용 복구방법 및 디바이스
EP0709827B1 (de) Vorrichtung und Verfahren zur Sprachkodierung und -dekodierung sowie Vorrichtung zum Extrahieren einer Phasen-Amplituden-Charakteristik
US4821324A (en) Low bit-rate pattern encoding and decoding capable of reducing an information transmission rate
EP0673013B1 (de) System zum Kodieren und Dekodieren von Signalen
EP0802524B1 (de) Sprachkodierer
US4716592A (en) Method and apparatus for encoding voice signals
KR20010102004A (ko) Celp 트랜스코딩
KR100218214B1 (ko) 음성 부호화 장치 및 음성 부호화 복호화 장치
US4945565A (en) Low bit-rate pattern encoding and decoding with a reduced number of excitation pulses
US20040111257A1 (en) Transcoding apparatus and method between CELP-based codecs using bandwidth extension
EP0390975B1 (de) Zur Sprachqualitätsverbesserung geeignetes Kodiergerät unter Anwendung einer Doppelanlage zur Pulserzeugung
EP1597721B1 (de) Melp (mixed excitation linear prediction)-transkodierung mit 600 bps
EP0715297B1 (de) Wiederherstellung einer Folge von Sprachkode-Parametern mittels Klassifizierung und eines Verzeichnisses der Parameterverläufe
US6006178A (en) Speech encoder capable of substantially increasing a codebook size without increasing the number of transmitted bits
EP0849724A2 (de) Vorrichtung und Verfahren hoher Qualität zur Kodierung von Sprache
CA1229681A (en) Method and apparatus for speech-band signal coding
US5704002A (en) Process and device for minimizing an error in a speech signal using a residue signal and a synthesized excitation signal
EP0744069B1 (de) Lineare vorhersage durch impulsanregung
EP0402947B1 (de) Einrichtung und Verfahren zur Sprachkodierung mit Regular-Pulsanregung
US5708756A (en) Low delay, middle bit rate speech coder
JP3153075B2 (ja) 音声符号化装置
KR0155798B1 (ko) 음성신호 부호화 및 복호화 방법
JP2900431B2 (ja) 音声信号符号化装置
JP3089967B2 (ja) 音声符号化装置

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 19900710

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): BE DE FR GB NL SE

PUAL Search report despatched

Free format text: ORIGINAL CODE: 0009013

AK Designated contracting states

Kind code of ref document: A3

Designated state(s): BE DE FR GB NL SE

17Q First examination report despatched

Effective date: 19940926

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): BE DE FR GB NL SE

REF Corresponds to:

Ref document number: 69031749

Country of ref document: DE

Date of ref document: 19980108

ET Fr: translation filed
PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed
PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: BE

Payment date: 20000522

Year of fee payment: 11

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: SE

Payment date: 20000605

Year of fee payment: 11

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20000612

Year of fee payment: 11

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20000614

Year of fee payment: 11

Ref country code: DE

Payment date: 20000614

Year of fee payment: 11

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: NL

Payment date: 20000629

Year of fee payment: 11

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20010615

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20010616

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: BE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20010630

BERE Be: lapsed

Owner name: NEC CORP.

Effective date: 20010630

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NL

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20020101

EUG Se: european patent has lapsed

Ref document number: 90111360.5

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20010615

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FR

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20020228

NLV4 Nl: lapsed or anulled due to non-payment of the annual fee

Effective date: 20020101

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20020403