EP0482699B1 - Method for coding and decoding a sampled analog signal having a repetitive nature and a device for coding and decoding by said method - Google Patents

Method for coding and decoding a sampled analog signal having a repetitive nature and a device for coding and decoding by said method Download PDF

Info

Publication number
EP0482699B1
EP0482699B1 EP91202675A EP91202675A EP0482699B1 EP 0482699 B1 EP0482699 B1 EP 0482699B1 EP 91202675 A EP91202675 A EP 91202675A EP 91202675 A EP91202675 A EP 91202675A EP 0482699 B1 EP0482699 B1 EP 0482699B1
Authority
EP
European Patent Office
Prior art keywords
term prediction
amplitudes
signal
prediction analysis
combined
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
EP91202675A
Other languages
German (de)
French (fr)
Other versions
EP0482699A3 (en
EP0482699A2 (en
Inventor
John Gerard Beerends
Frank Muller
Robertus Lambertus Adrianus Van Ravesteijn
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Koninklijke PTT Nederland NV
Original Assignee
Koninklijke PTT Nederland NV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke PTT Nederland NV filed Critical Koninklijke PTT Nederland NV
Publication of EP0482699A2 publication Critical patent/EP0482699A2/en
Publication of EP0482699A3 publication Critical patent/EP0482699A3/en
Application granted granted Critical
Publication of EP0482699B1 publication Critical patent/EP0482699B1/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/06Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients

Definitions

  • the invention relates to a method for coding a sampled analog signal having a repetitive nature, in which the sampled signal is split into consecutive segments each containing a predetermined number of samples; in which a short-term prediction analysis is performed on said segments and in which the coefficients determined in said short-term prediction analysis are transmitted and are also fed to a short-term prediction filter, in which a long-term prediction analysis performed on a residual signal available at an output of said filter and the information determined in said long-term prediction analysis is also transmitted, and in which the information present in the residual signal is coded and transmitted.
  • Such a method is known from "An error protected transform coder for cellular mobile radio", by H. Suda et al, disclosed during the IEEE Workshop on speech coding "Advances in speech coding", Vancouver, CA, 5-8 September 1989, pages 81-86.
  • This paper discloses in its figure 1 an encoder in which a short-term prediction analysis and a long-term prediction analysis are performed.
  • This encoder comprises a short-term prediction filter for generating a residual signal and comprises a multiplexing unit for multiplexing and then transmitting (after coding) information present in the residual signal, information determined in said short-term prediction analysis and information determined in said long-term prediction analysis.
  • This known method is disadvantageous, inter alia, because it transmits information in an inefficient way, i.e. with a large number of bits/second.
  • the method according to the invention is characterised in that the residual signal is transformed into the frequency domain, in that the amplitudes of at least a number of the frequency components obtained by the transformation into the frequency domain are combined to form a smaller number of frequency components in a manner such that the frequencies associated with the combined amplitudes are situated equidistantly on a linear Bark scale, and in that a signal is transmitted which is representative of said combined amplitudes.
  • the residual signal is coded perceptively, which means that only that information is transmitted which is relevant for differences in the decoded received signal which can be detected by the human ear.
  • the present invention makes use of the insight known for some time that human hearing functions in fact as a chain consisting of a number of filters having adjacent frequency bands but having different bandwidths, the so-called critical bands or Barks, the bandwidth of such critical bands being much smaller for low frequencies than for high frequencies.
  • a frequency scale formed in accordance with this insight is referred to as a linear Bark scale.
  • Bark scale For a further explanation of the principle of the Bark scale, reference is made to B. Scharf and S. Buus, "Stimulus, Physiology, Thresholds" in L. Kaufman, K.R. Boff and J. P. Thomas, editors, Handbook of Perception and Human Performance, chapter 14, pages 1-43, Wiley, New York, 1986.
  • the invention further relates to a method for decoding a signal coded by the method describer above, in which the long-term prediction analysis information received and the other information received from the residual signal are combined and the combined signal, together with the short-term prediction analysis coefficients received, is fed to an inverse short-term prediction filter at whose output a series of samples is delivered which is representative of the sampled analog signal.
  • the method for decoding is characterised in that original amplitudes in the frequency domain are reconstructed from the combined amplitude values received, in that the information transmitted as a result of the long-term prediction analysis is used to calculate the phase values associated with said amplitudes, and in that the calculated phase values, together with the associated amplitudes are transformed to the time domain.
  • the invention also relates to an apparatus for coding a sampled analog signal having a repetitive nature, comprising
  • the invention further also relates to an apparatus for decoding a coded signal, comprising
  • LPC linear predictive coding
  • LTP long-term prediction
  • Figure 1a shows a block diagram of an exemplary embodiment of a coding unit for the device according to the invention.
  • Figure 1b shows a block diagram of an exemplary embodiment of a decoding unit for the device according to the invention.
  • An analog signal delivered by a microphone 1 is limited in bandwidth by a low-pass filter 2 and converted in an analog/digital converter 3 into a series of amplitude and time-discrete samples which are representative of the analog signal.
  • the output signal of the converter 3 is fed to the input of a short-term analysis unit 4 and to the input of a short-term prediction filter 5.
  • STP short-term prediction
  • the analysis unit 4 provides an output signal in the form of short-term prediction filter coefficients which are quantised, coded and transmitted to the decoder unit shown in Figure 1b.
  • the structure and the function of the filter 5 and the unit 4 are well known to those skilled in the art in the field of speech coding and are of no further importance for the present invention, so that a further explanation can be omitted.
  • the STP-filtered signal is fed to a long-term prediction (LTP) analysis unit 6.
  • LTP long-term prediction
  • an LTP analysis is applied twice per segment of 160 samples in a manner such as that described, for example, in Dutch Patent Application 9001985.
  • a search is always made, in accordance with a particular search strategy, for a segment which is as similar as possible in a signal period preceding said segment having a particular duration and a signal is transmitted in coded form which is representative of the number of samples D situated between the starting instant of the segment found and the starting instant of the segment to be coded.
  • the output signal of the STP filter unit 5 is referred to as the residual signal and, according to the invention, said residual signal is transmitted in coded form in a manner such that only the information which, seen perceptively, is relevant is transmitted.
  • the segments of 160 samples in said residual signal are divided into 8 subsegments of 30 samples in the circuit 7. This is done by first dividing the segment supplied into eight subsegments of 20 samples and then completing these at the leading edge with the ten last samples of the previous subsegment. This implies that the last ten samples of every segment have to be stored in order to also be able to complete the first subsegment of the subsequent segment.
  • Every subsegment of 30 samples is multiplied in a circuit 8 by a window function such as, for example, a cosine function.
  • the window function is so chosen that, for every sample in the overlapping parts of the subsegments, the sum of the squares of the two multiplication factors is unity. The reason that this has to be the case for the squares is that the multiplication by the window function takes place both in the coding unit and in the decoding unit shown in Figure 1b.
  • a Discrete Fourier Transform (DFT) is performed on the windowed subsegment in a circuit 9, 16 different frequency components being obtained for every subsegment.
  • the amplitudes A of the components 1 to 13 inclusive are calculated in a circuit 10.
  • the components 0, 14 and 15 can be ignored because they are situated outside the frequency band of 300 - 3,400 Hz chosen for speech communication. If a greater or a smaller frequency band is relevant, the number of amplitude components taken into consideration can be adjusted accordingly.
  • Bark amplitude components are calculated in a circuit 11. These are amplitudes associated with frequencies which are situated equidistantly on a linear Bark scale.
  • the application of the scaling value G has the advantage that the scaled amplitudes can be coded more efficiently.
  • the value of G is quantised in a circuit 13 and then transmitted to the decoding unit. If the scale factor G has been calculated, every Bark component is divided by the quantised gain factor ⁇ in a circuit 14. The result of this division is quantised in a circuit 15, coded and then also transmitted to the decoding unit.
  • circuits 12, 13 and 14 can be omitted and the four calculated values for the Bark amplitude components can be transmitted directly after quantisation in circuit 15.
  • the four scaled Bark amplitude components are multiplied in a multiplier 18 by the gain factor, ⁇ , decoded in a circuit 17, as a result of which the reconstructed Bark amplitude components B ⁇ 1 to B ⁇ 4 inclusive are obtained. This is of course not applicable if no scaling factor is used in the coding unit.
  • the amplitudes and the phases are required.
  • the phases are determined in the following manner with the aid of the LTP information decoded in a circuit 23 and consisting of the sample spacing D.
  • the 120 most recent samples of the reconstructed STP residue such as are present at the output of the circuit 22 to be discussed in greater detail below are stored in each case.
  • the subsegment is determined which is situated at a spacing of D samples in the past with respect to the present subsegment and this subsegment is multiplied in a circuit 25 by the same window function as was used in the circuit 8 in the coder unit.
  • a DFT is then applied to said subsegment in a circuit 26, after which the phases of the 13 components considered can be calculated in a circuit 27. With the aid of the phases determined in this way and the amplitudes already calculated, an IDFT is performed in the circuit 20, the amplitudes of ⁇ 0 , ⁇ 14 , ⁇ 15 and ⁇ 16 being set equal to zero.
  • the last ten samples in this subsegment are stored.
  • the first twenty samples form a portion of the reconstruction of a segment of the STP residue.
  • a completely reconstructed segment of the STP residue is obtained, and this is situated ten samples in the past with respect to the segment on which the STP analysis has been performed in the coding unit.
  • An inverse STP filtering is performed on this segment in a filter circuit 28 in a manner known per se with the aid of the STP coefficients received, the filter coefficients from the previous segment being used for the first ten samples.
  • the output signal of the filter 28 is converted in a digital/analog converter 29 into an analog signal which is fed via a low-pass filter 30 to a loudspeaker 31 which gives a high-fidelity reproduction of the speech signal supplied to the microphone 1, it having been possible to transmit said speech signal in coded form with a low number of bits due to the measures according to the invention.
  • a circuit 23' can be included between the circuits 23 and 24 to first subject the value of D received by the decoder additionally to a number of operations in order to obtain an optimum value of D for the reconstruction of the speech signal. These may be three consecutive operations.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Reduction Or Emphasis Of Bandwidth Of Signals (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
  • Analogue/Digital Conversion (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

Frequency components are calculated from the STP-filtered speech signal. The amplitudes of these are combined in a manner such that the resultant values are associated with frequencies which are situated equidistantly on a linear Bark scale. Said components are quantised, possibly after scaling. In the decoder, the components are again distributed over the frequency spectrum. In the coder, the fundamental regularity D is determined with an LTP technique, after which it is transmitted. In the decoder the phases of the reconstructed signal at the spacing D in the past are determined. These phases are combined with the amplitudes already present in the frequency spectrum, after which transformation back to the time domain takes place. Inverse STP filtering is then carried out.

Description

  • The invention relates to a method for coding a sampled analog signal having a repetitive nature, in which the sampled signal is split into consecutive segments each containing a predetermined number of samples; in which a short-term prediction analysis is performed on said segments and in which the coefficients determined in said short-term prediction analysis are transmitted and are also fed to a short-term prediction filter, in which a long-term prediction analysis performed on a residual signal available at an output of said filter and the information determined in said long-term prediction analysis is also transmitted, and in which the information present in the residual signal is coded and transmitted.
  • Such a method is known from "An error protected transform coder for cellular mobile radio", by H. Suda et al, disclosed during the IEEE Workshop on speech coding "Advances in speech coding", Vancouver, CA, 5-8 September 1989, pages 81-86. This paper discloses in its figure 1 an encoder in which a short-term prediction analysis and a long-term prediction analysis are performed. This encoder comprises a short-term prediction filter for generating a residual signal and comprises a multiplexing unit for multiplexing and then transmitting (after coding) information present in the residual signal, information determined in said short-term prediction analysis and information determined in said long-term prediction analysis.
  • This known method is disadvantageous, inter alia, because it transmits information in an inefficient way, i.e. with a large number of bits/second.
  • It is an object of the invention, inter alia, to provide a method for very efficiently transmitting the information, i.e. with a small number of bits/second, without the quality, experienced by the listener, of the speech reconstructed by a decoding method at the receiving side being impaired.
  • Thereto, the method according to the invention is characterised in that the residual signal is transformed into the frequency domain, in that the amplitudes of at least a number of the frequency components obtained by the transformation into the frequency domain are combined to form a smaller number of frequency components in a manner such that the frequencies associated with the combined amplitudes are situated equidistantly on a linear Bark scale, and in that a signal is transmitted which is representative of said combined amplitudes.
  • According to the present invention, the residual signal is coded perceptively, which means that only that information is transmitted which is relevant for differences in the decoded received signal which can be detected by the human ear.
  • In the first place, use is made for this purpose of the known fact that the human ear is not sensitive to absolute phase values, but only to phase relationships, so that it is not necessary in principle to transmit the phase information from the residual signal to be coded, provided only that it is possible to reconstruct the original phase relationships at the receiving end.
  • In addition, the present invention makes use of the insight known for some time that human hearing functions in fact as a chain consisting of a number of filters having adjacent frequency bands but having different bandwidths, the so-called critical bands or Barks, the bandwidth of such critical bands being much smaller for low frequencies than for high frequencies. A frequency scale formed in accordance with this insight is referred to as a linear Bark scale. For a further explanation of the principle of the Bark scale, reference is made to B. Scharf and S. Buus, "Stimulus, Physiology, Thresholds" in L. Kaufman, K.R. Boff and J. P. Thomas, editors, Handbook of Perception and Human Performance, chapter 14, pages 1-43, Wiley, New York, 1986.
  • It is also pointed out that the principle of first transforming a residual signal to be transmitted in speech coding to the frequency domain and then transmitting the information available after this transformation has already been put forward earlier. For this purpose reference can be made, for example, to the paper entitled "Fourier Transform Vector Quantisation for Speech Coding" by P. Chang et al. in IEEE Transactions on Communications, Vol. COM 35, No. 10, pages 1059-1068.
  • According to this publication, however, after the transformation use is made of vector quantisation and there is no mention of transmitting purely amplitude information.
  • The invention further relates to a method for decoding a signal coded by the method describer above, in which the long-term prediction analysis information received and the other information received from the residual signal are combined and the combined signal, together with the short-term prediction analysis coefficients received, is fed to an inverse short-term prediction filter at whose output a series of samples is delivered which is representative of the sampled analog signal.
  • The method for decoding is characterised in that original amplitudes in the frequency domain are reconstructed from the combined amplitude values received, in that the information transmitted as a result of the long-term prediction analysis is used to calculate the phase values associated with said amplitudes, and in that the calculated phase values, together with the associated amplitudes are transformed to the time domain.
  • The invention also relates to an apparatus for coding a sampled analog signal having a repetitive nature, comprising
    • splitting means for splitting the sampled signal into consecutive segments each containing a predetermined number of samples,
    • short-term prediction means for performing a short-term prediction analysis on said segments and for generating coefficients,
    • a short-term prediction filter for receiving the coefficients determined in said short-term prediction analysis,
    • long-term prediction means for performing a long-term prediction analysis on a residual signal available at an output of said filter,
    • coding means for coding the information present in the residual signal, which information is to be transmitted, and
    • an output for transmitting the coefficients, the information determined in said long-term prediction analysis and the information present in the residual signal.
  • This apparatus is characterised in that the apparatus comprises
    • transformation means for transforming the residual signal into the frequency domain,
    • combination means for combining the amplitudes of at least a number of the frequency components obtained by the transformation into the frequency domain to form a smaller number of frequency components in a manner such that the frequencies associated with the combined amplitudes are situated equidistantly on a linear Bark scale, a signal which is representative of said combined amplitudes being transmittable.
  • The invention further also relates to an apparatus for decoding a coded signal, comprising
    • an input for receiving long-term prediction analysis information and other information received from the residual signal and short-term prediction analysis coefficients,
    • combination means for combining the long-term prediction analysis information and the other information received from the residual signal into a combined signal,
    • an inverse short-term prediction filter for receiving the combined signal and the short-term prediction analysis coefficients and for generating at its output a series of samples which is representative of the-sampled analog signal.
  • This apparatus is characterised in that the apparatus comprises
    • reconstruction means for reconstructing original amplitudes in the frequency domain from the combined amplitude values received,
    • calculation means for using the information transmitted as a result of the long-term prediction analysis to calculate the phase values associated with said amplitudes, and
    • transformation means for transforming the calculated phase values, together with the associated amplitudes, into the time domain.
  • It should be observed that it is known that analog signals having a strongly consistent nature such as, for example, speech signals can be efficiently coded after sampling by consecutively performing a number of different transformations on consecutive segments of the signal which each have a particular time duration. One of the known transformations for this purpose is linear predictive coding (LPC), for an explanation of which reference can be made to the book entitled "Digital Processing of Speech Signals" by L.R. Rabiner and R.W. Schafer; Prentice Hall, New Jersey; chapter 8. As stated, LPC is always used for signal segments having a particular time duration, in the case of speech signals, for example, 20 ms, and is considered as short-term coding. It is also known to make use not only of a short-term prediction but also a long-term prediction (LTP) in which a very efficient coding is obtained by a combination of these two techniques. The principle of LTP is described in Frequenz, (Frequency), volume 42, no. 2-3, 1988; pages 85-93; P. Vary et al.: "Sprachcodec for dass Europäische Funkfernsprechnetz" ("Speech coder/decoder for that European Radiotelephone Network"), while an improved version of the LTP principle is described in the Dutch Patent Application 9001985.
  • The invention will be explained in greater detail below on the basis of an exemplary embodiment with reference to the drawing, wherein:
  • Figure 1a shows a block diagram of an exemplary embodiment of a coding unit for the device according to the invention.
  • Figure 1b shows a block diagram of an exemplary embodiment of a decoding unit for the device according to the invention.
  • An analog signal delivered by a microphone 1 is limited in bandwidth by a low-pass filter 2 and converted in an analog/digital converter 3 into a series of amplitude and time-discrete samples which are representative of the analog signal. The output signal of the converter 3 is fed to the input of a short-term analysis unit 4 and to the input of a short-term prediction filter 5. These two units cater for the abovementioned short-term prediction (STP) on segments of, for example, 160 samples and the analysis unit 4 provides an output signal in the form of short-term prediction filter coefficients which are quantised, coded and transmitted to the decoder unit shown in Figure 1b. The structure and the function of the filter 5 and the unit 4 are well known to those skilled in the art in the field of speech coding and are of no further importance for the present invention, so that a further explanation can be omitted. The STP-filtered signal is fed to a long-term prediction (LTP) analysis unit 6. In this analysis unit, an LTP analysis is applied twice per segment of 160 samples in a manner such as that described, for example, in Dutch Patent Application 9001985. In such an LTP analysis, for a signal segment to be coded, a search is always made, in accordance with a particular search strategy, for a segment which is as similar as possible in a signal period preceding said segment having a particular duration and a signal is transmitted in coded form which is representative of the number of samples D situated between the starting instant of the segment found and the starting instant of the segment to be coded.
  • The output signal of the STP filter unit 5 is referred to as the residual signal and, according to the invention, said residual signal is transmitted in coded form in a manner such that only the information which, seen perceptively, is relevant is transmitted. For this purpose, the segments of 160 samples in said residual signal are divided into 8 subsegments of 30 samples in the circuit 7. This is done by first dividing the segment supplied into eight subsegments of 20 samples and then completing these at the leading edge with the ten last samples of the previous subsegment. This implies that the last ten samples of every segment have to be stored in order to also be able to complete the first subsegment of the subsequent segment. Then every subsegment of 30 samples is multiplied in a circuit 8 by a window function such as, for example, a cosine function. The window function is so chosen that, for every sample in the overlapping parts of the subsegments, the sum of the squares of the two multiplication factors is unity. The reason that this has to be the case for the squares is that the multiplication by the window function takes place both in the coding unit and in the decoding unit shown in Figure 1b. A Discrete Fourier Transform (DFT) is performed on the windowed subsegment in a circuit 9, 16 different frequency components being obtained for every subsegment. Of these 16 frequency components, numbered 0 to 15 inclusive, the amplitudes A of the components 1 to 13 inclusive are calculated in a circuit 10. The components 0, 14 and 15 can be ignored because they are situated outside the frequency band of 300 - 3,400 Hz chosen for speech communication. If a greater or a smaller frequency band is relevant, the number of amplitude components taken into consideration can be adjusted accordingly. Starting from the said 13 components, four so-called Bark amplitude components are calculated in a circuit 11. These are amplitudes associated with frequencies which are situated equidistantly on a linear Bark scale. The Bark amplitude components B1 to B4 inclusive can, for example, be calculated as follows from the DFT amplitudes A1 to A13 inclusive: B 1 = A 1 2 + A 2 2
    Figure imgb0001
    B 2 = A 3 2 + A 4 2 + A 5 2
    Figure imgb0002
    B 3 = A 6 2 + A 7 2 + A 8 2 + A 9 2
    Figure imgb0003
    B 4 = A 10 2 + A 11 2 + A 12 2 + A 13 2
    Figure imgb0004
  • If desired, a gain factor G is calculated as a scaling value in circuit 12 from the four Bark amplitude components in accordance with: G = B 1 2 + B 2 2 + B 3 2 + B 4 2
    Figure imgb0005
  • The application of the scaling value G has the advantage that the scaled amplitudes can be coded more efficiently. The value of G is quantised in a circuit 13 and then transmitted to the decoding unit. If the scale factor G has been calculated, every Bark component is divided by the quantised gain factor Ĝ in a circuit 14. The result of this division is quantised in a circuit 15, coded and then also transmitted to the decoding unit.
  • If no use is made of a scaling value, the circuits 12, 13 and 14 can be omitted and the four calculated values for the Bark amplitude components can be transmitted directly after quantisation in circuit 15.
  • After decoding in a circuit 16 in the decoder unit, the four scaled Bark amplitude components are multiplied in a multiplier 18 by the gain factor, Ĝ, decoded in a circuit 17, as a result of which the reconstructed Bark amplitude components B̂1 to B̂4 inclusive are obtained. This is of course not applicable if no scaling factor is used in the coding unit. In a circuit 19, the amplitudes in the frequency domain Â1 to Â13 inclusive (equidistant on the Hz scale) are calculated by means of the following formulae  1 =  1 = B ˆ 1 2
    Figure imgb0006
    Â 3 = Â 4 = Â 5 = B ˆ 2 3
    Figure imgb0007
    Â 6 = Â 7 = Â 8 = Â 9 = B ˆ 3 2
    Figure imgb0008
    Â 10 = Â 11 = Â 12 = Â 13 = B ˆ 4 2
    Figure imgb0009
  • In order to be able to transform the 13 frequency components considered in the coder back to the time domain by means of an inverse DFT (IDFT) in the IDFT circuit, the amplitudes and the phases are required.
  • The phases are determined in the following manner with the aid of the LTP information decoded in a circuit 23 and consisting of the sample spacing D.
  • The 120 most recent samples of the reconstructed STP residue such as are present at the output of the circuit 22 to be discussed in greater detail below are stored in each case. In a circuit 24, the subsegment is determined which is situated at a spacing of D samples in the past with respect to the present subsegment and this subsegment is multiplied in a circuit 25 by the same window function as was used in the circuit 8 in the coder unit. A DFT is then applied to said subsegment in a circuit 26, after which the phases of the 13 components considered can be calculated in a circuit 27. With the aid of the phases determined in this way and the amplitudes already calculated, an IDFT is performed in the circuit 20, the amplitudes of Â0, Â14, Â15 and Â16 being set equal to zero.
  • At the output of the circuit 20 a reconstruction of the subsegment, 30 samples long, is now available, but this has also been modified by the window function performed in the coder unit. The reconstructed subsegment is therefore multiplied again by the window function in a circuit 21. In the case of the first ten samples of the subsegment now multiplied twice by the window function, the last ten samples, stored for this purpose, of the previous subsegment multiplied twice by the window function are added in a circuit 22. As a result of this, the sum of the multiplication factors in the resultant ten samples is equal to unity.
  • The last ten samples in this subsegment are stored. The first twenty samples form a portion of the reconstruction of a segment of the STP residue. After eight subsegments have been reconstructed and combined, a completely reconstructed segment of the STP residue is obtained, and this is situated ten samples in the past with respect to the segment on which the STP analysis has been performed in the coding unit.
  • An inverse STP filtering is performed on this segment in a filter circuit 28 in a manner known per se with the aid of the STP coefficients received, the filter coefficients from the previous segment being used for the first ten samples.
  • The output signal of the filter 28 is converted in a digital/analog converter 29 into an analog signal which is fed via a low-pass filter 30 to a loudspeaker 31 which gives a high-fidelity reproduction of the speech signal supplied to the microphone 1, it having been possible to transmit said speech signal in coded form with a low number of bits due to the measures according to the invention.
  • If desired, a circuit 23' can be included between the circuits 23 and 24 to first subject the value of D received by the decoder additionally to a number of operations in order to obtain an optimum value of D for the reconstruction of the speech signal. These may be three consecutive operations.
    • 1) If the series of values of D received exhibit a trend, the present D received, if it falls outside said trend by a certain margin, is replaced by a value which is in keeping with said trend. Algorithms for determining a trend in a series of consecutive values and for determining a replacement value for a signal which falls outside said trend are well known per se to those skilled in the art.
    • 2) Three intermediate values (I1, I2 and I3) are calculated between two consecutive values of D (D1 and D2), possibly adjusted with the aid of such an algorithm, by means of interpolation. This is done, for example, in the following manner: I 1 = 0.75 ∗ D 1 + 0.25 ∗ D 2
      Figure imgb0010
      I 2 = 0.5 ∗ D 1 + 0.5 ∗ D 2
      Figure imgb0011
      I 3 = 0.25 ∗ D 1 + 0.75 ∗ D 2
      Figure imgb0012
      The interpolation is carried out because the spacing D is determined in the coding unit twice per segment. Without interpolation, decoding of four consecutive subsegments would be carried out with the same value of D. If no fundamental regularity is present in the signal in the coding unit, a regularity would consequently wrongly be provided in the decoder during four subsegments. This problem is overcome by the interpolation.
      If fundamental regularity is in fact present in the speech signal, the repetition spacing in the signal will in general vary slowly. Due to the interpolation, the variation in the value of D now also has a smooth nature in the decoder.
    • 3) After equalising the values of D by, if necessary, calculating a replacement value and after interpolation, the calculated spacing D corresponds as well as possible with the actual repetition spacing present in the signal. If, however, said spacing D is less than 30, D is multiplied by an integer which is chosen in a manner such that the result is as a minimum equal to 30. This is necessary because all the samples of a subsegment at a spacing of less than 30 with respect to the present segment have not yet been reconstructed, so that they can therefore not be used to calculate the phases.
  • The reason that spaces D of less than 30 are nevertheless transmitted is that, if the fundamental regularity in the signal encompasses a number of samples less than 30, this prevents the decoded spacing D assuming values which are mutually unequal multiples of the actual repetition spacing. As a result of this, the equalisation algorithm would have less opportunity of detecting a trend.

Claims (10)

  1. Method for coding a sampled analog signal having a repetitive nature, in which the sampled signal is split into consecutive segments each containing a predetermined number of samples; in which a short-term prediction analysis is performed on said segments and in which the coefficients determined in said short-term prediction analysis are transmitted and are also fed to a short-term prediction filter, in which a long-term prediction analysis performed on a residual signal available at an output of said filter and the information determined in said long-term prediction analysis is also transmitted, and in which the information present in the residual signal is coded and transmitted, characterised in that the residual signal is transformed into the frequency domain, in that the amplitudes of at least a number of the frequency components obtained by the transformation into the frequency domain are combined to form a smaller number of frequency components in a manner such that the frequencies associated with the combined amplitudes are situated equidistantly on a linear Bark scale, and in that a signal is transmitted which is representative of said combined amplitudes.
  2. Method for decoding a signal coded by the method according to Claim 1, in which the long-term prediction analysis information received and the other information received from the residual signal are combined and the combined signal, together with the short-term prediction analysis coefficients received, is fed to an inverse short-term prediction filter at whose output a series of samples is delivered which is representative of the sampled analog signal, characterised in that original amplitudes in the frequency domain are reconstructed from the received amplitude values which are combined according to claim 1, in that the information transmitted as a result of the long-term prediction analysis is used to calculate the phase values associated with said amplitudes, and in that the calculated phase values, together with the associated amplitudes are transformed to the time domain.
  3. Method according to Claim 1, characterised in that the amplitudes of thirteen frequency components A1 to A13 inclusive obtained by the transformation into the frequency domain are transformed into amplitudes B1 to B4 inclusive of four frequency components situated equidistantly on a Bark scale in accordance with: B 1 = A 1 2 + A 2 2
    Figure imgb0013
    B 2 = A 3 2 + A 4 2 + A 5 2
    Figure imgb0014
    B 3 = A 6 2 + A 7 2 + A 8 2 + A 9 2
    Figure imgb0015
    B 4 = A 10 2 + A 11 2 + A 12 2 + A 13 2
    Figure imgb0016
    and in that these values for B are transmitted after quantisation.
  4. Method according to Claim 3, characterised in that a scaling factor G is calculated for the four frequency components B1 to B4 inclusive situated equidistantly on a Bark scale in accordance with: G = B 1 2 + B 2 2 + B 3 2 + B 4 2
    Figure imgb0017
    in that this value for G is quantised, and in that the values of B1 to B4 inclusive are divided by the quantised scaling factor before they are quantised.
  5. Method according to Claim 2 and 3 or 4, characterised in that combined amplitude values B1' to B4' are constructed from the information received, in that amplitude values A1' to A13' inclusive are obtained therefrom in accordance with: A 1 ʹ = A 2 ʹ = B 1 ʹ 2
    Figure imgb0018
    A 3 ʹ = A 4 ʹ = A 5 ʹ = B 2 ʹ 3
    Figure imgb0019
    A 6 ʹ = A 7 ʹ = A 8 ʹ = A 9 ʹ = B 3 ʹ 2
    Figure imgb0020
    A 10 ʹ = A 11 ʹ = A 12 ʹ = A 13 ʹ = B 4 ʹ 2
    Figure imgb0021
    and in that the information transmitted as a result of the long-term prediction analysis is representative of the number of samples D which is situated between the starting instant of a group of samples found with the aid of the long-term prediction analysis and transmitted earlier and the starting instant of a group of samples to be decoded.
  6. Method according to Claim 5, characterised in that the group of samples transmitted earlier which is situated at a spacing D with respect to a group of samples to be decoded is transformed to the frequency domain, in that the phase value is determined of at least a number of the frequency components calculated with said transformation, in that said phase values are combined with the amplitude values A1' to A13' inclusive, and in that these combinations are transformed back into the time domain.
  7. Method according to Claim 5 or 6, characterised in that the variation in the received values of D is equalised according to a predetermined algorithm by, if necessary, calculating a replacement value for a received value of D and in that three intermediate values are calculated for D between two consecutive values of D by means of interpolation.
  8. Method according to Claim 7, characterised in that three intermediate values I1, I2 and I3 are calculated from the known values D1 and D2 in accordance with: I 1 = 0.75 ∗ D 1 + 0.25 ∗ D 2
    Figure imgb0022
    I 2 = 0.50 ∗ D 1 + 0.50 ∗ D 2
    Figure imgb0023
    I 3 = 0.25 ∗ D 1 + 0.75 ∗ D 2
    Figure imgb0024
  9. Apparatus for coding a sampled analog signal having a repetitive nature, comprising
    - splitting means for splitting the sampled signal into consecutive segments each containing a predetermined number of samples,
    - short-term prediction means for performing a short-term prediction analysis on said segments and for generating coefficients,
    - a short-term prediction filter for receiving the coefficients determined in said short-term prediction analysis,
    - long-term prediction means for performing a long-term prediction analysis on a residual signal available at an output of said filter,
    - coding means for coding the information present in the residual signal, which information is to be transmitted, and
    - an output for transmitting the coefficients, the information determined in said long-term prediction analysis and the information present in the residual signal,
    characterised in that the apparatus comprises
    - transformation means for transforming the residual signal into the frequency domain,
    - combination means for combining the amplitudes of at least a number of the frequency components obtained by the transformation into the frequency domain to form a smaller number of frequency components in a manner such that the frequencies associated with the combined amplitudes are situated equidistantly on a linear Bark scale, a signal which is representative of said combined amplitudes being transmittable.
  10. Apparatus for decoding a signal coded by the method according to claim 1, comprising
    - an input for receiving long-term prediction analysis information and other information received from the residual signal and short-term prediction analysis coefficients,
    - combination means for combining the long-term prediction analysis information and the other information received from the residual signal into a combined signal,
    - an inverse short-term prediction filter for receiving the combined signal and the short-term prediction analysis coefficients and for generating at its output a series of samples which is representative of the sampled analog signal,
    characterised in that the apparatus comprises
    - reconstruction means for reconstructing original amplitudes in the frequency domain from the received amplitude values which are combined according to claim 1,
    - calculation means for using the information transmitted as a result of the long-term prediction analysis to calculate the phase values associated with said amplitudes, and
    - transformation means for transforming the calculated phase values, together with the associated amplitudes, into the time domain.
EP91202675A 1990-10-23 1991-10-16 Method for coding and decoding a sampled analog signal having a repetitive nature and a device for coding and decoding by said method Expired - Lifetime EP0482699B1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
NL9002308A NL9002308A (en) 1990-10-23 1990-10-23 METHOD FOR CODING AND DECODING A SAMPLED ANALOGUE SIGNAL WITH A REPEATING CHARACTER AND AN APPARATUS FOR CODING AND DECODING ACCORDING TO THIS METHOD
NL9002308 1990-10-23

Publications (3)

Publication Number Publication Date
EP0482699A2 EP0482699A2 (en) 1992-04-29
EP0482699A3 EP0482699A3 (en) 1992-08-19
EP0482699B1 true EP0482699B1 (en) 1997-08-20

Family

ID=19857866

Family Applications (1)

Application Number Title Priority Date Filing Date
EP91202675A Expired - Lifetime EP0482699B1 (en) 1990-10-23 1991-10-16 Method for coding and decoding a sampled analog signal having a repetitive nature and a device for coding and decoding by said method

Country Status (11)

Country Link
EP (1) EP0482699B1 (en)
JP (1) JP2958726B2 (en)
AT (1) ATE157188T1 (en)
CA (1) CA2053133C (en)
DE (1) DE69127339T2 (en)
DK (1) DK0482699T3 (en)
ES (1) ES2106051T3 (en)
FI (1) FI105623B (en)
NL (1) NL9002308A (en)
NO (1) NO305188B1 (en)
PT (1) PT99294A (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07261797A (en) * 1994-03-18 1995-10-13 Mitsubishi Electric Corp Signal encoding device and signal decoding device
JPH09127995A (en) * 1995-10-26 1997-05-16 Sony Corp Signal decoding method and signal decoder
JP2000165251A (en) * 1998-11-27 2000-06-16 Matsushita Electric Ind Co Ltd Audio signal coding device and microphone realizing the same
FI116992B (en) 1999-07-05 2006-04-28 Nokia Corp Methods, systems, and devices for enhancing audio coding and transmission
EP1113432B1 (en) * 1999-12-24 2011-03-30 International Business Machines Corporation Method and system for detecting identical digital data
CN114519996B (en) * 2022-04-20 2022-07-08 北京远鉴信息技术有限公司 Method, device and equipment for determining voice synthesis type and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5650398A (en) * 1979-10-01 1981-05-07 Hitachi Ltd Sound synthesizer
US4742550A (en) * 1984-09-17 1988-05-03 Motorola, Inc. 4800 BPS interoperable relp system
JP2892462B2 (en) * 1990-08-27 1999-05-17 沖電気工業株式会社 Code-excited linear predictive encoder

Also Published As

Publication number Publication date
EP0482699A3 (en) 1992-08-19
NO305188B1 (en) 1999-04-12
FI914993A (en) 1992-04-24
NO914105L (en) 1992-04-24
DE69127339D1 (en) 1997-09-25
CA2053133A1 (en) 1992-04-24
PT99294A (en) 1994-01-31
CA2053133C (en) 1996-05-21
ES2106051T3 (en) 1997-11-01
NO914105D0 (en) 1991-10-18
JP2958726B2 (en) 1999-10-06
FI914993A0 (en) 1991-10-23
EP0482699A2 (en) 1992-04-29
DK0482699T3 (en) 1998-03-30
FI105623B (en) 2000-09-15
JPH05268098A (en) 1993-10-15
NL9002308A (en) 1992-05-18
ATE157188T1 (en) 1997-09-15
DE69127339T2 (en) 1998-01-29

Similar Documents

Publication Publication Date Title
KR970007661B1 (en) Method and apparatus for coding audio signals based on perceptual model
KR100361236B1 (en) Transmission System Implementing Differential Coding Principle
US6681204B2 (en) Apparatus and method for encoding a signal as well as apparatus and method for decoding a signal
KR101646650B1 (en) Optimized low-throughput parametric coding/decoding
JPS6161305B2 (en)
CN101010725A (en) Multichannel signal coding equipment and multichannel signal decoding equipment
KR100352351B1 (en) Information encoding method and apparatus and Information decoding method and apparatus
EP0482699B1 (en) Method for coding and decoding a sampled analog signal having a repetitive nature and a device for coding and decoding by said method
US5687281A (en) Bark amplitude component coder for a sampled analog signal and decoder for the coded signal
KR100303580B1 (en) Transmitter, Encoding Device and Transmission Method
US5588089A (en) Bark amplitude component coder for a sampled analog signal and decoder for the coded signal
US5899966A (en) Speech decoding method and apparatus to control the reproduction speed by changing the number of transform coefficients
EP2261894A1 (en) Signal analysis/control system and method, signal control device and method, and program
JP2006146247A (en) Audio decoder
JP2523286B2 (en) Speech encoding and decoding method
KR100727276B1 (en) Transmission system with improved encoder and decoder
JPH0519798A (en) Signal processor
JP3827720B2 (en) Transmission system using differential coding principle
EP0475520B1 (en) Method for coding an analog signal having a repetitive nature and a device for coding by said method
EP0573103A1 (en) Transmitter, receiver and record carrier in a digital transmission system
KR100205472B1 (en) Quantizer
JP3099569B2 (en) Transmission method of acoustic signal
JPH07273656A (en) Method and device for processing signal
KR960003627B1 (en) Decoding method of subband decoding audio signal for people hard of hearing
JPH0784595A (en) Band dividing and encoding device for speech and musical sound

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE CH DE DK ES FR GB GR IT LI LU NL SE

PUAL Search report despatched

Free format text: ORIGINAL CODE: 0009013

AK Designated contracting states

Kind code of ref document: A3

Designated state(s): AT BE CH DE DK ES FR GB GR IT LI LU NL SE

17P Request for examination filed

Effective date: 19930210

17Q First examination report despatched

Effective date: 19950913

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AT BE CH DE DK ES FR GB GR IT LI LU NL SE

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 19970820

REF Corresponds to:

Ref document number: 157188

Country of ref document: AT

Date of ref document: 19970915

Kind code of ref document: T

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

Ref country code: CH

Ref legal event code: NV

Representative=s name: ISLER & PEDRAZZINI AG

REF Corresponds to:

Ref document number: 69127339

Country of ref document: DE

Date of ref document: 19970925

ITF It: translation for a ep patent filed

Owner name: STUDIO MASSARI S.R.L.

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 19971031

REG Reference to a national code

Ref country code: ES

Ref legal event code: FG2A

Ref document number: 2106051

Country of ref document: ES

Kind code of ref document: T3

ET Fr: translation filed
REG Reference to a national code

Ref country code: DK

Ref legal event code: T3

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed
REG Reference to a national code

Ref country code: CH

Ref legal event code: PFA

Free format text: KONINKLIJKE PTT NEDERLAND N.V. TRANSFER- KONINKLIJKE KPN N.V.

REG Reference to a national code

Ref country code: GB

Ref legal event code: IF02

REG Reference to a national code

Ref country code: CH

Ref legal event code: PCAR

Free format text: ISLER & PEDRAZZINI AG;POSTFACH 1772;8027 ZUERICH (CH)

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: NL

Payment date: 20101013

Year of fee payment: 20

Ref country code: AT

Payment date: 20101014

Year of fee payment: 20

Ref country code: DK

Payment date: 20101015

Year of fee payment: 20

Ref country code: FR

Payment date: 20101104

Year of fee payment: 20

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: CH

Payment date: 20101025

Year of fee payment: 20

Ref country code: DE

Payment date: 20101022

Year of fee payment: 20

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20101021

Year of fee payment: 20

Ref country code: BE

Payment date: 20101013

Year of fee payment: 20

Ref country code: SE

Payment date: 20101014

Year of fee payment: 20

Ref country code: IT

Payment date: 20101026

Year of fee payment: 20

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: ES

Payment date: 20101025

Year of fee payment: 20

REG Reference to a national code

Ref country code: DE

Ref legal event code: R071

Ref document number: 69127339

Country of ref document: DE

REG Reference to a national code

Ref country code: DE

Ref legal event code: R071

Ref document number: 69127339

Country of ref document: DE

REG Reference to a national code

Ref country code: NL

Ref legal event code: V4

Effective date: 20111016

BE20 Be: patent expired

Owner name: KONINKLIJKE *PTT NEDERLAND N.V.

Effective date: 20111016

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

Ref country code: DK

Ref legal event code: EUP

REG Reference to a national code

Ref country code: GB

Ref legal event code: PE20

Expiry date: 20111015

REG Reference to a national code

Ref country code: SE

Ref legal event code: EUG

REG Reference to a national code

Ref country code: AT

Ref legal event code: MK07

Ref document number: 157188

Country of ref document: AT

Kind code of ref document: T

Effective date: 20111016

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NL

Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION

Effective date: 20111016

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION

Effective date: 20111015

REG Reference to a national code

Ref country code: ES

Ref legal event code: FD2A

Effective date: 20120604

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: ES

Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION

Effective date: 20111017

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION

Effective date: 20111017