WO2002021091A1 - Analyseur de signal de bruit, synthetiseur de signal de bruit, procede d'analyse de signal de bruit et procede de synthese de signal de bruit - Google Patents
Analyseur de signal de bruit, synthetiseur de signal de bruit, procede d'analyse de signal de bruit et procede de synthese de signal de bruit Download PDFInfo
- Publication number
- WO2002021091A1 WO2002021091A1 PCT/JP2001/007630 JP0107630W WO0221091A1 WO 2002021091 A1 WO2002021091 A1 WO 2002021091A1 JP 0107630 W JP0107630 W JP 0107630W WO 0221091 A1 WO0221091 A1 WO 0221091A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- spectrum
- model
- noise signal
- noise
- signal
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 30
- 230000002194 synthesizing effect Effects 0.000 title claims description 17
- 238000001228 spectrum Methods 0.000 claims abstract description 230
- 230000007704 transition Effects 0.000 claims abstract description 83
- 238000013139 quantization Methods 0.000 claims description 81
- 239000013598 vector Substances 0.000 claims description 50
- 230000003595 spectral effect Effects 0.000 claims description 45
- 238000004364 calculation method Methods 0.000 claims description 27
- 238000004458 analytical method Methods 0.000 claims description 22
- 238000012937 correction Methods 0.000 claims description 13
- 230000005236 sound signal Effects 0.000 claims description 13
- 238000003786 synthesis reaction Methods 0.000 claims description 11
- 230000015572 biosynthetic process Effects 0.000 claims description 10
- 238000010606 normalization Methods 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 6
- 238000001308 synthesis method Methods 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims 11
- 230000002123 temporal effect Effects 0.000 claims 3
- 230000005540 biological transmission Effects 0.000 description 25
- 238000010586 diagram Methods 0.000 description 23
- 238000013179 statistical model Methods 0.000 description 18
- 230000006866 deterioration Effects 0.000 description 6
- 238000009499 grossing Methods 0.000 description 3
- 238000002789 length control Methods 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012821 model calculation Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000003584 silencer Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/012—Comfort noise or silence coding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
Definitions
- Noise signal analyzer noise signal synthesizer, noise signal analysis method, and noise signal synthesis method
- the present invention relates to a noise signal analysis / synthesis apparatus for analyzing and synthesizing a background noise signal superimposed on an audio signal, and a speech coding apparatus for encoding a speech signal using the analysis / synthesis apparatus.
- a noise signal analysis / synthesis apparatus for analyzing and synthesizing a background noise signal superimposed on an audio signal
- a speech coding apparatus for encoding a speech signal using the analysis / synthesis apparatus.
- FIG. 1 is a block diagram showing a configuration of a conventional coding apparatus that employs a CS—ACELP coding method with DTX control.
- an input speech signal is inputted to a speech / non-speech discriminator 11, a CS-ACELP speech coder 12, and a silence section coder 13.
- the voiced Z silence determiner 11 determines whether the input speech signal is a voiced section or a silent section (a section including only background noise).
- the CS-ACELP speech encoder 12 When the speech / non-speech determiner 11 determines that there is speech, the CS-ACELP speech encoder 12 performs speech encoding on a speech section. The encoded data of the sound section is output to the DTX control and multiplexer 14.
- the silent section encoder 13 encodes a noise signal for a silent section.
- This silence section The encoder 13 uses the input speech signal to calculate the same LPC coefficient as that used for encoding a speech section and the LPC prediction residual energy of the input speech signal, and these are the codes of the silent section.
- the data is output to the DTX control and multiplexer 14 as a whole.
- the coded data in the silent section is intermittently transmitted in a section where a predetermined change in the characteristics (LPC coefficient and energy) of the input signal is detected.
- the DTX control and multiplexer 14 uses the outputs of the voiced / silent discriminator 11, CS-ACELP speech coder 12 and silence interval coder 13 to output data to be transmitted as transmission data. After the evening is controlled and multiplexed, it is output as transmission data.
- the CS-ACELP speech coder is used to encode only the speech section of the input speech signal, and the speech section (noise only section) of the input speech signal is processed.
- the speech section noise only section
- the quality of a decoded signal with respect to a noise signal during a silent period is reduced in a receiving device receiving data encoded by a transmitting device due to the following factors. There is a problem of deterioration.
- the first factor is that the silent section encoder (noise signal analysis / encoding unit) in the transmitting side equipment uses the AR type for each signal model (short section (about 10 to 50 ms)) similar to the speech encoder.
- the decoded signal is generated by driving the combined filter of the LPC (the LPC combined filter) with the noise signal).
- the receiving device synthesizes (generates) noise using coded data obtained by intermittently analyzing the input noise signal in the transmitting device.
- the purpose is to represent the noise signal in a statistical model. Specifically, using a plurality of stationary noise models represented by an amplitude spectrum time series following a certain statistical distribution, and the duration of the amplitude spectrum time series follows another statistical distribution, The noise signal is expressed as a spectrum sequence that statistically transitions between the stationary noise models.
- Fig. 1 is a block diagram showing the configuration of an encoding device that employs a conventional CS—ACELP encoding system with DTX control.
- FIG. 2 is a block diagram showing a configuration of the noise signal analyzer according to the first embodiment of the present invention.
- FIG. 3 is a block diagram showing a configuration of the noise signal synthesizer according to the first embodiment of the present invention.
- FIG. 4 is a flowchart showing the operation of the noise signal analyzer according to the first embodiment of the present invention.
- FIG. 5 is a flowchart showing the operation of the noise signal synthesizer according to the first embodiment of the present invention.
- FIG. 6 is a block diagram illustrating a configuration of a speech encoding device according to a second embodiment of the present invention.
- FIG. 7 is a block diagram showing a configuration of the speech decoding device according to the second embodiment of the present invention.
- FIG. 8 is a flowchart showing an operation of the speech coding apparatus according to the second embodiment of the present invention.
- FIG. 9 is a flowchart showing the operation of the speech decoding apparatus according to the second embodiment of the present invention.
- FIG. 10 is a block diagram showing the configuration of the noise signal analyzing apparatus according to the third embodiment of the present invention.
- FIG. 11 is a diagram illustrating a spectral model parameter according to the third embodiment of the present invention.
- FIG. 12 is a block diagram showing a configuration of the noise signal synthesizing apparatus according to the third embodiment of the present invention.
- FIG. 13 is a schematic diagram showing the operation of the noise signal analyzer according to the third embodiment of the present invention.
- FIG. 14 is a flowchart showing the operation of the spectrum model parameter calculation / quantization unit according to the third embodiment of the present invention.
- FIG. 15 is a schematic diagram showing an operation of the noise signal synthesizer according to the third embodiment of the present invention.
- FIG. 16 is a block diagram showing a configuration of a speech coding apparatus according to Embodiment 4 of the present invention.
- FIG. 17 is a block diagram showing a configuration of a speech decoding apparatus according to Embodiment 4 of the present invention.
- FIG. 18 is a flowchart showing the operation of the speech coding apparatus according to the fourth embodiment of the present invention.
- FIG. 19 is a flowchart showing an operation of the speech decoding device according to the fourth embodiment of the present invention.
- the noise signal is represented by a statistical model. That is, using a plurality of stationary noise models that are represented by an amplitude spectrum time series that follows a certain statistical distribution and that have a duration of the amplitude spectrum time series that follow a certain statistical distribution, More specifically, the noise signal is used as a transitioning spectrum sequence.
- Vector time series ⁇ Si (n) ⁇ (n l,..., Li, i2 1,..., M).
- Li indicates the duration of each amplitude spectrum time series ⁇ Si (n) ⁇ (here, the number of frames is used as a unit).
- ⁇ S i (n) ⁇ and Li each follow a statistical distribution represented by a normal distribution.
- FIG. 2 is a block diagram showing a configuration of the noise signal analyzer according to the first embodiment of the present invention.
- the windowing unit 101 is configured to receive an input for an m-th frame (BF0, 1, 2,...) Input for each fixed interval (hereinafter referred to as “frame”).
- the FFT (Fast Fourier Transform) unit 102 converts the windowed input noise signal into a frequency spectrum, and calculates an input amplitude spectrum X (m) of the m-th frame.
- the spectral model number sequence ⁇ index (m) ⁇ (l ⁇ index (m) ⁇ M corresponding to the amplitude spectrum sequence ⁇ X (m) ⁇ (m 0, l, 2,...) Of the input noise signal , DF0, 1,2,
- the duration model ⁇ The transition probability calculation unit 105 uses the spectrum model number sequence ⁇ index (m) ⁇ obtained by the spectrum model sequence calculation unit 104 to calculate the number of continuous frames Li for each Si.
- FIG. 3 is a block diagram showing a configuration of the noise signal combining device according to the first embodiment of the present invention.
- the transition sequence generation unit 201 includes the model parameters (the average value Lav—i of Li, the standard deviation Ldv_i and the standard deviation value Ld) obtained by the noise signal analyzing device shown in FIG.
- the transition probability P (i, j) between S i among the transition probabilities p (i, j) between i the transition of the spectrum model S i is given to the given transition probability p (i, j).
- i index '(l) Is controlled so as to follow a normal distribution with an average value Lav_i and a standard deviation Ldv-i.
- the spectrum generation unit 205 applies a random number phase to the amplitude spectrum time series of a predetermined time length (the number of frames) generated along the transition sequence ⁇ index '(l) ⁇ by the above method.
- a spectrum time series is generated.
- the spectrum generation unit 205 may perform smoothing on the generated amplitude spectrum time series so that the spectrum changes smoothly.
- An IFFT (Inverse Fast Fourier Transform) unit 206 converts the spectrum time series generated by the spectrum generation unit 205 into a time-domain waveform.
- the superposition addition unit 207 outputs a final synthesized noise signal by performing superposition addition of signals between frames.
- FIG. 4 is a flowchart showing an operation of the noise signal analyzer according to the first embodiment of the present invention.
- FIG. 5 is a flowchart showing an operation of the noise signal synthesizing device according to the first embodiment of the present invention.
- the input noise signal for the m-th frame (IIF0, 1, 2,8) Is windowed by a windowing unit 101 using a Hanning window or the like.
- the windowed input noise signal is subjected to FFT (Fast Fourier Transform) by the FFT section 102 and converted into a frequency spectrum. Thereby, the input amplitude spectrum X n) of the m-th frame is calculated.
- the corresponding spectrum model number series is calculated by finding the number i of the spectrum model S i having the average amplitude Sav- i having the smallest distance from the input amplitude spectrum X (m). This is done by The above processing of ST301 to ST304 is performed in frame units. .
- the continuation length model ⁇ transition probability calculation section 105 uses the continuation length model / transition probability calculation section 105 to calculate statistical parameters related to the number of continuous frames Li for each S i (The average value Lav-i of Li and the standard deviation value Ldv_i) and the transition probability p (i, j) between Si are calculated.
- these are output as model parameters for the input noise signal. These model parameters are calculated and transmitted at regular intervals or at arbitrary intervals.
- the model parameters (the average value Lav—i of Li, the standard deviation Ldv_i, and the transition probability P (i,) between the standard deviation values Ldv_i and S i) obtained by the noise signal analyzer are transformed into the transition sequence generation unit 201. And input to the duration control unit 203.
- the transition sequence generator 201 uses the transition probability P (i, j) of the input model parameters to transition the spectrum model S i to the given transition probability p.
- the continuous length control unit 203 uses the statistical model parameters of Li (average value Lav_i, standard deviation Ldv_i) of the number of continuous frames for the spectral model S i in the input model parameters,
- the continuous length control unit 203 generates a continuous frame number L controlled to follow a normal distribution having an average value Lav_i and a standard deviation Ldv-i with respect to (l).
- a random number phase is generated by the random number phase generation unit 204.
- the spectrum generation unit 205 uses i and the standard deviation Sdv-i), the spectrum generation unit 205 generates an amplitude spectrum time series (n) ⁇ for index, (l) shown in Expression (1). Note that the generated amplitude spectrum time series may be smoothed so that the spectrum changes smoothly.
- the random number phase generated by ST 404 is given to the amplitude spectrum time series of a predetermined time length (the number of frames) generated along the transition sequence ⁇ index '(l) ⁇ .
- a spectral time series is created.
- the created spectrum time series is converted into a time-domain waveform by IFFT section 206, and then, in ST 407, the inter-frame Signal superposition and addition are performed.
- the superimposed and added signal is output as the final synthesized noise signal.
- the background noise signal is represented by a statistical model. That is, the noise signal analyzer (transmitting device) generates statistical information (statistical model parameters) including the spectrum change of the noise signal spectrum using the noise signal, and generates the generated information. Transmitting to noise signal synthesizer (reception side device).
- the noise signal synthesizer (reception side device) synthesizes a noise signal using the above information (statistical model parameters) transmitted from the noise signal analyzer (transmission side device).
- the noise signal synthesizer uses statistical information including the spectrum change of the noise signal spectrum instead of the spectrum of the noise signal analyzed intermittently. Since the noise signal can be synthesized, it is possible to synthesize a noise signal with little perceptual deterioration.
- the noise signal analyzer-synthesis apparatus having the configuration shown in FIGS. 2 and 3 and the noise signal analysis method'synthesis method shown in FIGS.
- the statistical model of the spectrum S has been described as being prepared by preliminary learning as the spectrum model information. It is also possible to adopt a form in which learning is performed in advance or quantized by other spectral expression parameters such as LPC coefficients and transmitted to the combining side.
- the statistical parameters of the spectrum continuation length (average Lav of L, standard deviation Ldv) and the statistical transition parameters between the spectral models S i are patterned in advance, and the appropriate It is also possible to select and transmit the input noise signal during a certain period and to combine based on it.
- a speech coding apparatus is realized using the noise signal analysis apparatus described in Embodiment 1, and a speech decoding apparatus is realized using the noise signal synthesis apparatus described in Embodiment 1. The case will be described.
- FIG. 6 is a block diagram showing a configuration of the speech encoding device according to the second embodiment of the present invention.
- an input voice signal is input to a voiced Z silence determiner 501, a voice encoder 502, and a noise signal encoder 503.
- the voiced Z silence determiner 501 determines whether the input audio signal is a voiced section or a silent section (a section including only noise) and outputs the determination result.
- the sound / non-speech determiner 5001 may be an arbitrary one, and generally performs determination using the instantaneous amount or change amount of a plurality of parameters such as the input signal power, spectrum, and pitch cycle. It is.
- the voice coder 502 When the result of the determination by the voiced / silence determiner 501 is voiced, the voice coder 502 performs voice coding on the input voice signal, and performs DTX control and multiplexing on the coded data. Output to the generator 504.
- the speech encoder 502 is an encoder for a voiced section, and is an arbitrary encoder that encodes speech with high efficiency.
- the noise signal encoder 503 encodes the input speech signal when the result of the determination by the voiced / silence determiner 501 is silent, and performs model parameter coding for the input noise signal. Output evening.
- This noise signal encoder 503 is different from the noise signal analyzer described in Embodiment 1 (see FIG. 2) in that the output model parameter C is added with a configuration to output the encoded parameters over time.
- the DTX control and multiplexer 504 should transmit as a transmission data using the outputs from the voiced / silence discriminator 501, the voice coder 502 and the noise signal coder 503. It controls information and multiplexes transmission information, and outputs transmission data.
- FIG. 7 is a block diagram showing a configuration of the speech decoding device according to the second embodiment of the present invention.
- the transmission data transmitted by the speech coding apparatus shown in FIG. 6 is input to the demultiplexing and DTX controller 61 as received data.
- the separation and DTX controller 600 separates the received data into speech coded data or noise model coding parameters and a voiced Z silence determination flag required for voice decoding and noise generation.
- the speech decoder 602 When the speech / non-speech determination flag indicates a speech section, the speech decoder 602 performs speech decoding using the speech coded data and outputs decoded speech.
- the noise signal decoder 603 generates a noise signal using the noise model coding parameter when the speech / non-speech determination flag indicates a silence section, and outputs a noise signal.
- This noise signal decoder 603 is different from the noise signal synthesizer (FIG. 2) described in the first embodiment in that the input model coding parameters are decoded in each of the model parameters. It is added.
- the output switch 604 switches the output of the speech decoder 602 and the output of the noise signal decoder 603 in accordance with the result of the voice / non-voice flag, and outputs the output signal.
- FIG. 8 is a flowchart showing an operation of the speech coding apparatus according to the second embodiment of the present invention.
- an audio signal for each frame is input, and ST702 In, it is determined whether the input audio signal is a sound section or a silent section (a section including only noise), and the result of the determination is output.
- the presence / absence determination of sound Z may be performed by an arbitrary method. Generally, the determination is performed using the instantaneous amount or change amount of a plurality of parameters such as the input signal power, spectrum, and pitch cycle.
- This speech encoding process is a speech section encoding, and may be any method for encoding speech with high efficiency.
- noise signal encoding is different from the noise signal analysis method described in the first embodiment in that a step of quantizing and encoding the output model parameters and outputting encoded parameter parameters is output. It is.
- control of information to be transmitted as transmission data (DTX control) and multiplexing of transmission information using outputs from the voice / non-speech determination result, voice coding and noise signal coding. Is performed, and is output as transmission data in ST706.
- FIG. 9 is a flowchart showing an operation of the speech decoding device according to the second embodiment of the present invention.
- transmission data encoded with respect to an input signal on the encoding side is input as reception data.
- the received data is separated into voice coded data or noise model coding parameters required for voice decoding and noise generation, and a voice / non-voice determination flag.
- voice decoding is performed using the voice coded data in ST804, and a decoded voice is output.
- the speech / non-speech determination flag indicates a silent section
- a noise signal is generated in ST 805 by using the voice model encoding parameter, and the noise signal is generated. Signal is output.
- This noise signal decoding process is different from the method of synthesizing the noise signal described in Embodiment 1 in that a step of decoding the input model coding parameters into each model parameter is added. .
- the output of speech decoding in ST804 or the decoding of noise signal in ST805 is output as a decoded signal according to the result of the voice / non-speech determination flag.
- speech encoding can be performed with high quality in speech sections in speech sections, and noise is reduced in silence sections using a noise signal analysis apparatus and a synthesis apparatus with little perceptual deterioration.
- signal encoding and decoding By performing signal encoding and decoding, high-quality encoding can be performed even in a background noise environment. Since the statistical characteristics of the noise signal under actual ambient noise are assumed to be constant over a relatively long period (for example, several seconds to several tens of seconds), the transmission cycle of the model parameters over time is Since a long period of the order is sufficient, the amount of information on the model parameters of the noise signal to be transmitted to the decoding side can be reduced, and efficient transmission can be realized.
- FIG. 10 is a block diagram illustrating a configuration of the noise signal analyzer according to the third embodiment of the present invention.
- the windowing section 901 is adapted to generate an m-th frame (IFO, 1, 2,...) Input for each fixed section (hereinafter referred to as “frame”).
- the FFT (Fast Fourier Transform) section 90 2 The input noise signal thus obtained is converted into a frequency spectrum, and an input amplitude spectrum X (m) of the m-th frame is calculated.
- the spectral model parameter calculation 'quantization unit 903 converts the amplitude spectrum sequence ⁇ X (m) ⁇ (m-0 5 l 3 2 5 ...) Of the input noise signal into a section of a fixed number of frames. , Or an interval consisting of a number of frames adaptively determined by some index, is divided as a unit interval (modeling interval) for modeling, and calculation and quantization of spectral model parameters in the modeling interval are performed.
- the spectrum model number sequence ⁇ index (m) ⁇ (l ⁇ index) corresponding to the amplitude spectrum sequence ⁇ X (m) ⁇ of the input noise signal (m) ⁇ M, m mk 3 mk + l, mk + 2 5 ..., mk + N ⁇ -1; mk is the first frame number of the modeling section, ⁇ is the number of frames in the modeling section) Is output.
- Duration modelCalculation of transition probabilitiesQuantizer 904 calculates spectral model parameters overnightSpectral model number sequence of the modeled section obtained by quantizer 903 (index (m ) ⁇ , The statistical parameters for the number of continuation frames Li for each S i (the continuation length model parameters) (the average value Lav-i and the standard deviation Ldv-i of Li) and the difference between Si-Sj Calculate the transition probabilities p (i, j) 'Quantize and output their quantized indexes.
- the quantization method is arbitrary, each element of Lav-i, Ldv_i, and P (i, j) may be scalar-quantized.
- the above-described quantization indexes of the spectral model parameters, the continuation length model parameters, and the transition probability parameters are output as the statistical model parameter parameter quantization indexes of the input noise signal in the modeling section.
- FIG. 11 is a block diagram illustrating a detailed configuration of the spectral model parameter calculation / quantization unit 903 of FIG. 10.
- the power calculation unit 10 0 uses the power value calculated in 01, the power normalization unit 1002 normalizes the power. Then, the representative vector of the noise spectrum representative vector storage unit 1003 is classified into a class in the class setting unit 104 with respect to the input amplitude spectrum that has been normalized. It performs class evening (vector quantization) with the evening center, and outputs information on which class evening each input spectrum belongs to.
- class evening vector quantization
- This number sequence is generated as a number sequence belonging to the top M class evenings based on the sequence of the class evening (representative vector) numbers to which the clustering unit 104 belongs.
- the number of the above M classes can be determined by any method (for example, re-classification of the previous frame or replacement with the class number of the previous frame).
- the modeling section average power quantization unit 1006 calculates the power value for each frame calculated by the power calculation unit 1001, It averages over the entire modeling interval, performs quantization on the average power by an arbitrary method such as scalar quantization, and outputs the power index and the average power (quantization value) E of the modeling interval. Then, in the error spectrum 'power correction value quantizing unit 1 0 07, Sav_i is converted into the error vector di from the corresponding representative vector Ci, Ci and the modeled section average power E And the power correction value ei for E for each spectrum model, and di and ei are quantized by any method such as scalar quantization.
- the M representative vector indexes obtained by the class-based average spectral calculation unit 1005 and the error spectrum obtained by the error spectrum and power correction value quantization unit 1007 are obtained.
- the vector quantization index, the power correction value quantization index, and the power quantization index obtained by the model interval average power quantization unit 1 ⁇ 06 are output as the quantization index of the spectrum model parameters.
- the standard deviation value in the class for Ci obtained at the time of learning the noise spectrum representative vector is used as it is. By storing this value in the noise spectrum representative vector storage unit in advance, it is unnecessary to output the quantization index.
- the average spectral calculation unit 1005 for each class may also calculate the standard deviation within the class and calculate the average when calculating the average spectrum. In this case, the quantization index is output as a part of the quantization index of the spectrum model parameter.
- FIG. 12 is a block diagram showing a configuration of the noise signal synthesizing device according to the third embodiment of the present invention. In the noise signal synthesizer shown in FIG.
- transition sequence generator 1101 among the statistical model parameter overnight quantization indices obtained by the noise signal analyzer shown in FIG.
- the transition probability p (i, j) is decoded using the quantization index of the transition probability p (i,), and the transition of the spectrum model S i becomes the given transition probability p (i, j).
- Generate the vector model number transition sequence ⁇ index '(l) ⁇ (l ⁇ index' (l) ⁇ M 5 1 0,1,2,).
- the average amplitude Sav-i and the standard deviation Sdv-i (i l,...), Which are the statistical parameters of the spectrum model S i, are obtained from the quantization index of the spectrum model parameters.
- the decoding of the average amplitude Sav-i is performed by calculating the spectrum parameter of the encoder and calculating the quantization parameter obtained by the quantization unit 903.
- Decoding is performed based on the equation (2) using the same noise vector and the representative vector in the noise spectrum representative vector storage unit provided on the encoding side provided in the spectrum model parameter decoding unit 1103.
- the standard deviation Sdv-i if the encoding device uses the standard deviation value within the class for Ci obtained during the noise spectrum representative vector learning as it is, the corresponding value
- the noise is decoded by obtaining it from the noise spectrum representative vector storage unit 1003.
- S index ′ (1) is assumed to follow a normal distribution having an average amplitude Sav_i and a standard deviation Sdv ⁇ i with respect to i ⁇ index ′ (l).
- Signal average value of Lav_i, standard deviation value Ldv-i
- Ldv-i index '(l).
- the spectrum generation unit 1105 generates the amplitude spectrum of the predetermined time length (two frame numbers NFRM in the modeled section) generated along the transition sequence ⁇ index '(l) ⁇ by the above method.
- the spectrum time series is created by giving the random time phase generated by the random number phase generation unit 111 to the vector time series. Note that the spectrum generation unit 1105 may perform smoothing so that the spectrum changes smoothly with respect to the generated amplitude spectrum time series.
- the IFFT (Inverse Fast Fourier Transform) unit 1106 converts the spectrum time series created by the spectrum generation unit 111 into time-domain waveforms.
- the superposition / addition unit 1107 outputs a final synthesized noise signal by performing superposition / addition of signals between frames.
- a windowing section 91 performs windowing of the input noise signal for the m-th frame (nF0, l, 2,...) With a Hanning window or the like.
- the FFT unit 902 performs FFT (Fast Fourier Transform) on the windowed input noise signal and converts it into a frequency spectrum.
- FFT Fast Fourier Transform
- the input amplitude spectrum X (m) of the m-th frame is calculated.
- the spectral model parameters are calculated and quantized by the quantizer 903, the amplitude spectrum sequence of the input noise signal ⁇ X (iii) ⁇ (m2 0, 1, 2, .. :) is a unit section (modeling section) for modeling a section of a fixed number of frames or a section consisting of the number of frames adaptively determined by some index.
- the calculation and quantization of the spectral model parameters in the modeling section are performed, the quantization index of the spectral model parameters is output, and the amplitude spectrum sequence of the input noise signal is output.
- the spectrum model number sequence corresponding to ⁇ X (m) ⁇ ⁇ index (m) ⁇ (1 ⁇ index (m) ⁇ M, m mk, mk + l, mk + 2, ..., mk + NFR -l; mk is the head frame number of the modeling section, and is the number of frames in the modeling section.
- the continuation length model, transition probability calculation, and quantization section 904 quantize the spectral model number sequence (index ( m) ⁇ , the statistical parameters related to the number of continuous frames Li for each S i (continuous length model parameters) (mean Lav_i and standard deviation Ldv_i of Li) and the transition probability p between Si-Sj (i, j) is calculated and quantized, and their quantized indexes are output.
- the quantization method is arbitrary, each element of Lav-i, Ldv_i, and P (i, j) may be scalar-quantized.
- the above-mentioned quantization index of the spectral model parameter, the duration model parameter, and the transition probability parameter is a statistical model of the input noise signal in the modeling section. Output as parameter quantization index.
- FIG. 14 is a flowchart showing the detailed operation of the spectrum model parameter calculation-quantization unit 903 in ST122 of FIG.
- the spectral model parameters are calculated and quantized by the quantizer 903 in the representative interval of the input noise from the representative vector set of the amplitude spectrum representing the noise signal prepared in advance.
- Classification is performed with the vector at the center of the class, and information on which class each input spectrum belongs to is output. Then, in ST 13 05, a class evening average spectrum calculating section 100 5 assigns a model to the belonging class evening (representative vector) number sequence obtained in the class evening section 104
- This number sequence is generated as a number sequence belonging to the upper M class classes based on the sequence of the class class (representative vector) numbers to which the class classifying unit 1 ⁇ 04 belongs. In other words, for frames that do not belong to the top M classes, the above M classes can be used in any way (for example, by re-classifying the class again or replacing it with the class number of the previous frame).
- the frame is associated with the evening number or the frame is deleted from the sequence.
- the modeling section average power quantization section 1006 averages the per-frame power values calculated by the power calculation section 100 1 over the entire modeling section. , Any average such as scalar quantization
- the quantization is performed by the method, and the power index and the modeled section average power (quantized value) E are output.
- the error spectrum from the corresponding representative vector Ci, Ci, as shown in equation (2), is obtained by the error spectrum '
- the di and ei are quantized by any method such as scalar quantization with respect to Sav-i represented by the modeled section average power E and the power correction value ei of E for each vector model.
- di in the quantization of the error spectrum di, di may be divided into a plurality of bands, and the average value of each band may be subjected to scalar quantization for each band. Then, in ST 13 08, the M representative vector indexes obtained in ST 13 05, the error spectrum quantization index obtained in ST 13 07, and The ⁇ correction value quantization index and the power quantization index obtained in ST 13 06 are output as the quantization index of the spectrum model parameters.
- the standard deviation Sdv-i of the spectrum model parameters is the same as the standard deviation value in the class for Ci obtained at the time of learning the noise spectrum representative vector. By storing this value in the noise spectrum representative vector storage unit in advance, it is unnecessary to output the quantization index.
- the standard deviation within the class may be calculated and quantized when the average spectrum is calculated by the class evening average spectrum calculating unit 1005. In this case, the quantized index is output as a part of the quantized index of the spectral model parameters.
- the quantization of the error spectrum is described by the scalar quantization for each band.
- the quantization may be performed by another method such as the vector quantization of the entire band.
- a configuration was described in which the power information was expressed by the average power of the modeling section and the correction value for the average power of each model. It may be possible to represent a party.
- the operation of the noise signal synthesizing apparatus according to the present embodiment will be described with reference to FIG. First, in ST 1401, the data obtained by the noise signal analyzer were used. Each quantization index of the obtained statistical model parameters is input.
- the spectral model parameter overnight decoding unit 1 103 obtains the statistical noise of the spectrum model S i from the quantization index of the spectral model parameter overnight.
- the decoded values (the average value Lav_i of Li, the standard deviation value Ldv-i) from the quantization index of the statistical model parameters of the number of continuous frames Li for the spectral model S i are used.
- I inde (1)
- the continuous frame number L controlled to follow the normal distribution having the average value Lav_i and the standard deviation Ldv_i is generated by the continuous length control unit 1102.
- a random number phase is generated by the random number phase generation unit 110 4.
- the model number index ′ (l) obtained in ST 1403 and the spectrum obtained in ST 1402 by spectrum generating section 1105 are obtained.
- the amplitude A vector time series ⁇ X '(n) ⁇ is generated.
- the superposition adding unit 1107 in ST 1408 Overlapping addition of signals between frames is performed.
- the signal obtained by the superposition and addition is output as the final synthesized noise signal.
- the background noise signal is represented by a statistical model. That is, the noise signal analyzer (transmitting device) generates statistical information (statistical model parameters) including the spectrum change of the noise signal spectrum using the noise signal, and generates the generated information. Transmitting to noise signal synthesizer (reception side device). The noise signal synthesizer (reception side device) synthesizes a noise signal using the above information (statistical model parameters) transmitted from the noise signal analyzer (transmission side device). As a result, the noise signal synthesizer (reception side device) uses statistical information including the spectrum change of the noise signal spectrum instead of the spectrum of the noise signal analyzed intermittently.
- the noise signal can be synthesized, it is possible to synthesize a noise signal with little perceptual deterioration.
- the statistical characteristics of the noise signal under actual ambient noise are assumed to be constant over a relatively long period (for example, several seconds to several tens of seconds). Since a long period of time is sufficient, the amount of information of the noise signal to be transmitted to the decoding side in the model parameters can be reduced, and efficient transmission can be realized.
- a speech coding device is realized using the noise signal analysis device described in Embodiment 3, and a speech decoding device is realized using the noise signal synthesis device described in Embodiment 3. The case will be described.
- FIG. 16 is a block diagram showing a configuration of the speech coding apparatus according to the fourth embodiment of the present invention.
- the input audio signal is 5 0 1
- the speech encoder 1502 and the noise signal encoder 1503 are input.
- the voice / non-voice determiner 15001 determines whether the input audio signal is a voice section or a silent section (a section including only noise), and outputs the determination result.
- the sound / non-speech determiner 15001 may be an arbitrary device, and generally uses the instantaneous amount or change amount of a plurality of parameters such as the input signal power and the spectrum pitch period. This is to make a judgment.
- the speech encoder 1502 When the speech / non-speech determinator 1501 determines that there is speech, the speech encoder 1502 performs speech encoding on the input speech signal, and performs DTX control and encoding on the encoded data. Output to multiplexer 1504.
- the speech encoder 1502 is an encoder for a sound section, and is an arbitrary encoder that encodes speech with high efficiency.
- the noise signal encoder 1503 encodes a noise signal for the input speech signal and performs a statistical model for the input noise signal when the result of the speech / silence discrimination unit 1501 is silent.
- the quantized index of the parameter is output as encoded data.
- the noise encoder 1503 the noise analysis device (FIG. 10) described in the third embodiment is used.
- DTX control and multiplexer 1504 transmits as transmission data using output from voiced Z silencer 1501, speech encoder 1502 and noise signal encoder 1503 It controls the information to be transmitted and multiplexes the transmission information, and outputs the transmission data.
- FIG. 17 is a block diagram showing a configuration of the speech decoding device according to the fourth embodiment of the present invention.
- transmission data transmitted by the speech coding apparatus shown in FIG. 16 is input to the demultiplexing and DTX controller 1601 as received data.
- the separation and DTX controller 1601 separates the received data into speech coded data necessary for speech decoding and noise generation, or noise model coding parameters, and a sound / no-speech determination flag.
- the voice decoder 1602 receives the voice code ⁇ ! Speech decoding is performed using data decoding, and decoded speech is output.
- the noise signal decoder 1603 generates a noise signal using the noise model coding parameter and outputs a noise signal.
- the noise signal synthesizer FIG. 12
- the output switch 164 switches the output of the speech decoder 162 and the output of the noise signal decoder 163 in accordance with the result of the voice / non-voice determination flag, and outputs the output.
- FIG. 18 is a flowchart showing an operation of the speech coding apparatus according to the fourth embodiment of the present invention.
- an audio signal for each frame is input in ST1771, and in ST1702, it is determined whether the input audio signal is a voiced section or a silent section (a section including only noise). The result is output.
- This sound / non-speech determination may be performed by an arbitrary method. In general, the determination is performed using the instantaneous amount or change amount of a plurality of parameters such as the input signal power, spectrum, and pitch cycle.
- This speech encoding process is a speech section encoding, and may be any method for encoding speech with high efficiency.
- a noise signal is encoded for the input speech signal in ST 174, and a model parameter for the input noise signal is output.
- the noise signal encoding the noise signal analysis method described in the third embodiment is used.
- the voiced Z silence determination result, voice coding and noise signal Control of information to be transmitted as transmission data (DTX control) and multiplexing of transmission information are performed using the output from the signal encoding, and output as transmission data in ST1766.
- FIG. 19 is a flowchart showing an operation of the speech decoding device according to the fourth embodiment of the present invention.
- data encoded and transmitted with respect to an input signal on the encoding side is received as received data.
- the received data is separated into speech encoded data necessary for speech decoding and noise generation, or noise model encoding parameters, and a voiced Z silence determination flag.
- the speech / non-speech determination flag indicates a speech section
- speech decoding is performed using the speech coded data in ST 184, and a decoded speech is output.
- the voiced Z silence determination flag indicates a silence section
- a noise signal is generated using the noise model coding parameter in ST 185, and the noise signal is generated. Is output.
- the noise signal decoding process the method of synthesizing the noise signal described in the fourth embodiment is used.
- the output of speech decoding at ST1804 or the decoding of noise signal decoding at ST1805 is output as a decoded signal according to the result of the voiced Z silence determination flag.
- the output of the decoded signal is described as being output by switching between the decoded speech signal and the synthesized noise signal in the voiced section and the non-voice section.
- the noise signal may be added to the decoded speech signal even in the sound period and output.
- a means for separating an input speech signal containing a noise signal into a noise signal and a speech signal containing no noise on the speech encoding side is provided, and the data obtained by encoding the separated speech signal and the noise signal is used.
- the noise signal synthesized in the silence section on the decoding side may be added to the decoded speech signal in the speech section as described above and output.
- a speech signal is encoded with high quality in a sound section.
- the coding and decoding of noise signals using noise signal analyzers and synthesizers that have less audible deterioration in silent sections can achieve high-quality coding even in background noise environments. I can do it. Since the statistical characteristics of the noise signal under actual ambient noise are assumed to be constant over a relatively long period (for example, several seconds to several tens of seconds), the transmission cycle of the model parameters over time is Since a long period of the order is sufficient, the amount of information on the model parameters of the noise signal to be transmitted to the decoding side can be reduced, and efficient transmission can be realized.
- the processing performed by the noise signal analyzing device and the noise signal synthesizing device described in the first and third embodiments, and the processing performed by the voice coding device and the voice decoding device described in the second and fourth embodiments The processing is realized by software (program), and this software (program) can be stored in a recording medium that can be read by a computer.
- this software program
- this software can be stored in a recording medium that can be read by a computer.
- the present invention relates to a noise signal analysis / synthesis apparatus for analyzing and synthesizing a background noise signal superimposed on an audio signal, and also relates to speech coding for encoding the audio signal using the analysis / synthesis apparatus. Suitable for the device.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Description
Claims
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2001282616A AU2001282616A1 (en) | 2000-09-06 | 2001-09-04 | Noise signal analyzer, noise signal synthesizer, noise signal analyzing method, and noise signal synthesizing method |
EP01961335A EP1258715B1 (en) | 2000-09-06 | 2001-09-04 | Noise signal analyzer, noise signal synthesizer, noise signal analyzing method, and noise signal synthesizing method |
US10/129,076 US6934650B2 (en) | 2000-09-06 | 2001-09-04 | Noise signal analysis apparatus, noise signal synthesis apparatus, noise signal analysis method and noise signal synthesis method |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2000-270588 | 2000-09-06 | ||
JP2000270588 | 2000-09-06 | ||
JP2001070148A JP3670217B2 (ja) | 2000-09-06 | 2001-03-13 | 雑音符号化装置、雑音復号装置、雑音符号化方法および雑音復号方法 |
JP2001-70148 | 2001-03-13 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2002021091A1 true WO2002021091A1 (fr) | 2002-03-14 |
Family
ID=26599385
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2001/007630 WO2002021091A1 (fr) | 2000-09-06 | 2001-09-04 | Analyseur de signal de bruit, synthetiseur de signal de bruit, procede d'analyse de signal de bruit et procede de synthese de signal de bruit |
Country Status (5)
Country | Link |
---|---|
US (1) | US6934650B2 (ja) |
EP (1) | EP1258715B1 (ja) |
JP (1) | JP3670217B2 (ja) |
AU (1) | AU2001282616A1 (ja) |
WO (1) | WO2002021091A1 (ja) |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7171356B2 (en) * | 2002-06-28 | 2007-01-30 | Intel Corporation | Low-power noise characterization over a distributed speech recognition channel |
JP2004029674A (ja) * | 2002-06-28 | 2004-01-29 | Matsushita Electric Ind Co Ltd | 雑音信号符号化装置及び雑音信号復号化装置 |
US8670988B2 (en) * | 2004-07-23 | 2014-03-11 | Panasonic Corporation | Audio encoding/decoding apparatus and method providing multiple coding scheme interoperability |
CN1815550A (zh) * | 2005-02-01 | 2006-08-09 | 松下电器产业株式会社 | 可识别环境中的语音与非语音的方法及系统 |
CN1953052B (zh) * | 2005-10-20 | 2010-09-08 | 株式会社东芝 | 训练时长预测模型、时长预测和语音合成的方法及装置 |
KR100785471B1 (ko) | 2006-01-06 | 2007-12-13 | 와이더댄 주식회사 | 통신망을 통해 가입자 단말기로 전송되는 오디오 신호의출력 품질 개선을 위한 오디오 신호의 처리 방법 및 상기방법을 채용한 오디오 신호 처리 장치 |
US20080312916A1 (en) * | 2007-06-15 | 2008-12-18 | Mr. Alon Konchitsky | Receiver Intelligibility Enhancement System |
US8190440B2 (en) * | 2008-02-29 | 2012-05-29 | Broadcom Corporation | Sub-band codec with native voice activity detection |
EP2151821B1 (en) * | 2008-08-07 | 2011-12-14 | Nuance Communications, Inc. | Noise-reduction processing of speech signals |
JP6053272B2 (ja) * | 2011-10-19 | 2016-12-27 | オリンパス株式会社 | 顕微鏡装置 |
US10066962B2 (en) | 2013-07-01 | 2018-09-04 | Battelle Energy Alliance, Llc | Apparatus, system, and method for sensor authentication |
CN113066472B (zh) * | 2019-12-13 | 2024-05-31 | 科大讯飞股份有限公司 | 合成语音处理方法及相关装置 |
CN118424363B (zh) * | 2024-07-04 | 2024-09-10 | 深圳深蕾科技股份有限公司 | 基于杂散光数据和光电编码器的设备健康评估方法及系统 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH01502779A (ja) * | 1987-04-03 | 1989-09-21 | アメリカン テレフォン アンド テレグラフ カムパニー | 適応多変数推定装置 |
JPH01502853A (ja) * | 1987-04-03 | 1989-09-28 | アメリカン テレフォン アンド テレグラフ カムパニー | 有声判定装置および有声判定方法 |
JPH0962299A (ja) * | 1995-08-23 | 1997-03-07 | Oki Electric Ind Co Ltd | コード励振線形予測符号化装置 |
JPH09321793A (ja) * | 1996-05-21 | 1997-12-12 | Hewlett Packard Co <Hp> | ネットワークシステム |
JPH1097292A (ja) * | 1996-01-29 | 1998-04-14 | Texas Instr Inc <Ti> | 音声信号伝送方法および不連続伝送システム |
JPH11163744A (ja) * | 1997-11-28 | 1999-06-18 | Oki Electric Ind Co Ltd | ディジタル通信用音声送受信装置 |
Family Cites Families (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2102254B (en) * | 1981-05-11 | 1985-08-07 | Kokusai Denshin Denwa Co Ltd | A speech analysis-synthesis system |
US4720802A (en) * | 1983-07-26 | 1988-01-19 | Lear Siegler | Noise compensation arrangement |
US4897878A (en) * | 1985-08-26 | 1990-01-30 | Itt Corporation | Noise compensation in speech recognition apparatus |
US4852181A (en) * | 1985-09-26 | 1989-07-25 | Oki Electric Industry Co., Ltd. | Speech recognition for recognizing the catagory of an input speech pattern |
JPH0636158B2 (ja) * | 1986-12-04 | 1994-05-11 | 沖電気工業株式会社 | 音声分析合成方法及び装置 |
US5761639A (en) * | 1989-03-13 | 1998-06-02 | Kabushiki Kaisha Toshiba | Method and apparatus for time series signal recognition with signal variation proof learning |
US5148489A (en) * | 1990-02-28 | 1992-09-15 | Sri International | Method for spectral estimation to improve noise robustness for speech recognition |
US5465317A (en) * | 1993-05-18 | 1995-11-07 | International Business Machines Corporation | Speech recognition system with improved rejection of words and sounds not in the system vocabulary |
EP0692880B1 (en) * | 1993-11-04 | 2001-09-26 | Sony Corporation | Signal encoder, signal decoder, recording medium and signal encoding method |
US5774846A (en) * | 1994-12-19 | 1998-06-30 | Matsushita Electric Industrial Co., Ltd. | Speech coding apparatus, linear prediction coefficient analyzing apparatus and noise reducing apparatus |
SE507370C2 (sv) * | 1996-09-13 | 1998-05-18 | Ericsson Telefon Ab L M | Metod och anordning för att alstra komfortbrus i linjärprediktiv talavkodare |
JP4006770B2 (ja) | 1996-11-21 | 2007-11-14 | 松下電器産業株式会社 | ノイズ推定装置、ノイズ削減装置、ノイズ推定方法、及びノイズ削減方法 |
JP3464371B2 (ja) | 1996-11-15 | 2003-11-10 | ノキア モービル フォーンズ リミテッド | 不連続伝送中に快適雑音を発生させる改善された方法 |
US5960389A (en) * | 1996-11-15 | 1999-09-28 | Nokia Mobile Phones Limited | Methods for generating comfort noise during discontinuous transmission |
US5924065A (en) * | 1997-06-16 | 1999-07-13 | Digital Equipment Corporation | Environmently compensated speech processing |
US6144937A (en) * | 1997-07-23 | 2000-11-07 | Texas Instruments Incorporated | Noise suppression of speech by signal processing including applying a transform to time domain input sequences of digital signals representing audio information |
JP4216364B2 (ja) | 1997-08-29 | 2009-01-28 | 株式会社東芝 | 音声符号化/復号化方法および音声信号の成分分離方法 |
US6182033B1 (en) * | 1998-01-09 | 2001-01-30 | At&T Corp. | Modular approach to speech enhancement with an application to speech coding |
US6453285B1 (en) * | 1998-08-21 | 2002-09-17 | Polycom, Inc. | Speech activity detector for use in noise reduction system, and methods therefor |
US20020116196A1 (en) * | 1998-11-12 | 2002-08-22 | Tran Bao Q. | Speech recognizer |
-
2001
- 2001-03-13 JP JP2001070148A patent/JP3670217B2/ja not_active Expired - Fee Related
- 2001-09-04 US US10/129,076 patent/US6934650B2/en not_active Expired - Fee Related
- 2001-09-04 AU AU2001282616A patent/AU2001282616A1/en not_active Abandoned
- 2001-09-04 WO PCT/JP2001/007630 patent/WO2002021091A1/ja active IP Right Grant
- 2001-09-04 EP EP01961335A patent/EP1258715B1/en not_active Expired - Lifetime
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH01502779A (ja) * | 1987-04-03 | 1989-09-21 | アメリカン テレフォン アンド テレグラフ カムパニー | 適応多変数推定装置 |
JPH01502853A (ja) * | 1987-04-03 | 1989-09-28 | アメリカン テレフォン アンド テレグラフ カムパニー | 有声判定装置および有声判定方法 |
JPH0962299A (ja) * | 1995-08-23 | 1997-03-07 | Oki Electric Ind Co Ltd | コード励振線形予測符号化装置 |
JPH1097292A (ja) * | 1996-01-29 | 1998-04-14 | Texas Instr Inc <Ti> | 音声信号伝送方法および不連続伝送システム |
JPH09321793A (ja) * | 1996-05-21 | 1997-12-12 | Hewlett Packard Co <Hp> | ネットワークシステム |
JPH11163744A (ja) * | 1997-11-28 | 1999-06-18 | Oki Electric Ind Co Ltd | ディジタル通信用音声送受信装置 |
Non-Patent Citations (1)
Title |
---|
See also references of EP1258715A4 * |
Also Published As
Publication number | Publication date |
---|---|
US20020165681A1 (en) | 2002-11-07 |
JP2002156999A (ja) | 2002-05-31 |
EP1258715A1 (en) | 2002-11-20 |
EP1258715A4 (en) | 2005-10-12 |
EP1258715B1 (en) | 2008-01-30 |
JP3670217B2 (ja) | 2005-07-13 |
AU2001282616A1 (en) | 2002-03-22 |
US6934650B2 (en) | 2005-08-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5343098B2 (ja) | スーパーフレーム構造のlpcハーモニックボコーダ | |
KR100647336B1 (ko) | 적응적 시간/주파수 기반 오디오 부호화/복호화 장치 및방법 | |
KR100566713B1 (ko) | 음향 파라미터 부호화, 복호화 방법, 장치 및 프로그램, 음성 부호화, 복호화 방법, 장치 및 프로그램 | |
US7599833B2 (en) | Apparatus and method for coding residual signals of audio signals into a frequency domain and apparatus and method for decoding the same | |
JP4270866B2 (ja) | 非音声のスピーチの高性能の低ビット速度コード化方法および装置 | |
US6678655B2 (en) | Method and system for low bit rate speech coding with speech recognition features and pitch providing reconstruction of the spectral envelope | |
JP4445328B2 (ja) | 音声・楽音復号化装置および音声・楽音復号化方法 | |
CA2918345A1 (en) | Unvoiced/voiced decision for speech processing | |
WO2002021091A1 (fr) | Analyseur de signal de bruit, synthetiseur de signal de bruit, procede d'analyse de signal de bruit et procede de synthese de signal de bruit | |
WO2000077774A1 (fr) | Codeur de signaux de bruit et codeur de signaux vocaux | |
EP2009623A1 (en) | Speech coding | |
KR20050006883A (ko) | 광대역 음성 부호화기 및 그 방법과 광대역 음성 복호화기및 그 방법 | |
JP3353852B2 (ja) | 音声の符号化方法 | |
JP4578145B2 (ja) | 音声符号化装置、音声復号化装置及びこれらの方法 | |
JP3348759B2 (ja) | 変換符号化方法および変換復号化方法 | |
JP3916934B2 (ja) | 音響パラメータ符号化、復号化方法、装置及びプログラム、音響信号符号化、復号化方法、装置及びプログラム、音響信号送信装置、音響信号受信装置 | |
JP2797348B2 (ja) | 音声符号化・復号化装置 | |
JP2004246038A (ja) | 音声楽音信号符号化方法、復号化方法、符号化装置、復号化装置、符号化プログラム、および復号化プログラム | |
KR101377667B1 (ko) | 오디오/스피치 신호의 시간 도메인에서의 부호화 방법 | |
KR20080092823A (ko) | 부호화/복호화 장치 및 방법 | |
JP2002073097A (ja) | Celp型音声符号化装置とcelp型音声復号化装置及び音声符号化方法と音声復号化方法 | |
JP5724338B2 (ja) | 符号化装置および符号化方法、復号装置および復号方法、並びにプログラム | |
JP2002169595A (ja) | 固定音源符号帳及び音声符号化/復号化装置 | |
KR20080034819A (ko) | 부호화/복호화 장치 및 방법 | |
JPH11249696A (ja) | 音声符号化/復号化方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PH PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
WWE | Wipo information: entry into national phase |
Ref document number: 10129076 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2001961335 Country of ref document: EP |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWP | Wipo information: published in national office |
Ref document number: 2001961335 Country of ref document: EP |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
WWG | Wipo information: grant in national office |
Ref document number: 2001961335 Country of ref document: EP |