WO2011065741A2 - 오디오 신호 처리 방법 및 장치 - Google Patents
오디오 신호 처리 방법 및 장치 Download PDFInfo
- Publication number
- WO2011065741A2 WO2011065741A2 PCT/KR2010/008336 KR2010008336W WO2011065741A2 WO 2011065741 A2 WO2011065741 A2 WO 2011065741A2 KR 2010008336 W KR2010008336 W KR 2010008336W WO 2011065741 A2 WO2011065741 A2 WO 2011065741A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- output signal
- memory
- current frame
- synthesis
- parameter
- Prior art date
Links
- 230000005236 sound signal Effects 0.000 title claims abstract description 29
- 238000003672 processing method Methods 0.000 title claims abstract description 11
- 230000015572 biosynthetic process Effects 0.000 claims description 45
- 238000003786 synthesis reaction Methods 0.000 claims description 45
- 238000000034 method Methods 0.000 claims description 35
- 230000005284 excitation Effects 0.000 claims description 19
- 238000013213 extrapolation Methods 0.000 claims description 10
- 241000593989 Scardinius erythrophthalmus Species 0.000 claims description 6
- 201000005111 ocular hyperemia Diseases 0.000 claims description 6
- 238000013139 quantization Methods 0.000 claims description 4
- 230000007774 longterm Effects 0.000 abstract description 6
- 238000010586 diagram Methods 0.000 description 6
- 230000002194 synthesizing effect Effects 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 4
- 230000003595 spectral effect Effects 0.000 description 4
- 230000003044 adaptive effect Effects 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 3
- 239000002131 composite material Substances 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 239000003623 enhancer Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000002238 attenuated effect Effects 0.000 description 1
- 230000001427 coherent effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000006257 total synthesis reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/005—Correction of errors induced by the transmission channel, if related to the coding algorithm
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/028—Noise substitution, i.e. substituting non-tonal spectral components by noisy source
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/12—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/12—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
- G10L19/125—Pitch excitation, e.g. pitch synchronous innovation CELP [PSI-CELP]
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B20/00—Signal processing not specific to the method of recording or reproducing; Circuits therefor
- G11B20/10—Digital recording or reproducing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L2019/0001—Codebooks
Definitions
- the present invention relates to an audio signal processing method and apparatus capable of encoding or decoding an audio signal.
- the transmission of audio signals is often intended for real-time conversations, so the less delay there is in encoding and decoding audio signals.
- the present invention has been made to solve the above problems, and to provide an audio signal processing method and apparatus for concealing frame loss at the receiving end. It is still another object of the present invention to provide an audio signal processing method and apparatus for minimizing propagation of an error to the next frame due to a signal arbitrarily generated to conceal frame loss.
- the present invention provides the following effects and advantages.
- the receiver-based loss concealment method since the number of bits required for additional information for frame error concealment is not necessary, the loss can be effectively concealed even in a low bit rate environment.
- 1 is a block diagram of an audio signal processing apparatus according to an embodiment of the present invention.
- 2 is a flow chart of an audio signal processing method according to an embodiment of the present invention.
- 3 is a detailed configuration diagram of an error concealment unit 130 according to a practical example of the present invention.
- 4 is a flowchart of an error concealment step S400.
- FIG. 5 is a view for explaining a signal generated by the error concealment unit according to an embodiment of the present invention.
- FIG. 6 is a detailed block diagram of a re-encoding unit 140 according to an embodiment of the present invention.
- 9 is a flowchart of a decoding step (S700).
- FIG. 10 is a view for explaining a signal generated by a decoding unit according to an embodiment of the present invention.
- the Odao signal processing method when receiving an audio signal including the data of the current frame h h If an error occurs in the data of the current frame, the random codebook to the current frame Generating a first temporary output signal of the current frame by performing frame error concealment on data of the current frame; Generating a parameter by performing one or more of short term prediction, full term prediction, and fixed codebook search based on the first temporary output signal; And updating the memory for the next frame with the parameter, wherein the parameter includes one or more of pitch gain, pitch delay, fixed codebook gain, and fixed codebook.
- an error occurs in the data of the current frame, performing an extrapolation method on a past input signal to generate a second temporary output signal of the current frame; And a step for selecting the first temporary output signal or the second temporary output signal according to the voice characteristic of the previous frame, wherein the parameter is selected from the first temporary output signal or the second temporary output signal.
- it can be generated by performing one or more of short term prediction, full term prediction, and fixed codebook search.
- the voice characteristic of the previous frame is about whether the voiced voice characteristic is large or the voiced voice characteristic is large, and the voiced voice characteristic may be large when the pitch gain is large and the change in pitch delay is small.
- the memory may include a memory for full-term prediction and a memory for short-term prediction, and the memory may include a memory used for parameter quantization of the prediction technique.
- the method may further include generating a final output signal for the current frame by performing one or more of fixed codebook acquisition, red-eye codebook synthesis, and short-term synthesis using the parameter.
- the method may further include updating the excitation signal obtained through the whole term synthesis and the fixed codebook synthesis, and the final output signal to the memory.
- the method may further include performing one or more of full-term synthesis and short-term synthesis on the next frame based on the memory.
- an audio signal including data of a current frame is received, and an error occurs in the data of the current frame.
- a de-multiplexer that checks whether it has occurred;
- An error concealment unit generating a first temporary output signal of the current frame by performing a frame error concealment on the data of the current frame when an error occurs in the data of the current frame;
- a re-encoding unit for generating a parameter by performing at least one of short term prediction, integrated team prediction, and fixed codebook search based on the first temporary output signal;
- a decoder which updates the memory for the next frame with the parameter, wherein the parameter includes one or more of pitch gain, pitch delay, fixed codebook gain, and fixed codebook.
- the error concealment unit when an error occurs in the data of the current frame, the extrapolation unit for generating a second temporary output signal of the current frame by performing an interpolation method on the past input signal; And a selector configured to select a first temporary output signal or a second temporary output signal according to a voice characteristic of a previous frame, wherein the parameter is selected for one of the first temporary output signal or the second temporary output signal, By performing one or more of short term prediction, full term prediction, and fixed codebook search.
- the voice characteristic of the previous frame is about whether the voiced voice characteristic is large or the voiceless voice characteristic is large, and the voiced voice characteristic may be large when the pitch gain is large and the change in pitch delay is small.
- the memory may include a memory for full team prediction and a memory for short team prediction, and the memory may include a memory used for parameter quantization of the prediction technique.
- the decoding unit may generate a final output signal for the current frame by performing at least one of fixed codebook acquisition, red-hero codebook synthesis, and short term synthesis by using the parameter.
- the decoder may update the excitation signal and the final output signal obtained through the whole synthesis and the fixed codebook synthesis to the memory.
- the decoder may further include performing one or more of the quantum synthesis and short-term synthesis on the next frame based on the memory. have.
- Coding can be interpreted as encoding or decoding in some cases, and information is a term that encompasses values, parameters, coefficients, elements, and so on. It may be interpreted otherwise, but the present invention is not limited thereto.
- the audio signal is broadly defined as a concept that is distinguished from a video signal, and refers to a signal that can be visually identified during reproduction.
- an audio signal is defined as a concept that is distinguished from a speech signal. Means a signal with little or no characteristics.
- the audio signal in the present invention should be interpreted broadly and can be understood as a narrow audio signal when used separately from a voice signal.
- Coding may also refer to encoding only, but may be used as a concept including both encoding and decoding.
- 1 is a view showing the configuration of an audio signal processing apparatus according to an embodiment of the present invention
- Figure 2 is a view showing a procedure for an audio signal processing method according to an embodiment of the present invention.
- an audio signal processing apparatus 100 includes an error concealment unit 130 and a re-encoding unit 140, and includes a de-multiplexer 110 and a decoding unit ( 120) may be further included.
- an audio signal processing apparatus 100 includes an error concealment unit 130 and a re-encoding unit 140, and includes a de-multiplexer 110 and a decoding unit ( 120) may be further included.
- each component will be described with reference to FIGS. 1 and 2.
- the de-multiplexer 110 receives an audio signal including data of a current frame through a network (S100).
- channel encoding is performed on the packet of the received audio signal and it is checked whether an error has occurred (step S200).
- the de-multiplexer 110 transmits the data of the received current frame to the decoder 120 or the error concealment unit 130 according to an error check result (BFI: bad frame indicator).
- BFI bad frame indicator
- the error concealment unit 130 generates a temporary output signal by performing error concealment on the current frame using the random codebook and past information (step S400).
- the process performed by the error concealment unit 130 will be described in detail later with reference to FIGS. 3 to 5.
- the recoding unit 140 generates an encoded parameter by performing recoding on the temporary output signal (S500).
- the recoding may include one or more of short team prediction, total prediction, and codebook search, and the parameter may include one or more of pitch gain, pitch delay, fixed codebook gain, and fixed codebook.
- the recoding unit 140 transmits the encoded parameter to the decoding unit 120 (step S600).
- the decoder 120 performs decoding on the datum of the current frame extracted from the bitstream (step S700). Alternatively, decoding is performed based on the encoded parameter for the current frame received from the recoding unit 140 (step S700). An operation or operation S700 of the decoder 120 will be described in detail later with reference to FIGS. 8 to 10.
- FIG 3 is a view showing a detailed configuration of the error concealment unit 130 according to an embodiment of the present invention
- Figure 4 is a view showing the sequence of the error concealment step (S400)
- Figure 5 is according to an embodiment of the present invention It is a figure for demonstrating the signal produced
- the error concealment unit 130 may include a total combining unit 132, a random signal generating unit 134, an enhancer 136, a short term combining unit 138, an extrapolation unit 138-2, and the like. It may include a selector 139.
- Tongteom synthesis unit 132 first, obtains the random pitch gain (g pa) and any pitch delay (D a) (step S410).
- the pitch gain and pitch delay are parameters generated by full term prediction (or long term prediction) (LTP), and the full term prediction (long term prediction) synthesis filter can be expressed by the following equation. [Equation 1]
- g p is a pitch gain
- D is a pitch delay
- the received pitch gain and the received pitch delay may constitute an adaptive codebook, which is substituted into Equation 1 above.
- the total combining unit 132 may replace the random pitch gain gpa and the random pitch delay to replace the received pitch gain and the received pitch delay.
- the arbitrary pitch gain (gpa) may be the pitch gain value of the previous frame, and may be calculated as a weighted sum weighted to the most recent gain value among the values stored in the previous frame, but the present invention is limited thereto.
- step S420 the past excitation signal of the previous frame received from the decoder 120 may be used. Referring to (A) of FIG. 5, an example of the whole composite signal (red codebook) of the previous frame and the whole composite signal g paV ( n ) of the current frame generated based on an arbitrary pitch delay and an arbitrary pitch gain is illustrated. It is.
- the random signal generator 134 uses a random codebook gain g ca and a random codebook rand (n) to replace the fixed codebook g ca rand (n). ) (Step S430).
- the arbitrary codebook gain g ca may also be obtained as a weighted sum of weights applied to the most recent gain value among the values stored in the previous frame, and the weighted sum may be appropriately attenuated according to the characteristics of the speech signal.
- the present invention is not limited thereto.
- FIG. 5B an example of a fixed codebook signal g ca rand (n) generated with a random codebook gain g ca and a random codebook rand (n) is shown.
- an error concealed excitation signal u fec (n) is generated using the codebook signal generated in operation S430 generated in operation S420 (operation S440).
- Uf ec (n) is an error concealed excitation signal
- g pa is any pitch gain (hero codebook gain)
- v (n) is the red-eye codebook
- g ca is a random codebook gain
- rand (n) is a random codebook
- Enhancer 136 is used to eliminate artificial artifacts that may be caused by insufficient information, such as low bit rate mode or error concealed case for the error concealed excitation signal u fec (n).
- the FIR filter is made naturally to compensate for the missing pulses in the fixed codebook, and the gains of the red-hero codebook and the fixed codebook are adjusted through the speech characteristic classification.
- the present invention is not limited thereto.
- the shotham synthesizing unit 138 first obtains a spectral vector? [0] in which arbitrary shottum prediction coefficients (or arbitrary linear prediction coefficients) have been converted for the current frame.
- the random short timber prediction coefficient is generated to replace the received short term prediction coefficient because an error occurs in the data of the current frame.
- the arbitrary short term prediction coefficient is generated based on the short term prediction coefficient of the previous frame (including the previous frame), and may be generated according to the following equation, but the present invention is not limited thereto.
- ISM Interference Spectral
- the Spectral Frequency vector is an ISF (Emittance Spectral Frequency) vector of each order corresponding to the stored short-term prediction coefficients, and a is a weight.
- the short term synthesis unit 138 performs short term prediction synthesis (short term synthesis) or linear prediction synthesis (LPC synthesis) using an arbitrary short term spectrum vector 1.
- short term synthesis short term synthesis
- LPC synthesis linear prediction synthesis
- the short team prediction (STP) synthesis filter may be based on the following equation, but the present invention is not limited thereto.
- Equation 4 1 1
- the excitation signal corresponds to an input signal of the short term prediction synthesis filter
- the excitation signal may be passed through the short term prediction synthesis filter to generate a first temporary output signal.
- the extrapolation unit 138-2 performs extrapolation (or extrapolation Xextmpolation) that generates a future signal based on the past signal to produce a second temporary output signal for error concealment (S470).
- the pitch analysis is performed from the past signal, and the signal is stored as much as one pitch period.
- a second temporary output signal can be generated by performing a Pitch Synchronous Overlap and Add (PSOLA) method, but the present invention is not limited to PSOLA in performing extrapolation.
- the selector 139 selects a target signal of the re-encoding unit 140 from the first temporary output signal and the second temporary output signal (S480). Voice characteristic classification of the past signal can be performed to select the first temporary output signal for unvoiced sound and the second temporary output signal for voiced sound. , If the change in the team delay is small, the voiced sound can be discriminated, but the present invention is not limited thereto.
- the recoding unit 140 will be described with reference to FIGS. 6 and 7.
- 6 is a view showing a detailed configuration of the re-encoding unit 140 according to an embodiment of the present invention
- Figure 7 is a view showing the sequence of the re-encoding step (S500).
- the recoding unit 140 includes one or more of the short term prediction unit 142, the psychological weighting filter 144, the team prediction unit 146, and the fixed codebook search unit 148.
- the short term predictor 142 receives one of a first temporary output signal or a second temporary output signal, which is an output signal of the error concealment unit 130 described above with reference to FIG. 1,
- the short team predictive analysis is performed on the signal (step S510).
- Linear prediction coefficients (LPCs) may be obtained through short term prediction analysis.
- Step S510 is a short section through the short term analysis It is to generate a short team prediction coefficient that minimizes an error of the prediction (STP) filter, that is, a prediction error that is a difference between the original signal and the estimated signal.
- STP prediction filter
- the psychological weighting filter unit 144 applies a psychological weighting filter to the residual signal r (n) which is a difference between the prediction signal and the temporary output signal by the short team prediction (S520). step).
- the psychometric weighting filter may be a filter shown in the following equation.
- the team prediction unit 146 first obtains the team prediction delay value D obtained by performing an open loop search from the weighted input signal to which the psychometric weighting filter is applied, and searches the closed loop within + -d.
- the final total prediction delay value T and the corresponding gain are selected.
- d may be 8 samples, but the present invention is not limited thereto.
- the whole prediction is preferably the same method as that used in the encoding stage.
- the delay value (pitch delay) (D) of the long term prediction may be calculated according to the following equation.
- the total prediction delay (D) is k where the value of the equation is maximized.
- the total prediction gain (pitch gain) may be calculated according to the following equation.
- D is the total prediction delay value (pitch delay)
- g p is the team prediction gain (pitch gain)
- d (n) may be an output signal ⁇ ( ⁇ ) in a closed loop and wx (n) in which a psychoweight filter is applied in an open loop.
- the integrated team prediction gain is obtained by using the previous prediction gain (D) determined according to Equation 6 above.
- the integrated team prediction unit 146 generates the pitch gain () and the integrated team prediction delay value (D) through the above-described process, and generates a total prediction from the residual signal (residual) (r (n)) of the short team prediction.
- the fixed codebook target signal (c (n)) from which the red-hero codebook signal is removed is transmitted to the codebook search unit 1 48.
- c (n) is a fixed codebook target signal r (n) is a residual signal of short team prediction
- v (n) is the pitch signal corresponding to the red-eye codebook delay (D)
- v (n) may be a red-eye codebook using a long-term predictor from the previous past excitation signal memory.
- the previous past memory may be a memory of the decoder 120 described with reference to FIG. 1.
- the codebook search unit 148 generates a fixed codebook gain g c and a fixed codebook e (n) with respect to the codebook signal (step S540). In this case, it is preferable to use the same method as that performed in the codebook search.
- the parameter may be generated in a closed loop manner to determine the encoded parameter.
- the parameter generated through the above process is transferred to the decoder 120 as described above with reference to FIGS. 1 and 2.
- the decoder 120 includes a switch 121, a whole term synthesizer 122, a fixed codebook obtainer 124, a short term synthesizer 126, and a memory 128.
- the switch 121 receives the parameter from the de-multiplexer 110 or the re-coding unit 140 according to the error check result BFI (step S710). ).
- the parameter received from the de-multiplexer 110 refers to a parameter included in the bitstream and extracted by the de-multiplexer 110.
- the parameter received from the re-encoding unit 140 includes error concealment performed by the error concealment unit 130 and encoded by the re-encoding unit 140 in a section (for example, a frame) in which an error has occurred. Refers to a specified parameter. The following will be explained based on the latter.
- the whole synthesizing unit 122 generates the adaptive codebook by performing the whole team synthesis based on the whole predictive gain g p and the whole predicting delay D (step S720).
- the whole prediction synthesis unit 122 is similar to the operation of the whole combining unit 132 described above except that the input parameters are different.
- FIG. 10A an example of a whole composite signal g pV (n) generated using the received pitch gain and pitch delay is shown.
- the codebook acquisition unit 124 generates a fixed codebook signal ⁇ ( ⁇ ) using the received fixed codebook gain g c and fixed codebook parameters (step S730).
- FIG. 10B an example of a fixed codebook signal generated using a fixed codebook gain and a fixed codebook index is shown.
- the excitation signal u (n) is generated by adding the pitch signal and the codebook signal.
- the codebook acquisition unit 124 uses a received fixed codebook without using a random codebook.
- the short term synthesizing unit 126 performs short team synthesis based on the short term prediction coefficient and the signal of the previous frame, and generates the final output signal by adding the excitation signal u (n) to the short team synthesis signal (S740). ). In this case, the following equation may be applied.
- u (n) is the excitation signal
- ⁇ ( ⁇ ) is a coarse coherent book corresponding to the pitch delay (D)
- g c (n) is a fixed codebook gain
- ⁇ ( ⁇ ) is a fixed codebook with unit size
- the memory 128 may be divided into a memory for error concealment (128-1: not shown) and a memory for decoding (128-2: not shown).
- the memory for error concealment 128-1 stores data for the error concealment unit 130 (total prediction gain, historical team delay value and past delay value history, fixed codebook gain, short term prediction coefficient, etc.), and performs decoding.
- the memory 128-2 stores data necessary for the decoding unit 120 to decode (eg, an excitation signal, a gain value, and a final output signal of the current frame for synthesis of the next frame). May be implemented as one memory 128 without being separated. Meanwhile, the memory 128-2 for decoding may include a memory for whole prediction and a memory for short prediction, and the excitation signal is generated through the whole synthesis in the next frame in the memory 128-2 for the whole prediction. There may be memory needed to do this and memory needed for short team synthesis.
- the audio signal processing method according to the present invention can be stored in a computer-readable recording medium which is produced as a program for execution in a computer, and multimedia data having a data structure according to the present invention can also be stored in a computer-readable recording medium. Can be stored.
- the computer readable recording medium includes any type of storage device in which data that can be read by a computer system is stored.
- Examples of computer-readable recording media include ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage, and the like, and may also be implemented in the form of a carrier wave (for example, transmission over the Internet). Include.
- the bitstream generated by the encoding method may be stored in a computer-readable recording medium or transmitted through a wired / wireless communication network.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
Claims
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/511,331 US9020812B2 (en) | 2009-11-24 | 2010-11-24 | Audio signal processing method and device |
EP10833553.0A EP2506253A4 (en) | 2009-11-24 | 2010-11-24 | METHOD AND DEVICE FOR PROCESSING AUDIO SIGNAL |
CN201080053308.2A CN102648493B (zh) | 2009-11-24 | 2010-11-24 | 音频信号处理方法和设备 |
KR1020127012638A KR101761629B1 (ko) | 2009-11-24 | 2010-11-24 | 오디오 신호 처리 방법 및 장치 |
US14/687,991 US9153237B2 (en) | 2009-11-24 | 2015-04-16 | Audio signal processing method and device |
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US26424809P | 2009-11-24 | 2009-11-24 | |
US61/264,248 | 2009-11-24 | ||
US28518309P | 2009-12-10 | 2009-12-10 | |
US61/285,183 | 2009-12-10 | ||
US29516610P | 2010-01-15 | 2010-01-15 | |
US61/295,166 | 2010-01-15 |
Related Child Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/511,331 A-371-Of-International US9020812B2 (en) | 2009-11-24 | 2010-11-24 | Audio signal processing method and device |
US14/687,991 Continuation US9153237B2 (en) | 2009-11-24 | 2015-04-16 | Audio signal processing method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2011065741A2 true WO2011065741A2 (ko) | 2011-06-03 |
WO2011065741A3 WO2011065741A3 (ko) | 2011-10-20 |
Family
ID=44067093
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2010/008336 WO2011065741A2 (ko) | 2009-11-24 | 2010-11-24 | 오디오 신호 처리 방법 및 장치 |
Country Status (5)
Country | Link |
---|---|
US (2) | US9020812B2 (ko) |
EP (1) | EP2506253A4 (ko) |
KR (1) | KR101761629B1 (ko) |
CN (1) | CN102648493B (ko) |
WO (1) | WO2011065741A2 (ko) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113782050A (zh) * | 2021-09-08 | 2021-12-10 | 浙江大华技术股份有限公司 | 声音变调方法、电子设备及存储介质 |
Families Citing this family (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20140067512A (ko) * | 2012-11-26 | 2014-06-05 | 삼성전자주식회사 | 신호 처리 장치 및 그 신호 처리 방법 |
TR201808890T4 (tr) | 2013-06-21 | 2018-07-23 | Fraunhofer Ges Forschung | Bir konuşma çerçevesinin yeniden yapılandırılması. |
CN110265044B (zh) | 2013-06-21 | 2023-09-12 | 弗朗霍夫应用科学研究促进协会 | 在错误隐藏过程中在不同域中改善信号衰落的装置及方法 |
WO2014202539A1 (en) * | 2013-06-21 | 2014-12-24 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for improved concealment of the adaptive codebook in acelp-like concealment employing improved pitch lag estimation |
EP2922056A1 (en) | 2014-03-19 | 2015-09-23 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus, method and corresponding computer program for generating an error concealment signal using power compensation |
EP2922055A1 (en) * | 2014-03-19 | 2015-09-23 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus, method and corresponding computer program for generating an error concealment signal using individual replacement LPC representations for individual codebook information |
EP2922054A1 (en) | 2014-03-19 | 2015-09-23 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus, method and corresponding computer program for generating an error concealment signal using an adaptive noise estimation |
TWI602172B (zh) * | 2014-08-27 | 2017-10-11 | 弗勞恩霍夫爾協會 | 使用參數以加強隱蔽之用於編碼及解碼音訊內容的編碼器、解碼器及方法 |
EP3483880A1 (en) | 2017-11-10 | 2019-05-15 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Temporal noise shaping |
EP3483886A1 (en) | 2017-11-10 | 2019-05-15 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Selecting pitch lag |
EP3483884A1 (en) | 2017-11-10 | 2019-05-15 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Signal filtering |
EP3483878A1 (en) | 2017-11-10 | 2019-05-15 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio decoder supporting a set of different loss concealment tools |
EP3483882A1 (en) | 2017-11-10 | 2019-05-15 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Controlling bandwidth in encoders and/or decoders |
WO2019091573A1 (en) | 2017-11-10 | 2019-05-16 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for encoding and decoding an audio signal using downsampling or interpolation of scale parameters |
EP3483883A1 (en) | 2017-11-10 | 2019-05-15 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio coding and decoding with selective postfiltering |
WO2019091576A1 (en) | 2017-11-10 | 2019-05-16 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio encoders, audio decoders, methods and computer programs adapting an encoding and decoding of least significant bits |
EP3483879A1 (en) * | 2017-11-10 | 2019-05-15 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Analysis/synthesis windowing function for modulated lapped transformation |
CN112992160B (zh) * | 2021-05-08 | 2021-07-27 | 北京百瑞互联技术有限公司 | 一种音频错误隐藏方法及装置 |
Family Cites Families (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3102015B2 (ja) * | 1990-05-28 | 2000-10-23 | 日本電気株式会社 | 音声復号化方法 |
JPH04264597A (ja) * | 1991-02-20 | 1992-09-21 | Fujitsu Ltd | 音声符号化装置および音声復号装置 |
EP1763020A3 (en) * | 1991-06-11 | 2010-09-29 | Qualcomm Incorporated | Variable rate vocoder |
US5615298A (en) * | 1994-03-14 | 1997-03-25 | Lucent Technologies Inc. | Excitation signal synthesis during frame erasure or packet loss |
US5450449A (en) * | 1994-03-14 | 1995-09-12 | At&T Ipm Corp. | Linear prediction coefficient generation during frame erasure or packet loss |
US5699478A (en) * | 1995-03-10 | 1997-12-16 | Lucent Technologies Inc. | Frame erasure compensation technique |
DE69633164T2 (de) * | 1995-05-22 | 2005-08-11 | Ntt Mobile Communications Network Inc. | Tondekoder |
CN1163870C (zh) * | 1996-08-02 | 2004-08-25 | 松下电器产业株式会社 | 声音编码装置和方法,声音译码装置,以及声音译码方法 |
JP3206497B2 (ja) * | 1997-06-16 | 2001-09-10 | 日本電気株式会社 | インデックスによる信号生成型適応符号帳 |
US6810377B1 (en) * | 1998-06-19 | 2004-10-26 | Comsat Corporation | Lost frame recovery techniques for parametric, LPC-based speech coding systems |
JP3319396B2 (ja) * | 1998-07-13 | 2002-08-26 | 日本電気株式会社 | 音声符号化装置ならびに音声符号化復号化装置 |
KR100281181B1 (ko) * | 1998-10-16 | 2001-02-01 | 윤종용 | 약전계에서 코드 분할 다중 접속 시스템의 코덱 잡음 제거 방법 |
US6597961B1 (en) * | 1999-04-27 | 2003-07-22 | Realnetworks, Inc. | System and method for concealing errors in an audio transmission |
US6636829B1 (en) * | 1999-09-22 | 2003-10-21 | Mindspeed Technologies, Inc. | Speech communication system and method for handling lost frames |
JP3478209B2 (ja) * | 1999-11-01 | 2003-12-15 | 日本電気株式会社 | 音声信号復号方法及び装置と音声信号符号化復号方法及び装置と記録媒体 |
CA2290037A1 (en) * | 1999-11-18 | 2001-05-18 | Voiceage Corporation | Gain-smoothing amplifier device and method in codecs for wideband speech and audio signals |
US6584438B1 (en) * | 2000-04-24 | 2003-06-24 | Qualcomm Incorporated | Frame erasure compensation method in a variable rate speech coder |
EP1199709A1 (en) * | 2000-10-20 | 2002-04-24 | Telefonaktiebolaget Lm Ericsson | Error Concealment in relation to decoding of encoded acoustic signals |
US7031926B2 (en) * | 2000-10-23 | 2006-04-18 | Nokia Corporation | Spectral parameter substitution for the frame error concealment in a speech decoder |
JP3582589B2 (ja) * | 2001-03-07 | 2004-10-27 | 日本電気株式会社 | 音声符号化装置及び音声復号化装置 |
CA2388439A1 (en) * | 2002-05-31 | 2003-11-30 | Voiceage Corporation | A method and device for efficient frame erasure concealment in linear predictive based speech codecs |
KR100462024B1 (ko) * | 2002-12-09 | 2004-12-17 | 한국전자통신연구원 | 부가 음성 데이터를 이용한 패킷 손실 복구 방법 및 이를이용한 송수신기 |
US7146309B1 (en) * | 2003-09-02 | 2006-12-05 | Mindspeed Technologies, Inc. | Deriving seed values to generate excitation values in a speech coder |
US7613606B2 (en) * | 2003-10-02 | 2009-11-03 | Nokia Corporation | Speech codecs |
US7873515B2 (en) * | 2004-11-23 | 2011-01-18 | Stmicroelectronics Asia Pacific Pte. Ltd. | System and method for error reconstruction of streaming audio information |
US7519535B2 (en) | 2005-01-31 | 2009-04-14 | Qualcomm Incorporated | Frame erasure concealment in voice communications |
KR100612889B1 (ko) * | 2005-02-05 | 2006-08-14 | 삼성전자주식회사 | 선스펙트럼 쌍 파라미터 복원 방법 및 장치와 그 음성복호화 장치 |
US7831421B2 (en) | 2005-05-31 | 2010-11-09 | Microsoft Corporation | Robust decoder |
US8798172B2 (en) | 2006-05-16 | 2014-08-05 | Samsung Electronics Co., Ltd. | Method and apparatus to conceal error in decoded audio signal |
KR101261528B1 (ko) * | 2006-05-16 | 2013-05-07 | 삼성전자주식회사 | 복호화된 오디오 신호의 오류 은폐 방법 및 장치 |
US8010351B2 (en) | 2006-12-26 | 2011-08-30 | Yang Gao | Speech coding system to improve packet loss concealment |
US8630863B2 (en) * | 2007-04-24 | 2014-01-14 | Samsung Electronics Co., Ltd. | Method and apparatus for encoding and decoding audio/speech signal |
-
2010
- 2010-11-24 KR KR1020127012638A patent/KR101761629B1/ko active IP Right Grant
- 2010-11-24 WO PCT/KR2010/008336 patent/WO2011065741A2/ko active Application Filing
- 2010-11-24 EP EP10833553.0A patent/EP2506253A4/en not_active Ceased
- 2010-11-24 CN CN201080053308.2A patent/CN102648493B/zh not_active Expired - Fee Related
- 2010-11-24 US US13/511,331 patent/US9020812B2/en not_active Expired - Fee Related
-
2015
- 2015-04-16 US US14/687,991 patent/US9153237B2/en not_active Expired - Fee Related
Non-Patent Citations (2)
Title |
---|
None |
See also references of EP2506253A4 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113782050A (zh) * | 2021-09-08 | 2021-12-10 | 浙江大华技术股份有限公司 | 声音变调方法、电子设备及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
EP2506253A4 (en) | 2014-01-01 |
CN102648493B (zh) | 2016-01-20 |
CN102648493A (zh) | 2012-08-22 |
US9020812B2 (en) | 2015-04-28 |
US9153237B2 (en) | 2015-10-06 |
WO2011065741A3 (ko) | 2011-10-20 |
US20120239389A1 (en) | 2012-09-20 |
KR101761629B1 (ko) | 2017-07-26 |
KR20120098701A (ko) | 2012-09-05 |
EP2506253A2 (en) | 2012-10-03 |
US20150221311A1 (en) | 2015-08-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2011065741A2 (ko) | 오디오 신호 처리 방법 및 장치 | |
US7496506B2 (en) | Method and apparatus for one-stage and two-stage noise feedback coding of speech and audio signals | |
RU2419891C2 (ru) | Способ и устройство эффективной маскировки стирания кадров в речевых кодеках | |
EP1363273B1 (en) | A speech communication system and method for handling lost frames | |
RU2257556C2 (ru) | Квантование коэффициентов усиления для речевого кодера линейного прогнозирования с кодовым возбуждением | |
US20080297380A1 (en) | Signal decoding apparatus and signal decoding method | |
JP2002328700A (ja) | フレーム消去の隠蔽およびその方法 | |
JP2002541499A (ja) | Celp符号変換 | |
AU2001255422A1 (en) | Gains quantization for a celp speech coder | |
KR20070028373A (ko) | 음성음악 복호화 장치 및 음성음악 복호화 방법 | |
JP3357795B2 (ja) | 音声符号化方法および装置 | |
US7302385B2 (en) | Speech restoration system and method for concealing packet losses | |
US8265929B2 (en) | Embedded code-excited linear prediction speech coding and decoding apparatus and method | |
JP2002509294A (ja) | 暗騒音条件下における音声符号化の方法 | |
JP3426871B2 (ja) | 音声信号のスペクトル形状調整方法および装置 | |
WO2014034697A1 (ja) | 復号方法、復号装置、プログラム、及びその記録媒体 | |
JP2001154699A (ja) | フレーム消去の隠蔽及びその方法 | |
JP2018511086A (ja) | オーディオ信号を符号化するためのオーディオエンコーダー及び方法 | |
EP1397655A1 (en) | Method and device for coding speech in analysis-by-synthesis speech coders | |
JP3754819B2 (ja) | 音声通信方法及び音声通信装置 | |
JPH05165497A (ja) | コード励振線形予測符号化器及び復号化器 | |
WO2005045808A1 (en) | Harmonic noise weighting in digital speech coders | |
KR20050007854A (ko) | 서로 다른 celp 방식의 음성 코덱 간의 상호부호화장치 및 그 방법 | |
JP3350340B2 (ja) | 音声符号化方法および音声復号化方法 | |
JPH10149200A (ja) | 線形予測符号化装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201080053308.2 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10833553 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 20127012638 Country of ref document: KR Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13511331 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2010833553 Country of ref document: EP |