WO2015111970A1 - Dispositif de codage et procédé utilisant un codage résiduel - Google Patents

Dispositif de codage et procédé utilisant un codage résiduel Download PDF

Info

Publication number
WO2015111970A1
WO2015111970A1 PCT/KR2015/000763 KR2015000763W WO2015111970A1 WO 2015111970 A1 WO2015111970 A1 WO 2015111970A1 KR 2015000763 W KR2015000763 W KR 2015000763W WO 2015111970 A1 WO2015111970 A1 WO 2015111970A1
Authority
WO
WIPO (PCT)
Prior art keywords
signal
harmonic
residual
generating
parameter
Prior art date
Application number
PCT/KR2015/000763
Other languages
English (en)
Korean (ko)
Inventor
박지훈
Original Assignee
재단법인 다차원 스마트 아이티 융합시스템 연구단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 재단법인 다차원 스마트 아이티 융합시스템 연구단 filed Critical 재단법인 다차원 스마트 아이티 융합시스템 연구단
Publication of WO2015111970A1 publication Critical patent/WO2015111970A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • H04S3/008Systems employing more than two channels, e.g. quadraphonic in which the audio signals are in digital form, i.e. employing more than two discrete digital channels
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/066Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for pitch analysis as part of wider processing for musical purposes, e.g. transcription, musical performance evaluation; Pitch recognition, e.g. in polyphonic sounds; Estimation or use of missing fundamental
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/155Musical effects
    • G10H2210/265Acoustic effect simulation, i.e. volume, spatial, resonance or reverberation effects added to a musical sound, usually by appropriate filtering or delays
    • G10H2210/295Spatial effects, musical uses of multiple audio channels, e.g. stereo
    • G10H2210/305Source positioning in a soundscape, e.g. instrument positioning on a virtual soundstage, stereo panning or related delay or reverberation changes; Changing the stereo width of a musical source
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/03Aspects of down-mixing multi-channel audio to configurations with lower numbers of playback channels, e.g. 7.1 -> 5.1
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/03Application of parametric coding in stereophonic audio systems

Definitions

  • the following embodiments relate to an encoding apparatus and method using residual coding.
  • SAOC spatial audio object coding
  • S-TSC SAOC two-step coding
  • International Publication No. 2010-143907 discloses a method and encoding apparatus for encoding a multi-object audio signal, a decoding method and a decoding apparatus, and a transcoding method and a transcoder.
  • the multi-object audio signal encoding apparatus discloses a method of encoding object signals except for foreground object signals among a plurality of input object signals and encoding foreground object signals to provide a satisfactory sound quality to a listener. do.
  • Embodiments of the present invention provide a residual encoding apparatus and method capable of generating a residual signal for each of a plurality of input object signals.
  • Embodiments of the present invention provide a residual encoding apparatus and method capable of generating a residual signal for each of a downmix signal, an OLD parameter, and a plurality of object signals.
  • Embodiments of the present invention provide a residual encoding apparatus and method capable of generating a residual signal for each of a plurality of object signals using an OLD parameter.
  • the residual encoding apparatus may include: a downmix signal generator configured to weight down a plurality of input object signals to generate a downmix signal; A spatial parameter generator for generating a spatial parameter by normalizing subband power of each of the plurality of object signals; And a residual signal generator configured to generate a residual signal for each of the plurality of object signals using the downmix signal and the spatial parameter.
  • the spatial parameter generator calculates the spatial parameter OLD using the equation, P represents the parameter subband power, B represents the number of parameter subbands, and N represents the number of input objects.
  • the residual signal generator generates the residual signal using the OLD.
  • the residual encoding method may include generating a downmix signal by weighting a plurality of input object signals; Generating a spatial parameter by normalizing subband power of each of the plurality of object signals; And generating a residual signal for each of the plurality of object signals using the downmix signal and the spatial parameter.
  • a computer readable recording medium having recorded thereon a program for performing a residual encoding method, the program comprising: instructions for weighting a plurality of input object signals to generate a downmix signal; Generating a spatial parameter by normalizing subband power of each of the plurality of object signals; And generating a residual signal for each of the plurality of object signals using the downmix signal and the spatial parameter.
  • Embodiments of the present invention provide a residual encoding apparatus and method capable of generating a residual signal for each of a plurality of input object signals.
  • Embodiments of the present invention provide a residual encoding apparatus and method capable of generating a residual signal for each of a downmix signal, an OLD parameter, and a plurality of object signals.
  • Embodiments of the present invention provide a residual encoding apparatus and method capable of generating a residual signal for each of a plurality of object signals using an OLD parameter.
  • 1 is a diagram illustrating a SAOC encoder and a decoder.
  • FIG. 2 is a block diagram illustrating an encoding apparatus and a decoding apparatus for vocal harmonic coding.
  • 3 is a graph showing harmonic information.
  • FIG. 4 is a flowchart illustrating a pitch extraction method, according to an exemplary embodiment.
  • 5 is a graph according to the pitch extraction method of FIG. 4.
  • FIG. 6 is a flowchart illustrating an MVF extraction method according to an embodiment.
  • FIG. 7 is a graph according to the MVF extraction method of FIG. 6.
  • FIG. 9 is a graph illustrating a harmonic filtering and a smoothing filtering process.
  • 10 is a graph illustrating test results according to vocal harmonic coding.
  • 11 is a flowchart illustrating an encoding method for vocal harmonic coding.
  • FIG. 12 is a flowchart illustrating a decoding method for vocal harmonic coding.
  • FIG. 13 is a block diagram illustrating a personal audio studio system according to an embodiment of the present invention.
  • FIG. 14 is a diagram illustrating an encoding apparatus capable of selectively using any one of SAOC coding, residual coding, and vocal harmonic coding.
  • 15 is a diagram illustrating an encoding apparatus for performing residual coding according to an embodiment of the present invention.
  • FIG. 16 is a diagram illustrating the residual signal generator shown in FIG. 15 in more detail.
  • 1 is a diagram illustrating a SAOC encoder and a decoder.
  • SAOC spatial audio object coding
  • the SAOC encoder converts the input object signals into downmix signals and spatial parameters and sends them to the SAOC decoder.
  • the decoder reproduces the object signal using the received downmix signal and spatial parameters, and the renderer renders the respective objects according to user input to generate final music.
  • the SAOC encoder calculates the downmix signal and the spatial parameter OLD (Object Level Difference).
  • the downmix signal can be obtained by the weighted sum of the input signals.
  • OLD may be obtained by normalizing to the power of the largest value among the subband powers of the object. OLD may be defined according to [Equation 1].
  • P represents the parameter subband power
  • B represents the number of parameter subbands
  • N represents the number of input objects.
  • the SAOC decoder can reproduce the object signal through the downmix signal and the OLD.
  • the SAOC decoder may reproduce the object signal using Equation 2.
  • the SAOC decoder when a specific object is to be adjusted, the SAOC decoder adjusts a specific object from the downmix signal with only OLD.
  • FIG. 2 is a block diagram illustrating an encoding apparatus and a decoding apparatus for vocal harmonic coding.
  • the SAOC parameter generator 211 the harmonic information generator 212, the object signal reproducing unit 221, the harmonic filtering unit 222, the smoothing filtering unit 223, and the rendering unit 224 are provided. Is shown.
  • the SAOC parameter generator 211 generates a downmix signal by weighting a plurality of input object signals including a vocal object signal and an instrument object signal, and normalizes subband powers of the plurality of input object signals. To create a spatial parameter.
  • the SAOC parameter generator 211 may correspond to the SAOC encoder of FIG. 1.
  • the downmix signal and the spatial parameter are transmitted to the harmonic information generator 212.
  • the harmonic information generation unit 212 generates harmonic information from the vocal object signal in order to remove the harmonic component generated when reproducing the instrument object signal from the downmix signal using spatial parameters.
  • the vocal object signal When the vocal object signal is removed from the downmix signal based on the OLD, a difference may occur between the unvoiced sound signal and the voiced sound signal included in the vocal object signal. In fact, in order to obtain a background signal composed of the instrument object signal, if the vocal object signal is removed from the downmix signal based on the OLD, the removal performance in the voiced signal portion is lowered.
  • the harmonic information may include the pitch of the voiced sound signal included in the vocal object signal, the harmonic maximum frequency of the voiced sound signal, and the spectral harmonic magnitude of the voiced sound signal.
  • the harmonic component may correspond to the voiced sound signal.
  • the harmonic information generation unit 212 generates pitch information of the voiced sound signal included in the vocal object signal, generates harmonic maximum frequency information of the voiced sound signal using the pitch information, pitch information and the maximum frequency information.
  • the harmonic information generation unit 212 is a voiced sound included in the vocal object signal using a quantization table calculated based on an average value of the subband power of the vocal object signal and the subband power of the vocal object signal.
  • the spectral harmonic magnitude of the signal can be quantized. Quantization of the spectral harmonic magnitude of the voiced signal is described in detail with reference to FIG. 8.
  • the object signal reproducing unit 221 reproduces the vocal object signal and the instrument object signal from the downmix signal using spatial parameters.
  • the object signal reproducing unit 221 may correspond to the SAOC decoder of FIG. 1.
  • the harmonic filtering unit 222 removes the harmonic component from the reproduced instrument object signal using the reproduced vocal object signal and the harmonic information.
  • the harmonic information is information generated by the encoding apparatus to remove harmonic components generated when reproducing the instrument object signal in the downmix signal. A detailed operation of the harmonic filtering unit 222 will be described with reference to FIG. 9.
  • the smoothing filtering unit 223 smoothes the instrument object signal from which the harmonic component is removed.
  • the planarization of the instrument object signal is an operation for reducing the discontinuity due to the harmonic filtering unit 222.
  • a detailed operation of the smoothing filtering unit 223 will be described with reference to FIG. 9.
  • the renderer 224 generates the SAOC demodulation output by using the reproduced vocal object signal and the reproduced instrument object signal.
  • the renderer 224 may correspond to the renderer of FIG. 1.
  • the output signal of the rendering unit 224 may be output through the speaker as it is.
  • the output signal of the rendering unit 224 may be transmitted to the harmonic filtering unit 222.
  • the output signal of the rendering unit 224 may be output as the improved background music through the harmonic filtering unit 222 and the smoothing filtering unit 223.
  • 3 is a graph showing harmonic information.
  • Harmonic information is information used to remove harmonic components that occur when reproducing an instrument object signal in a downmix signal using spatial parameters.
  • the harmonic information may include the pitch of the voiced sound signal included in the vocal object signal, the harmonic maximum frequency of the voiced sound signal, and the spectral harmonic magnitude of the voiced sound signal. Since vocal harmonics are mostly generated by voiced sound signals of vocal object signals, the harmonic information may be information about voiced sound signals.
  • FIG. 3 a graph in the time domain (left) of a voiced signal and a graph in the frequency domain (right) are shown.
  • the interval or pitch period between pitches of the spectral harmonic magnitude of the voiced sound may be the pitch of the voiced sound signal.
  • the inverse of the pitch of the voiced sound signal may be a fundamental frequency (F0).
  • the maximum voiced frequency (MVF) may be the harmonic maximum frequency of the voiced sound signal. MVF may represent a frequency band in which harmonics are distributed.
  • the harmonic amplifier (HA) may be the spectral harmonic magnitude of the voiced signal. The harmonic amplifier can indicate the magnitude of the harmonic.
  • FIG. 4 is a flowchart illustrating a pitch extraction method, according to an exemplary embodiment.
  • a pitch may be extracted through Discrete Fourier Transform (DFT), Spectral Whitening, and Salience for a vocal object signal.
  • the pitch can be extracted according to various methods commonly used. 4 is a pitch extraction method using the saliency function of [Equation 3].
  • tau ⁇ is a candidate of the pitch value.
  • 5 is a graph according to the pitch extraction method of FIG. 4.
  • a graph of a vocal object a graph based on spectral whitening, and a graph based on a result of a salience function are shown.
  • the graph according to the result of the sales function is a graph of the sales function according to the tau ⁇ of [Equation 3], where the index of the maximum value is predicted as the pitch value.
  • FIG. 6 is a flowchart illustrating an MVF extraction method according to an embodiment.
  • the harmonic information generator 212 may use an LP residual signal to find a harmonic peak on a frequency to predict the MVF. Each step shown in FIG. 6 is described in detail in FIG.
  • FIG. 7 is a graph according to the MVF extraction method of FIG. 6.
  • the harmonic information generator 212 calculates the LP residual signal through LP (Linear Predictive) analysis of the input signal, and extracts the local peak of the fundamental frequency interval. In addition, the harmonic information generator 212 performs a local peak. Linear interpolation can be used to predict the shaping curve.
  • the harmonic information generator 212 truncates the residual signal by 3-dB down the shaping curve.
  • the harmonic information generator 212 normalizes the interval of peak points of the truncated signal to a fundamental frequency and predicts the MVF through the MVF decision.
  • the example shown in FIG. 7 is the result of using 0.5 and 1.5 as thresholds for the determination of MVF.
  • the harmonic information generator 212 may calculate the HA from the power spectrum at the harmonic peak point.
  • HA varies in size
  • quantization is required.
  • an adaptive quantization technique using an OLD parameter and an arithmetic mean may be used for HA.
  • the harmonic quantization table for the adaptive quantization technique may be generated using the maximum and minimum values calculated through Equations 4 to 6 below.
  • Equations 4 to 6 the minimum and maximum values at which the m th harmonic may exist to quantize the m th harmonic impedance are shown in Equations 4 to 6 as shown in the right figure.
  • Equation 4 the maximum value is Pv (b), which is the b-th subband power of the vocal signal.
  • the minimum value is Pv (b) / (nD) which is an average of Pv (b).
  • n is the number of harmonics included in the sub band
  • D is the duration of the sub band.
  • Equation 5 If the logarithm of Equation 4 is taken, Equation 5 is obtained. If the Equation 5 is normalized, the minimum and maximum values of the quantization table can be obtained as shown in Equation 6.
  • FIG. 9 is a graph illustrating a harmonic filtering and a smoothing filtering process.
  • Equation 7 shows the harmonic filtering unit 222.
  • Equation 7 Denotes an instrument object signal from which the harmonic component that is the output of the harmonic filter has been removed, Denotes the reproduced instrument object signal that is the input of the harmonic filter.
  • G E (k) is the transfer function of the harmonic filter, which is designed according to Equation (8).
  • Equation 8 Represents the reproduced vocal object signal, Denotes the reproduced instrument object signal.
  • the harmonic impedance H (m) according to the harmonic information is the power spectrum of the m th harmonic in the frequency domain. H (m) is defined as shown in [Equation 9].
  • F 0 represents the fundamental frequency
  • m is an integer
  • M is the number of harmonics.
  • M ⁇ f mvf / F 0 >.
  • f mvf is the MVF frequency.
  • X v represents a vocal object signal.
  • Equation 10 shows the smoothing filtering unit 222.
  • Equation 10 Denotes an instrument object signal from which the harmonic component is removed, which is the output of the harmonic filter and the input of the smoothing filter, Denotes the flattened instrument object signal that is the output of the smoothing filter, and Gs (k) denotes the transfer function of the smoothing filter. Gs (k) is defined as shown in [Equation 11].
  • W denotes the bandwidth of the harmonic according to the smoothing range
  • 10 is a graph illustrating test results according to vocal harmonic coding.
  • VHC Vocal Harmonic Coding
  • the VHC shows a lower score than the TSC II, but considering that the bit rate of the VHC is much lower than the bit rate of the TSC II, the overall performance is good.
  • 11 is a flowchart illustrating an encoding method for vocal harmonic coding.
  • the encoding apparatus weights a plurality of input object signals including a vocal object signal and an instrument object signal to generate a downmix signal.
  • the encoding apparatus In operation 1120, the encoding apparatus generates a spatial parameter by normalizing subband powers of the plurality of input object signals.
  • the encoding apparatus generates harmonic information from the vocal object signal.
  • the harmonic information may include the pitch of the voiced sound signal included in the vocal object signal, the maximum harmonic frequency of the voiced sound signal, and the spectral harmonic size of the voiced sound signal.
  • the encoding apparatus may include generating pitch information of the voiced sound signal included in the vocal object signal, generating harmonic maximum frequency information of the voiced sound signal using the pitch information, and spectrum of the voiced sound signal using the pitch information and the maximum frequency information.
  • the harmonic information may be generated by generating the harmonic size.
  • the encoding apparatus may quantize the spectral harmonic size of the voiced sound signal included in the vocal object signal using a quantization table calculated based on the average value of the subband power of the vocal object signal and the subband power of the vocal object signal.
  • FIG. 12 is a flowchart illustrating a decoding method for vocal harmonic coding.
  • step 1210 the decoding apparatus reproduces a vocal object signal and an instrument object signal from a downmix signal using spatial parameters.
  • the decoding apparatus removes the harmonic component from the reproduced instrument object signal using the reproduced vocal object signal and the harmonic information.
  • Step 1220 may be performed through a harmonic filter.
  • the harmonic information may include the pitch of the voiced sound signal included in the vocal object signal, the maximum harmonic frequency of the voiced sound signal, and the spectral harmonic size of the voiced sound signal.
  • the decoding apparatus flattens the instrument object signal from which the harmonic component is removed using a smoothing filter.
  • the decoding apparatus may generate a SAOC demodulation output using the reproduced vocal object signal and the reproduced instrument object signal.
  • FIG. 13 is a block diagram illustrating a personal audio studio system according to an embodiment of the present invention.
  • the personal audio studio system may selectively receive either original sound or compressed content as input content.
  • the user can set whether the input content is original sound or compressed content.
  • the selector shown in the form of a switch
  • the signal is input to the object control module 1, and conversely, if the input content is compressed content, it is input to the object control module 2.
  • the object control module 1 may generate the compressed content SAOC based Contens by compressing the original sound using any one of SAOC coding, residual coding, and vocal harmonic coding.
  • the object control module 2 may perform at least one of object insertion, object addition, and object editing (addition after object removal) in the compressed state.
  • FIG. 14 is a diagram illustrating an encoding apparatus capable of selectively using any one of SAOC coding, residual coding, and vocal harmonic coding.
  • the object control module 1 illustrated in FIG. 13 includes a SAOC-based encoder, and the SAOC-based encoder may selectively use any one of various coding methods.
  • the SAOC-based encoder may selectively use any one of SAOC coding, residual coding, and vocal harmonic coding, and the SAOC encoder and the S-VHC encoder (vocal harmonic encoder) are as described above.
  • the S-RC encoder residual encoder
  • characteristics of the SAOC encoder, the S-VHC encoder (vocal harmonic encoder), and the S-RC encoder may be expressed as follows.
  • the SAOC encoder has a downmixed signal and OLD as outputs, and has a very low bit rate and low quality.
  • the vocal harmonic encoder has a downmixed signal and OLD and harmonic information as outputs, and has a low bit rate and relatively good quality, and has characteristics suitable for a karaoke service.
  • the S-RC encoder residual encoder
  • c1 and c2 may be calculated as follows by a spatial parameter called CLD.
  • the residual signal can be calculated as follows.
  • the residual signal may be represented as follows.
  • the residual encoder illustrated in FIG. 15 may generate a downmix signal, a spatial parameter, and a residual signal as follows. More specifically, the downmix signal generator may generate the downmix signal Xd (k) as follows.
  • the spatial parameter calculator can calculate the spatial parameter OLD for each object as follows.
  • i is the index of the object in the input content
  • B is the number of parameter subbands
  • N is the number of objects in the input content.
  • Pi (b) represents the subband power in the b th subband of the i th object, and is defined as follows.
  • a residual signal can be generated as follows using the OLD spatial parameter calculated by the spatial parameter calculator without the need to separately calculate the CLD.
  • FIG. 16 is a diagram illustrating the residual signal generator shown in FIG. 15 in more detail.
  • the residual encoder receives an original sound including audio signals for a plurality of objects and generates a downmix signal.
  • the generated downmix signal is provided to the residual signal generator and the spatial parameter calculator, and the spatial parameter calculator calculates OLD for each object.
  • the downmix signal and the calculated OLD for each object are provided to the residual signal generator, and the residual signal generator generates the residual signal for each object based on the following equation defined above.
  • the method according to the embodiment may be embodied in the form of program instructions that can be executed by various computer means and recorded in a computer readable medium.
  • the computer readable medium may include program instructions, data files, data structures, etc. alone or in combination.
  • the program instructions recorded on the media may be those specially designed and constructed for the purposes of the embodiments, or they may be of the kind well-known and available to those having skill in the computer software arts.
  • Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tape, optical media such as CD-ROMs, DVDs, and magnetic disks, such as floppy disks.
  • Examples of program instructions include not only machine code generated by a compiler, but also high-level language code that can be executed by a computer using an interpreter or the like.
  • the hardware device described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

L'invention concerne un dispositif de codage résiduel, qui comprend : une unité de génération de signal de réduction par mixage pour générer un signal de réduction par mixage par la recherche d'une somme pondérée d'une pluralité de signaux d'objet qui sont entrés ; une unité de génération de paramètre spécial pour générer un paramètre spatial par la normalisation d'une puissance de sous-bande de chacun de la pluralité de signaux d'objet ; et une unité de génération de signal résiduel pour générer un signal résiduel de chacun de la pluralité de signaux d'objet à l'aide du paramètre spatial.
PCT/KR2015/000763 2014-01-23 2015-01-23 Dispositif de codage et procédé utilisant un codage résiduel WO2015111970A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020140008595A KR101536855B1 (ko) 2014-01-23 2014-01-23 레지듀얼 코딩을 이용하는 인코딩 장치 및 방법
KR10-2014-0008595 2014-01-23

Publications (1)

Publication Number Publication Date
WO2015111970A1 true WO2015111970A1 (fr) 2015-07-30

Family

ID=53681693

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2015/000763 WO2015111970A1 (fr) 2014-01-23 2015-01-23 Dispositif de codage et procédé utilisant un codage résiduel

Country Status (2)

Country Link
KR (1) KR101536855B1 (fr)
WO (1) WO2015111970A1 (fr)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20080037106A (ko) * 2005-08-30 2008-04-29 엘지전자 주식회사 오디오 신호의 인코딩 및 디코딩 장치, 및 방법
KR100917843B1 (ko) * 2006-09-29 2009-09-18 한국전자통신연구원 다양한 채널로 구성된 다객체 오디오 신호의 부호화 및복호화 장치 및 방법
KR20100007740A (ko) * 2008-07-10 2010-01-22 한국전자통신연구원 공간정보 기반의 다객체 오디오 부호화에서의 오디오 객체 편집 방법 및 그 장치
KR20100114450A (ko) * 2009-04-15 2010-10-25 한국전자통신연구원 가변 비트율을 갖는 잔차 신호 부호화를 이용한 고품질 다객체 오디오 부호화 및 복호화 장치
KR20100132913A (ko) * 2009-06-10 2010-12-20 한국전자통신연구원 다객체 오디오 신호를 부호화하는 방법 및 부호화 장치, 복호화 방법 및 복호화 장치, 그리고 트랜스코딩 방법 및 트랜스코더
KR20110018728A (ko) * 2009-08-18 2011-02-24 삼성전자주식회사 멀티 채널 오디오 신호의 부호화 방법 및 장치, 그 복호화 방법 및 장치

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20080037106A (ko) * 2005-08-30 2008-04-29 엘지전자 주식회사 오디오 신호의 인코딩 및 디코딩 장치, 및 방법
KR100917843B1 (ko) * 2006-09-29 2009-09-18 한국전자통신연구원 다양한 채널로 구성된 다객체 오디오 신호의 부호화 및복호화 장치 및 방법
KR20100007740A (ko) * 2008-07-10 2010-01-22 한국전자통신연구원 공간정보 기반의 다객체 오디오 부호화에서의 오디오 객체 편집 방법 및 그 장치
KR20100114450A (ko) * 2009-04-15 2010-10-25 한국전자통신연구원 가변 비트율을 갖는 잔차 신호 부호화를 이용한 고품질 다객체 오디오 부호화 및 복호화 장치
KR20100132913A (ko) * 2009-06-10 2010-12-20 한국전자통신연구원 다객체 오디오 신호를 부호화하는 방법 및 부호화 장치, 복호화 방법 및 복호화 장치, 그리고 트랜스코딩 방법 및 트랜스코더
KR20110018728A (ko) * 2009-08-18 2011-02-24 삼성전자주식회사 멀티 채널 오디오 신호의 부호화 방법 및 장치, 그 복호화 방법 및 장치

Also Published As

Publication number Publication date
KR101536855B1 (ko) 2015-07-14

Similar Documents

Publication Publication Date Title
WO2012036487A2 (fr) Appareil et procédé pour coder et décoder un signal pour une extension de bande passante à haute fréquence
WO2010107269A2 (fr) Appareil et méthode de codage/décodage d'un signal multicanaux
WO2013141638A1 (fr) Procédé et appareil de codage/décodage de haute fréquence pour extension de largeur de bande
WO2013183977A1 (fr) Procédé et appareil de masquage d'erreurs de trames et procédé et appareil de décodage audio
WO2012157931A2 (fr) Remplissage de bruit et décodage audio
WO2010005272A2 (fr) Procédé et appareil pour un codage et un décodage multiplexe
WO2010087614A2 (fr) Procédé de codage et de décodage d'un signal audio et son appareil
WO2013002623A2 (fr) Appareil et procédé permettant de générer un signal d'extension de bande passante
WO2010050740A2 (fr) Appareil et procédé de codage/décodage d’un signal multicanal
WO2012144878A2 (fr) Procédé de quantification de coefficients de codage prédictif linéaire, procédé de codage de son, procédé de déquantification de coefficients de codage prédictif linéaire, procédé de décodage de son et support d'enregistrement
WO2013058635A2 (fr) Procédé et appareil de dissimulation d'erreurs de trame et procédé et appareil de décodage audio
WO2017039422A2 (fr) Procédés de traitement de signal et appareils d'amélioration de la qualité sonore
WO2018174310A1 (fr) Procédé et appareil de traitement d'un signal de parole s'adaptant à un environnement de bruit
WO2013115625A1 (fr) Procédé et appareil permettant de traiter des signaux audio à faible complexité
WO2020145472A1 (fr) Vocodeur neuronal pour mettre en œuvre un modèle adaptatif de locuteur et générer un signal vocal synthétisé, et procédé d'entraînement de vocodeur neuronal
EP2700072A2 (fr) Appareil de quantification de coefficients de codage prédictif linéaire, appareil de codage de son, appareil de déquantification de coefficients de codage prédictif linéaire, appareil de décodage de son et dispositif électronique s'y rapportant
WO2009145449A2 (fr) Procédé pour traiter un signal vocal bruyant, appareil prévu à cet effet et support d'enregistrement lisible par ordinateur
AU2012246799A1 (en) Method of quantizing linear predictive coding coefficients, sound encoding method, method of de-quantizing linear predictive coding coefficients, sound decoding method, and recording medium
WO2016024853A1 (fr) Procédé et dispositif d'amélioration de la qualité sonore, procédé et dispositif de décodage sonore, et dispositif multimédia les utilisant
WO2014185569A1 (fr) Procédé et dispositif de codage et de décodage d'un signal audio
WO2016032021A1 (fr) Appareil et procédé de reconnaissance de commandes vocales
WO2017222356A1 (fr) Procédé et dispositif de traitement de signal s'adaptant à un environnement de bruit et équipement terminal les utilisant
WO2020050509A1 (fr) Dispositif de synthèse vocale
WO2015170899A1 (fr) Procédé et dispositif de quantification de coefficient prédictif linéaire, et procédé et dispositif de déquantification de celui-ci
WO2014163231A1 (fr) Procede d'extraction de signal de parole et appareil d'extraction de signal de parole a utiliser pour une reconnaissance de parole dans un environnement dans lequel de multiples sources sonores sont delivrees

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15740782

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15740782

Country of ref document: EP

Kind code of ref document: A1