WO2015111970A1 - Dispositif de codage et procédé utilisant un codage résiduel - Google Patents
Dispositif de codage et procédé utilisant un codage résiduel Download PDFInfo
- Publication number
- WO2015111970A1 WO2015111970A1 PCT/KR2015/000763 KR2015000763W WO2015111970A1 WO 2015111970 A1 WO2015111970 A1 WO 2015111970A1 KR 2015000763 W KR2015000763 W KR 2015000763W WO 2015111970 A1 WO2015111970 A1 WO 2015111970A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- signal
- harmonic
- residual
- generating
- parameter
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 41
- 230000007274 generation of a signal involved in cell-cell signaling Effects 0.000 claims 1
- 230000001755 vocal effect Effects 0.000 description 50
- 230000005236 sound signal Effects 0.000 description 32
- 239000011295 pitch Substances 0.000 description 24
- 238000001914 filtration Methods 0.000 description 22
- 238000009499 grossing Methods 0.000 description 17
- 238000013139 quantization Methods 0.000 description 13
- 230000003595 spectral effect Effects 0.000 description 13
- 238000010586 diagram Methods 0.000 description 11
- 238000000605 extraction Methods 0.000 description 9
- 230000006870 function Effects 0.000 description 6
- 238000009877 rendering Methods 0.000 description 4
- 238000001228 spectrum Methods 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000007493 shaping process Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000002087 whitening effect Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
- H04S3/008—Systems employing more than two channels, e.g. quadraphonic in which the audio signals are in digital form, i.e. employing more than two discrete digital channels
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
- G10H2210/066—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for pitch analysis as part of wider processing for musical purposes, e.g. transcription, musical performance evaluation; Pitch recognition, e.g. in polyphonic sounds; Estimation or use of missing fundamental
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/155—Musical effects
- G10H2210/265—Acoustic effect simulation, i.e. volume, spatial, resonance or reverberation effects added to a musical sound, usually by appropriate filtering or delays
- G10H2210/295—Spatial effects, musical uses of multiple audio channels, e.g. stereo
- G10H2210/305—Source positioning in a soundscape, e.g. instrument positioning on a virtual soundstage, stereo panning or related delay or reverberation changes; Changing the stereo width of a musical source
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/03—Aspects of down-mixing multi-channel audio to configurations with lower numbers of playback channels, e.g. 7.1 -> 5.1
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/03—Application of parametric coding in stereophonic audio systems
Definitions
- the following embodiments relate to an encoding apparatus and method using residual coding.
- SAOC spatial audio object coding
- S-TSC SAOC two-step coding
- International Publication No. 2010-143907 discloses a method and encoding apparatus for encoding a multi-object audio signal, a decoding method and a decoding apparatus, and a transcoding method and a transcoder.
- the multi-object audio signal encoding apparatus discloses a method of encoding object signals except for foreground object signals among a plurality of input object signals and encoding foreground object signals to provide a satisfactory sound quality to a listener. do.
- Embodiments of the present invention provide a residual encoding apparatus and method capable of generating a residual signal for each of a plurality of input object signals.
- Embodiments of the present invention provide a residual encoding apparatus and method capable of generating a residual signal for each of a downmix signal, an OLD parameter, and a plurality of object signals.
- Embodiments of the present invention provide a residual encoding apparatus and method capable of generating a residual signal for each of a plurality of object signals using an OLD parameter.
- the residual encoding apparatus may include: a downmix signal generator configured to weight down a plurality of input object signals to generate a downmix signal; A spatial parameter generator for generating a spatial parameter by normalizing subband power of each of the plurality of object signals; And a residual signal generator configured to generate a residual signal for each of the plurality of object signals using the downmix signal and the spatial parameter.
- the spatial parameter generator calculates the spatial parameter OLD using the equation, P represents the parameter subband power, B represents the number of parameter subbands, and N represents the number of input objects.
- the residual signal generator generates the residual signal using the OLD.
- the residual encoding method may include generating a downmix signal by weighting a plurality of input object signals; Generating a spatial parameter by normalizing subband power of each of the plurality of object signals; And generating a residual signal for each of the plurality of object signals using the downmix signal and the spatial parameter.
- a computer readable recording medium having recorded thereon a program for performing a residual encoding method, the program comprising: instructions for weighting a plurality of input object signals to generate a downmix signal; Generating a spatial parameter by normalizing subband power of each of the plurality of object signals; And generating a residual signal for each of the plurality of object signals using the downmix signal and the spatial parameter.
- Embodiments of the present invention provide a residual encoding apparatus and method capable of generating a residual signal for each of a plurality of input object signals.
- Embodiments of the present invention provide a residual encoding apparatus and method capable of generating a residual signal for each of a downmix signal, an OLD parameter, and a plurality of object signals.
- Embodiments of the present invention provide a residual encoding apparatus and method capable of generating a residual signal for each of a plurality of object signals using an OLD parameter.
- 1 is a diagram illustrating a SAOC encoder and a decoder.
- FIG. 2 is a block diagram illustrating an encoding apparatus and a decoding apparatus for vocal harmonic coding.
- 3 is a graph showing harmonic information.
- FIG. 4 is a flowchart illustrating a pitch extraction method, according to an exemplary embodiment.
- 5 is a graph according to the pitch extraction method of FIG. 4.
- FIG. 6 is a flowchart illustrating an MVF extraction method according to an embodiment.
- FIG. 7 is a graph according to the MVF extraction method of FIG. 6.
- FIG. 9 is a graph illustrating a harmonic filtering and a smoothing filtering process.
- 10 is a graph illustrating test results according to vocal harmonic coding.
- 11 is a flowchart illustrating an encoding method for vocal harmonic coding.
- FIG. 12 is a flowchart illustrating a decoding method for vocal harmonic coding.
- FIG. 13 is a block diagram illustrating a personal audio studio system according to an embodiment of the present invention.
- FIG. 14 is a diagram illustrating an encoding apparatus capable of selectively using any one of SAOC coding, residual coding, and vocal harmonic coding.
- 15 is a diagram illustrating an encoding apparatus for performing residual coding according to an embodiment of the present invention.
- FIG. 16 is a diagram illustrating the residual signal generator shown in FIG. 15 in more detail.
- 1 is a diagram illustrating a SAOC encoder and a decoder.
- SAOC spatial audio object coding
- the SAOC encoder converts the input object signals into downmix signals and spatial parameters and sends them to the SAOC decoder.
- the decoder reproduces the object signal using the received downmix signal and spatial parameters, and the renderer renders the respective objects according to user input to generate final music.
- the SAOC encoder calculates the downmix signal and the spatial parameter OLD (Object Level Difference).
- the downmix signal can be obtained by the weighted sum of the input signals.
- OLD may be obtained by normalizing to the power of the largest value among the subband powers of the object. OLD may be defined according to [Equation 1].
- P represents the parameter subband power
- B represents the number of parameter subbands
- N represents the number of input objects.
- the SAOC decoder can reproduce the object signal through the downmix signal and the OLD.
- the SAOC decoder may reproduce the object signal using Equation 2.
- the SAOC decoder when a specific object is to be adjusted, the SAOC decoder adjusts a specific object from the downmix signal with only OLD.
- FIG. 2 is a block diagram illustrating an encoding apparatus and a decoding apparatus for vocal harmonic coding.
- the SAOC parameter generator 211 the harmonic information generator 212, the object signal reproducing unit 221, the harmonic filtering unit 222, the smoothing filtering unit 223, and the rendering unit 224 are provided. Is shown.
- the SAOC parameter generator 211 generates a downmix signal by weighting a plurality of input object signals including a vocal object signal and an instrument object signal, and normalizes subband powers of the plurality of input object signals. To create a spatial parameter.
- the SAOC parameter generator 211 may correspond to the SAOC encoder of FIG. 1.
- the downmix signal and the spatial parameter are transmitted to the harmonic information generator 212.
- the harmonic information generation unit 212 generates harmonic information from the vocal object signal in order to remove the harmonic component generated when reproducing the instrument object signal from the downmix signal using spatial parameters.
- the vocal object signal When the vocal object signal is removed from the downmix signal based on the OLD, a difference may occur between the unvoiced sound signal and the voiced sound signal included in the vocal object signal. In fact, in order to obtain a background signal composed of the instrument object signal, if the vocal object signal is removed from the downmix signal based on the OLD, the removal performance in the voiced signal portion is lowered.
- the harmonic information may include the pitch of the voiced sound signal included in the vocal object signal, the harmonic maximum frequency of the voiced sound signal, and the spectral harmonic magnitude of the voiced sound signal.
- the harmonic component may correspond to the voiced sound signal.
- the harmonic information generation unit 212 generates pitch information of the voiced sound signal included in the vocal object signal, generates harmonic maximum frequency information of the voiced sound signal using the pitch information, pitch information and the maximum frequency information.
- the harmonic information generation unit 212 is a voiced sound included in the vocal object signal using a quantization table calculated based on an average value of the subband power of the vocal object signal and the subband power of the vocal object signal.
- the spectral harmonic magnitude of the signal can be quantized. Quantization of the spectral harmonic magnitude of the voiced signal is described in detail with reference to FIG. 8.
- the object signal reproducing unit 221 reproduces the vocal object signal and the instrument object signal from the downmix signal using spatial parameters.
- the object signal reproducing unit 221 may correspond to the SAOC decoder of FIG. 1.
- the harmonic filtering unit 222 removes the harmonic component from the reproduced instrument object signal using the reproduced vocal object signal and the harmonic information.
- the harmonic information is information generated by the encoding apparatus to remove harmonic components generated when reproducing the instrument object signal in the downmix signal. A detailed operation of the harmonic filtering unit 222 will be described with reference to FIG. 9.
- the smoothing filtering unit 223 smoothes the instrument object signal from which the harmonic component is removed.
- the planarization of the instrument object signal is an operation for reducing the discontinuity due to the harmonic filtering unit 222.
- a detailed operation of the smoothing filtering unit 223 will be described with reference to FIG. 9.
- the renderer 224 generates the SAOC demodulation output by using the reproduced vocal object signal and the reproduced instrument object signal.
- the renderer 224 may correspond to the renderer of FIG. 1.
- the output signal of the rendering unit 224 may be output through the speaker as it is.
- the output signal of the rendering unit 224 may be transmitted to the harmonic filtering unit 222.
- the output signal of the rendering unit 224 may be output as the improved background music through the harmonic filtering unit 222 and the smoothing filtering unit 223.
- 3 is a graph showing harmonic information.
- Harmonic information is information used to remove harmonic components that occur when reproducing an instrument object signal in a downmix signal using spatial parameters.
- the harmonic information may include the pitch of the voiced sound signal included in the vocal object signal, the harmonic maximum frequency of the voiced sound signal, and the spectral harmonic magnitude of the voiced sound signal. Since vocal harmonics are mostly generated by voiced sound signals of vocal object signals, the harmonic information may be information about voiced sound signals.
- FIG. 3 a graph in the time domain (left) of a voiced signal and a graph in the frequency domain (right) are shown.
- the interval or pitch period between pitches of the spectral harmonic magnitude of the voiced sound may be the pitch of the voiced sound signal.
- the inverse of the pitch of the voiced sound signal may be a fundamental frequency (F0).
- the maximum voiced frequency (MVF) may be the harmonic maximum frequency of the voiced sound signal. MVF may represent a frequency band in which harmonics are distributed.
- the harmonic amplifier (HA) may be the spectral harmonic magnitude of the voiced signal. The harmonic amplifier can indicate the magnitude of the harmonic.
- FIG. 4 is a flowchart illustrating a pitch extraction method, according to an exemplary embodiment.
- a pitch may be extracted through Discrete Fourier Transform (DFT), Spectral Whitening, and Salience for a vocal object signal.
- the pitch can be extracted according to various methods commonly used. 4 is a pitch extraction method using the saliency function of [Equation 3].
- tau ⁇ is a candidate of the pitch value.
- 5 is a graph according to the pitch extraction method of FIG. 4.
- a graph of a vocal object a graph based on spectral whitening, and a graph based on a result of a salience function are shown.
- the graph according to the result of the sales function is a graph of the sales function according to the tau ⁇ of [Equation 3], where the index of the maximum value is predicted as the pitch value.
- FIG. 6 is a flowchart illustrating an MVF extraction method according to an embodiment.
- the harmonic information generator 212 may use an LP residual signal to find a harmonic peak on a frequency to predict the MVF. Each step shown in FIG. 6 is described in detail in FIG.
- FIG. 7 is a graph according to the MVF extraction method of FIG. 6.
- the harmonic information generator 212 calculates the LP residual signal through LP (Linear Predictive) analysis of the input signal, and extracts the local peak of the fundamental frequency interval. In addition, the harmonic information generator 212 performs a local peak. Linear interpolation can be used to predict the shaping curve.
- the harmonic information generator 212 truncates the residual signal by 3-dB down the shaping curve.
- the harmonic information generator 212 normalizes the interval of peak points of the truncated signal to a fundamental frequency and predicts the MVF through the MVF decision.
- the example shown in FIG. 7 is the result of using 0.5 and 1.5 as thresholds for the determination of MVF.
- the harmonic information generator 212 may calculate the HA from the power spectrum at the harmonic peak point.
- HA varies in size
- quantization is required.
- an adaptive quantization technique using an OLD parameter and an arithmetic mean may be used for HA.
- the harmonic quantization table for the adaptive quantization technique may be generated using the maximum and minimum values calculated through Equations 4 to 6 below.
- Equations 4 to 6 the minimum and maximum values at which the m th harmonic may exist to quantize the m th harmonic impedance are shown in Equations 4 to 6 as shown in the right figure.
- Equation 4 the maximum value is Pv (b), which is the b-th subband power of the vocal signal.
- the minimum value is Pv (b) / (nD) which is an average of Pv (b).
- n is the number of harmonics included in the sub band
- D is the duration of the sub band.
- Equation 5 If the logarithm of Equation 4 is taken, Equation 5 is obtained. If the Equation 5 is normalized, the minimum and maximum values of the quantization table can be obtained as shown in Equation 6.
- FIG. 9 is a graph illustrating a harmonic filtering and a smoothing filtering process.
- Equation 7 shows the harmonic filtering unit 222.
- Equation 7 Denotes an instrument object signal from which the harmonic component that is the output of the harmonic filter has been removed, Denotes the reproduced instrument object signal that is the input of the harmonic filter.
- G E (k) is the transfer function of the harmonic filter, which is designed according to Equation (8).
- Equation 8 Represents the reproduced vocal object signal, Denotes the reproduced instrument object signal.
- the harmonic impedance H (m) according to the harmonic information is the power spectrum of the m th harmonic in the frequency domain. H (m) is defined as shown in [Equation 9].
- F 0 represents the fundamental frequency
- m is an integer
- M is the number of harmonics.
- M ⁇ f mvf / F 0 >.
- f mvf is the MVF frequency.
- X v represents a vocal object signal.
- Equation 10 shows the smoothing filtering unit 222.
- Equation 10 Denotes an instrument object signal from which the harmonic component is removed, which is the output of the harmonic filter and the input of the smoothing filter, Denotes the flattened instrument object signal that is the output of the smoothing filter, and Gs (k) denotes the transfer function of the smoothing filter. Gs (k) is defined as shown in [Equation 11].
- W denotes the bandwidth of the harmonic according to the smoothing range
- 10 is a graph illustrating test results according to vocal harmonic coding.
- VHC Vocal Harmonic Coding
- the VHC shows a lower score than the TSC II, but considering that the bit rate of the VHC is much lower than the bit rate of the TSC II, the overall performance is good.
- 11 is a flowchart illustrating an encoding method for vocal harmonic coding.
- the encoding apparatus weights a plurality of input object signals including a vocal object signal and an instrument object signal to generate a downmix signal.
- the encoding apparatus In operation 1120, the encoding apparatus generates a spatial parameter by normalizing subband powers of the plurality of input object signals.
- the encoding apparatus generates harmonic information from the vocal object signal.
- the harmonic information may include the pitch of the voiced sound signal included in the vocal object signal, the maximum harmonic frequency of the voiced sound signal, and the spectral harmonic size of the voiced sound signal.
- the encoding apparatus may include generating pitch information of the voiced sound signal included in the vocal object signal, generating harmonic maximum frequency information of the voiced sound signal using the pitch information, and spectrum of the voiced sound signal using the pitch information and the maximum frequency information.
- the harmonic information may be generated by generating the harmonic size.
- the encoding apparatus may quantize the spectral harmonic size of the voiced sound signal included in the vocal object signal using a quantization table calculated based on the average value of the subband power of the vocal object signal and the subband power of the vocal object signal.
- FIG. 12 is a flowchart illustrating a decoding method for vocal harmonic coding.
- step 1210 the decoding apparatus reproduces a vocal object signal and an instrument object signal from a downmix signal using spatial parameters.
- the decoding apparatus removes the harmonic component from the reproduced instrument object signal using the reproduced vocal object signal and the harmonic information.
- Step 1220 may be performed through a harmonic filter.
- the harmonic information may include the pitch of the voiced sound signal included in the vocal object signal, the maximum harmonic frequency of the voiced sound signal, and the spectral harmonic size of the voiced sound signal.
- the decoding apparatus flattens the instrument object signal from which the harmonic component is removed using a smoothing filter.
- the decoding apparatus may generate a SAOC demodulation output using the reproduced vocal object signal and the reproduced instrument object signal.
- FIG. 13 is a block diagram illustrating a personal audio studio system according to an embodiment of the present invention.
- the personal audio studio system may selectively receive either original sound or compressed content as input content.
- the user can set whether the input content is original sound or compressed content.
- the selector shown in the form of a switch
- the signal is input to the object control module 1, and conversely, if the input content is compressed content, it is input to the object control module 2.
- the object control module 1 may generate the compressed content SAOC based Contens by compressing the original sound using any one of SAOC coding, residual coding, and vocal harmonic coding.
- the object control module 2 may perform at least one of object insertion, object addition, and object editing (addition after object removal) in the compressed state.
- FIG. 14 is a diagram illustrating an encoding apparatus capable of selectively using any one of SAOC coding, residual coding, and vocal harmonic coding.
- the object control module 1 illustrated in FIG. 13 includes a SAOC-based encoder, and the SAOC-based encoder may selectively use any one of various coding methods.
- the SAOC-based encoder may selectively use any one of SAOC coding, residual coding, and vocal harmonic coding, and the SAOC encoder and the S-VHC encoder (vocal harmonic encoder) are as described above.
- the S-RC encoder residual encoder
- characteristics of the SAOC encoder, the S-VHC encoder (vocal harmonic encoder), and the S-RC encoder may be expressed as follows.
- the SAOC encoder has a downmixed signal and OLD as outputs, and has a very low bit rate and low quality.
- the vocal harmonic encoder has a downmixed signal and OLD and harmonic information as outputs, and has a low bit rate and relatively good quality, and has characteristics suitable for a karaoke service.
- the S-RC encoder residual encoder
- c1 and c2 may be calculated as follows by a spatial parameter called CLD.
- the residual signal can be calculated as follows.
- the residual signal may be represented as follows.
- the residual encoder illustrated in FIG. 15 may generate a downmix signal, a spatial parameter, and a residual signal as follows. More specifically, the downmix signal generator may generate the downmix signal Xd (k) as follows.
- the spatial parameter calculator can calculate the spatial parameter OLD for each object as follows.
- i is the index of the object in the input content
- B is the number of parameter subbands
- N is the number of objects in the input content.
- Pi (b) represents the subband power in the b th subband of the i th object, and is defined as follows.
- a residual signal can be generated as follows using the OLD spatial parameter calculated by the spatial parameter calculator without the need to separately calculate the CLD.
- FIG. 16 is a diagram illustrating the residual signal generator shown in FIG. 15 in more detail.
- the residual encoder receives an original sound including audio signals for a plurality of objects and generates a downmix signal.
- the generated downmix signal is provided to the residual signal generator and the spatial parameter calculator, and the spatial parameter calculator calculates OLD for each object.
- the downmix signal and the calculated OLD for each object are provided to the residual signal generator, and the residual signal generator generates the residual signal for each object based on the following equation defined above.
- the method according to the embodiment may be embodied in the form of program instructions that can be executed by various computer means and recorded in a computer readable medium.
- the computer readable medium may include program instructions, data files, data structures, etc. alone or in combination.
- the program instructions recorded on the media may be those specially designed and constructed for the purposes of the embodiments, or they may be of the kind well-known and available to those having skill in the computer software arts.
- Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tape, optical media such as CD-ROMs, DVDs, and magnetic disks, such as floppy disks.
- Examples of program instructions include not only machine code generated by a compiler, but also high-level language code that can be executed by a computer using an interpreter or the like.
- the hardware device described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
L'invention concerne un dispositif de codage résiduel, qui comprend : une unité de génération de signal de réduction par mixage pour générer un signal de réduction par mixage par la recherche d'une somme pondérée d'une pluralité de signaux d'objet qui sont entrés ; une unité de génération de paramètre spécial pour générer un paramètre spatial par la normalisation d'une puissance de sous-bande de chacun de la pluralité de signaux d'objet ; et une unité de génération de signal résiduel pour générer un signal résiduel de chacun de la pluralité de signaux d'objet à l'aide du paramètre spatial.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020140008595A KR101536855B1 (ko) | 2014-01-23 | 2014-01-23 | 레지듀얼 코딩을 이용하는 인코딩 장치 및 방법 |
KR10-2014-0008595 | 2014-01-23 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015111970A1 true WO2015111970A1 (fr) | 2015-07-30 |
Family
ID=53681693
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2015/000763 WO2015111970A1 (fr) | 2014-01-23 | 2015-01-23 | Dispositif de codage et procédé utilisant un codage résiduel |
Country Status (2)
Country | Link |
---|---|
KR (1) | KR101536855B1 (fr) |
WO (1) | WO2015111970A1 (fr) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20080037106A (ko) * | 2005-08-30 | 2008-04-29 | 엘지전자 주식회사 | 오디오 신호의 인코딩 및 디코딩 장치, 및 방법 |
KR100917843B1 (ko) * | 2006-09-29 | 2009-09-18 | 한국전자통신연구원 | 다양한 채널로 구성된 다객체 오디오 신호의 부호화 및복호화 장치 및 방법 |
KR20100007740A (ko) * | 2008-07-10 | 2010-01-22 | 한국전자통신연구원 | 공간정보 기반의 다객체 오디오 부호화에서의 오디오 객체 편집 방법 및 그 장치 |
KR20100114450A (ko) * | 2009-04-15 | 2010-10-25 | 한국전자통신연구원 | 가변 비트율을 갖는 잔차 신호 부호화를 이용한 고품질 다객체 오디오 부호화 및 복호화 장치 |
KR20100132913A (ko) * | 2009-06-10 | 2010-12-20 | 한국전자통신연구원 | 다객체 오디오 신호를 부호화하는 방법 및 부호화 장치, 복호화 방법 및 복호화 장치, 그리고 트랜스코딩 방법 및 트랜스코더 |
KR20110018728A (ko) * | 2009-08-18 | 2011-02-24 | 삼성전자주식회사 | 멀티 채널 오디오 신호의 부호화 방법 및 장치, 그 복호화 방법 및 장치 |
-
2014
- 2014-01-23 KR KR1020140008595A patent/KR101536855B1/ko active IP Right Grant
-
2015
- 2015-01-23 WO PCT/KR2015/000763 patent/WO2015111970A1/fr active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20080037106A (ko) * | 2005-08-30 | 2008-04-29 | 엘지전자 주식회사 | 오디오 신호의 인코딩 및 디코딩 장치, 및 방법 |
KR100917843B1 (ko) * | 2006-09-29 | 2009-09-18 | 한국전자통신연구원 | 다양한 채널로 구성된 다객체 오디오 신호의 부호화 및복호화 장치 및 방법 |
KR20100007740A (ko) * | 2008-07-10 | 2010-01-22 | 한국전자통신연구원 | 공간정보 기반의 다객체 오디오 부호화에서의 오디오 객체 편집 방법 및 그 장치 |
KR20100114450A (ko) * | 2009-04-15 | 2010-10-25 | 한국전자통신연구원 | 가변 비트율을 갖는 잔차 신호 부호화를 이용한 고품질 다객체 오디오 부호화 및 복호화 장치 |
KR20100132913A (ko) * | 2009-06-10 | 2010-12-20 | 한국전자통신연구원 | 다객체 오디오 신호를 부호화하는 방법 및 부호화 장치, 복호화 방법 및 복호화 장치, 그리고 트랜스코딩 방법 및 트랜스코더 |
KR20110018728A (ko) * | 2009-08-18 | 2011-02-24 | 삼성전자주식회사 | 멀티 채널 오디오 신호의 부호화 방법 및 장치, 그 복호화 방법 및 장치 |
Also Published As
Publication number | Publication date |
---|---|
KR101536855B1 (ko) | 2015-07-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2012036487A2 (fr) | Appareil et procédé pour coder et décoder un signal pour une extension de bande passante à haute fréquence | |
WO2010107269A2 (fr) | Appareil et méthode de codage/décodage d'un signal multicanaux | |
WO2013141638A1 (fr) | Procédé et appareil de codage/décodage de haute fréquence pour extension de largeur de bande | |
WO2013183977A1 (fr) | Procédé et appareil de masquage d'erreurs de trames et procédé et appareil de décodage audio | |
WO2012157931A2 (fr) | Remplissage de bruit et décodage audio | |
WO2010005272A2 (fr) | Procédé et appareil pour un codage et un décodage multiplexe | |
WO2010087614A2 (fr) | Procédé de codage et de décodage d'un signal audio et son appareil | |
WO2013002623A2 (fr) | Appareil et procédé permettant de générer un signal d'extension de bande passante | |
WO2010050740A2 (fr) | Appareil et procédé de codage/décodage d’un signal multicanal | |
WO2012144878A2 (fr) | Procédé de quantification de coefficients de codage prédictif linéaire, procédé de codage de son, procédé de déquantification de coefficients de codage prédictif linéaire, procédé de décodage de son et support d'enregistrement | |
WO2013058635A2 (fr) | Procédé et appareil de dissimulation d'erreurs de trame et procédé et appareil de décodage audio | |
WO2017039422A2 (fr) | Procédés de traitement de signal et appareils d'amélioration de la qualité sonore | |
WO2018174310A1 (fr) | Procédé et appareil de traitement d'un signal de parole s'adaptant à un environnement de bruit | |
WO2013115625A1 (fr) | Procédé et appareil permettant de traiter des signaux audio à faible complexité | |
WO2020145472A1 (fr) | Vocodeur neuronal pour mettre en œuvre un modèle adaptatif de locuteur et générer un signal vocal synthétisé, et procédé d'entraînement de vocodeur neuronal | |
EP2700072A2 (fr) | Appareil de quantification de coefficients de codage prédictif linéaire, appareil de codage de son, appareil de déquantification de coefficients de codage prédictif linéaire, appareil de décodage de son et dispositif électronique s'y rapportant | |
WO2009145449A2 (fr) | Procédé pour traiter un signal vocal bruyant, appareil prévu à cet effet et support d'enregistrement lisible par ordinateur | |
AU2012246799A1 (en) | Method of quantizing linear predictive coding coefficients, sound encoding method, method of de-quantizing linear predictive coding coefficients, sound decoding method, and recording medium | |
WO2016024853A1 (fr) | Procédé et dispositif d'amélioration de la qualité sonore, procédé et dispositif de décodage sonore, et dispositif multimédia les utilisant | |
WO2014185569A1 (fr) | Procédé et dispositif de codage et de décodage d'un signal audio | |
WO2016032021A1 (fr) | Appareil et procédé de reconnaissance de commandes vocales | |
WO2017222356A1 (fr) | Procédé et dispositif de traitement de signal s'adaptant à un environnement de bruit et équipement terminal les utilisant | |
WO2020050509A1 (fr) | Dispositif de synthèse vocale | |
WO2015170899A1 (fr) | Procédé et dispositif de quantification de coefficient prédictif linéaire, et procédé et dispositif de déquantification de celui-ci | |
WO2014163231A1 (fr) | Procede d'extraction de signal de parole et appareil d'extraction de signal de parole a utiliser pour une reconnaissance de parole dans un environnement dans lequel de multiples sources sonores sont delivrees |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15740782 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 15740782 Country of ref document: EP Kind code of ref document: A1 |