WO2010091555A1 - 一种立体声编码方法和装置 - Google Patents

一种立体声编码方法和装置 Download PDF

Info

Publication number
WO2010091555A1
WO2010091555A1 PCT/CN2009/070428 CN2009070428W WO2010091555A1 WO 2010091555 A1 WO2010091555 A1 WO 2010091555A1 CN 2009070428 W CN2009070428 W CN 2009070428W WO 2010091555 A1 WO2010091555 A1 WO 2010091555A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
delay
stereo signal
current
adjustment
Prior art date
Application number
PCT/CN2009/070428
Other languages
English (en)
French (fr)
Inventor
吴文海
郎玥
苗磊
刘泽新
胡晨
塔迪·哈维·米希尔
张清
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2009/070428 priority Critical patent/WO2010091555A1/zh
Priority to CN2009801545991A priority patent/CN102292769B/zh
Priority to EP09839878.7A priority patent/EP2395504B1/en
Publication of WO2010091555A1 publication Critical patent/WO2010091555A1/zh
Priority to US13/208,460 priority patent/US8489406B2/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S1/00Two-channel systems
    • H04S1/007Two-channel systems in which the audio signals are in digital form
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/03Application of parametric coding in stereophonic audio systems

Definitions

  • the present invention relates to the field of stereo technology, and in particular, to a method and apparatus for stereo coding. Background technique
  • stereo The purpose of stereo is to transmit or reconstruct a particular sound field, giving the listener the sound and spatial characteristics of the original sound field.
  • stereo technology due to the development of computer technology, digital signal processing technology, and the development of high-definition television sound systems and home audio-visual systems, stereo technology has been greatly developed, and this also proposes stereo technology, especially codec technology. Higher requirements.
  • Existing stereo coding methods can be divided into two categories, one of which is early waveform-based stereo coding.
  • the second category is the more commonly used parametric stereo coding.
  • the left and right channel signals are usually not directly encoded, but the left and right channel signals are downmixed, the downmixed signal is encoded, and some additional sideband information is encoded.
  • the stereo signal is recovered at the decoding end by the downmix signal and these sideband information.
  • the quality of the stereo signal is good or bad, [depending on the quality of the downmix signal.
  • the sounding object has a distance change or a distance difference with respect to the two microphones recording the left and right channels, which necessarily causes a certain delay between the left and right signals. Can't sync completely. If the delay can be adjusted during downmixing, that is, the left and right channel signals can be synchronized, the quality of the stereo composite signal can be greatly improved.
  • FIG. 1 is a schematic flowchart of a stereo coding method in the prior art.
  • the left and right signals are sampled 4, and after the Linear Predictive Coding (LPC) analysis and LPC filtering, the residual signal is obtained.
  • LPC Linear Predictive Coding
  • the delays of the left and right signals are extracted separately. If the delays of the two signals are different for two consecutive frames, the delay adjustment is performed before the downmixing.
  • Embodiments of the present invention provide a method and apparatus for stereo coding, which can reduce distortion caused by delay adjustment.
  • an embodiment of the present invention provides a method for stereo coding, including: extracting a current inter-channel delay of a stereo signal and a previous delay adjacent to a delay between the current channels; When the current delay is different from the previous delay, the frame determination is performed according to the current stereo signal characteristic; if it is determined that the current delay frame is an adjustment frame, the current inter-channel delay pair is used. The stereo signal is time-delayed.
  • a stereo encoding apparatus including: an extraction delay unit, configured to acquire a current inter-channel delay of a stereo signal and a previous one adjacent to a delay between the current channels a delay unit; a determining unit, configured to: adjust a frame determination according to a current stereo signal characteristic when the current delay acquired by the acquiring delay unit is different from the previous delay; a delay adjustment unit, configured to determine When the unit determines that the frame in which the current delay is located is an adjustment frame, the delay adjustment of the stereo signal is performed by using the current inter-channel delay.
  • FIG. 1 is a schematic flow chart of a stereo coding method in the prior art
  • FIG. 2 is a flowchart of a stereo coding method according to an embodiment of the present invention.
  • FIG. 3 is a schematic flowchart of a stereo encoding method according to an embodiment of the present invention
  • FIG. 4 is a flowchart of determining an unvoiced and voiced sound in a channel according to an embodiment of the present invention
  • FIG. 5 is a schematic structural diagram of a stereo encoding apparatus according to an embodiment of the present invention. detailed description
  • a method for stereo coding provided by an embodiment of the present invention includes:
  • Step 21 extract a current inter-channel delay of the stereo signal and a previous delay adjacent to the current inter-channel delay
  • Step 22 When the current delay and the previous delay are different, adjusting frame determination according to current stereo signal characteristics
  • Step 23 If it is determined that the frame where the current delay is located is an adjustment frame, delay adjustment of the stereo signal is performed by using the current inter-channel delay.
  • a method for stereo coding provided by an embodiment of the present invention, by extracting a current inter-channel delay of a stereo signal and a previous delay adjacent to a delay between the current channels, at a current delay and the previous delay
  • the frame adjustment is performed according to the current stereo signal characteristic, and only when the frame of the current delay is determined to be an adjustment frame, the delay between the current channel is adjusted for the delay of the stereo signal, so that the delay is Adjusted when it is suitable for adjustment, which can reduce the delay adjustment distortion.
  • FIG. 3 a schematic diagram of a stereo coding method is provided in the embodiment of the present invention.
  • the left and right signals are first sampled 4, and after LPC analysis and LPC filtering, residual signals are obtained. Then, the delays of the left and right signals are extracted separately. If the delays of the two signals are different for two consecutive frames, the judgment is made whether it is suitable for delay adjustment before downmixing. When the delay of two consecutive frames is different, in the place where the stereo needs to be adjusted in delay, the frame adjustment is performed according to the current stereo signal characteristics. If the frame where the current delay is located is the adjustment frame, the current inter-channel delay is used. Delay adjustment of the stereo signal.
  • One method is to judge based on the type of stereo signal.
  • the method specifically determines that the frame where the current delay is located is an adjustment frame when the stereo signal is an unvoiced frame or a silence frame; and determines that the frame where the current delay is located is a non-adjusted frame when the stereo signal is a voiced frame.
  • the process determines the type of signal by the average, maximum, and zero-crossing rate of a pitch period of the stereo signal.
  • the pitch period of the signal is extracted, the initial counter is set to 0, then the maximum value and the average value in the pitch period are extracted, and the average value is compared with the set average threshold. If it is greater than the average threshold, count+1 is , otherwise count does not change.
  • the ratio of the maximum value to the average value in the pitch period is compared with the set ratio threshold. If it is greater than the ratio threshold, then count+1, or the zero-crossing threshold is count+1, otherwise the count is unchanged. Finally, compare whether count is greater than 2. If it is greater than 2, it is judged as voiced, otherwise it is judged as unvoiced.
  • mute category judgment can be handled in the same way as the unvoiced voice. According to the above judgment process, 1 can be output in the voiced frame during the calculation programming, and 0 can be output in the unvoiced frame or the silence frame.
  • the category of the entire stereo signal is determined by the category of the left and right channel signals.
  • the stereo signal is judged to be voiced only when the left and right channel signal types are simultaneously voiced.
  • Another method is to judge based on the energy of the stereo signal. The method is specifically: determining that the frame of the current delay is an adjustment frame when the frame energy of the stereo signal is less than a certain threshold; the frame energy of the stereo signal is greater than or equal to the certain threshold
  • the frame in which the current delay is located is determined to be a non-adjusted frame.
  • Yet another method is to judge based on the type and energy combination of the stereo signal.
  • the method is specifically: when the stereo signal is an unvoiced frame or a silence frame and the frame energy of the stereo signal is less than a certain threshold, the frame in which the current delay is located is determined to be an adjustment frame, otherwise the frame in which the current delay is located is determined to be non- Adjust the frame. Or, when the stereo signal is an unvoiced frame or a silence frame, or when the frame energy of the stereo signal is less than a certain threshold, the frame where the current delay is located is determined to be an adjustment frame, otherwise, the frame where the current delay is located is determined to be non-adjusted. frame.
  • Other restrictions For example, for a speech signal with a relatively large background noise or a music signal with a low periodicity, other methods may be used to determine the adjustment frame.
  • an embodiment of the present invention further provides a device for stereo coding, including: an extraction delay unit 51, configured to acquire a current inter-channel delay of a stereo signal and adjacent to a delay between the current channels. Last delay
  • the determining unit 52 is configured to: when the current delay acquired by the acquiring delay unit is different from the previous delay, adjust the frame according to the current stereo signal characteristic;
  • the delay adjustment unit 53 is configured to perform delay adjustment on the stereo signal by using the current inter-channel delay when the determining unit determines that the current delay frame is an adjustment frame.
  • the determining unit 52 includes any one of the following modules:
  • a class judging module configured to perform frame adjustment according to a category of the stereo signal
  • An energy judging module configured to adjust frame determination according to energy of the stereo signal
  • the category energy judging module is configured to adjust the frame judgment according to the category and energy combination of the stereo signal.
  • the class judging module is configured to determine that the frame where the current delay is located is an adjustment frame when the stereo signal is an unvoiced frame or a silence frame, and determine that the frame where the current delay is located is a non-adjustment frame when the stereo signal is a voiced frame.
  • the energy judging module is configured to determine that the frame of the current delay is an adjustment frame when the frame energy of the stereo signal is less than a certain threshold, and the frame energy of the stereo signal is greater than or equal to the certain set threshold.
  • the frame in which the current delay is located is determined to be a non-adjusted frame.
  • the type energy judgment module determines that the frame where the current delay is located is an adjustment frame, otherwise determines the frame where the current delay is located. For the non-adjustment frame; or, when the stereo energy signal is used for the unvoiced frame or the silence frame, or when the frame energy of the stereo signal is less than a certain threshold, the frame of the current delay is adjusted. Frame, otherwise judge the frame where the current delay is located as a non-adjusted frame.
  • the judging unit is not limited to the above-mentioned judging modules.
  • the above modules are only described as a preferred embodiment of the present invention, and other judging modules may be used to determine the frame for adjustment.
  • the present invention is not particularly limited.
  • the apparatus for stereo encoding provided by the embodiment of the present invention extracts the current inter-channel delay of the stereo signal and the previous delay adjacent to the current inter-channel delay by the extraction delay unit 51, at the current delay and When the previous delay is different, the determining unit 52 performs an adjustment frame determination according to the current stereo signal characteristic, and uses the current channel by the delay adjusting unit 53 only when the frame in which the current delay is located is determined to be an adjustment frame. Inter-delay adjusts the delay of the stereo signal so that the delay is adjusted when it is suitable for adjustment, which can reduce the distortion caused by delay adjustment.
  • the storage medium may be a magnetic disk, an optical disk, a read only memory (ROM), or a random access memory (RAM).
  • Each functional unit in the embodiment of the present invention may be integrated into one processing module, or each unit may exist physically separately, or two or more units may be integrated into one module.
  • the integrated modules can be implemented either in the form of hardware or in the form of software functional modules.
  • the integrated modules, if implemented in the form of software functional modules and sold or used as separate products, may also be stored in a computer readable storage medium.
  • the above-mentioned storage medium may be a read only memory, a magnetic disk or an optical disk or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Stereophonic System (AREA)

Description

一种立体声编码方法和装置 技术领域
本发明涉及立体声技术领域, 尤其涉及一种立体声编码的方法和装置。 背景技术
立体声的目的是传递或重建某一个特定的声场, 给倾听者再现原声场的 声音和空间特性。 近年来由于计算机技术、 数字信号处理技术的发展, 以及 由于高清晰度电视声系统、 家用视听系统发展的需要, 使得立体声技术有了 较大的发展, 同时这也对立体声技术尤其编解码技术提出了更高的要求。
现有的立体声编码方法可以分成两类, 一类是早期的基于波形的立体声 编码。 第二类是当前较为常用的参数立体声编码。 在参数立体声编码中, 通 常并不是直接对左右声道信号进行编码, 而是将左右声道信号进行下混, 对 下混之后的信号进行编码, 并编码一些额外的边带信息。 在解码端通过下混 信号和这些边带信息来恢复立体声信号。
立体声信号质量的好坏, [艮大程度上取决于下混信号的质量。 左右两个 声道信号越同步, 在下混的过程中损失的信息就会越少。 而通常情况, 发声 物体相对录制左右声道的两个麦克来说会有距离的变动或者距离差, 这样必 然造成左右两路信号之间有一定的延时。 不能完全同步。 如果在下混时能将 该延时进行调整, 也就是使得左右声道信号能够同步, 则可以很大程度上提 升立体声合成信号的质量。
参见图 1 , 图 1为现有技术中立体声编码方法的流程示意图。 首先对左右 两路信号进行下釆样 4, 进行线性预测编码(Linear Predictive Coding, LPC ) 分析和 LPC滤波之后, 得到残差信号。 然后分别提取左右两路信号的延时, 如果连续两帧左右两路信号的延时不同, 则在进行下混之前进行延时调整。
在实现本发明过程中, 发明人研究发现:
由于延时调整过程中需要对左右声道信号进行叠接相加, 这个过程会引 入失真, 而且不同特性的立体声信号在进行叠接相加时会对帧间数据的不连 续产生不同的失真影响。 由于现有技术本身并不区分延时调整时立体声信号 的特性, 只要连续两帧的左右两路信号的延时不同就立即进行延时调整, 这 时就有可能会带来非常严重的失真。 发明内容
本发明实施例提供一种立体声编码的方法和装置, 能够减少延时调整带 来的失真。
具体的, 本发明的一个实施例提供了一种立体声编码的方法, 包括: 提 取立体声信号的当前声道间延时和与所述当前声道间延时相邻的上一延时; 所述当前延时和所述上一延时不同时, 则根据当前立体声信号特性进行调整 帧判断; 如果判断所述当前延时所在帧为调整帧时, 则釆用所述当前声道间 延时对立体声信号进行延时调整。
本发明的另一个实施例提供了一种立体声编码的装置, 包括: 提取延时 单元, 用于获取立体声信号的当前声道间延时和与所述当前声道间延时相邻 的上一延时; 判断单元, 用于所述获取延时单元获取的当前延时和所述上一 延时不同时, 根据当前立体声信号特性进行调整帧判断; 延时调整单元, 用 于在所述判断单元判断所述当前延时所在帧为调整帧时, 釆用所述当前声道 间延时对立体声信号进行延时调整。
通过上述技术方案的描述可知, 通过提取立体声信号的当前声道间延时 和与所述当前声道间延时相邻的上一延时 , 在当前延时和所述上一延时不同 时, 根据当前立体声信号特性进行调整帧判断, 并仅在当前延时所在帧判断 为调整帧时, 釆用所述当前声道间延时对立体声信号进行延时调整, 使得延 时在适合进行调整的时候才进行调整, 从而能够减少延时调整带来的失真。 附图说明 施例或现有技术描述中所需要使用的附图作一简单地介绍, 显而易见地, 下 面描述中的附图仅仅是本发明的一些实施例, 对于本领域普通技术人员来讲, 在不付出创造性劳动性的前提下, 还可以根据这些附图获得其他的附图。
图 1为现有技术中立体声编码方法的流程示意图;
图 2为本发明实施例提供的一种立体声编码方法的流程图;
图 3为本发明实施例提供的一种立体声编码方法的流程示意图; 图 4为本发明实施例提供的一个声道内确定清浊音的流程图;
图 5为本发明实施例提供的一种立体声编码装置的结构示意图。 具体实施方式
为使本发明的目的、 技术方案、 及优点更加清楚明白, 下面结合附图并 举实施例, 对本发明提供的技术方案进一步详细描述。 显然, 所描述的实施 例仅仅是本发明一部分实施例, 而不是全部的实施例。 基于本发明中的实施 例, 本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实 施例, 都属于本发明保护的范围。
参见图 2, 本发明实施例提供的一种立体声编码的方法, 包括:
步骤 21 , 提取立体声信号的当前声道间延时和与所述当前声道间延时相 邻的上一延时;
步骤 22, 所述当前延时和所述上一延时不同时, 则根据当前立体声信号 特性进行调整帧判断;
步骤 23 , 如果判断所述当前延时所在帧为调整帧时, 则釆用所述当前声 道间延时对立体声信号进行延时调整。
本发明实施例提供的立体声编码的方法, 通过提取立体声信号的当前声 道间延时和与所述当前声道间延时相邻的上一延时, 在当前延时和所述上一 延时不同时, 根据当前立体声信号特性进行调整帧判断, 并仅在当前延时所 在帧判断为调整帧时, 釆用所述当前声道间延时对立体声信号进行延时调整, 使得延时在适合进行调整的时候才进行调整, 从而能够减少延时调整带来的 失真。
参见图 3 , 本发明实施例提供的一种立体声编码方法流程示意图, 与现有 技术相比, 也是首先对左右两路信号进行下釆样 4, 进行 LPC分析和 LPC滤 波之后, 得到残差信号, 然后分别提取左右两路信号的延时, 如果连续两帧 左右两路信号的延时不同则在下混之前进行是否适合延时调整的判断。 当连 续两帧延时不同时, 在立体声需要进行延时调整的地方, 根据当前立体声信 号特性进行调整帧判断, 如果判断当前延时所在帧为调整帧时, 则釆用当前 声道间延时对立体声信号进行延时调整。
根据立体声信号特性进行调整帧判断本发明实施例提供如下几种判断方 法:
一种方法是, 根据立体声信号的类别进行判断。 该方法具体为在立体声 信号为清音帧或者静音帧时判断当前延时所在帧为调整帧; 在立体声信号为 浊音帧时判断当前延时所在帧为非调整帧。
参见图 4, 图 4所示为一个声道内确定清浊音的流程图。 该流程通过立体 声信号一个基音周期(pitch ) 的平均值、 最大值及过零率来判断信号的类别。 首先提取信号的基音周期, 初始化计数器 count为 0, 然后提取该基音周期内 的最大值和平均值, 将平均值与设定的平均值门限进行比较, 如果大于该平 均值门限就将 count+1 , 否则 count不变。 然后将该基音周期内的最大值与平 均值的比值与设定的比值门限进行比较, 如果大于该比值门限则 count+1 , 否 过零率门限则 count+1 , 否则 count不变。 最后比较 count是否大于 2, 如果大 于 2则判断为浊音, 否则判断成清音。
需要说明的是, 静音类别判断可以等同于清音来处理。 根据以上判断过 程, 在计算编程时可以在浊音帧时输出 1 , 而在清音帧或静音帧输出 0。
整个立体声信号的类别由左右两个声道信号的类别来确定。 只有当左右 声道信号类型同时为浊音时才判断该立体声信号为浊音。 另一种方法是, 根据立体声信号的能量进行判断。 该方法具体为: 在立 体声信号的帧能量小于某一设定的门限值时判断当前延时所在帧为调整帧; 在立体声信号的帧能量大于或等于所述某一设定的门限值时判断当前延时所 在帧为非调整帧。
再一种方法是, 根据立体声信号的类别和能量组合进行判断。 该方法具 体为: 在立体声信号为清音帧或者静音帧且立体声信号的帧能量小于某一设 定的门限值时, 判断当前延时所在帧为调整帧, 否则判断当前延时所在帧为 非调整帧。 或者, 在立体声信号为清音帧或者静音帧时, 或者立体声信号的 帧能量小于某一设定的门限值时, 判断当前延时所在帧为调整帧, 否则判断 当前延时所在帧为非调整帧。 别的限定。 比如, 对于背景噪声比较大的语音信号或者周期性不强的音乐信 号, 还可以釆用其它方法进行调整帧的判断。
参见图 5 , 本发明实施例还提供一种立体声编码的装置, 包括: 提取延时单元 51 , 用于获取立体声信号的当前声道间延时和与所述当前 声道间延时相邻的上一延时;
判断单元 52, 用于所述获取延时单元获取的当前延时和所述上一延时不 同时, 根据当前立体声信号特性进行调整帧判断;
延时调整单元 53 , 用于在所述判断单元判断所述当前延时所在帧为调整 帧时, 釆用所述当前声道间延时对立体声信号进行延时调整。
优选地, 所述判断单元 52包括如下任一模块:
类别判断模块, 用于根据立体声信号的类别进行调整帧判断;
能量判断模块, 用于根据立体声信号的能量进行调整帧判断;
类别能量判断模块, 用于根据立体声信号的类别和能量组合进行调整帧 判断。
具体地, 所述类别判断模块用于立体声信号为清音帧或者静音帧时判断当前延时 所在帧为调整帧, 立体声信号为浊音帧时判断当前延时所在帧为非调整帧。
所述能量判断模块用于立体声信号的帧能量小于某一设定的门限值时判 断当前延时所在帧为调整帧, 立体声信号的帧能量大于或等于所述某一设定 的门限值时判断当前延时所在帧为非调整帧。
所述类别能量判断模块用于立体声信号为清音帧或者静音帧且立体声信 号的帧能量小于某一设定的门限值时, 判断当前延时所在帧为调整帧, 否则 判断当前延时所在帧为非调整帧; 或者, 所述类别能量判断模块用于立体声 信号为清音帧或者静音帧时, 或者立体声信号的帧能量小于某一设定的门限 值时, 判断当前延时所在帧为调整帧, 否则判断当前延时所在帧为非调整帧。
当然, 判断单元并不局限于以上几种判断模块, 以上模块仅作为本发明 的优选实施例进行说明, 还可以釆用其他判断模块进行调整帧的判断, 本发 明并不做特别的限定。
本发明实施例提供的立体声编码的装置, 通过提取延时单元 51提取立体 声信号的当前声道间延时和与所述当前声道间延时相邻的上一延时 , 在当前 延时和所述上一延时不同时, 由判断单元 52根据当前立体声信号特性进行调 整帧判断, 并仅在当前延时所在帧判断为调整帧时, 由延时调整单元 53釆用 所述当前声道间延时对立体声信号进行延时调整, 使得延时在适合进行调整 的时候才进行调整, 从而能够减少延时调整带来的失真。
最后需要说明的是, 本领域普通技术人员可以理解实现上述实施例方法 中的全部或部分流程, 是可以通过计算机程序来指令相关的硬件来完成, 所 述的程序可存储于一计算机可读取存储介质中, 该程序在执行时, 可包括如 上述各方法的实施例的流程。 其中, 所述的存储介质可为磁碟、 光盘、 只读 存储记忆体 ( ROM )或随机存储记忆体 ( RAM )等。
本发明实施例中的各功能单元可以集成在一个处理模块中, 也可以是各 个单元单独物理存在, 也可以两个或两个以上单元集成在一个模块中。 上述 集成的模块既可以釆用硬件的形式实现, 也可以釆用软件功能模块的形式实 现。 所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售 或使用时, 也可以存储在一个计算机可读取存储介质中。 上述提到的存储介 质可以是只读存储器, 磁盘或光盘等。
上述具体实施例并不用以限制本发明, 对于本技术领域的普通技术人员 来说, 凡在不脱离本发明原理的前提下, 所作的任何修改、 等同替换、 改进 等, 均应包含在本发明的保护范围之内。

Claims

权 利 要 求 书
1、 一种立体声编码的方法, 其特征在于, 包括:
提取立体声信号的当前声道间延时和与所述当前声道间延时相邻的上一延 时;
所述当前延时和所述上一延时不同时, 则根据当前立体声信号特性进行调 整帧判断;
如果判断所述当前延时所在帧为调整帧时, 则釆用所述当前声道间延时对 立体声信号进行延时调整。
2、 根据权利要求 1所述的方法, 其特征在于, 所述根据当前立体声信号特 性进行调整帧判断包括如下之一或组合:
根据立体声信号的类别进行调整帧判断;
或者, 根据立体声信号的能量进行调整帧判断。
3、 根据权利要求 2所述的方法, 其特征在于, 所述根据立体声信号的类别 进行调整帧判断具体为:
立体声信号为清音帧或者静音帧时判断当前延时所在帧为调整帧; 立体声信号为浊音帧时判断当前延时所在帧为非调整帧。
4、 根据权利要求 2所述的方法, 其特征在于, 所述根据立体声信号的能量 进行调整帧判断具体为:
立体声信号的帧能量小于某一设定的门限值时判断当前延时所在帧为调整 帧;
立体声信号的帧能量大于或等于所述某一设定的门限值时判断当前延时所 在帧为非调整帧。
5、 根据权利要求 2所述的方法, 其特征在于, 所述根据立体声信号的类别 和能量组合进行调整帧判断具体为:
立体声信号为清音帧或者静音帧且立体声信号的帧能量小于某一设定的门 限值时, 判断当前延时所在帧为调整帧, 否则判断当前延时所在帧为非调整帧; 或者, 立体声信号为清音帧或者静音帧时, 或者立体声信号的帧能量小于 某一设定的门限值时, 判断当前延时所在帧为调整帧, 否则判断当前延时所在 帧为非调整帧。
6、 一种立体声编码的装置, 其特征在于, 包括:
提取延时单元, 用于获取立体声信号的当前声道间延时和与所述当前声道 间延时相邻的上一延时;
判断单元, 用于所述获取延时单元获取的当前延时和所述上一延时不同时, 根据当前立体声信号特性进行调整帧判断;
延时调整单元, 用于在所述判断单元判断所述当前延时所在帧为调整帧时, 釆用所述当前声道间延时对立体声信号进行延时调整。
7、 根据权利要求 6所述的装置, 其特征在于, 所述判断单元包括如下任一 模块:
类别判断模块, 用于根据立体声信号的类别进行调整帧判断;
能量判断模块, 用于根据立体声信号的能量进行调整帧判断;
类别能量判断模块, 用于根据立体声信号的类别和能量组合进行调整帧判 断。
8、 根据权利要求 7所述的装置, 其特征在于,
所述类别判断模块, 具体用于立体声信号为清音帧或者静音帧时判断当前 延时所在帧为调整帧, 立体声信号为浊音帧时判断当前延时所在帧为非调整帧。
9、 根据权利要求 7所述的装置, 其特征在于,
所述能量判断模块, 具体用于立体声信号的帧能量小于某一设定的门限值 时判断当前延时所在帧为调整帧, 立体声信号的帧能量大于或等于所述某一设 定的门限值时判断当前延时所在帧为非调整帧。
10、 根据权利要求 7所述的装置, 其特征在于,
所述类型能量判断模块, 具体用于立体声信号为清音帧或者静音帧且立体 声信号的帧能量小于某一设定的门限值时, 判断当前延时所在帧为调整帧, 否 则判断当前延时所在帧为非调整帧; 或者, 具体用于立体声信号为清音帧或者 静音帧时, 或者立体声信号的帧能量小于某一设定的门限值时, 判断当前延时 所在帧为调整帧, 否则判断当前延时所在帧为非调整帧。
PCT/CN2009/070428 2009-02-13 2009-02-13 一种立体声编码方法和装置 WO2010091555A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
PCT/CN2009/070428 WO2010091555A1 (zh) 2009-02-13 2009-02-13 一种立体声编码方法和装置
CN2009801545991A CN102292769B (zh) 2009-02-13 2009-02-13 一种立体声编码方法和装置
EP09839878.7A EP2395504B1 (en) 2009-02-13 2009-02-13 Stereo encoding method and apparatus
US13/208,460 US8489406B2 (en) 2009-02-13 2011-08-12 Stereo encoding method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2009/070428 WO2010091555A1 (zh) 2009-02-13 2009-02-13 一种立体声编码方法和装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US13/208,460 Continuation US8489406B2 (en) 2009-02-13 2011-08-12 Stereo encoding method and apparatus

Publications (1)

Publication Number Publication Date
WO2010091555A1 true WO2010091555A1 (zh) 2010-08-19

Family

ID=42561374

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2009/070428 WO2010091555A1 (zh) 2009-02-13 2009-02-13 一种立体声编码方法和装置

Country Status (4)

Country Link
US (1) US8489406B2 (zh)
EP (1) EP2395504B1 (zh)
CN (1) CN102292769B (zh)
WO (1) WO2010091555A1 (zh)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010091555A1 (zh) * 2009-02-13 2010-08-19 华为技术有限公司 一种立体声编码方法和装置
CN104681029B (zh) 2013-11-29 2018-06-05 华为技术有限公司 立体声相位参数的编码方法及装置
EP3353779B1 (en) 2015-09-25 2020-06-24 VoiceAge Corporation Method and system for encoding a stereo sound signal using coding parameters of a primary channel to encode a secondary channel
US10115403B2 (en) * 2015-12-18 2018-10-30 Qualcomm Incorporated Encoding of multiple audio signals
US10074373B2 (en) * 2015-12-21 2018-09-11 Qualcomm Incorporated Channel adjustment for inter-frame temporal shift variations
US9978381B2 (en) * 2016-02-12 2018-05-22 Qualcomm Incorporated Encoding of multiple audio signals
US10217468B2 (en) * 2017-01-19 2019-02-26 Qualcomm Incorporated Coding of multiple audio signals
CN108877815B (zh) * 2017-05-16 2021-02-23 华为技术有限公司 一种立体声信号处理方法及装置
CN109215667B (zh) 2017-06-29 2020-12-22 华为技术有限公司 时延估计方法及装置
US10872611B2 (en) * 2017-09-12 2020-12-22 Qualcomm Incorporated Selecting channel adjustment method for inter-frame temporal shift variations

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101091206A (zh) * 2004-12-28 2007-12-19 松下电器产业株式会社 语音编码装置和语音编码方法
EP1953736A1 (en) * 2005-10-31 2008-08-06 Matsushita Electric Industrial Co., Ltd. Stereo encoding device, and stereo signal predicting method
CN101253557A (zh) * 2005-08-31 2008-08-27 松下电器产业株式会社 立体声编码装置、立体声解码装置、及立体声编码方法

Family Cites Families (45)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5434948A (en) * 1989-06-15 1995-07-18 British Telecommunications Public Limited Company Polyphonic coding
US5920842A (en) * 1994-10-12 1999-07-06 Pixel Instruments Signal synchronization
US6377919B1 (en) * 1996-02-06 2002-04-23 The Regents Of The University Of California System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech
US6202046B1 (en) * 1997-01-23 2001-03-13 Kabushiki Kaisha Toshiba Background noise/speech classification method
EP0878790A1 (en) * 1997-05-15 1998-11-18 Hewlett-Packard Company Voice coding system and method
US6865215B1 (en) * 2000-02-16 2005-03-08 Iowa State University Research Foundation, Inc. Spread spectrum digital data communication overlay system and method
US6973184B1 (en) * 2000-07-11 2005-12-06 Cisco Technology, Inc. System and method for stereo conferencing over low-bandwidth links
US7358974B2 (en) * 2001-01-29 2008-04-15 Silicon Graphics, Inc. Method and system for minimizing an amount of data needed to test data against subarea boundaries in spatially composited digital video
US7319703B2 (en) * 2001-09-04 2008-01-15 Nokia Corporation Method and apparatus for reducing synchronization delay in packet-based voice terminals by resynchronizing during talk spurts
CA2365203A1 (en) * 2001-12-14 2003-06-14 Voiceage Corporation A signal modification method for efficient coding of speech signals
JP2003243988A (ja) * 2002-02-20 2003-08-29 Tadahiro Omi データ処理装置
US7502743B2 (en) * 2002-09-04 2009-03-10 Microsoft Corporation Multi-channel audio encoding and decoding with multi-channel transform selection
US7299190B2 (en) * 2002-09-04 2007-11-20 Microsoft Corporation Quantization and inverse quantization for audio
JP4676140B2 (ja) * 2002-09-04 2011-04-27 マイクロソフト コーポレーション オーディオの量子化および逆量子化
US7412376B2 (en) * 2003-09-10 2008-08-12 Microsoft Corporation System and method for real-time detection and preservation of speech onset in a signal
US7761304B2 (en) * 2004-11-30 2010-07-20 Agere Systems Inc. Synchronizing parametric coding of spatial audio with externally provided downmix
WO2006059567A1 (ja) * 2004-11-30 2006-06-08 Matsushita Electric Industrial Co., Ltd. ステレオ符号化装置、ステレオ復号装置、およびこれらの方法
EP1852850A4 (en) * 2005-02-01 2011-02-16 Panasonic Corp SCALABLE CODING DEVICE AND SCALABLE CODING METHOD
US7573912B2 (en) * 2005-02-22 2009-08-11 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschunng E.V. Near-transparent or transparent multi-channel encoder/decoder scheme
JP5173840B2 (ja) * 2006-02-07 2013-04-03 エルジー エレクトロニクス インコーポレイティド 符号化/復号化装置及び方法
US7454335B2 (en) * 2006-03-20 2008-11-18 Mindspeed Technologies, Inc. Method and system for reducing effects of noise producing artifacts in a voice codec
CA2650419A1 (en) * 2006-04-27 2007-11-08 Technologies Humanware Canada Inc. Method for the time scaling of an audio signal
WO2007137232A2 (en) * 2006-05-20 2007-11-29 Personics Holdings Inc. Method of modifying audio content
CN1983909B (zh) * 2006-06-08 2010-07-28 华为技术有限公司 一种丢帧隐藏装置和方法
KR101056325B1 (ko) * 2006-07-07 2011-08-11 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. 복수의 파라미터적으로 코딩된 오디오 소스들을 결합하는 장치 및 방법
US8015000B2 (en) * 2006-08-03 2011-09-06 Broadcom Corporation Classification-based frame loss concealment for audio signals
KR20090013178A (ko) * 2006-09-29 2009-02-04 엘지전자 주식회사 오브젝트 기반 오디오 신호를 인코딩 및 디코딩하는 방법 및 장치
BRPI0715559B1 (pt) * 2006-10-16 2021-12-07 Dolby International Ab Codificação aprimorada e representação de parâmetros de codificação de objeto de downmix multicanal
TWI396187B (zh) * 2007-02-14 2013-05-11 Lg Electronics Inc 用於將以物件為主之音訊信號編碼與解碼之方法與裝置
KR101411901B1 (ko) * 2007-06-12 2014-06-26 삼성전자주식회사 오디오 신호의 부호화/복호화 방법 및 장치
KR101513028B1 (ko) * 2007-07-02 2015-04-17 엘지전자 주식회사 방송 수신기 및 방송신호 처리방법
EP2201566B1 (en) * 2007-09-19 2015-11-11 Telefonaktiebolaget LM Ericsson (publ) Joint multi-channel audio encoding/decoding
WO2009039897A1 (en) * 2007-09-26 2009-04-02 Fraunhofer - Gesellschaft Zur Förderung Der Angewandten Forschung E.V. Apparatus and method for extracting an ambient signal in an apparatus and method for obtaining weighting coefficients for extracting an ambient signal and computer program
WO2009081567A1 (ja) * 2007-12-21 2009-07-02 Panasonic Corporation ステレオ信号変換装置、ステレオ信号逆変換装置およびこれらの方法
EP2313886B1 (en) * 2008-08-11 2019-02-27 Nokia Technologies Oy Multichannel audio coder and decoder
US8400566B2 (en) * 2008-08-21 2013-03-19 Dolby Laboratories Licensing Corporation Feature optimization and reliability for audio and video signature generation and detection
EP2345027B1 (en) * 2008-10-10 2018-04-18 Telefonaktiebolaget LM Ericsson (publ) Energy-conserving multi-channel audio coding and decoding
WO2010084756A1 (ja) * 2009-01-22 2010-07-29 パナソニック株式会社 ステレオ音響信号符号化装置、ステレオ音響信号復号装置およびそれらの方法
WO2010091555A1 (zh) * 2009-02-13 2010-08-19 华为技术有限公司 一种立体声编码方法和装置
KR101313116B1 (ko) * 2009-03-24 2013-09-30 후아웨이 테크놀러지 컴퍼니 리미티드 신호 지연을 전환하기 위한 방법 및 장치
CN101848412B (zh) * 2009-03-25 2012-03-21 华为技术有限公司 通道间延迟估计的方法及其装置和编码器
WO2010127489A1 (zh) * 2009-05-07 2010-11-11 华为技术有限公司 检测信号延迟的方法、检测装置及编码器
CN101556799B (zh) * 2009-05-14 2013-08-28 华为技术有限公司 一种音频解码方法和音频解码器
CN101989429B (zh) * 2009-07-31 2012-02-01 华为技术有限公司 转码方法、装置、设备以及系统
CN102157150B (zh) * 2010-02-12 2012-08-08 华为技术有限公司 立体声解码方法及装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101091206A (zh) * 2004-12-28 2007-12-19 松下电器产业株式会社 语音编码装置和语音编码方法
CN101253557A (zh) * 2005-08-31 2008-08-27 松下电器产业株式会社 立体声编码装置、立体声解码装置、及立体声编码方法
EP1953736A1 (en) * 2005-10-31 2008-08-06 Matsushita Electric Industrial Co., Ltd. Stereo encoding device, and stereo signal predicting method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2395504A4 *

Also Published As

Publication number Publication date
EP2395504B1 (en) 2013-09-18
CN102292769A (zh) 2011-12-21
EP2395504A1 (en) 2011-12-14
US20110301962A1 (en) 2011-12-08
EP2395504A4 (en) 2012-07-11
US8489406B2 (en) 2013-07-16
CN102292769B (zh) 2012-12-19

Similar Documents

Publication Publication Date Title
WO2010091555A1 (zh) 一种立体声编码方法和装置
JP7427715B2 (ja) プログラム情報またはサブストリーム構造メタデータをもつオーディオ・エンコーダおよびデコーダ
US11887578B2 (en) Automatic dubbing method and apparatus
US9788133B2 (en) Systems, methods, apparatus, and computer-readable media for backward-compatible audio coding
US9478225B2 (en) Systems, methods, apparatus, and computer-readable media for three-dimensional audio coding using basis function coefficients
EP2191463B1 (en) A method and an apparatus of decoding an audio signal
CN114902688B (zh) 内容流处理方法和装置、计算机系统和介质
US20100114583A1 (en) Apparatus for processing an audio signal and method thereof
KR20070003593A (ko) 멀티채널 오디오 신호의 인코딩 및 디코딩 방법
JP2020204778A5 (zh)
WO2014153922A1 (zh) 一种人声提取方法、系统以及人声音频播放方法及装置
US9153241B2 (en) Signal processing apparatus
US20150104158A1 (en) Digital signal reproduction device
KR102402465B1 (ko) 호출어에 대한 오인식 방지를 위한 장치 및 방법
TW202242852A (zh) 適應性增益控制
JP2007183410A (ja) 情報再生装置および方法
US20240153520A1 (en) Neutralizing distortion in audio data
JP2002229593A (ja) 音声信号復号化処理方法
WO2024020102A1 (en) Intelligent speech or dialogue enhancement
WO2010108315A1 (zh) 信号延时切换的方法和装置
CN113965662A (zh) 音视频输出设备及其音视频延时校准方法及相关组件
KR20150082794A (ko) 멀티미디어 재생 시스템에서의 대사 음량 조절 장치 및 방법

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200980154599.1

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09839878

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2009839878

Country of ref document: EP