CN1524399A - Audio channel translation - Google Patents

Audio channel translation Download PDF


Publication number
CN1524399A CNA028046625A CN02804662A CN1524399A CN 1524399 A CN1524399 A CN 1524399A CN A028046625 A CNA028046625 A CN A028046625A CN 02804662 A CN02804662 A CN 02804662A CN 1524399 A CN1524399 A CN 1524399A
Prior art keywords
Prior art date
Application number
Other languages
Chinese (zh)
Other versions
CN1275498C (en
Original Assignee
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US26728401P priority Critical
Application filed by 多尔拜实验特许公司 filed Critical 多尔拜实验特许公司
Publication of CN1524399A publication Critical patent/CN1524399A/en
Application granted granted Critical
Publication of CN1275498C publication Critical patent/CN1275498C/en



    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • H04S3/002Non-adaptive circuits, e.g. manually adjustable or static, for enhancing the sound image or the spatial distribution
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • H04S5/00Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation 
    • H04S5/005Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation  of the pseudo five- or more-channel type, e.g. virtual surround


本发明涉及将表示一个声场的M个输入声道转换为表示同一声场的N个输出声道的方法,其中每个声道是表示由一个方向抵达的声音的单个音频流,M和N是正整数,且M至少为2,该方法产生一组或多组输出声道,每组有一个或多个输出声道。 The present invention relates to a sound field representing the M input channels into presentation method with N output channels of the sound field, wherein each channel is a single audio stream by an arrival direction of the sound, M and N are positive integers and M is at least 2, the process produces one or more sets of output channels, each with one or more output channels. 每一组被联系于两个或更多的空间上相邻的输入声道,并且一组中的每个输出声道由一个处理产生,此处理包括确定两个或更多输入声道的相关性度量和两个或更多输入声道的电平相互关系。 Each contact set is adjacent to two or more spatial input channels and each output channel in a set is generated by a process that may include two or more input channels determining relevant and measure the level of two or more input channels relationship.


声道转换 Channel Conversion

技术领域 FIELD

本发明涉及音频信号处理。 The present invention relates to audio signal processing. 特别是,本发明涉及表示一个声场的M个输入声道至表示同一声场的N个输出声道的转换,其中每个声道是表示由一个方向抵达的声频的单个音频流,M和N是正整数,且M至少为2。 In particular, the present invention relates represents a sound field of M input channels to represent the N output channel converts the same sound field, wherein each channel is a single audio stream audio arriving from a direction frequency, M and N are positive integers, and M is at least 2.

背景技术 Background technique

虽然人类只有两只耳朵,但我们能听出实际三维的声音,这依赖于多个定位提示,例如头部相关的转换函数(HRTF)和头部运动。 Although only two human ears, but we can hear the actual three-dimensional sound, which depends on multiple targeting ideas, such as head-related transfer function (HRTF) and head movement. 所以完全逼真的声音再现要求保留并再现全三维声场,或者至少需要被感觉的提示。 So totally realistic sound reproduction required to retain and reproduce the full three-dimensional sound field, or at least the perceived need prompt. 不幸的是,声音记录技术不适应于获取三维声场,也不适应二维平面声音的获取,甚至不适应一维直线声音的获取。 Unfortunately, sound recording technology suited to obtain a three-dimensional sound field, nor to adapt to acquire two-dimensional sound, even suited to obtain a one-dimensional linear sound. 当前的声音记录技术只适合于获取、保存和表现零维的离散声道。 Current sound recording technology is only suitable for acquisition, preservation and presentation of the zero-dimensional discrete channels.

自从Edison发明声音记录以来关于改进逼真度的努力大多集中于克服其原始的模拟式纹道受调的圆柱体/圆盘媒体的缺陷。 Edison invention since most voice recording efforts focused on improving fidelity to overcome its original track by the modulated analog cylinder / disc media defects. 这些缺陷包括有限且不平坦的频率响应,噪声,失真,抖晃,速度精度,磨损,污垢和复制损害。 These defects include limited not flat frequency response, noise, distortion, wow and flutter, speed accuracy, wear, dirt, and copying damage. 虽然已有一些对于局部改进的零散努力,包括电子放大,磁带记录,减小噪音以及价格比某些汽车还高的放音机,但是各声道质量的传统问题在直到研发了一般的数字记录,尤其是引入音频光盘(CD)之前证明没有最终解决。 Although there have been some improvements for partial fragmented efforts, including electronic amplification, tape recording, reduced noise, and the price is higher than some of the car's cassette player, but traditional problems in the quality of each channel until the development of a general digital recording in particular the introduction of an audio compact disc (CD) did not prove before a final settlement. 自研发了数字记录特别是CD以来,除了进一步扩展数字记录的质量到24比特/96kHz(千赫兹)取样的一些努力之外,在声音再现研究方面的主要努力集中于降低为保持各声道质量所需之数据量—大都采用感知编码器,以及提高空间逼真度。 Self-developed digital recording, especially since the CD, in addition to further expansion of digital recording quality to 24-bit / 96kHz (kilohertz) other than sampling some of the efforts in sound reproduction major research efforts focused on reducing each channel in order to maintain quality the amount of data required - mostly using perceptual encoder, and improving spatial fidelity. 这后一个问题是本文的主题。 This latter issue is the subject of this article.

改进空间逼真度的努力已沿着两条路线进行:试图传送整个声场的感知提示,以及试图传送实际的原始声场的一个近似。 Efforts to improve the fidelity of space has been carried out along two routes: trying to convey perception prompted the entire sound field, and trying to convey an approximation of the actual original sound field. 采用前一方法的系统实例包括双声道记录和基于两个扬声器的虚拟环绕声系统。 Examples of systems using the former method include binaural recording and two speaker-based virtual surround systems. 这些系统存在多个不幸的缺陷,尤其是在可靠地定位某些方向上的声音方面,以及要求使用耳机或在单个的固定的听位置上收听方面。 Unfortunately, these systems present a plurality of defects, especially in reliably positioned sounds in certain directions aspects, and require the use of headphones or a fixed single listener in terms of the listening position.

无论是在一间居室中还是诸如电影院这样的营业场所,为了再现立体声给多位听众,唯一可行的方法是尝试来近似实际的原始声场。 Whether in a living room or a place of business such as movie theaters, in order to reproduce sound to multiple listeners, the only feasible way is to try to approximate the actual original sound field. 如果给定声音记录的离散信道特性,这是不会令人惊奇的:目前大多数努力包括可谓保守地增加再现声道的数量。 Given the discrete channel characteristics of the sound recording, which is not surprising: Most can be described as conservative efforts include increasing the number of reproduction channels. 表示性系统包括50年代早期的移动-单声道三扬声器电影胶片音轨,常规立体声,60年代的四声道立体声,70毫米电影胶片上的五声道离散磁性音轨,70年代采用矩阵的杜比环绕声,90年代的AC-3 5.1声道环绕声和近来的环绕-EX6.1声道环绕声。 It represents a system comprising a mobile early 1950s - mono speaker footage three tracks, conventional stereo sound, quadraphonic 60's, five channel discrete magnetic tracks on 70 mm film, a matrix of 70 years Dolby surround sound, AC-3 5.1 channel surround sound in the 1990s and recent -EX6.1 surround-channel surround sound. “Dolby”(杜比)、“Pro Logic”和“Surround EX”(环绕-EX)是Dolby实验室特许公司的商标。 "Dolby" (Dolby), "Pro Logic" and "Surround EX" (surround -EX) is a trademark of Dolby Laboratories Licensing Corporation. 在不同程度上,这些系统提供比单声道再现改善了的空间再现。 In varying degrees, these systems provide improved than a single channel reproduction of the reproduction space. 然而大量声道的混音导致更多时间和费用负担在内容制作者身上,并且导致的感受典型的是几个分散的离散声道中的一个,而不是一个连续的声场。 However, a large number of channel mix results in a typical feel more time and cost burden on content producers who, due to a few and dispersed in a discrete channels, rather than a continuous sound field. Dolby的Pro Logic解码被描述在美国专利4,799,260中,该专利全部内容在此作为参考。 The Dolby Pro Logic decoding are described in U.S. Patent No. 4,799,260, the entire contents of which are incorporated herein by reference. AC-3的详细内容描述在先进电视系统委员会(ATSC)1995年12月20日公布的文档A/52“数字音频压缩标准(AC-3)”中(可在互联网的万维网网址得到)。 Details of AC-3 are described in the Advanced Television Systems Committee (ATSC) December 20, 1995 published document A / 52 "Digital Audio Compression Standard (AC-3)" (available on the Internet World Wide Web site www.atsc. org / Standards / A52 / a-52.doc obtained). 也可见1999年7月22日的勘误表(可在互联网的万维网网址 err.pdf得到)。 See also errata July 22, 1999 (available on the World Wide Web URL Internet err.pdf get).

本发明的基础概述在一个无信源的波介质中重建一个任意分布的基础由一个高斯定理提供,此定理规定在某个区域内的波场完全由沿区域边界的压力分布确定。 SUMMARY OF THE INVENTION Reconstruction basis of arbitrarily distributed in a medium wave source in a free base by a Gauss theorem, wavefield predetermined theorem within an area determined by the pressure along the full distribution of the boundary region. 这意味着,在一间居室的范围内重建音乐厅中的声场原理上可如此实现:在音乐厅内设置居室,墙是隔音的,然后通过在墙的外侧配置无限个极小话筒而使墙变成声学上透明的,每个话筒信号经适当放大后连接到一个在居室墙内的对应扬声器。 This means that, in the range of a room on the sound field in a concert hall can be reconstructed principle thus achieved: provided in the concert hall room, soundproofing wall, and unlimited minimum by arranging microphones in the wall of the outer wall becomes acoustically transparent, each of the amplified microphone signal, after suitable connection to a corresponding one of the speakers in the room inside the wall. 通过在话筒和扬声器之间插入一个适当的记录媒体,一个圆满的—可能是不切实际的—精确三维声音再现系统被实现了。 A suitable recording medium by inserting between the microphone and speaker, a satisfactory - may be impractical - accurate three-dimensional sound reproduction system is achieved. 剩下的设计工作是使此系统变为实用的。 The remaining design work is to make this into a practical system.

迈向实用化的第一步可通过注意到感兴趣的信号是频带受限的—上限约20kHz,并应用空间取样定理来完成,空间取样定理是更常用的时域取样定理的变型。 The first step towards practical interest may be noted that the signal is band-limited - the upper limit to about 20kHz, and the application of space to complete the sampling theorem, the spatial sampling theorem is more commonly used time domain sampling theorem variants. 后者是说,如果一个连续的限带的时域波形以至少两倍于信源最高频率的速率被离散地取样,则不丢失信息。 The latter is to say, if a continuous time domain waveform with a threshold at a rate at least twice the highest frequency source is discretely sampled, no loss of information. 空间取样定理出于相同的考虑,它规定空间取样间隔必须至少为最短波长密度的两倍密度,以避免信息的丢失。 Space sampling theorem For the same reasons, it provides spatial sampling interval must be at least twice the density of the shortest wavelength density, in order to avoid loss of information. 因为20kHz的波长在空气中约为3/8英寸,这意味着一个精确的三维声音系统可用间隔不大于3/16英寸的话筒和扬声器的阵列实现。 Since the wavelength is about 20kHz 3/8 inch in air, which means that an accurate three-dimensional sound system can be spaced no greater than 3/16 of an array speaker and microphone realization. 扩展到一个典型的9英尺×12英尺房间的所有表面,这产生大约2.5百万个声道,这对于无限个而言是明显的改进,但目前仍是不实际的。 Extended to all surfaces of a typical nine feet × 12 feet room, which produces about 2.5 million channels, for an infinite number of terms which are a significant improvement, but is still not practical. 可是,它建立了利用作为空间取样的离散声道阵列的基本方法,根据该方法,应用适当的内插可以再生声场。 However, it establishes the basic method of using an array of discrete channels as spatial samples, according to which method, the application of appropriate interpolation can be reproduced sound field.

一旦声场被表征,原理上这是可能的:一个解码器产生最佳信号馈给任一输出扬声器。 Once the sound field is characterized, it is possible in principle: a decoder generates an optimal output signal is fed to any speaker. 馈送到这样一个解码器的声道在本文件不同地方被称为“基本”、“被传送的”和“输入”声道,并且位置不对应于基本的声道中的一个声道的位置的任何输出信道将被称为一个“中间”声道。 Such channels are fed to a decoder is referred to as "substantially", "transmitted" and "input" channels in various places in the document, and the position is not a position corresponding to the basic channels of a channel any output channels will be referred to as an "intermediate" channel. 一个输出声道也可以有一个与一个基本输入声道相一致的位置。 An output channel may have an input channel with a substantially consistent position.

所以要求减少离散声道空间取样或基本声道的数目。 Therefore, it required to reduce substantially discrete channels or channel spatial sampling. 实现这一点可以基于以下事实:在1500Hz(赫兹)以上听觉不再跟随各个周期,而只跟随临界频带包络。 This can be achieved based on the fact: at 1500Hz (Hz) or more can no longer follow each audible period, but only follow the critical band envelope. 这允许声道间隔与1500Hz相对应,大约为3英寸。 This channel separation allows the corresponding 1500Hz, about 3 inches. 这将减小9英尺×12英尺房间的声道总数到大约6000个,相比于前面的配置,有效地减少了约2.49百万个声道。 This reduces the total number of channels 9 ft × 12 feet to about room 6000, compared to the previous configuration, effectively reduced by about 2.49 million channels.

在任何情况下,理论上可借助于心理声学的定位极限而进一步缩减空间取样声道数。 In any case, in theory, can further reduce the number of spatial sampling channel by means of psycho-acoustic positioning limits. 对于居中的声音,水平的分辨力极限约为1度的弧,对应的垂直分辨力极限约为5度。 For voice centered, horizontal resolution limit of about 1 degree of arc, corresponding to the limit of vertical resolution is about 5 degrees. 如果这个密度适当地扩展在一个球面上,结果将仍然是数百到数千个声道。 If the density is appropriate to extend in a sphere, the result will still be hundreds to thousands of channels.


按照本发明,一个处理将表示一个声场的M个输入声道转换为表示同一声场的N个输出声道,其中每个声道是表示由一个方向抵达的声音的单个音频流,M和N是正整数,且M至少为2。 According to the present invention, a process of representing a sound field converting M input channels into N output channels representing the same sound field, wherein each channel is a single audio stream by an arrival direction of the sound, M and N are positive integers, and M is at least 2. 一组或多组输出声道被产生,每一组有一个或多个输出声道。 One or more sets of output channels are generated, each set having one or more output channels. 每一组与两个或更多的空间上相邻的输入声道相联系,并且一组中的每个输出声道由一个处理产生,此处理包括确定两个或更多个输入声道的相关性度量和两个或更多个输入声道的电平相互关系。 Each group is associated with two or more adjacent spatial input channels and each output channel in a set is generated by a process that may include determining two or more input channels rELATIONSHIP correlation measure and two or more input channels level.

在本发明的一个方面,多组输出声道被联系于多于两个的输入声道,并且其处理按照分级次序确定与每组输出声道联系的那些输入声道的相关性,使得每组或多组按照输入声道的个数被排序,这些输入声道被联系于这组输出声道(一个或多个)。 In one aspect of the present invention, multiple sets of output channels are in contact with more than two input channels, and that the correlation process determines that the input channels associated with each output channel in a hierarchical order, such that each or more groups according to the number of input channels are ordered, the contact input channels are output channels in the set (s). 输入声道的最大数目对应最高阶次,处理过程根据其分级次序依序处理各组。 The maximum number of input channels corresponding to the highest order, the process sequentially processing the groups in accordance with their hierarchical order. 此外按照本发明的一个方面,处理过程考虑对较高阶次的组处理的结果。 Further according to an aspect of the invention, the process of considering the results of the higher order group treated.

本发明的放音或解码方面假设M个表示由一个方向抵达的声音的输入声道中每一个由每个源方向的一个被动-矩阵的最近-相邻的幅度-跟随编码产生(即一个源方向是被假设主要映射最邻近的基本声道(一个或多个)),而不需要附加侧链信息(侧链或辅助信息的利用是可选的),从而它与现有混音的技术、控制台和格式兼容。 Playback or decoding aspects of the present invention represented by the M input channels is assumed to arrive from a direction of the sound source in each direction by each of a passive - adjacent amplitude - - following the latest code generator matrix (i.e., a source the main direction is assumed mapped nearest base channel (s)), without the need of additional side chain information (the use of side chain or auxiliary information is optional), such that it mixes with the prior art , consoles and compatible format. 虽然这些源信号可以通过直接使用一个被动编码矩阵产生,大多数常用的记录方法固有产生这些源信号(所以,构成一个“有效编码矩阵”)。 Although these source signals may be generated by directly using a passive encoding matrix, most conventional recording method is inherently produced in these source signals (thus, constituting an "effective encoding matrix"). 本发明的放音或解码方面也与自然记录的源信号大都兼容,例如用5个实际的定向话筒记录的信号,因为允许某些可能的时延,从中间方向抵达的声音倾向于主要映射到最邻近的话筒(在一个水平阵列中,明确地映射到最邻近的一对话筒中)。 Playback or decoding aspects of the invention are also largely compatible with natural recording source signals, for example five actual signal recorded by the microphone oriented as to allow some possible time delay, sounds arriving from intermediate directions tend to map major the nearest microphones (in a horizontal array, specifically mapped to the nearest pair of microphones).

按照本发明的一个解码器或解码处理可以被实现为相连的处理模块或模块功能(以后称为“解码模块”)的网格,每一个解码模块被用于从与该解码模块相联的两个或更多的空间上最邻近的基本声道产生一个或多个输出声道(或者产生可用于产生一个或多个输出声道的控制信号)。 May be implemented as a function of the processing module or modules are connected (hereinafter referred to as "decoding modules") in accordance with the present invention, a decoder or decoding processing to the grid, each decoding module is used from the two associated decoding module one or more spatial channels most adjacent substantially generating one or more output channels (or generates a control signal may be used to produce one or more output channels). 输出声道体现关联到具体解码模块的空间上最邻近的基本信道中音频信号的相对比例。 The relative proportions of spatially related to the particular output channel decoding module embodies the nearest channel of the audio signal substantially. 如下面更详细解释的那样,在模块共享节点和存在解码模块分级的意义上解码模块互相松散耦合。 As explained in more detail below, the significance of the presence of modules share nodes and hierarchical decoding module decoding module loosely coupled to each other. 模块按照与其相联系的基本声道的数目被分级排序(具有最多数目相关基本声道的一个模块或多个模块有最高阶次)。 Basic module according to the number of channels associated therewith are ordered in a hierarchy (having the largest number of base channels associated module or modules has a highest order). 一个管理程序功能如此管理这些模块:公共节点信号被公平地共享,并且较高阶次的解码器模块可以影响较低阶次模块的输出。 A manager module to manage these functions thus: a common node signals are equitably shared and higher order decoder modules may affect the output of lower-order sub-module.

每个解码器模块可以有效地包括一个矩阵,使得它直接产生输出信号,或者每个解码器模块可产生控制信号,这些控制信号与其它解码器模块产生的控制信号一起被用于改变一个可变矩阵的系数或改变输入到一个固定矩阵的或从一个固定矩阵输出的比例因子,以产生所有的输出信号。 Each decoder module may effectively comprise a matrix such that it directly generates output signals or each decoder module may generate control signals, the control signal together with control signals generated by other decoder modules is used to change a variable or change the coefficient matrix inputted to a matrix or a fixed matrix output from a fixed scale factor, to generate all of the output signals.

解码器模块模仿人耳的工作,力求给出感觉透明的再现。 The decoder module to mimic the human ear's work, given the feeling seeks transparent reproduction. 每个解码器模块可被实现为或者宽频带的或者多频带的结构或功能,在后者情况下或者用一个连续的滤波器组,或者用一个块结构,例如采用诸如在每个频带上做相同实质处理的一个基于变换的处理器。 Each decoder module may be implemented as, or wideband or multiband structure or function, in the latter case, or with a continuous filterbank, or a block structure with, for example, as done in the use of each frequency band a substantially identical process based transform processor.

虽然基础发明一般涉及M个输入声道至N个输出声道的空间转换,其中M和N是正整数,且M至少为2,该发明的另一内容是通过合宜地依靠虚拟映像,接收N个输出声道的扬声器数量可减小到一个实用的数值,即在未放置扬声器的空间位置上形成感觉到的声像。 While the basic invention relates generally to space converting M input channels to N output channels, wherein M and N are positive integers, and M is at least 2, another content is achieved by the invention conveniently rely virtual images, receives N number speaker output channels can be reduced to a practical value, i.e., the perceived sound image is formed is not placed at the spatial location of the speakers. 虚拟映像最普通的应用是通过在声道之间移动一个单声信号立体再现两个扬声器之间的一个映像的轨迹。 The most common application is the reproduction track a virtual image of the image by moving between the two speakers perspective a mono signal between the channels. 虚拟映像对于具有少量声道的群再现来说不被认为是一种可行的方法,因为它要求收听者与两个扬声器等距离或近似等距。 Virtual image having a small group of channels for reproduction is not considered to be a feasible method, because it requires the listener and two speakers equidistant or nearly equidistant. 例如,在电影院中左前方和右前方的扬声器对于大多数听众获得一个中央声像的有用的幻像而言相距太远了,因此作为许多对话源的中央声道是重要的,一个物理的中央扬声器被使用。 For example, in a movie theater in the left front and right front speaker for most listeners to obtain useful phantom center image in terms of a too far away, so it is important as a center channel dialogue many sources, a physical center speaker used.

然而,当扬声器的密度被增大时,对大多数听众来说,至少对于平滑移动的范围,可在任何一对扬声器之间出现虚拟映像的位置将可达到;扬声器足够时,扬声器之间的间隙不再能被感知。 However, when the density of the speakers is increased, for most listeners, for smooth movement of the range, there may be the position of the virtual image will reach at least between any pair of speakers; speaker sufficient time, between the speakers the gap can no longer be perceived. 这样的一个阵列具有对比前面推出的二百万的阵列几乎不能区分的潜力。 Such an array has the potential to two million front contrast Release array almost indistinguishable.

为了测试本发明的效果,我们开发了一个水平阵列,每面墙上5个扬声器,考虑公用的角落扬声器,总共16个,加上以大约45度的垂直角置于收听者上方的一圈6个扬声器,再加上直接在收听者上方的单个扬声器,共23个,加上一个超重低音扬声器(LFE声道),总计24个,所有声道都由用于24声道放音的一台PC(个人计算机)馈给。 To test the effect of the present invention, we have developed a horizontal array of 5 speakers on each side wall, consider the common corner speakers, a total of 16, with a vertical angle of approximately 45 degrees is placed above the circle of the listener 6 speakers, plus a single speaker directly on top of the listener, a total of 23, plus a subwoofer (LFE channel), a total of 24, all channels used by the 24-channel playback of a PC (personal computer) feed. 虽然按现在的说法这个系统可被叫作23.1声道系统,为了简单,这里它将被称为一个24声道系统。 Although by now saying the system can be called the 23.1-channel system, for simplicity, here it is called a 24-channel system.

图1是一个顶视图,它简示出符合上面所述测试安排的一个理想化的解码结构。 FIG. 1 is a top plan view, which schematically shows a structure of the decoding of the test line with the above arrangement an idealized. 5个水平广范围的基本声道作为外国上的方块1'、3'、5'、9和13'被示出。 5 horizontal base channel as a block on a wide range of foreign 1 ', 3', 5 ', 9, and 13' are shown. 一个垂直声道被示作中心处的虚线方块23',该声道可能由5个广范围的基本声道通过相关或所产生的混响而导出,或者单独提供。 A vertical channel is shown as a dashed line block 23 'at the center of the channel may be derived from the five wide range of reverberation basic channels produced by or associated, or separately provided. 23个宽范围输出声道由相应数字1-23标出的实心圆示出。 Wide range of output channel 23 shown by a circle marked solid corresponding figures 1-23. 外圆上16个输出声道在一个水平面上,内圆上6个输出声道在水平面上方45度。 On the outer 16 in the output channels, a circle within a horizontal plane in six output channels 45 degrees above the horizontal plane. 输出声道23直接在一个或多个听众上方。 Output channel 23 is directly above one or more listeners. 5个两输入解码模块由外圆上箭头24-28示出,它们连接在每一对水平基本声道之间。 5 shows a two-input decoding modules 24-28 the outer arrow, which are connected between each pair of substantially horizontal channel. 5个附加的两输入垂直解码模块由箭头29-33示出,连接垂直声道到水平声道中的每一个。 Five additional two-input vertical decoding modules are shown by arrows 29-33 connecting the vertical channel to each of the horizontal channels. 被升高的中央靠后的声道即输出声道21由一个三输入解码模块导出,它由输出声道21与基本声道9、13和23之间的箭头示出。 It is raised by the center channel after channel 21 that is derived from the output of a three-input decoding module, which is shown by the arrows between the base 21 and the channel 9, 13 and 23 output channels. 所以每个模块与相应的一对或三个空间上最邻近的基本声道相关联。 Therefore, each module with a corresponding space or three nearest base channel associated couple. 虽然图1中示出的解码模块有3个、4个或5个输出声道,但一个解码模块可有任意合理个数的输出声道。 Although FIG. 1 shows a decoding module has three, four or five output channels, a decoding module but may have any reasonable number of output channels. 一个输出声道可定位于一个或多个基本声道中间或在与一个基本声道相同的位置上。 An output channel may be positioned at a substantially intermediate or more channels or channel at a substantially same position. 所以在图1例中,每一个基本声道位置上也有一个输出声道。 In the example of FIG. 1 so, there is a base channel for each output channel location. 每个输入声道被两个或三个解码模块共享。 Each input channel is shared by two or three decoding modules.

如将要讨论的,本发明的设计目标是放音处理器应能原则上工作于任意个数的扬声器及其排列结构,24声道的阵列将用来作为一个说明例,但不是按照本发明取得一个令人信服的连续被感知的声场所需要的密度和排列结构的唯一例子。 As will be discussed, the present invention is designed playback processor should be able to work on the principle of the speaker and the arrangement structure of an arbitrary number of 24-channel array will be used as an illustrative embodiment, but not according to the present invention is to obtain a compelling example of unique structure and arrangement density continuously required perceived sound field.

能够应用大的且可以由用户选择的放音声道个数这一要求提出了离散声道个数问题和/或其它信息,这些必须被传达给放音处理器,以便它至少作为一个选项导出上面描述的24个声道。 And it can be applied to a large number of playback channels this requirement can be selected by the user the number of discrete channels proposed problems and / or other information, which must be conveyed to the playback processor, so that it is at least as an option to export above 24 channels as described. 显然,一种可能的方法是简单地传送24个离散信道,但除了信息生产者必须混合这样多个独立的声道可能是很麻烦的,并且传送如此多的声道对于传输媒体也可能是麻烦的之外,最好不这样,因为24声道结构只是许多可能中的一种,并且需要能由一个公共的传送信号阵列产生较多或较少的放音声道。 Obviously, one possible approach is simply to transmit 24 discrete channels, but in addition to a plurality of information producers must be mixed such separate channels may be cumbersome, and so many transmission channel for the transmission media may be cumbersome than, less preferably, because the channel structure 24 is only one of many possibilities, and the need to produce more or fewer playback channels by a common transmission signal array.

再生输出声道的一个途径是应用正式的空间内插,为每个输出产生一个被传送声道的固定加权和,假设这些声道的密度足够大,大到能够允许这样做。 One way is reproduced output channel app formal spatial interpolation, generating an output for each channel is transferred and fixed weight, assuming a density of these channels is large enough to allow to do so. 然而,这将需要数以千计到数以百万计的被传送声道,相当于用数百个抽头的FIR滤波器实现单个信号的时域内插。 However, this would need to be transmitted to the thousands of millions of channels, corresponding to a single time domain signal achieved with hundreds of taps FIR filter interpolation. 被传送声道减少到实用数量需要应用心理声学原理和由足够少的声道更积极的动态内插,但仍然没有回答以下问题:为了产生一个完好的声场感觉需要多个声道。 It reduced the number of transmission channels to practical need to apply psychoacoustic principles and within a sufficiently small channel more positive dynamic instrumentation, but still did not answer the following questions: In order to produce a good sound field feel the need to multiple channels.

这个问题被几年前本发明人完成的、并最近被其它人重复的一个实验回答了。 This problem is present inventors have completed a few years ago and recently repeated an experiment to answer the others. 至少较早实验的基础是观察到传统的两声道双耳记录能再现真实的左/右声像分布,但是导致不稳定的前/后位置确定,部分因为所用HRTF的不完善,并且没有头部运动提示。 Base earlier study, we observed at least to a conventional two-channel binaural recording can be reproduced true left / right audio and video distribution, but causes the front / rear position determination of instability, in part because of the imperfect HRTF employed, and no head tips movements. 为避开此缺陷,一个双-双耳(4声道)记录被实现,它用相距对应人头部尺寸的两对定向话筒。 In order to avoid this defect, one pair - binaural (4-channel) recording is realized, by a distance which corresponds to two pairs of directional microphones human head sizes. 一对话筒面朝前,另一对面朝后。 One pair of microphone facing forward, the other on the rear-facing. 得到的记录在靠近头部隔开的4个扬声器上放音,以减轻声学交叉耦合效应。 The resulting recording playback head in close spaced four speakers, to mitigate acoustic cross coupling effects. 此结构从每一对扬声器给出真实的左/右定时和幅度定位提示,话筒和扬声器的对应的离散位置给出清楚的前/后信息。 This gives the structure of the speaker from each true left / right timing and amplitude localization cues, discrete positions corresponding to the microphone and speaker is given a clear front / back information. 此结果是一个非常令人信服的环绕声放音,只是缺少高度信息的适当表现。 This result is a very convincing surround sound playback, but the lack of appropriate height performance information. 最近其它人的实验加进一个中央正前声道和两个高度声道,给出了同样的真实感,甚至可能由于加进了高度信息而改善了。 Recent experiments of others before being added to a central channel and two-channel height, gives the same sense of reality, and even may be due to the added height information is improved.

所以,从心理声学考虑和实验提供的证据两方面,看来相关的感觉信息可以在大概4至5个“类似双耳”的水平声道,再加上一个或多个垂直声道中被传递。 So, evidence from the psycho-acoustic considerations and experiments provide both, it seems relevant sensory information can be about 4-5 "similar to the ears" level channels, plus one or more vertical channels are passed . 然而,双耳声道对的信号交叉馈送特性使得它们不适合于直接给一组扬声器放音,因为在中间频率范围和在低频只有非常小的隔离度。 However, the cross-feed signals to the binaural characteristics makes them unsuitable for direct discharge to a set of speaker sound, because the intermediate frequency range and only a very small low degree of isolation. 因而与在编码器引入交叉馈送(像对一个双耳对所做的那样)以只需在解码器取消它相比,这是更为简单的和更加直接的:保持声道相互隔离并从最近的被传送声道混合输出声道信号。 Thus the cross-feed is introduced at the encoder (as for a binaural pair did) to cancel it simply compared in the decoder, which is simpler and more straightforward: kept isolated from each other and from the nearest channel It is transmitted channel mixing output channel signal. 这样做不仅可以通过同样个数的扬声器而不用解码器来直接放音,如果需要,还对少数声道用一个被动矩阵解码器进行可选用的下混,而且它基本对应于现有的5.1声道的标准排列结构,至少在水平面上是对应的。 Not only does this without the decoder directly by the same number of playback speaker, if desired, also for a small number of channels can be selected using a downmix passive matrix decoder, but it essentially corresponds to the existing 5.1- standard channel arrangement structure, at least in the horizontal plane corresponds to. 它也广泛兼容于自然记录,例如可用5个实际的定向话筒实施的记录,因为允许某些可能的时延,由中间方向抵达的声音将倾向于主要映射到最近的话筒(在水平阵列中,尤其是映射到最近的一对话筒)。 It is also widely compatible with natural recording, such as recording the actual implementation of the five directional microphones available as possible to allow some delay, sounds arriving from intermediate directions tend to map primarily to the nearest microphones (in a horizontal array, especially mapped to the nearest pair of microphones).

所以,从感觉的角度看,这应是可能的:一个声道转换解码器接受一个标准的5.1声道节目,并通过任意个数的水平排列的扬声器—包括前述24声道阵列中的16个水平扬声器—实现有说服力的放音。 Therefore, from the viewpoint of the feeling, it should be possible: a channel decoder converter accepts a standard 5.1 channel program, and by the horizontal arrangement of any number of speakers - including 16 of the 24 channels in the array speaker level - to achieve convincing playback. 通过附加一个垂直声道,就像有时为一个数字电影系统所建议的那样,可以对全部24声道阵列馈送分别导出的、感觉有效的信号,这些信号一起产生一个在大多数收听位置上感觉到的连续声场。 Produced by adding together a vertical channel, sometimes as a digital cinema system, as suggested, can be fed separately derived for all 24 channel array, sense signals, the signals in a majority of the listening position to feel the continuous sound field. 当然,如果在编码现场可以得到精细结构的源声道,关于它们的附加信息可被用来有效地改变编码矩阵定标因子,以预补偿解码器的局限性,或者可以简单地包括进来作为附加的侧链(辅助)信息,可能类似于用在AC-3(Dolby Digital)多声道编码中的耦合坐标,但是在感觉上这样的附加信息应是不必要的;并且实际上,包含这种信息的要求是不需要的。 Of course, the source can be obtained if the channel coding in the field of fine structure, additional information about them can be used to effectively change the encoding matrix scaling factor to compensate for the pre-decoder limitations, or may simply be included as additional the side-chain (auxiliary) information, may be similar to the coupling coordinates used in AC-3 (Dolby Digital) multichannel coding, but this in the sense that the additional information should not be necessary; and in fact, contains such requested information is not required. 声道转换解码器的所需工作不局限于用5.1声道的信源工作,并可以用较少或较多的声道,但是至少有理由相信,可靠的性能可从5.1声道信源获得。 Conversion work required channel decoder is not limited to working with the source 5.1, and with fewer or more channels, but at least reason to believe that reliable performance may be obtained from 5.1 channel sources .

剩下的未回签的一个问题是如何由被传送声道的稀疏阵列提取中间的输出声道。 Remaining issue is how the non-return check intermediate output channel extracted by the sparse array of transmitted channels. 本发明的一个方面所建议的解决方案是再利用虚拟映象的概念,但是稍微作些变化。 One aspect of the invention, the proposed solution is to re-use the concept of virtual mappings, but slightly to make some changes. 先前已注意到,虚拟映像对于用稀疏的扬声器阵列进行群放音是不适用的,因为它要求收听者与每个扬声器的距离近似相等。 Previously noted the virtual image to be playback with a sparse population of the speaker array is not applicable because it requires the distance between the listener and each speaker are approximately equal. 但是它经过改造可对不规则地就座的一个收听者给出中间幻像声道的感觉,这是对于那些振幅已经在最近的实际输出声道之间移动的信号。 After transformation it gives the feeling of intermediate phantom channels to a listener seated irregularly, for those which has moved between the amplitude recent actual channel output signal. 所以在本发明的一个方面中建议声道转换解码器包括一系列模块化的内插信号处理器,每个处理器有效模仿一个最佳就座的收听者,并且每个以模仿人类听觉系统的方法工作,以由振幅移动的信号提取那些将形成虚拟映像的成份,并将它们馈给实际的扬声器;扬声器最好足够密地排列,使得自然的虚拟映象能充满在扬声器之间剩余的间隙中。 It is recommended that the channel decoder comprises a converter of the human auditory system, a series of modular interpolating signal processors, each of effectively mimic an optimal seated listener, and each to emulate In one aspect of the present invention the method of working, to extract the movement of the signal amplitude that will form a virtual image component, and feeds them to the actual speakers; speaker is preferably sufficiently densely arranged, so that the virtual image can be naturally filled the remaining gap between the speaker in.

一般,每个解码模块由最邻近的被传送基本声道导出其输入,例如,对于一个天幕式(在顶上的)扬声器阵列,可以是3个或更多的基本声道。 In general, each decoding module derives its inputs from the nearest base channel is transmitted, for example, for a canopy of formula (on top of) the speaker array, it may be three or more basic channels. 产生与多于两个的基本声道有关系的输出声道的一种方法可以是进行一系列成对的操作,例如,某些成对解码模块的输出馈给其它模块的输入。 Generating substantially more than two channels are related to a method of output channels may be sequentially operate in pairs, for example, some pairs of output fed to the input of a decoding module of the other modules. 然而,这有两个缺点。 However, it has two drawbacks. 一个缺点是级联解码模块引入多个级联的时间常数,导致某些输出声道比其它声道反应更快,从而引起声音位置假象。 One disadvantage is concatenated decoding module incorporated multiple cascaded time constants, resulting in some output channels faster than the reaction of the other channels, thereby causing the position of the sound artifacts. 第二个缺点是成对相关只能沿着一对声道之间直线安插中间的或导出的输出声道;三个或更多基本声道的应用超出了这个限制。 The second drawback is related only placed in pairs or derived output channels along the line between the middle of the channel pairs; three or more base channel applications beyond this limit. 因此,通常成对相关的一个扩展已被开发,用于相关三个或更多的输出信号,这个技术在下面被说明。 Thus, typically a pair of associated extension has been developed for the associated three or more output signals, this technique is described below.

人耳中的水平定位主要基于两个定位提示:两耳间振幅差和两耳间时间差。 Positioned horizontally in the human ear is based on two localization cues: interaural time difference and interaural amplitude difference. 后者仅对于时间上近似对准的—差+600微秒左右—信号对有效。 For the latter approximately aligned in time only - difference of about + 600 microseconds - the effective signal. 实际效果是幻像的中间映像将只出现在对应于一个具体的左/右振幅差的位置上,假定在两个真实声道中公共的信号成份是相关的或者近似相关的(注:两个信号可以有+1到-1之间的交叉相关值。完全相关的信号(相关值=1)有相同波形且时间上对准,但是可以有不同的幅度,对应于偏离中心的映像位置)。 The net effect is a phantom intermediate images will only appear in particular corresponding to a difference between the position of the left / right amplitudes, it is assumed that two real channels is related to a common signal component related to or approximately (Note: two signals cross-correlation can have values ​​between +1 to -1. perfectly correlated signals (correlation = 1) have the same waveform and aligned in time, but may have different amplitudes, corresponding to off-center position of the image). 当一个信号对的相关值低于1时,感觉到的映像将展宽,直到对于两个不相关的信号,将不存在中间映像,只有分离的且不同的左和右映像。 When a correlation value signal is less than 1, the perceived image will stretch until the two signals are not related to the intermediate image does not exist, only separate and distinct left and right images. 负的相关通常被耳朵处理为类似于不相关的信号对,虽然这两个映像可在更宽范围内扩展。 Negative correlation is typically processed similar to the ear uncorrelated signal pairs, although the two images may be extended over a wider range. 相关被实现在一个临界频带基础上,且在约1500Hz以上,临界频带信号包络被用来代替信号本身,以节省人类计算需求(MIPS)。 Related be implemented in a critical band basis, and above about 1500Hz, the critical band signal envelopes are used instead of the signals themselves, to save human computational requirements (MIPS).

垂直定位更复杂一点,依赖于HRTF顶提示和水平提示随头部运动的动态调制,但是最终的效应类似于水平定位相对于移动的振幅、交叉相关以及相应感觉到的映像位置和汇合。 Vertical positioning is more complicated, and depends on the HRTF top tips tips dynamic modulation level with head motion, but the final effect is similar to a horizontally positioned with respect to the amplitude, cross-correlation and image position of the respective moving felt and convergence. 然而垂直空间分辨力精度低于水平分辨力,并且为了适当的内插性能,不需要那么密的基本声道阵列。 However, the accuracy of the vertical spatial resolution lower than the resolution, and for appropriate interpolation performance, not so dense array of substantially channel.

利用定向的处理器—它模仿人耳的工作—的好处是信号处理的任何不完善或限制应能通过人耳的类似的不完善和限制在感觉上掩饰掉,从而允许下述可能性:系统被感觉与原来的完全连续的放音几乎没有区别。 Using a processor-oriented - the work it mimics the human ear - benefit any imperfections or limitations of the signal processing can be similar to the human ear cover imperfections and limitations in the sense that off, thereby allowing the possibility: System the feeling of being completely original continuous playback almost no difference.

虽然本发明被设计能有效地应用于不管多或少的输出声道可用的情形(包括不解码而由与输入声道同样多的扬声器进行的放音,以及被动的下混到较少的声道,包括单声,立体声和兼容Lt/Rt的环绕声),最好力求使用多的和有点随意的、然而实用的个数的放音声道/扬声器,并且使用类似或更少个数的编码声道,包括现有的5.1声道环绕声道,以及可能的下一代11或12声道数字电影声道作为源材料。 While the present invention is designed to effectively applied to the case of more or less regardless of the output channels are available (including playback without decoding performed by the same number of input channel speaker, and the passive sound less mingled channel, surround sound including mono, stereo and compatible Lt / Rt), and more preferably seeks to use and somewhat arbitrary, but a practical number of playback channels / loudspeakers, and use of similar or less number of coding channels, including existing 5.1 channel surround channel, and possibly the next 11 or 12-channel digital cinema channels as a source material.

本发明的实施要求体现四个原理:误差遏制,优势保持,恒定功率和同步平滑。 Embodiment of the present invention requires four embody principles: error containment, the advantage remains, constant power, and synchronized smoothing.

误差遏制的概念是在给定的解码错误可能性下,每个源的解码后位置应该在合理的意义下接近其真实的预期方向。 The concept of containment is in error given the possibility of decoding errors, the decoded position of each source should be close to their real expectations at a reasonable sense of direction. 这规定了解码策略中一定程度的保守性。 It provides decoding conservative policy to some extent. 存在更为积极的解码,它们伴随着错误事件中可能更大的空间上的不一致,通常推荐接受较小精度的解码,以换取确保的空间遏制。 There is a more positive decoding, they are accompanied by inconsistencies in the event of possible errors more space is generally recommended to accept a smaller decoding accuracy in exchange for space to ensure containment. 甚至在更高精度的解码有把握被应用的情况下,如果存在着动态信号条件要求解码器在积极的和保守的方式之间接合以生成人造声像的可能,应用更高精度的解码可能是不明智的。 Even more accurate decoding is sure case of application, if the dynamic signal conditions there is required a positive engagement between the decoder and conservative manner may generate an artificial sound image, the application may be more accurate decoding unwise.

优势保持是误差遏制的一个更为有约束力的变种,它要求单个的良好确定的优势信号应能被解码器只移动到最邻近的那些输出声道中。 Advantage is holding a more binding variants contain errors, which requires a single well-defined dominant signal should be moved only to the decoder those nearest output channels. 这个条件对于保持优势信号的映像汇合是必要的,并且有利于感觉出矩阵解码器的离散性。 The conditions for maintaining the advantages of image signal convergence is necessary and beneficial to the feeling that discrete matrix decoder. 当一个信号是占优势的时候,它被从其它输出声道中抑制掉,方法是或者从相关基本信号中减去它,或者直接使其它输出声道的矩阵系数互补于用于产生优势信号的矩阵系数(“反优势系数/信号”)。 When a time signal is dominant, it is suppressed from other output channels off, or is subtracted from the associated base signal, or directly cause the other output channels matrix coefficients complementary to advantage for generating a signal matrix coefficients ( "anti advantage factor / signal").

恒定功率解码不仅要求总的解码输出功率等于输入功率,而且要求在传送的基本阵列中被编码的每个声道和定向信号的输入/输出功率相等。 Constant power decoding requires not only the total decoded output power equal to the input power, and requires input in the basic array is encoded in the transmission of each channel and directional signal / power output equal. 这使增益变化产生的假象最小。 This minimizes the gain variation generated illusion.

同步平滑意味着对系统施加与信号相关的平滑时间常数,并且要求:如果一个解码模块中的任一平滑网络被切换到快速时间常数模式,在此模块中的所有其它平滑网络同样被切换。 Synchronization smoothing means is applied smoothing time constant associated with the signals on the system, and requires: a decoding module, if the network is switched to the slip REN Yiping fast time constant mode, all other smoothing networks in this module is likewise switched. 这是为了避免新占优的定向信号呈现缓慢衰落/离开以前的优势方向。 This is to avoid new dominant directional signals show a slow decline / advantage before leaving direction.


图1是一个示意图,示出一个理想化的解码器安排结构的俯视图。 FIG 1 is a schematic plan view showing an arrangement configuration of the decoder idealized.

具体实施方式 detailed description

解码模块因为编码任一源方向被假设为主要映射到最邻近的声道上,声道转换解码是基于一系列半自动的解码模块,它们在通常的意义上再生输出声道,尤其是中间输出声道,每一个输出声道通常由所有被传送声道的一个子集,以类似于人耳的方法求出。 Because the decoding module encoding any source direction is assumed to map primarily to the nearest channel, channel decoding is based on a series of semi-automatic conversion of a decoding module output channels are regenerated in the usual sense, particularly intermediate output sound channel, each output channel is typically a subset of all channels to be transmitted, in a manner similar to the human ear is obtained.

以类似于人耳的方法,解码模块的工作基于幅度比和交叉相关的结合,幅度比用于确定标称的当前主方向,交叉相关用于确定映像的相对宽度。 In a method similar to the human ear, the relative width of the image is determined based on operation of the decoder module and the cross-correlation magnitude ratio of binding, the magnitude of the main current direction determines the nominal ratio for a cross-correlation.

应用由幅度比和交叉相关求出的控制信号,处理器产生输出声道的声音信号。 Application of a control signal and the amplitude ratio calculated cross correlation, the processor generating a sound signal output channels. 因为这最好基于线性关系实现,以避免产生失真,解码器形成包含有感兴趣信号的基本声道的加权和。 Since this is preferably achieved based on a linear relationship, to avoid distortion, the decoder comprising forming a weighted basic channel and the signal of interest has. (像下面解释的那样,也可要求在计算加权和中也包括非邻近的基本声道)。 (As explained below, it may also be required in calculating the weighted and also comprises substantially non-adjacent channels). 这个有限的但动态的内插方式更常被称为矩阵化。 This limited but dynamic interpolation method is more often referred to as matrixing. 如果在信源中,需要的信号被映射(振幅移动)到最邻近的M个基本声道中,则是一个M:N矩阵解码的问题。 If the source, the desired signal is mapped (amplitude moved) to the nearest M basic channels, is a M: N matrix decoding problem. 换言之,输出声道表示输入声道相对比例。 In other words, the output channels represent relative proportions of the input channels.

特别是在两输入解码模块的情况中,它很象有源的2:N矩阵解码器涉及的问题,例如新型号的Dolby Pro Logic矩阵解码器,它具有成对的解码模块输入端对应于Lt/Rt编码信号。 Particularly in the case of two-input decoding modules, as it is active 2: N matrix decoder problems involved, such as new types of Dolby Pro Logic matrix decoder, having a pair of input terminals corresponding to the decoding module Lt / Rt encoded signals.

注意:2:N矩阵解码器的输出有时称为基本声道。 Note: 2: N matrix decoder output is sometimes called a basic channel. 然而在本文中用“基本”来称呼声道转换解码器的输入声道。 However, herein, by "substantially" to refer to the channel decoder converter input channels.

然而,在现有技术的自主2:N解码器与本发明解码模块的工作之间至少有一个有意义的区别。 However, in the prior art autonomous 2: at least a significant difference between the N working decoder decoding module according to the present invention. 前者除了用左/右幅度指示左/右位置,这一点也是声道转换解码器的假设,它们还用相互声道的相位指示前/后位置,特别是基于Lt/Rt编码声道的和/差比。 The former except that right and left indicate the amplitude / left / right positions, it is assumed that the channel decoder converter, are also indicated with each other before the phase channel / rear position, in particular based on the Lt / Rt encoded channels and / worse than.

这种自主2:N解码器结构有两个问题。 This autonomous 2: N decoder structure has two problems. 一个问题是,例如完全相关的(前方的),但是偏离中心的信号将导致和/差比小于无限大,从而不正确地指示一个不完全在前方的位置(类似于完全反相关的偏离中心的背后信号)。 One problem is, for example, fully correlated (in front), the signal will result in off-center and / ratio is less than infinity, incorrectly indicating such an incomplete forward position (totally similar to the anti-offset from the center of the associated behind the signal). 结果是一个有点变形的解码空间。 The result is a somewhat deformed decoding space. 第二个缺点是,位置映射是多对一的,引入固有的解码错误。 A second disadvantage is that the position of the mapping is many-introducing inherent decoding errors. 例如在一个4:2:4矩阵系统中,一对没有前-入或背-入的非相关的左-入和右-入信号将映射与信号相同的纯净的,非相关的Lt/Rt对,也可映射一个没有左-入/右-入的非相关的前-入/后-入对,或者映射所有4个不相关输入的内容。 For example, in a 4: 2: 4 matrix system, a pair of no pre - into or back - unrelated to the left into the - into and Right - input signal mapped to the same signal pure, uncorrelated Lt / Rt pair , may also be left without a mapping - in / Right - before entering the uncorrelated - after the / - into pairs or mapping the contents of all four inputs uncorrelated. 解码器面对一个不相关的Lt/Rt对没有选择,而“放松此矩阵”,即用一个被动矩阵分配声音到所有输出声道上。 A decoder face unrelated Lt / Rt pair is not selected, and the "relax this matrix", i.e., with a passive-matrix output assigned to all the sound channels. 不可能解码为一个同时只有左-出/右-出,或只有前-出/背-出的信号阵列。 It can not be simultaneously decoded as a left - a / Right - out, or only the first - out / back - of the signal array.

根本的问题在于,在N:2:N矩阵系统中应用相互声道的相位来编码前/后位置,这不同于人耳的工作,人耳不用相位来判别前/后位置。 Fundamental problem is that, in N: N matrix system application phases each channel prior to encoding / aft position, which is different from the work of the human ear, the ear not discriminated phase front / rear position: 2. 本发明最好用至少三个不在一条直线上的基本声道来工作,使得前/后位置由基本声道的设定方向指示,而不是根据它们的相对相位或极性给出不同的方向,这样,一对不相关的或反-相关的声道转换的基本信号明确地解码为分离的基本-输出声道信号,没有中间信号,也没有“后方的”方向被指示。 Prior to the present invention is preferably not work with at least three basic channels on a straight line so that / the position indicated by the setting direction of the base channel, instead of giving different directions according to their relative polarity or phase, Thus, trans-or irrelevant - the basic signal related to channel conversion explicitly decoded substantially isolated - output channel signal, there is no intermediate signal and no direction "rearward" is indicated. (此外,这避免了自主2:N解码器中令人遗憾的“中心聚集”效应,其中不相关的左-入和右-入信号以减小了的分离度被放音,因为解码器馈送这两个信号的和及差给中心和周围的声道。)当然,原则上可以通过用一个N:M声道转换系统与一个2:N解码器-N=4或5-级联来在空间上扩展一个Lt/Rt信号,但是在此情况下,2:N解码器的任何局限性—例如中心聚集—将被带到倍增的声道输出上,也可以组合这些功能到一个设计来接收2声道Lt/Rt信号的声道转换解码器,并且在此情况下改变其特性以解释负的相关信号为具有后方的定向,保持其它的处理不变。 (In addition, this avoids customize 2: N decoder unfortunate "center aggregation" effect, which is not related to the left - and right into - the signal to reduce the degree of separation is playing, because the decoder is fed sum and difference of these two signals to the channel and around the center), of course, the principle can be obtained by a N:.-N decoder to the -N = 4 or 5 cascade: M-channel conversion system with a 2 extended a Lt / Rt signal space, but in this case, the 2: N decoder, any limitations - such as the center aggregation - will be brought to the channel multiplied outputs, these functions may be combined into a design to receive channel 2-channel Lt / Rt signal into the decoder, and change its characteristics in this case to interpret negative correlation signals as having rearward orientation, leaving the other process unchanged. 然而,甚至在此情况下仍然存在由只有两个被传送的声道所导致的解码模糊。 However, even in this case there are still only two channels are decoded by the transmitted blur caused.

所以,每个解码模块,尤其是具有两个输入声道的解码模块类似于现有有源2:N解码器,具有前/后检测禁用或变更的,任意个数的输出声道。 Thus, each decoding module, especially those with two input channels of the decoding module is similar to the conventional active 2: N decoder, with the front / back detection disabled or changed, an arbitrary number of output channels. 当然数字上不可能用矩阵从较少个数的声道唯一地产生较多个数的声道,因为这基于解N个具有M个未知数的线性方程,而M大于N。 Of course, not possible to produce matrix from the channel on the digital number uniquely less large number of channels, because the linear equations with M unknowns solution based on the N and M is greater than N. 所以期望的是,解码模块在存在多个自主的源方向信号时可能有时呈现不太好的声道复原。 It is desirable that the decoding module may at times exhibit less good channel reconstituted in the presence of a plurality of independent signal source direction. 然而人类听觉系统受使用两耳的局限,将承受同样的,允许系统被感觉为离用的极限,甚至用所有声道工作时也如此。 However, the limitations of the human auditory system by using both ears, will bear the same, allowing the system to be perceived as off limits when used, even with all channels work the case. 当其它声道被静音时分离的声道质量仍然是要考虑的,这是为了照顾到坐在一个扬声器近处的收听者。 When the separation is muted when other channels channel quality is still to be considered, which is to take care to sit near a speaker of the listener.

人耳的工作肯定是与频率有关的,但是大多数声像在所有频率上被相关,而且根据作为宽带系统的Pro logic解码器成功的经验性实验,可以预期一个宽频带的声道转换系统可能在某些应用中也有令人满意的性能。 The human ear is certainly working frequency dependent, but most of the audio and video are related at all frequencies, and based on the success of the broadband system as a Pro logic decoder empirical experiment, we can expect a broadband channel conversion system possible in some applications it has satisfactory performance. 多频带声道转换解码器应该也是可能的,采用在逐个频带的基础上类似的处理,并在每个情况下应用相同的编码信号,单个频带的个数和带宽可作为一个自由参数留给解码器实现者。 The multi-band channel decoder converter should also be possible, using similar processing on a band-by-basis, and the same application code signal, the number of single frequency band and bandwidth can be left as a free parameter decoded in each case an implementer. 虽然多频带处理可能比宽带处理要求更高的MIPS,如果输入信号被分成数据块,并且处理基于块实现,则计算需求量可能不太高。 Although the multi-band processing may be higher MIPS than wideband processing requirements, if the input signal is divided into blocks, and processing based on the block implements, the computing requirement may not be too high.

在说明可被本发明解码模块使用的算法之前,首先给出对共享节点的考虑。 In the described algorithm may be used in the module according to the present invention, before decoding, consideration is given to the first shared node.

共享节点如果解码模块所用的基本声道组都是独立的,则解码模块本身应是独立的,自主的实体。 If the basic channel group shared node used by the decoding module are independent, then the decoding modules themselves should be independent, autonomous entities. 然而通常不是这种情况。 However, this is often not the case. 一个给定的被传递声道通常将与两个或更多的相邻基本声道一起被分离的输出信号享用。 A given channel is transmitted usually with two or more adjacent channels are substantially separate output signals enjoy. 如果独立的解码模块被用来解码此阵列,每一个将被相邻声道的输出信号影响,导致可能是严重的错误。 If independent decoding modules are used to decode the array, each output signal will influence the adjacent channel, it may lead to a serious error. 在功能上,两个相邻解码模块的输出信号将“拉”向—或移向—另一个,因为公共基本节点包含两个信号,使电平增加。 Functionally, the two output signals of neighboring decoding modules will "pull" to - or to the - to another, since the common base node contains two signals, so that the level is increased. 如果—这里经常发生的情况—信号是动态的,互作用的量将大到导致与信号有关的动态定位误差大到令人不愉快。 Here the situation often happens - - If the signal is dynamic, the amount of interaction will result in large dynamic positioning error is large signals related to unpleasant. 这个问题在ProLogic和其它有源的2:N解码中不存在,因为它们只有单个的分离的声道对作为解码器输入。 This problem ProLogic and other active 2: N decoding does not exist, because they only have a single isolated channel pair as the decoder input.

所以,补偿“共享节点”效应是必要的。 So, to compensate, "shared node" effect is necessary. 一个可能的方法是,在试图再生一个共享公共节点的相邻解码模块的输出信号之前,从公共节点中减去一个已再生的信号。 One possible method is to, before attempting to decode the output signal of the adjacent modules share a common node reproduction, subtracting the regenerated signal from a common node. 这通常是不可能的,因而改用以下方法:每个解码模块预测出现在共输入声道上的公共输出信号能量,并且一个管理程序通知每个模块它的相邻模块的输出信号能量估计。 This is usually not possible, and thus use the following method: each decoding prediction module common output signal energy present in the co-channel input, a management and notification program modules each of its adjacent module output signal energy estimates.

公共能量的成对计算例如,假设基本声道对A/B包含一个公共信号X以及单独的不相关的信号Y和Z:A=0.707X+YR=0.707X+Z其中定标因子0.707=0.5]]>提供了一个功率对最邻近基本声道保持映射。 Calculation of the pair of common energy For example, suppose the basic channel pair A / B contains a common signal X and a separate uncorrelated signals Y and Z: A = 0.707X + YR = 0.707X + Z where the scaling factor = 0.707 0.5 ]]> provides power to a nearest neighbor mapping remains substantially channel.

RMS能量(A)=∫A2∂t=A2‾=(0.707X+Y)2‾=(0.5X2+0.707XY+Y2)‾]]>=0.5X2‾+0.707XY‾+Y2‾]]>因为X和Y不相关, XY=0,所以A2‾=0.5X2‾+Y2‾.]]>即,因为X和Y不相关,基本声道A中的总能量是信号X和Y的能量和。 RMS energy (A) = & Integral; A2 & PartialD; t = A2 & OverBar; = (0.707X + Y) 2 & OverBar; = (0.5X2 + 0.707XY + Y2) & OverBar;]]> = 0.5X2 & OverBar; + 0.707XY & OverBar; + Y2 & OverBar;] ]> because X and Y are uncorrelated, XY = 0, so A2 & OverBar; = 0.5X2 & OverBar; + Y2 & OverBar;.]]> i.e., as X and Y are uncorrelated, the total energy of the base channels a are signals X and Y and energy. 类似地:B2‾=0.5X2‾+Z2‾]]>因为X,Y和Z是不相关的,A和B的平均交叉乘积为:AB‾=0.5X2‾]]>这样,在一个输出信号被两个相邻的基本声道—它们也可包含独立的,不相关的信号—均分享用的情况下,信号的平均交叉一乘积等于公共信号分量在每个声道中的能量。 Similarly: B2 & OverBar; = 0.5X2 & OverBar; + Z2 & OverBar;]]> as X, Y and Z are uncorrelated, A, and the average cross product B is: AB & OverBar; = 0.5X2 & OverBar;]]> Thus, in an output signal the two adjacent base channels - which may also contain independent, uncorrelated signals - in the case of sharing access to the average cross product of a signal component equal to the common signal energy in each channel. 如果公共信号不是均分地被共享,即它偏向一个基本声道,平均交叉乘积将是A和B中公共分量的能量之间的几何平均,由此,单独声道公共能量估计能通过用声道振幅比的平方根进行归一化而求出。 If the common signal is not shared by the average, i.e. it toward a base channel, the cross product will be the geometric mean between the energy of the common components in A and B average, whereby individual channel common energy estimates by using sound channel amplitude ratio determined and the square root of the normalizing. 实时的时间平均用一个具有适当的下降时间常数的有漏泄积分器计算,以反映前进中的活动性。 A real time average of having a suitable decay time constant of the integrator of leakage calculated to reflect the activity of the advancing. 时间常数平滑可用非线性的上升和下降时间选件来精心完善,并且在多频带系统中,可用频率来定标。 Available time constant of the smoothing rise and fall times of the non-linear options to improve the well, and a multi-band system, the available frequency scaling.

更高阶的公共能量计算为了求出具有三个或更多个输入的解码模块的公共能量,必须形成所有输入信号的平均交叉-乘积。 Higher order calculation of common energy In order to obtain three or more common energy of the input decoding module, an average must cross all the input signals - product. 简单地进行输入的成对处理将不能区分每对输入与所有输入公共的信号之间的分离的输出信号。 Processing input pairs will simply not distinguish between separate output signals between each pair of input signal common to all inputs.

例如,考虑三个基本声道A,B和C,它们分别由不相关的信号W,Y,Z和公共的信号X组成:A=X+WB=X+YC=X+Z如果平均交叉乘积被计算,像在第二阶计算中那样,所有包含W,Y和Z的组合的项将被消去,剩下X3的平均:ABC‾=X3‾]]>不幸的是,如果X是平均值为零的时间信号,则其立方的平均也是零。 For example, consider the basic three channels A, B and C, which are the uncorrelated signals W, Y, Z, and common signal X Composition: A = X + WB = X + YC = X + Z If the average cross-product is calculated, as above, items of all combinations comprising W, Y and Z will be erased in the second order calculation, leaving X3 average: ABC & OverBar; = X3 & OverBar;]]> Unfortunately, if X is an average value time signal is zero, then the average of its cube is zero. 不像X2的平均,对任何非零的X值,X2均为正数,X3与X有相同的符号,从而正和负的贡献部分将抵消掉。 Unlike the average of X2, for any nonzero value of X, X2 are positive, X3 and X have the same sign, so that the positive and negative contributions to offset part. 显然,这对于X的任何奇次幂同样成立,X的奇次幂对应于奇数个模块输入,但是指数大于2的偶指数也能导致错误的结果;例如具有分量(X,X,-X,-X)的4个输入与(X,X,X,X)将有相同的乘积/平均值。 Obviously, the same holds for any odd power of X, corresponding to an odd power of X is inputted to the odd number of modules, but the index greater than 2 even index can also lead to erroneous results; for example, a component (X, X, the -X-, -X) with four inputs (X, X, X, X) will have the same product / average value.

上述问题可以用变形的平均乘积技术解决。 Above problems can be solved by the product of the average deformation technique. 在做平均之前,每个乘积的符号通过取乘积的绝对值而去除。 Before doing average, each sign of the product is removed by taking the absolute value of the product. 乘积的每一项的符号被检查。 Symbols per one product to be inspected. 如果它们都相同,乘积的绝对值被送去进行平均,如果任一符号不同于其它的,乘积的绝对值的负值被平均。 If they are the same, the absolute value of the product is sent for an average, if the symbol is different from any of the other, the absolute value of the product is negative on average. 因为可能的同符号组合的个数不等于可能的不同符号组合的个数,一个加权因子被施加于变负的绝对值乘积进行补偿,此加权因子由同符号组合个数与不同符号组合个数的比值构成。 Because the number of possible combinations of the same symbols is not equal to the number of possible different symbol combinations, a weighting factor is applied to an absolute value of the product becomes negative to compensate this weighting factor is determined by the number of different symbols and the number of combinations of symbols in combination with ratio configuration. 例如一个三输入模块在8个可能中有两个同符号的可能情况,剩下的六个可能情况是不同符号的,因此定标因子为2/6=1/3。 For example, a three-input module has eight possible two possible scenarios for the same symbol, the remaining six cases may be different signs, and therefore the scaling factor is 2/6 = 1/3. 此补偿当而且仅当一个解码模块的所有输入存在公共的信号分量时才导致积分的或相加的乘积增大。 This compensation if and only if all inputs of a decoding module of the signal component when there is a common cause integration or adding the product increases.

然而,为了不同阶模块的平均可以比较,它们全体必须有相同的量纲。 However, the average can be compared for different order modules, they must all have the same dimension. 一个常规的二阶相关包含两输入乘法的平均,因而量纲为能量或功率。 It comprises a conventional second-order correlation of the average two-input multiplier, and therefore the energy or power dimension. 所以在更高阶相关中被平均的项必须也改变为有功率量纲。 Therefore, higher-order correlation are averaged item must also have the power to change the dimension. 对于一个第K阶相关,各个乘积绝对值必须在平均之前变为其指数为2/k的幂。 For a K-th order correlation, the individual product absolute values ​​must be changed its index of 2 / k before a power average.

当然,与阶次无关,如果需要,模块的各个输入节点的能量可计算为相应节点信号的平方的平均,并且不需要首先提升到其k次幂,再减小到一个二阶量。 Of course, regardless of the order, if necessary, the energy of each input node module may calculate an average of the square of the corresponding node signal, and need not be first raised to its k-th power and then reduced to a second order quantity.

共享的节点:相邻电平通过应用基本声道信号的平均平方和变形的交叉乘积,可以估计出公共的输出声道信号能量大小,上面的例子涉及单个内插处理器,但是如果A/B(/C)节点的一个或多个是与另一个具有其自已的与任何其它信号不相关的公共信号分量的另一个模块共同的,则上面计算的平均交叉-乘积应不受影响,使得计算固有地不存在声像率引应效。 Shared nodes: the average level of the adjacent substantially square cross-product application and deformation of the channel signal can be estimated common output channel signal energy magnitude, the above example is directed to a single interpolation processor, but if A / B a further module (/ C) nodes having one or more common with the other signal component not related to any other signal of its own common, the above calculated average cross - the product should not be affected, so that the calculation the absence of a sound image is inherently primers should effect. (注:如果两个输出信号不是相关的,它们将倾向于拉近解码器,但是在人耳中会有一类似的效应,重新使系统工作对人类听觉仍保持忠实。)一旦每个解码模块已计算出在其每一基本声道上的估计的公共输出声道信号,管理程序功能可告知相邻模块每个其它的公共能量,在那一点处,输出声道信号的产生像下面所述那样进行。 (Note: If the two output signals are not related, they will tend to narrow the decoder, but will have a similar effect in the human ear, a re-make the system work for human hearing remain faithful.) Once each decoding modules calculated on its estimated channel each substantially common output channel signal, the program management functions can inform neighboring modules of each of the other common energy, at that point, the output channel signal generated as follows as get on. 由一个模块在一个节点上所应用的公共能量的计算必须考虑不同阶模块可能重叠的多层结构,并且从共享同一节点的任一低阶模块估计的公共能量中减去一个高阶模块的公共能量。 Calculated by the common energy of a module at a node must take into account the application of different order modules may overlap a multilayer structure, and a high-order module is subtracted from a common energy of any lower order module sharing the same nodes estimated in common energy.

例如,假设有两个相邻的表示两个水平方向的基本声道A和B,以及一个表示垂直方向的基本声道C,并进一步假设存在一个表示一个在内部的方向(即在A,B和C的限制内的一个方向)的信号能量为X2的中间的或导出的输出声道。 For example, suppose two adjacent channels A and B represent two substantially horizontal and substantially vertical direction represented by a channel C, and further assume that there represents a (i.e., in the direction A in the inside, B and a direction within the limits of C) energy of a signal of an intermediate or derived output channel X2. 输入为(A,B,C)的三输入模块的公共能量将是X2,但是两输入模块(A,B),(B,C)和(A,C)的公共能量也应是X2。 Input three-input modules (A, B, C) will be the common energy X2, but the two-input modules (A, B), (B, C) and (A, C) should also be the common energy X2. 如果A所连接的模块(A,B,C),(A,B)和(A,C)的公共能量简单地相加,将得到3X2,而不是X2。 If the module (A, B, C) A is connected, (A, B), and (A, C) of the common energy simply added, the resulting 3X2, instead of X2. 为了正确地计算公共节点能量,每个高阶模块的公共能量首先从每个重叠的低阶模块估计的公共能量中减去,从而高阶模块(A,B,C)的公共能量X2从两个两输入模块的公共能量估计中被减去,在每个情况下得到0,并且得到节点A处的净公共能量估计等于X2+0+0=X2。 Subtracting the common node in order to correctly calculate the energy of the common energy of each higher order module is first estimated from each of the low-order module overlapping common energy, so higher order module (A, B, C) from the two common energy X2 estimate of the two-input module common energy are subtracted in each case to give 0, and a net common energy estimate at node a equal to X2 + 0 + 0 = X2.

输出声道信号产生如前所述,以一个线性方法由传送的声道再生输出声道全体的处理基本上是一个矩阵方法,即形成基本声道的加权和,以求出输出声道信号。 Output channel signal is generated as described above, to a linear method for the treatment of all output channels reproduced by a channel transfer matrix method is substantially, i.e., forming a weighted sum of base channel, in order to output the channel signals. 矩阵定标因子的最佳选择一般是与信号无关的。 Optimum selection matrix scaling factor is generally independent of the signal. 确实,如果当前活动的输出声道的个数等于被传送声道(但表示不同的方向)的个数,使得系统是严格受制约的,则数学上可以计算出有效编码矩阵的逆矩阵并还原分离的源信号原型。 Indeed, if the number of currently active output channels is equal to the transmitted channels (but representing different directions) the number of such systems are subject to strict, it is mathematically possible to calculate the effective coding matrix and an inverse matrix to restore separating source signals prototype. 甚至于如果活动的输出声道个数大于基本声道个数,可能仍然可以计算出一个伪逆矩阵。 Even if the number of active output channels is greater than the basic number of channels, you may still be able to calculate a pseudo inverse matrix.

不幸的是,此方法存在问题,计算量需求—特别是基于多频带处理,并且面向高精度浮点实现—并是一个最重要的因素。 Unfortunately, a problem in this method, the amount of calculation needs - in particular based on multi-band processing and high-precision floating point for the implementation - and is a most important factor. 即使中间信号被假设是位于最邻近的基本声道间,有效编码矩阵的数学逆阵或伪-逆矩阵一般对每个输出声道有来自所有基本声道的贡献,这是由于节点共享效应。 Even if the intermediate signal is assumed to be located between the nearest base channel, or a mathematical pseudo inverse matrix of the effective encoding matrix - the inverse matrix generally have contributions from all of the base channels for each output channel, which is due to the node sharing effect. 如果在解码中有任何不完善—实际上这是不可避免的,一个基本声道信号可能由一个空间上与它相距较远的输出声道再生,这是非常不合乎要求的。 If there are any imperfections in the decoding - this is in fact inevitable, a signal may be reproduced by a base channel and its output channel far apart a space, which is undesirable. 此外,伪逆矩阵计算倾向于产生最小RMS能量解,这大大扩展了声音范围,给出最小的分离度;这是与本发明相当不相容的。 Further, the pseudo-inverse matrix calculation tend to produce minimum-RMS energy solutions, which greatly expands the range of sound, giving a minimum degree of separation; this is quite incompatible with the present invention.

因此,为了实现一个实用的容错解码器—在其中有固有的空间解码误差,与用于信号检测的相同模块结构被用于信号产生。 Accordingly, in order to achieve a practical, fault-tolerant decoder - a space in which the decoding error inherent with the same module configuration is used for signal detection signal is generated.

下面详述一个解码模块再生输出信号的产生过程。 DETAILED DESCRIPTION The following procedure generates a decoding module output signal regeneration. 注意连接于模块的每个输出声道的有效位置被假设由振幅比确定,这些振幅是定位信号到其物理位置所需的,即对应于比方向的有效矩阵编码系数的比值。 Note that the module is connected to the active position is assumed for each output channel is determined by the amplitude ratio, i.e. the ratio of these amplitudes corresponding to the ratio of the effective matrix encoding coefficients direction positioning signal to its physical location desired. 为了避开被零除的问题,比值被典型地计算为一个声道的矩阵系数除以此输入声道的矩阵系数(通常为1)全体的RMS和得到的商。 In order to avoid the problem of division by zero, the ratio is typically calculated as the quotient of one channel matrix coefficient in this matrix coefficient input channels (usually 1) and the obtained RMS entire addition. 例如,在一个输入为L和R的两输入模块中所用的能量比应是L能量除以L和R能量之和(“L-比值”),它有0至1的取值范围。 For example, in one of the L and R inputs of the two input modules used in energy than the energy should be divided by the L and L and R energies ( "L-ratio"), which has in the range of 0 to 1. 如果两输入解码模块具有5个输出声道,有效编码矩阵系数对为(1.0,0)、(0.89,0.45)、(0.71,0.71)、(0.45,0.89)和(0,0.1),相应的L-比值是1.0,0.89,0.71,0.45和0,因为每对定标固子有一个1.0的RMS和。 If the two-input decoding module has five output channels, the effective encoding matrix coefficient is (1.0, 0), (0.89,0.45), (0.71,0.71), (0.45,0.89) and (0,0.1), the corresponding L- and 1.0,0.89,0.71,0.45 ratio is 0, since each pair of scaling sub-solid and has a RMS of 1.0.

从解码模块的每个输入节点(基本声道)的信号能量中减去被相邻解码模块取走的任何节点共离信号,得到归一化的输入信号功率电平,用于计算的余数。 From any node in the signal energy of each input node decoder module (base channel) subtracting the adjacent decoding module removed from the total signal to obtain normalized input signal power level, the remainder calculation.

优势方向指示被计算为基本方向被相对能量加权的矢量和。 Advantage is calculated as the indicated direction substantially opposite the direction of the weighted vectors and energy. 对于一个两输入模块,它简化为归一化输入信号功率电平的L-比值。 For a two input module, which simplifies L- ratio normalized input signal power levels.

包括优势方向在其中的输出声道通过将上一步骤中的优势方向L-比值与输出声道的L-比值进行比较而确定。 Advantages include determining a direction in which the output channels by comparing the dominant direction of the previous step the ratio of L- L- ratio of output channels. 例如,如果上述五输出解码模块输入的L-比值为0.75,则第二和第三输出声道包括了优势方向,因为0.89>0.75>0.71。 For example, L- if the ratio of the above-described five decoded output module inputs is 0.75, the second and third output channels comprises a preferential direction, as 0.89> 0.75> 0.71.

映射优势信号到最邻近的涵盖声道的移动定标因子由声道的反-优势信号电平的比值计算得到。 Dominant signal is mapped to the nearest channel of the mobile cover scaling factor from trans-channel - computational advantages signal level ratio obtained. 与特定输出声道相联系的反-优势信号是当对应的解码模块输入信号用输出声道的反-优势矩阵定标因子变换的结果。 Associated with a particular output channel trans - Advantages result matrix conversion scaling factor - dominant signal when an input signal corresponding to the decoding module by reverse output channel. 一个输出声道的反-优势矩阵定标因子是RMS和等于1的那些定标因子,它们在单个优势信号被定位到该输出声道上时导致零输出。 Matrix advantages result in a zero output is the RMS scaling factor and the scaling factor that is equal to 1, they are positioned on the output channels to a single dominant signal - a trans output channels. 如果输出声道的编码矩阵定标因子为(A,B),则此声道的反-优势定标因子是(B,-A)。 If the encoding matrix output channel scaling factor (A, B), then this channel trans - Advantages scaling factor (B, -A).

证明如果单个优势信号被定位于具有编码定标因子(A,B)的输出声道上,则信号必须有振幅(KA,KB),其中K是信号的总振幅,于是,对于此声道,反-优势信号是(KA*B-KB*A)=0。 If a single dominant signal is demonstrated positioned on the output channel encoding scaling factor (A, B), the signal must have amplitudes (KA, KB), where K is the total amplitude of the signal, then, for this channel, trans - a dominant signal (KA * B-KB * A) = 0.

因此,如果一个优势信号由两输入模块输入信号(x(t),y(t))组成,它具有归一化为RMS=1的输入振幅(X,Y),产生的优势信号为dom(t)=Xx(t)+Yy(t)。 Thus, if a dominant signal by the two-input module input signals (x (t), y (t)) composition, having normalized to RMS = 1 the amplitude of the input (X, Y), the dominant signal is generated DOM ( t) = Xx (t) + Yy (t). 如果这个信号的位置被包括在矩阵定标因子分别为(A,B)和(C,D)的输出声道之间,对于矩阵定标因子为(A,B)的声道定标dom(t)的优势信号定标因子是:SF(A,B)=sqrt((DX-CY)/((DX-CY)+(BX-AY))),而对于矩阵定标因子为(C,D)的声道,相应的优势信号定标因子为: If the position of this signal is comprised between the output channel matrix scale factor, respectively (A, B) and (C, D), the scaling factor for the matrix (A, B) channels scaling DOM ( t) the dominant signal scaling factor is: SF (a, B) = sqrt ((DX-CY) / ((DX-CY) + (BX-AY))), while for matrix scaling factor (C, D) channels, respective dominant signal scaling factor:

SF(C,D)=sqrt((BX-AY)/((DX-CY)+(BX-AY))),当优势方向从一个输出声道向另一输出声道移去时,这两个定标因子以相反方向在0与1之间移去,且具有不变的功率和。 SF (C, D) = sqrt ((BX-AY) / ((DX-CY) + (BX-AY))), when removed from the dominant direction of an output channel to the other output channels, two a scaling factor in opposite directions between 0 and 1 is removed, and has a constant power and.

反-优势信号用适当的对所有非-优势声道定标的增盖被计算和定位。 Trans - Advantages signal suitable for all non - Advantages scaled channel gain is calculated and a cover positioned. 反-优势信号是一个没有任何优势信号的矩阵变换信号。 Anti - dominant signal is not a signal of any matrix transformation dominant signal. 如果解码模块的输入为(x(t),y(t)),其归一化振幅为(X,Y),优势信号是Xx(t)+Yy(t),反-优势信号是Yx(t)-Xy(t),与非-优势输出声道的位置无关。 If the input decoding module is (x (t), y (t)), which normalized amplitude (X, Y), dominant signal is Xx (t) + Yy (t), trans - dominant signal is Yx ( t) -Xy (t), and non - irrespective of the position of the output channel advantage.

除了优势/反-优势信号分布外,第二个信号分布用“被动”矩阵计算,它基于已经讨论过的,被定标以维持功率的输出声道矩阵定标因子。 Addition to the advantages / trans - dominant signal distribution, the distribution of the second signal with the "passive" matrix calculation, which is based on already discussed, is scaled to maintain the output power of the channel matrix of the scaling factor.

解码模块输入信号的交叉相关被计算为输入信号的平均交叉乘积除以归一化输入电平的乘积的平方根。 The average cross-product decoding module cross-correlation of the input signal is calculated as the input signal by normalizing the product of the square root of the input level.

现在回到产生过程的说明,最终输出被计算为优势信号和被动信号分布的一个加权的交叉衰落和,其中用解码模块的输入信号交叉相关推出交叉衰落因子。 Returning now to the description generation process, the final output is calculated as a weighted cross-fade signal and the advantages of the passive signal distribution and wherein the decoding module input signal cross-correlation Release crossfade factor. 对于相关值=1,只使用优势/反-优势分布。 For the correlation value = 1, only the advantages / anti - Advantage distribution. 当相关值减小时,输出信号阵列通过对被动分布的交叉衰减被展宽,以实现在一个低的正相关值上,典型地为0.2至0.4,取决于连接到解码模块的输出信道个数。 When the correlation value decreases, the output signal is a wide array of cross fading through the passive distributed development, in order to achieve a low positive value of correlation, typically 0.2 to 0.4, depending on the number of output channels connected to the decoding module. 当相关值进一步减小,趋于零时,被动振幅输出分布逐渐向外弯曲,减小输出信号电平,以模仿人耳对这些信号的响应。 When the correlation value is further reduced, it tends to zero, the passive amplitude output distribution is gradually bent outward, reducing the output signal level, to mimic the human ear in response to these signals.

垂直处理至今所描述的用以从相邻基本声道产生输出声道信号的大多数处理与输出和基本声道的方向无关。 Regardless of the direction of the most of the processing for generating the output channel signals from neighboring base channel and substantially vertical processing channels described so far. 然而由于人耳的水平定向性,人类听觉定位倾向于在垂直方向上比水平方向上有较小的对相互声道相关性的敏感度。 However, since the horizontal orientation of the human ear, the human auditory localization tends to have less sensitivity to each channel correlation in the horizontal direction than in the vertical direction. 为保持人耳工作的真实感,这可能是需要的:在用垂直一定向的输入声道内插处理器中削弱相关约束,例如在使用它之前用一个弯曲函数处理相关信号。 Ear to maintain realistic work, which may be required: in the input channel is inserted with an orientation perpendicular to the weakening of the processor-related constraints, for example, before using it with a bent correlation signal processing function. 然而有可能用与水平声道相同的处理将不带来任何听觉的恶化,这样将简化整个解码器的结构。 However, it is possible to use the same channel level processing will not bring any deterioration of hearing, which would simplify the structure of the whole decoder.

严格讲,垂直信号包括从上方和下方来的声音,并且所描述的解码器结构应同样好地对它们工作,但是实际中通常没有自然声来自下方,因而其处理和声道可被消去而不损害所感觉的系统空间保真度。 Strictly speaking, a signal comprising a vertical downward from above and to the sound, and the decoder structure described should work equally well with them, but in practice there is no natural sound normally from below, so that the process can be eliminated without channels and damage to the perceived fidelity system space.

此概念在应用声道转换到现有5.1声道环绕声材料时可能有实际意义,当然此材料没有垂直声道。 This may be a meaningful concept when applied to a channel converting an existing 5.1-channel surround sound material, this material is of course no vertical channel. 然而,它可以包含垂直信息,例如飘在头顶上的,它们的记录跨在多个或全部水平声道。 However, it may contain vertical information, for example on a floating head, their records across a plurality of or all of the horizontal channels. 所以,应该可以从这些源材料中提取一个虚拟的垂直声道,方法是考虑非-相邻声道或声道组之间的相关性。 Therefore, it should be possible to extract a virtual vertical channel from such source material, is considered non - correlation between adjacent channels or channel groups. 如果存在上述相关性,它们通常将表示来自收听者上方,而不是下方的垂直信息的存在。 If the presence of the correlation, which typically represents from above of the listener, there is a vertical information rather than downward. 在某些情况下,也可以由一个混响发生器导出虚拟的垂直信息,可能关键在于所用收听环境模型。 In some cases, may be derived from the information of a virtual vertical reverberation generator, the key may be used in that model a listening environment. 一旦虚拟的垂直声道从5.1声道信源被提取或导出,至较大个数声道-例如前面描述的24声道结构—的扩展可以象提供了一个真实的垂直声道那样进行。 Once the virtual vertical channel is extracted from the source 5.1 channel signal or exported to a greater number of channels - e.g. channel structure 24 previously described - like extension can provide a vertical channel as for real.

定向记忆关于解码模块控制产生的操作,如上所述,它类似于诸如Pro Logic的解码器的一个2:N自主解码器的工作,本发明的一个方面是在处理中唯一的“记忆”是在平滑网络中,此网络产生基本控制信号。 Operation produces directional control memory of the decoding module, as described above, such that it is similar to a 2 Pro Logic decoder: N independent work decoder, an aspect of the present invention is in the treatment only "memory" in smoothing network, this network produces primary control signals. 在任一时刻,只存在一个优势方向和一个输入相关值,而信号产生直接根据这些信号进行。 At any one time, there is only one dominant direction of the input and a correlation value, and generating a signal based on these signals directly.

然而,特别是在复杂的声学环境下(如原型的鸡尾酒会),人耳呈现出一定程度的位置记忆,或者惯性,一个短暂的来自某给定方向的被明确定位的优势声音将导致其它的来自非专一的方向的不能明确定位的那些声音被感觉到来自同一个源。 However, especially in complex acoustic environments (such as the prototype of the cocktail party), the human ear showing a certain degree of position memory, or inertia, a short sound is clear positioning advantage from a given direction will lead to other those sound is not clear from the non-specific localization of direction is felt from the same source.

可在解码模块中(实际上同样在Pro Logic解码中)模仿这个效应,方法是增加一个显式机构来保存最新的优势方向轨迹,并在方向上模糊的信号条件期间,加权输出信号分布,使其指向最新的优势方向。 In the decoding module may be (practically the same in Pro Logic decoding) mimic the effect is to add an explicit means to preserve the advantages of the latest direction of the track, and during the blur in the direction of signal conditions, weight the output signal distribution that it points to the advantages of the latest direction. 这可以改进由复杂信号阵列所感觉到的再生离散性和稳定性。 This can improve the stability and regeneration discrete signal by a complex array of perceived.

修改的相关和选择的声道混合如前所述,每个解码模块的输出分布确定是基于其输入信号的同时的交叉相关,这可能在某些情况下低估了输出信号内容量。 And modifying the selected channel associated mixing described above, the decoded output of each module is determined based on the distribution of cross correlation of its input signals simultaneously, which may underestimate the amount of output signal content under certain circumstances. 例如,这将随一个自然记录的信号出现,在此信号中非-中心方向有略微不同的抵达时间和不相等的振幅,这导致相关值减小。 For example, it will be with a naturally recorded signal occurs, the signal in Central Africa - at slightly different arrival time and the amplitude is not equal to the center direction, which results in the correlation value decreases. 如果应用大间距的话筒,相应有更大的声道间时延,上述效应可能更严重。 If large pitch microphone application, the corresponding inter-channel time delay greater, these effects may be more serious. 为了补偿此效应,相关性计算可扩展到覆盖信道间时延的一个范围,这以略为更高的处理MIPS要求为代价。 To compensate for this effect, the correlation calculation can be extended to cover a range of inter-channel delay, which is slightly higher at the expense of processing MIPS requirements. 因为听觉神经细胞绝没有约1毫秒的有效时间常数,更加真实的相关值可以通过首先用一个具有1毫秒时间常数的平滑器对被检测声音进行平滑来获得。 Because the auditory neurons by no effective time constant is about 1 millisecond, more realistic correlation values ​​may have a 1 ms time constant of the smoothing filter by first smoothing the detected sound is obtained.

此外,如果一个信息生产者有一个具有强不相关声道的现有5.1声道节目,通过轻微混合相邻声道,从而增加相关性,可以在用声道转换解码器进行处理时提高分布的均匀性,此方法将导致声道转换解码模块在其中间输出声道之间提供更均匀的分布。 Further, if a message producer has an existing 5.1 channel program with strongly uncorrelated channels, by slightly mixing adjacent channels, thereby increasing the correlation, the distribution can be improved at the time of treatment with converter channel decoder uniformity, this method causes the channel converting module between the intermediate decoding output channel to provide a more uniform distribution. 这种混音也可做成有选择性的,例如保留中心前方声道信号不被混音,以保持对话音轨的紧致性。 Such mixing can also be made selectively, for example, to retain the front center channel signal is not mixing, to maintain the compactness of the dialog track.

音量压缩/扩展当编码处理包括混合较大个数的声道为较小个数声道时,如果不提供某些形式的增益补偿,则编码后信号有可能被限幅。 Volume compression / expansion when the encoding process comprises mixing a larger number of channels to a smaller number of channels, without providing some form of gain compensation, the encoded signal is likely to be clipped. 这个问题对于传统的矩阵编码同样存在,但是对声道转换有更大的可能出现,因为被混合为一个给定输出声道的声道数更大。 This question is also present conventional matrix encoding, but greater channel conversion may occur, because the greater the mixing of a given number of channels of the output channel. 为避免在这种情况下的限幅,由编码器给出一个总的增益定标因子,并在已编码的比特流中传送到解码器。 To avoid clipping in such cases, the overall gain is given a scaling factor by the encoder, and transmitted to the decoder in the encoded bitstream. 通常这个值为0dB,但是它可以被编码器设置为一个非零的衰减值,以避免限幅,解码器提供一个等效的补偿增益量。 Usually this value is 0dB, but it may be set to attenuate the encoder a nonzero value to avoid clipping, the decoder providing an equivalent amount of gain compensation.

如果解码器被用来处理一个现有的多声道,它没有这个定标因子节目(例如,一个现有的5.1声道轨迹),它应该选用固定的定标因子为一个假设的值(大约0dB),或者基于信号电平和/或动态范围应用一个扩展函数,或应用可能利用的元数据,例如一个对话规范值,来调节解码器增益。 If the decoder is used to process an existing multi-channel, it does not show the scaling factor (e.g., an existing 5.1-channel track), it should choose a fixed scaling factor a hypothetical value (approximately 0dB), or an extension function, or may use the application metadata based on signal level and / or dynamic range of applications, for example, a dialog specification value, to adjust the decoder gain.

本发明及其各个方面可以实现在模拟电路中,或者更可能作为软件功能实现在数字信号处理器、编程的通用数字计算机和/或专用数字计算机中。 The present invention and its various aspects may be implemented in an analog circuit, or, more likely to be implemented in a digital signal processor, general purpose digital computer, programmed and / or special purpose digital computer as a software function. 模拟与数字信号流之间的接口可实现在合适的硬件中和/或作为功能实现在软件和/或固件中。 The interface between the analog and digital signal streams may be implemented in suitable hardware and / or as functions implemented in software and / or firmware.

Claims (8)

1.将表示一个声场的M个输入声道转换为表示同一声场的N个输出声道的方法,其中每个声道是表示由一个方向抵达的声音的单个音频流,M和N是正整数,且M至少为2,该方法包括:产生一组或多组输出声道,每组有一个或多个输出声道,其中每一组被联系于两个或更多的空间上相邻的输入声道,并且一组中的每个输出声道由一个处理产生,此处理包括确定两个或更多输入声道的相关性度量和两个或更多输入声道的电平相互关系。 1. a sound field indicating the M input channels into presentation method with N output channels of the sound field, wherein each channel is a single audio stream by an arrival direction of the sound, M and N are positive integers, and M is at least 2, the method comprising: generating one or more sets of output channels, each with one or more output channels, wherein each group is to link two or more adjacent input space channel and each output channel in a set is generated by a process that may include two or more input channels determining the correlation measure two or more input channels and the level interrelationships.
2.如权利要求1所述的方法,其特征在于,有一组输出声道联系于两个输入声道。 2. The method according to claim 1, wherein there is a set of output channels to contact the two input channels.
3.如权利要求1所述的方法,其特征在于,一个或多个所述输出声道组被联系于多于两个的输入声道。 The method according to claim 1, characterized in that one or more of the output channels is set to contact more than two input channels.
4.如权利要求1所述的方法,其特征在于,一个或多个输出声道组比一个或多个其它输出声道组联系于更多的输入声道,并且所述的处理按照一个分级次序确定每组输出声道相联系的输入声道的相关性,使得每个组或多个组按照其输出声道所联系的输入声道的个数被排序,最多的输入声道个数具有最高的阶次,并且所述处理按照这些组的分级次序顺序处理它们。 4. The method of claim 1 and a process according to one of the classification, characterized in that a plurality of output channels or groups other than the one or more output channels in groups Contact more input channels, order determining correlation output channels each associated input channel, or a plurality of groups such that each group according to the number of output channels of the input channel links are ordered, the number of input channels having the largest the highest order of, and the processing thereof in accordance with the classification of these groups in sequential order.
5.如权利要求4所述的方法,其特征在于,所述处理考虑对较高阶次的组的处理结果。 5. The method according to claim 4, characterized in that, considering the processing result of processing higher order sets.
6.如权利要求1所述的方法,其特征在于,所述的确定两个或更多输入声道的相关性度量和两个或更多输入声道的电平相互关系在频率域中实现。 6. The method according to claim 1, wherein said determining two or more input channels and measure two or more levels of correlation between input channels implemented in the frequency domain .
7.如权利要求1所述的方法,其特征在于,所述处理采用非线性的时间常数。 7. The method according to claim 1, wherein said nonlinear processing time constant.
8.如权利要求1或3至8中任一项所述的方法,其特征在于,有三个或更多的输入声道表示不在一条直线上的方向。 Or 3 to 8. A method according to any one of claim 81, wherein there are three or more input channels is not represented by a straight line direction.
CN 02804662 2001-02-07 2002-02-07 Audio channel translation CN1275498C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US26728401P true 2001-02-07 2001-02-07

Publications (2)

Publication Number Publication Date
CN1524399A true CN1524399A (en) 2004-08-25
CN1275498C CN1275498C (en) 2006-09-13



Family Applications (1)

Application Number Title Priority Date Filing Date
CN 02804662 CN1275498C (en) 2001-02-07 2002-02-07 Audio channel translation

Country Status (11)

Country Link
EP (1) EP1410686B1 (en)
JP (1) JP2004526355A (en)
KR (1) KR100904985B1 (en)
CN (1) CN1275498C (en)
AT (1) AT390823T (en)
AU (1) AU2002251896B2 (en)
CA (1) CA2437764C (en)
DE (1) DE60225806T2 (en)
HK (1) HK1066966A1 (en)
MX (1) MXPA03007064A (en)
WO (1) WO2002063925A2 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101065797B (en) 2004-10-28 2011-07-27 Dts(英属维尔京群岛)有限公司 Dynamic down-mixer system
CN104424971A (en) * 2013-09-02 2015-03-18 华为技术有限公司 Audio file playing method and audio file playing device
US9628934B2 (en) 2008-12-18 2017-04-18 Dolby Laboratories Licensing Corporation Audio channel spatial translation
CN107623894A (en) * 2013-03-29 2018-01-23 三星电子株式会社 The method of rendering audio signal
CN107690123A (en) * 2012-12-04 2018-02-13 三星电子株式会社 Audio provides method

Families Citing this family (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7660424B2 (en) 2001-02-07 2010-02-09 Dolby Laboratories Licensing Corporation Audio channel spatial translation
US7551745B2 (en) * 2003-04-24 2009-06-23 Dolby Laboratories Licensing Corporation Volume and compression control in movie theaters
US9977561B2 (en) 2004-04-01 2018-05-22 Sonos, Inc. Systems, methods, apparatus, and articles of manufacture to provide guest access
US8234395B2 (en) 2003-07-28 2012-07-31 Sonos, Inc. System and method for synchronizing operations among a plurality of independently clocked digital data processing devices
ITRM20030559A1 (en) * 2003-12-03 2005-06-04 Fond Scuola Di San Giorgio Apparatus for the acquisition and measurement data and
US7394903B2 (en) * 2004-01-20 2008-07-01 Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. Apparatus and method for constructing a multi-channel output signal or for generating a downmix signal
US8983834B2 (en) 2004-03-01 2015-03-17 Dolby Laboratories Licensing Corporation Multichannel audio coding
US8290603B1 (en) 2004-06-05 2012-10-16 Sonos, Inc. User interfaces for controlling and manipulating groupings in a multi-zone media system
US8024055B1 (en) 2004-05-15 2011-09-20 Sonos, Inc. Method and system for controlling amplifiers
US8868698B2 (en) 2004-06-05 2014-10-21 Sonos, Inc. Establishing a secure wireless network with minimum human intervention
WO2006011367A1 (en) * 2004-07-30 2006-02-02 Matsushita Electric Industrial Co., Ltd. Audio signal encoder and decoder
US7508947B2 (en) * 2004-08-03 2009-03-24 Dolby Laboratories Licensing Corporation Method for combining audio signals using auditory scene analysis
US7283634B2 (en) * 2004-08-31 2007-10-16 Dts, Inc. Method of mixing audio channels using correlated outputs
JP4997781B2 (en) * 2006-02-14 2012-08-08 沖電気工業株式会社 Mixdown method and mixdown apparatus
KR100763919B1 (en) 2006-08-03 2007-10-05 삼성전자주식회사 Method and apparatus for decoding input signal which encoding multi-channel to mono or stereo signal to 2 channel binaural signal
US8788080B1 (en) 2006-09-12 2014-07-22 Sonos, Inc. Multi-channel pairing in a media system
US8483853B1 (en) 2006-09-12 2013-07-09 Sonos, Inc. Controlling and manipulating groupings in a multi-zone media system
US9202509B2 (en) 2006-09-12 2015-12-01 Sonos, Inc. Controlling and grouping in a multi-zone media system
US8086752B2 (en) 2006-11-22 2011-12-27 Sonos, Inc. Systems and methods for synchronizing operations among a plurality of independently clocked digital data processing devices that independently source digital data
US8290782B2 (en) * 2008-07-24 2012-10-16 Dts, Inc. Compression of audio scale-factors by two-dimensional transformation
US8023660B2 (en) 2008-09-11 2011-09-20 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus, method and computer program for providing a set of spatial cues on the basis of a microphone signal and apparatus for providing a two-channel audio signal and a set of spatial cues
CN102209988B (en) * 2008-09-11 2014-01-08 弗劳恩霍夫应用研究促进协会 Apparatus, method and computer program for providing a set of spatial cues on the basis of a microphone signal and apparatus for providing a two-channel audio signal and a set of spatial cues
WO2010113434A1 (en) 2009-03-31 2010-10-07 パナソニック株式会社 Sound reproduction system and method
WO2012042905A1 (en) 2010-09-30 2012-04-05 パナソニック株式会社 Sound reproduction device and sound reproduction method
US8938312B2 (en) 2011-04-18 2015-01-20 Sonos, Inc. Smart line-in processing
US9042556B2 (en) 2011-07-19 2015-05-26 Sonos, Inc Shaping sound responsive to speaker orientation
US9729115B2 (en) 2012-04-27 2017-08-08 Sonos, Inc. Intelligently increasing the sound level of player
US9008330B2 (en) 2012-09-28 2015-04-14 Sonos, Inc. Crossover frequency adjustments for audio speakers
US9244516B2 (en) 2013-09-30 2016-01-26 Sonos, Inc. Media playback system using standby mode in a mesh network
US9226087B2 (en) 2014-02-06 2015-12-29 Sonos, Inc. Audio output balancing during synchronized playback
US9226073B2 (en) 2014-02-06 2015-12-29 Sonos, Inc. Audio output balancing during synchronized playback

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0054575B1 (en) * 1980-12-18 1985-05-22 Kroy Inc. Printing apparatus and tape-ribbon cartridge therefor
US6198827B1 (en) * 1995-12-26 2001-03-06 Rocktron Corporation 5-2-5 Matrix system
JPH10174199A (en) 1996-12-11 1998-06-26 Fujitsu Ltd Speaker sound image controller
US6009179A (en) 1997-01-24 1999-12-28 Sony Corporation Method and apparatus for electronically embedding directional cues in two channels of sound
EP1054575A3 (en) * 1999-05-17 2002-09-18 Bose Corporation Directional decoding
AUPP271598A0 (en) * 1998-03-31 1998-04-23 Lake Dsp Pty Limited Headtracked processing for headtracked playback of audio signals
US6757659B1 (en) * 1998-11-16 2004-06-29 Victor Company Of Japan, Ltd. Audio signal processing apparatus

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101065797B (en) 2004-10-28 2011-07-27 Dts(英属维尔京群岛)有限公司 Dynamic down-mixer system
CN102833665A (en) * 2004-10-28 2012-12-19 Dts(英属维尔京群岛)有限公司 Audio spatial environment engine
CN102117617B (en) 2004-10-28 2013-01-30 Dts(英属维尔京群岛)有限公司 Audio spatial environment engine
CN102833665B (en) * 2004-10-28 2015-03-04 Dts(英属维尔京群岛)有限公司 Audio spatial environment engine
US10104488B2 (en) 2008-12-18 2018-10-16 Dolby Laboratories Licensing Corporation Audio channel spatial translation
US9628934B2 (en) 2008-12-18 2017-04-18 Dolby Laboratories Licensing Corporation Audio channel spatial translation
CN104837107B (en) * 2008-12-18 2017-05-10 杜比实验室特许公司 Audio channel spatial translation
US10469970B2 (en) 2008-12-18 2019-11-05 Dolby Laboratories Licensing Corporation Audio channel spatial translation
CN107690123A (en) * 2012-12-04 2018-02-13 三星电子株式会社 Audio provides method
US10341800B2 (en) 2012-12-04 2019-07-02 Samsung Electronics Co., Ltd. Audio providing apparatus and audio providing method
CN107623894B (en) * 2013-03-29 2019-10-15 三星电子株式会社 The method for rendering audio signal
US10405124B2 (en) 2013-03-29 2019-09-03 Samsung Electronics Co., Ltd. Audio apparatus and audio providing method thereof
CN107623894A (en) * 2013-03-29 2018-01-23 三星电子株式会社 The method of rendering audio signal
US10021500B2 (en) 2013-09-02 2018-07-10 Huawei Technologies Co., Ltd. Audio file playing method and apparatus
CN104424971A (en) * 2013-09-02 2015-03-18 华为技术有限公司 Audio file playing method and audio file playing device
CN104424971B (en) * 2013-09-02 2017-09-29 华为技术有限公司 A kind of audio file play method and device

Also Published As

Publication number Publication date
AT390823T (en) 2008-04-15
WO2002063925A8 (en) 2004-03-25
DE60225806D1 (en) 2008-05-08
AU2002251896A2 (en) 2002-08-19
MXPA03007064A (en) 2004-05-24
EP1410686B1 (en) 2008-03-26
WO2002063925A3 (en) 2004-02-19
DE60225806T2 (en) 2009-04-30
HK1066966A1 (en) 2007-04-13
CA2437764C (en) 2012-04-10
CN1275498C (en) 2006-09-13
KR100904985B1 (en) 2009-06-26
AU2002251896B2 (en) 2007-03-22
WO2002063925A2 (en) 2002-08-15
JP2004526355A (en) 2004-08-26
CA2437764A1 (en) 2002-08-15
KR20030079980A (en) 2003-10-10
EP1410686A2 (en) 2004-04-21

Similar Documents

Publication Publication Date Title
US7787631B2 (en) Parametric coding of spatial audio with cues based on transmitted channels
KR101202368B1 (en) Improved head related transfer functions for panned stereo audio content
KR101909573B1 (en) Method and apparatus for encoding and decoding successive frames of an ambisonics representation of a 2- or 3-dimensional sound field
CN101160618B (en) Compact side information for parametric coding of spatial audio
CA2707761C (en) Parametric joint-coding of audio sources
US8295493B2 (en) Method to generate multi-channel audio signal from stereo signals
US8019093B2 (en) Stream segregation for stereo signals
JP5698189B2 (en) Audio encoding
KR100644617B1 (en) Apparatus and method for reproducing 7.1 channel audio
TWI424754B (en) Channel reconfiguration with side information
TWI305639B (en) Apparatus and method for generating a multi-channel output signal
US20050157883A1 (en) Apparatus and method for constructing a multi-channel output signal or for generating a downmix signal
CN101884065B (en) Spatial audio analysis and synthesis for binaural reproduction and format conversion
JP4993227B2 (en) Method and apparatus for conversion between multi-channel audio formats
JP5285626B2 (en) Speech spatialization and environmental simulation
JP2005354695A (en) Audio signal processing
US20070223751A1 (en) Utilization of filtering effects in stereo headphone devices to enhance spatialization of source around a listener
US8885834B2 (en) Methods and devices for reproducing surround audio signals
US8064624B2 (en) Method and apparatus for generating a stereo signal with enhanced perceptual quality
US20040212320A1 (en) Systems and methods of generating control signals
KR20110002491A (en) Decoding of binaural audio signals
CN1860826B (en) Apparatus and method of reproducing wide stereo sound
US8712061B2 (en) Phase-amplitude 3-D stereo encoder and decoder
US8442237B2 (en) Apparatus and method of reproducing virtual sound of two channels
US9009057B2 (en) Audio encoding and decoding to generate binaural virtual spatial signals

Legal Events

Date Code Title Description
C06 Publication
C10 Entry into substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1066966

Country of ref document: HK

C14 Grant of patent or utility model