US20040247134A1 - System and method for compatible 2D/3D (full sphere with height) surround sound reproduction - Google Patents

System and method for compatible 2D/3D (full sphere with height) surround sound reproduction Download PDF

Info

Publication number
US20040247134A1
US20040247134A1 US10/802,924 US80292404A US2004247134A1 US 20040247134 A1 US20040247134 A1 US 20040247134A1 US 80292404 A US80292404 A US 80292404A US 2004247134 A1 US2004247134 A1 US 2004247134A1
Authority
US
United States
Prior art keywords
channel
signal
speakers
itu
compatible
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US10/802,924
Other versions
US7558393B2 (en
Inventor
Robert Miller
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US10/802,924 priority Critical patent/US7558393B2/en
Publication of US20040247134A1 publication Critical patent/US20040247134A1/en
Application granted granted Critical
Publication of US7558393B2 publication Critical patent/US7558393B2/en
Expired - Fee Related legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • H04S3/002Non-adaptive circuits, e.g. manually adjustable or static, for enhancing the sound image or the spatial distribution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/15Aspects of sound capture and related signal processing for recording or reproduction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/11Application of ambisonics in stereophonic audio systems

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Stereophonic System (AREA)

Abstract

A system and method of producing an output sound field that is representative of an input sound field compatible with both existing prior art sound reproduction systems, for example ITU 5.1/6.1, and with a three-dimensional reproduction system unique to this disclosure. One embodiment of the disclosed system is comprised of a microphone array, an encoder, a decoder, and a plurality of speakers, some of which may not be located in the plane of the listener. A further embodiment discloses matrices to encode and decode the signals representative of the input and output sound fields respectively.

Description

    BACKGROUND OF THE INVENTION
  • This application claims the priority of [0001] provisional application 60/455,497 filed 18 Mar. 2003 and is hereby incorporated herein by reference. The inventor's paper entitled “Scalable Tri-play Recording for Stereo, ITU 5.1/6.1 2D, and Periphonic 3D (with Height) Compatible Surround Sound Reproduction” presented at the 115th convention of the Audio Engineering Society in October of 2003 is hereby incorporated herein by reference in its entirety.
  • Lifelike reproduction of sound has long been a subject of scientific exploration and experimentation. While we may not have completed this exploration, we now know enough to record and reproduce a very good approximation of the lifelike sounds of, for example, musical performance in an acoustic space, and other applications. We do know that it is essential to preserve true three-dimensionality of the arrivals at the ear of both direct and reflected sounds, or close approximations of their directions of arrival. We say “true three-dimensionality” (“3D”) because the term is much misused. For example, methods are often termed 3D where reproducers (e.g., loudspeakers) are arranged only in the horizontal plane. These methods can only reliably preserve horizontal angles of sound arrivals where the listener is at the center of a horizontal circle. However, in live listening in an acoustic space, reflections also arrive from above and below, at vertical angles of elevation, referred to as “height”, and resulting in truly natural “periphonic” hearing. [0002]
  • For lifelike reproduction, there are both (a) important reasons why the most reliable way to reproduce height is by locating loudspeakers above and below the listener, who is now at the center of a sphere, not just a circle, and (b) important reasons why height must also be preserved in the first place. [0003]
  • Regarding point (a) above, in the past, less reliable methods have attempted to generalize an important aspect of human Head-Related Transfer Functions (“HRTF”) using generalized filters or so-called “dummy-head” microphones, intended to deliver to inside the two ear canals of the listener what was recorded at the two ear canals of the dummy head. The problem is that the human mechanism for determining sound arrivals from above or below is the pinna, or outer ear. Folds of the pinna cause reflections of higher frequency sounds either partially to reinforce or partially to cancel, or attenuate, depending on both the frequency and the direction of the sound, both horizontal and vertical. But each human individual's pinna are as unique as a fingerprint, so generalized filters or generalized “dummy pinna” work more or less poorly for each listener. Miniature microphones placed within the ear canals of the recordist/listener result in more lifelike reproduction, but only with that one person doing the recording and/or listening. [0004]
  • For lifelike reproduction by a group of listeners—such as in listening to recorded music in a home theater, training in a simulator, or virtual reality for computer multi-media, or riding an amusement ride—loudspeakers must be located above and below as well as around the listeners. Each listener's pinna, in “agreement” with other aspects of their individual HRTF, will determine for them both the azimuth and elevation of each sound, just as they have learned these complex relationships for themselves since childhood. [0005]
  • Regarding point (b) above, why must true 3D (i.e., with height) be preserved in the first place? The reason is that humans learn sound directionality by relating seeing sources of sound with the hearing mechanisms described above. Through a complex ear-brain response the listener knows the direction of a sound—above or below as well as horizontally—even when facing another way or with eyes closed. In acoustic spaces, unseen reflections arrive at different times, building up to steady state, then collapse in the same order when the source of the sound stops. Each arrival and “departure” from each direction is tonally “colored” by the pinna. Musicians hear this same complex interplay and form each note, phrase, even pause, to be “musically correct”, playing the acoustic as an extension of their instrument. The “tonality” or timbre of their guitar, piano, or violin would sound very different in a different space. They will play differently in a different hall to be musically correct in that hall, such as playing faster or more legato in a small space and slower and more pizzicato in a large one. Listeners in the same space learn this “musical language” and appreciate the music more when they agree it is correct. But take away height reflections from the ceiling or acoustic clouds above the stage and the timbre changes dramatically. [0006]
  • So for lifelike reproduction of natural sounds such as music, spherically positioned reproducers of sound are a requirement. [0007]
  • Numerous approaches termed “three-dimensional” are in fact only two-dimensional since they use speakers only in the horizontal plane. If the listener perceives any height sounds, they can only be due to the acoustics of the listening environment, which are invalid in reproducing the space where the music was recorded. Other approaches attempt to simulate height auditory “cues”, or signals, to the ear-brain system, however these cannot be generalized reliably to life-like degree for all listeners because their pinna are as individual as their fingerprints, as described above. If the goal is to believably reproduce the recorded space, then the listener will believe he has been “transported” to that space and is no longer in the listening space. If the recorded space is an acoustic one with reflective ceiling and floor elements, lifelike believability requires vertically-arriving sounds to be preserved. Since we cannot successfully generalize pinna colorations (e.g., by using filters and/or dummy heads) that connote height, we can best reproduce height cues by using loudspeakers above and below the listeners. But an infinite number of loudspeakers and channels as in real life would be infinitely impractical. [0008]
  • Prior art systems, such as 1[0009] st Order Ambisonics, creates a reasonable approximation of three-dimensionality using four channels and a minimum of eight loudspeakers. Ambisonics has not succeeded in the marketplace for a variety of reasons, not the least of which is the fact that Ambisonics does not produce a lifelike reproduction of sound in front of the listener, where the ear-brain “perceptualization” is most acute.
  • Another prior art system, called Ambiophonics, uses a two-channel binaural-based approach that precisely positions sounds across a 120 degree arc in front of a listener where such localization is most important for lifelike hearing. In order to localize frontal sounds widely yet accurately, Ambiophonics uses two closely-spaced speakers, called a “stereo dipole” or “Ambiopole”, and transaural crosstalk cancellation. However, Ambiosonics is inherently two-dimensional and incapable of producing three-dimensional sound with height. [0010]
  • Prior art monaural systems sounded correct tonally but had a “stage door” affect: it was localized at a point in 2D for coming through a narrow opening, say, in an orchestra shell wall. Prior art stereo systems, while providing spaciousness in sound in two dimensions, suffer from lack of localization as the speakers are typically placed as the front left and front right positions, thereby leaving a large gap between the speakers. Other prior art systems, such as ITU 5.1/6.1 and stereo, favor spaciousness and simulating tonality at the price of accurate localization—as though mutually exclusive. ITU 5.1/6.1 systems extend the stereo concept to envelop listeners but only in two dimensions. A height component is lacking. [0011]
  • Another prior art system is WaveField Synthesis (“WFS”). The WFS system is limited to two dimensions and therefore lacks the directionality of height and the natural timbral quality achievable by systems and methods exercising the present invention. Furthermore, WFS requires upwards of 36 speakers and is impractical at present in needing as many channels for distribution and digital signal processing as for reproduction. [0012]
  • Yet other prior art systems, known collectively as Higher Order Ambiosonics (“HOA”) likewise have deficiencies. Along with the deficiencies previously noted for Ambiosonic systems, HOA systems require nine or more channels for Ambisonic components for a total of 11 or more distribution channels. Currently, six full-range channels is the current limitation of distribution media such as DVD-A, SACD, and DTS-CD. [0013]
  • No prior art systems have yet been able to reproduce accurate 3D sound—with height and accurate spaciousness, tonality, and localization. The present invention produces life-like 3D sound with correct spatial impression, timbre (tonality), and localization. Furthermore, embodiments of the present invention plays compatibly in stereo, ITU 5.1/6.1, full 3D using available 6-channel media, and full 3D using 10 or more speakers in a home theater or height-modified cinema. [0014]
  • It is an object of the present disclosure to provide a novel system and method for accurately reproducing a 3D sound field. [0015]
  • It is another object of the present disclosure to provide a novel system and method for combining accurate reproduction of “front stage sound” with accurate three-dimensional localization of sound to produce a sound field with height and accurate spaciousness, tonality, and localization. [0016]
  • It is yet another object of the present disclosure to provide a novel system and method for producing a signal which accurately reproduces a 3D sound field that is also capable of play back on current surround 2D sound systems without the use of a decoder or the need to add additional speakers. [0017]
  • It is still another object of the present disclosure to provide a novel system and method for providing a transformation matrix for mapping a 3D sound field into a signal for providing a 2D sound field without the need for a decoder. [0018]
  • It is still yet another object of the present disclosure to provide a novel system and method for providing a reconstitution matrix for accurately reproducing a 3D sound field. [0019]
  • It is a further object of the present disclosure to provide a novel system and method for a microphone array capable of capturing a sound field in three dimensions.[0020]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1A is a high level block diagram illustrating the flow of information from a microphone array through an encoder, a decoder, to a set of 3D speakers according to embodiments of the present disclosure. [0021]
  • FIG. 1B is a high level block diagram illustrating the flow of information from a microphone array through an encoder to a set of 2D speakers according to embodiments of the present disclosure. [0022]
  • FIGS. 2A-2C are a depiction of the top, front, and side views of an embodiment of a hybrid microphone array according to an aspect of the present disclosure. [0023]
  • FIGS. 3A-3F each depict one of six transform modes according to aspects of the present disclosure. [0024]
  • FIGS. 4A-4F each depict one of the six 3D transform mode matrices of FIGS. 3A-3F, respectively. [0025]
  • FIGS. 5A-5F each depict one of the six reconstitution matrices of FIGS. 4A-4F, respectively. [0026]
  • FIG. 6 is an illustration of a speaker layout for an embodiment of the present disclosure.[0027]
  • DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
  • An embodiment of the present disclosure may comprise (a) a microphone array capable of capturing sounds in three dimensions and using, perhaps, six recording channels; (b) an encoder for “transformation” of recordings from the microphone array so that the captured sounds may be encoded on standard media such as compact discs (“CDs”) or digital video discs (“DVDs”) such that playing the media requires no decoder for replay on, for example, ITU 5.1/6.1 systems; (c) a decoder for lossless “reconstituting” of 3D information of the captured sounds for use with a 3D speaker layout; and (d) a speaker layout for 3D reproduction of the captured sounds, or a standard ITU 5.1/6.1 speaker layout. It shall be understood by those of skill in the art that the an ITU 5.1/6.1 system does not require a 3D speaker layout. The novel system and method are sometimes referred to herein as “[0028] PerAmbio 3D/2D” or simply “PerAmbio”.
  • For example, FIG. 1A is an overall, high-level block diagram of an embodiment of the present disclosure illustrating the flow of information from a [0029] microphone array 10 through an encoder 12, a decoder 14, to a 3D speaker arrangement 16. Sound field 2 impinges on the microphone array 10 which produces a microphone signal (“Pin”). The microphone signal may be a six channel signal. The encoder 12 converts Pin to an encoded signal (“Sout”). The encoded signal is sent to the decoder 14 which produces a decoded signal (“Pout”). Pout is applied to the 3D speaker arrangement 16 to produce a 3D sound field that is an accurate reproduction of the sound field 2.
  • FIG. 1B is an overall, high-level block diagram of an embodiment of the present disclosure illustrating the flow of information from a [0030] microphone array 10 through an encoder 12 to a 2D speaker arrangement 18. Sound field 2 impinges on the microphone array 10 which produces a microphone signal (“Pin”). The microphone signal may be a six channel signal. The encoder 12 converts Pin to an encoded signal (“Sout”). The encoded signal is applied to the 2D speaker arrangement 18 to produce a 2D sound field that is a representation of the sound field 2.
  • The details of the components of the systems in FIGS. 1A and 1B will be discussed below. [0031]
  • Microphone Array
  • Embodiments of the present invention may include a specialized microphone array for recording the necessary information of the [0032] sound field 2 so as to accurately reproduce the sound field with a speaker arrangement.
  • FIGS. 2A-2C depict a novel microphone array according to embodiments of the present disclosure. The microphone array, sometimes referred to as the “[0033] PerAmbio 3D/2D microphone array” is a hybrid array comprising a “soundfield” array for four Ambisonic signals (W, X, Y, Z), also know as B-Format channels, and a baffled, substantially ellipsoidal array for Ambiophonic signals (FL, FR, BL, BR).
  • 1[0034] st order so-called “B-format” Ambisonic signals, called W, X, Y, and Z, represent pressure (omni-directional), and forward-, leftward-, and upward-facing pressure-gradient (velocity) microphone elements, respectively, as is known in prior art. The B-format signals in combination can approximately represent the sound of plane waves arriving at a listener from any direction in 3-dimensions. They contribute the “ambience” component of PerAmbio 3D/2D.
  • An [0035] ellipsoid 20 is approximately head-shaped and contributes that portion of human HRTF (head related transfer function) that can be successfully generalized—the human head spacing and “shadowing” between the ears. Head-spacing causes time delay, or interaural time delay (“ITD”) while the head-shadowing describes the loss of level at frequencies greater than approximately 700 Hz, known as interaural level difference (“ILD”), of sounds originating from the side of the head opposite each ear. The inventive microphone array is designed with its imprimatur for these aspects of HRTF because they are similar in nearly all individuals. They contribute a great deal to horizontal localization of sounds—but not all. As discussed above, learned through experience, a listener's individual pinna cues must agree with head size and shadowing cues, or the listener is confused, and deems the sound not lifelike. The pinna are highly individual unlike prior art microphone arrays which use a dummy head with a “standard” pinna configuration. Since the inventive microphone array is pinnaless, the only “pinna” in the system are the listener's.
  • The microphone baffling [0036] 22 attenuates sounds from behind and above in order to avoid interference with the soundfield array that might otherwise cause undesirable ambiguous images and comb filtering for critical frontal sounds. FIGS. 2A-2C show a horizontal and vertical frontal acceptance angle. In one preferred embodiment, the horizontal frontal acceptance angle is 120 and the vertical frontal acceptance angel is 150. Side and top baffles use the boundary-layer effect with small microphone diaphragms located at the intersection of these planes and the “plane” tangent to the ellipsoid. This avoids high frequency reflections that otherwise would cause undesirable comb filtering and smearing of the microphone's impulse response, which is critically important in this application. The baffles provide 6 dB of acoustic gain above 500 Hz, which, when compensated with equalization, result in a +6 dB increase in signal-to-noise above that frequency, and make possible the use of small diaphragm microphone elements. The microphone may weigh approximately 7 kg (15 lb) and can be mounted on a stand or suspended and tilted as needed.
  • Microphone positions are designated on FIGS. 2A-2C as FL (front left) [0037] 24, FR (front right) 25, BL (back left) 26, and BR (back right) 27. The vectors associated with FL, FR, BL, and BR indicate the general direction of sound which impinges on each of the microphones. In embodiments of the microphone array which use 6 channels, either the FL, FR microphone pair or a mix adding the FL, FR pair to the BL, BR microphone pair, is used. When all four microphones are in use, an additional pair of channels is needed.
  • For compatibility with ITU-R BS.775.1 two dimensional surround systems, the microphone array may be fitted with the BL, BR microphone pair on the back of the baffle and may be positioned in coincidence (approximately 25 mm or less in 3-dimensional space) from the frontal pair (FL, FR). For anechoic recordings such as out of doors, the baffle may be typically flat and the horizontal and vertical acceptance angles are therefore 180 in front or back. Recordings made with the FL, FR, BL, BR microphones are compatible with standard ITU 5.1/6.1 systems. Playback in home theaters with ITU 5.1/6.1 systems, as discussed previously, results in two dimensional surround sound accurate over 360 when played using two cross-talk cancelled stereo dipoles (front and back). Playback can be three dimensional, with an appropriate speaker arrangement, if the B-format microphone signals are captured as well. PerAmbio three dimensional B-format signals may also be generated post-production using hall impulse responses and convolution of the front Ambiophone channels. The PerAmbio outputs of the present invention may be augmented with “spot” microphones highlighting individual instruments as desired by the recording or mixing engineer using methods specific to the present invention. [0038]
  • 2D/3D Playback System
  • The present disclosure describes an encoder for “transformation” processing of 3D recordings in a form compatible with standard ITU 5.1/6.1 systems such that no decoder is needed. In doing so, the mastering engineer may select a useful “mode” that mathematically maps the height information in a way that most suits the performance or venue, e.g., opera, recital, arena concert, movie scene, etc. Eighty combinations of transformation modes are possible, but only a dozen or so are useful to the experienced recording engineer. The transformation mode selected by the recording engineer is reversible and changeable by the mastering engineer during preparations for mass distribution on CD or DVD media, for example. Transformation makes possible not just uncompromised, but potentially improved, 5.1/6.1, CD, DVD, etc. two dimensional media that contains embedded information for lossless 3D “reconstitution”, described below, for example, when a listener adds a 3D decoder and 3D speaker arrangement. [0039]
  • When the user elects to expand to three dimensional sound from a prior art two dimensional system, he adds a “reconstitution” [0040] decoder 14 of the present invention, or a receiver/audio controller so-equipped. The reconstitution decoder 14 both: (a) recovers the three dimensional information according to the mode selected by the recording engineer; and (b) develops outputs for feeding, for example, 10, 14, or 26 loudspeakers, including four or more above and below the horizontal plane, depending on the user's resources. In DVD-A, the transformation mode selected by the recording engineer could be encoded in meta-data such that the user's receiver/decoder 14 could automatically select the mode for reconstitution. In addition, the transformation “mode” selected by the recording engineer or mastering engineer, is reversible and changeable by the advanced user as desired in order to enhance reproduction in two dimensional ITU 5.1/6.1 systems. The reconstitution decoder 14 of the present invention has been realized in DSP (digital signal processing) prototype form, has been demonstrated, and is ready as software for a programmable DSP chip ready for manufacture of consumer receivers and professional decoders.
  • In addition to adding a [0041] reconstitution decoder 14, in order to get true 3D reproduction, the user must add, for example, four or five or more speakers (and power amplifiers) for a total of 10, 14, or 26 depending on the user's resources. Ten speakers is the experimentally determined minimum for lifelike results. Referring now to FIG. 6, which is a depiction of a twelve speaker arrangement according to an embodiment of the present disclosure, the two frontal speakers (41, 42) typically are of higher quality and power than the eight ambience speakers (43, 44, 45, 46, 47, 48, 49, 50) and two back speakers (51, 52) which may be of “satellite-quality” and lower in power. Speaker locations are somewhat flexible with decreasing quality of results if varied from recommended positions of the present invention. Whether in the recommended positions or not, the reconstitution decoder 14 of the present invention may be programmed by the user to reflect the exact loudspeaker locations during setup. The “Listening Area” (“Sweet Spot”) is enlarged due to the hybrid nature of the present invention to accommodate 6 persons or more in a space of size commonly used for home theaters.
  • Encoder
  • FIGS. 3A-3F depict six possible transform modes the inventor has identified as useful. If metadata permitted, the recording engineer could have available all 80 combinations (3[0042] 4−1) considered for encoding 3D directionality into 6 full-range ITU compatible media channels for direct replay in 5.1/6.1. For 3D replay, decoding corresponding to the recording mode is implemented preferably in a DSP chip, but other implementations are contemplated. It may also be possible for users to download new matrices via the Internet.
  • The inventor has identified six useful “modes” for use in situations such as music recording, cinema ambience, multi-channel broadcast, etc. A mode chosen during recording may be changed in post-production, or by a user with a “smart decoder” reconstituting original channels and making a new transformation. Changing the tilt of a raised (suspended) microphone is also easily done. For example, in DVD-A mastering, a flag is set in meta data of the tri-play 3D/2D disc for automatic selection by replay equipment. [0043]
  • For ease of use, mnemonics describe the three basic modes, i (FIG. 3A), j (FIG. 3B), & k (FIG. 3C), in terms of ITU 5.1/6.1 channels C (center), SC (surround center), SL (surround left), SR (surround right), L (left), and R (right), illustrated as follows with the source of sound to the right: [0044]
  • FIG. 3A: “i” represents C and SC “inclined” upward while SL and SR incline downward. [0045]
  • FIG. 3B: “j” “juxtaposes” the C, SC, SL, and SR channels from “i”. [0046]
  • FIG. 3C: “k” is lying on its back with has C and SC angling upward from the corner channels (L, R, SL, SR) which lie flat. [0047]
  • Three tilted variants i′ (FIG. 3D), j′ (FIG. 3E), and k′ (FIG. 3F) rotate C, SC, SL, and SR with respect to L, R by any practical angle, e.g. −30°, in order to raise the microphone (suspended or on a high stand). The output of the baffled ambiophone varies only slightly with height incidence, so physical tilting is inconsequential for the FL, FR or BL, BR channels. [0048]
  • From experience, recording engineers might identify applications described below for each of the six modes (keeping in mind they can be changed in post or replay): [0049]
  • FIG. 3A (“i”): the microphone array is placed at source level (L, R), below acoustic shell reflections (C), e.g. an outdoor amphitheater event, with audience. [0050]
  • FIG. 3B (“i′”): the array is on a high stand or hanging in an opera house or symphony hall, the orchestra widely spaced in a pit or strings downstage (L, R), singers or winds upstage (C), hall ambience back (SL, SR) & up (SC). [0051]
  • FIG. 3C (“j”): the array is more closely placed before a small ensemble at source level for direct sound and early floor and sidewall reflections (L, R), higher direct solo and ceiling reflections (C), and hall ambience from back-up (SL, SR) and back-down (SC). [0052]
  • FIG. 3D (“j′”): the array hangs closer to a proscenium to pickup downstage sounds (L, R), upstage drama (C), highback ambience (SL, SR), and audience (SC). [0053]
  • FIG. 3E (“k”): the microphone array is in an arena with sports play-action or musical instruments at microphone level (L, R), and with good high-front (C) and back (SC) crowd sounds or ceiling ambience. [0054]
  • FIG. 3F (“k′”): the array is suspended in a cathedral with upstage choir (C) and front-of-church organ divisions and floor reflections (L, R), antiphonal and congregation in back (SL, SR), and organ trumpet overhead (SC). [0055]
  • After recording six PerAmbio 3D channels, given as {Pin} in 6×1 matrix form, a “transformation” matrix {S}: [0056] s ( L , FL ) s ( L , FR ) s ( L , W ) s ( L , X ) s ( L , Y ) s ( L , Z ) s ( R , FL ) s ( R , FR ) s ( R , W ) s ( R , X ) s ( R , Y ) s ( R , Z ) s ( C , FL ) s ( C , FR ) s ( C , W ) s ( C , X ) s ( C , Y ) s ( C , Z ) s ( SC , FL ) s ( SC , FR ) s ( SC , W ) s ( SC , X ) s ( SC , Y ) s ( SC , Z ) s ( SL , FL ) s ( SL , FR ) s ( SL , W ) s ( SL , X ) s ( SL , Y ) s ( SL , Z ) s ( SR , FL ) s ( SR , FR ) s ( SR , W ) s ( SR , X ) s ( SR , Y ) s ( SR , Z )
    Figure US20040247134A1-20041209-M00001
  • is applied to obtain the six ITU-compatible media channels {Sout} as follows: [0057] { Sout } = { S } · { Pin } where: { S } is  defined  above, { Sout } is L R C SC SL SR and { Pin } is FL FR W X Y Z
    Figure US20040247134A1-20041209-M00002
  • For a standard ITU home theater surround system, a multi-channel disc (6 discrete channel DVD-A, SACD, or DTS-CDIDVD-V) plays {Sout} directly in 5.1/6.1. If the speaker layout is 5.1, current implementations sum SC information into SL and SR speaker feeds at −3 dB. [0058]
  • When the user augments his system for 3D, a “reconstitution” matrix {P} is applied, which may be implemented in DSP, in response to flags in meta data that select one of six recording modes to recover losslessly PerAmbio 3D—in matrix form {Pout}—as follows:[0059]
  • {Pout}={P}·{Sout}
  • Since matrix {P} is the inverse of matrix {S},
  • {Pout}={S}−1·{Sout}
  • PerAmbio 3D reconstitution is lossless if
  • {Pout}={Pin}.
  • Experiments have led to improved matrices for the six transformation modes depicted in FIGS. 3A-3F. These matrices are shown in FIGS. 4A-4F, respectively. [0060]
  • Decoder
  • In order to play back the encoded channels in 3D, the encoded signals must be decoded. For example, if a user chooses to install 3D speakers, power amplifiers, etc., in order to reproduce the 3D sound field, a “reconstitution” decoder must also be added as shown in FIG. 1A. The decoder applies the inverse of the transformation matrix, or “reconstitution matrix” chosen for the recording. The reconstitution matrices for the transformation matrices in FIGS. 4A-4F are shown in FIGS. 5A-5F, respectively. [0061]
  • Speaker Arrangements
  • FIG. 6 depicts a recommended loudspeaker position for a preferred embodiment of the inventive system using 12 speakers. Another preferred embodiment uses ten speakers comprising all the speakers in FIG. 6 with the exception of the BL and BR speakers. In the loudspeaker positions of the depicted embodiment, the present inventive system is compatible playing existing two dimensional recordings made in ITU 5.1 or 6.1 format by moving backward 26% of the speaker diameter, the relative positions of 2 dimensional speakers to the listener are in full compliance with standard ITU-R775. Best results also require changing levels and delays of the four to six speakers affected, which could be a programmable function of DSP in the receiver/audio controller. Thus, the present invention offers full forward as well as backward compatibility between two dimensional and three dimensional recordings for all home theater users both before they expand their systems to three dimensions and thereafter. [0062]
  • In a preferred 10-speaker arrangement, the speakers are arranged as follows: [0063]
  • The FL, FR speakers are positioned so that: [0064]
  • azimuthally, one is approximately 8 degrees to the left of and the other is approximately 8 degrees to the right of the 12 o'clock position (i.e., directly in front) of a listener; and [0065]
  • elevationally, both are positioned substantially on a horizontal plane that intersects the listener's ears. [0066]
  • The L, R speakers are positioned so that: [0067]
  • azimuthally, one is approximately 45 degrees to the left of and the other is approximately 45 degrees to the right of the 12 o'clock position of the listener; and [0068]
  • elevationally, both are positioned substantially on said horizontal plane. [0069]
  • The SL, SR speakers are positioned so that: [0070]
  • azimuthally, one is approximately 135 degrees to the left of and the other is approximately 135 degrees to the right of the 12 o'clock position of the listener; and [0071]
  • elevationally, both are positioned substantially on said horizontal plane. [0072]
  • The UL, UR speakers are positioned so that: [0073]
  • azimuthally, one is approximately 90 degrees to the left of and the other is approximately 90 degrees to the right of the 12 o'clock position of the listener; and [0074]
  • elevationally, both are positioned above said horizontal plane. [0075]
  • The DL, DR speakers are positioned so that: [0076]
  • azimuthally, one is approximately 90 degrees to the left of and the other is approximately 90 degrees to the right of the 12 o'clock position of the listener; and [0077]
  • elevationally, both are positioned below said horizontal plane. [0078]
  • In a preferred 12-speaker arrangement, the two speakers are added to the above arrangement as follows: [0079]
  • The BL, BR speakers are positioned so that: [0080]
  • azimuthally, one is approximately 172 degrees to the left of and the other is approximately 172 degrees to the right of the 12 o'clock position of a listener; and [0081]
  • elevationally, both are positioned substantially on a horizontal plane that intersects the listener's ears. [0082]
  • Although the various aspects of the present invention have been described with respect to heir preferred embodiments, it will be understood that the present invention is entitled to protection within the full scope of the appended claims. [0083]

Claims (64)

I claim:
1. A system for producing an output sound field that is representative of an input sound field, comprising:
a microphone array for receiving the input sound field and producing therefrom a microphone signal (“Pin”) representative of the input sound field wherein Pin comprises B-format channels, an FL (front left) channel, and an FR (front right) channel;
an encoder for producing an encoded signal (“Sout”) from Pin wherein Sout comprises an ITU-compatible six channel signal;
a decoder for producing a decoded signal (“Pout”) from Sout wherein Pout comprises B-format channels, an FL channel, and an FR channel; and
a plurality of speakers for producing the output sound field from Pout.
2. The system of claim 1 wherein the hybrid microphone array is comprised of:
at least 6 microphones; and
a baffle including a substantially ellipsoidal structure.
3. The system of claim 2 wherein four of said microphones are arranged in a tetrahedron.
4. The system of claim 3 wherein the plurality of speakers produces the output sound field from Sout.
5. The system of claim 4 wherein the plurality of speakers are arranged in a 2D arrangement.
6. The system of claim 1 wherein Pin and Sout are each a 6×1 matrix and the encoder produces Sout by multiplying Pin by a 6×6 transformation matrix (“S”).
7. The system of claim 1 wherein S comprises the quantities:
s ( L , FL ) s ( L , FR ) s ( L , W ) s ( L , X ) s ( L , Y ) s ( L , Z ) s ( R , FL ) s ( R , FR ) s ( R , W ) s ( R , X ) s ( R , Y ) s ( R , Z ) s ( C , FL ) s ( C , FR ) s ( C , W ) s ( C , X ) s ( C , Y ) s ( C , Z ) s ( SC , FL ) s ( SC , FR ) s ( SC , W ) s ( SC , X ) s ( SC , Y ) s ( SC , Z ) s ( SL , FL ) s ( SL , FR ) s ( SL , W ) s ( SL , X ) s ( SL , Y ) s ( SL , Z ) s ( SR , FL ) s ( SR , FR ) s ( SR , W ) s ( SR , X ) s ( SR , Y ) s ( SR , Z )
Figure US20040247134A1-20041209-M00003
wherein:
L represents a left speaker channel for an ITU-compatible six channel signal;
R represents a right speaker channel for an ITU-compatible six channel signal;
C represents a center speaker channel for an ITU-compatible six channel signal;
SC represents a surround center speaker channel for an ITU-compatible six channel signal;
SL represents a surround left speaker channel for an ITU-compatible six channel signal;
SR represents a surround right speaker channel for an ITU-compatible six channel signal;
FL represents the front left speaker channel;
FR represents the front right speaker channel;
W represents a B-format channel;
X represents a B-format channel;
Y represents a B-format channel;
Z represents a B-format channel;
and wherein
s(α, β) represents a transformation quantity relating the respective α and β channels.
8. The system of claim 7 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .736 0 .425 0 0 .601 - .736 0 .425 0 0 .601 - .368 .638 - .425 0 0 .601 - .368 - .638 - .425
Figure US20040247134A1-20041209-M00004
9. The system of claim 7 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .736 0 - .425 0 0 .601 - .736 0 - .425 0 0 .601 - .368 .638 .425 0 0 .601 - .368 - .638 .425
Figure US20040247134A1-20041209-M00005
10. The system of claim 7 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .736 0 .425 0 0 .601 - .425 0 .736 0 0 .601 - .425 .736 0 0 0 .601 - .425 - .736 0
Figure US20040247134A1-20041209-M00006
11. The system of claim 7 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .850 0 0 0 0 .601 - .425 0 .736 0 0 .601 - .531 .638 - .184 0 0 .601 - .531 - .638 - .184
Figure US20040247134A1-20041209-M00007
12. The system of claim 7 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .425 0 - .736 0 0 .601 - .850 0 0 0 0 .601 - .106 .638 .552 0 0 .601 - .106 - .638 .552
Figure US20040247134A1-20041209-M00008
13. The system of claim 7 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .850 0 0 0 0 .601 0 0 .850 0 0 .601 - .368 .736 .213 0 0 .601 - .368 - .736 .213
Figure US20040247134A1-20041209-M00009
14. The system of claim 6 wherein Pout is a 6×1 matrix and the decoder produces Pout by multiplying Sout by the inverse of S.
15. The system of claim 1 wherein the plurality of speakers are arranged in a 3D arrangement.
16. The system of claim 15 wherein the plurality of speakers is ten.
17. The system of claim 16 wherein:
a first two of said speakers are positioned so that:
azimuthally, one is approximately 8 degrees to the left of and the other is approximately 8 degrees to the right of the 12 o'clock position of a listener; and
elevationally, both are positioned substantially on a horizontal plane that intersects the listener's ears;
a second two of said speakers are positioned so that:
azimuthally, one is approximately 45 degrees to the left of and the other is approximately 45 degrees to the right of the 12 o'clock position of the listener; and
elevationally, both are positioned substantially on said horizontal plane;
a third two of said speakers are positioned so that:
azimuthally, one is approximately 135 degrees to the left of and the other is approximately 135 degrees to the right of the 12 o'clock position of the listener; and
elevationally, both are positioned substantially on said horizontal plane;
a fourth two of said speakers are positioned so that:
azimuthally, one is approximately 90 degrees to the left of and the other is approximately 90 degrees to the right of the 12 o'clock position of the listener; and
elevationally, both are positioned above said horizontal plane; and
a fifth two of said speakers are positioned so that:
azimuthally, one is approximately 90 degrees to the left of and the other is approximately 90 degrees to the right of the 12 o'clock position of the listener; and
elevationally, both are positioned below said horizontal plane.
18. The system of claim 17 further comprising at least two additional speakers.
19. The system of claim 18 wherein:
a sixth two of said speakers are positioned so that:
azimuthally, one is approximately 172 degrees to the left of and the other is approximately 172 degrees to the right of the 12 o'clock position of a listener; and
elevationally, both are positioned substantially on a horizontal plane that intersects the listener's ears;
20. A system for providing an encoded signal (“Sout”) representative of an input sound field, comprising:
a microphone array for receiving the input sound field and producing therefrom a microphone signal (“Pin”) representative of the input sound field wherein Pin comprises B-format channels, an FL (front left) channel, and an FR (front right) channel;
an encoder for producing Sout from Pin wherein Sout comprises an ITU-compatible six channel signal.
21. The system of claim 20 wherein the hybrid microphone array is comprised of:
at least 6 microphones; and
a baffle including a substantially ellipsoidal structure.
22. The system of claim 21 wherein four of said microphones are arranged in a tetrahedron.
23. The system of claim 20 wherein Pin and Sout are each a 6×1 matrix and the encoder produces Sout by multiplying Pin by a 6×6 transformation matrix (“S”).
24. The system of claim 20 wherein S comprises the quantities:
s ( L , FL ) s ( L , FR ) s ( L , W ) s ( L , X ) s ( L , Y ) s ( L , Z ) s ( R , FL ) s ( R , FR ) s ( R , W ) s ( R , X ) s ( R , Y ) s ( R , Z ) s ( C , FL ) s ( C , FR ) s ( C , W ) s ( C , X ) s ( C , Y ) s ( C , Z ) s ( SC , FL ) s ( SC , FR ) s ( SC , W ) s ( SC , X ) s ( SC , Y ) s ( SC , Z ) s ( SL , FL ) s ( SL , FR ) s ( SL , W ) s ( SL , X ) s ( SL , Y ) s ( SL , Z ) s ( SR , FL ) s ( SR , FR ) s ( SR , W ) s ( SR , X ) s ( SR , Y ) s ( SR , Z )
Figure US20040247134A1-20041209-M00010
wherein:
L represents a left speaker channel for an ITU-compatible six channel signal;
R represents a right speaker channel for an ITU-compatible six channel signal;
C represents a center speaker channel for an ITU-compatible six channel signal;
SC represents a surround center speaker channel for an ITU-compatible six channel signal;
SL represents a surround left speaker channel for an ITU-compatible six channel signal;
SR represents a surround right speaker channel for an ITU-compatible six channel signal;
FL represents the front left speaker channel;
FR represents the front right speaker channel;
W represents a B-format channel;
X represents a B-format channel;
Y represents a B-format channel;
Z represents a B-format channel;
and wherein
s(α, β) represents a transformation quantity relating the respective α and β channels.
25. The system of claim 24 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .736 0 .425 0 0 .601 - .736 0 .425 0 0 .601 - .368 .638 - .425 0 0 .601 - .368 - .638 - .425
Figure US20040247134A1-20041209-M00011
26. The system of claim 24 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .736 0 - .425 0 0 .601 - .736 0 - .425 0 0 .601 - .368 .638 .425 0 0 .601 - .368 - .638 .425
Figure US20040247134A1-20041209-M00012
27. The system of claim 24 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .736 0 .425 0 0 .601 - .425 0 .736 0 0 .601 - .425 .736 0 0 0 .601 - .425 - .736 0
Figure US20040247134A1-20041209-M00013
28. The system of claim 24 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .850 0 0 0 0 .601 - .425 0 .736 0 0 .601 - .531 .638 - .184 0 0 .601 - .531 - .638 - .184
Figure US20040247134A1-20041209-M00014
29. The system of claim 24 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .425 0 - .736 0 0 .601 - .850 0 0 0 0 .601 - .106 .638 .552 0 0 .601 - .106 - .638 .552
Figure US20040247134A1-20041209-M00015
30. The system of claim 24 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .850 0 0 0 0 .601 0 0 .850 0 0 .601 - .368 .736 .213 0 0 .601 - .368 - .736 .213
Figure US20040247134A1-20041209-M00016
31. The system of claim 23 wherein Pout is a 6×1 matrix and the decoder produces Pout by multiplying Sout by inverse of S.
32. A method for producing an output sound field that is representative of an input sound field, comprising the steps of:
providing a microphone array for receiving the input sound field and producing therefrom a microphone signal (“Pin”) representative of the input sound field wherein Pin comprises B-format channels, an FL channel, and an FR channel;
producing an encoded signal (“Sout”) from Pin wherein Sout comprises an ITU-compatible six channel signal;
producing a decoded signal (“Pout”) from Sout wherein Pout comprises B-format channels, am FL channel, and an FR channel; and
providing a plurality of speakers for producing the output sound field from Pout to thereby represent the input sound field.
33. The method of claim 32 wherein the hybrid microphone array is comprised of:
at least 6 microphones; and
a substantially ellipsoidal baffle.
34. The method of claim 33 wherein four of said microphones are arranged in a tetrahedron.
35. The method of claim 34 wherein the plurality of speakers produces the output sound field from Sout.
36. The method of claim 35 wherein the plurality of speakers are provided in a 2D arrangement.
37. The method of claim 32 wherein Pin and Sout are each a 6×1 matrix and the encoder produces Sout by multiplying Pin by a 6×6 transformation matrix (“S”).
38. The method of claim 32 wherein S comprises the quantities:
s ( L , FL ) s ( L , FR ) s ( L , W ) s ( L , X ) s ( L , Y ) s ( L , Z ) s ( R , FL ) s ( R , FR ) s ( R , W ) s ( R , X ) s ( R , Y ) s ( R , Z ) s ( C , FL ) s ( C , FR ) s ( C , W ) s ( C , X ) s ( C , Y ) s ( C , Z ) s ( SC , FL ) s ( SC , FR ) s ( SC , W ) s ( SC , X ) s ( SC , Y ) s ( SC , Z ) s ( SL , FL ) s ( SL , FR ) s ( SL , W ) s ( SL , X ) s ( SL , Y ) s ( SL , Z ) s ( SR , FL ) s ( SR , FR ) s ( SR , W ) s ( SR , X ) s ( SR , Y ) s ( SR , Z )
Figure US20040247134A1-20041209-M00017
wherein:
L represents a left speaker channel for an ITU-compatible six channel signal;
R represents a right speaker channel for an ITU-compatible six channel signal;
C represents a center speaker channel for an ITU-compatible six channel signal;
SC represents a surround center speaker channel for an ITU-compatible six channel signal;
SL represents a surround left speaker channel for an ITU-compatible six channel signal;
SR represents a surround right speaker channel for an ITU-compatible six channel signal;
FL represents the front left speaker channel;
FR represents the front right speaker channel;
W represents a B-format channel;
X represents a B-format channel;
Y represents a B-format channel;
Z represents a B-format channel;
and wherein
s(α, β) represents a transformation quantity relating the respective α and, β channels.
39. The system of claim 38 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .736 0 .425 0 0 .601 - .736 0 .425 0 0 .601 - .368 .638 - .425 0 0 .601 - .368 - .638 - .425
Figure US20040247134A1-20041209-M00018
40. The system of claim 38 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .736 0 - .425 0 0 .601 - .736 0 - .425 0 0 .601 - .368 .638 .425 0 0 .601 - .368 - .638 .425
Figure US20040247134A1-20041209-M00019
41. The system of claim 38 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .736 0 .425 0 0 .601 - .425 0 .736 0 0 .601 - .425 .736 0 0 0 .601 - .425 - .736 0
Figure US20040247134A1-20041209-M00020
42. The system of claim 38 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .850 0 0 0 0 .601 - .425 0 .736 0 0 .601 - .531 .638 - .184 0 0 .601 - .531 - .638 - .184
Figure US20040247134A1-20041209-M00021
43. The system of claim 38 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .425 0 - .736 0 0 .601 - .850 0 0 0 0 .601 - .106 .638 .552 0 0 .601 - .106 - .638 .552
Figure US20040247134A1-20041209-M00022
44. The system of claim 38 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .850 0 0 0 0 .601 0 0 .850 0 0 .601 - .368 .736 .213 0 0 .601 - .368 - .736 .213
Figure US20040247134A1-20041209-M00023
45. The method of claim 37 wherein Pout is a 6×1 matrix and the decoder produces Pout by multiplying Sout by the inverse of S.
46. The method of claim 32 wherein the plurality of speakers are arranged in a 3D arrangement.
47. The method of claim 46 wherein the plurality of speakers is ten.
48. The method of claim 47 wherein:
a first two of said speakers are positioned so that:
azimuthally, one is approximately 8 degrees to the left of and the other is approximately 8 degrees to the right of the 12 o'clock position of a listener; and
elevationally, both are positioned substantially on a horizontal plane that intersects the listener's ears;
a second two of said speakers are positioned so that:
azimuthally, one is approximately 45 degrees to the left of and the other is approximately 45 degrees to the right of the 12 o'clock position of the listener; and
elevationally, both are positioned substantially on said horizontal plane;
a third two of said speakers are positioned so that:
azimuthally, one is approximately 135 degrees to the left of and the other is approximately 135 degrees to the right of the 12 o'clock position of the listener; and
elevationally, both are positioned substantially on said horizontal plane;
a fourth two of said speakers are positioned so that:
azimuthally, one is approximately 90 degrees to the left of and the other is approximately 90 degrees to the right of the 12 o'clock position of the listener; and
elevationally, both are positioned above said horizontal plane; and
a fifth two of said speakers are positioned so that:
azimuthally, one is approximately 90 degrees to the left of and the other is approximately 90 degrees to the right of the 12 o'clock position of the listener; and
elevationally, both are positioned below said horizontal plane.
49. The method of claim 48 further comprising at least two additional speakers.
50. The method of claim 49 wherein:
a sixth two of said speakers are positioned so that:
azimuthally, one is approximately 172 degrees to the left of and the other is approximately 172 degrees to the right of the 12 o'clock position of a listener; and
elevationally, both are positioned substantially on a horizontal plane that intersects the listener's ears;
51. A method for producing an encoded signal (“Sout”) representative of an input sound field, comprising the steps:
providing a microphone array for receiving the input sound field and producing therefrom a microphone signal (“Pin”) representative of the input sound field wherein Pin comprises B-format channels, an FL (front left) channel, and an FR (front right) channel;
producing Sout from Pin wherein Sout comprises an ITU-compatible six channel signal.
52. The method of claim 51 wherein the hybrid microphone array is comprised of:
at least 6 microphones; and
a substantially ellipsoidal shaped baffle.
53. The method of claim 52 wherein four of said microphones are arranged in a tetrahedron.
54. The method of claim 51 wherein Pin and Sout are each a 6×1 matrix and the encoder produces Sout by multiplying Pin by a 6×6 transformation matrix (“S”).
55. The method of claim 51 wherein S comprises the quantities:
s ( L , FL ) s ( L , FR ) s ( L , W ) s ( L , X ) s ( L , Y ) s ( L , Z ) s ( R , FL ) s ( R , FR ) s ( R , W ) s ( R , X ) s ( R , Y ) s ( R , Z ) s ( C , FL ) s ( C , FR ) s ( C , W ) s ( C , X ) s ( C , Y ) s ( C , Z ) s ( SC , FL ) s ( SC , FR ) s ( SC , W ) s ( SC , X ) s ( SC , Y ) s ( SC , Z ) s ( SL , FL ) s ( SL , FR ) s ( SL , W ) s ( SL , X ) s ( SL , Y ) s ( SL , Z ) s ( SR , FL ) s ( SR , FR ) s ( SR , W ) s ( SR , X ) s ( SR , Y ) s ( SR , Z )
Figure US20040247134A1-20041209-M00024
wherein:
L represents a left speaker channel for an ITU-compatible six channel signal;
R represents a right speaker channel for an ITU-compatible six channel signal;
C represents a center speaker channel for an ITU-compatible six channel signal;
SC represents a surround center speaker channel for an ITU-compatible six channel signal;
SL represents a surround left speaker channel for an ITU-compatible six channel signal;
SR represents a surround right speaker channel for an ITU-compatible six channel signal;
FL represents the front left speaker channel;
FR represents the front right speaker channel;
W represents a B-format channel;
X represents a B-format channel;
Y represents a B-format channel;
Z represents a B-format channel;
and wherein
s(α, β) represents a transformation quantity relating the respective α and β channels.
56. The system of claim 55 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .736 0 .425 0 0 .601 - .736 0 .425 0 0 .601 - .368 .638 - .425 0 0 .601 - .368 - .638 - .425
Figure US20040247134A1-20041209-M00025
57. The system of claim 55 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .736 0 - .425 0 0 .601 - .736 0 - .425 0 0 .601 - .368 .638 .425 0 0 .601 - .368 - .638 .425
Figure US20040247134A1-20041209-M00026
58. The system of claim 55 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .736 0 .425 0 0 .601 - .425 0 .736 0 0 .601 - .425 .736 0 0 0 .601 - .425 - .736 0
Figure US20040247134A1-20041209-M00027
59. The system of claim 55 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .850 0 0 0 0 .601 - .425 0 .736 0 0 .601 - .531 .638 - .184 0 0 .601 - .531 - .638 - .184
Figure US20040247134A1-20041209-M00028
60. The system of claim 55 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .425 0 - .736 0 0 .601 - .850 0 0 0 0 .601 - .106 .638 .552 0 0 .601 - .106 - .638 .552
Figure US20040247134A1-20041209-M00029
61. The system of claim 55 wherein S comprises the following approximate quantities:
.850 0 0 0 0 0 0 .850 0 0 0 0 0 0 .601 .850 0 0 0 0 .601 0 0 .850 0 0 .601 - .368 .736 .213 0 0 .601 - .368 - .736 .213
Figure US20040247134A1-20041209-M00030
62. The method of claim 54 wherein Pout is a 6×1 matrix and the decoder produces Pout by multiplying Sout by the inverse of S.
63. In a system for producing a 2D output sound field that is a function of an input sound field, where the system includes a microphone for receiving the input sound field and producing therefrom a microphone signal comprising B-format channels, an encoder for receiving the microphone signal and producing therefrom an encoded signal comprising an ITU-compatible six channel signal, and a first plurality of speakers arranged in a 2D configuration for receiving the encoded signal and producing therefrom the 2D output sound field, the improvement comprising:
a microphone array in place of said microphone wherein said microphone array receives the input sound field and produces therefrom a microphone array signal representative of the input sound field wherein the microphone array signal comprises B-format channels, an FL channel, and an FR channel;
said encoder further comprising circuitry for providing said encoded signal from said microphone array signal;
a decoder for producing a decoded signal from said encoded signal wherein said decoded signal comprises B-format channels, an FL channel, and an FR channel; and
a second plurality of speakers in addition to the first plurality of speakers, said first and second plurality of speakers arranged in a 3D configuration and receiving said decoded signal and producing therefrom a 3D output sound field.
64. The system of claim 63 wherein the hybrid microphone array is comprised of:
at least 6 microphones; and
a baffle including a substantially ellipsoidal structure.
US10/802,924 2003-03-18 2004-03-18 System and method for compatible 2D/3D (full sphere with height) surround sound reproduction Expired - Fee Related US7558393B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/802,924 US7558393B2 (en) 2003-03-18 2004-03-18 System and method for compatible 2D/3D (full sphere with height) surround sound reproduction

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US45549703P 2003-03-18 2003-03-18
US10/802,924 US7558393B2 (en) 2003-03-18 2004-03-18 System and method for compatible 2D/3D (full sphere with height) surround sound reproduction

Publications (2)

Publication Number Publication Date
US20040247134A1 true US20040247134A1 (en) 2004-12-09
US7558393B2 US7558393B2 (en) 2009-07-07

Family

ID=33493099

Family Applications (1)

Application Number Title Priority Date Filing Date
US10/802,924 Expired - Fee Related US7558393B2 (en) 2003-03-18 2004-03-18 System and method for compatible 2D/3D (full sphere with height) surround sound reproduction

Country Status (1)

Country Link
US (1) US7558393B2 (en)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070253555A1 (en) * 2006-04-19 2007-11-01 Christopher David Vernon Processing audio input signals
US20100135509A1 (en) * 2008-12-01 2010-06-03 Charles Timberlake Zeleny Zeleny sonosphere
US20100302441A1 (en) * 2009-06-02 2010-12-02 Canon Kabushiki Kaisha Information processing apparatus, information processing method, and program
CN102231868A (en) * 2011-05-18 2011-11-02 上海大学 High-order-recording-way-based three-dimensional (3D) sound reproducing system
WO2013142657A1 (en) * 2012-03-23 2013-09-26 Dolby Laboratories Licensing Corporation System and method of speaker cluster design and rendering
WO2014014600A1 (en) * 2012-07-15 2014-01-23 Qualcomm Incorporated Systems, methods, apparatus, and computer-readable media for backward-compatible audio coding
US20140286493A1 (en) * 2011-11-11 2014-09-25 Thomson Licensing Method and apparatus for processing signals of a spherical microphone array on a rigid sphere used for generating an ambisonics representation of the sound field
US8855340B2 (en) * 2009-12-09 2014-10-07 Electronics And Telecommunications Research Institute Apparatus for reproducting wave field using loudspeaker array and the method thereof
US20140307894A1 (en) * 2011-11-11 2014-10-16 Thomson Licensing A Corporation Method and apparatus for processing signals of a spherical microphone array on a rigid sphere used for generating an ambisonics representation of the sound field
US20150146873A1 (en) * 2012-06-19 2015-05-28 Dolby Laboratories Licensing Corporation Rendering and Playback of Spatial Audio Using Channel-Based Audio Systems
US20150302520A1 (en) * 2011-08-15 2015-10-22 Velocity International, Inc Method of conducting audience guided events and auctions utilizing closed network satellite broadcasts to multiple location digital theater environments with integrated real time audience interaction
US20160044430A1 (en) * 2012-03-23 2016-02-11 Dolby Laboratories Licensing Corporation Method and system for head-related transfer function generation by linear mixing of head-related transfer functions
US20160064003A1 (en) * 2013-04-03 2016-03-03 Dolby Laboratories Licensing Corporation Methods and Systems for Generating and Rendering Object Based Audio with Conditional Rendering Metadata
US20160227340A1 (en) * 2015-02-03 2016-08-04 Qualcomm Incorporated Coding higher-order ambisonic audio data with motion stabilization
US20160269845A1 (en) * 2013-10-25 2016-09-15 Samsung Electronics Co., Ltd. Stereophonic sound reproduction method and apparatus
US9473870B2 (en) 2012-07-16 2016-10-18 Qualcomm Incorporated Loudspeaker position compensation with 3D-audio hierarchical coding
US20160381482A1 (en) * 2013-05-29 2016-12-29 Qualcomm Incorporated Extracting decomposed representations of a sound field based on a first configuration mode
US20180077511A1 (en) * 2012-08-31 2018-03-15 Dolby Laboratories Licensing Corporation System for Rendering and Playback of Object Based Audio in Various Listening Environments
US9922656B2 (en) 2014-01-30 2018-03-20 Qualcomm Incorporated Transitioning of ambient higher-order ambisonic coefficients
US10021508B2 (en) 2011-11-11 2018-07-10 Dolby Laboratories Licensing Corporation Method and apparatus for processing signals of a spherical microphone array on a rigid sphere used for generating an ambisonics representation of the sound field
US10327067B2 (en) * 2015-05-08 2019-06-18 Samsung Electronics Co., Ltd. Three-dimensional sound reproduction method and device
WO2019133942A1 (en) * 2017-12-29 2019-07-04 Polk Audio, Llc Voice-control soundbar loudspeaker system with dedicated dsp settings for voice assistant output signal and mode switching method
CN111343556A (en) * 2020-03-11 2020-06-26 费迪曼逊多媒体科技(上海)有限公司 Sound system combining traditional sound reinforcement, holographic sound reinforcement and electronic sound cover and using method thereof
US10770087B2 (en) 2014-05-16 2020-09-08 Qualcomm Incorporated Selecting codebooks for coding vectors decomposed from higher-order ambisonic audio signals

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI559786B (en) 2008-09-03 2016-11-21 杜比實驗室特許公司 Enhancing the reproduction of multiple audio channels
US8442244B1 (en) 2009-08-22 2013-05-14 Marshall Long, Jr. Surround sound system
KR101953279B1 (en) 2010-03-26 2019-02-28 돌비 인터네셔널 에이비 Method and device for decoding an audio soundfield representation for audio playback
WO2012145176A1 (en) 2011-04-18 2012-10-26 Dolby Laboratories Licensing Corporation Method and system for upmixing audio to generate 3d audio
AU2012279349B2 (en) 2011-07-01 2016-02-18 Dolby Laboratories Licensing Corporation System and tools for enhanced 3D audio authoring and rendering
KR101285982B1 (en) 2011-10-25 2013-07-15 강릉원주대학교산학협력단 Sound system, sound transmitter, transmitting method, and computer-readable recording medium for creating sound information compatible with both 2d picture and 3d picture
JP6486833B2 (en) 2012-12-20 2019-03-20 ストラブワークス エルエルシー System and method for providing three-dimensional extended audio
WO2015000819A1 (en) 2013-07-05 2015-01-08 Dolby International Ab Enhanced soundfield coding using parametric component generation
US9916836B2 (en) 2015-03-23 2018-03-13 Microsoft Technology Licensing, Llc Replacing an encoded audio output signal

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5594800A (en) * 1991-02-15 1997-01-14 Trifield Productions Limited Sound reproduction system having a matrix converter
US20020172370A1 (en) * 2001-05-15 2002-11-21 Akitaka Ito Surround sound field reproduction system and surround sound field reproduction method
US7231054B1 (en) * 1999-09-24 2007-06-12 Creative Technology Ltd Method and apparatus for three-dimensional audio display

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5594800A (en) * 1991-02-15 1997-01-14 Trifield Productions Limited Sound reproduction system having a matrix converter
US7231054B1 (en) * 1999-09-24 2007-06-12 Creative Technology Ltd Method and apparatus for three-dimensional audio display
US20020172370A1 (en) * 2001-05-15 2002-11-21 Akitaka Ito Surround sound field reproduction system and surround sound field reproduction method

Cited By (56)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070253555A1 (en) * 2006-04-19 2007-11-01 Christopher David Vernon Processing audio input signals
US8688249B2 (en) * 2006-04-19 2014-04-01 Sonita Logic Limted Processing audio input signals
US20100135509A1 (en) * 2008-12-01 2010-06-03 Charles Timberlake Zeleny Zeleny sonosphere
US20100302441A1 (en) * 2009-06-02 2010-12-02 Canon Kabushiki Kaisha Information processing apparatus, information processing method, and program
US8855340B2 (en) * 2009-12-09 2014-10-07 Electronics And Telecommunications Research Institute Apparatus for reproducting wave field using loudspeaker array and the method thereof
CN102231868A (en) * 2011-05-18 2011-11-02 上海大学 High-order-recording-way-based three-dimensional (3D) sound reproducing system
US20150302520A1 (en) * 2011-08-15 2015-10-22 Velocity International, Inc Method of conducting audience guided events and auctions utilizing closed network satellite broadcasts to multiple location digital theater environments with integrated real time audience interaction
US20140286493A1 (en) * 2011-11-11 2014-09-25 Thomson Licensing Method and apparatus for processing signals of a spherical microphone array on a rigid sphere used for generating an ambisonics representation of the sound field
US9420372B2 (en) * 2011-11-11 2016-08-16 Dolby Laboratories Licensing Corporation Method and apparatus for processing signals of a spherical microphone array on a rigid sphere used for generating an ambisonics representation of the sound field
US20140307894A1 (en) * 2011-11-11 2014-10-16 Thomson Licensing A Corporation Method and apparatus for processing signals of a spherical microphone array on a rigid sphere used for generating an ambisonics representation of the sound field
US10021508B2 (en) 2011-11-11 2018-07-10 Dolby Laboratories Licensing Corporation Method and apparatus for processing signals of a spherical microphone array on a rigid sphere used for generating an ambisonics representation of the sound field
US9503818B2 (en) * 2011-11-11 2016-11-22 Dolby Laboratories Licensing Corporation Method and apparatus for processing signals of a spherical microphone array on a rigid sphere used for generating an ambisonics representation of the sound field
US10051400B2 (en) 2012-03-23 2018-08-14 Dolby Laboratories Licensing Corporation System and method of speaker cluster design and rendering
US9622006B2 (en) * 2012-03-23 2017-04-11 Dolby Laboratories Licensing Corporation Method and system for head-related transfer function generation by linear mixing of head-related transfer functions
US20160044430A1 (en) * 2012-03-23 2016-02-11 Dolby Laboratories Licensing Corporation Method and system for head-related transfer function generation by linear mixing of head-related transfer functions
WO2013142657A1 (en) * 2012-03-23 2013-09-26 Dolby Laboratories Licensing Corporation System and method of speaker cluster design and rendering
US9622014B2 (en) * 2012-06-19 2017-04-11 Dolby Laboratories Licensing Corporation Rendering and playback of spatial audio using channel-based audio systems
US20150146873A1 (en) * 2012-06-19 2015-05-28 Dolby Laboratories Licensing Corporation Rendering and Playback of Spatial Audio Using Channel-Based Audio Systems
KR101993587B1 (en) 2012-07-15 2019-06-27 퀄컴 인코포레이티드 Systems, methods, apparatus, and computer-readable media for backward-compatible audio coding
US9288603B2 (en) 2012-07-15 2016-03-15 Qualcomm Incorporated Systems, methods, apparatus, and computer-readable media for backward-compatible audio coding
WO2014014600A1 (en) * 2012-07-15 2014-01-23 Qualcomm Incorporated Systems, methods, apparatus, and computer-readable media for backward-compatible audio coding
KR101751241B1 (en) 2012-07-15 2017-06-27 퀄컴 인코포레이티드 Systems, methods, apparatus, and computer-readable media for backward-compatible audio coding
KR20170075025A (en) * 2012-07-15 2017-06-30 퀄컴 인코포레이티드 Systems, methods, apparatus, and computer-readable media for backward-compatible audio coding
US9788133B2 (en) 2012-07-15 2017-10-10 Qualcomm Incorporated Systems, methods, apparatus, and computer-readable media for backward-compatible audio coding
CN104471960A (en) * 2012-07-15 2015-03-25 高通股份有限公司 Systems, methods, apparatus, and computer-readable media for backward-compatible audio coding
US9473870B2 (en) 2012-07-16 2016-10-18 Qualcomm Incorporated Loudspeaker position compensation with 3D-audio hierarchical coding
US11178503B2 (en) 2012-08-31 2021-11-16 Dolby Laboratories Licensing Corporation System for rendering and playback of object based audio in various listening environments
US20180077511A1 (en) * 2012-08-31 2018-03-15 Dolby Laboratories Licensing Corporation System for Rendering and Playback of Object Based Audio in Various Listening Environments
US10959033B2 (en) 2012-08-31 2021-03-23 Dolby Laboratories Licensing Corporation System for rendering and playback of object based audio in various listening environments
US10412523B2 (en) * 2012-08-31 2019-09-10 Dolby Laboratories Licensing Corporation System for rendering and playback of object based audio in various listening environments
US10388291B2 (en) 2013-04-03 2019-08-20 Dolby Laboratories Licensing Corporation Methods and systems for generating and rendering object based audio with conditional rendering metadata
US11568881B2 (en) 2013-04-03 2023-01-31 Dolby Laboratories Licensing Corporation Methods and systems for generating and rendering object based audio with conditional rendering metadata
US10748547B2 (en) 2013-04-03 2020-08-18 Dolby Laboratories Licensing Corporation Methods and systems for generating and rendering object based audio with conditional rendering metadata
US9881622B2 (en) * 2013-04-03 2018-01-30 Dolby Laboratories Licensing Corporation Methods and systems for generating and rendering object based audio with conditional rendering metadata
US11948586B2 (en) 2013-04-03 2024-04-02 Dolby Laboratories Licensing Coporation Methods and systems for generating and rendering object based audio with conditional rendering metadata
US20160064003A1 (en) * 2013-04-03 2016-03-03 Dolby Laboratories Licensing Corporation Methods and Systems for Generating and Rendering Object Based Audio with Conditional Rendering Metadata
US9749768B2 (en) * 2013-05-29 2017-08-29 Qualcomm Incorporated Extracting decomposed representations of a sound field based on a first configuration mode
US10499176B2 (en) 2013-05-29 2019-12-03 Qualcomm Incorporated Identifying codebooks to use when coding spatial components of a sound field
US9854377B2 (en) 2013-05-29 2017-12-26 Qualcomm Incorporated Interpolation for decomposed representations of a sound field
US9980074B2 (en) 2013-05-29 2018-05-22 Qualcomm Incorporated Quantization step sizes for compression of spatial components of a sound field
US11146903B2 (en) 2013-05-29 2021-10-12 Qualcomm Incorporated Compression of decomposed representations of a sound field
US11962990B2 (en) 2013-05-29 2024-04-16 Qualcomm Incorporated Reordering of foreground audio objects in the ambisonics domain
US20160381482A1 (en) * 2013-05-29 2016-12-29 Qualcomm Incorporated Extracting decomposed representations of a sound field based on a first configuration mode
US9883312B2 (en) 2013-05-29 2018-01-30 Qualcomm Incorporated Transformed higher order ambisonics audio data
US20160269845A1 (en) * 2013-10-25 2016-09-15 Samsung Electronics Co., Ltd. Stereophonic sound reproduction method and apparatus
US10645513B2 (en) 2013-10-25 2020-05-05 Samsung Electronics Co., Ltd. Stereophonic sound reproduction method and apparatus
US11051119B2 (en) 2013-10-25 2021-06-29 Samsung Electronics Co., Ltd. Stereophonic sound reproduction method and apparatus
US10091600B2 (en) * 2013-10-25 2018-10-02 Samsung Electronics Co., Ltd. Stereophonic sound reproduction method and apparatus
CN107734445A (en) * 2013-10-25 2018-02-23 三星电子株式会社 Stereophonics method and apparatus
US9922656B2 (en) 2014-01-30 2018-03-20 Qualcomm Incorporated Transitioning of ambient higher-order ambisonic coefficients
US10770087B2 (en) 2014-05-16 2020-09-08 Qualcomm Incorporated Selecting codebooks for coding vectors decomposed from higher-order ambisonic audio signals
US20160227340A1 (en) * 2015-02-03 2016-08-04 Qualcomm Incorporated Coding higher-order ambisonic audio data with motion stabilization
US9712936B2 (en) * 2015-02-03 2017-07-18 Qualcomm Incorporated Coding higher-order ambisonic audio data with motion stabilization
US10327067B2 (en) * 2015-05-08 2019-06-18 Samsung Electronics Co., Ltd. Three-dimensional sound reproduction method and device
WO2019133942A1 (en) * 2017-12-29 2019-07-04 Polk Audio, Llc Voice-control soundbar loudspeaker system with dedicated dsp settings for voice assistant output signal and mode switching method
CN111343556A (en) * 2020-03-11 2020-06-26 费迪曼逊多媒体科技(上海)有限公司 Sound system combining traditional sound reinforcement, holographic sound reinforcement and electronic sound cover and using method thereof

Also Published As

Publication number Publication date
US7558393B2 (en) 2009-07-07

Similar Documents

Publication Publication Date Title
US7558393B2 (en) System and method for compatible 2D/3D (full sphere with height) surround sound reproduction
Rumsey Spatial audio
CN104604257B (en) For listening to various that environment is played up and the system of the object-based audio frequency of playback
US9154896B2 (en) Audio spatialization and environment simulation
JP5431249B2 (en) Method and apparatus for reproducing a natural or modified spatial impression in multi-channel listening, and a computer program executing the method
US20060165247A1 (en) Ambient and direct surround sound system
WO1996033591A1 (en) An acoustical audio system for producing three dimensional sound image
Wiggins An investigation into the real-time manipulation and control of three-dimensional sound fields
Lee Multichannel 3D microphone arrays: A review
Griesinger The psychoacoustics of listening area, depth, and envelopment in surround recordings, and their relationship to microphone technique
Hollerweger Periphonic sound spatialization in multi-user virtual environments
KR100955328B1 (en) Apparatus and method for surround soundfield reproductioin for reproducing reflection
Kim Height channels
Malham Toward reality equivalence in spatial sound diffusion
Glasgal Ambiophonics. Achieving physiological realism in music recording and reproduction
Klepko 5-channel microphone array with binaural-head for multichannel reproduction
Griesinger Surround: The current technological situation
Bartlett et al. An improved Stereo Microphone array using boundary technology: theoretical aspects
Pfanzagl-Cardone The Art and Science of 3D Audio Recording
Miller III Scalable Tri-play Recording for Stereo, ITU 5.1/6.1 2D, and Periphonic 3D (with Height) Compatible Surround Sound Reproduction
Miller III Transforming Ambiophonic+ Ambisonic 3D Surround Sound to & from ITU 5.1/6.1
Miller III Spatial Definition and the PanAmbiophone microphone array for 2D surround & 3D fully periphonic recording
Barbour Applying aural research: the aesthetics of 5.1 surround
Miller III Recording immersive 5.1/6.1/7.1 surround sound, compatible stereo, and future 3D (with height)
Hietala Perceived differences in recordings produced with four 5.0 surround microphone techniques

Legal Events

Date Code Title Description
REMI Maintenance fee reminder mailed
FPAY Fee payment

Year of fee payment: 4

SULP Surcharge for late payment
REMI Maintenance fee reminder mailed
LAPS Lapse for failure to pay maintenance fees
STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20170707