MXPA05001413A - Audio channel spatial translation. - Google Patents

Audio channel spatial translation.

Info

Publication number
MXPA05001413A
MXPA05001413A MXPA05001413A MXPA05001413A MXPA05001413A MX PA05001413 A MXPA05001413 A MX PA05001413A MX PA05001413 A MXPA05001413 A MX PA05001413A MX PA05001413 A MXPA05001413 A MX PA05001413A MX PA05001413 A MXPA05001413 A MX PA05001413A
Authority
MX
Mexico
Prior art keywords
input
variable
input signals
correlation
matrix
Prior art date
Application number
MXPA05001413A
Other languages
Spanish (es)
Inventor
Mark Franklin Davis
Original Assignee
Dolby Lab Licensing Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dolby Lab Licensing Corp filed Critical Dolby Lab Licensing Corp
Priority claimed from PCT/US2003/024570 external-priority patent/WO2004019656A2/en
Publication of MXPA05001413A publication Critical patent/MXPA05001413A/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • H04S3/02Systems employing more than two channels, e.g. quadraphonic of the matrix type, i.e. in which input signals are combined algebraically, e.g. after having been phase shifted with respect to each other
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S5/00Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation 
    • H04S5/02Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation  of the pseudo four-channel type, e.g. in which rear channel signals are derived from two-channel stereo signals

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Analysis (AREA)
  • Algebra (AREA)
  • Stereophonic System (AREA)
  • Machine Translation (AREA)
  • Stereo-Broadcasting Methods (AREA)
  • Input Circuits Of Receivers And Coupling Of Receivers And Audio Equipment (AREA)

Abstract

variable matrix, M audio input signals, each associated with a direction, are translated to N audio output signals, each associated with a direction, wherein N is larger than M, M is two or more and N is a positive integer equal to three or more. The variable matrix is controlled in response to measures of: (1) the relative levels of the input signals, and (2) the cross-correlation of the input signals so that a soundfield generated by the output signals has a compact sound image in the nominal ongoing primary direction of the input signals when the input signals are highly correlated, the image spreading from compact to broad as the correlation decreases and progressively splitting into multiple compact sound images, each in a direction associated with an input signal, as the correlation continues to decrease to highly uncorrelated.

Description

ES, FI, FR, GB, GR, HU, IB, IT, LU, MC. NL, PT, RO, (88) J > ale of jmblicaiion f thc inlcrnaiional scarcli reporl: SE, SI, SK, TR), OAPI patenl (BF, BJ, CF, CG, CI, CM, UO lober 2004 GA, GN, GQ, GW, ML, MR, NE, SN, TD, TG). For two-Ietter oJes arid other abbreviations. Refer to the "Quid- Fulilislicd: anee Notes on Codes and Abbreviations" appearing ai the. begin- wiih ititentational se i repon niti of euf.h regular issue. of the PCT Guzeitc.
SPACE CONVERSION OF AUDIO CHANNELS FIELD OF THE INVENTION The invention relates to the processing of audio signals. More particularly, the invention relates to the conversion of M audio input channels, representing an acoustic field, into N audio output channels, which represent the same acoustic field, wherein each channel is a single audio stream that represents audio arriving from one direction, M and N are positive integers and M is at least 2 and N is at least 3, and N is greater than M. Typically, a spatial converter in which N is greater than M is usually characterized as a "decoder".
BACKGROUND OF THE INVENTION Although humans have only two ears, we hear sounds as a three-dimensional entity, based on a number of localization signals, such as head-related transfer functions (HRTFs) and head movement. The reproduction of the sound of complete fidelity then requires the retention and reproduction of the complete three-dimensional acoustic field, or at least the signals of perception of it. Unfortunately the technology of recording of sounds is not aimed at capturing the three-dimensional acoustic field, nor at capturing a two-dimensional plane of sound, nor at capturing a one-dimensional line of sound. The current technology of recording of sounds is strictly oriented to the capture, conservation, and presentation of discrete zero-dimensional audio channels. Most of the effort to improve fidelity, since Edison's original invention of sound recording, has focused on improving the imperfections of his original media based on cylinders / grooved discs, modulated analogically. These imperfections included a response to non-uniform, limited frequency, noise, distortion, whining, scintillation, speed accuracy, wear, garbage, and loss in copying generation. Although there were a number of gradual attempts at isolated improvements, including electronic amplification, tape recording, noise reduction, and disc players costing more than some cars, the traditional problems of the quality of individual channels were finally not resolved , arguably, until the singular development of digital recording in general, and specifically the introduction of audio compact disc. Since then, apart from the effort to further extend the quality From digital recording to 24-bit / 96 kHz sampling, the main efforts in the investigation of audio reproduction have focused on reducing the amount of data necessary to maintain the quality of individual channels, mainly using perception encoders, and in increase spatial fidelity. The last problem is the subject of this document. Efforts to improve spatial fidelity have proceeded along two fronts: try to carry the signals of perception of a complete acoustic field, and try to bring an approximation to the original, real acoustic field. Examples of systems employing the foregoing approach include binaural recording and virtual peripheral sound systems based on two speakers. .These systems exhibit a number of unfortunate imperfections, especially in the reliable location of sounds in certain directions, and in that they require the use of headphones or a fixed position of the individual listener. For the presentation of spatial sound to multiple listeners, whether in a room or in a commercial site such as a cinema, the only viable alternative has been to try to approach the real original acoustic field. Given the nature of discrete channels of sound recording, it is not surprising that most of the Efforts, to date, have involved what could be termed conservative increases in the number of presentation channels. The representative systems include the sound tracks of the films, in three speakers, monophonic, in panoramic, of the 50s, the conventional stereo sound, the quadraphonic systems of the 60s, the discrete magnetic soundtracks, in five channels, in 70mm films, Dolby peripheral sound using a matrix, in the 70s, sound in 5.1 channels AC-3 of the 90s, and recently the sound in 6.1 channels Surround-EX. "Dolby", "Pro Logic" and "Surround EX" are trademarks of Dolby Laboratories Licensing Corporation. To one degree or another, these systems provide improved spatial reproduction compared to monophonic presentation. However, the mixing of a greater number of channels incurs a greater time and cost penalties for the content producers, and the resulting perception is typically one of a few discrete, diffuse channels, rather than a continuous acoustic field. Aspects of Dolby Pro Logic decoding are described in U.S. Patent 4,799,260, which patent is incorporated herein by reference in its entirety. Detail of the AC-3 are presented in "Digital Audio Compression Standard (AC-3)," Advanced Television Systems Committee (ATSC), Document A / 52, December 20, 1995 (available on the World Wide Web of the Internet, at www.atsc.org/Standards/A52/a_52.doc). See also the Errata Sheet of July 22, 1999 (available on the World Wide Web of the Internet, at ww.doby.com / tech / ATSC_er.df) Once the acoustic field is characterized, it is possible in principle that a The decoder derives the optimum signal feed for any output horn.To the channels supplied to a decoder like that will be referred to herein, in various ways, as "cardinal", "transmitted" and "input" channels, and to any output channel with a location not corresponding to the position of one of the input channels will be referred to as an "intermediate" channel.An output channel may also have a location coincident with the position of an input channel.
DESCRIPTION OF THE INVENTION In accordance with a first aspect of the invention, a process for converting M audio input signals, each associated with an address, into N audio output signals, each associated with an address, wherein N is greater than M, M is two or greater and N is a positive integer equal to three or greater, comprises provide a variable matrix: N, apply the M audio input signals to the variable matrix, derive the N audio output signals from the variable matrix, and control the variable matrix in response to the input signals, so such that an acoustic field generated by the output signals has an image of the compact sound in the nominal primary current direction of the input signals when the input signals are highly correlated, the image is scattered from compact to extensive as the correlation decreases and progressively divides into multiple compact sound images, each in a direction associated with an input signal, as the correlation continues to decrease until highly uncorrelated. In accordance with this first aspect of the invention, the variable matrix can be controlled in response to measurements of: (1) the relative levels of the input signals, and (2) the cross-correlation of the input signals. In that case, for a cross-correlation measurement of the input signals, which have values in a first interval, limited by a maximum value and a reference value, the acoustic field may have a compact sound image when the measurement of the cross-correlation is the maximum value and may have a widely dispersed image when the correlation measure cross is the reference value, and for a cross-correlation measure of the input signals having values in a second interval, limited by the reference value and a minimum value, the acoustic field may have the image widely dispersed when the measurement cross-correlation is the reference value and may have a plurality of compact sound images, each in a direction associated with an input signal, when the cross-correlation measure is the minimum value. In accordance with a further aspect of the present invention, a process for converting M audio input signals, each associated with an address, into N audio output signals, each associated with an address, wherein N is greater than M, and M is three or greater, comprises providing a plurality of variable matrices m: n, where m is a subset of M and n is a sub-sum of N, applying a respective subset of the M audio input signals to each one of the variable matrices, derive a respective subset of the N audio output signals from each of the variable matrices, control each of the variable matrices, in response to the subset of input signals applied to it, in such a way that an acoustic field generated by the respective subset of output signals derived from the same, have a compact sound image in the direction of the primary current direction nominal of the subset of input signals applied to it when those input signals are highly correlated, the image is scattered from compact to wide as the correlation it decreases and progressively divides into multiple compact sound images, each in a direction associated with an input signal applied to it, as the correlation continues to decrease until it becomes highly uncorrelated, and the N audio output signals are derived to starting from the subcontacts of N audio output channels. In accordance with this additional aspect of the present invention, variable arrays can also be controlled in response to information that compensates for the effect of one or more different variable arrays that receive the same input signal. In addition, the derivation of the N audio output signals from the subsets of N audio output channels may also include the compensation of multiple variable matrices that produce the same output signal. In accordance with these additional aspects of the present invention, each of the variable matrices can be controlled in response to measurements of: (a) the relative levels of the signals of input to it, (b) the cross-correlation of the input signals. In accordance still with a further aspect of the present invention, a process for converting audio input signals, each associated with an address, into N audio output signals, each associated with an address, wherein N is greater than M, and is three or more, comprises providing a variable matrix M: N responsive to scale factors that control the matrix coefficients or that control the outputs of the matrix, apply the M audio input signals to the variable matrix, provide a plurality of scale factor generators of the variable matrix m: n where m is a subset of M and n is a subset of N, apply a respective subset of M audio input signals to each of the scale factor generators of the variable matrix, derive a set of scale factors from the variable matrix for the respective subsets of the N audio output signals from each of the factor factors of variable matrix wing, controlling each of the variable factor generators of the variable matrix in response to the subset of input signals applied to it, so that when the scale factors generated by it are applied to the matrix Variable M: N, an acoustic field generated by the respective subset of output signals produced has a compact sound image in the nominal primary current direction of the subset of input signals that produced the applied scaling factors, when those input signals are highly correlated, the dispersion of the image from compact to broad as the correlation decreases and progressively divides into multiple compact sound images, each in a direction associated with an input signal that produced the applied scale factors, as the correlation continues to decrease to become highly uncorrelated, and derive N audio output signals from the variable matrix. In accordance with this still further aspect of the present invention, the generators of the scale factors of the variable matrix can also be controlled in response to information that compensates for the effect of one or more variable scale scale factor generators, different , that receive the same input signal. In addition, the derivation of the N audio output signals from the variable matrix may include compensating multiple variable factor generators of the variable matrix, which produce scale factors for the same output signal. In accordance with these still further aspects of the present invention, each of the Scale factor generators of the variable matrix can be controlled in response to measures of: (a) the relative levels of the input signals applied to it, and (b) the cross-correlation of the input signals. In accordance with the present invention, M audio input channels representing an acoustic field are converted into N audio output channels representing the same acoustic field, wherein each channel is an individual audio stream representing audio arriving from one direction, M and N are positive integers, and M is at least 2 and N is at least 3, and N is greater than. Each input and output channel has an associated address (for example, azimuth, elevation and, optionally, distance, to allow a virtual or projected channel, closer or more distant). One or more sets of output channels are generated, each set having one or more output channels. Each set is usually associated with two or more spatially adjacent input channels and each output channel in a set is generated by determining a measure of the cross-correlation of the two or more input channels and a measure of the interrelationships of levels of the two or more input channels. The measure of cross-correlation is preferably a measure of cross-correlation with displacement in zero time, which is the ratio of the common energy level with respect to the geometric mean of the energy levels of the input signals. The common energy level is preferably the softened or averaged common energy level and the energy levels of the input signals are the energy levels of the input signals, smoothed or averaged. In one aspect of the present invention, multiple sets of output channels may be associated with more than two input channels and a process may determine the correlation of the input channels, with which each set of output channels is associated, according to a hierarchical order in such a way that each set or sets is (are) classified according to the number of input channels with which its channel or output channels are associated, where the largest number of input channels has the highest classification, and processing processes sets in order according to their hierarchical order. Further, in accordance with one aspect of the present invention, the processing takes into account the results of the processing of higher order sets.
The reproduction or decoding aspects assume that each of the M audio input channels, representing audio arriving from one direction, was generated by amplitude panoramic coding of the nearest passive matrix neighbor, of each source address ( that is, it is assumed that a source address correlates primarily with the nearest input channel (s)), without the requirement for additional sidechain information (the use of the sidechain or auxiliary information is optional), making it compatible with mixing techniques, consoles, and formats, existing. Although these source signals can be generated by explicitly employing a passive encoding matrix, most conventional recording techniques inherently generate those source signals (thus constituting an "effective coding matrix"). The reproduction or decoding aspects of the present invention are also largely compatible with natural recording source signals, such as could be done with five real directional microphones, since, by allowing some possible time delay, the sounds arriving from the intermediate directions tend to correlate mainly the closest microphones (in an arrangement horizontally, specifically to the nearest pair of microphones). A decoder or decoding process in accordance with aspects of the present invention can be implemented as a grid of processing modules or modular functions, in series (hereinafter "modules" or "decoding modules"), each of the which is used to generate one or more output channels (or alternatively control signals that can be used to generate one or more output channels), typically from the two or more nearest, adjacent, spatial input channels associated with the decoding module. The output channels typically represent relative proportions of the audio signals in the closest, adjacent, spatial input channels associated with the particular decoding module. As explained in greater detail below, the decoding modules are loosely coupled with each other in the sense that the modules share inputs and there is a hierarchy of decoding modules. The modules are arranged in the hierarchy according to the number of input channels with which they are associated (the module or modules with the highest number of associated input channels is classified as the highest). A supervisor or The supervisory function presides over the modules, in such a way that the common input signals are shared equally between the modules and in such a way that the decoder modules of higher order can affect the output of the lower order modules. Each module of the decoder can include, in effect, a matrix such that it directly generates output channels or each module of the decoder can generate control signals that are used, together with the control signals generated by other decoder modules, to vary the coefficients of a variable matrix or scale factors of inputs or outputs of a fixed matrix in order to generate all the output signals. The decoder modules simulate the operation of the human ear to try to provide a perceptually transparent reproduction. The conversion of signals according to the present invention, of which the decoder modules and the module functions are one aspect, can be applied either to broadband signals or to each frequency band of a multi-band processor, and depending on the implementation, it can be carried out once per sample or once per block of samples. A multiple band modality can employ either a filter bank, such as a discrete critical band filter bank or a bank of filters having a band structure compatible with an associated decoder, or a transformation configuration, such as a bank of linear filters FFT (Fast Fourier Transform) or MDCT (Modified Discrete Cosine Transform). Another aspect of this invention is that the number of horns receiving the N output channels can be reduced to a practical number judiciously based on the formation of virtual images, which is the creation of sound images perceived in positions in space other than the positions in which the horn was located. Although the most common use of virtual image formation is in the stereo reproduction of an image in a site between two speakers, expanding in panning a monophonic signal between the channels, forming the virtual image, as contemplated by an aspect of this invention, can include the production of phantom projected images that provide the auditory impression of being beyond the walls of a room or inside the walls of a room. The formation of virtual images is not considered a viable technique for the presentation of groups with a number of scattered channels, because it requires that the listener be equidistant from the two speakers, or in a nearby place. In theaters, for example, the front, left and right speakers are too far removed to obtain useful, phantom image information of a central image for much of the audience, so that, given the importance of the central channel as the source of much of the dialogue Instead, a central physical speaker is used. As the density of the speakers increases, a point will be reached where the virtual image formation is viable between any pair of speakers for a large part of the audience, at least to the extent that the panoramic extension is smooth with enough speakers the gaps between the speakers are no longer perceived as such.
Signal Distribution As mentioned above, a cross correlation measure determines the relationship of dominant energy (common signal components) to non-dominant energy (non-common signal components) in a module and the degree of dispersion of the signals. non-dominant signal components between the output channels of the module. This can be better understood by considering the distribution of signals in the output channels of a module, under different signal conditions, in the case of a two-input module. Unless mentioned otherwise, the principles presented extend directly to modules of a higher order. The problem with signal distribution is that there is often very little information to retrieve the amplitude distribution of the original signal, much less than for the signals themselves. The basic information available is the signal level at each module input and the cross product averaged from the input signals, the common energy level. The cross-correlation with displacement at zero time is the ratio of the common energy level to the geometric mean of the energy levels of the input signals. The meaning of the cross-correlation is that it functions as a measure of the net amplitude of the signal components common to all inputs. If there is a single panoramic signal anywhere between the inputs of the module (an "inner" or "intermediate" signal), all the inputs will have the same waveform, although with possibly different amplitudes, and under these conditions the correlation it will be 1.0. At the other extreme, if all the input signals are independent, which means that there is no common signal component, the correlation will be zero. It can be considered that intermediate correlation values between 0 and 1.0 correspond to intermediate equilibrium levels of some single common signal component and independent signal components at the inputs. Consequently, any input signal condition can be divided into a common signal, the "dominant" signal, and the remaining input signal components after subtracting the contributions from the common signal, comprise a signal component of "all the rest". "(the energy of the" non-dominant "or residue signal). As mentioned above, the amplitude of the common or "dominant" signal is not necessarily higher than the levels of the residual or non-dominant signal. For example, consider the case of a five-channel arc (L (Left), Middle (Middle-Left), C (Central), Medium (Medium-Right), R (Right)) in correspondence with a single pair Lt / Rt (total left and total right) in which you want to recover the five original channels. If all five channels have independent signals of equal amplitude, then Lt and Rt will be of equal amplitude, with an intermediate value of common energy, corresponding to an intermediate value of cross-correlation between zero and one (because Lt and Rt are not signals independent). The same levels can be achieved with appropriately selected levels of L, C, and R without signals of MedioL and MedioR. In this way, a module with two inputs and five outputs could only feed the output channel corresponding to the dominant address (C in this case) and the output channels correspond to the residuals of the input signal (L, R) after removing the energy C from the inputs Lt and Rt, without giving signals for the output channels MedioL and MedioR. That result is not desirable and switching off a channel unnecessarily is almost always a bad choice because small disturbances in the signal conditions will cause the "off" channel to switch between on and off, causing an annoying screeching sound ("chirp") it consists of a channel that turns on and off quickly), especially when the "off" channel is heard in isolation. Consequently, when there are multiple possible distributions of the output signals, for a given set of input signal values to the module, the conservative approach from the point of view of the quality of an individual channel is to disperse the signal components not dominant, as uniformly as possible between the output channels of the module, consistent with the signal conditions. One aspect of the present invention is to uniformly disperse the energy of the available signal, subject to the conditions of the signal, according to a three-way division instead of a "dominant" two-way division versus "all the rest" " Preferably the division into three Pathways comprise the components of the dominant signal (common), the components of the fill signal (uniformly scattered), and the residue of the components of the input signal. Unfortunately there is only enough information to perform a two-way division (components of the dominant signal and the components of all the other signals). An appropriate approach to perform a three-way division is described herein, in which for correlation values above a particular value, the two-way division employs the components of the dominant signal and the dispersed non-dominant signal; for correlation values below that value, the two-way division employs the components of the dispersed non-dominant signal and the residue. The common signal energy is divided between "dominant" and "uniformly dispersed". The "uniformly dispersed" component includes both "common" and "residual" signal components. Therefore, "dispersion" involves a mixture of common (correlated) and residue (non-correlated) signal components. Before processing, for a given input / output channel configuration of a given module, a correlation value is calculated that corresponds to all the output channels that receive the same signal amplitude. At this correlation value, You can reference it as the "random_xcor" value. For a single channel of intermediate output, derivative centering and input channels, the random_xcor value can be calculated as 0.333. For three intermediate channels, equally separated, and two intermediate channels, the value aleatoric_xcor can be calculated as 0.483. Although it has been found that these time values provide satisfactory results, they are not critical. For example, values of approximately 0.3 and 0.5, respectively, are useful. In other words, for a module with M inputs and N outputs, there is a particular degree of correlation of the M inputs that can be considered to represent equal energies in all N outputs. This can be achieved by considering the M inputs as if they were derived using a passive matrix from N to M that receives N independent signals of equal energy, although of course the actual inputs can be derived by other means. This threshold correlation value is "random_xcor" and can represent a dividing line between two operating regimes. Later, during processing, if the cross-correlation value of a module is greater than or equal to the random_xcor value, it is scaled to a range of 1.0 to 0: escalada_xcor = (correlation-random_xcor) / (1-aleatory_xcor) The value "escalada_xcor" represents the amount of dominant signal above the uniformly dispersed level. What about can also be distributed to the other output channels of the module. However, there is an additional factor that must be taken into account, especially that as the nominal primary current direction of the input signals becomes progressively more displaced from the center, the amount of energy dispersed must be progressively reduced, if an equal distribution is maintained for all the output channels or, alternatively, the amount of energy dispersed must be maintained but the energy distributed to the output channels must be reduced in relation to the displacement with respect to the center "of the dominant energy, In other words, it is a taper of the energy along the output channels, in the latter case, additional processing complexity may be required to maintain the output power equal to the input power. part the current correlation value is less than the random_xcor value, the dominant energy is considered zero, the energy evenly dispersed is progressively reduced, and the signal residual, whatever remains, is allowed to accumulate in the entries. In the correlation = zero, there is no internal signal, and there are only independent input signals that are made to correspond directly with the output channels. The operation of this aspect of the invention can be further explained as follows: a) When the real correlation is greater than random_xcor, there is enough common energy to consider that it is a dominant signal that must be conducted (expanded in panoramic) between two adjacent outputs (or, of course, fed to an output if its address coincides with that of that output); the energy assigned to it is subtracted from the inputs to give residues that are distributed (preferably uniformly) among all the outputs. b) When the real correlation is precisely random_xcor, the input energy (which can be considered as a whole) is evenly distributed among all the outputs (this is the definition of random_xcor). c) When the actual correlation is less than random_xcor, there is not enough common energy for a dominant signal, so that the energy of the inputs is distributed between the outputs with proportions dependent on how much is less. This is as if the correlated part is treated as the residue, so that it is evenly distributed among all outputs, and the uncorrelated part instead of a certain number of dominant signals that are sent to the outputs that correspond to the addresses of the inputs. At the end of the correlation that is zero, each input is fed to an output position only (usually one of the outputs, but could be an expanded position in the pan between the two). In this way, there is a continuity between the complete correlation, with a single panoramic signal between two outputs according to the relative energies of the inputs, through random_xcor with the output inputs evenly distributed among all the outputs, for the correlation of zero with M inputs independently fed to N output positions.
Compensation by Interaction As mentioned above, it can be considered that channel conversion in accordance with one aspect of the present invention involves a grid of "modules". Because multiple modules can share a particular input channel, they are possible interactions between modules and can degrade performance unless some compensation is applied. Although this is not generally possible to separate signals in an input according to the module to which they "correspond", estimating the amount of an input signal used by each connected module can improve the resulting correlation and direction estimates, giving resulting in improved overall performance. As mentioned above, there are two types of module interactions: those that involve modules at a common or lower hierarchy level (ie modules with a similar number of entries or fewer entries), which are referred to as " neighbors ", and modules with a higher level of hierarchy (that have more inputs) than a given module, but that share one or more common entries, referred to as" higher order neighbors ". Consider the first compensation among neighbors at a level of common hierarchy. To understand the problems caused by the interaction between neighbors, consider a module of two inputs, isolated, with identical input signals L / R (left and right), A. This corresponds to a single dominant (common) signal equidistant between the inputs . The common energy is A2 and the correlation is 1.0. Assume a second module of two inputs with one common signal, B, at its L / R inputs, a common energy B2, and also a correlation of 1.0. If the two modules are connected to a common input, the signal at that input will be A + B. Assuming that the signals A and B are independent, then the averaged product of AB will be zero, so that the common energy of the first module will be A (A + B) = A2 + AB = A2 and the common energy of the second module will be B (A + B) = B2 + AB = B2. In this way, the common energy is not affected by neighboring modules, as long as they process independent signals. This is generally a valid assumption. If the signals are not independent, are the same, or at least substantially share common signal components, the system will react in a manner consistent with the response of the human ear, especially the common input will cause the resulting audio image to drift towards the common entrance. In this case, the amplitude relationships of the L / R inputs of each module are displaced because the common input has more signal amplitudes (A + B) than any external input, which causes the address estimate be diverted towards the common entrance. In that case the correlation value of both modules is now somewhat less than 1.0 because the waveforms in both pairs of inputs are different. Because the correlation value determines the degree of dispersion of the components of the signal, not common, and the relation of the dominant energy (common signal component) with respect to the non-dominant energy (uncommon signal component), the common input signal, not compensated, causes the distribution of the uncommon signal, of each module, is dispersed. To obtain compensation, a measure of the "common entry level" that can be attributed to each input of each module is estimated, and then each module is informed about the total amount of that energy from the common entry level of all the modules. neighboring levels of the same hierarchy level at the entrance of each module. Two ways of calculating the measure of the common input level attributable to each input of a module are described here: one that is based on the common energy of the inputs to the module (described generally in the next paragraph), and another which is more accurate but requires greater computation resources, which is based on the total energy of the internal outputs of the module (described later in relation to the arrangement of Figure 6A). According to the first form of calculating the measure of the common input level attributable to each input of a module, the analysis of the input signals to a module does not allow directly solving the common input level of each input, only a proportion from the total common energy, which is a geometric mean of the common input energy levels. Because the level of common input energy at each input can not exceed the total energy level at that input, which is measured and known, the total common energy is factored into estimated common energy levels, proportional to the levels of entry observed, subject to subsequent qualification. Once the set of common input levels is calculated for all the modules in the grid (whether the measurement of common input levels is based on the first or second form of calculation), each module is informed of the total of the common input levels of all neighboring modules in each input, an amount referred to as the "neighbor level" of a module in each of its inputs. The module then subtracts the neighbor level of the input level in each of its inputs to derive offset input levels, which are used to calculate the correlation and the address (nominal primary address of the input signals). For the example cited above, neighbor levels are initially zero, because the common input has a higher signal than any extreme input, the first module requires a common input power level at the input, greater than A2, and the second module requires a common entry level in the same entry, greater than B2. Since the total demands are greater than the energy level available in that input, the demands are limited to approximately A2 and B2, respectively. Because there are no other modules connected to the common input, each common input level corresponds to the neighbor level of the other module. Consequently, the input power level, compensated, observed by the first module, is: (A2 + B2) -B2 = A2 and the input power level, compensated, observed by the second module is: (A2 + B2 ) -A = B2 However, these are only the levels that would have been observed with the isolated modules. Consequently, the resulting correlation values will be 1.0, and the dominant directions will be centered, at the appropriate amplitudes, as desired. However, the recovered signals themselves will not be completely isolated and the output of the first module will have some signal component of B, and vice versa, but this is a limitation of a matrix system, and if the processing is carried out on a base of multiple bands, the mixed signal components will be found at a similar frequency, making the distinction between them somewhat difficult. In more complex situations the The compensation usually will not be as accurate, but experience with the system indicates that compensation in practice mitigates most of the interaction effects of neighboring modules. Having established the principles and signals used in the level compensation between neighbors, the extension to the level compensation between neighbors, of higher order, is quite direct. This applies to situations in which two or more modules, at different hierarchy levels, share more than one common input channel. For example, there could be a three-input module that shares two inputs with a two-input module. A common signal component for all three inputs will also be common to both inputs of the two-input module, and without compensation, it will be done in different positions for each module. More generally, there may be a signal component common to all three inputs and a second component common only to the inputs of the two-input module, requiring that their effects be separated as much as possible to properly convert the output acoustic field. Consequently, the effects of the common signals of three inputs, as they were incorporated in the common input levels, described above, should be subtracted from the inputs before carrying out conveniently the calculation in two entries. In effect, the higher order common signal elements must be subtracted not only from the input levels of the lower level module, but from their observed common energy level measurement as such, before proceeding with the calculation of the lower level . This is different from the effects of common input levels, of modules that are at the same level of hierarchy, that do not affect the measure of the common energy level of a neighboring module. In this way, the levels of higher order neighbors must be taken into account and employees, separately from the levels of neighbors of the same order. At the same time that higher-order neighbor levels are passed into modules of lower hierarchy, the remaining common levels of lower-level modules should also be passed up the hierarchy because, as mentioned above, lower-level modules act similarly. than ordinary neighbors in modules of higher levels. Some quantities are interdependent and difficult to solve simultaneously. In order to avoid performing calculations that require many resources with complex simultaneous solutions, the previous calculated values can be passed to the relevant modules. A potential interdependence of common entry levels, of modules at different hierarchy levels, can be solved either by using the previous value, as above, or by performing calculations in a repetitive sequence (ie a cycle), from the highest to the lowest hierarchy level. Alternatively, a solution of simultaneous equations may be possible as well, although it may involve non-trivial computation requirements. Although the described interaction compensation techniques provide only approximately correct values for distributions of complex signals, it is believed that they provide improvement with respect to the grid array that fails to take into account the interactions between modules.
BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 is a top plan view showing schematically an idealized decoding arrangement in the manner of a test arrangement employing a horizontal array of sixteen channels around the walls of a room, a six-channel array arranged in a circle above the horizontal arrangement and a single channel above. Figure 2 is a functional block diagram that provides an overview of a transformation mode of multiple bands, one plurality of modalities that work with a central supervisor that implements the example of Figure 1. Figure 3 is a functional block diagram useful for understanding the manner in which a supervisor, such as supervisor 201 of Figure 2, may end a scale factor of the extreme point. Figures 4A-4C show a functional block diagram of a module in accordance with an aspect of the present invention. Figure 5 is a schematic view showing a hypothetical arrangement of a three input module fed by a triangle of input channels, three indoor output channels, and a dominant direction. The view is useful to understand the distribution of the components of the dominant signal. Figures 6A and 6B are functional block diagrams, showing, respectively, an appropriate arrangement for (1) generating the total estimated energy for each input of a module in response to the total energy in each input, and (2) in response to a cross-correlation measure of the input signals, generate an endpoint energy scale factor component, in excess, for each of the endpoints of the module.
Fig. 7 is a functional block diagram showing a preferred function of block 367 of "sum and / or greater of" of Fig. 4C. Figure 8 is an idealized representation of the manner in which an aspect of the present invention generates scale factor components in response to a cross correlation measure. Figures 9A and 9B and up to Figures 16A and 16B are series of idealized representations illustrating the output scale factors of a module resulting from several examples of input signal conditions.
MODES FOR CARRYING OUT THE INVENTION In order to test aspects of the present invention an array having a horizontal array of 5 horns was deployed on each wall of a room having four walls (one speaker at each corner, with three speakers uniformly spaced apart). between each corner), 16 speakers in total, allowing common horns at the corners, plus a circle of 6 speakers above a centrally located earpiece with a vertical angle of approximately 45 degrees, plus a single speaker directly above, a total of 23 speakers, plus a horn for low tones / LFE channel (low frequency effects), in total 24 speakers, all powered from a computer Personal arranged for playback on 24 channels. Although according to the language common to this system, it could be referred to as a 23.1 channel system, for simplicity it will be referred to herein as a 24 channel system. Figure 1 is a top plan view showing schematically an idealized decoding arrangement in the manner of the test arrangement just described. Five horizontal, wide-ranging input channels are shown as the boxes 1 ', 3', 5 ', 9' and 13 'on the outer circle. A vertical channel, which can be derived from the five wide interval inputs by the correlation or reverberation generated, or supplied separately (as in Figure 2), is shown as the discontinuous square 23 'in the center. The twenty-three wide-range output channels are shown as filled circles numbered 1-23. The outer circle of sixteen output channels is on a horizontal plane, the inner circle of six output channels is forty-five degrees above the horizontal plane. The output channel 23 is directly above one or more listeners. Five decoding modules with two inputs are delineated by brackets 24-28 around the outer circle, connected between each pair of input channels horizontal Five vertical decoding modules, with two additional inputs, are delineated by the brackets 29-33 that connect the vertical channel to each of the horizontal inputs. The output channel 21, the raised central rear channel, is derived from a three-input decoding module 34 illustrated as arrows between the output channel 21 and the input channels 9, 13 and 23. In this way the three-module inputs 34 is at a higher level in the hierarchy than its lower-ranked neighbor modules and two inputs 27, 32, and 33. In this example each module is associated with a respective pair of trio of adjacent, spatially adjacent input channels . Each module in this example has at least three neighbors on the same level. For example, modules 25, 28 and 29 are neighbors of module 24. Although the decoding modules shown in Figure 1 have, variably, three, four or five output channels, a decoding module can have any reasonable number of channels. of exit. An output channel may be located between two or more input channels or in the position that an input channel. Thus, in the example of Figure 1, each of the locations of the input channels is also an output channel. Two or three decoding modules share each input channel. Although the arrangement of Figure 1 employs five modules (24-28) (each of which has two inputs) and five inputs (1 ', 3', 5 ', 9' and 13 ') to derive sixteen horizontal outputs ( 1-16) that represent locations around the four walls of a room, you can obtain similar results with a minimum of three inputs and three modules (each of which has two inputs and where each module shares one input with another module). ). Using multiple modules in which each module has output channels in an arc or in a line (such as the example in figures 1 and 2), decoding ambiguities found in decoders of the prior art, in which correlations less than zero are decoded by indicating backward directions, can be avoided. Although the input and output channels can be characterized by their physical position, or at least by their direction, characterizing them with a matrix is useful because it provides a well-defined signal relationship. Each matrix element (row i, column j) is a transfer function that relates the input channel i to the output channel j. The elements of a matrix are Usually multiplicative sign coefficients, but may also include phase or delay terms (in principle any filter) may be functions of the frequency (in terms of discrete frequency, a different matrix in each frequency). This is straightforward in the case of the dynamic scale factors applied to the outputs of a fixed matrix, but it also leads by itself to operations with variable matrices, either by having a separate scale factor for each matrix element, or by more elaborate matrix elements than simple scalar scale factors, in which the elements of the matrix themselves are variable, for example a variable delay. There is some flexibility in matching the physical positions with the elements of the matrix; in principle, modalities of aspects of the present invention can handle the correspondence of an input channel with any number of output channels, and vice versa, but the most common situation is to assume signals that have only corresponded to the most outgoing channels. close through simple scalar factors that, to conserve the power, the sum of squares is calculated to 1.0. This correspondence is often done through a sine / cosine panoramic function.
For example, with two input channels and three internal output channels, on a line between them, plus the two output channels at endpoints, coinciding with the input positions (ie, a module M: N in which is 2 and N is 5), it can be assumed that the extension represents 90 degrees of arc (the interval in which the sine or cosine changes from 0 to 1 or vice versa, so that each channel is 90 degrees / 4 intervals = 22.5 degrees of separation, giving the matrix coefficients of the channels of (eos (angle), sin (angle)): Coefficients Lsalida = cos (0), sin (0) = (1,0) Coefficients MedioLsalida = eos (22.5), sin (22.5) = (.92, .38) Coefficients Csalida = eos (45), sin (45) = (.71, .71) MedioRsalida Coefficients = eos (67.5, sin (S .5) = (.38, .92) Rsalida Coefficients = eos (90), sen (90) = (0, 1) In this way, for the case of a matrix with fixed coefficients and a variable gain, controlled by a scale factor in each matrix output, the output of the signal in each of the five output channels is (where "SF" is a scale factor for a particular output identified by the subscript): Lsalida = L (SFL) MedioLalidate = ((.92) Lt + (.38) (Rt)) (SF Cout: ((.45) Lt + (.45) t)) (SFC) MedioRalidate = ((.38) Lt + (.92) Lt)) (SF Rsalida = Rt / SFR) Generally, given an array of input channels, conceptually you can join the closest inputs with straight lines, representing potential decoder modules. (They are "potential" because if there are no output channels can be derived from a module, the module is not necessary). For typical arrays, any output channel on a line between two input channels can be derived from a two-input module (if the sources and transmission channels are in a common plane, then any source appears at least two input channels, in which case there is no advantage in using more than two inputs). An output channel in the same position as an input channel is an endpoint channel, perhaps more than one module. An output channel that is not on a line or in the same position as an input (for example inside or outside a triangle formed by three input channels) requires a module that has more than two inputs.
Decoder modules with more than two inputs are useful when a common signal occupies more than two input channels. This can happen, for example, when the source channels and the input channels are not in a plane: a source channel can correspond to more than two input channels. This occurs in the example of Figure 1 when 24 channels are corresponded (16 channels in horizontal circles, six channels in a raised circle, a vertical channel, plus LFE), with 6.1 channels (including a composite vertical channel). In that case the central rear channel in the raised circle is not in a direct line between two of the source channels, it is in the center of a triangle formed by the channels Ls (13), Rs (9), and higher (23 ), in such a way that a three-input module is required to extract it. One way of matching high channels with a horizontal arrangement is to match each of them with more than two input channels. This allows the 24 channels of the example of Figure 1 to be entered into correspondence with a conventional 5.1-channel arrangement. In that alternative, a plurality of three-input modules can extract the raised channels, and the remaining signal components can be processed. pox 'modules of two inputs to extract the main horizontal circle of channels. In general it is not necessary to verify all possible combinations of points in common of the signal between the input channels. With arrays of flat channels (eg channels representing horizontally arranged addresses), it is usually appropriate to carry out a similarity comparison of pairs of spatially adjacent channels. For channels arranged in a vault or on the surface of a sphere, the common points of the signals can be extended to three or more channels. The use and detection of common points of signals can also be used to conduct additional information of the signal. For example, a vertical signal component can be represented by matching all five full-range channels of an array of five horizontal channels. Decisions about what combinations of input channels should be analyzed with respect to points in common, together with an input / output correspondence matrix, by default, need to be performed only once per converter or channel conversion function array input / output, in the configuration of the converter or conversion function. "Initial correspondence" (before processing) is derived a passive "master" matrix that relates the configurations of input / output channels, with the spatial orientation of the channels. As an alternative, the processor or processing portion of the invention can generate scale factors variable in time, one per output channel, which modify, either the levels of the output signal of what would otherwise have been an array simple passive or the same coefficients of the matrix. The scale factors are in turn derived from a combination of (a) components of dominant signals, (b) uniformly dispersed (fillings), and (c) of waste (endpoint) as described below. A master matrix is useful in the configuration of a module array such as the one shown in the example of FIG. 1 and is described further below in relation to FIG. 2. Examining the master matrix can be deduced, for example , how many decoder modules are needed, how they are connected, how many input and output channels each has and the matrix coefficients that relate each of the inputs and outputs of the modules. These coefficients can be taken from the master matrix; only values other than zero are necessary, unless an input channel is also an output channel (ie an endpoint).
Each module preferably has a "local" matrix which is that portion of the master matrix applicable to the particular module. In the case of an array of multiple modules, such as the example of Figures 1 and 2, the module can use the local matrix for the purpose of producing scale factors (or matrix coefficients) to control the master matrix, as described subsequently with reference to Figures 2 and 4A-4C, or for the purpose of producing a subset of output signals, wherein the output signals are gathered by a central process, such as a supervisor such as the one described in relation to Figure 2. That supervisor, in the latter case, compensates for multiple versions of the same output signal produced by modules having a common output signal, in a manner analogous to the manner in which supervisor 201 of Figure 2 determines a final scale factor to replace the preliminary scale factors produced by modules that produce preliminary scale factors for the same output channel. In the case of multiple modules that produce scale factors different from the output signals, those modules can continuously obtain the matrix information relevant to themselves, from a master matrix, through the supervisor instead of having a local matrix. However, fewer computational resources are required if the module has its own local matrix. In the case of a single independent module, the module has a local matrix that is the only required matrix (in effect, the local matrix is the master matrix), and that local matrix is used to produce output signals. Unless otherwise indicated, descriptions of embodiments of the invention having multiple modules are with reference to the alternative in which the modules produce scale factors. Any decoder module output channel with only a non-zero coefficient in the module's local matrix (that coefficient is 1.0, since the sum of squares of the coefficients is 1.0) is an endpoint channel. Output channels with more than one non-zero coefficient are internal output channels. Consider a simple example. If output channels 01 and 02 are both derived from input channels II and 12 (both with different coefficient values), then a two-input module connected between II and 12 is required, which generates outputs 01 and 02, possibly among others. In a more complex case, if there are 5 inputs and 16 outputs, and one of the decoder modules has inputs II and 12 and outputs outputs 01 and 02 such that: 01 = A II + B 12 + 0 13 + 0 14 + 0 fifteen (note that there is no contribution of the input channels 13, 14, or 15), and 02 = C II + D 12 + 0 13 + 0 14 + 0 15 (note that there is no contribution from the input channels 13, 14 , or 15), then the decoder can have two inputs (II and 12), two outputs, and the scale factors that relate them are: 01 = A II + B 12, and 02 = C II + D 12. Either The master matrix or the local matrix, in the case of a single autonomous module, can have matrix elements that work to provide more than multiplication. For example, as mentioned above, the elements of the array may include a filter function, such as a phase or delay term, and / or a filter that is a frequency function. An example of filtering that can be applied is a pure delay matrix that can produce ghost projected images. In practice, that master or local matrix can be divided, for example, into two functions, one that uses coefficients to derive the output channels, and a second that applies a filtering function. Figure 2 is a functional block diagram that provides an overview of a multi-band transformation mode, which implements the example of Figure 1. A PCM audio input, for example, having multiple channels of interspersed audio signals, is applied to a supervisor or supervisory function 201 (hereinafter " supervisor 201") which includes an interleaver eliminator that retrieves separate streams from each of the six audio signal channels 3 ', 5', 9 ', 13' and 23 ') carried by the interleaved input and applies each to a transformation or transformation function of the time domain the domain of the frequency (hereinafter "transformation forward"). Alternatively, audio channels can be received in separate streams, in which case an interleaver eliminator is not required. As mentioned above, the signal conversion according to the present invention can be applied either to broadband signals or to each frequency band of a multi-band processor, which can employ either a filter bank, such as a discrete critical band filter bank or a filter bank having a band structure compatible with an associated decoder, or a transformation configuration, such as a linear filter bank FFT (Fast Fourier Transform) or MDCT (Discrete Cosine Transform) Modified). Figures 2, 4A-4C and other figures are described in the context of a multi-band transformation configuration. Not shown in figures 1, 2 and in other figures, for simplicity, there is an optional LFE input channel (a seventh power input channel in Figures 1 and 2) and an output channel (a 24a power output channel in Figures 1). and 2) the LFE channel can be treated in general in the same way as in the other input and output channels, but with its own scale factor set at "1" and its own matrix coefficient, also set at "1". In cases where the source channels do not have LFE but where the output channels have it (for example, a 2: 5.1 upmix), an LFE channel can be derived using a low pass filter (for example, a filter). Butterworth of fifth order with a corner frequency of 120 Hz) applied to the sum of the channels, or to avoid cancellation during the addition of the channels, a sum of the channels with phase correction can be used. In cases where the input has an LFE channel, but not the output, the LFE channel can be added to one or more of the output channels. Continuing with the description of Figure 2, the modules 24-34 receive appropriate inputs selected from the six inputs 1 ', 3', 5 ', 9', 13 'and 23' in the manner shown in Figure 1. Each module generates a preliminary scale factor ("PSF") output for each of the audio output channels associated therewith, as shown in FIG. 1. In this way, for example, the module 24 receives the inputs 1 'and 3' and generates outputs of preliminary scale factors PSF1, PSF2 and PSF3. Alternatively, as mentioned above, each module can generate a preliminary set of audio outputs for each of the audio output channels associated therewith. Each module may also communicate with a supervisor 201, as further explained below. Information sent from supervisor 201 to several modules may include neighbor level information and higher order neighbor level information, if any. The information sent to the supervisor from each module can include the total estimated energy of the interior of the outputs, attributable to each of the inputs of the module. The modules can be considered part of a control signal generating portion of the overall system of Figure 2. A supervisor, such as the supervisor 201 of Figure 2, can execute a number of different functions. A supervisor, for example, can determine if more than one module is in use, and if not, the supervisor does not need to perform any function related to it. with levels of neighbors. During the initialization, the supervisor can inform each module about the number of inputs and outputs it has, the related matrix coefficients, and the signal sampling rate. As already mentioned, you can read the blocks of interleaved PCM samples and de-interleave them to produce separate channels. You can apply an unlimited action in the time domain, for example, in response to additional information that indicates that the source signal was limited in its amplitude and the degree of that limitation. If the system is operating in a multi-band mode, you can apply a window creation and a filter bank (for example, FFT, MDCT, etc.) to each channel (so that multiple modules do not execute redundant transformations that increase substantially processing information) and pass transformation value streams to each module for processing. Each module passes back to the supervisor, a two-dimensional arrangement of scale factors: a scale factor for all the transformation trays of each subband of each output channel (when in a multi-band transformation configuration, or otherwise a scale factor per output channel), or alternatively, a bidimensional arrangement of output signals: a pool of complex transformation trays, for each subband of each output channel (when in a multi-band transformation configuration, or else an output signal per output channel). The supervisor can smooth the scaling factors and apply them to the matrix operation of the signal path (matrix 203, described below) to produce (in a multi-band transformation configuration) complex spectra of output channels. Alternatively, when the module produces output signals, the supervisor can derive the output channels (complex spectra of output channels, in a multi-band transformation configuration), compensating the local matrices that produce the same output signal. Then you can execute an inverse transformation plus window creation and overlap addition, in the case of the MDCT, for each output channel, interleaving the output samples to form a composite multi-channel output stream (or you can optionally omit the interleaved in order to provide multiple output streams), and send them to an output file, sound card, or other final destination. Although various functions may be performed by a supervisor, as described herein, or by multiple supervisors, a person of ordinary skill in the art will appreciate that some or all of these Functions can be executed in the modules themselves instead of by a supervisor common to all or some of the modules. For example, if there is only one autonomous module, there is no need to distinguish between the functions of the module and the functions of the supervisor. Although in the case of multiple modules, a common supervisor can reduce the total processing power required, by eliminating or reducing redundant processing tasks, the elimination of a common supervisor or its simplification can allow the easy addition of some modules with others, for example, for scaling to more output channels. Returning to the description of Figure 2, the six inputs 1 ', 3', 5 ', 9', 13 'and 23' are also applied to a variable matrix or to the variable matrix operation function 203 (hereinafter ") matrix 203"). Matrix 203 can be considered a part of the signal path of the system of Figure 2. Matrix 203 also receives as inputs of supervisor 201 a set of final scaling factors, from SF1 to SF23, for each of the 23 channels of the example example of Figure 1. The final scale factors can be considered as the output of the signal portion of the control of the system of Figure 2. As further explained below, supervisor 201 preferably passes, as factors of final scale to the matrix, the scale factors Preliminaries for each "inner" output channel, but the supervisor determines the final scale factors for each endpoint output channel, in response to the information it receives from the modules. An "inner" output channel is intermediate between two or more "endpoint" output channels of each module. Alternatively, if the modules produce output channels instead of scale factors, no matrix is required 203; the supervisor himself produces the output signals. In the example of Figure 1 it is assumed that the endpoint output channels match the locations of the input channels, although they need not match, as discussed further elsewhere herein. In this way the output channels 2, 4, 6-8, 10-12, 14-16, 17, 18, 19, 20, 21 and 22 are inner output channels. The inner output channel 21 is intermediate or is enclosed by three input channels (the input channels 9 ', 13' and 23 '), while the other inner channels are each intermediate (between or enclosed by) two input channels. Because there are multiple preliminary scale factors for those endpoint output channels, which are shared between modules (ie output channels 1, 3, 5, 9, 13 and 23), the supervisor 201 determines the final endpoint scale factors (SF1, SF3, etc.) between the scale factors from SF1 to SF23. The final internal exit scale factors (SF2, SF4, SF6, etc.) are the same factors as the preliminary scale factors. Figure 3 is a functional block diagram useful for understanding how a supervisor, such as supervisor 201 of Figure 2, can determine an endpoint scale factor. The supervisor does not add all the outputs of the modules that share an input, to obtain an end point scale factor. Instead it combines additively, such as in a combiner 301, the total estimated internal energy for an input of each module that shares the input, such as the input 9 ', which is shared by the modules 26 and 27 of Figure 2 This sum represents the total energy level in the input demanded by the internal outputs of all connected modules. Subsequently subtract this sum of the softened input energy level, in that input (for example the output of smoother 325 or 327 of Figure 4B, as described below, of any of the modules that share the input (module 26 or module 27 , in this example), such as in combiner 303. It is sufficient to select any of the smoothed inputs of the modules in the common input, even though the levels may differ slightly from module to module, because each module adjusts its time constants independently of each other. The difference, at the output of the combiner 303, is the energy level of the desired output signal at that input, energy level that is not allowed to fall below zero. Dividing that level of the desired output signal between the softened input level in that input, as in the divisor 305, and executing a square root operation, as in block 307, the final scale factor is obtained (SF9 , in this example) for that output. Note that the supervisor derives a single final scale factor for each of those shared entries, regardless of how many modules share the input. An arrangement for determining the total estimated energy of the indoor outputs, attributable to each of the inputs of the module, is described below with reference to Figure 6A. Because the levels are energy levels (a second order quantity), rather than amplitudes (a first order quantity), after the division operation, a square root operation is applied in order to obtain the factor of final scale (the scale factors are associated with first order quantities). The addition of interior levels and Subtraction of the total input level are all executed in a pure energy sense, because the interior outputs of different interior modules are assumed independent (not correlated). If this assumption is not true in an unusual situation, the calculation may produce more signal left in the input than it should, which can cause a slight spatial distortion in the reproduced sound field (for example, a slight drag of others). inner images close to the entrance), but in the same situation, the human ear probably reacts similarly. The scale factors of the indoor output channel, such as PSF6 to PSF8 of module 26, are passed by the supervisor as final scale factors, (unmodified). For simplicity, Figure 3 shows only the generation of one of the final endpoint scale factors. Other final endpoint scale factors may be derived in a similar manner. Returning to the description of Figure 2, as mentioned above, in the variable matrix 203, the variability can be complicated (all variable coefficients) or simple (coefficients that vary in groups, such as those applied to inputs or outputs of a fixed matrix). Although you can use every In order to produce substantially the same results, it has been found that one of the simplest approaches, ie a fixed matrix followed by a variable gain for each output (the gain of each output controlled by scale factors) produces satisfactory results and is used in the modalities described here. Although a variable matrix can be used in which each coefficient of the matrix is variable, it has the disadvantage of having more variables and requiring more processing power. The supervisor 201 also performs an optional smoothing in the time domain of the final scaling factors, before they are applied to the variable array 203. In a variable array system, the output channels are never "turned off"; coefficients are arranged to reinforce some signals and cancel others. However, a variable-gain and fixed-matrix system, as in the described embodiments of the present invention, nevertheless turns the channels on and off, and is more susceptible to undesirable "chirping" artifacts. This can occur despite the two-step smoothing described later (for example, softeners 319/325, etc.). For example, when a scale factor is close to zero, because only a small change is needed to move from "small" to "none" and back, transitions to and from zero can cause an audible squeak. The optional smoothing performed by the supervisor 201 preferably smoothes the output scaling factors with varying time constants that depend on the size of the absolute difference ("dif-abs") between the values of the instantaneous, recently derived scaling factors, and an execution value of the smoothed scaling factor. For example, if the dif-abs is greater than 0.4 (and of course, <1.0), little or no smoothing is applied; a small additional amount of smoothing is applied to the dif-abs values between 0.2 and 0.4; and below values of 0.2, the time constant is a continuous inverse function of dif-abs. Although these values are not critical, they have been found to reduce audible chirp artifacts. Optionally, a multi-band version of a module, the time constants of the smoothing scaling factors can also be scaled with frequency and over time, in the manner of the frequency smoothing devices 413, 415 and 417 of FIG. 4A , described later. As mentioned above variable array 203 is preferably a fixed decoding matrix with variable scale factors (gains) in the outputs of the matrix. Each output channel of the matrix can have matrix coefficients (fixed) which would have been the downmix coding coefficients for that channel if it had been an encoder with discrete inputs (instead of mixing source channels, directly for the downmix array, which avoids the need for a discrete encoder) . The sum of squares of the coefficients is preferably 1.0 for each output channel. The matrix coefficients are fixed once we know where the output channels are located (as previously discussed with respect to the "master" matrix); while the scale factors, which control the output gain of each channel, are dynamic. The inputs comprising transformation trays in the frequency domain, applied to the modules 24-34 of Figure 2 can be grouped into frequency sub-bands for each module, after initial quantities of energy and common energy are calculated in the level of the tray, as explained further below. In this way, there is a preliminary scale factor (PSF in Figure 2) and a final scale factor (SF in Figure 2) for each frequency sub-band. The output channels in the frequency domain 1-23 produced by matrix 203 they each comprise a set of transformation trays (the groups of transformation trays, the size of sub-bands, are treated with the same scale factor). The sets of transformation trays in the frequency domain are converted into a set of output channels PCM 1-23, respectively, by a transformation or transformation function of the frequency domain at time 205 (hereinafter " reverse transformation "), which may be a function of supervisor 201, but is shown separately for clarity. The supervisor 201 can interleave the resulting PCM channels 1-23 to provide a single PCM output stream, interleaved, or leave the PCM output channels as separate streams. Figures 4A-4C show a functional block diagram of a module in accordance with an aspect of the present invention. The module receives two or more streams of input signals from a supervisor, such as supervisor 201 of FIG. 2. Each input comprises a set of transformation trays in the frequency domain, of complex values. Each input, from 1 to m, is applied to a function or device (such as the function or device 401 for input 1 and the function or device 403 for input m) that calculates the energy of each tray, which is the sum of the squares of the real and imaginary values of each transformation tray (only the routes for two entries, 1 and m, are shown to simplify the drawing). Each of the inputs is also applied to a function or device 405 that calculates the common energy of each tray through the input channels of the module. In the case of an FFT modality, this can be calculated by taking the cross product of the input samples (in the case of two entries, L and R, for example, the real part of the complex product of the value of the tray L complex and the complex conjugate of the value of the complex tray R). The modalities that use real values only need to cross multiply the real value for each entry. For more than two entries, the special cross multiplication technique, described later, can be used, especially if all signs are equal, the product is given a positive sign, otherwise a negative sign is provided and it is scaled by the relation of the number of possible positive results (always two: they are either all positive or all negative) for the number of possible negative results.
Pairs Calculation of the Common Energy For example, suppose that a pair of input channels A / B contains a common signal X together with uncorrelated individual signals Y and Z: A = 0.707X + YB = 0.707X + Z where the scale factors of 0.707 = (0.5) 1, provides a power that retains the correlation with the nearest input channels.
Energy RMS (A) = Aldt = 1? = (.707 + Y) 2 = (0.5 2 + 0.707 T + 72) = 0.5 T + 0.707 ^ 7 + G Because X and Y are not correlated, In such a way that: ^ "= o.5 ^ + rT that is, because X and Y are not correlated, the total energy in the input channel A is the sum of the energies of the X and Y signals. Similarly: ~ ¥ = 0.5?? +? Since X, Y and Z are not correlated, the average cross product of A and B is: AB ~ = 0.5X2 In such a way that, in the case of an output signal shared equally by two neighboring input channels which may also contain uncorrelated, independent signals, the average cross product of the signals is equal to the energy of the common signal component in each signal. If the common signal is not shared equally, that is, it is expanded in panning to one of the inputs, the average cross product will be the geometric mean between the energy of the common components in A and B, from which estimates can be derived. of the common energy of individual channels, normalizing by means of the square root of the ratio of the amplitudes of the channels. Real time averages are calculated subsequent to the smoothing steps, as described below.
Higher Order Calculation of Common Energy In order to derive the common energy of decoding modules with three or more inputs, it is necessary to form cross products, average, of all the input signals. The simple execution of the paired processing of the inputs fails to differentiate between separate output signals between each pair of inputs and a signal common to all.
Consider, for example, three input channels, A, B and C, composed of uncorrelated signals W, Y, Z, and the common signal X: A = X + WB = X + YC = X + Z If the average cross product, all the terms involving the combinations of W, Y, and Z are canceled, as in the second order calculation, leaving the average of X3: Unfortunately, if X is an average time signal of zero, as expected, then the average of your cube is zero. Unlike the averaging of X2, which is positive for any non-zero value of X, X3 has the same sign as X, so that positive and negative contributions will tend to cancel out. Obviously the same applies for any odd power of X, which corresponds to an odd number of module inputs, but even exponents greater than two can also lead to erroneous results; for example, four inputs with components (X, X, -X, -X) will have the same product / average as (X, X, X, X). This problem can be solved by using a variant of the averaged product technique. Before if averaged, the sign of each product is discarded by taking the absolute value of the product. The signs of each term of the product are examined. If they are all the same, the absolute value of the products is applied to the averager. If some of the signs are different from the others, the negative of the absolute value of the product is averaged. Since the number of possible combinations of the same sign, can not be equal to the number of possible combinations of different signs, a weighting factor composed of the relation of the number of combinations of equal signs with respect to the different signs, is applied to the products of the absolute values denied, to make the compensation. For example, a module of three entries has two ways so that the signs are equal, among eight possibilities, leaving six possible ways for the signs to be different, resulting in a scale factor 2/6 = 1/3. This compensation causes the integrated or summed product to grow in a positive direction if and only if there is a component of the signal common to all the inputs of a decoding module. However, in order that the averages of modules of different orders are comparable, they must have all the same dimensions. A conventional second-order correlation involves multiplication averages of two inputs and therefore of quantities with the dimensions of energy or powers. In this way the terms that are going to be averaged in higher order correlations must also be modified so that they have the power dimensions. For a k-th order correlation, the absolute values of the individual product must then be raised to 2 / k power before being averaged. Of course, regardless of the order, the individual input energies of a module, if needed, can be calculated as the average of the square of the corresponding input signal, and need not be raised first to the kth power and then reduced to a second order amount. Returning to the description of Figure 4A, the outputs of the transformation tray of each of the blocks can be grouped into sub-bands by a respective function or device 407, 409 and 411. The sub-bands can be approximated to the critical bands of the human ear , for example. The rest of the embodiment of the module of Figures 4A-4C operates separately and independently of each sub-band. In order to simplify the drawing only the operation of a sub-band is presented. Each subband of blocks 407, 409 and 411 is applied to a frequency smoother or to a function of frequency smoothing 413, 415 and 417 (hereinafter "frequency smoothing"), respectively. The purpose of frequency smoothing is explained later. Each frequency-smoothed sub-band of a frequency smoother is applied to "fast" softeners or optional smoothing functions, 419, 421 and 423 (hereinafter "fast smoothing"), respectively, which provide smoothing in the domain of time. Although preferred, fast smoothers can be omitted when the time constant of fast smoothers is near the block duration of the forward transformation that generated the input trays (for example, a forward transformation in the supervisor 201). of figure 1). Fast smoothing agents are "fast" in relation to smoothing or smoothing functions with "slow" variable time constant 425, 427 and 429 (hereinafter "slow smoothing agents") which receive the respective outputs of the fast smoothing agents. Examples of time constants values of fast and slow smoothers are provided below. In this way, whether a fast smoothing is provided by the inherent operation of a forward transformation or by a rapid smoother, a two-stage smoothing action is preferred where the second stage, slower, variable. However, an individual smoothing step can provide acceptable results. The time constants of the slow smoothers are preferably in synchronism with each other within a module. This can be achieved, for example, by applying the same control information to each slow smoother and setting each slow smoother to respond, in the same way, to applied control information. The derivation of the information to control the slow softeners is described later. Preferably each pair of softeners are in series, in the manner of pairs 419/425, 421/427 and 423/429, as shown in Figures 4A and 4B, in which a fast smoother feeds a slow smoother. A series arrangement has the advantage that the second stage is resistant to peaks of fast and short signals at the input of the pair. However, similar results can be obtained by configuring the smoothing pairs in parallel. For example, in a parallel arrangement the resistance of the second stage in a series array for fast and short signal peaks can be handled in the logic of a controller with time constant. Each stage of the two stage softeners can be implemented by a low pass filter of a single pole (a "leaky integrator") such as a low pass filter RC (in an analogue mode) or, equivalently, a first order low pass filter (in a digital mode). For example, in a digital mode, the first order filters can each be understood as a "biquadratic" filter, a second general order IIR filter, in which some of the coefficients are set to zero, in such a way that the filter works like a first order filter. Alternatively, the two softeners can be combined in a single second-order bicuadratic stage, although it is simpler to calculate coefficient values for the second stage (variable) if it is separated from the first stage (fixed). It should be noted that in the embodiment of Figures 4A, 4B and 4C, all signal levels are expressed as energy levels (squared), unless an amplitude is required by extracting a square root. Smoothing is applied to the energy levels of applied signals, making the smoothers sensitive to the RMS, rather than being sensitive to the average (the average-sensitive smoothers are fed by linear amplitudes). Because the signals applied to the softeners are of high squared levels, the softeners react by suddenly increasing the signal level more quickly than the average softeners, since the increments are increased by a squared elevation function. The two-stage softeners then provide an average of time for each subband of each input channel energy (that of the 1st channel is provided by a slow smoother 425 and that of the m-th channel by a slow smoother 427) and the average for each subband of the common energy of the input channels (provided by the slow smoother 429). The average energy outputs of the slow softeners (425, 427, 429) are applied to the combiners 431, 433, and 435, respectively, in which (1) the neighboring energy levels (if any) (of the supervisor 201 of Figure 2, for example) are subtracted from the smoothed energy level of each of the input channels, and (2) the higher order neighboring energy levels (if any) (of supervisor 201 of Figure 2, for example) are subtracted from each of the average energy outputs of the slow smoothers. For example, each module that receives the 3 'input (figures 1 and 2) has two neighboring modules and receives information from the neighbor energy level that compensates for the effect of those two neighboring modules. However, none of these modules is a "higher order" module (that is, all the modules that share the 3 input channel) are modules of two tickets) . In contrast, module 28 (figures 1 and 2) is an example of a module that has a higher-order module that shares one of its inputs. In this way, for example, in the module 28, the average energy output of a slow smoother for the input 13 'receives the compensation of the neighboring level, of higher order. The resulting "compensated by neighbors" energy levels for each subband of each of the inputs of the module are applied to a function or device 437 that calculates a nominal primary current direction of those energy levels. The indication of the address can be calculated as the vector sum of the inputs weighted in the energy. For a two-input module this is simplified with the L / R ratio of the energy levels of the input signal being smoothed and compensated by the neighbors. Assume, for example, a flat perimetric arrangement in which the positions of the channels are given as 2 -polos that represent x and y coordinates for the case of two inputs. It is assumed that the listener that is in the center is, say, in (0, 0). The left frontal channel, in normalized spatial coordinates, is found in (1, 1). The right front channel is at (-1, 1). If the left input amplitude (Lt) is 4 or the right input amplitude (Rt) is 3, then, using those amplitudes as weighting factors, the nominal primary current direction is: 4 (* (1, 1) + 3 * (-1, 1)) / (4 + 3) = (0.143,1), or slightly to the left of the center on a horizontal line that connects the Left and the Right. Alternatively, once a master matrix is defined, the spatial direction can be expressed in matrix coordinates, rather than in physical coordinates. In this case, the input amplitudes, normalized so that the sum of squares gives one, are the effective matrix coordinates of the direction. In the previous example, the left and right levels are 4 and 3, which are normalized to 0.8 and 0.6. Consequently the "address" is (0.8, 0.6). In other words, the nominal current primary address is a standardized version with sum of squares to one, of the square root of input energy levels smoothed by neighbors' compensation. Block 337 produces the same number of outputs, indicating a spatial address, since there are inputs to the module (two in this example). The energy levels smoothed by compensation of the neighbors, for each subband of each of the inputs of the modules, applied to the function or determining device of the address 337 are also applied to a function or device 339 that calculates the cross correlation compensated by the neighbors ("neighbors-compensaáa_xcor"). The block 339 also receives as an input the average energy averaged from the module inputs, for each subband, from the slow variable smoother 329, which has been compensated in the combiner 335 by energy levels of higher order neighbors, if any. . The cross-correlation compensated by the neighbors is calculated in block 339 as the common energy smoothed by higher-order compensation, divided by the M-th root, where M is the number of inputs, the product of the energy levels smoothed by compensation of the neighbors, for each of the input channels of the module, to derive a true mathematical correlation value in the range of 1.0 to -1.0. Preferably the values from 0 to -1.0 are taken as zero. The neighbors-compensated_xcor provides an estimate of the cross-correlation that exists in the absence of other modules. The neighbors-compensated_xcor of block 339 is then applied to a weighting function or device 341 which weights the compensated-neighbors-xcor with the address information compensated by the neighbors, to produce a cross-correlation compensated by the neighbors and with weighting of the address ( "address-weighted_xcor"). The weighting increases as the nominal primary running direction deviates from a centered condition. In other words, the unequal input amplitudes (and therefore the energies) cause a proportional increase in the weighted-direction_xcor. The address-weighted_xcor provides an estimate of the compactness of the image. Thus, in the case of a two-input module having, for example, left L and right R inputs, the weighting increases as the direction moves away from the center either to the left or to the right (ie, the weighting is the same in any direction for the same degree of distance from the center). For example, in the case of a two-input module, the value of the neighbors-corapensada_xcor is weighted by a ratio L / R or R / L, such that the distribution of the non-uniform signal drives the weighted-directioncorporate 1.0. For that two-entry module, when R > = L the direction-weighted_xcor = (1- ((1-neighbors-compensated_xcor) * (L / R)), and when R <L, the weighted-direction_xcor = (1- ((1-neighbors-compensated_xcor) * (R / L)). For modules with more than two inputs, the calculation of the weighted-address__xcor from the neighbors -ponderada_xcor requires, for example, the replacement of the ratio L / or R / L in the previous one, by a measure of "uniformity" that varies between 1.0 and 0. For example, to calculate the measure of uniformity for any number of Inputs, the levels of the input signals are normalized by the normal input power, resulting in normalized input levels that add a value of 1.0 in a sense of energy (square). Divide each standardized input level between the similarly normalized input level, of a signal centered on the array. The smallest ratio becomes the measure of uniformity. Therefore, for the example, for a three-input module with an input that has a level of zero, the measure of uniformity is zero, and the weighted-direction_xcor is equal to one. (In this case, the signal is on the edge of the three-input module, on a line between two of its inputs, and a two-input module (of lower hierarchy) decides where the primary address is in nominal course, on the line, and how widely the output signal should be spread along that line). Returning to the description of Figure 4B, the weighted-direction xx is weighted additionally by its application to a function or device 443 that applies a "random_xcor" weight to produce a "effective_xcor". The effective_xcor provides an estimate of the distribution form of the input signals. The random_xcor is the average cross product of the input magnitudes divided by the square root of the average input energies. The value of the random_xcor can be calculated assuming that the input channels were originally input channels of the module, and calculating the value of xcor that results from all those channels that have independent signals but of the same level, that are mixed passively downwards . According to this approach, for the case of a module with three outputs with two inputs, the random_xcor is calculated as 0.333 and for the case of a module with five outputs (three internal outputs) with two inputs, the random_xcor is calculated as 0.483. The value of random_xcor needs to be calculated only once for each module. Although it has been found that these values of random_xcor provide satisfactory results, the values are not critical and other values can be used at the discretion of the system designer. A change in the value of random_xcor affects the dividing line between the two operating regimes of the signal distribution system, as described below. The precise location of that dividing line is not critical.
The randomness_xcor weighting performed by the function or device 343 can be considered as a renormalization of the weighted-address-xcor value, in such a way that an effective_xcor is obtained: effective_xcor = (weighted-direction_xcor-random_xcor) / (l-aleatory_xcor), if address- weighted_xcor > = random_xcor, or otherwise effective_xcor = 0 The weighting of random_xcor accelerates the reduction in weighted-direction_xcor as weighted-direction__xcor decreases below 1.0, so that when weighted-direction_xcor equals random_xco, the value of effective_xcor It is zero. Because the outputs of a module represent directions along an arc or a line, the values of effective_xcor less than zero are treated as equal to zero. The information for controlling the slow smoothing agents 325, 327 and 329 is derived from the energies of the smoothed input channels in a fast and slow manner and compensated for by non-neighbors, and from the common energy of the smoothed input channels in a way fast and slow In particular, a function or device 345 calculates a cross-correlation compensated by non-neighbors, in a rapid manner, in response to the energies of the smoothed input channels in a manner fast and to the energy ccmún of the input channels softened in a fast way. A function or device 347 calculates an address offset by non-neighboring, fast (relation or vector, as discussed above in relation to the description of block 337) in response to energies of the input channel smoothed in a fast manner. A function or device 349 calculates a cross-correlation compensated by non-neighbors, in a slow manner, in response to the energies of the smoothed input channels in a slow manner and to the common energy of the smoothed input channels in a slow manner. A function or device 351 calculates an address compensated by non-neighboring, slow (ratio or vector, as discussed above) in response to the energies of the smoothed input channels in a slow manner. Cross correlation compensated by non-neighbors, in a fast way, the direction compensated by non-neighbors, in a fast manner, the cross-correlation compensated by non-neighbors, in a slow manner, and the cross-correlation compensated by non-neighbors, in a way slow, along with the weighted_draction_xcor of block 341, are applied to a device or function 353 that provides the information to control variable slow smooters 325, 327 and 329 to adjust their time constants (hereinafter "set constant constants"). weather") . Preferably, the same control information is applied to each variable slow smoother. Unlike the other quantities fed to the time constant selection box, which compare a fast measurement with a slow one, the weighted-direction xx is preferably used without reference to some fast value, so that if the absolute value of the weighted-direction_xcor is greater than a threshold, this causes the adjustment of the time constants 353 to select a faster time constant. The rules for the operation of the "adjustment of time constants" 353 are presented below. Generally speaking, in a dynamic audio system, you want to use time constants as slow as possible, remaining at an inactive value, to minimize the audible breakdown of the reproduced sound field, unless a "new event" occurs in the audio signal, in which case it is desirable that a control signal change rapidly to a new inactive value, and then remain at that value until another "new event" occurs. Typically, audio processing systems have equated changes in amplitude with a "new event." However, when it comes to cross-products or cross-correlation, novelty and breadth do not always match: a new event may cause a reduction in the correlation crusade. When detecting changes in the parameters relevant to the operation of the module, especially as cross correlation and direction, the time constants of a module can accelerate and quickly assume a new state of control as desired. The consequences of inadequate dynamic behavior include errant behavior, chirping (a channel goes off and on quickly, pumping (unnatural changes in the level) and, in a multi-band mode, chirping (chirping and pumping). a band-by-band basis.) Some of these effects are especially critical for the quality of isolated channels.A modality such as that in Figures 1 and 2 employs a grid of decoding modules.That configuration results in two kinds of problems. dynamics: inter-module dynamics and intra-module dynamics In addition, the different ways of implementing audio processing (for example, broadband, multi-band using a FFT or MDCT linear filter bank, or a discrete filter bank, critical band or other type) each require their own dynamic behavior optimization.The basic decoding process within each module depends on a a measure of relationships energy of the input signals and a cross-correlation measure of the input signals (in particular the direction-weighted correlation (weighted-direction -xcor), described above, the output of block 341 in FIG. 4B), which together control the distribution of signals between the outputs of a module. The derivation of these basic quantities requires smoothing, which, in the time domain, requires the calculation of a time-weighted average of the instantaneous values of those quantities. The required time constant interval is quite large: very short (1 ms, for example) for fast transient changes in signal conditions, up to very long (150 ms, for example) for low correlation values, where it is very likely that the instantaneous variation is much greater than the true averaged value. A common method to implement a variable behavior in the time constant is, in analog terms, the use of an "accelerator" diode. When the instantaneous level exceeds the level averaged by a threshold amount, the diode conducts, resulting in a shorter effective time constant. A disadvantage of this technique is that a momentary peak, in an otherwise steady state input, may cause a large change in the smoothed level, which then decays very slowly, providing an unnatural emphasis of isolated peaks that would otherwise have little audible consequence. The calculation of the correlation, described in relation to the embodiment of Figures 4A-4C, makes the use of accelerating diodes (or their equivalent DSP) problematic. For example, all softeners within a particular mode preferably have synchronized time constants, such that their smoothed levels are comparable. Therefore, a mechanism for changing the global time constant (in series) is preferred. Additionally, a rapid change in signal conditions is not necessarily associated with an increase in the common energy level. The use of an accelerating diode for this level will probably produce inaccurate and deviant estimates of the correlation. Therefore, modalities of aspects of the present invention preferably use two-stage smoothing without an equivalent diode accelerator. The estimates of the correlation and direction may be derived at least from the first and second stages of the smoothers to set the time constant of the second stage.
For each pair of softeners (eg, 319/325), the first stage, the fast stage, the time constant can be set to a fixed value, such as 1 ms. The second stage, the variable slow stage, the time constants can be, for example, selectable between 10 ms (fast), 30 ms (average), and 150 ms (slow). Although it has been found that these time constants provide satisfactory results, their values are not critical and other values can be employed at the discretion of the system designer. In addition, the values of the time constant of the second stage can be continuously variable rather than discrete. The selection of time constants can be based not only on the signal conditions described above, but also on a hysteresis mechanism using a "fast flagger", which is used to ensure that once a genuine fast transition is found , the system remains in the fast mode, avoiding the use of the mean time constant, until the signal conditions rehabilitate the slow time constant. This can help ensure rapid adaptation to the new signal conditions. The selection of which of the three possible time constants of the second stage should be used, can be achieved by "adjusting the constants of time "353 according to the following rules for the case of two inputs: If the absolute value of address-weighted_xcor is less than a first reference value (for example 0.5) and the absolute difference between fast not neighbors-compensated_xcor and slow not neighbors-compensated_xcor is less than the same first reference value, and the absolute difference between the fast and slow address relationships (each of which has a range of +1 to -1.}. is smaller than the first reference value, then the slow time constant of the second stage, and the fast signaling is set to True, enabling the subsequent selection of the average time constant, otherwise, if the fast signaling is True, the absolute difference between not neighbors-compensated_xcor, fast and slow, is greater than the first reference value and less than a second reference value (for example 0.75), the absolute difference between the temporary relations L / R, fast and slow, is greater than the first reference value and lower than the second reference value , and the absolute value of address-weighted_xcor is greater than the first reference value and smaller than the second reference value, then, select the mean time constant of the second stage. Otherwise, the fast time constant of the second stage is used, and the fast flag is set to False, disabling the subsequent use of the mean time constant until the slow time constant is again selected. In other words, the slow time constant is selected when all three conditions are less than a first reference value, the average time constant is selected when all conditions are between a first reference value and a second reference value and the precondition was the slow time constant, and the fast time constant is selected when any of the conditions are greater than the second reference value. Although it has been found that the aforementioned rules and reference values produce satisfactory results, they are not critical and variations in the rules or other rules that take into account a fast and slow cross-correlation and a fast and slow direction can be used at the discretion of the system designer. In addition, other changes can be made. For example, it may be simpler but equally effective to use an accelerator diode type processing, but with a series operation in such a way that if some softener in a module is in a fast mode, all the other softeners will also be changed to the fast mode. It may also be desirable to have separate smoothers for the determination of the time constant and for the signal distribution, where the smooters for the determination of the time constant are maintained with fixed time constants, and only the time constants would vary of signal distribution. Because, even in fast mode, smoothed signal levels require several milliseconds to adapt, a time delay in the system can be created to allow control signals to adapt before applying them to a signal path. In a broadband mode, this delay can be understood as a discrete delay (for example 5 ms) in the signal path. In multi-band versions (transformation), the delay is a natural consequence of block processing, and if the analysis of a block is performed before the matrix operation of the signal path, of that block, it would not be required to an explicit delay. Multiple band modes of aspects of the invention can use the same time constants ? rules that the broadband versions, except that the sample rate of the softeners can be set to the signal sampling rate divided by the block size (for example the block rate), such that the coefficients used in the softeners adjust properly. For frequencies below 400 Hz, in multi-band modes, the time constants are preferably scaled inversely to the frequency. In the broadband version this is not possible since there are no separate softeners at different frequencies, and as a partial compensation, a soft bandpass / pre-emphasis filter can be applied to the input signal in the control path, for Emphasize higher average and higher frequencies. This filter can have, for example, a high-pass, two-pole feature, with a corner frequency at 200 Hz, plus a 2-pole low-pass characteristic, with a corner frequency at 8000 Hz, plus a network of pre-emphasis that applies 6 dB of reinforcement from 400 Hz to 800 Hz and another 6 dB of reinforcement from 1S00 HZ to 3200 Hz. Although that filter has been found to be convenient, the characteristics of the filter are not critical and other parameters may be employed at the discretion of the system designer.
In addition to smoothing in the time domain, the multi-band versions of aspects of the invention also preferably employ smoothing in the frequency domain, as described above in relation to FIG. 4A (frequency smoothing devices 413, 415 and 417). For each block the energy levels compensated by non-neighbors, can be averaged with a sliding frequency window, adjusted to approximate a 1/3 octave bandwidth (critical band), before being applied to the processing in the domain of time, subsequent, described above. Since filter banks based on a transformation have an intrinsically linear frequency resolution, the width of this window (in the number of transformation coefficients) increases with increasing frequency, and is usually only a transformation coefficient of width at low frequencies (below approximately 400 Hz). Therefore, the total smoothing applied to the processing of multiple bands is based more on the smoothing in the time domain at low frequencies, and the smoothing in the frequency domain at higher frequencies, where it is likely that the rapid response in the time is more necessary in certain times.Returning to the description of Figure 4C, the preliminary scale factors (shown as "PSFs" in Figure 2), which ultimately affect the distribution of dominant / fill / endpoint signals, can be produced by a combination of devices or functions 455, 457 and 459 that calculate the components of "dominant" scale factors, the components of scale factors of "stuffing" and the components of scaling factors of "the energy of extreme point in excess", respectively, the normal ones These are normal, respectively, 361, 363 and 365 functions, and a device or function 367 that takes either the largest of the dominant scale and fill factor components and / or the additive combination of the components of Extreme endpoint energy scale factors. Preliminary scale factors can be sent to a supervisor, such as supervisor 201 of Figure 2, if the module is one of a plurality of modules. Preliminary scale factors can each have a range of zero to one. Components of the Dominant Scale Factors In addition to the effective_xcor, the device or function 355 ("calculates the components of dominant scale factors") receives the information of the address compensated by neighbors, of the block 337 and the information concerning the local matrix coefficients of a local matrix 369, so that it can determine the nearest N output channels (where N = number of inputs) that can be applied to a weighted sum to produce the coordinates of the primary address in progress, nominal, and apply the "dominant" scale factor components to them to produce the dominant coordinates. The output of block 355 is or a scale factor component (per subband) if it happens that the current, nominal primary address matches an output address or, otherwise, multiple components of scale factors (one for the number of entries per sub-band) that group the current primary direction, nominal, and are applied in appropriate proportions in order to expand in pan or correlate the dominant signal with the correct virtual location in a conservative sense of power (ie, for N = 2, the two components of scale factors of the dominant channel, assigned, must give a value in the sum of squares, equal to effective_xcor). For a two-input module, all the output channels are in a line or arc, so that there is a natural order (from "left" to "right"), and it is easily evident that the channels are close to each other. to others. For the case hypothetical analyzed above, which has two input channels and five output channels, with sen / eos coefficients as shown, the primary address in progress, nominal, can be assumed as (0.8, 0.6), between the left-half channel mi (.92, .38) and the central channel C (.71, .71). This can be achieved by finding two consecutive channels where the coefficient L is greater than the coordinate of the primary direction in nominal course L, and the channel to its right has a coefficient L less than the dominant coordinate L. The components of the scale factors The dominant ones are assigned to the two closest channels in a sense of constant power. To do this, we solve a system of two equations and two unknowns, where the unknowns are the component of the scale factor, dominant, of the channel to the left of the dominant direction (SFL), and the component of the corresponding scale factor , to the right of the current nominal primary address (SFR) (these equations are solved for SFL and SFR). first_dominante_coord = SFL * matrix value of the left channel 1 + SFR * matrix value of the right channel 1 segunda_dominante_coord = SFL * matrix value of the left channel 2 + SFR * matrix value of the right channel 2 Note that the left and right channels mean the channels that enclose the nominal primary address in progress, and not the input channels L and R to the module. The solution is the calculation of the antidominant level of each channel, normalized by the sum of squares to 1.0, and used as components of the dominant distribution scale factor (SFL, SFR), each for the other channel. In other words, the antidominant value of an output channel with coefficients A, B for a signal with coordinates C, D is the absolute value of AD-BC. For the numerical example under consideration: Antidom (ML channel) = abs (.92 * .6 - .38 * .8) = .248 Antidom (channel C) = abs (.71 * .6 - .71 * .8) = .142 (where "abs" indicates that the absolute value is taken). The normalization of the last two numbers by adding squares to 1.0 produces values of .8678 and .4969 respectively. In this way, by changing these values for the opposite channels, the components of the dominant scale factors are (note that the value of the dominant scale factor, before the address weighting, is the square root of effective_xcor): ML dom sf = .4969 * square root (effective_xcor) C dom sf = .8678 * square root (effective xcor) (The dominant signal is closer to CSalida than to MedioLsalida) The use of an antidominant component of the channel, normalized, as the other component of the dominant scale factor of the channel, can be better understood considering what happens if the primary direction in course, nominal, points exactly to one of the two selected channels. Assume that one of the coefficients of the channel is [A, B] and the other of the coefficients of the channel is [C, D] and the coordinates of the nominal primary direction are [A, B] (pointing to the first channel ), then: Antidom (first channel) = abs (AB-BA) Antidom (second channel) = abs (CB-DA) Note that the first anonymous value is zero. When the two antidominant signals are normalized by adding squares to 1.0, the second antidominant value is 1.0. When it changes, the first channel receives a dominant scale factor component of 1.0 (multiplied by the square root of effective_xcor) and the second channel receives 0.0, as desired. When this approach extends to modules with more than two inputs, there is no longer a natural ordering that occurs when the channels are in a line or arc. Again, block 337 of Figure 4B, for example, it calculates the coordinates of the primary direction in nominal course, taking the amplitudes of the inputs, after the compensation by neighbors, and normalizing them to the sum of squares of one. Block 455 of Figure 4B, for example, then identifies the N nearest channels (where N = number of entries) that can be applied to a weighted sum to produce the dominant coordinates. (Note: the distance or proximity can be calculated as the sum of the differences of the squared coordinates, as if they were the spatial coordinates (x, y, z)). In this way the nearest N channels are not always captured because they have to be summed in a weighted way to produce the primary direction in nominal course. For example, suppose that a three input module is fed by a triangle of channels Ls, Rs and Superior as in Figure 5. Assume that there are three indoor output channels near the bottom of the triangle, where the local matrix coefficients of the module are [.71, .69, .01], [.70, .70, .01], and [.69, .71, .01], respectively. Assume that the nominal primary running direction is slightly below the center of the triangle, with the coordinates [.6, .6, .53]. (Note: the center of the triangle has the coordinates [.5, .5, .707]). The three closest channels The nominal primary current direction is those three inner channels in the lower part, but they do not add to the dominant coordinates using scale factors between 0 and 1, so that instead two are selected from the lower and upper endpoint channel to distribute the dominant signal, and solve the three equations for the three weighting factors in order to complete the dominant calculation and proceed to the calculations of the stuffing and extreme point. In the examples of Fig. 1 and 2 there is only one module of three inputs and it is used to derive only one inner channel, which simplifies the calculations. Components of the Fill Scale Factor In addition to the effective_xcor, device or function 357 ("calculates the components of the fill scale factor") receives random_xcor, weighted-direction_xcor from block 341, "EQUIAMPL" ("EQUIA PL" is defined and it is explained later), and the information concerning the local matrix coefficients of the local matrix (in case the same component of the fill scale factor is applied to all the outputs, as explained later in relation to the figure 14B). The output of block 457 is a component of the scale factor for each module output (per subband).
As explained above, effective_xcor is zero when the weighted-direction_xcor is less than or equal to random_xcor. When the address-weighted_xcor > = random_xcor, the component of the fill scale factor for all the output channels is a component of the fill scale factor = square root (1-effective__xcor) * EQUIAMPL In this way, when weighted-direction_xcor = random_xco, the effective_xcor is 0 , so that (1 -efective_xcor) is 1.0, such that the scale factor component of the fill amplitude equals EQUIAMPL (assuring that the output power = input power in that condition). That point is the maximum value that the fill scale factors components reach. When weighted_xcor is less than random_xcor, the component (s) of the dominant scale factors is (are) zero and the components of the fill scale factors are reduced to zero when the weighted_corporate_xcor approaches zero: Filling scale factor component = square root (direction-weighted_xcor / random_xcor) * EQUIAMPL In this way, in the limit, where direction-weighted_xcor = random_xcor, the components of the preliminary fill scale factor are again equal to EQUIAMPL, ensuring continuity with the results of the previous equation for the case of address-weighted_xcor greater than aleatory_xcor. Associated with each decoder module is not only a value of random_xcor but also a value of "EQUIAMPL", which is a value of the scale factor that all scale factors should have if the signals are equally distributed, in such a way that the power is preserved, that is to say: EQUIAMPL = square_root_of (Number of input channels of the decoding module / Number of output channels of the decoding module) For example, for a two-input module with three outputs: EQUIAMPL = square root (2/3) = .8165 Where "square rootO" means "square_root_of ()" For a module of two inputs with 4 outputs: EQUIAMPL = square root (2/4) = .7071 For a module of two inputs with 5 outputs: EQUIAMPL = square root (2/5) = .6325 Although these EQUIAMPL values have been found to provide satisfactory results, the values are not critical and other values can be used at discretion of the system designer. Changes in the value of EQUIAMPL affect the levels of the output channels for the "filling" condition (intermediate correlation of the input signals) with respect to the levels of the output channels for the "dominant" condition (maximum condition of the input) and the condition of "all extreme points" (minimum correlation of the input signals.) Components of the Extreme Point Scale Factors In addition to neighbors-compensated_xcor (from block 439, Figure 4B), the device or function 359 ("calculates the components of the energy scale factors of the extreme point in excess") receives the energy compensated by non-neighbors, smoothed, of the inputs from Ia to the mth, respective (from blocks 325 and 327) and, optionally, the information concerning the local matrix coefficients of the local matrix (in the event that one or both outputs of the end point of the module do not coincide with an input and the module applies punctuation energy. to extreme in excess, to the two exits that have the directions closest to the entry direction, as discussed further below). The output of block 359 is a component of the scale factor for each endpoint output if the addresses match the input addresses, otherwise two components of the scale factors, one for each of the outputs closest to the end, as explained later. However, the components of the scale factors of excess point energy, produced by block 359 are not the only components of "endpoint" scale factors. There are three other sources of endpoint scale factor components (two in the case of a single stand-alone module): First, within the calculations of the preliminary scale factors of a particular module, endpoints are possible candidates for components of dominant signal scaling factors by block 355 (and normalizer 361). Second, in the "fill" calculation of block 357 (and normalizer 363) of FIG. 4C, endpoints are treated as possible fill candidates, along with all interior channels. Any fill scale factor component, other than zero, can be applied to all outputs, including the selected endpoints and dominant outputs. Third, if there is a grid of multiple modules, a supervisor (such as the supervisor 201 of the example of Figure 2) makes a fourth final allocation of the "endpoint" channels, as described above in relation to figures 2. and 3.
In order for block 459 to calculate the components of the scale factors of the "extreme point energy excess", the total energy in all the interior outputs is reflected back to the module inputs, based on the neighbors -compensada_xcor , to estimate how much of the energy of the inner outputs is contributed by each input ("internal energy at the input '?'"), and that energy is used to calculate the component of the scale factor of the excess endpoint energy , on each output of the module that matches an entry (that is, an endpoint). The reflection of the interior energy back to the inputs is required in order to provide necessary information for a supervisor, such as supervisor 201 of FIG. 2, to calculate neighboring levels and neighboring levels of higher order. One way to calculate the contribution of internal energy in each of the inputs of the modules and to determine the component of the excess endpoint scale factor, for each endpoint output, is presented in Figures 6A and 6B. Figures 6A and 6B are functional block diagrams showing, respectively, in a module, such as any of the modules 24-34 of Figure 2, an arrangement suitable for (1) generating the energy total estimated interior for each input of a module, from 1 to m, in response to the total energy of each input, from 1 to m, and (2) in response to the neighbors -compensated_xcor (see Figure 4B, the output of the block 439), generate an excess end point energy scale factor component for each of the module endpoints. The total estimated internal energy for each input of a module (Figure 6A) is required by the supervisor, in the case of an array of multiple modules, and in any case, by the module itself in order to generate the components of the Excess point extreme energy scale. Using the components of the derived scaling factors in blocks 455 and 457 of Figure 4C, along with other information, the arrangement of Figure 6A calculates the total energy in each interior output (but not in its endpoint outputs). Using the internal output energy levels, calculated, multiply each output level by the matrix coefficient that relates that output to each input ["m" inputs, "m" multipliers], which provides the contribution of energy from that input to that exit. For each input, add all the energy contributions of all the indoor output channels to obtain the total interior energy contribution of that input. The contribution of total interior energy, from each entry, is reported to the supervisor and is used by the module to calculate the component of the excess endpoint energy scale factor, for each endpoint output. Referring to the SA figure in detail, the total smoothed energy level, for each module input (preferably compensated by non-neighbors) is applied to a set of multipliers, a multiplier for each of the module's internal outputs. For simplicity of presentation, Figure 6A shows two inputs, "1" and "m" and two interior outputs "X" and "Z". The smoothed total energy level for each module input is multiplied by a matrix coefficient (from the module's local matrix) that relates the particular input to one of the module's internal outputs (note that the matrix coefficients are their own inverses because the sum of squares of the matrix coefficients is equal to one). This is done for each combination of internal input and output. In this manner, as shown in Fig. 6A, the total energy level smoothed at input one (which can be obtained, for example, at the output of slow smoother 425 of Fig. 4B) is applied to a multiplier 601 that multiplies that energy level by a matrix coefficient that relates the inner output X to the input 1, providing a component of energy level output, scaled, ?? at output X. Similary, multipliers 603, 605 and 607 provide scaled energy level components Xm, and Zm. The energy level components for each indoor output (for example i and Xra; Z ± and Zm) are summed in combiners 611 and 613 in an amplitude / power manner according to neighbors-compensated_xcor. If the inputs to a combiner are in phase, which is indicated by a cross-correlation weighted by neighbors, of 1.0, their linear amplitudes add up. If they are not correlated, which is indicated by a cross-correlation weighted by neighbors, from zero, their energy levels add up. If the cross-correlation is between one and zero, the sum is partly a sum of amplitudes and partially a sum of powers. In order to appropriately add the inputs to each combiner, both the sum of the amplitudes and the sum of the powers are calculated and weighted by neighbors-compensated_xcor and (1-neighbors-compensated_xcor), respectively. In order to obtain the weighted sum, we obtain either the square root or the sum of powers, in order to obtain an equivalent amplitude, or we obtain the square root of the sum of the linear amplitudes to obtain its power level before get the weighted sum. For example, using the last approach (weighted sum of powers), if the amplitude levels are 3 and 4 and neighbors - compensated_xcor is the sum of amplitudes with a value of 3 + 4 = 7, or a power level of 49 and the sum of power energy is 9 + 16 = 25. In this way the weighted sum is 0.7 * 49 + (1 - 0.7) * 25 = 41.8 (power energy level) or, calculating the square root, 6.47. The products of the sum (Xi + Xm / ^ + Zm) are multiplied by the components of scale factors for each of the outputs, X and Z, in multipliers 613 and 615 to produce the total energy level in each output inside, which can be identified as X 'and Z'. The component of the scale factor for each of the interior outputs is obtained from block 467 (figure 4C) total in each interior output, which can be identified as X 'and Z'. The component of the scale factor for each of the inner outputs is obtained from block 467 (Figure 4C). Note that the "components of the excess endpoint energy scale factors" of block 459 (FIG. 4C) do not affect the inner outputs and are not involved in the calculations made by the arrangement of FIG. 6A. The total energy level in each interior output, X 'and Z' is reflected back to the respective module inputs, multiplying each by a matrix coefficient (from the module's local matrix) that relate the particular output to each of the module's inputs. This is done for each combination of internal output and input. In this way, as shown in Figure 6A, the total energy level X 'at the inner output X is applied to a multiplier 617 that multiplies the energy level by a matrix coefficient that relates the indoor output X to the input 1 (which is the same as its inverse, as mentioned above), providing a component of the scaled energy level Xi 'at input 1. It should be noted that when a second order value, such as the total energy level X' , is weighted by a first order value, such as a matrix coefficient, a second order weighting is required. This is equivalent to extracting the square root of the energy to obtain an amplitude, multiplying that amplitude by the matrix coefficient and squaring the result to obtain an energy value again. Similarly, multipliers 619, 621 and 623 provide scaled energy levels Xm ', Zi' and Zm '. The energy components related to each output (eg, Xi 'and ??', Xm 'and Zm') are summed in the combiners 625 and 627 in an amplitude / power manner, as described above in relation to the combiners 611 and 613, according to neighbors-compensated_xcor. The outputs of combiners 625 and 627 represent the total estimated internal energy for inputs 1 and m, respectively. In the case of a network of multiple modules, that information is sent to the supervisor, such as the supervisor 201 of Figure 2, so that the supervisor can calculate neighboring levels. The supervisor requests all internal energy contributions from each input from all the modules connected to that input, and then informs each module, for each of its inputs, that the sum of all other contributions of total internal energy came from all the other modules connected to that input. The result is the neighbor level for that input of that module. The generation of neighbor level information is described further below. The total estimated internal energy contributed by each of the inputs 1 and m are also required by the module in order to calculate the component of the extreme endpoint scale factor, for each endpoint output. Figure 6B shows how this information can be calculated from the component of the scale factor. For the simplicity of the presentation, only the calculation of the scale factor component information for an endpoint is shown, and understands that a similar calculation is made for each end point exit. The total estimated internal energy contributed by an input, such as input 1, is subtracted in a combiner or combining function 629 from the total input energy smoothed for the same input, input 1 in this example (the same total energy level smoothed at the input 1, obtained, for example, at the output of the slow smoother 425 of FIG. 4B, which is applied to a multiplier 601). The result of the subtraction is divided into a divisor or dividing function 631 between the total energy level smoothed for the same input 1. The square root of the result of the division is extracted in a square root extractor or in an extraction function of square roots 633. It should be noted that the operation of the divisor or divisor function 631 (and other divisors described herein) must include a zero-denominator test. In that case, the quotient can be set to zero. If there is only one autonomous module, the preliminary endpoint scale factor components are then determined by virtue of the dominance of the endpoint, fill and excess energy scale factors. In this way, all the output channels that include the endpoints have scale factors assigned, and you can proceed to use them to carry out the matrix operation of the signal routes. However, if there is a grid of multiple modules, each one has an assigned endpoint scale factor for each input that feeds it, so that each input that has more than one module connected to it has multiple factor assignments. of scale, one of each connected module. In this case, the supervisor (such as the supervisor 201 of the example of Figure 2) executes a fourth final assignment of the "endpoint" channels, as described above in relation to Figures 2 and 3. The supervisor determines the end-point scale factors that override all scale factor assignments made by individual modules as endpoint scale factors. In practical arrangements there is no certainty that there really exists an output channel address that corresponds to an endpoint position, although this is often the case. If there is no physical endpoint channel, but there is at least one physical channel beyond the endpoint, the endpoint energy is extended in panning to the physical channels closer to the end, as if it were a dominant signal component. In a horizontal array, these are the two channels closest to the position of the endpoint, preferably used a constant energy distribution (the sum of squares of the two scale factors is 1.0). In other words, when a direction of the sound does not correspond to the position of a real sound channel, even if that direction is an endpoint signal, it is preferred to extend it in panoramic to the pair of available, closest real channels, because if the sound moves slowly, jumps suddenly from one output channel to another. In this way, when there is no physical endpoint sound channel, it is not appropriate to pan an endpoint signal to the sound channel closest to the endpoint location, unless there is no physical channel beyond from the extreme point, in which case there is no choice other than a sound channel near the location of the endpoint. Another way to implement such panning expansion, such as supervisor 201 of FIG. 2, to generate "final" scaling factors, based on the assumption that each input also has a corresponding output channel (i.e., each input and matching output, representing the same location). Then, an output matrix, such as variable matrix 203 of Figure 2, can correlate an output channel with one or more appropriate output channels, if there is no real output channel that corresponds directly to an input channel. As mentioned above, the output of each of the devices or functions of the "calculated scale factor component" 455, 457 and 459 apply to respective standardization devices or functions 461, 463 and 465. Those standardized are desirable because the components of the scale factors, calculated by blocks 455, 457 and 459 are based on levels compensated by neighbors, while the last matrix operation of the signal path (in the master matrix, in the case of multiple modules, or in the local matrix, in the case of an autonomous module) it involves levels compensated by non-neighbors (the input signals applied to the matrix are not compensated by neighbors). Typically, the components of scale factors are reduced in value by a normalizer. An appropriate way to implement normalizers is as follows: each normalizer receives the smoothed input power, compensated by the neighbors, for each of the module inputs (such as combiners 331 and 333), smoothed input power, compensated by not neighbors, for each of the inputs of the module (such as blocks 325 and 327), the information of the local matrix coefficients of the local matrix, and the respective outputs of the blocks 355, 357 and 359. Each normalizer calculates a desired output for each output channel and an actual output level for each output channel, assuming a scale factor of 1. Subsequently it divides the desired output calculated for each output channel between the actual output level, calculated, for each output channel and calculates the square root of the quotient to provide a potential preliminary scale factor for the application to the "sum and / or greater of" 367. Consider the next example. Assume that the input energy levels compensated by non-neighbors, smoothed, of a module of two inputs are 6 and 8, and that the energy levels compensated by the corresponding neighbors are 3 and 4. Also assume an output channel central interior that has the matrix coefficients = (.71, .71), or squared: (0.5, 0.5). If the module selects an initial scale factor for this channel (based on levels compensated by neighbors) of 0.5, or squared = 0.25, then the desired output level of this channel (assuming a sum of pure energy for simplicity and using levels compensated by neighbors) is: .25 * (3 * .5 + 4 * .5) = 0.875. Because the actual input levels are 6 and 8, if the previous scale factor (squared) of 0.25 is Use for the final matrix operation of the signal path, the output level is .25 * (6 * .5 + 8 * .5) = 1.75 instead of the desired output level of 0.875. The normalizer adjusts the scale factor to obtain the desired output level when levels compensated by non-neighbors are used. Actual output, assuming SF = 1 = (6 * .5 + 8 * .5) = 7. (Desired output level) / (Actual output assuming SF = 1) = 0.875 / 7.0 = .125 = final scaling factor squared. Final scale factor for that output channel = square root (0.125) = 0.354, instead of the initially calculated value of 0.5. The "sum and / or the greatest of" 367 preferably sum the corresponding filler and endpoint factors components, for each output channel per subband, and, selects the largest of the components of the dominant scale factors and fill, for each output channel per subband. The function of the "sum and / or the greatest of" block 367 in its preferred form can be characterized as shown in Figure 7. That is, the components of dominant scale factors and the components of fill scale factors are apply to a device or function 701 that selects the largest component of the scale factors for each output ("the largest of" 701) and applies them to an additive combiner or a combination function 703 that adds the components of the scale factors of the greater of 701 with the endpoint energy scale factors, for each output. Alternatively acceptable results can be obtained when the "sum and / or the highest of" 467: (1) sum in both Region 1 and Region 2, (2) takes the highest value present in Region 1 and Region 2, or (3) selects the largest of all the values present in Region 1 and sum in Region 2. Figure 8 is an idealized representation of the manner in which an aspect of the present invention generates components of scale factors in response to a cross correlation measure. The figure is particularly useful with reference to Figures 9A and 9B to the examples of Figures 16A and 16B. As mentioned above, the generation of the components of scale factors can be considered as having two regions or operating regimes: a first region. Region 1, limited by "all dominant" and "uniformly filled" in which the components of the available scale factors are a mixture of dominant scale and fill factor components, and a second region, Region 2, limited by " evenly refilled "and "all extreme points" in which the components of the scale factors available are a mixture of the components of the fill-in and excess-point energy scale factors. The "all dominant" border condition occurs when the weighted-direction_xcor is one. Region 1 (dominant plus fill) extends from that boundary to the point where the weighted-direction_xcor equals random_xcor, the condition of "uniformly filled". The boundary condition of "all extreme points" occurs when the weighted-direction_xcor is zero. Region 2 (fill plus end point) extends from the "uniformly filled" boundary condition to the boundary condition of "all extreme points". The "uniformly filled" boundary point can be considered as located in Region 1 or Region 2. As mentioned later, the precise point at the border is not critical. As illustrated in Figure 8, as the value of the component (s) of the dominant scale factors declines, the value of the fill-scale factors components increases, reaching a maximum when the ( the component (s) of the dominant scale factors reach (n) a value of zero, point at which when the value of the components of the factors of Filling scale decline, the value of the components of excess endpoint energy scale factors, it increases. The result, when applied, to an appropriate matrix that receives the input signals from the module, is a distribution of output signals that provides a compact sound image when the input signals sor. highly correlated, scattering (widening) from compact to broad as the correlation is reduced, and progressively dividing or curving outward in multiple sound images, one at each extreme point, from broad, as the correlation continues to narrow until highly uncorrelated Although it is desirable that there is a single spatially compact sound image (in the nominal primary current direction of the input signals) for the case of a complete correlation and a plurality of spatially compact sound images (one at each endpoint) for In the case of a complete non-correlation, the spatially scattered sound image between those extremes can be achieved in different ways than those shown in the illustration of Figure 8. It is not critical, for example, that the values of the factor of fill scale reach a maximum for the case of random_xcor equal-weighted-direction xx, and also that values of the three components of the scale factors change linearly as shown. Modifications of the relationships in Figure 8 (and the equations expressed in them contained in the figure) and other relationships between an appropriate measure of cross-correlation and scale factor values that are capable of producing the distribution of endpoint signals from compact to widely dispersed, even compact, for a cross-correlation measure from highly correlated to highly uncorrelated, they are also contemplated by the present invention. For example, instead of obtaining a distribution of endpoint signals from compact dominant, to widely dispersed, even compact, using a double region approach as described above, those results can be obtained through a mathematical approach, such as one that uses a solution of equations based on the pseudo-inverse. Examples of Output Scale Factors A series of idealized representations, of Figures 9A and 9B to Figures 16A and 16B, illustrate the output scale factors of a module for several examples of input signal conditions. For simplicity, a single autonomous module is assumed, in such a way that the scale factors that it produces for a matrix variable are the final scale factors. The module and an associated variable matrix have two input channels (such as left L and right R) that coincide with two endpoint output channels (which can also be designated as L and R). In this series of examples there are three internal output channels (such as middle left Lm, middle C, and middle right Rm). The meanings of "all dominant", "dominant and mixed fillers", "uniformly filled", "filled and mixed endpoints", and "all extreme points" are further illustrated in relation to the examples of Figures 9A and 9B a 16A and 16B. In each pair of figures (for example 9A and 9B), figure "A" shows the energy levels of two inputs, left L, and right R and figure "B" shows the components of the scale factors for the five outputs, left L, left middle LM, central C, right middle RM and right R. The figures are not to scale. Figure 9A, the input energy levels, shown as two vertical arrows, are equal. In addition both the weighted-direction-xcor (and the effective_xcor) have a value of 1.0 (total correlation). In this example there is only one non-zero scale factor, shown in Figure 9B as a single vertical arrow in C, which is applied to the output of the inner central channel C, resulting in a spatially compact dominant signal. In this example, the output is centered (L / R = 1) and therefore it happens to coincide with the central indoor output channel C. If there is no matching output channel, the dominant signal is applied in proportions appropriate to the nearest output channels in order to pan the dominant signal to the correct virtual location between them. If for example there was no central output channel C, the left output channels L central and right central would have non-zero scaling factors, causing the dominant signal to be applied equally to the LM and RM outputs. In this case of full correlation (all dominant signal), there are no filler and endpoint components. In this way, the preliminary scale factors produced by block 467 (Figure 4C) are the same as the components of the normalized, dominant scale factors produced by block 361. In Figure 10A, input energy levels they are equal, but the-weighted_xcor direction is less than 1.0 and greater than random_xcor. Consequently, the scale factors are those of Region 1, that is, the components of the dominant and filling, mixed scale factors. The largest component of the factor of The dominant, normalized scale (of block 361) and the standardized fill-scale factor component (of block 363) is applied to each output channel (by block 367) in such a way that the dominant scale factor is located in the same central output channel C as in Figure 10B, but it is smaller, and the fill scale factors appear in each of the other output channels, L, LM, RM and R (including the L and R endpoints) ). In Figure 11A, the input energy levels remain the same, but the weighted-direction_xcor = random_xcor. Consequently the scale factors, in Figure 11B, are those of the boundary condition between regions 1 and 2, the uniformly filled condition in which there are no dominant or end-point scale factors, only fill-scale factors that they have the same value in each output (and hence they are "uniformly filled"), which is indicated by the identical arrows in each output. The levels of fill-scale factors reach their highest value in this example. As discussed below, the fill-scale factors may be applied non-uniformly, such as in a tapered manner depending on the conditions of the input signals.
In Figure 12A, the input energy levels remain the same, but the weighted-direction_xcor is less than random_xcor and greater than zero (Region 2). Consequently, as shown in Figure 12B, there are fill-scale and end-point factors, but there are no dominant scale factors. In Figure 13A, the input energy levels remain the same, but the weighted-direction_xcor is zero. Consequently, the scale factors, shown in Figure 13B, are those of the boundary condition of all the endpoints. There are no internal exit scale factors, and there are only scale factors of extreme points. In the examples of Figs. 9A / 9B to 13A / 13B, because the energy levels of the two inputs are equal, the weighted_corporate (such as that produced by block 441 of Fig. 4B) is the same that the neighbors -compensated_xcor (such as that produced by block 439 of Figure 4B). However, in Figure 14A, the input energy levels are not equal (L is greater than R). Although the neighbors-compensated_xcor equals random__xcor in this example, the resulting scale factors, shown in Figure 14B, are not filler scale factors applied uniformly to all channels as in the example in Figures 11A and 11B. In instead, the unequal input energy levels cause a proportional increase in the weighted-direction_xcor (proportional to the degree to which the nominal primary running direction deviates from its central position) so that it becomes greater than the neighbors- compensated_xcor, thereby causing the scale factors to be weighted more towards all dominant ones (as illustrated in figure 8). This is a desired result because the signals strongly weighted in L or R should not have a wide width; these should have a compact width near the end point of the L or R channel. The resulting output shown in Figure 14B is a non-zero dominant scale factor located closer to the L than to the R (information of the direction compensated by the neighbors, in this case, locates the dominant component precisely in the left-center position Lm), the amplitudes of reduced fill-scale factors, and no endpoint scale factor (the weighting of the direction pushes the operation towards Region 1 of Figure 8 (dominant and mixed fill)). For the five outputs corresponding to the scale factors of Figure 14B, the outputs can be expressed as: Lsalida = Lt (SFL) MediaLalid = ((.92) Lt + (.38) Rt)) (SFMediaL) Csalida = ((.45) Lt + (.45) Rt)) (SFC) MediaRalid = ((.38) Lt + (.92 ) Lt)) (SF Rsalida = Rt (SFR) In this way, in the example of Figure 14B, even though the scale factors (SF) for each of the four different outputs to MediaLalidate are equal (filled), the corresponding signal outputs are not equal because Lt is greater than Rt (giving by result more signal output to the left) and the dominant output in the left Medial is greater than what the scale factor indicates. Because the primary direction in nominal course matches the left Mediol output channel, the ratio of Lt to Rt is the same as the matrix coefficients for the left Mediol output channel, ie 0.92 to 0.38. Assume that these are the real amplitudes for Lt and Rt. To calculate the output levels these levels are multiplied by the corresponding matrix coefficients, added, and scaled by the respective scale factors: output amplitude (output_canal_sub_i) = sf (i) * (Lt_Coef (i) * Lt + Rt_Coef (i) * Rt) Although the matrix is preferably taken into account between the addition of amplitude and energy (as in the calculations related to FIG. 6A), in this example the Cross-correlation is quite high (large dominant scale factor) and the common sum can be realized as: Lsalida = 0.1 * (1 * 0.92) + 0 * 0.38) = 0. 092 MediaLout = 0.9 * (0.92 * 0.92 + 0.38 * 0. 38) = 0.900 Cout = 0.1 * (0.71 * 0.92 + 0.71 * 0.38) = 0. 092 MediaRout = 0.1 * (0.38 * 0.92 + 0.92 * 0. 38) = 0.070 Rout = 0.1 * (0 * 0.92 + 1 * 0.3T) = 0.038 In this way, this example shows that the signal output at Lout, Cout, Routout, and Routout are not equal because Lt is greater than Rt even when the scale factors for that output are equal. The fill scale factors can also be distributed in the output channels as shown in the examples of Figures 10B, 11B, 12B and 14B. Alternatively, the components of the fill-scale factors, instead of being uniform, may be varied with their position in some other way as a function of the components of the dominant (correlated) and / or end-point input signal ( uncorrelated) (or, equivalently, as a function of the value of the weighted-direction_xcor). For example, for values moderately high of the weighted-direction-xco, the amplitudes of the fill-scale factor component can be convexly curved, so that the output channels close to the primary direction in nominal course receive more signal level than the channels that are more far. For the weighted-address_xcor = random_xcor, the amplitudes of the fill-scale factor component can be flattened to a flat distribution, and for the weighted-direction -xcor < random_xcor, the amplitudes can be curved concavely, favoring the channels that are near the directions of the extreme point. Examples of such amplitudes of fill scale factors, curves, are presented in Figure 15B and Figure 16B. The output of Figure 15B results from an input (Figure 15A) which is the same as in Figure 10A, described above. The output of Figure 16B results from an input (Figure 16A) which is the same as that of Figure 12B described above. Communication between the Module and the Supervisor with Respect to Neighbors Levels and Neighbors Levels of Higher Order Each module in an array of multiple modules, such as the example of figures 1 and 2, requires two mechanisms in order to support communication between the same and a supervisor, such as supervisor 201 of figure 2: (a) one is to collect and report the information required by the supervisor to calculate neighboring levels and neighboring levels of higher order (if any) the information required by the The supervisor is the estimated, total internal energy that can be attributed to each of the inputs of the module, generated, for example, by the arrangement of Figure 6A. (b) another is to receive and apply the neighboring levels (if any) and the higher order neighboring levels (if any) of the supervisor. In the example of Figure 4B, neighboring levels are subtracted from respective combiners 431 and 433 of the smoothed energy levels of each input, and higher order neighboring levels (if any) are subtracted from the respective combiners, 431, 433 and 435 of the smoothed energy levels of each input and the common energy through the channels. Once a supervisor knows all the contributions of the total estimated internal energy, from each input of each module: (1) it determines whether the total estimated internal energy contributions of each input (summed by all the modules connected to that input) exceeds he level of the total available signal in that input. If the sum exceeds the total available, the supervisor again scales each internal energy reported by each module connected to that input, so that they are added to the total input level. (2) Inform each module of its neighboring levels in each entry of the sum of all other internal energy contributions of that input (if any). Neighboring higher order (HO) levels are neighboring levels of one or more higher order modules that share the inputs of a lower level module. The previous calculation of neighboring levels is related "only to the modules in a particular entry that has the same hierarchy: all modules of three entries (if any), then also all the modules of two inputs, etc. A neighboring higher-order level of a module is the sum of all neighboring levels of all higher-order modules in that input (that is, the upper neighbor level in an input of a two-input module is the sum of all the modules of third order, fourth order and higher order, if any, that share the node of a module with two inputs). Once a module knows that its neighboring higher-order levels are in one of its particular inputs, it subtracts them, together with neighboring levels of equal hierarchy level, of the total input energy level of that input, to obtain the level compensated by the neighbors, in that input node. This is shown in Figure 4B where the neighboring levels for input 1 and for input m are subtracted in combiners 431 and 433, respectively, from the outputs of the variable slow smoothing agents 425 and 427, and the neighboring levels of order higher for input 1, input m and common energy, subtracted in combiners 431, 433 and 435, respectively, from the outputs of the variable slow smoothing agents, 425, 427 and 429. A difference between the use of neighboring levels and neighboring levels of higher order, for the compensation is that the neighboring higher order levels are also used to compensate through the input channels the common energy (for example, achieved by subtracting a higher order neighbor level in the combiner 435) . The fundamental reason for this difference is that the common level of a module is not affected by adjacent modules of the same hierarchy, but it can be affected by a higher-order module that shares all the inputs of a module. For example, assume input channels Ls (left perimeter), Rs (right edge), and Tsuperior, with an inner exit channel in the center of the triangle between them (rear high circle), plus an interior exit channel on a line between Ls and Rs (main horizontal rear circle), the preceding exit channel needs a module of three inputs to retrieve the common signal to all three inputs. Subsequently, the last output channel, which is the one on a line between the two inputs (Ls and Rs), needs a module with two inputs. However, the total common signal level observed by the two-input module includes common elements of the three-input module that do not belong to the last output channel, so that the square root of the products is subtracted by pairs of levels Higher-order neighbors, of the common energy of the two-input module, to determine how much common energy is due only to its inner channel (the last mentioned). Thus, in Figure 4B, at the smoothed common energy level (of block 429) the common, higher order, derived level has been subtracted to produce a common energy level compensated by the neighbors (of combiner 435) which is used by the module to calculate (in block 439) the neighbors-compensated_xcor. The present invention and its different aspects can be implemented in analog circuits, or more likely as software functions executed in digital signal processors, in general purpose, programmed digital computers, and / or special purpose digital computers. The interfaces between the analog and digital signal flows can be executed in appropriate hardware and / or as functions in software and / or firmware. Although the present invention and its different aspects may involve analog or digital signals, in practical applications, it is likely that most or all of the processing functions will be executed in the digital domain in digital signal streams in which the audio signals are represented by samples. It should be understood that the implementation of other variations and modifications of the invention and its different aspects will be apparent to those skilled in the art, and that the invention is not limited by these specific embodiments described. It is therefore contemplated to cover, through the present invention, any and all modifications, variations or equivalents that are within the true spirit and scope of the basic underlying principles described and claimed herein.

Claims (1)

131 CLAIMS 1. A process for converting M audio input signals, each associated with an address, into N audio output signals, each associated with an address, where N is greater than M, M is two or more and N is a positive integer equal to three or greater, characterized in that it comprises providing a variable matrix M: N, applying the M audio input signals to the variable matrix, deriving the N audio output signals from the matrix variable, and, controlling the variable array in response to the input signals, such that an acoustic field generated by those output signals has a compact sound image in the nominal, current primary direction of the input signals, when the input signals are highly correlated, the dispersion of the image from compact to extensive as the correlation decreases and progressively divides into multiple, compact sound images, each in a single associated with an input signal, as the correlation continues to decrease until it is highly uncorrelated. 2. A process according to claim 1, characterized in that the variable matrix M: N is a variable matrix that has variable coefficients or is a variable matrix that has fixed coefficients and 132 variable outputs, and the variable matrix is controlled by varying the variable coefficients or by varying the variable outputs. 3. A process according to claim 1, characterized in that the variable matrix is controlled in response to measurements of: (1) the relative levels of the input signals, and, (2) the cross-correlation of the input signals. 4. A process according to claim 3, characterized in that for a cross-correlation measurement of the input signals having values in a first interval, limited by a maximum value and a reference value, the acoustic field has an image of compact sound when the cross-correlation measure is the maximum value and has a widely dispersed image when the cross-correlation measure is that reference value, and for a cross-correlation measure of the input signals having values in a second interval, limited by the reference value and a minimum value, the acoustic field has the image widely dispersed when the cross correlation measure is the reference value and has a plurality of compact sound images, each in a direction associated with a signal of input, when the cross-ratio coefficient is the minimum value. 5. A process according to claim 4, characterized in that the reference value is approximately the value of a measure of the cross-correlation of the input signals for the case of equal energy in each of the outputs. 6. A process according to claim 3, characterized in that a measurement of the relative levels of the input signals is in response to a smoothed energy level of each input signal. A process according to claim 3 or claim 6, characterized in that a measurement of the relative levels of the input signals is a primary current direction, nominal, of the input signals. 8. A process according to claim 3, characterized in that a measure of the cross-correlation of the input signals is in response to a softened common energy of the input signals divided by the -th product root of the smoothed energy level. of each input signal, where M is the number of inputs. 9. A process according to any of claims 6, 7 or 8, characterized in that the smoothed energy level of each input signal is 134 is obtained by smoothing in the variable time and constant time domain. 10. A process according to any of claims 6, 7 or 8, characterized in that the smoothed energy level of each input signal is obtained by smoothing in the frequency domain and smoothing in the constant time and variable time domain. 11. A process according to claim 8, characterized in that the common energy of the input signals is obtained by cross-multiplying the input amplitude levels. 12. A process according to claim 11, characterized in that the smoothed common energy of the input signals is obtained by a smoothing in the constant time and variable time domain of the common energy of the input signals. 13. A process according to claim 12, characterized in that the smoothed energy level of each input signal is obtained through a smoothing in the constant time domain and the variable time domain. 14. A process according to claim 11, characterized in that the smoothed common energy of the input signals is obtained by smoothing in the frequency domain and by smoothing in the constant time and variable time domain, of the common energy of the input signals. 15. A process according to claim 14, characterized in that the smoothed energy level of each input signal is obtained by smoothing in the frequency domain and smoothing in the constant time and variable time domain. 16. A process according to any of claims 9, 10, 12, 13, 14 and 15, characterized in that the smoothing and the constant time domain and the variable time, is carried out by a smoothing that has both a constant of fixed time as a variable time constant. 17. A process according to any of claims 9, 10, 12, 13, 14 and 15, characterized in that the smoothing in the constant time domain and the variable time is carried out by a smoothing having only a constant of variable time. 18. A process according to claim 16 or claim 17, characterized 136 because the variable time constant is variable in stages. 19. A process according to claim 16 or claim 17, characterized in that the variable time constant is continuously variable. 20. A process according to claim 16 or claim 17, characterized in that the variable time constant is controlled in response to measurements of the relative levels of the input signals and their cross-correlation. 21. A process according to claim 6, characterized in that the smoothed energy level of each input signal is obtained by a smoothing in the constant time and variable time domain, smoothing the energy levels of each signal of input, substantially with the same time constant. 22. A process according to claim 3, characterized in that the measurements of the relative levels of the input signals and their cross-correlation are obtained, each by means of the smoothing in the domain of the constant time and the variable time, in the one that applies the same time constant to each smoothing. 23. A process according to claim 8, characterized in that the measurement of the cross-correlation is a first cross-correlation measure of the input signals and an additional measure of cross-correlation is obtained by applying a measure of the relative levels of the input signals, to the first cross-correlation measure, to produce a cross-correlation measure with address weight. 24. A process according to claim 23, characterized in that still an additional measure of cross-correlation of the input signals is obtained by applying a scaling factor approximately equal to a value of a cross-correlation measure of the input signals, for the case of equal energy in each of the outputs. 25. A process for converting M audio input signals, each associated with an address, into N audio output signals, each associated with a direction, wherein N is greater than, and M is three or greater, characterized because it comprises: providing a plurality of variable matrices m-n, where m is a subset of M and n is a subset of N, applying a respective subset of those audio input signals, to each of the variable matrices, deriving a respective subset of those N audio output signals, starting 138 of each of the variable matrices, control each of those variable matrices in response to the subset of input signals applied thereto, such that an acoustic field generated by the respective subset of output signals, derived from it, has a compact sound image in the nominal primary current direction, of the subset of input signals applied to it, when those input signals are highly correlated, the image is scattered from compact to extensive as the correlation decreases and progressively divides into multiple compact sound images, each in a direction associated with an input signal applied to it, as the correlation continues to decrease until highly uncorrelated, and, derive the N output signals audio from the subsets of N audio output channels. 26. A process according to claim 25, characterized in that the variable matrices are also controlled in response to information that compensates for the effect of one or more different variable arrays that receive the same input signal. 27. A process according to claim 25 or claim 26, characterized in that the derivation of the N audio output signals from the subsets of N audio output channels. 139 includes compensating the multiple variable matrices that produce the same output signal. 28. A process according to any of claims 25-27, characterized in that each of the variable matrices is controlled in response to measures of: (a) the relative levels of the input signals applied to it and, (b) the cross-correlation of the input signals. 29. A process for converting M audio input signals, each associated with an address, into N audio output signals, each associated with an address, wherein N is greater than M, and is three or more, characterized because it comprises: providing a variable matrix M: N sensitive to scale factors that control the matrix coefficients or that control the outputs of the matrix; apply the M audio input signals to the variable matrix; provide a plurality of m: n variable matrix scale factor generators, where m is a subset of M and n is a subset of N, apply a respective subset of those M audio input signals, to each of the generators of variable matrix scale factors; derive a set of variable matrix scale factors, for respective subsets of the N audio output signals, from each of the matrix scale factor generators 140 variable; control each of the variable matrix scale factor generators, in response to the subset of input signals applied to it, such that when the scale factors generated by it are applied to the variable matrix M-.N , an acoustic field generated by the respective subset of output signals produced, has a compact sound image in the primary current direction nominal, of the subset of input signals that produced the scale factors applied, when those input signals are located highly correlated, the image is dispersed from compact to extensive as the correlation decreases and progressively divides into multiple compact sound images, each in a direction associated with an input signal that produced the scale factors applied, as the correlation continues to decrease until it is highly uncorrelated, and, derive the N exit signals from udio from that variable matrix. 30. A process according to claim 29, characterized in that variable matrix scale factor generators are also controlled in response to information that compensates for the effect of one or more matrix scale factor generators. 141 variable, different, that receive the same input signal. 31. A process according to claim 29 or claim 30, characterized in that the derivation of the N audio output signals from the variable matrix includes compensating multiple variable matrix scale factor generators, which produce factors of scale for the same output signal. 32. A process according to any of claims 29-31, characterized in that each of the variable matrix scale factor generators is controlled in response to measurements of: (a) the relative levels of the input signals applied to the same, and, (b) the cross-correlation of the input signals.
MXPA05001413A 2002-08-07 2003-08-06 Audio channel spatial translation. MXPA05001413A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US40198302P 2002-08-07 2002-08-07
PCT/US2003/024570 WO2004019656A2 (en) 2001-02-07 2003-08-06 Audio channel spatial translation

Publications (1)

Publication Number Publication Date
MXPA05001413A true MXPA05001413A (en) 2005-06-06

Family

ID=33489220

Family Applications (1)

Application Number Title Priority Date Filing Date
MXPA05001413A MXPA05001413A (en) 2002-08-07 2003-08-06 Audio channel spatial translation.

Country Status (17)

Country Link
EP (1) EP1527655B1 (en)
JP (1) JP4434951B2 (en)
KR (1) KR100988293B1 (en)
CN (1) CN1672464B (en)
AT (1) ATE341923T1 (en)
AU (1) AU2003278704B2 (en)
BR (2) BR0305746A (en)
CA (1) CA2494454C (en)
DE (1) DE60308876T2 (en)
DK (1) DK1527655T3 (en)
ES (1) ES2271654T3 (en)
HK (1) HK1073963A1 (en)
IL (1) IL165941A (en)
MX (1) MXPA05001413A (en)
MY (1) MY139849A (en)
PL (1) PL373120A1 (en)
TW (1) TWI315828B (en)

Families Citing this family (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7508947B2 (en) * 2004-08-03 2009-03-24 Dolby Laboratories Licensing Corporation Method for combining audio signals using auditory scene analysis
JP5101292B2 (en) 2004-10-26 2012-12-19 ドルビー ラボラトリーズ ライセンシング コーポレイション Calculation and adjustment of audio signal's perceived volume and / or perceived spectral balance
ATE406075T1 (en) * 2004-11-23 2008-09-15 Koninkl Philips Electronics Nv DEVICE AND METHOD FOR PROCESSING AUDIO DATA, COMPUTER PROGRAM ELEMENT AND COMPUTER READABLE MEDIUM
TWI397901B (en) * 2004-12-21 2013-06-01 Dolby Lab Licensing Corp Method for controlling a particular loudness characteristic of an audio signal, and apparatus and computer program associated therewith
JP5461835B2 (en) 2005-05-26 2014-04-02 エルジー エレクトロニクス インコーポレイティド Audio signal encoding / decoding method and encoding / decoding device
JP2009500656A (en) 2005-06-30 2009-01-08 エルジー エレクトロニクス インコーポレイティド Apparatus and method for encoding and decoding audio signals
JP5006315B2 (en) 2005-06-30 2012-08-22 エルジー エレクトロニクス インコーポレイティド Audio signal encoding and decoding method and apparatus
WO2007004830A1 (en) 2005-06-30 2007-01-11 Lg Electronics Inc. Apparatus for encoding and decoding audio signal and method thereof
KR100880642B1 (en) 2005-08-30 2009-01-30 엘지전자 주식회사 Method and apparatus for decoding an audio signal
JP4859925B2 (en) 2005-08-30 2012-01-25 エルジー エレクトロニクス インコーポレイティド Audio signal decoding method and apparatus
US7788107B2 (en) 2005-08-30 2010-08-31 Lg Electronics Inc. Method for decoding an audio signal
WO2007027050A1 (en) 2005-08-30 2007-03-08 Lg Electronics Inc. Apparatus for encoding and decoding audio signal and method thereof
US8068569B2 (en) 2005-10-05 2011-11-29 Lg Electronics, Inc. Method and apparatus for signal processing and encoding and decoding
KR100857115B1 (en) 2005-10-05 2008-09-05 엘지전자 주식회사 Method and apparatus for signal processing and encoding and decoding method, and apparatus therefor
US7696907B2 (en) 2005-10-05 2010-04-13 Lg Electronics Inc. Method and apparatus for signal processing and encoding and decoding method, and apparatus therefor
US7751485B2 (en) 2005-10-05 2010-07-06 Lg Electronics Inc. Signal processing using pilot based coding
WO2007040360A1 (en) 2005-10-05 2007-04-12 Lg Electronics Inc. Method and apparatus for signal processing and encoding and decoding method, and apparatus therefor
US7646319B2 (en) 2005-10-05 2010-01-12 Lg Electronics Inc. Method and apparatus for signal processing and encoding and decoding method, and apparatus therefor
US7672379B2 (en) 2005-10-05 2010-03-02 Lg Electronics Inc. Audio signal processing, encoding, and decoding
US7653533B2 (en) 2005-10-24 2010-01-26 Lg Electronics Inc. Removing time delays in signal paths
US7965848B2 (en) 2006-03-29 2011-06-21 Dolby International Ab Reduced number of channels decoding
US8583424B2 (en) * 2008-06-26 2013-11-12 France Telecom Spatial synthesis of multichannel audio signals
CN104837107B (en) 2008-12-18 2017-05-10 杜比实验室特许公司 Audio channel spatial translation
JP5314129B2 (en) * 2009-03-31 2013-10-16 パナソニック株式会社 Sound reproducing apparatus and sound reproducing method
CN101527874B (en) * 2009-04-28 2011-03-23 张勤 Dynamic sound field system
TWI444989B (en) 2010-01-22 2014-07-11 Dolby Lab Licensing Corp Using multichannel decorrelation for improved multichannel upmixing
US9008338B2 (en) * 2010-09-30 2015-04-14 Panasonic Intellectual Property Management Co., Ltd. Audio reproduction apparatus and audio reproduction method
US9781510B2 (en) * 2012-03-22 2017-10-03 Dirac Research Ab Audio precompensation controller design using a variable set of support loudspeakers
EP2830051A3 (en) 2013-07-22 2015-03-04 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio encoder, audio decoder, methods and computer program using jointly encoded residual signals
EP2830332A3 (en) 2013-07-22 2015-03-11 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Method, signal processing unit, and computer program for mapping a plurality of input channels of an input channel configuration to output channels of an output channel configuration
US10567903B2 (en) * 2015-06-24 2020-02-18 Sony Corporation Audio processing apparatus and method, and program
CN106604199B (en) * 2016-12-23 2018-09-18 湖南国科微电子股份有限公司 A kind of matrix disposal method and device of digital audio and video signals
US11019449B2 (en) * 2018-10-06 2021-05-25 Qualcomm Incorporated Six degrees of freedom and three degrees of freedom backward compatibility
CN117953905A (en) 2018-12-07 2024-04-30 弗劳恩霍夫应用研究促进协会 Apparatus, method for generating sound field description from signal comprising at least one channel
TWI740206B (en) * 2019-09-16 2021-09-21 宏碁股份有限公司 Correction system and correction method of signal measurement
CN114327040A (en) * 2021-11-25 2022-04-12 歌尔股份有限公司 Vibration signal generation method, device, electronic device and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5659619A (en) * 1994-05-11 1997-08-19 Aureal Semiconductor, Inc. Three-dimensional virtual audio display employing reduced complexity imaging filters
US6009179A (en) * 1997-01-24 1999-12-28 Sony Corporation Method and apparatus for electronically embedding directional cues in two channels of sound
US6072878A (en) * 1997-09-24 2000-06-06 Sonic Solutions Multi-channel surround sound mastering and reproduction techniques that preserve spatial harmonics
AUPP271598A0 (en) * 1998-03-31 1998-04-23 Lake Dsp Pty Limited Headtracked processing for headtracked playback of audio signals
EP1054575A3 (en) * 1999-05-17 2002-09-18 Bose Corporation Directional decoding

Also Published As

Publication number Publication date
JP4434951B2 (en) 2010-03-17
CA2494454A1 (en) 2004-03-04
CN1672464B (en) 2010-07-28
ES2271654T3 (en) 2007-04-16
CA2494454C (en) 2013-10-01
PL373120A1 (en) 2005-08-08
DE60308876D1 (en) 2006-11-16
CN1672464A (en) 2005-09-21
JP2005535266A (en) 2005-11-17
TW200404222A (en) 2004-03-16
BRPI0305746B1 (en) 2018-03-20
EP1527655A2 (en) 2005-05-04
DE60308876T2 (en) 2007-03-01
KR20050035878A (en) 2005-04-19
EP1527655B1 (en) 2006-10-04
AU2003278704A1 (en) 2004-03-11
TWI315828B (en) 2009-10-11
IL165941A (en) 2010-06-30
AU2003278704B2 (en) 2009-04-23
DK1527655T3 (en) 2007-01-29
IL165941A0 (en) 2006-01-15
BR0305746A (en) 2004-12-07
ATE341923T1 (en) 2006-10-15
MY139849A (en) 2009-11-30
HK1073963A1 (en) 2005-10-21
KR100988293B1 (en) 2010-10-18

Similar Documents

Publication Publication Date Title
US11805379B2 (en) Audio channel spatial translation
MXPA05001413A (en) Audio channel spatial translation.
US7660424B2 (en) Audio channel spatial translation
JP6818841B2 (en) Generation of binaural audio in response to multi-channel audio using at least one feedback delay network
US11582574B2 (en) Generating binaural audio in response to multi-channel audio using at least one feedback delay network
WO2004019656A2 (en) Audio channel spatial translation
KR101341523B1 (en) Method to generate multi-channel audio signals from stereo signals
KR20240017091A (en) Method for and apparatus for decoding an ambisonics audio soundfield representation for audio playback using 2d setups
WO2015102920A1 (en) Generating binaural audio in response to multi-channel audio using at least one feedback delay network
US7330552B1 (en) Multiple positional channels from a conventional stereo signal pair
EP3488623A1 (en) Audio object clustering based on renderer-aware perceptual difference
WO2018017394A1 (en) Audio object clustering based on renderer-aware perceptual difference

Legal Events

Date Code Title Description
FG Grant or registration