WO2018194501A1 - Stereo unfold with psychoacoustic grouping phenomenon - Google Patents
Stereo unfold with psychoacoustic grouping phenomenon Download PDFInfo
- Publication number
- WO2018194501A1 WO2018194501A1 PCT/SE2018/050300 SE2018050300W WO2018194501A1 WO 2018194501 A1 WO2018194501 A1 WO 2018194501A1 SE 2018050300 W SE2018050300 W SE 2018050300W WO 2018194501 A1 WO2018194501 A1 WO 2018194501A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sound
- stereo
- feeds
- unfolded
- unfold
- Prior art date
Links
- 238000012545 processing Methods 0.000 claims abstract description 25
- 238000000034 method Methods 0.000 claims description 67
- 238000007493 shaping process Methods 0.000 claims description 20
- 230000001934 delay Effects 0.000 claims description 9
- 230000005236 sound signal Effects 0.000 claims description 6
- 239000003381 stabilizer Substances 0.000 claims description 2
- 210000004556 brain Anatomy 0.000 abstract description 31
- 238000005516 engineering process Methods 0.000 abstract description 15
- 230000003111 delayed effect Effects 0.000 abstract description 3
- 230000035807 sensation Effects 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 24
- 230000008569 process Effects 0.000 description 17
- 230000004807 localization Effects 0.000 description 15
- 210000005069 ears Anatomy 0.000 description 7
- 230000002889 sympathetic effect Effects 0.000 description 7
- 230000008859 change Effects 0.000 description 6
- 230000004044 response Effects 0.000 description 6
- 230000006872 improvement Effects 0.000 description 5
- 230000008901 benefit Effects 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 238000004091 panning Methods 0.000 description 3
- 230000008447 perception Effects 0.000 description 3
- 238000007781 pre-processing Methods 0.000 description 3
- 238000001228 spectrum Methods 0.000 description 3
- 230000002238 attenuated effect Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 208000016354 hearing loss disease Diseases 0.000 description 2
- 230000000704 physical effect Effects 0.000 description 2
- 230000002829 reductive effect Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- PICXIOQBANWBIZ-UHFFFAOYSA-N zinc;1-oxidopyridine-2-thione Chemical class [Zn+2].[O-]N1C=CC=CC1=S.[O-]N1C=CC=CC1=S PICXIOQBANWBIZ-UHFFFAOYSA-N 0.000 description 2
- 208000032041 Hearing impaired Diseases 0.000 description 1
- 238000010521 absorption reaction Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- ZYXYTGQFPZEUFX-UHFFFAOYSA-N benzpyrimoxan Chemical compound O1C(OCCC1)C=1C(=NC=NC=1)OCC1=CC=C(C=C1)C(F)(F)F ZYXYTGQFPZEUFX-UHFFFAOYSA-N 0.000 description 1
- 239000006227 byproduct Substances 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 230000001010 compromised effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000001627 detrimental effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000004438 eyesight Effects 0.000 description 1
- 238000002868 homogeneous time resolved fluorescence Methods 0.000 description 1
- 238000007654 immersion Methods 0.000 description 1
- 239000007943 implant Substances 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 239000000047 product Substances 0.000 description 1
- 230000001172 regenerating effect Effects 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000006641 stabilisation Effects 0.000 description 1
- 238000011105 stabilization Methods 0.000 description 1
- 230000000087 stabilizing effect Effects 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 230000016776 visual perception Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S1/00—Two-channel systems
- H04S1/007—Two-channel systems in which the audio signals are in digital form
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
- H04S3/002—Non-adaptive circuits, e.g. manually adjustable or static, for enhancing the sound image or the spatial distribution
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/022—Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/04—Time compression or expansion
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
- H04S7/307—Frequency adjustment, e.g. tone control
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/11—Positioning of individual sound objects, e.g. moving airplane, within a sound field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/13—Aspects of volume control, not necessarily automatic, in stereophonic sound systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/15—Aspects of sound capture and related signal processing for recording or reproduction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
- H04S3/02—Systems employing more than two channels, e.g. quadraphonic of the matrix type, i.e. in which input signals are combined algebraically, e.g. after having been phase shifted with respect to each other
Definitions
- the original Stereo Unfold Technology improved upon ordinary stereo reproduction of sound by the use of DSP algorithms to extract information from normal stereo recordings and playback of the additional information layered in time through speaker drivers aiming not only forward but also in other directions.
- the Stereo Unfold Technology creates a real believable three-dimensional soundstage populated with three-dimensional sound sources generating sound in a continuous real sounding acoustic environment and it constitutes a significant improvement compared to ordinary stereo reproduction.
- the new Enhanced Version of Stereo Unfold can now be used both with additional drivers that aim in other directions than forward towards the listener and without.
- the new Enhanced Version of Stereo Unfold is applicable on all types of existing standard loudspeakers and also on headphone listening.
- forward only speaker drivers it now manages to deliver at least the same amount of improvement as the old method, with additional drivers it improves even further.
- On headphones it manages to move the perceived soundstage from within the listeners head on a string between the ears to the outside of the head. It does so without any prior information about the listener's physical properties, i.e. shape and size of ears, head and shoulders.
- the first group deals with the technical problem of having two speakers closely spaced and tries to achieve a result similar to having widely spaced stereo speakers.
- the second group tries to replicate a surround sound field in the listening room using only one speaker instead of several.
- the third group tries to improve the perceived ambiance when listening to stereo but is unsuccessful due to inadequate processing and does not address the psychoacoustic problems inherent to stereo. None of the above prior art groups deal with the general shortcomings of stereo, why stereo as a method is flawed and how the stereo technique can be improved.
- the Stereo Unfold Technology aims to solve these inherent problems within the stereo technique.
- the Stereo Unfold Technology recreates a continuous spatial 3D sound field that is similar to a real acoustic event. Ordinary stereo reproduction can at best project a sound stage but the sound sources within that soundstage sound like they are paper cutouts of performers without any individual extension in depth and the paper cutouts perform in solitude without being in an acoustic space, much like flashlights suspended in a black room.
- the Stereo Unfold Technology recreates a continuous spatial 3D sound field that is similar to a real acoustic event. Ordinary stereo reproduction can at best project a sound stage but the sound sources within that soundstage sound like they are paper cutouts of performers without any individual extension in depth and the paper cutouts perform in solitude without being in an acoustic space, much like flashlights suspended in a black room.
- a surround sound system is at its core an extension of stereo with the same limitations as stereo. With the use of additional speakers located around the room it can create position information not only from the front between the left and right speakers but also other locations around in the room.
- Stereo Unfold have specifically been achieved through understanding of the psychoacoustic grouping phenomenon and the spatial sound processing in the human brain, it is an entirely different method and the result is a spatial 3D sound field that is audibly like a live acoustic event.
- Stereo Unfold processing increases the soundstage size but does so by adding back the missing ambient information from the acoustic environment in which the recording took place or the artificially created ambience if it's computationally of otherwise added to a recoding.
- US5671287 there is disclosed a method to create a directionally spread sound source which is predominantly aimed at processing mono signal sources to create a pseudo stereo signal.
- the method disclosed in US5671287 is not at all the same as the Stereo Unfold method according to the present invention and further disclosed below, Moreover, the target of the present invention and US5671287 are entirely unrelated.
- EP0276159 discloses a method to create artificial localization cues to improve sound immersion with headphones.
- the disclosed method uses common Head Related Transfer functions to create the directional cues and mentions addition of early and later reflections.
- the Stereo Unfold method according to the present invention restores the naturally occurring ambient to direct sound ratio in recordings by extracting the ambient information from the recording and then add it back using a signal processing method that facilitates psychoacoustic grouping.
- both the targets and methods of the present invention and EP0276159 are completely different.
- Stereo Unfold according to the present invention is not targeting improvements of localization or widening of the stereo playback soundstage.
- the individual signal sources (performers) in the reproduced recording does, after Stereo Unfold processing, not change localization within the soundstage to a great degree.
- the relatively small change of localization that does occur is a byproduct of the processing but not the goal.
- the goal is to recreate the direct to ambient sound ratio to achieve a more natural sounding recording.
- the mono process can at best provide some perceived depth and height of the soundstage projected in front of the listener but it is basically unable to convey any localization clues for the individual sound sources in the recording.
- the limited soundstage that is available is created by reflections from surfaces in the listening room. The reflections create an illusion of a cloud of sound around the single loudspeaker source. This can easily be verified by listening to mono in an anechoic environment where the cloud disappears.
- Stereo was an unfolded version of mono, unfolded in the physical horizontal plane by the use of two loudspeakers. It allowed localization of sound sources horizontally anywhere between the loudspeakers. When stereo is properly recorded and played back on loudspeakers it manages to create a relatively continuous horizontal plane of sound in front of the listener with some height and depth present. The listener's brain is fooled by the process into believing that there are multiple sound sources in front of him/her despite the fact that all sound only emanates from two speakers. Stereo played back through loudspeakers makes use of psychoacoustics to create an illusion of a soundstage populated by multiple sound sources at different horizontal locations in front of the listener. As with mono, reflected sound from the loudspeakers, reflected by the surfaces within the listening room, creates the illusion of a soundstage in front of the listener. Without these reflections, the sound would be perceived as emanating from inside of the listener's head.
- the reason for this phenomenon is that stereo recordings only contain left to right localization clues and are missing all additional spatial information [5].
- the stereo process doesn't provide any psychoacoustic clues that would enable a human brain to figure out any other spatial information than left to right localization. This is easy to test by listening to a stereo recording using headphones, the sound is invariably located inside the listener's head between the ears. With a pair of highly directional speakers, parabolic speakers or speakers in an anechoic room, similarly the sound stage is located within the listener's head.
- stereo reproduction can project a sound stage with depth, width and height.
- the sound sources within that soundstage unfortunately sound like they were paper cutouts of performers without any individual extension in depth.
- the paper cutouts perform in solitude without being in an acoustic space, almost like flashlights suspended in a black room projecting their sound only straight forward towards the listener.
- There is some ambience information present in stereo reproduction that allows us to hear the acoustic surroundings in which the recording was made but it is not anything remotely resembling the acoustics of a real space.
- Picture 1 of a symphony orchestra and two speakers tries to visually illustrate the sound from stereo. Most of the soundstage is perceived as being in between the two speakers with a little bit of height and depth and virtually no acoustic surrounding.
- the Stereo Unfold Technology creates a real believable three-dimensional soundstage populated with three-dimensional sound sources generating sound in a continuous real sounding acoustic environment.
- Picture 2 tries to visually illustrate the perceived soundstage from Stereo Unfold and it should be compared to Picture 1 that is illustrating ordinary stereo.
- the performers are located approximately at the same locations somewhat enlarged in size, the hall and ambience is added providing the predominant enlargement as well as a 3D quality to the sound.
- Stereo Unfold is unfolding the ordinary stereo recording much like mono once was unfolded physically into left/right stereo but this time stereo is unfolded in the dimension of time.
- the jump from stereo to Stereo Unfold is psychoacoustically actually not much different from physically unfolding mono into stereo. This might sound inexplicable but let's take a closer look at stereo and how it works psychoacoustically and it will become apparent that it's not.
- the localization of sound sources from left to right in stereo playback works through two main psychoacoustic phenomena.
- Our ear brain judges horizontal localization of a sound source based on inter aural time differences and perceived level differences between left and right ear. It is possible to pan a sound source from left to right by adjusting the level from the source in the right and left ear respectively. This is usually referred to as level panning. It is also possible to adjust the localization by changing the arrival time to the left and right ear and this panning method is the more effective of the two. It is easy to test the effectiveness of panning through inter aural time difference. Set up a stereo speaker pair in front of a listener and allow the listener to move away to the left or right from the centrally located position between the speakers.
- the perceived soundstage rather quickly collapses towards one of the stereo speakers because the inter aural time difference psychoacoustically tells us that the closer speaker is the source.
- the same can be illustrated using headphones, by delaying the stereo signal to one of the ears the whole soundstage collapses towards the non-delayed ear without any change in level.
- Localization in the horizontal plane in stereo is actually predominantly caused by the inter aural time difference between the left and right signals, i.e. stereo is a mono signal unfolded in time to generate psychoacoustic horizontal localization clues based on time differences between the ears.
- Blumlein used the physical separation of two speakers to be able to create the necessary inter aural time difference for the creation of left to right localization.
- Picture 3 shows one channel of an ordinary digital stereo sound recording. Along the axis starting to the left and ending in the middle of the picture we have sound samples on the real time domain axis.
- the graph displays the absolute value of the sound signal at each instance in time, height corresponding to level.
- Along the axis from the right to the middle of the picture we have the second dimension of time. In the original stereo recording there is no additional information in this dimension since stereo is just a two-dimensional process only containing left and right signals.
- Picture 4 shows the same digital stereo sound recording as Picture 3. The difference is that it has now been Stereo Unfold processed. It has been unfolded in time and along the axis from right to center we can now see how the signal at each instance in time is unfolded into the secondary time dimension. In the diagram it can be observed that the signal is unfolded by an unfold process using 20 discrete unfold signal feeds along the secondary time axis.
- the concept of the 3D-graph in Picture 4 is perhaps somewhat strange on first look but it very much resembles how the human brain interprets sound. A sound heard at a certain point in time is tracked by the brain along the secondary time axis and all information from the onset of the original signal up to the end in the diagram is used by the brain to obtain information about the sound.
- the brain tries to make sense of our sound environment in much the same way as our vision. It simplifies the sound environment by creating objects and assigning particular sounds to each object [2]. We hear the doorbell as an object together with the attendant reverberation, when a person walks across the room we assign all the sounds from the movement to the person etc.
- An example from our visual perception and grouping perhaps makes the details easier to understand.
- the graph in Picture 4 has two time dimensions and the additional second time dimension in the matrix is during the processing dimension folded into the real time dimension.
- stereo has its roots in the lack of spatial information within the recoding and reproduction chain.
- a recording engineer would not place a recoding microphone at a typical listening position in a concert hall. He would invariably move the microphone much closer to the performers. If the microphone was located out in the hall where the audience usually sits the recording would sound excessively reverberant an unnatural. This happens because the stereo recording fails to capture the spatial information properties from the sound field in the hall. It only captures the sound pressure level.
- a human listener in the hall would capture all of the information, both sound pressure and spatial, and would automatically use the spatial information to focus his/her attention to the performers on stage and as input to the psychoacoustic grouping process discussed later.
- the ambient sound field is reaching the listener from other directions and is perceptibly observed differently by the brain compared to the sound from the stage. Since spatial information is missing from stereo recordings the listener can't use spatial information to decode the sound and therefore, if the recording was made at the listening position in the hall, it would be perceived as having copious amounts of reverberant energy.
- the human brain uses both the spatial domain as well as the sound pressure domain to understand and process the sound environment.
- Picture 5 shows two cross sections of two rooms.
- the larger room is a typical concert hall with a stage section to the left and the audience space to the right.
- the sound emanates from the performer on stage travelling along a number of imaginable paths illustrated in the picture.
- the direct sound travels directly from the performer to the listener without reflecting on any surfaces within the hall. As can be seen, the path of the direct sound is much shorter than the path of the first reflection reaching the listener which creates an appreciable arrival time difference.
- the smaller room at the bottom in Picture 5 is a typical listening room with a loudspeaker to the left and a listener to the right. Again soundwave paths are illustrated in the picture with a direct path and reflected paths. In the smaller room the path length difference between the direct sound and the first reflection is smaller than in the larger hall which translates into a smaller arrival time difference.
- the larger hall has a much longer reverberation time than the small room.
- sound has to travel a longer distance before reaching the next reflecting surface that absorbs energy from the sound field and thus the sound lingers for a longer period of time in the larger space.
- Picture 6 illustrates the arrival of sound at the listener's ears in five different diagrams. Along the X-axis is time and on the Y-axis is level. The five diagrams show reverberant decay spectra from an impulse sound.
- Diagram 1 is from the concert hall in Picture 5
- diagram 2 is from the listening room in Picture 5
- diagram 3 is a stereo recording made in the concert hall shown in diagram 1
- diagram 4 is the stereo recoding played back in the listening room
- diagram 5 shows the stereo recoding played back in the listening room after it has been Stereo Unfold processed.
- the first peak to the left is the direct sound arriving from the performer to the listener.
- the next peak is the first reflection arriving after a certain time delay.
- the later reflections first those that have bounced on only one surface sparsely spaced followed by an increasingly dense array of reflections from multiple bounces. This is typical impulse response decay observable in many halls.
- the second diagram in Picture 2 shows the same kind of sound arrival as the first diagram but now it is shown from the typical listening room in Picture 5. Again we have the direct sound, the first peak, followed by the early somewhat sparsely spaced reflections and the denser multiple reflection paths that follows. The sound in the small room is absorbed quicker than in the hall which is clearly illustrated by comparing the sound decay in diagram one and two in Picture 6.
- the most critical difference between the hall and the room is the timing of the first reflection in relation to the direct sound. It's well known from concert hall acoustics that there should be about 25ms to 35ms between the direct sound arrival and the first reflection to maintain clarity and intelligibility of the sound in the hall. If this time is reduced the sound becomes less clear, unprecise even to a point where it becomes fatiguing. The small room isn't physically large enough to provide us with this amount of delay and therefore added ambient energy in the room invariably makes the sound less clear.
- Picture 6 diagram 3 illustrates the reverberant decay in a stereo recording captured in the hall illustrated in Picture 5.
- Picture 6 diagram 4 shows what happens when the recording shown in Picture 6 diagram 3 is played back by the speakers and room with a reverberant decay illustrated in Picture 6 diagram 2.
- the recorded reverberant decay becomes superimposed on the room reverberant decay resulting in the composite reverberant decay in Picture 6 diagram 4. This still doesn't at all look like the reverberant decay of the hall in Picture 6 diagram 1 but it's the decay typically found in a listening room upon playback of a stereo recording.
- stereo sound lacks all spatial information
- the spatial sound field is only created within the listening room by the speakers and the room together, and that the decay pattern looks very dissimilar to what naturally occurs in a music hall missing about 12dB of reverberant energy it's not very surprising that stereo sounds artificial.
- Stereo Unfold addresses the two fundamental limitations in stereo by regenerating a psychoacoustically based spatial 3D sound field that the human brain can easily interpret and by utilizing the psychoacoustic effect called psychoacoustic grouping.
- Stereo Unfold creates a spatial 3D sound field in the listening room through the use of additional drivers in other directions than forward together with basic grouping of the spatial field and the direct sound.
- Stereo Unfold uses the disclosed enhanced grouping method together with ordinary loudspeakers.
- the forward radiating loudspeaker essentially first plays back the stereo information and then later the grouped spatial information to recreate the spatial field without the use of additional drivers aimed in other directions than forward. This is possible through the use of the enhanced grouping process that uses the later described sympathetic grouping method.
- Stereo Unfold creates a spatial 3D sound field in the listening room through the use of additional drivers in other directions than forward together with enhanced grouping of the spatial field and the direct sound. This implementation recreates the best illusion but needs additional drivers and is thus somewhat limited in its applicability compared to the second implementation.
- Stereo Unfold processing creates a spatial 3D sound field with headphones using the enhanced grouping process. The direct and ambient sound fields are connected through enhanced grouping which moves the sound experience from the common within the listener's head to outside of the listener's head. It does so without any prior information about the listener's physical properties, i.e. shape and size of ears, head and shoulders.
- the Stereo Unfold EV DSP extraction process creates additional basic L+R, L-R and R-L feeds that are used as building blocks together with the original L and R channels in the unfold processing.
- the equations for the basic feeds (Fx) are show below; Gx, Dx, and Frx denotes gain, delay and frequency shaping respectively, Gfx are gain multiplies to adjust forward main output in level to maintain same perceived output level after the Stereo Unfold EV processing and Frfx are frequency shaping filters that can be modified to maintain the overall tonal balance of the forward direct sound.
- the Gx gain multipliers can be any number between 0 and infinity.
- the frequency shaping, Frx predominantly limits the frequency range to above 50Hz and rolls of frequencies above 7kHz to emulate typical reverberant field energy in a concert hall and naturally occurring absorption of higher frequencies in air.
- the preferred frequency range being 100Hz to 4kHz. It also contours the response to follow the roll of in an ambient sound field similar to what's naturally occurring in concert halls.
- the delays D1 and D2 are between 0ms - 3ms, the rest of Dx are at least 5ms up to 50ms, preferred range 10ms - 40ms, further preferred range 15ms - 35ms.
- the shown basic feeds F3-F9 can each become several input feeds to the processing with different Gx, Frx and Dx settings.
- a reference to any of the feeds F3 to F9 denotes at least one but can also be two, three, four, five or several more of the same basic feed with different Gx, Frx, and Dx in each instance.
- the following signals are played back according to the equations.
- the Stereo Unfold EV feeds without the F1 and F2 components can also be sent to drivers aimed in other directions than directly towards the listener. Additional feeds can be sent in one or all possible extra directions, in, out, up, back and down, using any type of loudspeaker drivers or arrays thereof. Basically any type of constellation that generates a diffuse widespread sound field will work. Also additional separate loudspeakers can be used for the additional feeds located close to or even possibly attached to the main speakers. Separate loudspeakers can also be located around the room similar to a surround setup or integrated into the walls and ceiling. Also any type combination of the above is possible and will work.
- the psychoacoustic grouping phenomenon is core to the Stereo Unfold EV process. Without grouping the brain would not connect the time layered feeds together and they would not provide additional information to the brain, rather the opposite, they would provide confusion and would make the sound less clear and less intelligible. Grouping is easier to describe in an uncomplicated example so let's take a closer look at the Left channel signals in the 3 Unfold feed example above with the output equation;
- grouping occurs at other multiples than 6, i.e. it is possible to use different multiples to create varying audible results.
- a larger multiple would be perceived as creating a more spacious sound up to a point where the sound starts to be perceived as an echo at delays larger than 50ms.
- a lower multiple creates a less spacious sound and if the total delay time is less than 10ms the sound starts to become unclear and difficult for the human brain to separate from the direct sound.
- the F3 feed is needed to group together with F1 and F6 in order to provide phase stabilization to the sound.
- the F6 feed is essentially an L-R feed and as such if added in significant amounts will cause a somewhat unpleasant phasiness to the sound to a certain degree similar to what happens when playing back stereo contents with one of the speakers out of phase.
- the F3 feed is provided as a stabilizing element that removes the phasiness and when grouped with the F1 and F6 feeds there is no phasiness present anymore.
- the human brain uses both spatial information and sound pressure information to decode, group and in general make sense of the acoustic environment. If the spatial information is removed by the stereo recoding method the natural grouping process stops working. Normally the ambient sound energy is significantly larger than the direct sound energy and when the spatial information is lost the brain can't suppress and handle the ambient sound information in the same way it does when it has access to the spatial information.
- phase relationship between direct and reflected sound is random and depends on location of sound source and listener in relation to surfaces in the environment.
- the brain is able to sort out what is direct and reflected sound and perceptibly decode them differently. It also adds the different contributing parts of the sound, direct and reflected sound, together so that they still are perceived to be sympathetically grouped together, i.e. in phase.
- Live sound from performers and instruments is perceived to be full bodied and rich in comparison to stereo recordings made in listening positions. The reason is that with live sound the brain has access to spatial information and adds the grouped sounds together so they perceptibly sound like they are in phase. When the spatial information is removed it can't do that anymore and the summation of the sounds becomes random in phase. The summation takes place in the same way as a simple energy addition of sound with random phase relationships.
- Picture 7 show a complex summation of sound pressure from multiple sources with random phase relationships similar to what typically occurs in a room.
- the trace in the diagram is one octave smoothed to take away the local cancellation dips and peaks caused by the random summation and show the overall average level at a particular frequency.
- the random summation causes a broad dip in the frequency response in the fundamental frequency range between approximately 120Hz-400Hz. It also creates a broad peak between approximately 400Hz-2kHz. This corresponds very well with the perception of tonal balance in a recording made at listening position. Normally such a recording sounds like it's made in a tiled very reverberant space lacking fundamental energy with emphasis in the low to upper midrange.
- Picture 8 shows the same one octave smoothed frequency response with sympathetic grouping applied instead of random phase summation.
- the frequency response is now very even throughout the entire frequency spectrum and there is very little change of tonal balance.
- the response only shows some very small wiggles in the 120Hz-400Hz range that will not perceptibly change the tonal balance.
- Picture 9 displays the different sound components in the sympathetic grouping.
- Trace 1 is the direct sound and trace 2 is the ambient sound feed.
- the lower cutoff frequency of the ambient sound feed is about 250Hz and it is delayed 24ms as described in the example earlier.
- the ambient level is brought up to restore the ambient to direct sound ratio to, in an acoustic space, normally occurring levels.
- the ambient sound is also attenuated at higher frequencies similarly to the way it would normally be in an acoustic space.
- the direct sound's frequency balance, trace 1 is modified so that the summation between the restored ambient sound and the direct sound becomes even across the whole frequency spectra.
- Picture 10 shows again trace 1 direct sound and trace 2 ambient information together with trace 3, which is the complex summation between the two former. Trace 3 in picture 10 was shown individually in Picture 8 above.
- Stereo Unfold EV can be applied to a sound recording at any stage. It can be applied on old recordings or it can be applied in the process of making new ones. It can be applied off line as a preprocess that adds the Stereo Unfold EV information to recordings or it can be applied whilst the sound recording is played back.
- the Stereo Unfold EV can then be implemented in any type of preprocessing or playback device imaginable either as hardware, software or firmware as described above.
- Some examples of such devices are active speakers, amplifiers, DA converters, PC music systems, TV sets, headphone amplifiers, smartphones, phones, pads, sound processing units for mastering and recording industry, software plugins in professional mastering and mixing software, software plugins for media players, processing of streaming media in software players, preprocessing software modules or hardware units for preprocessing of streaming contents or preprocessing software modules or hardware units for preprocessing of any type of recording.
- Stereo Unfold EV reduces the difficulties by offering more information for the brain to decode and more clues result in greater intelligibility. It is therefore highly likely that the technology would be of great benefit in devices for the hearing impaired such as hearing aids, cochlear implants, conversation amplifiers etc. Stereo Unfold EV could also likely be applied in PA sound distribution systems to improve intelligibility for everyone in sonically difficult environments such as but not limited to train stations and airports. Stereo Unfold EV can offer benefits in all types of applications where the intelligibility of sound is of concern.
- Stereo Unfold EV is just as appropriate in PA systems for sound reinforcement to enhance the intelligibility and sound quality of typically music and speech. It could be used in any type of live or playback performances in stadia, auditoria, conference venues, concert halls, churches, cinemas, outdoor concerts etc.
- Stereo Unfold EV can be used to unfold mono sources similarly as it does stereo sources in time with psychoacoustic grouping to enhance the experience either from an intelligibility point of view or to provide improved playback performance in general.
- Stereo Unfold EV process is also not limited to a stereo playback system but could be used in any surround sound setup as well with processing, unfolding in time and grouping, occurring in the individual surround channels.
- a method for reproduction of sound comprising:
- the method may also comprise a step of
- the present invention is directed to providing a method for stereo sound reproduction, meaning that the Left (L) channel and Right (R) channels are Left (L) and Right (R) stereo channels.
- stereo is only one possible of many technical applications where the present invention finds use.
- delay(s) (Dx) and/or frequency shaping(s) (Frx) are utilized in the processed algorithms.
- delay(s) (Dx) are utilized in the processed algorithms.
- delay(s) (Dx) and frequency shaping(s) (Frx) are utilized in the processed algorithms.
- gain(s) (Gx) are also utilized in the processed algorithms.
- the method may also involve frequency shaping(s) (Frx).
- frequency shaping(s) (Frx) are utilized and the frequency shaping(s) (Frx) predominantly limits the frequency range to above 50 Hz.
- frequency shaping(s) (Frx) are utilized and the frequency shaping(s) (Frx) is performed so that the higher frequency contents are rolled of above 7 kHz.
- frequency shaping(s) (Frx) are utilized and the frequency shaping(s) (Frx) is performed in a frequency range of from 100 Hz to 4 kHz.
- the delay(s) is(are) of relevance.
- the first two delays D1 and D2 are in the range of 0 - 3 ms.
- all delays except D1 and D2 are at least 5 ms, such as in the range of from 5 - 50 ms, preferably in the range of 10-40 ms, more preferably in the range of 15-35 ms.
- one or more feeds (Fx) are provided as a phase stabilizer.
- the feeds (Fx) are psychoacoustically grouped by means of using multiple(s) of the fundamental(s).
- feeds (Fx) may be modified to have similar frequency contents.
- the present invention is directed to grouping feeds (Fx). Therefore, according to one specific embodiment, the feeds (Fx) are psychoacoustically grouped in a Left (L) and (R) stereo channel, respectively.
- the present invention is also directed to a device arranged to provide sound reproduction by a method comprising: - providing a number of unfolded feeds (Fx) which are processed algorithms of sound signal(s);
- the device may be any type of sound recording unit, such as in any type of stereo units, amplifiers etc.
- the device is an integrated circuit on a chip, FPGA or processor. According to yet another embodiment, the device is implemented into a hardware platform. As understood from above, the method according to the present invention may also be utilized in software applications.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Quality & Reliability (AREA)
- Stereophonic System (AREA)
Abstract
Description
Claims
Priority Applications (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP18788470.5A EP3613222A4 (en) | 2017-04-18 | 2018-03-23 | Stereo unfold with psychoacoustic grouping phenomenon |
CN201880020404.3A CN110495189A (en) | 2017-04-18 | 2018-03-23 | Utilize the stereo expansion of psychologic acoustics grouping phenomenon |
BR112019021241-8A BR112019021241A2 (en) | 2017-04-18 | 2018-03-23 | METHOD FOR SOUND REPRODUCTION, AND, DEVICE. |
US16/605,009 US11197113B2 (en) | 2017-04-18 | 2018-03-23 | Stereo unfold with psychoacoustic grouping phenomenon |
JP2019556628A JP2020518159A (en) | 2017-04-18 | 2018-03-23 | Stereo expansion with psychoacoustic grouping phenomenon |
KR1020197033763A KR20190140976A (en) | 2017-04-18 | 2018-03-23 | Stereo Development Using Acoustic Psychological Grouping Phenomenon |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SE1750448-1 | 2017-04-18 | ||
SE1750448 | 2017-04-18 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018194501A1 true WO2018194501A1 (en) | 2018-10-25 |
Family
ID=63857120
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/SE2018/050300 WO2018194501A1 (en) | 2017-04-18 | 2018-03-23 | Stereo unfold with psychoacoustic grouping phenomenon |
Country Status (7)
Country | Link |
---|---|
US (1) | US11197113B2 (en) |
EP (1) | EP3613222A4 (en) |
JP (1) | JP2020518159A (en) |
KR (1) | KR20190140976A (en) |
CN (1) | CN110495189A (en) |
BR (1) | BR112019021241A2 (en) |
WO (1) | WO2018194501A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022053588A1 (en) | 2020-09-11 | 2022-03-17 | Siou Jean Marc | System for reproducing sounds with virtualization of the reverberated field |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0276159A2 (en) * | 1987-01-22 | 1988-07-27 | American Natural Sound Development Company | Three-dimensional auditory display apparatus and method utilising enhanced bionic emulation of human binaural sound localisation |
US5671287A (en) * | 1992-06-03 | 1997-09-23 | Trifield Productions Limited | Stereophonic signal processor |
WO2003009639A1 (en) * | 2001-07-19 | 2003-01-30 | Vast Audio Pty Ltd | Recording a three dimensional auditory scene and reproducing it for the individual listener |
WO2007137232A2 (en) * | 2006-05-20 | 2007-11-29 | Personics Holdings Inc. | Method of modifying audio content |
GB2491722A (en) * | 2011-06-10 | 2012-12-12 | System Ltd X | A system for analysing sound or music using a predictive model of the response of regions of the brain and ear |
US20130077792A1 (en) * | 2011-09-26 | 2013-03-28 | Paul Bruney | Psychoacoustic interface |
WO2015184307A1 (en) * | 2014-05-30 | 2015-12-03 | Qualcomm Incorporated | Obtaining sparseness information for higher order ambisonic audio renderers |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB9107011D0 (en) * | 1991-04-04 | 1991-05-22 | Gerzon Michael A | Illusory sound distance control method |
JP2988289B2 (en) | 1994-11-15 | 1999-12-13 | ヤマハ株式会社 | Sound image sound field control device |
US6111958A (en) * | 1997-03-21 | 2000-08-29 | Euphonics, Incorporated | Audio spatial enhancement apparatus and methods |
CN102440003B (en) * | 2008-10-20 | 2016-01-27 | 吉诺迪奥公司 | Audio spatialization and environmental simulation |
US9286863B2 (en) | 2013-09-12 | 2016-03-15 | Nancy Diane Moon | Apparatus and method for a celeste in an electronically-orbited speaker |
US20160269846A1 (en) * | 2013-10-02 | 2016-09-15 | Stormingswiss Gmbh | Derivation of multichannel signals from two or more basic signals |
US9374640B2 (en) | 2013-12-06 | 2016-06-21 | Bradley M. Starobin | Method and system for optimizing center channel performance in a single enclosure multi-element loudspeaker line array |
-
2018
- 2018-03-23 CN CN201880020404.3A patent/CN110495189A/en active Pending
- 2018-03-23 WO PCT/SE2018/050300 patent/WO2018194501A1/en unknown
- 2018-03-23 BR BR112019021241-8A patent/BR112019021241A2/en not_active Application Discontinuation
- 2018-03-23 EP EP18788470.5A patent/EP3613222A4/en not_active Withdrawn
- 2018-03-23 JP JP2019556628A patent/JP2020518159A/en active Pending
- 2018-03-23 US US16/605,009 patent/US11197113B2/en active Active
- 2018-03-23 KR KR1020197033763A patent/KR20190140976A/en unknown
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0276159A2 (en) * | 1987-01-22 | 1988-07-27 | American Natural Sound Development Company | Three-dimensional auditory display apparatus and method utilising enhanced bionic emulation of human binaural sound localisation |
US5671287A (en) * | 1992-06-03 | 1997-09-23 | Trifield Productions Limited | Stereophonic signal processor |
WO2003009639A1 (en) * | 2001-07-19 | 2003-01-30 | Vast Audio Pty Ltd | Recording a three dimensional auditory scene and reproducing it for the individual listener |
WO2007137232A2 (en) * | 2006-05-20 | 2007-11-29 | Personics Holdings Inc. | Method of modifying audio content |
GB2491722A (en) * | 2011-06-10 | 2012-12-12 | System Ltd X | A system for analysing sound or music using a predictive model of the response of regions of the brain and ear |
US20130077792A1 (en) * | 2011-09-26 | 2013-03-28 | Paul Bruney | Psychoacoustic interface |
WO2015184307A1 (en) * | 2014-05-30 | 2015-12-03 | Qualcomm Incorporated | Obtaining sparseness information for higher order ambisonic audio renderers |
Non-Patent Citations (1)
Title |
---|
See also references of EP3613222A4 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022053588A1 (en) | 2020-09-11 | 2022-03-17 | Siou Jean Marc | System for reproducing sounds with virtualization of the reverberated field |
FR3114209A1 (en) | 2020-09-11 | 2022-03-18 | Jean-Marc SIOU | SOUND REPRODUCTION SYSTEM WITH VIRTUALIZATION OF THE REVERBERE FIELD |
FR3114210A1 (en) | 2020-09-11 | 2022-03-18 | Jean-Marc SIOU | SOUND REPRODUCTION SYSTEM WITH VIRTUALIZATION OF THE REVERBERE FIELD |
Also Published As
Publication number | Publication date |
---|---|
US20200304929A1 (en) | 2020-09-24 |
EP3613222A4 (en) | 2021-01-20 |
EP3613222A1 (en) | 2020-02-26 |
CN110495189A (en) | 2019-11-22 |
JP2020518159A (en) | 2020-06-18 |
BR112019021241A2 (en) | 2020-05-12 |
KR20190140976A (en) | 2019-12-20 |
US11197113B2 (en) | 2021-12-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Hacihabiboglu et al. | Perceptual spatial audio recording, simulation, and rendering: An overview of spatial-audio techniques based on psychoacoustics | |
CN107889033B (en) | Spatial audio rendering for beamforming speaker arrays | |
EP3253079B1 (en) | System for rendering and playback of object based audio in various listening environments | |
US9769589B2 (en) | Method of improving externalization of virtual surround sound | |
Potard et al. | Decorrelation techniques for the rendering of apparent sound source width in 3D audio displays | |
CA3008214C (en) | Synthesis of signals for immersive audio playback | |
EP2368375B1 (en) | Converter and method for converting an audio signal | |
JP2015530043A (en) | Reflective and direct rendering of up-mixed content to individually specifiable drivers | |
CN113170271A (en) | Method and apparatus for processing stereo signals | |
JP2016527799A (en) | Acoustic signal processing method | |
CN1091889A (en) | Be used for acoustic image enhanced stereo sound control device and method | |
WO2017165968A1 (en) | A system and method for creating three-dimensional binaural audio from stereo, mono and multichannel sound sources | |
Tan et al. | Spatial sound reproduction using conventional and parametric loudspeakers | |
JP5338053B2 (en) | Wavefront synthesis signal conversion apparatus and wavefront synthesis signal conversion method | |
US11197113B2 (en) | Stereo unfold with psychoacoustic grouping phenomenon | |
US20180262859A1 (en) | Method for sound reproduction in reflection environments, in particular in listening rooms | |
US20200045419A1 (en) | Stereo unfold technology | |
Linkwitz | The Magic in 2-Channel Sound Reproduction-Why is it so Rarely Heard? | |
Peters et al. | Sound spatialization across disciplines using virtual microphone control (ViMiC) | |
RU2820838C2 (en) | System, method and persistent machine-readable data medium for generating, encoding and presenting adaptive audio signal data | |
Del Cerro et al. | Three-dimensional sound spatialization at Auditorio400 in Madrid designed by Jean Nouvel | |
von Schultzendorff et al. | Real-diffuse enveloping sound reproduction | |
Glasgal | Improving 5.1 and Stereophonic Mastering/Monitoring by Using Ambiophonic Techniques | |
CA3142575A1 (en) | Stereo headphone psychoacoustic sound localization system and method for reconstructing stereo psychoacoustic sound signals using same | |
JP2006157210A (en) | Multichannel sound field processing apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18788470 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2019556628 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112019021241 Country of ref document: BR |
|
ENP | Entry into the national phase |
Ref document number: 20197033763 Country of ref document: KR Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2018788470 Country of ref document: EP Effective date: 20191118 |
|
ENP | Entry into the national phase |
Ref document number: 112019021241 Country of ref document: BR Kind code of ref document: A2 Effective date: 20191009 |