WO2014171706A1 - 가상 객체 생성을 이용한 오디오 신호 처리 방법 - Google Patents
가상 객체 생성을 이용한 오디오 신호 처리 방법 Download PDFInfo
- Publication number
- WO2014171706A1 WO2014171706A1 PCT/KR2014/003250 KR2014003250W WO2014171706A1 WO 2014171706 A1 WO2014171706 A1 WO 2014171706A1 KR 2014003250 W KR2014003250 W KR 2014003250W WO 2014171706 A1 WO2014171706 A1 WO 2014171706A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sound source
- information
- speaker
- signal
- audio signal
- Prior art date
Links
- 230000005236 sound signal Effects 0.000 title claims abstract description 32
- 238000003672 processing method Methods 0.000 title claims abstract description 8
- 238000000034 method Methods 0.000 claims abstract description 89
- 238000009877 rendering Methods 0.000 claims abstract description 28
- 238000001914 filtration Methods 0.000 claims description 13
- 238000004091 panning Methods 0.000 claims description 6
- 238000010586 diagram Methods 0.000 description 16
- 238000012545 processing Methods 0.000 description 13
- 238000005516 engineering process Methods 0.000 description 10
- 238000013507 mapping Methods 0.000 description 10
- 230000000694 effects Effects 0.000 description 7
- 230000008859 change Effects 0.000 description 4
- 238000009434 installation Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 230000003595 spectral effect Effects 0.000 description 4
- 238000004519 manufacturing process Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 230000003111 delayed effect Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 230000003930 cognitive ability Effects 0.000 description 1
- 235000009508 confectionery Nutrition 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000013213 extrapolation Methods 0.000 description 1
- 230000007274 generation of a signal involved in cell-cell signaling Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
- H04S3/008—Systems employing more than two channels, e.g. quadraphonic in which the audio signals are in digital form, i.e. employing more than two discrete digital channels
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
- H04S7/302—Electronic adaptation of stereophonic sound system to listener position or orientation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/03—Aspects of down-mixing multi-channel audio to configurations with lower numbers of playback channels, e.g. 7.1 -> 5.1
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/07—Generation or adaptation of the Low Frequency Effect [LFE] channel, e.g. distribution or signal processing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/11—Positioning of individual sound objects, e.g. moving airplane, within a sound field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/01—Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]
Definitions
- the present invention relates to an audio signal processing method, and more particularly, to a method for encoding and decoding an object audio signal or rendering in a three-dimensional space.
- the present invention is Korean Patent Application No. 10-2013-0040923 filed April 15, 2013, Korea Patent Application No. 10-2013-0040931 filed April 15, 2013, Korea filed April 15, 2013 Benefits of Patent Application No. 10-2013-0040957, Korean Patent Application No. 10-2013-0040960, filed April 15, 2013, and Korean Patent Application No. 10-2013-0045502, filed April 24, 2013 Claims, all of which are hereby incorporated by reference.
- NHK's method to set up multi-channel audio environment by adding upper and lower layers.
- a total of nine channels can be provided for the highest layer. You can see that there are a total of nine speakers, three in the front, three in the middle and three in the surround. In the middle layer, a total of three speakers can be arranged in front, five in the middle position and two in the surround position.
- a total of three channels and two LFE channels may be installed at the bottom.
- VBAP VectorBased Amplitude Panning
- VBAP VectorBased Amplitude Panning
- 1 illustrates the concept of VBAP.
- Amplitude panning which determines the direction of sound sources between two speakers based on the size of the signal, or VBAP, which is widely used to determine the direction of sound sources using three speakers in three-dimensional space, As you can see, rendering can be implemented relatively conveniently.
- the virtual speaker 1 (Virtual speaker 1, 140) can be generated using the three speakers 110, 120, 130 of FIG.
- VBAP is a method of rendering a sound source by selecting a speaker around it so that a virtual source can be created based on a sweet spot and calculating a gain value controlling the speaker position vector. . Therefore, in case of object-based content, at least three speakers surrounding a target object (or virtual source) can be determined, and the VBAP can be reconstructed in consideration of their relative positions to reproduce the object at a desired position.
- 3D audio using environment there may be a situation in which a sound source environment that sounds from the user's feet occurs, that is, a major event on the content is occurring at a lower position than the viewer.
- the effects of positioning the sound source at a very low position vary widely.
- the main character falls from a high place, a subway passes through the ground, or a huge explosion occurs.
- it can be said to be a very useful effect in various scenes, such as horror scenes where unknown monsters pass under the feet.
- by positioning the sound source at a position lower than the range of the installed speaker it can give the user a realistic sound field that the existing audio system cannot provide in many dramatic situations.
- a line connecting BtFC, BtFL, SiL, BL, BC, BR, SiR, and BtFR becomes a speaker mesh, and the height of this mesh represents the minimum reproduction height. That is, an object of 45 degrees (BtFL) at an angle of up to 10 degrees is possible, and an object of a lower height is automatically adjusted to the minimum reproduction height (10 degrees) and reproduced. In short, the sound from the user's down position in the current setting cannot be played.
- the present invention relates to a virtual object creation technique in a new technical issue of rendering of a speakermesh outer region.
- the lower elevation may be one embodiment in which the necessity of Sound Extrapolation is the highest and the effect is most dramatic.
- An audio signal processing method comprises the steps of: receiving object sound source information and an audio bit string including an object audio signal in playing an audio signal including an object signal; Determining a first reproduction area object and a second reproduction area object based on the object sound source information or reproduction range information; And rendering the first playback region object in a first manner and rendering the second playback region object in a second manner.
- the first reproduction region object may include an object sound source signal designed to be reproduced in an area outside the reproduction range based on the received speaker position information and the object sound source position information.
- the second reproduction region object may include an object sound source signal designed to be reproduced in an area within a reproduction range based on the received speaker position information and the object sound source position information.
- the object sound source information may include object sound source position information or exception object display information.
- exception object display information may be characterized in that the additional information displayed in 1 bit per object.
- exception object display information may be characterized by including additional information of one or more bits different from the object sound source header according to the playback environment.
- the first method may be a method of generating a virtual speaker and then rendering the virtual speaker by a panning technique between the virtual speaker and the real speaker.
- the first method may be a mixture of a method for generating a low pass filtered signal and a method for generating a band pass filtered signal.
- the first method may generate the downmixed signal from a sound source signal of the first reproduction region object for the plurality of object signals, and then generate a low pass filtered subwoofer signal using the downmixed signal. It may be.
- the first method may be to generate a low pass filtered signal for the object audio signal.
- the second method may be a flexible rendering method for positioning the second play area object at a position indicated in the object sound source information.
- the first method may be a virtual object generating method including a filtering step for locating the first play area object at a position indicated in the object sound source information.
- the second method may be a flexible rendering method for positioning the second playback area object at a position indicated in the object sound source information.
- the first method may be a method of configuring filter coefficients based on the psychoacoustic characteristics of a person using the position (height, angle, distance) of the object and the relative position of the listener among the object sound source position information.
- the present invention as a technique for positioning an object signal in a position that has been received from the outside, it is possible to create additional value when used to generate an object signal of the lower side / rear layer. It is additionally applicable between the decoder and the renderer, and as a result, it is possible to reproduce the high quality audio signal by effectively reproducing the audio signal.
- FIG. 1 is a diagram illustrating an example of a concept of a general rendering method (VBAP) using multiple speakers.
- VBAP general rendering method
- 2 is a layout diagram of speaker arrangement of 22.2ch as an example of the multi-channel.
- FIG. 3 is a diagram illustrating input and output of a renderer for describing a rendering system.
- FIG. 4 is a diagram illustrating an audio signal processing apparatus according to an embodiment of the present invention.
- FIG. 5 is a diagram briefly illustrating input and output of a virtual object generator for generating a subwoofer signal according to an embodiment of the present invention.
- FIG. 6 is another block diagram of a virtual object generator for generating a subwoofer signal according to an embodiment of the present invention.
- FIG. 7 is another block diagram of a virtual object generator for generating a subwoofer signal according to an embodiment of the present invention.
- FIG. 8 is a block diagram of an audio signal processing apparatus according to another embodiment of the present invention.
- FIG. 9 is a flowchart of an object sound source rendering technique according to an embodiment of the present invention.
- Coding can be interpreted as encoding or decoding in some cases, and information is a term that encompasses values, parameters, coefficients, elements, and so on. It may be interpreted otherwise, but the present invention is not limited thereto.
- the present invention relates to a technique for positioning an object signal in an area outside a speaker mesh in playing an object sound source using only a limited number of speakers fixed at a defined position.
- FIG. 1 is a diagram illustrating an example of a concept of a general rendering method (VBAP) using multiple speakers.
- VBAP general rendering method
- the existing technology may generate virtual speaker 1 (Virtual speaker 1, 140) by using three speakers 110, 120, and 130 that output real channel signals.
- virtual speaker 1 Virtual speaker 1, 140
- virtual speaker 2 Virtual speaker 2, 150
- 2 is a layout diagram of speaker arrangement of 22.2ch as an example of the multi-channel.
- a speaker arrangement of 22.2 channels will be described as an example.
- the present invention is not limited thereto. That is, the present invention can be applied to the speaker arrangement of the arrangement different from FIG. 2 or the speaker of the number different from that of FIG.
- 22.2 Channels may be an example of a multi-channel environment for enhancing the sound field, the present invention is not limited to a specific number of channels or a specific speaker arrangement.
- 22.2ch may be located in three layers 210, 220, and 230.
- the three layers 210, 220, and 230 are the top layer 210 of the three layers, the bottom layer 230 at the lowest position, the top layer 210 and the bottom layer ( A middle layer 220 between the layers 230.
- a total of nine channels may be provided in the top layer 210.
- the top layer 210 has three (TpFL, TpFC, TpFR) from left to right on the front, three from left to right (TpL, TpC, TpR) in the middle position, and left to right in the surround position.
- TpFL, TpFC, TpFR three from left to right
- TpL, TpC, TpR three from left to right
- the front surface may mean the screen side.
- a total of 10 channels (FL, FLC, FC, FRC, FR, L, R, BL, BC, BL) may be provided in the middle layer 220.
- the middle layer 220 has five left to right (FL, FLC, FC, FRC, FR) in the front, two left to right in the middle position (L, R), and left in the surround position. From the right side, the speakers may be arranged in three channels BL, BC, and BL. Of the five speakers in the front, three of the center positions may be included in the TV screen.
- a total of three channels (BtFL, BtFC, BtFR) and two LFE channels 240 may be provided on the bottom layer 230.
- a speaker may be disposed in each channel of the bottom layer 230.
- FIG. 3 is a diagram illustrating input and output of a renderer for describing a rendering system.
- each input object sound source of the audio signal processing apparatus is rendered by the renderer 310 using respective object sound source information.
- Each rendered object signal is then added to produce a speaker output (ie a channel signal).
- the audio signal processing apparatus may be a sound rendering system.
- FIG. 4 is a diagram illustrating an audio signal processing apparatus according to an embodiment of the present invention.
- An audio signal processing apparatus includes a sound source position determiner 410 for detecting a position of an input object sound source and a virtual object generator 430 for placing an object signal in an area outside the speaker range.
- the audio signal processing apparatus also includes a renderer 420.
- the renderer 420 according to the embodiment of the present invention may be the same as the renderer 310 described with reference to FIG. 3.
- the renderer 420 also performs rendering in a conventional manner. That is, the renderer 420 performs rendering in a general manner.
- Objects that are determined to be out of range of the speaker by the sound source position determiner 410 are rendered by the virtual object generator 430, and objects that are not (ie, coverable by the speaker range) are rendered by the renderer 420. Rendered.
- FIG. 4 according to an embodiment of the present invention, only the configuration corresponding to the renderer 420 for one object sound source is described in detail.
- the overall structure of the present invention is composed of the sum of the structures shown in FIG. 4.
- An audio signal processing apparatus further includes a sound source position determining unit 410 and a virtual object generating unit 430 in the renderer 310 of FIG. 3. That is, the audio signal processing apparatus according to the embodiment of the present invention includes a sound source position determiner 410, a renderer 420, and a virtual object generator 430.
- the sound source position determiner 410 allocates the object sound source to one of the renderer 420 and the virtual object generator 430 based on the object sound source information.
- the allocated object sound source is rendered by the renderer 420 or the virtual object generator 430 to generate a speaker output.
- the sound source position determiner 410 determines an object designed as an area outside the range of the speaker among the objects from the header information of the object sound source.
- the range of the speaker may be a playable range.
- the playable range is a virtual range following the set of speakers required for object source orientation.
- the playable range can generally consist of a line connecting each speaker based on a method of selecting three speakers (VBAP) that can form the smallest triangle that contains the position where the sound source is to be positioned. have.
- VBAP method of selecting three speakers
- the playable range may be a speaker configuration capable of locating a sound source at all positions around the user at maximum, and may generally be a range covering only a limited position.
- the playable range is 360 degrees to the left and right of the user's ear level.
- the location where the speaker is installed may not be arranged according to the installation regulations. Therefore, the user may directly input the location information of the speaker (by using the UI), select the method from a given view set, or use a remote positioning technology. Can be entered using.
- the sound source position determiner 410 compares the object sound source position information with the playable range and determines whether the position of the object sound source (object) is out of the playable range or within the playable range. At this time, the object sound source to be positioned at a position outside the playable range is rendered by the virtual object generating unit 430, otherwise it is rendered by the existing technology (that is, when playable by a combination of speakers). That is, the object sound source to be positioned at a position not out of the playable range is rendered by the renderer 420.
- the sound source position determining unit 410 determines an object that is out of range, in addition to the method of discriminating the object using only the transmitted object information as described above, the content creator places the object outside the speaker range in the standard setup. There may be a method of adding a flag as additional information of the corresponding object.
- the flag can be, in the simplest case, 1-bit information indicating that the object is exceptional and, more complexly, additional information needed to play the object more realistically using a few bits of information (eg, a standard setup). Or a method of adding additional information for reproducing in different ways according to a speaker arrangement environment of a specific setup.
- the flag indicating the exception object may be determined by the content producer at the time of generating the sound source.
- a producer who intends to position a specific object sound source at a location (for example, under a user's foot) that cannot be covered by a general speaker setup environment has an object turned on with the flag of the object turned on. Sound information can be configured.
- the content production time may be various stages such as mastering, releasing, or targeting, even a flag that is initially set may be changed or expanded several times through the production stage.
- the flag included in the object additional information may be configured with different information according to the user's environment.
- a speaker mesh suitable for a use environment may be reconfigured from time to time in consideration of a change in the speaker layout structure of the current user.
- the speaker range (or playable range) is initialized according to the screen of the installation environment and the layout of the speaker, and the rendering matrix once initialized can continue to be used without modification unless there is a change in the installation environment.
- the user wants to generate a special trigger or arbitrarily perform an initial calibration process, it is possible to modify the playable range that has been initialized.
- the location of the installed devices may be measured by the user directly inputting location information (using UI equipment) or by various methods (for example, automatic location tracking using an inter-device communication technology).
- the virtual object generator 430 may provide various methods for effectively rendering an object that needs to be positioned at a position outside the play range.
- the virtual object generator 430 according to an embodiment of the present invention may provide various virtual object creation methods for effectively rendering an object to be positioned at a position out of a playback range.
- the virtual object creation method there is a method of performing a filtering process for locating a corresponding object at a desired location.
- the filter coefficients are constructed based on the psychoacoustic characteristics of the person considering the position (height, angle, distance) of the object and the position of the listener.
- the frequency cue corresponding to the position of the speaker itself may be removed from the signal output from the specific speaker and artificially inserted into the frequency cue corresponding to the position of the object sound source.
- HRTF HeadRelated Transfer Function
- a BtFL speaker 45 degrees angle, 10 degrees height
- the filtering-based virtual object generation technique may be a technique for configuring a modified filter in order to minimize the disadvantage that distortion occurs in the signal due to the influence of the filter.
- the listeners do not hear distortion of the signal by making a slight difference in the null position of the filtering applied to each speaker.
- the height cues differ slightly depending on the individual, which is generalized to form nulls in a relatively wide frequency band, and a method of generating height cues by dividing different speakers within this generalized null frequency range.
- the band is divided into a filter or a speaker group to which filtering is applied and a group to which a simple VBAP is applied to prevent the distortion of the signal to the listener.
- the virtual object generating method of the virtual object generating unit 430 is to set the virtual speaker in the reproduction of the object signal outside the speaker playable range, the virtual speaker and the actual speaker complex It can be implemented by the panning method used as.
- the virtual object generator 430 may implement a method of finally mapping a virtual speaker from the virtual speaker to a real speaker position when an object outside the speaker playable range is formed in the virtual speaker by panning.
- the mapping to the actual speaker is implemented by a predetermined rule, and the filtering as described above may be used in this process.
- the present invention is a virtual object generation technology of the virtual object generation unit 430 according to an embodiment of the present invention virtual object generation technology using a method for generating a subwoofer signal in the reproduction of the object signal outside the playable range of the speaker It may be about.
- the low frequency effect (LFE) channel signal corresponding to .1 or .2 carries only low frequency information (below 120 Hz) and complements the overall low frequency content of the audio scene. Or to relieve the burden of other channels.
- the LFE channel 240 of FIG. 2 is generally not the same as the subwoofer signal, and the encoding technique according to the embodiment of the present invention may not provide a subwoofer output at the time of encoding.
- a subwoofer output may be generated.
- the present invention encompasses a method of generating a subwoofer signal for reproducing an object signal (eg under a user's foot) that is outside the playable range.
- FIG. 5 is a diagram briefly illustrating input and output of a virtual object generator for generating a subwoofer signal according to an embodiment of the present invention.
- the virtual object generating unit 430 receives the subwoofer output signal by receiving the playable range calculated based on the location information of the speaker, the object sound source signal determined to be out of the play range, and the sound source information of the corresponding object. Output
- the subwoofer output signal may be a signal allocated to one or more subwoofers according to the speaker setup of the use environment.
- the virtual object generator 430 When one or more object sound source signals are reproduced, the virtual object generator 430 generates a final output signal by a linear sum of the subwoofer output signals generated from the individual object sound source signals.
- FIG. 6 is another block diagram of a virtual object generator using a subwoofer signal generation method.
- the virtual object generator 430 of FIG. 6 corresponds to an example of the virtual object generator 430 of FIG. 5, and the virtual object generator 430 of FIG. 6 includes an object sound source signal determined to be out of a playback range. Represents a system that receives sound source information of an object and outputs a subwoofer output signal.
- the low pass filter 610 of the virtual object generator 430 extracts the low band signal of the corresponding object sound source through low pass filtering (LPF).
- the decorrelater 620 generates two subwoofer output signals based on the output lowband signal.
- the virtual object generator 430 determines the cutoff frequency and the decorrelator coefficient of the low band filtering by using the playable range calculated based on the location information of the speaker and the corresponding object sound source location information. Apply different filtering depending on.
- the determined decorrelator coefficients give the final subwoofer output the gain and delay needed to locate the object in the destination.
- FIG. 7 is another block diagram of a virtual object generator for generating a subwoofer signal according to an embodiment of the present invention.
- FIG. 7 corresponds to an example of the virtual object generator 430 of FIG. 5.
- the virtual object generator 430 of FIG. 7 inputs an object sound source signal and sound source information of the corresponding object determined to be out of a playback range. Represents a system that receives a subwoofer output signal.
- the virtual object generator 430 first selects the down mixer 1 720 or the down mixer 2 740 using the LFE mapping unit 710.
- the LFE mapping unit 710 may select the down mixer 1 720 or the down mixer 2 740 based on the LFE mapping.
- the LFE mapping unit 710 determines a suitable downmixer using the playable range calculated based on the location information of the speaker and the corresponding object sound source location information.
- each downmixer 720 and 740 downmixes the input object signals.
- the low pass filters 730 and 750 extract the low band signal of the corresponding object sound source through low band filtering on the downmixed signal to generate two LFE channel signals.
- the virtual object generating unit 430 according to the embodiment of the present invention requires only the downmixer and the low-band filtering as many as the number of subwoofers, and has a feature of obtaining complexity gains.
- FIG. 8 is a block diagram of an audio signal processing apparatus according to another embodiment of the present invention.
- the audio signal processing apparatus of FIG. 8 further includes an object to channel mapping unit 810, delay filters 820 and 840, and band pass filters 830 and 850 in the virtual object generator 430 of FIG. 7.
- the present invention is not limited thereto. That is, the present invention can be applied to the case in which the virtual object generator 430 of FIG. 5 and FIG. 6 further includes the object to channel mapping unit 810, delay filters 820 and 840, and band pass filters 830 and 850. Do.
- the audio signal processing apparatus of FIG. 8 uses the low-pass filters of FIGS. 5, 6, and 7 to generate a subwoofer output signal, in addition to reproducing sound for the sound source at a lower position.
- a filter (830, 850) to add a method for producing a speaker output signal to reproduce the sound for the sound source of the low position.
- the low frequency signal is output through the subwoofer to provide the overall sound field of the low position sound source, and in addition, the mid-range object sound source is output through the speaker to achieve more accurate sound source positioning.
- the position of the object sound source of the intermediate band is implemented by a method of assigning a delay value corresponding to the position to position the sound source using the Haas effect.
- the core of this technology is that the sound source location can be maximized by outputting an additional mid-band signal in addition to the output signal of the subwoofer.
- the object to channel mapping 810 selects one or more necessary speaker channels using the object sound source information and allocates the object sound source.
- the object signal assigned to the speaker channel passes through the delay filters 820 and 840 and is delayed as much as it can produce a Haas effect.
- the band pass filters 830 and 850 receive signals passing through the delay filters 820 and 840 as inputs, and pass only the intermediate band of the object signal to generate speaker channel output.
- the delay filters 820 and 840 and the band pass filters 830 and 850 may be used in a different order. That is, the object signal allocated to the channel may be first passed through the band pass filters 830 and 850 and then delayed through the delay filters 820 and 840 depending on the situation for the purpose of complexity or ease of implementation.
- the subwoofer output generation method is not limited to that shown in the lower part of FIG. 9 and may use other methods described above according to the usage environment, the intention of the user and the content producer, or the characteristics of the object signal.
- FIG. 9 shows a flowchart of the object sound source rendering technique of the present invention.
- the sound source rendering technology relates to a method for calculating a playable range by using speaker position information of a use environment.
- a position where a speaker is installed may not be arranged according to an installation rule, and thus a user may
- the location information can be entered directly (using the UI), by selecting from a given set of views, or by using remote positioning techniques.
- the playable range can generally consist of a line connecting each speaker based on a method of selecting three speakers (VBAP) that can form the smallest triangle that contains the position where the sound source is to be positioned. have.
- the playable range may be a speaker configuration capable of locating a sound source at all positions around the user at maximum, and may generally be a range covering only a limited position. (E.g. a 360-degree plane to the left and right at your ear for a 5.1 speaker setup)
- the sound source position determining unit 410 After the sound source position determining unit 410 configures the playable range based on the speaker arrangement information (S101), the sound source position determining unit 410 acquires the position information of the object sound source and the object sound source signal from the sound source bit string (S103). In addition, the sound source position determiner 410 compares the dual object sound source position information with the playable range and determines whether the object sound source should be positioned at a position outside the playable range (S105). In this case, the object sound source to be positioned at a position outside the playable range is rendered by the virtual object generating unit 430 (S107), and the object sound source within the non-playable range is rendered by the existing renderer 420 (S109). .
- the audio signal processing method according to the present invention can be stored in a computer-readable recording medium which is produced as a program for execution in a computer, and multimedia data having a data structure according to the present invention can also be stored in a computer-readable recording medium.
- the computer readable recording medium includes all kinds of storage devices in which data that can be read by a computer system is stored. Examples of computer-readable recording media include ROM, RAM, CDROM, magnetic tape, floppy disk, optical data storage, and the like, and may also be implemented in the form of a carrier wave (for example, transmission over the Internet). .
- the bitstream generated by the encoding method may be stored in a computer-readable recording medium or transmitted using a wired / wireless communication network.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Stereophonic System (AREA)
Abstract
Description
Claims (15)
- 객체신호를 포함한 오디오 신호를 재생함에 있어서,객체 음원 정보와 객체 오디오 신호를 포함한 오디오 비트열을 수신하는 단계;상기 객체 음원 정보 또는 재생 범위 정보에 기초하여 제1 재생 영역 객체와 제2 재생 영역 객체를 판별하는 단계; 및상기 제1 재생 영역 객체는 제1 방법으로 렌더링하고, 상기 제2 재생 영역 객체는 제2 방법으로 렌더링하는 단계를 포함하는 오디오 신호처리 방법.
- 제1항에 있어서,스피커 위치 정보를 수신하는 단계; 및상기 스피커 위치 정보를 이용하여 재생 범위 정보를 생성하는 단계를 더 포함하는 오디오 신호처리 방법.
- 제2항에 있어서,상기 제1 재생 영역 객체는 상기 수신된 스피커 위치 정보와 상기 객체 음원 위치 정보에 기초할 때, 재생 범위를 벗어난 영역에서 재생되도록 디자인된 객체 음원 신호를 포함하는 오디오 신호처리 방법.
- 제2항에 있어서,상기 제2 재생 영역 객체는 상기 수신된 스피커 위치 정보와 상기 객체 음원 위치 정보에 기초할 때, 재생 범위 내의 영역에서 재생되도록 디자인된 객체 음원 신호를 포함하는 오디오 신호처리 방법.
- 제1항에 있어서,상기 객체 음원 정보는 객체 음원 위치 정보 또는 예외 객체 표시 정보를 포함하는 오디오 신호처리방법.
- 제5항에 있어서,상기 예외 객체 표시 정보는 객체별 1비트로 표시되는 부가정보인 것을 특징으로 하는 오디오 신호처리 방법.
- 제5항에 있어서,상기 예외 객체 표시 정보는 재생 환경에 따라 객체 음원 헤더에 서로 다른 1비트 이상의 부가 정보포함 하는 것을 특징으로 하는 오디오 신호처리방법.
- 제1항에 있어서,상기 제1 방법은 가상 스피커를 생성한 후 상기 가상 스피커와 실제 스피커 사이의 패닝 기법에 의해 렌더링되는 방법인 오디오 신호처리 방법.
- 제1항에 있어서,상기 제1 방법은 저역 통과 필터링된 신호를 생성하는 방법과 밴드 통과 필터된 신호를 생성하는 방법을 혼합하는 것인 오디오 신호처리 방법.
- 제1항에 있어서,상기 제1 방법은 상기 복수의 객체 신호에 대한 상기 제 1 재생 영역 객체의 음원 신호로부터 상기 다운믹스된 신호를 생성하고, 이후 상기 다운믹스된 신호를 이용하여 저역 필터된 서브 우퍼 신호를 생성하는 것인 오디오 신호처리 방법.
- 제1항에 있어서,상기 제1 방법은 상기 객체 오디오 신호에 대한 저역통과 필터링된 신호를 생성하는 것인 오디오 신호처리 방법.
- 제1항에 있어서,상기 제2 방법은 상기 제2 재생 영역 객체를 상기 객체 음원 정보에 나타난 위치에 정위시키기 위한 유연한 렌더링 방법인 오디오 신호처리 방법.
- 제1항에 있어서,상기 제1 방법은 상기 제1 재생 영역 객체를 상기 객체 음원 정보에서 나타난 위치에 정위 시키기 위한 필터링 단계를 포함하는 가상 객체 생성 방법인 오디오 신호처리 방법.
- 제1항에 있어서,상기 제2 방법은 상기 제2 재생 영역 객체를 상기 객체 음원 정보에 나타난 위치에 정위 시키기 위한 유연한 렌더링 방법인 오디오 신호처리 방법.
- 제1항에 있어서,상기 제1 방법은 상기 객체 음원 위치 정보 중 객체의 위치(높이, 각도, 거리)와 청취자의 상대적 위치를 이용하여 사람의 심리 음향 특성에 기반하여 필터 계수를 구성하는 방법인 오디오 신호처리 방법.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/784,349 US20160066118A1 (en) | 2013-04-15 | 2014-04-15 | Audio signal processing method using generating virtual object |
CN201480021306.3A CN105144751A (zh) | 2013-04-15 | 2014-04-15 | 用于产生虚拟对象的音频信号处理方法 |
Applications Claiming Priority (10)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020130040931A KR20140123732A (ko) | 2013-04-15 | 2013-04-15 | 스피커 영역을 벗어난 음원의 공간정보기반 렌더링 방법 |
KR1020130040923A KR20140123730A (ko) | 2013-04-15 | 2013-04-15 | 스피커 영역을 벗어난 음원의 렌더링 방법 |
KR1020130040957A KR20140123744A (ko) | 2013-04-15 | 2013-04-15 | 스피커 영역을 벗어난 음원의 렌더링 방법에서의 가상 음원 생성 방법 |
KR10-2013-0040960 | 2013-04-15 | ||
KR20130040960A KR20140123746A (ko) | 2013-04-15 | 2013-04-15 | 서브 우퍼를 이용한 스피커 영역을 벗어난 객체 음원의 렌더링 방법 |
KR10-2013-0040931 | 2013-04-15 | ||
KR10-2013-0040957 | 2013-04-15 | ||
KR10-2013-0040923 | 2013-04-15 | ||
KR10-2013-0045502 | 2013-04-24 | ||
KR1020130045502A KR20140127022A (ko) | 2013-04-24 | 2013-04-24 | 가상 객체 생성을 이용한 오디오 신호처리 방법. |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2014171706A1 true WO2014171706A1 (ko) | 2014-10-23 |
Family
ID=51731580
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2014/003250 WO2014171706A1 (ko) | 2013-04-15 | 2014-04-15 | 가상 객체 생성을 이용한 오디오 신호 처리 방법 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20160066118A1 (ko) |
CN (1) | CN105144751A (ko) |
WO (1) | WO2014171706A1 (ko) |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102343330B1 (ko) | 2014-12-01 | 2021-12-24 | 삼성전자주식회사 | 스피커의 위치 정보에 기초하여, 오디오 신호를 출력하는 방법 및 디바이스 |
US10334387B2 (en) * | 2015-06-25 | 2019-06-25 | Dolby Laboratories Licensing Corporation | Audio panning transformation system and method |
ES2916342T3 (es) * | 2016-01-19 | 2022-06-30 | Sphereo Sound Ltd | Síntesis de señales para la reproducción de audio inmersiva |
US9956910B2 (en) * | 2016-07-18 | 2018-05-01 | Toyota Motor Engineering & Manufacturing North America, Inc. | Audible notification systems and methods for autonomous vehicles |
US9980078B2 (en) * | 2016-10-14 | 2018-05-22 | Nokia Technologies Oy | Audio object modification in free-viewpoint rendering |
EP3726859A4 (en) | 2017-12-12 | 2021-04-14 | Sony Corporation | SIGNAL PROCESSING DEVICE AND METHOD, AND PROGRAM |
KR20190083863A (ko) * | 2018-01-05 | 2019-07-15 | 가우디오랩 주식회사 | 오디오 신호 처리 방법 및 장치 |
US10999693B2 (en) * | 2018-06-25 | 2021-05-04 | Qualcomm Incorporated | Rendering different portions of audio data using different renderers |
WO2020016685A1 (en) | 2018-07-18 | 2020-01-23 | Sphereo Sound Ltd. | Detection of audio panning and synthesis of 3d audio from limited-channel surround sound |
US11122386B2 (en) * | 2019-06-20 | 2021-09-14 | Qualcomm Incorporated | Audio rendering for low frequency effects |
US12069464B2 (en) | 2019-07-09 | 2024-08-20 | Dolby Laboratories Licensing Corporation | Presentation independent mastering of audio content |
MX2023002587A (es) * | 2020-09-09 | 2023-03-22 | Sony Group Corp | Dispositivo y metodo de procesamiento acustico y programa. |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009526467A (ja) * | 2006-02-09 | 2009-07-16 | エルジー エレクトロニクス インコーポレイティド | オブジェクトベースオーディオ信号の符号化及び復号化方法とその装置 |
KR20100062773A (ko) * | 2008-12-02 | 2010-06-10 | 한국전자통신연구원 | 오디오 컨텐츠 재생 장치 |
KR20120023826A (ko) * | 2009-06-24 | 2012-03-13 | 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. | 오디오 신호 디코더, 오디오 신호를 디코딩하는 방법 및 캐스케이드된 오디오 객체 처리 단계들을 이용한 컴퓨터 프로그램 |
KR20120062758A (ko) * | 2009-08-14 | 2012-06-14 | 에스알에스 랩스, 인크. | 오디오 객체들을 적응적으로 스트리밍하기 위한 시스템 |
US20120232910A1 (en) * | 2011-03-09 | 2012-09-13 | Srs Labs, Inc. | System for dynamically creating and rendering audio objects |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6175631B1 (en) * | 1999-07-09 | 2001-01-16 | Stephen A. Davis | Method and apparatus for decorrelating audio signals |
DE10355146A1 (de) * | 2003-11-26 | 2005-07-07 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Vorrichtung und Verfahren zum Erzeugen eines Tieftonkanals |
JP5114981B2 (ja) * | 2007-03-15 | 2013-01-09 | 沖電気工業株式会社 | 音像定位処理装置、方法及びプログラム |
US8315396B2 (en) * | 2008-07-17 | 2012-11-20 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus and method for generating audio output signals using object based metadata |
EP2154911A1 (en) * | 2008-08-13 | 2010-02-17 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | An apparatus for determining a spatial output multi-channel audio signal |
EP3913931B1 (en) * | 2011-07-01 | 2022-09-21 | Dolby Laboratories Licensing Corp. | Apparatus for rendering audio, method and storage means therefor. |
KR102003191B1 (ko) * | 2011-07-01 | 2019-07-24 | 돌비 레버러토리즈 라이쎈싱 코오포레이션 | 적응형 오디오 신호 생성, 코딩 및 렌더링을 위한 시스템 및 방법 |
US8996296B2 (en) * | 2011-12-15 | 2015-03-31 | Qualcomm Incorporated | Navigational soundscaping |
WO2013181272A2 (en) * | 2012-05-31 | 2013-12-05 | Dts Llc | Object-based audio system using vector base amplitude panning |
CA2893729C (en) * | 2012-12-04 | 2019-03-12 | Samsung Electronics Co., Ltd. | Audio providing apparatus and audio providing method |
-
2014
- 2014-04-15 WO PCT/KR2014/003250 patent/WO2014171706A1/ko active Application Filing
- 2014-04-15 US US14/784,349 patent/US20160066118A1/en not_active Abandoned
- 2014-04-15 CN CN201480021306.3A patent/CN105144751A/zh active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2009526467A (ja) * | 2006-02-09 | 2009-07-16 | エルジー エレクトロニクス インコーポレイティド | オブジェクトベースオーディオ信号の符号化及び復号化方法とその装置 |
KR20100062773A (ko) * | 2008-12-02 | 2010-06-10 | 한국전자통신연구원 | 오디오 컨텐츠 재생 장치 |
KR20120023826A (ko) * | 2009-06-24 | 2012-03-13 | 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. | 오디오 신호 디코더, 오디오 신호를 디코딩하는 방법 및 캐스케이드된 오디오 객체 처리 단계들을 이용한 컴퓨터 프로그램 |
KR20120062758A (ko) * | 2009-08-14 | 2012-06-14 | 에스알에스 랩스, 인크. | 오디오 객체들을 적응적으로 스트리밍하기 위한 시스템 |
US20120232910A1 (en) * | 2011-03-09 | 2012-09-13 | Srs Labs, Inc. | System for dynamically creating and rendering audio objects |
Also Published As
Publication number | Publication date |
---|---|
CN105144751A (zh) | 2015-12-09 |
US20160066118A1 (en) | 2016-03-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2014171706A1 (ko) | 가상 객체 생성을 이용한 오디오 신호 처리 방법 | |
Holman | Surround sound: up and running | |
ES2871224T3 (es) | Sistema y método para la generación, codificación e interpretación informática (o renderización) de señales de audio adaptativo | |
WO2015037905A1 (ko) | 입체음향 조절기를 내포한 멀티 뷰어 영상 및 3d 입체음향 플레이어 시스템 및 그 방법 | |
JP5688030B2 (ja) | 三次元音場の符号化および最適な再現の方法および装置 | |
WO2009116800A9 (ko) | 오브젝트중심의 입체음향 좌표표시를 갖는 디스플레이장치 | |
US20060165247A1 (en) | Ambient and direct surround sound system | |
WO2015156654A1 (ko) | 음향 신호의 렌더링 방법, 장치 및 컴퓨터 판독 가능한 기록 매체 | |
WO2014175668A1 (ko) | 오디오 신호 처리 방법 | |
Lee | Multichannel 3D microphone arrays: A review | |
WO2014175591A1 (ko) | 오디오 신호처리 방법 | |
WO2015147435A1 (ko) | 오디오 신호 처리 시스템 및 방법 | |
JP2018110366A (ja) | 3dサウンド映像音響機器 | |
KR101235832B1 (ko) | 실감 멀티미디어 서비스 제공 방법 및 장치 | |
US10321252B2 (en) | Transaural synthesis method for sound spatialization | |
KR20090107453A (ko) | 오브젝트중심의 입체음향 좌표표시를 갖는 디스플레이장치 | |
Rumsey | Surround Sound 1 | |
CN114915874A (zh) | 音频处理方法、装置、设备、介质及程序产品 | |
Kim | Height channels | |
WO2018150774A1 (ja) | 音声信号処理装置及び音声信号処理システム | |
Fisher | Instant surround sound | |
Floros et al. | Spatial enhancement for immersive stereo audio applications | |
Kunchur | 3D imaging in two-channel stereo sound: Portrayal of elevation | |
Ando | Preface to the Special Issue on High-reality Audio: From High-fidelity Audio to High-reality Audio | |
KR20140123746A (ko) | 서브 우퍼를 이용한 스피커 영역을 벗어난 객체 음원의 렌더링 방법 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201480021306.3 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14785055 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14784349 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 14785055 Country of ref document: EP Kind code of ref document: A1 |