WO2019147040A1 - Procédé de mixage élévateur d'audio stéréo en tant qu'audio binaural et appareil associé - Google Patents

Procédé de mixage élévateur d'audio stéréo en tant qu'audio binaural et appareil associé Download PDF

Info

Publication number
WO2019147040A1
WO2019147040A1 PCT/KR2019/001018 KR2019001018W WO2019147040A1 WO 2019147040 A1 WO2019147040 A1 WO 2019147040A1 KR 2019001018 W KR2019001018 W KR 2019001018W WO 2019147040 A1 WO2019147040 A1 WO 2019147040A1
Authority
WO
WIPO (PCT)
Prior art keywords
stereo
binaural
output
cubic
audio
Prior art date
Application number
PCT/KR2019/001018
Other languages
English (en)
Korean (ko)
Inventor
김동준
Original Assignee
김동준
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 김동준 filed Critical 김동준
Publication of WO2019147040A1 publication Critical patent/WO2019147040A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2420/00Details of connection covered by H04R, not provided for in its groups
    • H04R2420/01Input selection or mixing for amplifiers or loudspeakers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/03Application of parametric coding in stereophonic audio systems

Definitions

  • the present invention relates to a technique for upmixing stereo audio to binaural audio, and more particularly, to a technique for upmixing stereo audio by combining a binaural output using treble and bass and a wide stereo output using a middle tone.
  • contents including multichannel audio signals such as 7.1 channel, 10.2 channel, 11.1 channel and 22.2 channel more than 5.1 channel is increasing.
  • user terminals possessed by users using contents can reproduce audio signals in stereo form, such as stereo speakers, headphones and earphones, high-quality multi-channel audio signals need to be converted into stereo-type audio signals .
  • Korean Patent Laid-Open Publication No. 10-2015-0013073 discloses a technique relating to a binaural rendering method and apparatus for multi-channel audio signals.
  • a method of upmixing stereo audio to binaural audio comprising: performing binaural encoding based on a treble region and a bass region separated from a stereo signal to generate a binaural output; ; Performing stereo wide processing based on the mid-range region separated from the stereo signal to produce a wide stereo output; And summing the stereo signal, the binaural output, and the wide stereo output to produce an upmix stereo output.
  • the binaural output is generated corresponding to a three-dimensional vector for a binaural point located in an 8-channel based cubic cubic consisting of 4 up channels and 4 down channels,
  • the positions of the four up channels may be set based on the treble range, and the positions of the four down channels may be set based on the low range.
  • the positions of the four up channels are set using any one of the high frequency frequencies detected on the basis of the size of the transient in the high frequency region, and the positions of the four down channels are set to be trans And can be set using any bass frequency detected based on the size of the shunt.
  • the distance between the upper layer of the three-dimensional cubic block composed of four up channels and the lower layer of the cubic cubic block composed of the four down channels can be set based on the equalizer value of the stereo signal.
  • a wide stereo output is generated based on a wide stereo layer corresponding to the middle region, and the wide stereo layer may correspond to a stereo layer having an expanded image space corresponding to a reverberation value and a delay value.
  • the three-dimensional vector can be generated based on the reference listening point positioned inside the three-dimensional cubic.
  • the binaural output generating step may generate the binaural output by applying the direction information of the three-dimensional vector to the rotated cubic bikes corresponding to the head tracking information.
  • the cubic cubic can be rotated corresponding to the rotation parameter of at least one of pan, tilt, and roll.
  • the binaural output may include harmonics based on the fundamental frequency of the upper layer.
  • the method of upmixing further includes the step of separating the stereo signal into the treble region, the middle region, and the bass region by inputting the stereo signal as a treble pass filter, a mid-tone pass filter, and a low- can do.
  • the upmix apparatus may perform binaural encoding based on high and low regions separated from a stereo signal to generate a binaural output,
  • a processor for performing stereo wide processing based on the region to produce a wide stereo output and summing the stereo signal, the binaural output and the wide stereo output to produce an upmix stereo output;
  • a memory for storing the stereo signal, the binaural output, and the wide stereo output.
  • the binaural output is generated corresponding to a three-dimensional vector for a binaural point located in an 8-channel based cubic cubic consisting of 4 up channels and 4 down channels,
  • the positions of the four up channels may be set based on the treble range, and the positions of the four down channels may be set based on the low range.
  • the positions of the four up channels are set using any one of the high frequency frequencies detected on the basis of the size of the transient in the high frequency region, and the positions of the four down channels are set to be trans And can be set using any bass frequency detected based on the size of the shunt.
  • the distance between the upper layer of the three-dimensional cubic block composed of four up channels and the lower layer of the cubic cubic block composed of the four down channels can be set based on the equalizer value of the stereo signal.
  • a wide stereo output is generated based on a wide stereo layer corresponding to the middle region, and the wide stereo layer may correspond to a stereo layer having an expanded image space corresponding to a reverberation value and a delay value.
  • the three-dimensional vector can be generated based on the reference listening point positioned inside the three-dimensional cubic.
  • the processor can generate the binaural output by applying the direction information of the three-dimensional vector to the rotated cubic bikes corresponding to the head tracking information.
  • the cubic cubic can be rotated corresponding to the rotation parameter of at least one of pan, tilt, and roll.
  • the binaural output may include harmonics based on the fundamental frequency of the upper layer.
  • the processor can separate the stereo signal into the treble region, the middle region, and the bass region by inputting the stereo signal as a treble pass filter, a mid-frequency pass filter, and a low-pass filter, respectively.
  • the present invention can reduce the time and cost required for mixing a stereo file into an emmissive file.
  • the present invention can improve compatibility with various kinds of contents based on natural upmix.
  • FIG. 1 is a diagram illustrating a stereo audio upmix structure according to an embodiment of the present invention.
  • FIG. 2 is a block diagram showing an upmix apparatus according to an embodiment of the present invention.
  • FIGS. 3 to 5 are views showing an example of a filter for separating a high-frequency region, a middle frequency region and a low frequency region of a stereo signal according to the present invention.
  • FIG. 6 illustrates a detailed structure for generating a binaural output according to an embodiment of the present invention.
  • FIG. 7 is a diagram illustrating an example of an upper layer and a lower layer in an 8-channel based cubic cubic according to the present invention.
  • FIG. 8 is a conceptual view illustrating an example of a stereo audio upmix effect according to the present invention.
  • FIG. 9 is a view showing a distance between a top layer and a bottom layer in 3D cubic according to the present invention.
  • FIG. 10 is a diagram showing an example of a three-dimensional vector according to the present invention.
  • FIG. 11 is a diagram illustrating an example in which direction information of a three-dimensional vector is applied to a rotated cubic bicycle corresponding to head tracking information according to the present invention.
  • FIG. 12 is a diagram showing an example of a rotation parameter according to the present invention.
  • FIG. 13 illustrates a detailed structure for generating a wider stereo output according to an embodiment of the present invention.
  • FIG. 14 is a diagram illustrating an example of expanding a stereo image according to the present invention.
  • FIG. 15 is a diagram illustrating an example of a structure in which upper and lower layers of cubic cubic according to the present invention are combined with a wide stereo layer.
  • 16 is a flowchart illustrating a method of upmixing stereo audio to binaural audio according to an embodiment of the present invention.
  • FIG. 1 is a diagram illustrating a stereo audio upmix structure according to an embodiment of the present invention.
  • a stereo audio upmix structure includes a stereo signal 110 corresponding to two channels, a high-pass filter 121, a medium-sound path filter 122, and a low- Respectively.
  • the stereo signal 110 input to the treble pass filter 121 only the high-frequency region passes through the binaural encoder 130 in the stereo signal 110 input to the treble pass filter 121.
  • the stereo signal 110 input to the mid-frequency pass filter 122 only the mid-frequency region passes through the mid-frequency pass filter 122 and is input to the stereo winder 140.
  • the stereo signal 110 input to the low-pass filter 123 only the low-frequency region passes through the high-frequency region and can be input to the binaural encoder 130 together with the high-frequency region.
  • the high-frequency region can have directionality, so that the high-frequency region and the low-frequency region are separated from each other and the immersive effect Binaural encoding can be performed.
  • the binaural encoder 130 may generate a three-dimensional layer using two stereo channels corresponding to a high-frequency region and a low-frequency region, and may generate a three- Encoding can be performed.
  • the middle region input to the stereo wider 140 can perform stereo wide processing for expanding the stereo image region without performing binaural encoding.
  • the binaural mixer 150 the binaural output outputted from the binaural encoder 130 and the wide stereo output outputted from the stereo winder 140 together with the stereo signal 110 Can be combined to produce an upmix stereo output.
  • the upmix stereo output may correspond to a stereo signal or stereo audio including an immersive effect. That is, according to the present invention, it is possible to produce an immersive effect on a stereo audio or a stereo audio content without performing another immersive mixing.
  • FIG. 2 is a block diagram showing an upmix apparatus according to an embodiment of the present invention.
  • the upmix apparatus includes a communication unit 210, a processor 220, and a memory 230.
  • the communication unit 210 transmits and receives information required for generating upmix stereo audio through a communication network such as a network.
  • the communication unit 210 may include a stereo signal or content corresponding to a source for upmix stereo audio generation, head tracking information to be input through a head tracking module or a user interface for binaural encoding, And provide upmix stereo audio corresponding to the upmix stereo output.
  • the processor 220 performs binaural encoding based on the high and low regions of the stereo signal to produce a binaural output.
  • the stereo signal can be input into the high-pass filter, the medium-pass filter, and the low-pass filter, respectively, so that the stereo signal can be separated into the high-frequency region, the middle-frequency region, and the low-frequency region.
  • the processor 220 may input a stereo signal corresponding to two channels to the treble pass filter 300, the mid-tone pass filter 400, and the low-pass filter 500, respectively .
  • the treble pass filter 300 corresponds to a filter that passes only the treble range of the input range of the stereo signal, and can output a stereo signal in a treble range as shown in FIG.
  • the mid-frequency pass filter 400 corresponds to a filter which passes only the middle region of the input band of the stereo signal, and can output the stereo signal of the middle region as shown in FIG.
  • the low-pass filter 500 corresponds to a filter which passes only the bass region of the input band of the stereo signal, and can output the low-frequency region stereo signal as shown in Fig.
  • the high-pass filter 300, the mid-pass filter 400, and the low-pass filter 500 used in the present invention are not limited to a specific filtering method, have.
  • the binaural output is generated corresponding to a three-dimensional vector for a binaural point located in an 8-channel based cubic cubic consisting of 4 up channels and 4 down channels,
  • the positions of the up channels are set based on the treble range, and the positions of the four down channels can be set based on the low range.
  • the 8-channel-based 3D cubic corresponds to an element for creating a 3-dimensional spatial image, and is composed of a 3-dimensional layer composed of an upper layer composed of 4 up channels and a lower layer composed of 4 down channels ≪ / RTI >
  • a binaural encoder 620 corresponding to a three-dimensional cubic system may be used to correspond to the two-channel and low-frequency regions 612 corresponding to the treble region 611 separated from the stereo signal And can perform binaural encoding corresponding to the two channels.
  • the positions of the four up channels can be set using any one of the high frequency frequencies detected on the basis of the size of the transient in the high frequency region 611, May be set using any one of the bass frequencies detected based on the size of the transient in step 612.
  • the transient may be the initial amplitude rise that appears when the sound is first started in the waveform of the sound.
  • the transients are detected one by one in the high-frequency region 611 and the low-frequency region 612, and the left and right channels generated by separating the retrieved frequencies in real- Up channels and four down channels.
  • stereo enhancement may be applied to a panning value of an upper layer corresponding to the high-frequency region 611 to generate a natural sound.
  • the treble frequency detected in the treble region 611 is subjected to left / right separation processing to obtain a position corresponding to the left channel L and the right channel R,
  • the speaker 711 can be disposed, and the speaker 712 can be disposed at the position of the write channel.
  • the speaker 713 is disposed at a position where the left channel L and the right channel R are combined in accordance with 'L- (LR)', and the left channel L and the right channel R correspond to 'R- (LR)
  • the upper layer 710 of cubic cubic can be constructed by disposing the speaker 714 at a position where the three-dimensional cubic is combined.
  • the bass frequency detected in the bass region 612 is subjected to left and right separation processing to obtain a position corresponding to the left channel L and the right channel R, and the speaker 721 is placed at a position of the left channel as shown in FIG. And the speaker 722 can be disposed at the position of the write channel. Thereafter, the speaker 723 is disposed at a position where the left channel L and the right channel R are combined in accordance with 'L- (LR)', and the left channel L and the right channel R correspond to 'R-
  • the cubic sub-layer 720 of three-dimensional cubic can be constructed by disposing the speaker 724 at a combination position.
  • the binaural output 630 shown in FIG. 6 is generated by binaurally encoding 8-channel based audio corresponding to eight speakers 711 to 714 and 721 to 724 as shown in FIG. 7 Output, and may be output in a stereo format corresponding to two channels as shown in Fig. At this time, the two channels corresponding to the binaural output 630 may correspond to the left channel and the right channel, respectively.
  • binaural encoding is performed by binaurally encoding the stereo signals in the high and low regions, which were only two channels 810, Output can be generated.
  • the upmix apparatus may include another usable three-dimensional layer or a three-dimensional layer to be developed in the future.
  • the distance between the upper layer of the three-dimensional cubic block constituted by the four up channels and the lower layer of the cubic cubic matrix constituted by the four down channels may be set based on the equalizer value of the stereo signal.
  • the value of the equalizer (EQ) of the stereo signal is for adjusting the spatial sensitivity of the sound by adjusting the frequency band.
  • the distance 910 between the upper layer and the lower layer of the cubic cubic shown in FIG. 9 corresponds to the equalizer value . That is, by adjusting the hertz (Hz) of the high-frequency band corresponding to the upper layer or adjusting the distance 910 between the upper layer and the lower layer by adjusting the low-frequency hertz corresponding to the lower layer, can.
  • the three-dimensional vector can be generated based on the reference listening point located inside the cubic.
  • a reference listening point 1010 which is a virtual representation of the position of a user or a listener, may be located at an inner center portion of a cubic cubic 1000 having eight dynamic speakers as vertexes .
  • the binaural point 1020 is located on the upper layer of the cubic cubic 1000 as shown in FIG. 10
  • the three-dimensional vector 1030 corresponding to the binaural output is shown in FIG. 10 In the direction from the reference listening point 1010 to the binaural point 1020 shown in FIG.
  • the output sound may be formed on the top of the listener. Also, when the binaural point 1020 is positioned lower than the reference listening point 1010 on the cubic cubic 1000, the output sound may be formed at the bottom of the listener.
  • the reference listening point 1010 may be located on the wide stereo layer corresponding to the middle sound region of the stereo signal, which is located inside the cubic cubic 1000. That is, a wide stereo layer corresponding to a two-channel based stereo layer may be located between the upper layer and the lower layer of the cubic cubic 1000.
  • the binaural output can be generated by applying the direction information of the three-dimensional vector to the rotated three-dimensional cubic corresponding to the head tracking information. That is, since the binaural point is set based on the listener's head corresponding to the reference listening point, the position of the binaural point on the cubic bicycle can also be changed if the listener's head position or angle is changed.
  • the 3D cubic 1000 shown in FIG. 10 is rotated as shown in FIG. 11 corresponding to the head tracking information.
  • the direction information of the three-dimensional vector 1030 shown in FIG. 10 can be directly applied to the three-dimensional cubic as shown in FIG. 11 to detect the position of the changed binaural point according to the rotation.
  • the head tracking information corresponds to data obtained by tracking the head movement of the user or listener, and may be obtained corresponding to at least one of a tracking input based on a separate head tracking module and a user input based on the user interface.
  • the head tracking module can measure the distance or angle of movement of the user's head and generate and transmit the head tracking information.
  • the head tracking information may be artificially provided by the user or the listener through the user interface. That is, the user or the listener may input the head tracking information based on the user interface regardless of whether the head tracking information is received by the head tracking module in order to artificially rotate the spatial image. At this time, the user or the listener may input and modify head tracking information while listening to a mixing process of generating an upmix stereo output or an upmix stereo output varying according to input information.
  • the cubic cubic can be rotated corresponding to the rotation parameter of at least one of pan, tilt, and roll.
  • the listener rotates the head corresponding to at least one of pan, tilt, and roll as shown in FIG. 12, It can be applied to cubic bikes.
  • the effect of rotating the cubic bicycle or moving the cubic bicycle vertically and horizontally according to the head tracking information can be mixed with a wide stereo output and a stereo signal to generate an upmix stereo output.
  • the binaural output may include harmonics based on the fundamental frequency of the upper layer.
  • the harmonics correspond to the upper harmonics whose frequencies are integral multiples in the sound corresponding to the reference frequency, and are included in the binaural output to be used for mixing to provide musical naturalness.
  • the processor 220 performs stereo wide processing based on the midsection region separated from the stereo signal to produce a wide stereo output.
  • the wide stereo output may be generated based on a wide stereo layer corresponding to the midrange region.
  • stereo wide processing may be performed corresponding to a wide stereo layer based on the middle region 1310 of the stereo signal input to the stereo wider 1320.
  • the wide stereo output 1330 may be output in a stereo format corresponding to two channels as shown in FIG.
  • the wide stereo layer corresponds to an element for producing a stereo image, which may correspond to an extended stereo layer corresponding to the reverb value and the delay value.
  • a stereo image region 1400 may be extended based on a Reverb 1410 and a Delay or Pan 1420.
  • the reverberation 1410 corresponds to the reverberation of the sound originating from the sound source, which has hit the ear, such as a wall, floor, or ceiling, twice or more, and the size of the space corresponding to the stereo image region 1400 is changed in the forward / ≪ / RTI >
  • the value of the reverberation 1410 can be adjusted based on a pre-delay value corresponding to a time taken for the original sound to be heard and then the reverberation 1410 to be heard.
  • the delay in the delay or the pan 1420 corresponds to the delay value for the left and right channels, and by adjusting the values to be different from each other, the size of the space corresponding to the stereo image region 1400 is changed in the left / Can be adjusted.
  • the present invention adjusts the delay or the pan 1420 to adjust the left and right sizes of the corresponding space of the stereo image region 1400 Can be adjusted.
  • a wide stereo layer 1530 may be located in combination with a three-dimensional cubic including a higher layer 1510 and a lower layer 1520 in the surround form .
  • the structure shown in FIG. 15 corresponds to an embodiment, and is not limited to a structure in which the respective layers are combined.
  • the processor 220 combines the stereo signal, binaural output, and wide stereo output to produce an upmix stereo output.
  • an immersive stereo output with an immersive effect can be generated by mixing an immersive element with a binaural output and an extended stereo effect with a wide stereo output together with the source stereo signal .
  • the present invention can support a natural upmix function based on the processor 220 having the above-described functions, compatibility between contents supporting various kinds of sounds can be improved.
  • the memory 230 stores a stereo signal, binaural output, and wide stereo output.
  • the memory 230 stores various information generated in the process of generating the upmix stereo output according to an embodiment of the present invention, as described above.
  • the memory 230 may be configured independently of the upmix device to support the upmix stereo audio generation function. At this time, the memory 230 may operate as a separate mass storage and may include control functions for performing operations.
  • the upmix device is capable of storing information in a memory on which the memory device is mounted.
  • the memory is a computer-readable medium.
  • the memory may be a volatile memory unit, and in other embodiments, the memory may be a non-volatile memory unit.
  • the storage device is a computer-readable medium.
  • the storage device may include, for example, a hard disk device, an optical disk device, or any other mass storage device.
  • 16 is a flowchart illustrating a method of upmixing stereo audio to binaural audio according to an embodiment of the present invention.
  • a method of upmixing stereo audio to binaural audio includes performing binaural encoding based on a treble region and a bass region separated from a stereo signal, And generates an output (S1610).
  • the stereo signal can be input into the high-pass filter, the medium-pass filter, and the low-pass filter, respectively, so that the stereo signal can be separated into the high-frequency region, the middle-frequency region, and the low-frequency region.
  • the processor 220 may input a stereo signal corresponding to two channels to the treble pass filter 300, the mid-tone pass filter 400, and the low-pass filter 500, respectively .
  • the treble pass filter 300 corresponds to a filter that passes only the treble range of the input range of the stereo signal, and can output a stereo signal in a treble range as shown in FIG.
  • the mid-frequency pass filter 400 corresponds to a filter which passes only the middle region of the input band of the stereo signal, and can output the stereo signal of the middle region as shown in FIG.
  • the low-pass filter 500 corresponds to a filter which passes only the bass region of the input band of the stereo signal, and can output the low-frequency region stereo signal as shown in Fig.
  • the high-pass filter 300, the mid-pass filter 400, and the low-pass filter 500 used in the present invention are not limited to a specific filtering method, have.
  • the binaural output is generated corresponding to a three-dimensional vector for a binaural point located in an 8-channel based cubic cubic consisting of 4 up channels and 4 down channels,
  • the positions of the up channels are set based on the treble range, and the positions of the four down channels can be set based on the low range.
  • the 8-channel-based 3D cubic corresponds to an element for creating a 3-dimensional spatial image, and is composed of a 3-dimensional layer composed of an upper layer composed of 4 up channels and a lower layer composed of 4 down channels ≪ / RTI >
  • a binaural encoder 620 corresponding to a three-dimensional cubic system may be used to correspond to the two-channel and low-frequency regions 612 corresponding to the treble region 611 separated from the stereo signal And can perform binaural encoding corresponding to the two channels.
  • the positions of the four up channels can be set using any one of the high frequency frequencies detected on the basis of the size of the transient in the high frequency region 611, May be set using any one of the bass frequencies detected based on the size of the transient in step 612.
  • the transient may be the initial amplitude rise that appears when the sound is first started in the waveform of the sound.
  • the transients are detected one by one in the high-frequency region 611 and the low-frequency region 612, and the left and right channels generated by separating the retrieved frequencies in real- Up channels and four down channels.
  • stereo enhancement may be applied to a panning value of an upper layer corresponding to the high-frequency region 611 to generate a natural sound.
  • the treble frequency detected in the treble region 611 is subjected to left / right separation processing to obtain a position corresponding to the left channel L and the right channel R,
  • the speaker 711 can be disposed, and the speaker 712 can be disposed at the position of the write channel.
  • the speaker 713 is disposed at a position where the left channel L and the right channel R are combined in accordance with 'L- (LR)', and the left channel L and the right channel R correspond to 'R- (LR)
  • the upper layer 710 of cubic cubic can be constructed by disposing the speaker 714 at a position where the three-dimensional cubic is combined.
  • the bass frequency detected in the bass region 612 is subjected to left and right separation processing to obtain a position corresponding to the left channel L and the right channel R, and the speaker 721 is placed at a position of the left channel as shown in FIG. And the speaker 722 can be disposed at the position of the write channel. Thereafter, the speaker 723 is disposed at a position where the left channel L and the right channel R are combined in accordance with 'L- (LR)', and the left channel L and the right channel R correspond to 'R-
  • the cubic sub-layer 720 of three-dimensional cubic can be constructed by disposing the speaker 724 at a combination position.
  • the binaural output 630 shown in FIG. 6 is generated by binaurally encoding 8-channel based audio corresponding to eight speakers 711 to 714 and 721 to 724 as shown in FIG. 7 Output, and may be output in a stereo format corresponding to two channels as shown in Fig. At this time, the two channels corresponding to the binaural output 630 may correspond to the left channel and the right channel, respectively.
  • binaural encoding is performed by binaurally encoding the stereo signals in the high and low regions, which were only two channels 810, Output can be generated.
  • the upmix apparatus may include another usable three-dimensional layer or a three-dimensional layer to be developed in the future.
  • the distance between the upper layer of the three-dimensional cubic block constituted by the four up channels and the lower layer of the cubic cubic matrix constituted by the four down channels may be set based on the equalizer value of the stereo signal.
  • the value of the equalizer (EQ) of the stereo signal is for adjusting the spatial sensitivity of the sound by adjusting the frequency band.
  • the distance 910 between the upper layer and the lower layer of the cubic cubic shown in FIG. 9 corresponds to the equalizer value . That is, by adjusting the hertz (Hz) of the high-frequency band corresponding to the upper layer or adjusting the distance 910 between the upper layer and the lower layer by adjusting the low-frequency hertz corresponding to the lower layer, can.
  • the three-dimensional vector can be generated based on the reference listening point located inside the cubic.
  • a reference listening point 1010 which is a virtual representation of the position of a user or a listener, may be located at an inner center portion of a cubic cubic 1000 having eight dynamic speakers as vertexes .
  • the binaural point 1020 is located on the upper layer of the cubic cubic 1000 as shown in FIG. 10
  • the three-dimensional vector 1030 corresponding to the binaural output is shown in FIG. 10 In the direction from the reference listening point 1010 to the binaural point 1020 shown in FIG.
  • the output sound may be formed on the top of the listener. Also, when the binaural point 1020 is positioned lower than the reference listening point 1010 on the cubic cubic 1000, the output sound may be formed at the bottom of the listener.
  • the reference listening point 1010 may be located on the wide stereo layer corresponding to the middle sound region of the stereo signal, which is located inside the cubic cubic 1000. That is, a wide stereo layer corresponding to a two-channel based stereo layer may be located between the upper layer and the lower layer of the cubic cubic 1000.
  • the binaural output can be generated by applying the direction information of the three-dimensional vector to the rotated three-dimensional cubic corresponding to the head tracking information. That is, since the binaural point is set based on the listener's head corresponding to the reference listening point, the position of the binaural point on the cubic bicycle can also be changed if the listener's head position or angle is changed.
  • the 3D cubic 1000 shown in FIG. 10 is rotated as shown in FIG. 11 corresponding to the head tracking information.
  • the direction information of the three-dimensional vector 1030 shown in FIG. 10 can be directly applied to the three-dimensional cubic as shown in FIG. 11 to detect the position of the changed binaural point according to the rotation.
  • the head tracking information corresponds to data obtained by tracking the head movement of the user or listener, and may be obtained corresponding to at least one of a tracking input based on a separate head tracking module and a user input based on the user interface.
  • the head tracking module can measure the distance or angle of movement of the user's head and generate and transmit the head tracking information.
  • the head tracking information may be artificially provided by the user or the listener through the user interface. That is, the user or the listener may input the head tracking information based on the user interface regardless of whether the head tracking information is received by the head tracking module in order to artificially rotate the spatial image. At this time, the user or the listener may input and modify head tracking information while listening to a mixing process of generating an upmix stereo output or an upmix stereo output varying according to input information.
  • the cubic cubic can be rotated corresponding to the rotation parameter of at least one of pan, tilt, and roll.
  • the listener rotates the head corresponding to at least one of pan, tilt, and roll as shown in FIG. 12, the value is obtained as a rotation parameter, .
  • the effect of rotating the cubic bicycle or moving the cubic bicycle vertically and horizontally according to the head tracking information can be mixed with a wide stereo output and a stereo signal to generate an upmix stereo output.
  • the binaural output may include harmonics based on the fundamental frequency of the upper layer.
  • the harmonics correspond to the upper harmonics whose frequencies are integral multiples in the sound corresponding to the reference frequency, and are included in the binaural output to be used for mixing to provide musical naturalness.
  • stereo wide processing is performed based on a middle region separated from a stereo signal to generate a wide stereo output (S1620).
  • the wide stereo output may be generated based on a wide stereo layer corresponding to the midrange region.
  • stereo wide processing may be performed corresponding to a wide stereo layer based on the middle region 1310 of the stereo signal input to the stereo wider 1320.
  • the wide stereo output 1330 may be output in a stereo format corresponding to two channels as shown in FIG.
  • the wide stereo layer corresponds to an element for producing a stereo image, which may correspond to an extended stereo layer corresponding to the reverb value and the delay value.
  • a stereo image region 1400 may be extended based on a Reverb 1410 and a Delay or Pan 1420.
  • the reverberation 1410 corresponds to the reverberation of the sound originating from the sound source, which has hit the ear, such as a wall, floor, or ceiling, twice or more, and the size of the space corresponding to the stereo image region 1400 is changed in the forward / ≪ / RTI >
  • the value of the reverberation 1410 can be adjusted based on a pre-delay value corresponding to a time taken for the original sound to be heard and then the reverberation 1410 to be heard.
  • the delay in the delay or the pan 1420 corresponds to the delay value for the left and right channels, and by adjusting the values to be different from each other, the size of the space corresponding to the stereo image region 1400 is changed in the left / Can be adjusted.
  • the present invention adjusts the delay or the pan 1420 to adjust the left and right sizes of the corresponding space of the stereo image region 1400 Can be adjusted.
  • a wide stereo layer 1530 may be located in combination with a three-dimensional cubic including a higher layer 1510 and a lower layer 1520 in the surround form .
  • the structure shown in FIG. 15 corresponds to an embodiment, and is not limited to a structure in which the respective layers are combined.
  • a method of upmixing stereo audio to binaural audio generates an upmix stereo output by summing a stereo signal, a binaural output, and a wide stereo output (S1630).
  • an immersive stereo output with an immersive effect can be generated by mixing an immersive element with a binaural output and an extended stereo effect with a wide stereo output together with the source stereo signal .
  • the present invention can support a natural upmix function based on the above-described functions, it is possible to improve compatibility between contents supporting various kinds of sounds.
  • a method of upmixing stereo audio to binaural audio transmits and receives information required for generating upmix stereo audio through a communication network such as a network .
  • a stereo signal or content corresponding to a source for generating an upmix stereo audio head tracking information to be input through a head tracking module or a user interface for binaural encoding, and the like, Mix stereo audio.
  • the method of upmixing stereo audio to binaural audio according to an embodiment of the present invention may include storing various information generated in the process of generating an upmix stereo output according to an embodiment of the present invention do.
  • Embodiments of the present invention may be implemented in a computer-implemented method or in a non-transitory computer-readable medium having recorded thereon instructions executable by the computer.
  • instructions readable by a computer are executed by a processor, the instructions readable by the computer are capable of performing at least one aspect of the invention.
  • the method and apparatus for upmixing stereo audio to binaural audio according to the present invention are not limited to the configuration and method of the above-described embodiments, All or some of the embodiments may be selectively combined so that various modifications can be made.
  • the present invention relates to a method of upmixing stereo audio to binaural audio and an apparatus therefor, and it is an object of the present invention to allow an existing stereo file to be upmixed without being subjected to emmissive mixing, It is possible to reduce the time and cost required to mix the contents and to improve the compatibility with various kinds of contents based on the natural upmix, thereby contributing to the development of the industry.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Stereophonic System (AREA)

Abstract

La présente invention concerne un procédé qui permet un mixage élévateur d'un audio stéréo en tant qu'audio binaural et un appareil associé. Un procédé de mixage élévateur d'un audio stéréo en tant qu'audio binaural, selon un mode de réalisation de la présente invention, comprend les étapes consistant à : générer une sortie binaurale en effectuant un codage binaural sur la base d'une plage de sons aigus et d'une plage de sons graves qui sont séparées d'un signal stéréo ; générer une large sortie stéréo par réalisation d'un traitement stéréo large sur la base d'une plage de sons moyens séparée du signal stéréo ; et générer une sortie stéréo mixée élevée par mixage du signal stéréo, de la sortie binaurale et de la sortie stéréo large.
PCT/KR2019/001018 2018-01-29 2019-01-24 Procédé de mixage élévateur d'audio stéréo en tant qu'audio binaural et appareil associé WO2019147040A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020180010877A KR102119240B1 (ko) 2018-01-29 2018-01-29 스테레오 오디오를 바이노럴 오디오로 업 믹스하는 방법 및 이를 위한 장치
KR10-2018-0010877 2018-01-29

Publications (1)

Publication Number Publication Date
WO2019147040A1 true WO2019147040A1 (fr) 2019-08-01

Family

ID=67396070

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2019/001018 WO2019147040A1 (fr) 2018-01-29 2019-01-24 Procédé de mixage élévateur d'audio stéréo en tant qu'audio binaural et appareil associé

Country Status (2)

Country Link
KR (1) KR102119240B1 (fr)
WO (1) WO2019147040A1 (fr)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102416271B1 (ko) 2020-07-30 2022-07-05 주식회사 케이제이인터내셔널 매트리스
KR102248410B1 (ko) 2020-08-18 2021-05-06 주식회사 한성넥스 매트리스
KR102283572B1 (ko) 2021-01-26 2021-07-29 주식회사 코잔 조합형 매트리스 시스템
KR102308368B1 (ko) 2021-03-30 2021-10-06 주식회사 한성넥스 매트리스
KR102702173B1 (ko) 2022-08-29 2024-09-04 주식회사 쏘포유 천연라텍스와 메모리폼이 적용된 공기순환용 매트리스

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20100120684A (ko) * 2008-02-14 2010-11-16 돌비 레버러토리즈 라이쎈싱 코오포레이션 입체 음향 확장
KR20150083734A (ko) * 2014-01-10 2015-07-20 삼성전자주식회사 액티브다운 믹스 방식을 이용한 입체 음향 재생 방법 및 장치
WO2017051079A1 (fr) * 2015-09-25 2017-03-30 Nokia Technologies Oy Appareil de capture des mouvements de la tête différentiel
WO2017072118A1 (fr) * 2015-10-26 2017-05-04 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Appareil et procédé pour générer un signal audio filtré exécutant une restitution d'élévation
WO2017203011A1 (fr) * 2016-05-24 2017-11-30 Stephen Malcolm Frederick Smyth Systèmes et procédés pour améliorer des systèmes de virtualisation d'audio

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102007991B1 (ko) 2013-07-25 2019-08-06 한국전자통신연구원 다채널 오디오 신호의 바이노럴 렌더링 방법 및 장치

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20100120684A (ko) * 2008-02-14 2010-11-16 돌비 레버러토리즈 라이쎈싱 코오포레이션 입체 음향 확장
KR20150083734A (ko) * 2014-01-10 2015-07-20 삼성전자주식회사 액티브다운 믹스 방식을 이용한 입체 음향 재생 방법 및 장치
WO2017051079A1 (fr) * 2015-09-25 2017-03-30 Nokia Technologies Oy Appareil de capture des mouvements de la tête différentiel
WO2017072118A1 (fr) * 2015-10-26 2017-05-04 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Appareil et procédé pour générer un signal audio filtré exécutant une restitution d'élévation
WO2017203011A1 (fr) * 2016-05-24 2017-11-30 Stephen Malcolm Frederick Smyth Systèmes et procédés pour améliorer des systèmes de virtualisation d'audio

Also Published As

Publication number Publication date
KR20190091825A (ko) 2019-08-07
KR102119240B1 (ko) 2020-06-05

Similar Documents

Publication Publication Date Title
WO2019147040A1 (fr) Procédé de mixage élévateur d'audio stéréo en tant qu'audio binaural et appareil associé
WO2015156654A1 (fr) Procédé et appareil permettant de représenter un signal sonore, et support d'enregistrement lisible par ordinateur
WO2018056780A1 (fr) Procédé et appareil de traitement de signal audio binaural
WO2018147701A1 (fr) Procédé et appareil conçus pour le traitement d'un signal audio
WO2019004524A1 (fr) Procédé de lecture audio et appareil de lecture audio dans un environnement à six degrés de liberté
WO2017191970A2 (fr) Procédé et appareil de traitement de signal audio pour rendu binaural
WO2015147532A2 (fr) Procédé de rendu de signal sonore, appareil et support d'enregistrement lisible par ordinateur
WO2018182274A1 (fr) Procédé et dispositif de traitement de signal audio
WO2011115430A2 (fr) Procédé et appareil de reproduction sonore en trois dimensions
WO2015147619A1 (fr) Procédé et appareil pour restituer un signal acoustique, et support lisible par ordinateur
WO2011139090A2 (fr) Procédé et appareil de reproduction de son stéréophonique
WO2015105393A1 (fr) Procédé et appareil de reproduction d'un contenu audio tridimensionnel
WO2015142073A1 (fr) Méthode et appareil de traitement de signal audio
WO2016089180A1 (fr) Procédé et appareil de traitement de signal audio destiné à un rendu binauriculaire
WO2019107868A1 (fr) Appareil et procédé de sortie de signal audio, et appareil d'affichage l'utilisant
WO2012172480A2 (fr) Système de production de son enveloppant stéréo numérique tridimensionnel naturel à 360 degrés (3d dssr n-360)
WO2013103256A1 (fr) Procédé et dispositif de localisation d'un signal audio multicanal
WO2006051001A1 (fr) Procede de traitement audio partiel, produit-programme, dispositif electronique et systeme
EP3596939A1 (fr) Appareil de sortie sonore et procédé de traitement de signal associé
WO2018101600A1 (fr) Appareil électronique, et procédé de commande associé
WO2019031652A1 (fr) Procédé de lecture audio tridimensionnelle et appareil de lecture
WO2015060696A1 (fr) Procédé et appareil de reproduction de son stéréophonique
WO2019031767A1 (fr) Appareil d'affichage et procédé de commande associé
WO2016182184A1 (fr) Dispositif et procédé de restitution sonore tridimensionnelle
WO2019013400A1 (fr) Procédé et dispositif de sortie audio liée à un zoom d'écran vidéo

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19743309

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19743309

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 08/02/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19743309

Country of ref document: EP

Kind code of ref document: A1