WO2022176417A1 - 情報処理装置、情報処理方法、及び、プログラム - Google Patents
情報処理装置、情報処理方法、及び、プログラム Download PDFInfo
- Publication number
- WO2022176417A1 WO2022176417A1 PCT/JP2022/000160 JP2022000160W WO2022176417A1 WO 2022176417 A1 WO2022176417 A1 WO 2022176417A1 JP 2022000160 W JP2022000160 W JP 2022000160W WO 2022176417 A1 WO2022176417 A1 WO 2022176417A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- ultrasonic
- signal
- information processing
- response signal
- space
- Prior art date
Links
- 230000010365 information processing Effects 0.000 title claims abstract description 48
- 238000003672 processing method Methods 0.000 title claims abstract description 9
- 230000004044 response Effects 0.000 claims abstract description 54
- 238000007689 inspection Methods 0.000 claims abstract description 13
- 238000012545 processing Methods 0.000 claims description 129
- 230000006870 function Effects 0.000 claims description 44
- 238000012546 transfer Methods 0.000 claims description 40
- 230000000694 effects Effects 0.000 claims description 22
- 230000005236 sound signal Effects 0.000 claims description 12
- 230000008859 change Effects 0.000 claims description 7
- 238000010801 machine learning Methods 0.000 claims description 7
- 230000001131 transforming effect Effects 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 abstract description 17
- 238000001228 spectrum Methods 0.000 description 27
- 230000005540 biological transmission Effects 0.000 description 23
- 238000000034 method Methods 0.000 description 21
- 238000010586 diagram Methods 0.000 description 15
- 238000013480 data collection Methods 0.000 description 14
- 210000003128 head Anatomy 0.000 description 9
- 238000004566 IR spectroscopy Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 6
- 238000006243 chemical reaction Methods 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 238000002604 ultrasonography Methods 0.000 description 5
- 230000010354 integration Effects 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 210000005069 ears Anatomy 0.000 description 2
- 238000002592 echocardiography Methods 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 210000000883 ear external Anatomy 0.000 description 1
- 230000001771 impaired effect Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
- H04S7/301—Automatic calibration of stereophonic sound system, e.g. with test microphone
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01H—MEASUREMENT OF MECHANICAL VIBRATIONS OR ULTRASONIC, SONIC OR INFRASONIC WAVES
- G01H17/00—Measuring mechanical vibrations or ultrasonic, sonic or infrasonic waves, not provided for in the preceding groups
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K15/00—Acoustics not otherwise provided for
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/10—Earpieces; Attachments therefor ; Earphones; Monophonic headphones
- H04R1/1041—Mechanical or electronic switches, or control elements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/01—Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]
Definitions
- the present technology relates to an information processing device, an information processing method, and a program, and more particularly to an information processing device, an information processing method, and a program that enable spatial perception suitable for a place where silence is required.
- Patent Documents 1 and 2 there is a system in which a visually impaired person perceives the surroundings from echoes of actually emitted test sounds or from simulated echoes generated from the actually measured positions of objects. disclosed.
- This technology was created in view of this situation, and enables spatial perception suitable for places where silence is required.
- the information processing device or program of the present technology is based on an ultrasonic response signal returned from the space in response to an ultrasonic frequency band inspection signal radiated into the space. or a program for causing a computer to function as such an information processing apparatus.
- the processing unit of an information processing apparatus having a processing unit perceives a This is an information processing method in which changes are made according to the situation of the space with respect to the reproduction signal to be reproduced.
- the reproduction signal that the user perceives On the other hand, a change is added according to the situation of the space.
- FIG. 1 is a configuration diagram showing a configuration example of a first embodiment of a sound processing device to which the present technology is applied;
- FIG. FIG. 4 is a diagram illustrating a frequency spectrum of a transfer function in the audible range (frequency spectrum of the audible range IR);
- FIG. 10 is a diagram illustrating a state of convolution processing performed by a reverberant sound generation unit;
- 2 is a flowchart illustrating a processing procedure of the sound processing device of FIG. 1;
- FIG. 11 is a block diagram illustrating the configuration of a second embodiment of a sound processing device to which the present technology is applied; It is the figure which illustrated the external appearance structure of an echo data collection device.
- FIG. 4 is a flow chart showing a processing procedure of the echo data collecting device; 6 is a flowchart illustrating a processing procedure of the sound processing device of FIG. 5;
- FIG. 10 is a diagram showing input/output states of an inference model in an audible range IR generation unit;
- FIG. 10 is a diagram showing input/output of a GAN in an audible range IR generation unit;
- FIG. 2 is a block diagram showing a configuration example of hardware of a computer that executes a series of processes by a program;
- FIG. 1 is a configuration diagram showing a configuration example of a first embodiment of a sound processing device to which the present technology is applied.
- the acoustic processing device 1 of the present embodiment in FIG. 1 includes, for example, an audio output device such as earphones, headphones, speakers, etc., which converts sound signals, which are electric signals, into sound waves.
- the audio output device may be wired or wirelessly connected to the main body of the sound processing device 1, or the main body of the sound processing device 1 may be incorporated into the audio output device.
- stereo earphones are connected to the main body of sound processing device 1 by wire, and that sound processing device 1 is composed of the main body of sound processing device 1 and the earphones.
- the sound processing device 1 allows the user to perceive the situation of the space around the user through sound.
- the acoustic processing device 1 has an ultrasonic transmission unit 11 , a binaural microphone 12 , an audible range IR (Impulse Response) generation unit 13 , a reverberation sound generation unit 14 , and an audio output unit 15 .
- IR Impulse Response
- the ultrasonic transmission unit 11 emits ultrasonic pulses (signals) as inspection waves into space at predetermined time intervals (predetermined cycles).
- the ultrasonic transmission unit 11 has, for example, a right speaker and a left speaker respectively installed in a right earphone worn on the right ear of the user and a left earphone worn on the left ear of the user.
- Ultrasonic pulses are emitted from the right loudspeaker in a wide directivity angle range centering on the center axis pointing rightward of the user's head.
- the left speaker radiates ultrasonic pulses over a wide directivity angle centered on the central axis pointing leftward of the user's head.
- the speakers of the ultrasonic wave transmitting unit 11 may be arranged in portions other than the ears, and the number of speakers may be other than two.
- the ultrasonic pulse emitted by the ultrasonic transmission unit 11 is, for example, an ultrasonic signal in an ultrasonic frequency band of 40 kHz to 80 kHz, and has a pulse width of about 1 ms.
- the binaural microphone 12 generates an ultrasonic impulse response signal (hereinafter referred to as ultrasonic IR) that is reflected (scattered) by an object placed in the space and returned to the ultrasonic pulse emitted into the space by the ultrasonic transmitter 11 . received in stereo.
- ultrasonic IR an ultrasonic impulse response signal
- the binaural microphone 12 has, for example, a right microphone and a left microphone installed in the right earphone and the left earphone, respectively.
- the right microphone mainly receives ultrasonic waves IR corresponding to ultrasonic pulses radiated from the right speaker of the ultrasonic transmission section 11 .
- the left microphone mainly receives ultrasonic waves IR corresponding to ultrasonic pulses emitted from the left speaker of the ultrasonic transmission unit 11 .
- the microphones for receiving the ultrasonic IR may be arranged in parts other than the ears, and the number of microphones may be other than two.
- the binaural microphone 12 may be wired or wirelessly connected to the main body of the sound processing device 1, similarly to the audio output device, and the main body of the sound processing device 1 is incorporated into the binaural microphone 12. may be the case.
- the ultrasonic IR received by the binaural microphone 12 is supplied to the audible IR generator 13 .
- the ultrasonic IR consists of two channels, an ultrasonic wave IR(R) received by the right microphone of the binaural microphone 12 and an ultrasonic wave IR(L) received by the left microphone.
- ultrasonic IR(R) and ultrasonic IR(L) they are simply referred to as ultrasonic IR.
- the audible range IR generator 13 converts the ultrasonic IR from the binaural microphone 12 into audible range IR.
- the audible range IR consists of two channels, an audible range IR(R) obtained from the ultrasonic wave IR(R) and an audible range IR(L) obtained from the ultrasonic wave IR(L).
- audible range IR(R) and the audible range IR(L) are simply referred to as audible range IR.
- the audible range IR generation unit 13 performs frequency conversion (Fourier transform) of the ultrasonic IR per cycle from the binaural microphone 12 from time domain representation to frequency domain representation (frequency spectrum), for example, by FFT (Fast Fourier Transform).
- frequency conversion Frier transform
- FFT Fast Fourier Transform
- the audible range IR generation unit 13 shifts the frequency spectrum of the ultrasonic IR (ultrasound IR in the frequency domain) to the audible range (audible frequency band) and adjusts the bandwidth for fitting. As a result, an audible impulse response signal (audible range IR) is generated in the space where the ultrasonic pulse is emitted.
- the audible range IR generated by the audible range IR generating section 13 is the audible range IR expressed in the frequency domain, and is also referred to as a transfer function of the audible range or simply a transfer function.
- the audible range IR generation unit 13 associates frequencies in the ultrasonic band (ultrasonic frequency band) of 40 kHz to 80 kHz with frequencies in the audible range of 20 Hz to 20 kHz. Specifically, when the frequency of the ultrasonic band is x and the frequency of the audible range is y, the relationship between the frequency x of the ultrasonic band and the frequency y of the audible range is linearly associated with the following equation (1). be done.
- the audible range IR generation unit 13 takes the frequency component of frequency x in the frequency spectrum of the ultrasonic IR as the frequency component of frequency y in the audible range associated by Equation (1).
- the correspondence between the frequency x in the ultrasonic band and the frequency y in the audible range according to Equation (1) is an example, and is not limited to linear correspondence.
- the respective frequency ranges in which the frequency x in the ultrasonic band and the frequency y in the audible range are associated are not limited to the range of 20 Hz to 20 kHz in the audible range and the range of 40 kHz to 80 kHz in the ultrasonic band.
- the frequency of the ultrasonic wave that generates the ultrasonic pulse emitted from the ultrasonic transmission unit 11 is part of the frequency range of 40 kHz to 80 kHz in the ultrasonic band
- the frequency range and the audible range of 20 Hz It may be a case where the range from 20 kHz to 20 kHz is associated.
- the audible range IR generation unit 13 reflects the actual attenuation characteristics, etc. according to the length of the propagation path when the audible sound actually propagates in space, on the frequency components in the audible range thus obtained. Equalizing processing is applied to
- FIG. 2 is a diagram exemplifying the frequency spectrum of the transfer function in the audible range (the frequency spectrum of the audible range IR) generated from the frequency spectrum of the ultrasonic IR by the audible range IR generator 13 .
- frequency spectrum 31 represents the frequency spectrum of ultrasonic IR.
- the horizontal axis indicates frequency, and the frequency spectrum 31 has, for example, frequency components from 40 kHz to 80 kHz in the ultrasonic band. Although the attenuation characteristic with respect to frequency of the frequency spectrum 31 is approximated linearly, it actually changes according to the spatial conditions and the like.
- the vertical axis represents the power spectrum, and the frequency spectrum is represented by the power spectrum on the graph.
- the frequency spectrum 32 represents the frequency spectrum of the transfer function in the audible range generated by the audible range IR generator 13 .
- the frequency spectrum 32 has, for example, frequency components from 20 Hz to 20 kHz in the audible range.
- the attenuation characteristic with respect to frequency of the frequency spectrum 32 is linearly approximated similarly to the frequency spectrum 31, but in reality it is not limited to this.
- the audible range IR generation unit 13 supplies the audible range transfer function (audible range IR) generated from the frequency spectrum of the ultrasonic IR to the reverberation sound generation unit 14 in FIG.
- the transfer function in the audible range consists of two channels, a transfer function (R) generated from the ultrasonic IR(R) and a transfer function (L) generated from the ultrasonic IR(L).
- R transfer function
- L transfer function
- audible IR When the transfer function is referred to as audible IR in frequency domain representation, or simply audible IR without distinction between time domain representation and frequency domain representation, audible IR is also generated from ultrasonic IR (R) It consists of two channels, an audible range IR(R) and an audible range IR(L) generated from the ultrasonic IR(R). When the audible range IR(R) and the audible range IR(L) are not particularly distinguished, they are simply referred to as audible range IR.
- the reverberant sound generation unit 14 gives a sound effect based on the transfer function (audible range IR) from the audible range IR generation unit 13 to the reproduced sound (signal) in the audible range heard by the user.
- the reproduced sound may be, for example, a sound signal pre-stored in a memory (not shown).
- the reproduced sound stored in the memory may be a sound signal such as a continuous or intermittent alarm sound specialized as a notification sound for notifying the spatial situation, or music selected and listened to by the user. It may be a sound signal such as
- the reproduced sound may be a sound signal such as music supplied as streaming from an external device connected to the sound processing device 1 via a network such as the Internet.
- the reverberation sound generation unit 14 performs convolution reverb (sampling reverb) processing to convolve the transfer function (audible range IR) from the audible range IR generation unit 13 on the reproduced sound.
- convolution reverb processing is also called convolution processing or convolution integration.
- the reverberant sound generation unit 14 performs convolution processing (convolution integration) between the reproduced sound obtained by frequency transforming (FFT) the reproduced sound into frequency domain representation and the audible range IR. In this case, the reproduced sound represented by the frequency domain is multiplied by the transfer function.
- FFT frequency transforming
- the overlap-save method and the overlap-add method are known as methods of convolving a long reproduced sound (signal) using FFT.
- the reverberation sound generation unit 14 performs inverse frequency transform (Inverse Fast Fourier Transform, IFFT) on the reproduced sound after convolution reverb processing (convolution processing). This provides a reproduced sound in time domain representation.
- IFFT inverse Fast Fourier Transform
- the reverberant sound generator 14 supplies the reproduced sound to the audio output unit 15 .
- FIG. 3 is a diagram illustrating how the convolution process performed by the reverberant sound generation unit 14 is performed.
- an audible range IR33 represents a signal supplied from the audible range IR generation section 13 to the reverberation sound generation section .
- Audible IR 33 in FIG. 3 is also the transfer function of the time domain representation.
- the reproduced sound 34 is a signal supplied to the reverberation sound generation unit 14 from a memory (not shown) or the like.
- a musical sound signal is shown as an example of the reproduced sound 34 .
- the reproduced sound 35 is a sound signal supplied from the reverberation sound generator 14 to the audio output unit 15 .
- the reverberant sound generation section 14 uses the audible range IR33 until the next audible range IR33 is supplied to generate the reproduced sound 34 and the audible range IR33. Perform the convolution integral with .
- the reproduced sound 35 obtained as a result is supplied to the audio output section 15 .
- the reproduced sound consists of two channels, a right reproduced sound (R) heard by the user's right ear and a left reproduced sound (L) heard by the left ear.
- the reverberant sound generation unit 14 supplies the result obtained by the convolution integration of the reproduced sound (R) and the transfer function (R) (audible range IR(R)) to the audio output unit 15 as the reproduced sound (R).
- the reverberant sound generation unit 14 supplies the result obtained by the convolution integration of the reproduced sound (L) and the transfer function (L) (audible range IR(L)) to the audio output unit 15 as the reproduced sound (L).
- the reproduced sound (R) and the reproduced sound (L) are not particularly distinguished, they are simply referred to as the reproduced sound.
- the audio output unit 15 converts the reproduced sound (R) from the reverberation sound generation unit 14 into sound waves using the earphone (R) worn by the user on the right ear, and outputs the sound waves.
- the audio output unit 15 converts the reproduced sound (L) from the reverberation sound generation unit 14 into sound waves using the earphone (L) worn by the user on the left ear, and outputs the sound waves.
- the audible range IR (transfer function of the audible range), which is estimated to be obtained when an audible range pulse signal is radiated in space by the audible range IR generation unit 13, is an ultrasonic wave. Generated based on IR. Therefore, it is not necessary to radiate the test sound in the audible range into the space, and it is possible to acquire the audible range IR (transfer function of the audible range) according to the situation of the space even in a place where silence is required.
- the audible range IR generated by the audible range IR generation unit 13 is convoluted with the reproduced sound heard by the user, the spatial situation (space object arrangement etc.) is applied. That is, a sound effect is imparted to the reproduced sound as if it were echoed by an object existing in the space. Therefore, the user can perceive, for example, that some object is approaching in the surroundings by the acoustic effect of the reproduced sound. Since content such as music can be used as the reproduced sound like a normal music player, listening for a long time does not cause pain to the user.
- FIG. 4 is a flow chart illustrating the processing procedure of the sound processing device 1 of FIG. Note that this flowchart shows the processing during one cycle of the ultrasonic pulse periodically radiated into space.
- step S11 the ultrasonic transmission unit 11 radiates (transmits) ultrasonic pulses into space. Processing proceeds from step S11 to step S12.
- step S12 the binaural microphone 12 receives ultrasonic waves IR returning from space. Processing proceeds from step S12 to step S13.
- step S13 the audible range IR generation unit 13 performs frequency conversion on the ultrasonic IR received in step S12 by FFT, and converts the ultrasonic IR in the frequency domain (frequency spectrum of the ultrasonic IR), that is, the audible to obtain the domain transfer function. Processing proceeds from step S13 to step S14.
- step S14 the audible range IR generator 13 shifts the band of the frequency spectrum of the ultrasonic IR obtained in step S13 to the audible range. Processing proceeds from step S14 to step S15.
- step S15 the audible range IR generation unit 13 adjusts the frequency components (frequency spectrum) of the audible range shifted in step S14 according to the length of the propagation path when the audible sound actually propagates in space. Equalizing processing is applied to reflect the actual attenuation characteristics. As a result, the frequency spectrum of the audible range IR (transfer function of the audible range) is obtained. Processing proceeds from step S15 to step S16.
- step S16 the reverberant sound generator 17 performs frequency conversion on the reproduced sound (signal), and performs convolution processing (convolution reverb processing) with the audible range IR obtained in step S16. As a result, acoustic effects are imparted to the reproduced sound in accordance with the conditions of the space. Processing proceeds from step S16 to step S17.
- step S17 the reverberant sound generation unit 17 inversely transforms the reproduced sound to which the acoustic effect was applied in step S16 from the frequency domain representation to the time domain representation. Processing proceeds from step S17 to step S18.
- step S18 the audio output unit 15 outputs the reproduced sound converted into the time domain representation in step S17 from earphones or the like.
- the sound processing device 1 repeats the processing from step S11 to step S18 each time the ultrasonic wave transmission unit 11 outputs an ultrasonic pulse (one pulse) into space.
- FIG. 5 is a block diagram illustrating the configuration of the second embodiment of the sound processing device to which the present technology is applied.
- the parts common to the sound processing apparatus 1 of FIG. 1 are denoted by the same reference numerals, and detailed description thereof will be omitted as appropriate.
- the processing system 51 of FIG. 5 also includes devices used when constructing a sound processing device 52, which is a second embodiment of the sound processing device to which the present technology is applied.
- the processing system 51 has a sound processing device 52 , an echo data collection device 61 , and a generative model learning device 62 .
- the acoustic processing device 52 has an ultrasonic transmission unit 11 , a binaural microphone 12 , an echo generation unit 14 , an audio output unit 15 , and an audible range IR generation unit 63 . Accordingly, the acoustic processing device 52 is common to the acoustic processing device 1 in FIG.
- the sound processing device 52 differs from the sound processing device 1 in FIG. 1 in that an audible range IR generation section 63 is provided instead of the audible range IR generation section 13 in FIG.
- the audible range IR generation unit 63 infers the audible range IR for the ultrasonic IR from the binaural microphone 12 using an inference model having a neural network structure.
- the inference model is generated by supervised learning using a machine learning technique in the generative model learning device 62 .
- the inference model generated by the generative model learning device 62 is installed in the audible range IR generator 63 .
- the echo data collection device 61 collects data sets used for learning the inference model.
- the echo sound data collection device 61 has an ultrasonic transmission unit 71, an audible sound transmission unit 72, a binaural microphone 73, and a storage unit 74.
- the ultrasonic transmission unit 71 emits an ultrasonic pulse (signal) having a pulse width of about 1 ms, which is an ultrasonic signal in the ultrasonic frequency band of 40 kHz to 80 kHz, similar to the ultrasonic transmission unit 11 of the acoustic processing device 1 .
- the period of the ultrasonic pulse does not have to match that of the ultrasonic pulse from the ultrasonic transmission unit 11 of the acoustic processing device 1, and is set to an arbitrary period.
- the audible sound transmitting unit 72 emits an audible range pulse (signal) having a pulse width of about 1 ms, which is composed of an audible sound signal in the audible range of 20 Hz to 20 kHz.
- the pulse width of the audible range pulse is the same as that of the ultrasonic pulse emitted from the ultrasonic transmission section 71, but may be different.
- the period of the audible range pulse is the same as that of the ultrasonic pulse emitted from the ultrasonic transmitter 71, but the ultrasonic pulse is set so that the period when the audible range pulse is on does not overlap with the time when the ultrasonic pulse is on. and the audible pulse are staggered.
- the binaural microphone 73 receives ultrasonic IR and audible range IR.
- the storage unit 74 stores the ultrasonic IR received by the binaural microphone 73 and the audible range IR.
- FIG. 6 is a diagram illustrating the external configuration of the echo data collection device 61. As shown in FIG.
- a stand 83 supports a dummy head 82 imitating the left and right ear peripheral parts of a human being.
- Left and right ultrasonic speakers of an ultrasonic wave transmitter 71 that emits ultrasonic pulses and an audible sound transmitter 72 that emits audible range pulses are placed at positions 81 and 81 near the outer ears of the right ear and left ear of the dummy head 82 .
- left and right audible range speakers are installed.
- Left and right microphones of the binaural microphone 73 are incorporated in the left and right portions of the dummy head 82 .
- the left and right microphones of binaural microphone 73 receive both ultrasonic and audible range pulses, respectively.
- the speakers and microphones placed on the dummy head 82 are connected to a personal computer 84 respectively.
- the personal computer 84 is connected to the speaker and microphone of the dummy head 82 and configures the echo sound data collection device 61 by executing a predetermined program. Note that the personal computer 84 may include the generative model learning device 62 .
- FIG. 7 is a flow chart showing the processing procedure of the echo data collection device 61.
- step S31 a location for collecting learning data for the inference model is determined, and the reverberant sound data collection device 61 is installed at that location. For example, it is installed in a space that can accommodate the use of the sound processing device 52 in various places such as outdoors, corridors, indoors, and rooms where furniture is arranged. Processing proceeds from step S31 to step S32.
- step S32 the ultrasonic transmission unit 71 transmits (radiates) ultrasonic pulses (single pulses) from the left and right ultrasonic speakers of the dummy head 82 to the surroundings. Processing proceeds from step S32 to step S33.
- step S33 the left and right microphones of the binaural microphone 73 receive the ultrasonic pulses (ultrasonic IR) emitted in step S32 and returned from the space. Processing proceeds from step S33 to step S34.
- ultrasonic pulses ultrasonic IR
- step S34 the storage unit 74 stores the right ear-side ultrasonic wave IR(R) and the left ear-side ultrasonic wave IR(L) received by the binaural microphone 73 in step S33. Processing proceeds from step S34 to step S35.
- step S35 the audible sound transmission unit 72 transmits (radiates) audible range pulses (single pulses) from the left and right audible range speakers of the dummy head 82 to the surroundings. Processing proceeds from step S35 to step S36.
- step S36 the left and right microphones of the binaural microphone 73 receive the audible range pulse (audible range IR) emitted in step S35 and returned from the space. Processing proceeds from step S36 to step S37.
- audible range pulse audible range IR
- step S37 the storage unit 74 stores the right ear-side audible range IR(R) and the left ear-side audible range IR(L) received by the binaural microphone 73 in step S36.
- step S34 ultrasonic IR data for two channels, ultrasonic IR(R) and ultrasonic IR(L), is saved.
- step S37 the audible range IR data for two channels of the audible range IR(R) and the audible range IR(L) are stored. These 2-channel ultrasonic IR data and 2-channel audible range IR data are linked to each other, with the ultrasonic IR as input data and the audible range IR as teacher data (correct data). paired data.
- the number of paired data is increased, and a data set, which is an aggregate of paired data, is stored in the storage unit 74.
- step S32 to S34 acquisition and storage of ultrasonic IR
- step S35 to S37 acquisition and storage of audible range IR
- the above learning data can be generated by a simulator that can reproduce spatial objects and their stereophonic sound in a virtual space of CG (Computer Graphics) by a game engine such as Unity or Unreal Engine. good.
- the generative model learning device 62 in FIG. 5 uses the data set stored in the storage unit 74 to learn an inference model in machine learning.
- the input data is the ultrasound IR
- the output data is the audible range IR estimated from the input data. If the number of samples per cycle of the ultrasonic pulse and the audible range pulse is n, the input and output of the inference model are 2n-dimensional for two channels, respectively.
- the generative model learning device 62 learns an inference model using the ultrasonic IR data as input data and the audible range IR data as teacher data among the paired data in the data set stored in the storage unit 74 . After the learning is completed, the learned inference model is installed in the audible range IR generation section 63 of the sound processing device 52 .
- U-Net and Fully Convolutional Network are known as networks that can output different data of the same dimension as the input.
- FIG. 8 is a flow chart illustrating the processing procedure of the sound processing device 52 of FIG. This flowchart shows the processing during one cycle of ultrasonic pulses that are periodically radiated into space.
- step S51 the ultrasonic transmission unit 11 radiates (transmits) ultrasonic pulses into space. Processing proceeds from step S51 to step S52.
- step S52 the binaural microphone 12 receives ultrasonic waves IR returning from space. Processing proceeds from step S52 to step S53.
- step S53 the audible range IR generator 63 inputs the ultrasonic IR received in step S52 into the inference model, and calculates the audible range IR by the inference model. Processing proceeds from step S53 to step S54.
- FIG. 9 is a diagram showing input and output of an inference model in the audible range IR generator 63.
- an inference network 91 which is an inference model, contains, as input data, data of the number of samples n of the ultrasonic IR(R) 93 from the binaural microphone 12 and n of the number of samples of the ultrasonic IR(L) 92. data is entered. From the inference network 91, the data of n samples of the audible range IR(R) 96 and the data of n samples of the audible range IR(L) 94 are output for the input data.
- step S54 the reverberant sound generation unit 17 performs frequency conversion on the reproduced sound (signal), and performs convolution processing (convolution reverb processing) with the audible range IR obtained in step S53. conduct. As a result, acoustic effects are imparted to the reproduced sound in accordance with the conditions of the space. Processing proceeds from step S54 to step S55.
- step S55 the audio output unit 15 outputs the reproduced sound obtained in step S54 from earphones or the like.
- the sound processing device 52 repeats the processing from step S51 to step S55 each time the ultrasonic wave transmission unit 11 outputs an ultrasonic pulse (one pulse) to the space.
- the inference model of the audible range IR generation unit 63 uses the ultrasonic IR in the time domain representation as input data and generates the audible range IR in the time domain representation as the output data, it is not limited to this.
- the inference model uses ultrasonic IR in frequency domain representation (frequency spectrum of ultrasonic IR) as input data, and audible range IR in frequency domain representation (frequency spectrum of audible range IR) as output data, i.e. transfer function of audible range may be generated.
- the audible range IR generation unit 13 generates an audible range IR estimated to be obtained when an audible range pulse signal is radiated into space. transfer function) is generated based on ultrasound IR. Therefore, it is not necessary to radiate the test sound in the audible range into the space, and it is possible to acquire the audible range IR (transfer function of the audible range) according to the situation of the space even in a place where silence is required.
- the audible range IR generated by the audible range IR generation unit 13 is convoluted with respect to the reproduced sound heard by the user. etc.) is applied. That is, a sound effect is imparted to the reproduced sound as if it were echoed by an object existing in the space. Therefore, the user can perceive, for example, that some object is approaching in the surroundings by the acoustic effect of the reproduced sound. Since content such as music can be used as the reproduced sound like a normal music player, listening for a long time does not cause pain to the user.
- the frequency bandwidth of the ultrasonic pulse emitted from the ultrasonic transmission section 11 may be narrow.
- an ultrasonic speaker may only emit a 40 kHz sine wave.
- the inference model of the audible IR generator 63 does not have enough information to infer the audible IR.
- an inference model represented by a GAN (Generative Adversarial Network) that generates an appropriate audible range IR may be used to generate the audible range IR from the ultrasound IR.
- paired data of ultrasonic IR and audible range IR are collected by the reverberant sound data collection device 61 in FIG. 5 according to the procedure shown in FIG.
- the generative model learning device 62 learns a GAN using the ultrasonic IR of each pair of data as input data and the audible range IR as teacher data (correct data).
- Digital sample data of audible range IR is generated from digital sample data of ultrasonic IR by using a GAN algorithm that generates an image from an image or a sound from a sound. For this generation, for example, a technique called pix2pix is used.
- FIG. 10 is a diagram showing how the GAN is input and output in the audible range IR generation section 63.
- FIG. 10 is a diagram showing how the GAN is input and output in the audible range IR generation section 63.
- the GAN 101 which is an inference model, has input data of the number n of samples of the ultrasonic IR(L) 92 from the binaural microphone 12 and data of the number n of samples of the ultrasonic IR(R) 93. is entered.
- the GAN 101 generates data of the number of samples n of the audible range IR(R) and data of the number of samples of the audible range IR(L) of n for the input data.
- the audible range IR generated by the inference model does not accurately reproduce the detailed reverberation characteristics (such as materials) of the real space, for example, the delay of early reflections and the sound pressure Reverberant sound effects such as changes, length of reverberation, and changes in frequency characteristics are imparted to the reproduced sound. Therefore, the user can perceive the position of space and obstacles by the acoustic effect of the reproduced sound.
- the reverberant sound data collection device 61 in FIG. 5 can be installed in various spaces to acquire and store ultrasonic IR and audible range IR to construct a data set.
- the reverberant sound data collection device 61 in FIG. 5 can be installed in various spaces to acquire and store ultrasonic IR and audible range IR to construct a data set.
- simulators that freely place objects in virtual space and reproduce physical simulations in virtual space are being developed, especially for game engines such as Unity and Unreal Engine.
- audio sources and microphones can be freely placed in the virtual space, and some audio formats that support high resolution (for example, sampling frequency of 192 kHz) can be used. If the ultrasonic wave is about 40kHz to 80kHz, it is possible to collect ultrasonic IR and audible range IR data on the simulator. Since it does not take time to move and can be accelerated by parallel processing, a large-scale data set can be constructed relatively quickly.
- CycleGAN When collecting data on a simulator, the echo characteristics on the simulator may not always match the echo characteristics in the real world. In such cases, after building a dataset on a simulator, we transform it into a dataset that more closely resembles the reverberation characteristics of the real world by domain transformation.
- CycleGAN for example, is known as a domain conversion method. CycleGAN is a type of GAN, but unlike pix2pix, it does not require paired data, and can collect data from both domains (in this case, reverberant sound in the simulator and reverberant sound in the real world) independently. .
- the domain can be generated with only a relatively small amount of data collection compared to data collection for machine learning inference models. Transformation can be done.
- the present technology can also be applied to the case where the reproduced sound (signal) in the above embodiment is read as a reproduced signal and the vibration corresponding to the reproduced signal is presented to the user. That is, the present technology includes a case in which a vibration signal (reproduction signal) that causes a user to perceive vibrations, instead of a reproduction sound (signal), is changed according to the spatial situation based on the ultrasonic IR.
- a vibration signal reproduction signal
- This technology is effective in various fields because it allows the user to perceive the spatial situation, especially the approach of objects, etc., through sounds and vibrations.
- a speaker and a microphone are installed on the exterior of a vehicle such as an automobile, ultrasonic pulses are emitted around the vehicle, and the ultrasonic waves IR are received by the microphone.
- the reproduced sound (reproduced signal) that is changed based on the ultrasonic IR received by the microphone may be output from a speaker or the like in the vehicle, or may be presented to the user as seat vibration or the like.
- a series of processes in the above-described sound processing device 1, sound processing device 52, echo data collection device 61, or generative model learning device 62 can be executed by hardware or by software.
- a program that constitutes the software is installed in the computer.
- the computer includes, for example, a computer built into dedicated hardware and a general-purpose personal computer capable of executing various functions by installing various programs.
- FIG. 11 is a configuration example of computer hardware when the computer executes each process executed by the acoustic processing device 1, the acoustic processing device 52, the echo data collecting device 61, or the generative model learning device 62 by a program.
- 2 is a block diagram showing .
- a CPU Central Processing Unit
- ROM Read Only Memory
- RAM Random Access Memory
- An input/output interface 205 is further connected to the bus 204 .
- An input unit 206 , an output unit 207 , a storage unit 208 , a communication unit 209 and a drive 210 are connected to the input/output interface 205 .
- the input unit 206 consists of a keyboard, mouse, microphone, and the like.
- the output unit 207 includes a display, a speaker, and the like.
- the storage unit 208 is composed of a hard disk, a nonvolatile memory, or the like.
- a communication unit 209 includes a network interface and the like.
- a drive 210 drives a removable medium 211 such as a magnetic disk, optical disk, magneto-optical disk, or semiconductor memory.
- the CPU 201 loads, for example, a program stored in the storage unit 208 into the RAM 203 via the input/output interface 205 and the bus 204 and executes the above-described series of programs. is processed.
- the program executed by the computer (CPU 201) can be provided by being recorded on removable media 211 such as package media, for example. Also, the program can be provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting.
- the program can be installed in the storage section 208 via the input/output interface 205 by loading the removable medium 211 into the drive 210 . Also, the program can be received by the communication unit 209 and installed in the storage unit 208 via a wired or wireless transmission medium. In addition, programs can be installed in the ROM 202 and the storage unit 208 in advance.
- the program executed by the computer may be a program that is processed in chronological order according to the order described in this specification, or may be executed in parallel or at a necessary timing such as when a call is made. It may be a program in which processing is performed.
- the present technology can also take the following configurations.
- a processing unit that modifies a reproduction signal perceived by a user according to the situation of the space based on the ultrasonic response signal returned from the space with respect to the inspection signal in the ultrasonic frequency band radiated into the space.
- Information processing equipment (2) The information processing apparatus according to (1), wherein the inspection signal is a pulse signal emitted at a predetermined cycle.
- the situation of the space is a situation of arrangement of objects in the space.
- the reproduced signal is a sound signal in an audible frequency band.
- the information processing apparatus wherein the processing unit imparts a sound effect according to the situation of the space based on the ultrasonic response signal.
- the processing unit generates a transfer function for a sound signal of an audible frequency band in the space based on the ultrasonic response signal, and imparts the acoustic effect based on the transfer function to the reproduction signal.
- the information processing device according to 3).
- the processing unit multiplies the reproduction signal in the frequency domain obtained by Fourier transforming the reproduction signal by the transfer function, thereby imparting the acoustic effect to the reproduction signal.
- the information processing apparatus according to (6) or (7), wherein the processing unit generates the transfer function based on a frequency component of the ultrasonic response signal.
- the processing unit associates the frequencies of the ultrasonic frequency band with the frequencies of the audible frequency band as processing for generating the transfer function, and sets the frequency components of the ultrasonic response signal corresponding to each frequency in the ultrasonic frequency band. is a frequency component of the transfer function for each frequency in the audible frequency band associated with each frequency of the ultrasonic response signal.
- the processing unit estimates frequency components of the transfer function with respect to frequency components of the ultrasonic response signal using an inference model generated by machine learning.
- the processing unit generates an impulse response signal in the audible frequency band in the space based on the ultrasonic response signal, and imparts the acoustic effect based on the impulse response signal to the reproduction signal. ).
- the processing unit generates the impulse response signal from the ultrasonic response signal using an inference model in machine learning.
- the ultrasonic response signal is composed of a right ear ultrasonic response signal detected for the right ear and a left ear ultrasonic response signal detected for the left ear, Based on the right-ear ultrasonic response signal, the processing unit modifies the reproduction signal for the right ear that is perceived by the right ear of the user, and converts the ultrasonic response signal for the left ear to The information processing apparatus according to any one of (5) to (13), wherein the change is made to the reproduction signal for the left ear that is perceived by the left ear of the user based on the change.
- the ultrasonic response signal includes a right microphone that acquires the right-ear ultrasonic response signal placed in the user's right ear and the left-ear ultrasonic response signal placed in the user's left ear.
- the information processing apparatus according to (14) above obtained by the obtained left microphone.
- the reproduction signal is a vibration signal that causes the user to perceive vibration.
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- General Physics & Mathematics (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
Description
図1は、本技術が適用された音響処理装置の第1の実施の形態の構成例を示す構成図である。
図4は、図1の音響処理装置1の処理手順を例示したフローチャートである。なお、本フローチャートは、空間に周期的に放射される超音波パルスの1周期の間の処理を示す。
次に本技術が適用された音響処理装置の第2の実施の形態について説明する。
図8は、図5の音響処理装置52の処理手順を例示したフローチャートである。本フローチャートは、空間に周期的に放射される超音波パルスの1周期の間の処理を示す。
図5の音響処理装置52において、超音波送信部11から放射する超音波パルスの周波数帯域幅が狭い場合がある。例えば、超音波スピーカが40kHzのサイン波のみしか放射できないような場合がある。そのような場合、可聴域IR生成部63の推論モデルは、可聴域IRを推論するために十分な情報が得られない。その場合には、妥当な可聴域IRを生成する、GAN(Generative Adversarial Network:敵対的生成ネットワーク)に代表される推論モデルを用いて超音波IRから可聴域IRを生成してもよい。
上述した音響処理装置1、音響処理装置52、反響音データ収集装置61、又は、生成モデル学習装置62における一連の処理は、ハードウエアにより実行することもできるし、ソフトウェアにより実行することもできる。一連の処理をソフトウェアにより実行する場合には、そのソフトウェアを構成するプログラムが、コンピュータにインストールされる。ここで、コンピュータには、専用のハードウエアに組み込まれているコンピュータや、各種のプログラムをインストールすることで、各種の機能を実行することが可能な、例えば汎用のパーソナルコンピュータなどが含まれる。
(1)
空間に放射した超音波周波数帯域の検査信号に対して前記空間から戻る超音波応答信号に基づいて、ユーザに知覚させる再生信号に対して、前記空間の状況に応じた変更を加える処理部
を有する情報処理装置。
(2)
前記検査信号は、所定周期で放射されたパルス信号である
前記(1)に記載の情報処理装置。
(3)
前記空間の状況は、前記空間の物体配置の状況である
前記(1)又は(2)に記載の情報処理装置。
(4)
前記再生信号は、可聴周波数帯域の音の信号である
前記(1)又は(3)に記載の情報処理装置。
(5)
前記処理部は、前記超音波応答信号に基づいて、前記空間の状況に応じた音響効果を付与する
前記(4)に記載の情報処理装置。
(6)
前記処理部は、前記超音波応答信号に基づいて、前記空間における可聴周波数帯域の音の信号に対する伝達関数を生成し、前記再生信号に対して前記伝達関数に基づく前記音響効果を付与する
前記(3)に記載の情報処理装置。
(7)
前記処理部は、前記再生信号をフーリエ変換して得た周波数領域の前記再生信号と、前記伝達関数とを掛け合わすことにより、前記再生信号に対して前記音響効果を付与する
前記(6)に記載の情報処理装置。
(8)
前記処理部は、前記超音波応答信号の周波数成分に基づいて、前記伝達関数を生成する
前記(6)又は(7)に記載の情報処理装置。
(9)
前記処理部は、前記伝達関数を生成する処理として、前記超音波周波数帯域の周波数と前記可聴周波数帯域の周波数とを対応付け、前記超音波周波数帯域における前記超音波応答信号の各周波数に対する周波数成分を、前記超音波応答信号の各周波数に対応付けられた前記可聴周波数帯域の各周波数に対する前記伝達関数の周波数成分とする処理を含む
前記(8)に記載の情報処理装置。
(10)
前記処理部は、機械学習により生成された推論モデルを用いて、前記超音波応答信号の周波数成分に対して、前記伝達関数の周波数成分を推定する
前記(8)に記載の情報処理装置。
(11)
前記処理部は、前記超音波応答信号に基づいて、前記空間における前記可聴周波数帯域のインパルス応答信号を生成し、前記再生信号に対して前記インパルス応答信号に基づく前記音響効果を付与する
前記(5)に記載の情報処理装置。
(12)
前記処理部は、前記再生信号と前記インパルス応答信号との畳み込み積分により前記再生信号に前記音響効果を付与する
前記(11)に記載の情報処理装置。
(13)
前記処理部は、機械学習における推論モデルを用いて前記超音波応答信号から前記インパルス応答信号を生成する
前記(11)又は前記(12)に記載の情報処理装置。
(14)
前記超音波応答信号は、右耳用として検出された右耳用超音波応答信号と左耳用として検出された左耳用超音波応答信号とからなり、
前記処理部は、前記右耳用超音波応答信号に基づいて、前記ユーザの右耳に知覚させる右耳用の前記再生信号に対して、前記変更を加え、前記左耳用超音波応答信号に基づいて、前記ユーザの左耳に知覚させる左耳用の前記再生信号に対して前記変更を加える
前記(5)乃至(13)のいずれかに記載の情報処理装置。
(15)
前記超音波応答信号は、前記ユーザの右耳に配置された前記右耳用超音波応答信号を取得する右用マイクと、前記ユーザの左耳に配置された前記左耳用超音波応答信号を取得する左用マイクとにより取得された
前記(14)に記載の情報処理装置。
(16)
前記再生信号は、前記ユーザに振動を知覚させる振動信号である
前記(1)又は(2)に記載の情報処理装置。
(17)
処理部
を有する情報処理装置の
前記処理部が、
空間に放射した超音波周波数帯域の検査信号に対して前記空間から戻る超音波応答信号に基づいて、ユーザに知覚させる再生信号に対して、前記空間の状況に応じた変更を加える
情報処理方法。
(18)
コンピュータを
空間に放射した超音波周波数帯域の検査信号に対して前記空間から戻る超音波応答信号に基づいて、ユーザに知覚させる再生信号に対して、前記空間の状況に応じた変更を加える処理部
として機能させるためのプログラム。
Claims (18)
- 空間に放射した超音波周波数帯域の検査信号に対して前記空間から戻る超音波応答信号に基づいて、ユーザに知覚させる再生信号に対して、前記空間の状況に応じた変更を加える処理部
を有する情報処理装置。 - 前記検査信号は、所定周期で放射されたパルス信号である
請求項1に記載の情報処理装置。 - 前記空間の状況は、前記空間の物体配置の状況である
請求項1に記載の情報処理装置。 - 前記再生信号は、可聴周波数帯域の音の信号である
請求項1に記載の情報処理装置。 - 前記処理部は、前記超音波応答信号に基づいて、前記空間の状況に応じた音響効果を付与する
請求項4に記載の情報処理装置。 - 前記処理部は、前記超音波応答信号に基づいて、前記空間における可聴周波数帯域の音の信号に対する伝達関数を生成し、前記再生信号に対して前記伝達関数に基づく前記音響効果を付与する
請求項5に記載の情報処理装置。 - 前記処理部は、前記再生信号をフーリエ変換して得た周波数領域の前記再生信号と、前記伝達関数とを掛け合わすことにより、前記再生信号に対して前記音響効果を付与する
請求項6に記載の情報処理装置。 - 前記処理部は、前記超音波応答信号の周波数成分に基づいて、前記伝達関数を生成する
請求項6に記載の情報処理装置。 - 前記処理部は、前記伝達関数を生成する処理として、前記超音波周波数帯域の周波数と前記可聴周波数帯域の周波数とを対応付け、前記超音波周波数帯域における前記超音波応答信号の各周波数に対する周波数成分を、前記超音波応答信号の各周波数に対応付けられた前記可聴周波数帯域の各周波数に対する前記伝達関数の周波数成分とする処理を含む
請求項8に記載の情報処理装置。 - 前記処理部は、機械学習により生成された推論モデルを用いて、前記超音波応答信号の周波数成分に対して、前記伝達関数の周波数成分を推定する
請求項8に記載の情報処理装置。 - 前記処理部は、前記超音波応答信号に基づいて、前記空間における前記可聴周波数帯域のインパルス応答信号を生成し、前記再生信号に対して前記インパルス応答信号に基づく前記音響効果を付与する
請求項5に記載の情報処理装置。 - 前記処理部は、前記再生信号と前記インパルス応答信号との畳み込み積分により前記再生信号に前記音響効果を付与する
請求項11に記載の情報処理装置。 - 前記処理部は、機械学習における推論モデルを用いて前記超音波応答信号から前記インパルス応答信号を生成する
請求項11に記載の情報処理装置。 - 前記超音波応答信号は、右耳用として検出された右耳用超音波応答信号と左耳用として検出された左耳用超音波応答信号とからなり、
前記処理部は、前記右耳用超音波応答信号に基づいて、前記ユーザの右耳に知覚させる右耳用の前記再生信号に対して、前記変更を加え、前記左耳用超音波応答信号に基づいて、前記ユーザの左耳に知覚させる左耳用の前記再生信号に対して前記変更を加える
請求項5に記載の情報処理装置。 - 前記超音波応答信号は、前記ユーザの右耳に配置された前記右耳用超音波応答信号を取得する右用マイクと、前記ユーザの左耳に配置された前記左耳用超音波応答信号を取得する左用マイクとにより取得された
請求項14に記載の情報処理装置。 - 前記再生信号は、前記ユーザに振動を知覚させる振動信号である
請求項1に記載の情報処理装置。 - 処理部
を有する情報処理装置の
前記処理部が、
空間に放射した超音波周波数帯域の検査信号に対して前記空間から戻る超音波応答信号に基づいて、ユーザに知覚させる再生信号に対して、前記空間の状況に応じた変更を加える
情報処理方法。 - コンピュータを
空間に放射した超音波周波数帯域の検査信号に対して前記空間から戻る超音波応答信号に基づいて、ユーザに知覚させる再生信号に対して、前記空間の状況に応じた変更を加える処理部
として機能させるためのプログラム。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2023500603A JPWO2022176417A1 (ja) | 2021-02-16 | 2022-01-06 | |
US18/264,142 US20240040328A1 (en) | 2021-02-16 | 2022-01-06 | Information processing device, information processing method, and program |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2021022316 | 2021-02-16 | ||
JP2021-022316 | 2021-02-16 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022176417A1 true WO2022176417A1 (ja) | 2022-08-25 |
Family
ID=82930726
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2022/000160 WO2022176417A1 (ja) | 2021-02-16 | 2022-01-06 | 情報処理装置、情報処理方法、及び、プログラム |
Country Status (3)
Country | Link |
---|---|
US (1) | US20240040328A1 (ja) |
JP (1) | JPWO2022176417A1 (ja) |
WO (1) | WO2022176417A1 (ja) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS63263899A (ja) * | 1987-04-21 | 1988-10-31 | Matsushita Electric Ind Co Ltd | 音場制御装置の入力装置 |
US20160128891A1 (en) * | 2014-11-10 | 2016-05-12 | Electronics And Telecommunications Research Institute | Method and apparatus for providing space information |
-
2022
- 2022-01-06 JP JP2023500603A patent/JPWO2022176417A1/ja active Pending
- 2022-01-06 WO PCT/JP2022/000160 patent/WO2022176417A1/ja active Application Filing
- 2022-01-06 US US18/264,142 patent/US20240040328A1/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS63263899A (ja) * | 1987-04-21 | 1988-10-31 | Matsushita Electric Ind Co Ltd | 音場制御装置の入力装置 |
US20160128891A1 (en) * | 2014-11-10 | 2016-05-12 | Electronics And Telecommunications Research Institute | Method and apparatus for providing space information |
Also Published As
Publication number | Publication date |
---|---|
JPWO2022176417A1 (ja) | 2022-08-25 |
US20240040328A1 (en) | 2024-02-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Pompei | Sound from ultrasound: The parametric array as an audible sound source | |
JP6933215B2 (ja) | 音場形成装置および方法、並びにプログラム | |
JP6281493B2 (ja) | 信号処理装置、信号処理方法、測定方法、測定装置 | |
Ahrens | Analytic methods of sound field synthesis | |
JP6361809B2 (ja) | 信号処理装置、信号処理方法 | |
CN104538023B (zh) | 声音漫射发生器 | |
JP2011517908A (ja) | フィルタ特性を生成する装置および方法 | |
CN112005559B (zh) | 改进环绕声的定位的方法 | |
WO2017128481A1 (zh) | 骨传导耳机的播放控制方法、装置及骨传导耳机设备 | |
JP5543106B2 (ja) | 空間オーディオ信号再生装置及び空間オーディオ信号再生方法 | |
WO2017208822A1 (ja) | 局所消音音場形成装置および方法、並びにプログラム | |
Novo | Auditory virtual environments | |
WO2022176417A1 (ja) | 情報処理装置、情報処理方法、及び、プログラム | |
JP2013009112A (ja) | 収音再生装置、プログラム及び収音再生方法 | |
US11830471B1 (en) | Surface augmented ray-based acoustic modeling | |
CN109923877A (zh) | 对立体声音频信号进行加权的装置和方法 | |
Husung et al. | Acoustical investigations in virtual environments for a car passing application | |
Geronazzo et al. | Use of personalized binaural audio and interactive distance cues in an auditory goal-reaching task | |
WO2022185725A1 (ja) | 情報処理装置、情報処理方法、およびプログラム | |
Zhang et al. | Active control of scattering effects in 2.5 D multizone reproduction | |
JP7403436B2 (ja) | 異なる音場の複数の録音音響信号を合成する音響信号合成装置、プログラム及び方法 | |
WO2022201799A1 (ja) | 情報処理装置、情報処理方法、及び、プログラム | |
Park | Active Noise Control Using an External Microphone Array for Path Estimation | |
Kumar et al. | Multichannel Dynamic Sound Rendering and Echo Suppression in a Room Using Wave Field Synthesis | |
Srivastava | Realism in virtually supervised learning for acoustic room characterization and sound source localization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22755743 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2023500603 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 18264142 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22755743 Country of ref document: EP Kind code of ref document: A1 |