WO2023286326A1 - 情報処理装置、情報処理方法およびプログラム - Google Patents
情報処理装置、情報処理方法およびプログラム Download PDFInfo
- Publication number
- WO2023286326A1 WO2023286326A1 PCT/JP2022/008820 JP2022008820W WO2023286326A1 WO 2023286326 A1 WO2023286326 A1 WO 2023286326A1 JP 2022008820 W JP2022008820 W JP 2022008820W WO 2023286326 A1 WO2023286326 A1 WO 2023286326A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- information processing
- filtering
- input signal
- sound source
- filter
- Prior art date
Links
- 230000010365 information processing Effects 0.000 title claims abstract description 121
- 238000003672 processing method Methods 0.000 title claims description 6
- 238000012545 processing Methods 0.000 claims abstract description 81
- 238000001914 filtration Methods 0.000 claims abstract description 69
- 238000000034 method Methods 0.000 claims description 34
- 238000013528 artificial neural network Methods 0.000 claims description 23
- 230000008569 process Effects 0.000 claims description 20
- 230000002238 attenuated effect Effects 0.000 claims description 5
- 230000004044 response Effects 0.000 claims description 4
- 238000000926 separation method Methods 0.000 description 71
- 238000004364 calculation method Methods 0.000 description 26
- 230000005236 sound signal Effects 0.000 description 21
- 230000006870 function Effects 0.000 description 11
- 238000001228 spectrum Methods 0.000 description 11
- 238000010586 diagram Methods 0.000 description 10
- 238000004891 communication Methods 0.000 description 9
- 230000008859 change Effects 0.000 description 8
- 238000012937 correction Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 6
- 230000003321 amplification Effects 0.000 description 5
- 238000003199 nucleic acid amplification method Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 230000005534 acoustic noise Effects 0.000 description 3
- 230000015654 memory Effects 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 244000025254 Cannabis sativa Species 0.000 description 1
- 238000004378 air conditioning Methods 0.000 description 1
- 239000010426 asphalt Substances 0.000 description 1
- 239000000872 buffer Substances 0.000 description 1
- 230000003139 buffering effect Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013434 data augmentation Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000009432 framing Methods 0.000 description 1
- 238000007654 immersion Methods 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012806 monitoring device Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/04—Circuits for transducers, loudspeakers or microphones for correcting frequency response
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03G—CONTROL OF AMPLIFICATION
- H03G5/00—Tone control or bandwidth control in amplifiers
- H03G5/16—Automatic control
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
- H04S7/307—Frequency adjustment, e.g. tone control
Definitions
- the present disclosure relates to an information processing device, an information processing method, and a program.
- the gunshot sounds can be emphasized and made easier to hear.
- noise other than the desired sound such as air-conditioning sound or electrical noise may interfere with listening to the desired sound.
- the noise can be removed by attenuating the frequency band of the noise, making it easier to hear.
- a sound control function such as an equalizer is appropriately adjusted manually or by presetting according to the sound that the user wants to hear or does not want to hear.
- this adjustment is complicated, and it is difficult to operate without a sense of pitch.
- Patent Document 1 proposes a technique for automatically controlling sound (specifically, a technique for enhancing ambient sound and acoustic noise cancellation based on context).
- the technique disclosed in Patent Document 1 adds an amplified or attenuated ambient signal to an acoustic noise canceling signal, and is intended for noise canceling headphones.
- this processing is realized by a combination of acoustic noise cancellation technology and a filter that modulates the frequency of the ambient sound.
- Patent Literature 1 requires a microphone to acquire ambient sounds, and has the problem of increasing the scale and cost of hardware.
- signal processing such as a predetermined equalizer on input (playback) signals of games, voice calls, and the like. Therefore, the sound cannot be controlled according to the input signal.
- One object of the present disclosure is to propose an information processing apparatus, an information processing method, and a program capable of performing processing according to an input signal while suppressing an increase in hardware scale and cost.
- the present disclosure for example, a filtering unit for filtering an input signal; a filter setting unit for setting the filtering, which controls the sound of the target sound source type in the input signal and is determined using an estimation result obtained from the input signal by an estimation algorithm. It is an information processing device.
- the filtering settings of the filter processing unit for filtering the input signal, which control the sound of the target sound source type in the input signal, are determined using estimation results obtained from the input signal by an estimation algorithm. It is an information processing method for performing processing.
- the filtering settings of the filter processing unit for filtering the input signal are determined using estimation results obtained from the input signal by an estimation algorithm. It is a program that causes a computer to execute a process.
- FIG. 1 is a diagram showing a configuration example of a commonly used equalizer.
- FIG. 2 is a diagram showing a display example of a sound quality adjustment setting instruction screen.
- FIG. 3 is a diagram showing a display example of the updated equalizer.
- FIG. 4 is a diagram illustrating a configuration example of functional blocks of the information processing apparatus.
- FIG. 5 is a diagram showing an example of sound source separation by a neural network.
- FIG. 6 is a diagram illustrating a hardware configuration example of an information processing apparatus.
- FIG. 7 is a flowchart illustrating an example of processing by the information processing device.
- FIG. 8 is a flowchart illustrating an example of processing for calculating filter coefficients.
- FIG. 9 is a diagram showing a configuration example of another functional block of the information processing apparatus.
- FIG. 10 is a diagram showing an example of filter coefficient calculation by a neural network.
- FIG. 11 is a diagram showing a configuration example of another functional block of the information processing apparatus.
- FIG. 12
- FIG. 1 shows a configuration example of a commonly used equalizer.
- the user can adjust the gain (specifically, the gain value) of each frequency band of the equalizer by operating a knob or the like depending on the sound that the user wants to hear or does not want to hear.
- IIR Infinite Impulse Response
- FIR Finite Impulse Response
- Non-Patent Document 1 there is known a technique of learning and constructing a neural network that separates a predetermined target sound, and realizing sound source separation using the trained neural network. Using this technology, any and all sounds can be separated from the input signal, so that the desired sound can be heard directly.
- AI Artificial Intelligence
- Non-Patent Document 1 Stefan Uhlich others. “Improving music source separation based on deep neural networks through data augmentation and network blending.” 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2017). 2017.
- Non-Patent Document 1 requires buffering of the input signal for several tens of milliseconds to several hundred milliseconds due to the structure of the neural network, which causes a delay in the output signal. means to Therefore, for applications such as those described above in which real-time performance is important, this delay poses a problem, making it impossible to use the sound source separation technique using a neural network.
- FIG. 2 shows a display example of a sound quality adjustment setting instruction screen.
- the sound quality adjustment function is incorporated into the screen during the game. This makes it possible to adjust the sound quality during the game, for example.
- the game screen 2 is displayed on the left side of the display screen 1, and the equalizer 3 is displayed on the right side.
- the equalizer 3 is the commonly used one mentioned above.
- the user can adjust the sound quality of the game output signal by operating the equalizer 3 .
- the user adjusts the gain (Gain value) of each frequency band of the equalizer 3 according to the sound that the user wants to hear or does not want to hear by operating a knob (operation to move the position of the knob). can be done.
- a user instruction input unit 4 (a portion labeled "Auto Equalizer") for automatically adjusting the equalizer 3 (specifically, the internal frequency modulation filter) is displayed.
- the part indicated as “Gain” is the gain setting section 41 that instructs the user to set the gain.
- the gain setting unit 41 allows the user to specify whether to amplify or attenuate the sound of the target sound source type specified by "Sound type” to be described later. For example, in “Sound type", the user selects “Up (amplification)” when specifying the sound that the user wants to hear, and selects “Down (attenuation)” when specifying the sound that the user does not want to hear.
- the gain setting unit 41 is not limited to allowing the user to simply select amplification or attenuation, but allows the user to set the level of amplification or attenuation, such as + ⁇ dB or ⁇ dB (" ⁇ " is a predetermined numerical value). may be
- the part labeled "Sound type” is the target sound source specifying section 42 that allows the user to specify the target sound source type.
- the target sound source type here is, for example, the type of sound that the user wants to control.
- the target sound source designation unit 42 allows the user to select the type of sound that the user wants to hear or that he or she does not want to hear. For example, if you want to amplify the sound of "Footstep” or "Gunshot” in the game, or if you want to attenuate "Wind noise", various categories can be targeted. can be prepared as One or more of "Sound types" can be selected.
- the part labeled "Update timing” is the coefficient update setting section 43 that allows the user to specify the coefficient update settings.
- the coefficient update setting unit 43 allows the user to specify the timing (time, interval, etc.) for automatically adjusting the sound quality. For example, when the "Auto" mode is selected, the equalizer 3 is adjusted during a period in which no sound is produced in the game or at the moment when the scene changes. As a result, the equalizer 3 can be changed without discomfort for the game sound that the user concentrates on listening to, so that the sense of immersion is not lost.
- the equalizer 3 is updated immediately after specifying the above-mentioned "Gain” or “Sound type” (after changing the setting). That is, it is possible to immediately change the sound quality according to the user's operation. Also, when the "Manual" mode is selected, the equalizer 3 is periodically updated according to the designated interval time. This makes it possible to meet the user's desire to keep updating the equalizer 3 at all times. For example, a numerical value such as every 0.5 seconds or every 2 seconds can be set.
- Various setting information specified by the user instruction input unit 4 is stored in a storage area in a readable manner, for example.
- an audio signal of a game playback sound is used as an input signal, and signal processing (specifically, filtering) is performed on the input signal according to settings specified by the user instruction input unit 4. and make its output signal audible to the user.
- signal processing specifically, filtering
- an image showing the difference before and after the update is displayed on the display device.
- the gains and knob positions that have changed for each frequency band are displayed in different colors so that they can be discriminated.
- the difference in color is represented by the shade.
- light-colored knobs represent the knob positions before updating
- dark-colored knobs represent the current knob positions after updating. This allows the user to easily grasp that the equalizer has changed (including the content of the change). It should be noted that a notation indicating the setting such as "amplify footsteps" may be added so that the current equalizer setting can be understood at a glance.
- FIG. 4 shows a configuration example of functional blocks of an information processing apparatus (information processing apparatus 10) according to the first embodiment.
- the information processing apparatus 10 is, for example, a signal processing circuit (specifically, a reproduction signal) that can be applied to the applications described above.
- the information processing device 10 performs signal processing on an input signal (specifically, an audio signal) to control sound.
- the audio signal may be obtained, for example, from applications such as games, voice calls (eg, web conferencing systems), etc., as described above.
- the information processing device 10 handles audio signals from personal computers, smartphones, tablet terminals, game machines, speaker devices, headphones, earphones, smart home appliances, televisions, players, recorders, telephones, in-vehicle devices, monitoring devices, medical devices, and the like. It can be configured with various electronic devices (specifically, computer devices).
- the information processing device 10 includes a filter processing unit 11, a sound source separation coefficient selection unit 12, a sound source separation unit 13, a frequency characteristic estimation unit 14, a filter coefficient calculation unit 15, a coefficient update unit 16, and a screen display update unit 17. , performs signal processing according to the settings described above.
- the filter processing unit 11 filters and outputs an input signal (specifically, an input audio signal). Thereby, for example, the frequency characteristic of the output signal (specifically, the audio signal after filtering) is changed.
- the filter processing unit 11 has a filter circuit (frequency modulation filter), and filtering is performed using this filter circuit.
- this filter circuit can be implemented with an IIR filter or an FIR filter as described above. That is, filtering can be done using IIR filters or FIR filters.
- the filter circuit that performs filtering is not limited to this, and for example, performs FFT (Fast Fourier Transform), amplifies or attenuates the gain of the amplitude spectrum of the signal converted to the frequency domain, and performs IFFT (Inverse Fast Fourier Transform). It is also possible to perform processing to return to the time domain waveform.
- FFT Fast Fourier Transform
- IFFT Inverse Fast Fourier Transform
- an IIR filter is assumed to perform low-delay processing.
- filtering can be performed in real time, and can be applied without problems to applications where real-time performance is important, that is, real-time processing.
- real-time processing is possible with filters other than the IIR filter as long as low-delay processing that the user cannot experience is possible.
- the initial value of the filtering setting (specifically, the filter coefficient) may have a flat frequency characteristic, that is, have the property of outputting the input signal as it is.
- the coefficients used last time may be held and used again with the same settings. In this way, an appropriately determined initial value can be used for the filtering setting.
- the output signal output from the filter processing unit 11 is output to other signal processing modules connected in the subsequent stage, output (reproduction) devices such as speakers and headphones, and the like.
- the sound source separation coefficient selection unit 12 and the sound source separation unit 13 perform processing related to sound source separation.
- the sound source separation coefficient selection unit 12 selects a sound source separation coefficient based on the set target sound source type.
- the target sound source type is, for example, the one specified as described above (sound category such as “Footstep” or “Gunshot”), and is input to the sound source separation coefficient selection unit 12 as character information or numerical parameters. .
- the sound source separation coefficient selection unit 12 stores a group of coefficients necessary for sound source separation processing in the sound source separation unit 13 in advance in a storage area such as a HDD (Hard Disk Drive), and selects a specified A corresponding coefficient is loaded based on the target sound source type and sent to the sound source separation unit 13 .
- this coefficient group must be prepared for the number of categories of sounds that are desired to be separated and controlled by sound source separation. In other words, if only this sound source separation coefficient is prepared, all kinds of sounds can be separated and controlled. Therefore, when a new sound category appears, its coefficient can be added and recorded here.
- the sound source separation unit 13 executes sound source separation processing.
- An estimation algorithm for sound source separation is used for this sound source separation processing. Specifically, this estimation algorithm estimates and separates a sound of a designated target sound source type from an input signal (specifically, an input audio signal), and outputs the separated sound as an estimation result.
- an estimation algorithm for example, a technique based on a neural network (specifically, the technique disclosed in Non-Patent Document 1 mentioned above) can be adopted. For example, when learning the target sound source type "Footstep" by a neural network, a large number of input signals (for example, 100,000 to 1,000,000) for learning "Footstep" are used, and from each of the input signals, "Footstep” Learn to separate sounds.
- the parameters of the neural network after learning are stored as coefficients (coefficients loaded by the sound source separation coefficient selection unit 12) necessary for the sound source separation unit 13 to separate the sound source of "Footstep".
- Fig. 5 shows an example of sound source separation based on a neural network.
- an input signal including a signal to be separated is frequency-converted, and its amplitude spectrum is used as an input signal vector.
- This vector size depends on the transform length of the frequency transform, so it is 1024 or 2048, for example.
- This vector is input to the neural network, and an output signal vector is obtained as an estimation result through internal processing using coefficients sent from the sound source separation coefficient selector 12 .
- This output signal vector is the amplitude spectrum of the signal after separation.
- the amplitude spectrum of the separated signal is obtained for each frame.
- the amplitude spectrum of this separated signal that is, the estimation result by the estimation algorithm is output to the frequency characteristic estimating section 14 shown in FIG.
- sound source separation based on neural networks can separate the desired sound with high accuracy, although there is a delay of several tens to hundreds of milliseconds.
- the sound of the target sound source type included in the input signal may have variable frequency characteristics. In other words, it is possible to accurately separate the sound of the target sound source type that can change according to the input signal. For example, when "Footstep" is specified as the target sound source type, footsteps can be appropriately separated even when footsteps on asphalt are changed to footsteps on grass.
- the estimation algorithm is not limited to one using a method based on a neural network as shown in Non-Patent Document 1. Any technique for extracting the sound of the target sound source type may be used, for example, non-negative matrix factorization (NMF) may be used. Although a delay may similarly occur when using other estimation algorithms as described above, the desired sound can be separated with high accuracy.
- NMF non-negative matrix factorization
- the frequency characteristic estimation unit 14 shown in FIG. 4 estimates frequency characteristics. For this estimation, the amplitude spectrum of the separated signal input from the sound source separation unit 13, that is, the amplitude spectrum of the sound of the category specified in advance by the user is used. Specifically, since the amplitude spectrum is sequentially input for each frame, the frequency characteristics of the desired sound can be estimated by, for example, averaging them or calculating a weighted sum with a time constant. Here, there may be both a section where the desired sound is sounding and a silent section, such as "Footstep". In that case, if silent intervals are included in the average calculation, an error may occur in the estimated frequency. Therefore, it is possible to determine that intervals below a certain threshold are silent intervals and exclude them from the average calculation. Note that the frequency characteristic estimator 14 is not limited to this, and may estimate the frequency characteristic by other methods.
- the filter coefficient calculation unit 15 calculates filter coefficients used by the filter processing unit 11 . Specifically, the filter coefficient calculator 15 first reads the gain setting set by the user. As described above, this can be set by setting whether to amplify or attenuate the sound of the specified target sound source type, or by setting a specific numerical value such as + ⁇ dB or - ⁇ dB. In this manner, the filter coefficient calculator 15 controls the filter coefficients of the sound of the target sound source type included in the input signal. Specifically, the filter coefficient calculator 15 determines a target filter characteristic based on the frequency characteristic estimated by the frequency characteristic estimator 14 and the gain setting.
- coefficients may be calculated in accordance with the format used in the filter processing unit 11 (for example, IIR filter, FIR filter, etc.). This calculation may use any algorithm, such as a classical technique derived from a transfer function or a technique based on numerical optimization. Specifically, the filter coefficients calculated in this way appropriately amplify or attenuate each frequency band of the input signal so that the sound of the target sound source type is amplified or attenuated. The calculated coefficients are output to the coefficient updating unit 16 and the screen display updating unit 17 .
- the format used in the filter processing unit 11 for example, IIR filter, FIR filter, etc.
- the coefficient update unit 16 is a filter setting unit that sets filtering in the filter processing unit 11 . Specifically, the coefficient update unit 16 sets the filter coefficients in the filter processing unit 11 to the coefficients input from the filter coefficient calculation unit 15 . That is, as described above, it is assumed that the sound of the target sound source type in the input signal is controlled and determined using the estimation result obtained from the input signal by the estimation algorithm.
- the coefficient update unit 16 controls the timing of updating the filter coefficients based on the coefficient update settings set by the user.
- the coefficient updating unit 16 detects timing based on, for example, an input signal (specifically, an input audio signal).
- an input signal specifically, an input audio signal.
- the filter processing unit 11 continues to filter and output the input signal at any time with a low delay of several hundred microseconds to several milliseconds. That is, the filtering in the filter processing unit 11 is at least processing with a lower delay than the estimation algorithm (specifically, real-time processing). In other words, the user does not perceive any delay and hears the output sound of the equalizer as in the conventional case. In this way, by updating only the filter coefficients, it is possible to obtain high-precision filtering that controls (specifically, amplifies or attenuates frequency characteristics) the sound specified by the target sound source type while maintaining low delay. can.
- the coefficient update unit 16 when the coefficient update setting is "None", the coefficient update unit 16 allows the user to set the coefficient (specifically, the above-described "Gain” or “Sound type” setting) is changed and the filter coefficient is updated at the timing when a new filtering instruction is received. Further, when the coefficient update setting is “Manual”, the coefficient update unit 16 updates the filter coefficients at regular intervals according to user settings or the like. Further, when the coefficient update setting is "Auto”, the coefficient update unit 16 updates the filter coefficients at a timing that does not make the user feel uncomfortable.
- Whether or not it is a timing that does not cause discomfort is determined using predetermined determination information (for example, audio signal, video signal, etc.) that indicates the sound switching timing.
- predetermined determination information for example, audio signal, video signal, etc.
- an input signal can be monitored as determination information, and can be changed when the volume (amplitude value) of the input signal becomes smaller than a certain threshold.
- it if it is a game or music, it can be changed at the change of sound when the scene changes. This makes it possible to avoid sudden changes in sound due to changes in filter coefficients.
- the coefficient update unit 16 outputs information indicating the update to the screen display update unit 17 .
- the screen display updating unit 17 updates the parameters such as the equalizer 3 and filter setting that are already displayed to the latest ones. Specifically, when the filter coefficient is updated by the coefficient updating unit 16, that is, when information indicating the update is input from the coefficient updating unit 16, the screen display updating unit 17 updates the difference before and after the update. Output the information to be displayed to the display device. Specifically, the screen display update unit 17 causes the display device to display an image of the user-operable equalizer 3 including information representing the difference, as shown in FIG. Note that the information representing this difference is not limited to that shown in FIG. 3, and may be output to a reproduction device other than the display device (specifically, output as sound to a speaker or the like), for example.
- FIG. 6 shows a hardware configuration example of the information processing device 10 .
- the information processing apparatus 10 has a control section 101, a storage section 102, an input section 103, a communication section 104 and an output section 105 interconnected by a bus.
- the control unit 101 is composed of, for example, a CPU (Central Processing Unit), RAM (Random Access Memory) and ROM (Read Only Memory).
- the ROM stores programs and the like that are read and operated by the CPU.
- the RAM is used as work memory for the CPU.
- the CPU controls the entire information processing apparatus 10 by executing various processes and issuing commands according to programs stored in the ROM.
- the storage unit 102 is, for example, a storage medium configured by an HDD, an SSD (Solid State Drive), a semiconductor memory, or the like, and stores content data such as image data, video data, audio data, and text data, as well as programs (for example, application) and other data.
- content data such as image data, video data, audio data, and text data, as well as programs (for example, application) and other data.
- the input unit 103 is a device for inputting various types of information to the information processing device 10 .
- the control unit 101 performs various processes corresponding to the input information.
- the input unit 103 may be a mouse and keyboard, a microphone, various sensors, a touch panel, a touch screen integrated with a monitor, physical buttons, and the like.
- Various types of information may be input to the information processing apparatus 10 via the communication unit 104, which will be described later.
- the communication unit 104 is a communication module that communicates with other devices and the Internet according to a predetermined communication standard.
- Communication methods include wireless LAN (Local Area Network) such as Wi-Fi (Wireless Fidelity), LTE (Long Term Evolution), 5G (5th generation mobile communication system), broadband, Bluetooth (registered trademark), etc. .
- Wi-Fi Wireless Fidelity
- LTE Long Term Evolution
- 5G Fifth Generation mobile communication system
- Bluetooth registered trademark
- the output unit 105 is a device for outputting various information from the information processing device 10 .
- the output unit 105 includes, for example, a display (display device) for displaying images and videos, and an output device for outputting sound, such as a speaker.
- Various types of information may be output from the information processing apparatus 10 via the communication unit 104 .
- the control unit 101 performs various processes by reading and executing programs (eg, applications) stored in the storage unit 102, for example. That is, the information processing device 10 has functions as a computer.
- programs eg, applications
- the program does not have to be stored in the storage unit 102.
- the information processing apparatus 10 may read and execute a program stored in a readable storage medium.
- the storage medium include optical discs, magnetic discs, semiconductor memories, HDDs, etc. that can be detachably attached to the information processing apparatus 10 .
- programs (eg, applications) and data are stored in a device (eg, cloud storage) connected to a network such as the Internet, and the information processing device 10 reads out the programs and data from there and executes them. good too.
- the program may be, for example, a plug-in program that adds part or all of the processing to an existing application.
- the program may be one that executes all of the applications described above, or it may be a plug-in program that adds the above-described sound control functions to the application.
- FIG. 7 is a flowchart showing the series of processes (sound quality adjustment process) described above. Note that, in this example, each setting of the target sound source type, gain, and coefficient update that are input by the user's operation described above is described as being set only once at the beginning in order to make the flow easier to understand. However, it is also possible to change this setting at any time.
- the information processing device 10 When the sound quality adjustment process is started, the information processing device 10 first initializes the settings of the filter processing section 11 (step S10). Specifically, the coefficient updating unit 16 sets initial values to the filter coefficients.
- the information processing apparatus 10 sets the target sound source type, gain, and coefficient update (step S20). Specifically, the target sound source type, gain, and coefficient update are stored in the storage area by setting instructions on the setting instruction screen shown in FIG.
- the information processing device 10 inputs an audio signal after these settings are made (step S30).
- the audio signal (input signal) is input to the filter processor 11 , the sound source separator 13 , and the coefficient updater 16 .
- the information processing device 10 determines whether or not it is time to update the filter coefficients (step S40). Specifically, this determination is made by the coefficient update unit 16 based on the above-described coefficient update settings. If it is determined that it is time to update (YES) in step S40, the information processing apparatus 10 updates the filter coefficients (step S50). Specifically, the coefficient update unit 16 updates the filter coefficient using the calculation result of the filter coefficient calculation process (described later).
- the information processing device 10 updates the screen display according to the update of the filter coefficient (step S60).
- the screen display updating unit 17 causes the display device to output information representing the difference between before and after updating (for example, the image of the equalizer 3 as shown in FIG. 3).
- the information processing apparatus 10 After updating the screen display in step S60, or when it is determined that it is not time to update the filter coefficients (NO) in step S40, the information processing apparatus 10 performs low-delay filtering (step S70), and after filtering audio signal is output (step S80). Specifically, the filter processor 11 filters the audio signal and outputs the filtered audio signal. Then, the output audio signal is sent to an output device such as a speaker or headphone for output.
- step S90 determines whether or not the signal is continuing (step S90), and if it is determined that the signal is continuing (YES), the process returns to step S30. On the other hand, if it is determined that the process is not ongoing (NO), the sound quality adjustment process is terminated.
- FIG. 8 is a flow chart showing the flow of filter coefficient calculation processing by the information processing device 10 .
- the filter coefficient calculation process starts when an audio signal is input to the sound source separation unit 13, for example.
- the information processing device 10 buffers the audio signal (step S110).
- the information processing device 10 selects a sound source separation coefficient (step S120).
- the sound source separation coefficient selection unit 12 selects a sound source separation coefficient based on the set target sound source type, and outputs the selected sound source separation coefficient to the sound source separation unit 13 .
- the information processing device 10 performs sound source separation on the audio signal (step S130). Specifically, the sound source separation unit 13 separates the sound of the target sound source type from the audio signal based on the sound source separation coefficient, and outputs the sound to the frequency characteristic estimation unit 14 .
- the information processing device 10 estimates frequency characteristics (step S140). Specifically, the frequency characteristic estimating unit 14 estimates the frequency characteristic of the separated target sound source type, and outputs the frequency characteristic to the filter coefficient calculating unit.
- the information processing apparatus 10 calculates filter coefficients (step S150), and ends the filter coefficient calculation process. Specifically, the filter coefficient calculator 15 calculates the filter coefficient using the estimated frequency characteristic, and outputs the calculated filter coefficient to the coefficient updater 16 . This filter coefficient is used in updating the filter coefficient (step S50) described above.
- the coefficient updating unit 16 sets the filtering setting of the filter processing unit 11 for filtering the input signal to control the sound of the target sound source type in the input signal. is determined using the estimation result obtained by the estimation algorithm from This makes it possible to generate an optimal filter according to the input signal without requiring additional hardware such as a microphone. More specifically, it is possible to generate a filter that is optimal for the sound characteristics of the target sound source type contained in the input signal. As a result, it is possible to reproduce a signal having optimal acoustic characteristics for the user.
- the filtering in the filter processing unit 11 has a lower delay (specifically, real-time processing) than the processing of the estimation algorithm, the output signal filtered by the filter processing unit 11 has a low delay (for example, several hundred microseconds). can be output with a delay of up to several milliseconds).
- the filtering settings are automatically updated according to the coefficient update settings, the user does not have to finely adjust the filtering settings (specifically, the equalizer, etc.).
- the filtering settings are updated at regular intervals, and can be updated at the timing when an instruction to change the filtering settings is received (user's arbitrary timing) or at a timing that does not make the user feel uncomfortable. As a result, it is possible to make the change in sound quality more natural when updating the filter coefficients, so that the feeling of being immersed in the content is not hindered.
- the display device when the filter coefficients are updated by the coefficient update unit 16, the display device outputs an image of the equalizer that can be operated by the user so that the difference between before and after the update can be understood. can be grasped.
- the information processing apparatus according to the second embodiment differs from the information processing apparatus 10 according to the first embodiment in that the filter coefficient itself is estimated by an estimation algorithm.
- Other points (specific examples of applications, hardware configuration examples, etc.) are basically the same as those of the information processing apparatus 10 . Differences from the information processing apparatus 10 described above will be described below.
- FIG. 9 shows a configuration example of functional blocks of an information processing device (information processing device 10A) according to this embodiment.
- the information processing device 10A has a filter processing unit 11, a sound source separation coefficient selection unit 12, a sound source separation unit 13A, a coefficient update unit 16, and a screen display update unit 17.
- the sound source separation unit 13A performs filter coefficient output type sound source separation processing. To put it simply, the sound source separation unit 13A does not output an amplitude spectrum value, but directly estimates the filtering settings (specifically, filter coefficients) in the filter processing unit 11 itself.
- An estimation algorithm for sound source separation is used for this sound source separation processing. Specifically, this estimation algorithm estimates a filter coefficient from an input signal using a coefficient input from the sound source separation coefficient selection unit 12 and a set gain setting, and outputs the result as an estimation result.
- a neural network can be used as the estimation algorithm. Note that the estimation algorithm may be other than this as long as it can perform similar processing.
- FIG. 10 shows an example of filter coefficient calculation by a neural network.
- this neural network uses, as the input signal vector, an amplitude spectrum obtained by transforming the input signal into the frequency domain.
- the value of the gain setting eg + ⁇ dB, - ⁇ dB, etc.
- the output of the neural network is assumed to be the filter coefficients used in the filter processor 11 . Since a neural network can learn by preparing a set of input data and output data in advance, such irregular input/output can be realized.
- the filter coefficients thus obtained are output to the coefficient updating unit 16 and screen display updating unit 17 .
- the coefficient update unit 16 updates the filter coefficients of the filter processing unit 11 using the filter coefficients input from the sound source separation unit 13A.
- the screen display update unit 17 updates the display of the display device using the filter coefficients input from the sound source separation unit 13A.
- Others are the same as those of the first embodiment. As described above, in the present embodiment, the processing from sound source separation (step S130) to filter coefficient calculation (step S150) in the filter coefficient calculation processing (see FIG. 8) of the information processing apparatus 10 in the first embodiment is performed by sound source separation. This is collectively done in the section 13A. Others are as described with reference to FIGS.
- the present embodiment has the following effects.
- the filter coefficient itself By directly outputting the filter coefficient itself from the sound source separation unit 13A, the frequency characteristic estimation unit 14 and the filter coefficient calculation unit 15 (see FIG. 4) included in the information processing apparatus 10 of the first embodiment can be omitted. . That is, it is possible to simplify the processing by reducing the configuration of functional blocks.
- the computation of the neural network itself can also reduce the number of dimensions of the output vector. Specifically, in the case of the amplitude spectrum output, a size of 1024 or 2048 is required, whereas in the case of the IIR filter coefficient output, the size is about several to several tens. Therefore, it is possible to reduce the multiplication/addition operations in the latter part of the neural network. Therefore, compared to the case shown in FIG. 5 (where separated sounds are output), the amount of calculation can be reduced, that is, the power consumption can be reduced.
- the information processing apparatus differs from the first embodiment in that the filter coefficients are calculated by correcting the frequency characteristics according to the output device. Other points are the same as those of the first embodiment.
- FIG. 11 shows a configuration example of functional blocks of an information processing device (information processing device 10B) according to this embodiment. Similar to the information processing device 10 of the first embodiment, the information processing device 10B includes a filter processing unit 11, a sound source separation coefficient selection unit 12, a sound source separation unit 13, a frequency characteristic estimation unit 14, a filter coefficient calculation unit 15, and a coefficient update unit. 16 and a screen display updating unit 17 .
- the information processing device 10B has a frequency characteristic corrector 18 that performs the above-described correction between the frequency characteristic estimator 14 and the filter coefficient calculator 15 . That is, in the present embodiment, the frequency characteristics estimated by the frequency characteristics estimation section 14 are output to the frequency characteristics correction section 18 .
- the frequency characteristic correction unit 18 corrects the frequency characteristic estimated from the sound source separation output using the output device frequency characteristic.
- the output device frequency characteristics are hardware-specific frequency characteristics of output devices (for example, playback devices such as headphones and speakers) that output filtered output signals.
- the output device frequency characteristic is, for example, measured in advance and stored in a storage area in a readable manner. For example, if the model of the output device to be used is decided, the characteristics of that model are stored. Remember. Then, the model is determined (regardless of whether it is automatic or manual) as necessary, and the characteristics corresponding to the determination result are used.
- the frequency characteristic correction unit 18 performs correction by multiplying the frequency characteristic of the sound source separation output by its negative characteristic, considering that the characteristic of the output device is applied during reproduction. For example, depending on the model of the output device, it may be difficult to produce a low-pitched sound. This makes it possible to obtain optimum filter coefficients for the output device.
- the frequency characteristic corrector 18 corrects the frequency characteristic input from the frequency characteristic estimator 14 and outputs the corrected frequency characteristic to the filter coefficient calculator 15.
- the frequency characteristic estimator 14 may read the output device frequency characteristic and directly estimate the corrected frequency characteristic.
- the present embodiment has the following effects in addition to the effects described in the first embodiment.
- a more optimal filter can be generated according to the input signal and output device frequency characteristics. In other words, since the sound quality is adjusted in consideration of both the content and the playback device, it is possible to provide higher quality sound.
- the information processing apparatus differs from the first embodiment in that part of the processing is executed on the server side. Other points are the same as those of the first embodiment.
- FIG. 12 shows a configuration example of functional blocks of an information processing device (information processing device 10C) according to this embodiment.
- the information processing device 10 ⁇ /b>C on the client side has a filter processing section 11 , a coefficient update section 16 and a screen display update section 17 .
- the information processing device 10C also has a communication function capable of communicating with another information processing device 10D on the server side via a network such as the Internet.
- the other information processing device 10D has a sound source separation coefficient selection unit 12, a sound source separation unit 13, a frequency characteristic estimation unit 14, and a filter coefficient calculation unit 15. Further, another information processing device 10D has a communication function capable of communicating with the information processing device 10C via a network.
- the processing of the sound source separation coefficient selection unit 12, the sound source separation unit 13, the frequency characteristic estimation unit 14, and the filter coefficient calculation unit 15 (specifically, the filter coefficient calculation processing shown in FIG. 8). on the server side.
- the hardware configuration of another information processing device 10D is the same as that of the information processing device 10C (see FIG. 6).
- the information processing device 10C transmits the input signal and the designated target sound source type and gain settings to the other information processing device 10D.
- the other information processing device 10D uses these to calculate filter coefficients using a sound source separation coefficient selection unit 12, a sound source separation unit 13, a frequency characteristic estimation unit 14, and a filter coefficient calculation unit 15, and the calculated filter coefficients are sent to the information processing device. 10C.
- the information processing device 10C receives the filter coefficients transmitted from the other information processing device 10D. Specifically, the coefficient updating unit 16 and the screen display updating unit 17 perform the above-described processes using the received filter coefficients. Thus, the information processing device 10C acquires the filter coefficients determined by the other information processing device 10D via the network.
- the information processing device 10C on the client side sends the input signal and various settings used for sound quality adjustment to the other information processing device 10D on the server side, and receives the filter coefficients from the other information processing device 10D. ⁇ It is possible to obtain high-performance filter coefficients with a low computational complexity. In other words, by executing processing (specifically, sound source separation) that requires a relatively large amount of calculation on the server side, the calculation load on the client side can be significantly reduced.
- the setting instruction screen shown in FIG. 2 was used to set the target sound source type, gain, and coefficient update, but the setting instruction is not limited to this.
- the setting instruction screen may have another screen configuration.
- the user instruction input section 4 may be displayed separately from the game screen 2 .
- the instruction for each setting is not limited to using the setting instruction screen, and may be performed by voice input, for example.
- each setting is not limited to being appropriately set by the user, and a predetermined setting may be used.
- the equalizer 3 shown in FIG. 2 was exemplified as an equalizer used for sound quality adjustment, but the applicable equalizer is not limited to this, and any equalizer (for example, type, function, etc.) can be used. can be selected.
- any equalizer for example, type, function, etc.
- setting items to be set by the user, the configuration of the filter processing section 11, and the like may be changed as necessary according to the equalizer to be used.
- the equalizer 3 may be a graphic equalizer or a parametric equalizer, and the setting items may appropriately set the parameters of the used equalizer.
- the sound of the target sound source type is amplified or attenuated, but sound control is not limited to this.
- it may extract or remove the sound of the target sound source type, or change the frequency characteristics (for example, pitch) of the sound of the target sound source type.
- the target sound source type by setting the sound of a specific person, the sound of a specific musical instrument, noise, etc. as the target sound source type, it is possible to apply enhancement, modification, extraction or removal of these sounds.
- a game was exemplified as a specific example of an application to which the sound quality adjustment function of the information processing device 10 can be applied. can be applied to other applications of
- the information processing apparatus 10B having the sound source separation unit 13 described in the first embodiment is provided with a function of correcting the frequency characteristics of the output device (frequency characteristic correction unit 18).
- the function may be provided to the device having the sound source separation section 13A described in the second embodiment.
- the output device frequency characteristic may be input to the sound source separation section 13A in the same manner as the gain setting shown in FIG. 5 to obtain the estimation result.
- the processing of the sound source separation coefficient selection unit 12, the sound source separation unit 13, the frequency characteristic estimation unit 14, and the filter coefficient calculation unit 15 described in the first embodiment is performed on the server side.
- the processing performed on the server side is not limited to this.
- the processing of the sound source separation coefficient selection unit 12 and the sound source separation unit 13A may be performed on the server side. Also, for example, a part of these processes may be performed on the server side.
- the present disclosure can also adopt the following configuration.
- a filtering unit for filtering an input signal a filter setting unit for setting the filtering, which controls the sound of the target sound source type in the input signal and is determined using an estimation result obtained from the input signal by an estimation algorithm.
- Information processing equipment (2) The information processing device according to (1), wherein the filtering is a process with a lower delay than the estimation algorithm.
- the estimation algorithm estimates and separates the sound of the target sound source type from the input signal, The information processing apparatus according to any one of (1) to (3), wherein the filtering setting is determined based on the frequency characteristics of the sound of the target sound source type separated by the estimation algorithm.
- the information processing apparatus estimates the filtering setting itself.
- the filtering settings appropriately amplify or attenuate each frequency band of the input signal so that the sound of the target sound source type is amplified or attenuated.
- (1) to (5) The information processing device described.
- the filtering is performed using an IIR (Infinite impulse response) filter,
- the information processing apparatus according to any one of (1) to (6), wherein the filtering setting is a filter coefficient of the filter.
- the estimation algorithm uses a neural network trained to obtain the estimation result by inputting an input signal for learning. Device.
- the information processing apparatus according to any one of (1) to (11), wherein the filtering settings are corrected according to frequency characteristics of an output device that outputs the filtered output signal.
- the information processing apparatus according to any one of (1) to (12), wherein the filtering setting is determined by another information processing apparatus and obtained via a network.
- the filtering settings of the filter processing unit for filtering the input signal, which control the sound of the target sound source type in the input signal, are determined using estimation results obtained from the input signal by an estimation algorithm.
- An information processing method that performs processing.
- the filtering settings of the filter processing unit for filtering the input signal, which control the sound of the target sound source type in the input signal are determined using estimation results obtained from the input signal by an estimation algorithm.
- Equalizer 10A, 10B, 10C Information processing device 11 Filter processing unit 13, 13A Sound source separation unit 14 Frequency characteristic estimation unit 15 Filter coefficient calculation unit 16 Coefficient update unit 17 Screen display update unit 18 Frequency characteristic correction unit
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
Description
入力信号をフィルタリングするフィルタ処理部と、
前記フィルタリングの設定を、前記入力信号中の対象音源種類の音を制御するものであって前記入力信号から推定アルゴリズムによって得られた推定結果を用いて決定されたものとするフィルタ設定部と
を有する情報処理装置である。
入力信号をフィルタリングするフィルタ処理部の前記フィルタリングの設定を、前記入力信号中の対象音源種類の音を制御するものであって前記入力信号から推定アルゴリズムによって得られた推定結果を用いて決定されたものとする
処理を行う情報処理方法である。
入力信号をフィルタリングするフィルタ処理部の前記フィルタリングの設定を、前記入力信号中の対象音源種類の音を制御するものであって前記入力信号から推定アルゴリズムによって得られた推定結果を用いて決定されたものとする
処理をコンピュータに実行させるプログラムである。
<1.背景>
<2.第1実施形態>
<3.第2実施形態>
<4.第3実施形態>
<5.第4実施形態>
<6.変形例>
以下に説明する実施形態等は本開示の好適な具体例であり、本開示の内容がこれらの実施形態等に限定されるものではない。なお、以下の説明において、実質的に同一の機能構成を有するものについては同一の符号を付し、重複説明を適宜省略する。
始めに、本開示の背景について説明する。図1は、一般的に使用されているイコライザの構成例を示している。例えば、ユーザは、自身が聞きたい音または聞きたくない音に応じてイコライザの各周波数帯域の利得(具体的には、Gain値)をツマミ操作などにより調整することができる。内部の信号処理としては、IIR(Infinite Impulse Response)フィルタやFIR(Finite Impulse Response)フィルタが一般的には用いられている。特に、IIRフィルタを使用した場合、入力から出力を得るまでの遅延時間が数百マイクロ秒~数ミリ秒程度であるため、体感的な音の遅れは全く感じられない。したがって、リアルタイムで音質調整が行えるため、ゲームや音声通話などのリアルタイム性が重要なアプリケーションで広く用いられている。
Stefan Uhlich others. "Improving music source separation based on deep neural networks through data augmentation and network blending." 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2017). 2017.
[2-1.アプリケーションの具体例]
まず、第1実施形態に係るアプリケーションの具体例について説明する。なお、本実施形態では、ゲーム再生音の音質調整を行う場合を例にして説明する。図2は、音質調整の設定指示画面の表示例を示している。図示した例では、ゲーム中の画面に音質調整機能を取り入れたものとなっている。これにより、例えば、ゲーム中に音質調整が可能になっている。
図4は、第1実施形態に係る情報処理装置(情報処理装置10)の機能ブロックの構成例を示している。情報処理装置10は、例えば、上述したアプリケーションに適用可能な信号処理回路を実装するもの(具体的には、再生信号)である。情報処理装置10は、入力信号(具体的には、オーディオ信号)に信号処理を施して音を制御する。オーディオ信号は、例えば、上述したゲーム、音声通話(例えば、ウェブ会議システム)などのアプリケーションから得られるものである。情報処理装置10は、パーソナルコンピュータ、スマートフォン、タブレット端末、ゲーム機、スピーカ装置、ヘッドホン、イヤホン、スマート家電、テレビジョン、プレーヤ、レコーダ、電話機、車載器、監視装置または医療機器などのオーディオ信号を扱う種々の電子機器(具体的には、コンピュータ機器)で構成することができる。
図6は、情報処理装置10のハードウェア構成例を示している。情報処理装置10は、バスにより相互接続されている制御部101、記憶部102、入力部103、通信部104および出力部105を有している。
図7は、上述した一連の処理(音質調整処理)をフローチャートとして示したものである。なお、本例では、上述したユーザ操作による入力となる対象音源種類、利得および係数更新の各設定については、フローをわかりやすくするために最初の1回だけ設定する旨で記載している。しかしながら、この設定変更は随時行うことも可能である。
本実施形態に係る情報処理装置10では、係数更新部16が、入力信号をフィルタリングするフィルタ処理部11のフィルタリングの設定を、入力信号中の対象音源種類の音を制御するものであって入力信号から推定アルゴリズムによって得られた推定結果を用いて決定されたものとする。これにより、マイクロホンなどの追加のハードウェアを必要とせずに、入力信号に応じた最適なフィルタを生成することができる。詳述すると、入力信号中に含まれる対象音源種類の音の特性に最適なフィルタを生成することができる。これにより、ユーザに対して最適な音響特性を有する信号の再生を行うことができる。
第2実施形態に係る情報処理装置は、推定アルゴリズムによってフィルタ係数そのものを推定する点が第1実施形態の情報処理装置10とは相違する。他の点(アプリケーションの具体例、ハードウェアの構成例など)は、基本的に情報処理装置10と同じである。以下、上述した情報処理装置10との相違点について説明する。
第3実施形態に係る情報処理装置は、出力デバイスに応じた周波数特性の補正を加えてフィルタ係数を算出する点が、第1実施形態とは相違する。他の点は、第1実施形態と同様である。
第4実施形態に係る情報処理装置は、処理の一部をサーバ側で実行する点が、第1実施形態とは相違する。他の点は、第1実施形態と同様である。
以上、本開示の実施形態について具体的に説明したが、本開示は、上述した実施形態に限定されるものではなく、本開示の技術的思想に基づく各種の変形が可能である。例えば、次に述べるような各種の変形が可能である。また、次に述べる変形の態様は、任意に選択された一又は複数を、適宜に組み合わせることもできる。また、上述した実施形態の構成、方法、工程、形状、材料および数値等は、本開示の主旨を逸脱しない限り、互いに組み合わせることや入れ替えることが可能である。また、1つのものを2つ以上に分けることも可能であり、一部を省略することも可能である。
(1)
入力信号をフィルタリングするフィルタ処理部と、
前記フィルタリングの設定を、前記入力信号中の対象音源種類の音を制御するものであって前記入力信号から推定アルゴリズムによって得られた推定結果を用いて決定されたものとするフィルタ設定部と
を有する情報処理装置。
(2)
前記フィルタリングは、前記推定アルゴリズムよりも低遅延の処理である
(1)に記載の情報処理装置。
(3)
前記フィルタリングは、リアルタイム処理である
(1)または(2)に記載の情報処理装置。
(4)
前記推定アルゴリズムは、前記入力信号から前記対象音源種類の音を推定分離するものであり、
前記フィルタリングの設定は、前記推定アルゴリズムによって分離された前記対象音源種類の音の周波数特性に基づき決定されたものである
(1)から(3)のうちの何れかに記載の情報処理装置。
(5)
前記推定アルゴリズムは、前記フィルタリングの設定そのものを推定するものである
(1)から(3)のうちの何れかに記載の情報処理装置。
(6)
前記フィルタリングの設定は、前記対象音源種類の音が増幅または減衰されるように前記入力信号の各周波数帯域を適宜、増幅または減衰させるものである
(1)から(5)のうちの何れかに記載の情報処理装置。
(7)
前記フィルタリングは、IIR(Infinite impulse response)フィルタを用いて行われ、
前記フィルタリングの設定は、前記フィルタのフィルタ係数である
(1)から(6)のうちの何れかに記載の情報処理装置。
(8)
前記推定アルゴリズムは、学習用の入力信号を入力して前記推定結果が得られるように学習されたニューラルネットワークを用いたものである
(1)から(7)のうちの何れかに記載の情報処理装置。
(9)
前記フィルタリングの指示を受け付けたタイミング、定期的な間隔または所定の判定情報に基づき違和感なしと判定されたタイミングで、前記フィルタリングの設定を更新する
(1)から(8)のうちの何れかに記載の情報処理装置。
(10)
前記フィルタリングの設定が更新された場合に、更新前後の違いを表す情報を出力デバイスに出力させる
(1)から(9)のうちの何れかに記載の情報処理装置。
(11)
前記出力デバイスは、表示デバイスであり、
前記表示デバイスに前記違いを表す情報を含むユーザ操作可能なイコライザの画像を表示させる
(1)から(10)のうちの何れかに記載の情報処理装置。
(12)
前記フィルタリングの設定は、前記フィルタリング後の出力信号を出力する出力デバイスの周波数特性に応じた補正がなされたものである
(1)から(11)のうちの何れかに記載の情報処理装置。
(13)
前記フィルタリングの設定は、他の情報処理装置で決定されたものを、ネットワークを介して取得したものである
(1)から(12)のうちの何れかに記載の情報処理装置。
(14)
入力信号をフィルタリングするフィルタ処理部の前記フィルタリングの設定を、前記入力信号中の対象音源種類の音を制御するものであって前記入力信号から推定アルゴリズムによって得られた推定結果を用いて決定されたものとする
処理を行う情報処理方法。
(15)
入力信号をフィルタリングするフィルタ処理部の前記フィルタリングの設定を、前記入力信号中の対象音源種類の音を制御するものであって前記入力信号から推定アルゴリズムによって得られた推定結果を用いて決定されたものとする
処理をコンピュータに実行させるプログラム。
Claims (15)
- 入力信号をフィルタリングするフィルタ処理部と、
前記フィルタリングの設定を、前記入力信号中の対象音源種類の音を制御するものであって前記入力信号から推定アルゴリズムによって得られた推定結果を用いて決定されたものとするフィルタ設定部と
を有する情報処理装置。 - 前記フィルタリングは、前記推定アルゴリズムよりも低遅延の処理である
請求項1に記載の情報処理装置。 - 前記フィルタリングは、リアルタイム処理である
請求項1に記載の情報処理装置。 - 前記推定アルゴリズムは、前記入力信号から前記対象音源種類の音を推定分離するものであり、
前記フィルタリングの設定は、前記推定アルゴリズムによって分離された前記対象音源種類の音の周波数特性に基づき決定されたものである
請求項1に記載の情報処理装置。 - 前記推定アルゴリズムは、前記フィルタリングの設定そのものを推定するものである
請求項1に記載の情報処理装置。 - 前記フィルタリングの設定は、前記対象音源種類の音が増幅または減衰されるように前記入力信号の各周波数帯域を適宜、増幅または減衰させるものである
請求項1に記載の情報処理装置。 - 前記フィルタリングは、IIR(Infinite impulse response)フィルタを用いて行われ、
前記フィルタリングの設定は、前記フィルタのフィルタ係数である
請求項1に記載の情報処理装置。 - 前記推定アルゴリズムは、学習用の入力信号を入力して前記推定結果が得られるように学習されたニューラルネットワークを用いたものである
請求項1に記載の情報処理装置。 - 前記フィルタリングの指示を受け付けたタイミング、定期的な間隔または所定の判定情報に基づき違和感なしと判定されたタイミングで、前記フィルタリングの設定を更新する
請求項1に記載の情報処理装置。 - 前記フィルタリングの設定が更新された場合に、更新前後の違いを表す情報を出力デバイスに出力させる
請求項1に記載の情報処理装置。 - 前記出力デバイスは、表示デバイスであり、
前記表示デバイスに前記違いを表す情報を含むユーザ操作可能なイコライザの画像を表示させる
請求項10に記載の情報処理装置。 - 前記フィルタリングの設定は、前記フィルタリング後の出力信号を出力する出力デバイスの周波数特性に応じた補正がなされたものである
請求項1に記載の情報処理装置。 - 前記フィルタリングの設定は、他の情報処理装置で決定されたものを、ネットワークを介して取得したものである
請求項1に記載の情報処理装置。 - 入力信号をフィルタリングするフィルタ処理部の前記フィルタリングの設定を、前記入力信号中の対象音源種類の音を制御するものであって前記入力信号から推定アルゴリズムによって得られた推定結果を用いて決定されたものとする
処理を行う情報処理方法。 - 入力信号をフィルタリングするフィルタ処理部の前記フィルタリングの設定を、前記入力信号中の対象音源種類の音を制御するものであって前記入力信号から推定アルゴリズムによって得られた推定結果を用いて決定されたものとする
処理をコンピュータに実行させるプログラム。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP22841676.4A EP4373134A1 (en) | 2021-07-15 | 2022-03-02 | Information processing device, information processing method, and program |
CN202280048095.7A CN117652159A (zh) | 2021-07-15 | 2022-03-02 | 信息处理装置、信息处理方法和程序 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2021116815 | 2021-07-15 | ||
JP2021-116815 | 2021-07-15 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023286326A1 true WO2023286326A1 (ja) | 2023-01-19 |
Family
ID=84918824
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2022/008820 WO2023286326A1 (ja) | 2021-07-15 | 2022-03-02 | 情報処理装置、情報処理方法およびプログラム |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP4373134A1 (ja) |
CN (1) | CN117652159A (ja) |
WO (1) | WO2023286326A1 (ja) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013009198A (ja) * | 2011-06-24 | 2013-01-10 | Toshiba Corp | 音響制御装置、音響補正装置、及び音響補正方法 |
JP2020197712A (ja) | 2019-05-31 | 2020-12-10 | アップル インコーポレイテッドApple Inc. | コンテキストに基づく周囲音の増強及び音響ノイズキャンセル |
JP2021509552A (ja) * | 2017-12-29 | 2021-03-25 | ハーマン インターナショナル インダストリーズ, インコーポレイテッド | 高度なオーディオ処理システム |
JP2021076831A (ja) * | 2019-10-21 | 2021-05-20 | ソニーグループ株式会社 | 電子機器、方法およびコンピュータプログラム |
-
2022
- 2022-03-02 CN CN202280048095.7A patent/CN117652159A/zh active Pending
- 2022-03-02 WO PCT/JP2022/008820 patent/WO2023286326A1/ja active Application Filing
- 2022-03-02 EP EP22841676.4A patent/EP4373134A1/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013009198A (ja) * | 2011-06-24 | 2013-01-10 | Toshiba Corp | 音響制御装置、音響補正装置、及び音響補正方法 |
JP2021509552A (ja) * | 2017-12-29 | 2021-03-25 | ハーマン インターナショナル インダストリーズ, インコーポレイテッド | 高度なオーディオ処理システム |
JP2020197712A (ja) | 2019-05-31 | 2020-12-10 | アップル インコーポレイテッドApple Inc. | コンテキストに基づく周囲音の増強及び音響ノイズキャンセル |
JP2021076831A (ja) * | 2019-10-21 | 2021-05-20 | ソニーグループ株式会社 | 電子機器、方法およびコンピュータプログラム |
Non-Patent Citations (1)
Title |
---|
STEFAN UHLICH: "Improving music source separation based on deep neural networks through data augmentation and network blending", IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, 2017 |
Also Published As
Publication number | Publication date |
---|---|
EP4373134A1 (en) | 2024-05-22 |
CN117652159A (zh) | 2024-03-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113676803B (zh) | 一种主动降噪方法及装置 | |
CN107615651B (zh) | 用于改善的音频感知的系统和方法 | |
CN106664473B (zh) | 信息处理装置、信息处理方法和程序 | |
US9413322B2 (en) | Audio loudness control system | |
US20170223471A1 (en) | Remotely updating a hearing aid profile | |
US9531338B2 (en) | Signal processing apparatus, signal processing method, program, signal processing system, and communication terminal | |
JP4262597B2 (ja) | サウンドシステム | |
CN104685563B (zh) | 用于嘈杂环境噪里的回放的音频信号整形 | |
US20140254828A1 (en) | System and Method for Personalization of an Audio Equalizer | |
US20110002467A1 (en) | Dynamic enhancement of audio signals | |
JP2005537702A (ja) | 補聴器および音声の明瞭さを高める方法 | |
JP5085769B1 (ja) | 音響制御装置、音響補正装置、及び音響補正方法 | |
EP3038255B1 (en) | An intelligent volume control interface | |
WO2023286326A1 (ja) | 情報処理装置、情報処理方法およびプログラム | |
WO2024001463A1 (zh) | 音频信号的处理方法、装置、电子设备、计算机可读存储介质及计算机程序产品 | |
GB2490092A (en) | Reducing howling by applying a noise attenuation factor to a frequency which has above average gain | |
CN110740404B (zh) | 一种音频相关性的处理方法及音频处理装置 | |
JP5695896B2 (ja) | 音質制御装置、音質制御方法及び音質制御用プログラム | |
JP2019091971A (ja) | オーディオプロセッサおよびオーディオ再生装置 | |
CN116349252A (zh) | 用于处理双耳录音的方法和设备 | |
US10902864B2 (en) | Mixed-reality audio intelligibility control | |
JP6954905B2 (ja) | オーディオ信号を出力するためのシステム及びそれぞれの方法と設定装置 | |
EP4333464A1 (en) | Hearing loss amplification that amplifies speech and noise subsignals differently | |
CN114125625B (zh) | 降噪调整方法、耳机及计算机可读存储介质 | |
US20240064487A1 (en) | Customized selective attenuation of game audio |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22841676 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202280048095.7 Country of ref document: CN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 18577560 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2022841676 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2022841676 Country of ref document: EP Effective date: 20240215 |
|
NENP | Non-entry into the national phase |
Ref country code: JP |