WO2021085174A1 - Voice processing device and voice processing method - Google Patents

Voice processing device and voice processing method Download PDF

Info

Publication number
WO2021085174A1
WO2021085174A1 PCT/JP2020/039054 JP2020039054W WO2021085174A1 WO 2021085174 A1 WO2021085174 A1 WO 2021085174A1 JP 2020039054 W JP2020039054 W JP 2020039054W WO 2021085174 A1 WO2021085174 A1 WO 2021085174A1
Authority
WO
WIPO (PCT)
Prior art keywords
unit
processing device
voice
microphone
signal
Prior art date
Application number
PCT/JP2020/039054
Other languages
French (fr)
Japanese (ja)
Inventor
洋平 櫻庭
Original Assignee
ソニー株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ソニー株式会社 filed Critical ソニー株式会社
Publication of WO2021085174A1 publication Critical patent/WO2021085174A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/02Circuits for transducers, loudspeakers or microphones for preventing acoustic reaction, i.e. acoustic oscillatory feedback
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/04Circuits for transducers, loudspeakers or microphones for correcting frequency response

Definitions

  • the present technology relates to a voice processing device and a voice processing method, and more particularly to a voice processing device and a voice processing method capable of suppressing howling more reliably.
  • the method of providing a notch filter is not sufficient as a countermeasure when the parameters in the signal processing unit that processes the audio signal are changed, and a technique for surely suppressing howling has been required.
  • This technology was made in view of such a situation, and makes it possible to suppress howling more reliably.
  • the voice processing device of one aspect of the present technology processes the voice signal picked up by the microphone, and sets an index regarding the amount of wraparound of the sound corresponding to the voice signal output from the speaker to the microphone as a predetermined threshold value. It is a voice processing device including a signal processing unit that compares and determines a threshold.
  • the voice processing method of one aspect of the present technology is an index relating to the amount of sound wrapping around the microphone according to the voice signal output from the speaker by the voice processing device processing the voice signal picked up by the microphone.
  • the audio signal picked up by the microphone is processed, and the amount of the sound corresponding to the audio signal output from the speaker wraps around the microphone.
  • the index is compared with a predetermined threshold, and the threshold determination is performed.
  • the voice processing device on one aspect of the present technology may be an independent device or an internal block constituting one device.
  • a hand microphone or pin microphone is used for loudspeaking (sound picked up by a microphone is reproduced from a speaker installed in the same room).
  • the reason for this is that it is necessary to reduce the sensitivity of the microphone in order to reduce the amount of wraparound to the speaker or microphone, and it is necessary to install the microphone near the speaker's mouth so that the volume can be increased. is there.
  • FIG. 1 instead of a hand-held microphone or a pin microphone, installing a microphone at a position away from the speaker's mouth, such as a microphone 10 mounted on the ceiling, to louden the sound is called off-mic loudspeaker. I'm out.
  • a microphone 10 mounted on the ceiling the voice spoken by the teacher is picked up by a microphone 10 mounted on the ceiling and loudened throughout the classroom so that the students can hear it.
  • the ceiling-mounted microphone 10 needs to be more sensitive than a handheld microphone or pin microphone, so that the amount of self-voice wraparound from the speaker 20 to the microphone 10 is large, that is, acoustic coupling is achieved. Because it is big.
  • the device that suppresses howling is called a howling suppressor or feedback reducer.
  • a howling suppressor or feedback reducer There are two types of feedback reducers: pre-measurement type and follow-up type.
  • a pre-measurement type feedback reducer it is common to measure whether howling occurs in advance, and if howling occurs, insert a notch filter at that frequency. Further, instead of the notch filter, a graphic equalizer or the like may be used to reduce the gain of the frequency at which howling occurs.
  • the follow-up type feedback reducer automatically detects howling and dynamically adds a notch filter to the frequency at which howling occurs, or lowers the gain of the frequency at which howling occurs with a graphic equalizer instead of the notch filter. It is a device of.
  • the pre-measurement type feedback reducer has a problem that new howling occurs when internal parameters such as volume and equalizer are changed.
  • this technology makes it possible to suppress howling more reliably.
  • howling can be reliably suppressed even when internal parameters such as volume and equalizer are changed during off-mic loudspeaker.
  • FIG. 2 shows an example of the configuration of an embodiment of a voice processing device to which the present technology is applied.
  • the voice processing device 1 includes an A / D conversion unit 12, a signal processing unit 13, and a signal output unit 14.
  • the voice processing device 1 is electrically connected to each of the microphone 10 and the speaker 20.
  • the microphone 10 is composed of a microphone unit 11-1 and a microphone unit 11-2.
  • An A / D conversion unit 12 is provided after the two microphone units 11.
  • the microphone 10 may be provided with one or more microphone units 11. Further, the microphone unit 11 may be an omnidirectional microphone or a unidirectional microphone.
  • the microphone unit 11-1 collects voice (sound) and supplies a voice signal as an analog signal to the A / D conversion unit 12.
  • the microphone unit 11-2 collects voice (sound) and supplies a voice signal as an analog signal to the A / D conversion unit 12.
  • the A / D conversion unit 12 converts the audio signals supplied from the microphone unit 11-1 and the microphone unit 11-2 from analog signals to digital signals and supplies them to the signal processing unit 13.
  • the signal processing unit 13 is configured as a digital signal processor (DSP: Digital Signal Processor) or the like.
  • DSP Digital Signal Processor
  • the signal processing unit 13 performs predetermined signal processing on the audio signal supplied from the A / D conversion unit 12, and supplies the resulting audio signal (audio signal with suppressed howling) to the signal output unit 14. To do.
  • the signal output unit 14 includes an audio output terminal.
  • the signal output unit 14 outputs the audio signal supplied from the signal processing unit 13 to the speaker 20 connected to the audio output terminal.
  • the speaker 20 processes the audio signal output from the audio processing device 1 (signal output unit 14), and outputs the audio (sound) corresponding to the audio signal.
  • the voice processing device 1 may include at least one of the microphone 10 and the speaker 20. Further, the microphone 10 may include all or at least a part of the A / D conversion unit 12, the signal processing unit 13, and the signal output unit 14. That is, the signal processing unit 13 may be provided inside the housing of the microphone 10 or may be provided inside the housing of the microphone 10 and externally attached to the housing of the microphone 10.
  • the signal processing unit 13 includes an auto equalizer unit 101, a volume unit 102, a calibration signal generation unit 103, an output sound power value calculation unit 104, an input sound power value calculation unit 105, and a feedback rate calculation unit 106. Will be done.
  • the auto equalizer unit 101 automatically changes the frequency characteristics of the audio signal input therein and supplies it to the volume unit 102.
  • the volume unit 102 adjusts the volume of the audio signal supplied from the auto equalizer unit 101 and supplies it to the signal output unit 14.
  • the calibration signal generation unit 103 generates a calibration signal such as a white noise signal or a pink noise signal during a calibration period such as during setting, and supplies the calibration signal to the signal output unit 14.
  • the signal output unit 14 outputs the calibration signal supplied from the calibration signal generation unit 103 to the speaker 20.
  • the speaker 20 outputs a calibration sound corresponding to the calibration signal input from the signal output unit 14.
  • the calibration sound is picked up by the microphone 10, and the audio signal is input to the signal processing unit 13.
  • each of the output sound power and the input sound power is calculated by the output sound power value calculation unit 104 and the input sound power value calculation unit 105.
  • the output sound power value calculation unit 104 calculates the output sound power value based on the calibration signal input therein, that is, the audio signal output from the signal processing unit 13, and supplies the output sound power value to the feedback rate calculation unit 106.
  • the input sound power value calculation unit 105 calculates the input sound power value based on the audio signal input therein, that is, the audio signal input to the signal processing unit 13, and supplies the input sound power value to the feedback rate calculation unit 106.
  • the feedback rate calculation unit 106 is input with the output sound power value from the output sound power value calculation unit 104 and the input sound power value from the input sound power value calculation unit 105.
  • the feedback rate calculation unit 106 calculates the feedback rate by performing a predetermined calculation using the output sound power value and the input sound power value.
  • the feedback rate is an index related to the amount of sound wraparound (wraparound rate) that quantitatively expresses how much the sound output from the speaker 20 wraps around the microphone 10.
  • wraparound rate the amount of sound wraparound
  • the feedback rate calculated by the feedback rate calculation unit 106 is supplied to the auto equalizer unit 101. As a result, the feedback rate is set in the auto equalizer unit 101 during the calibration period.
  • the speaker 20 After that, at the time of off-mic loudspeaker, the speaker 20 outputs a sound corresponding to the audio signal input from the signal output unit 14. This sound is picked up by the microphone 10, and the audio signal is input to the signal processing unit 13.
  • the auto equalizer unit 101 performs threshold determination using the feedback rate of each frequency band.
  • the auto equalizer unit 101 calculates a gain for lowering the feedback rate to the threshold value or less, and applies the gain to the audio signal.
  • the auto equalizer unit 101 resets (updates) the feedback rate in conjunction with the changed volume.
  • the auto equalizer unit 101 is corrected by updating the feedback rate, and the threshold value is determined using the corrected feedback rate of each frequency band.
  • the threshold value is determined using the corrected feedback rate of each frequency band.
  • a gain for lowering the corrected feedback rate to the threshold value or less is calculated, and the gain is applied to the audio signal.
  • the signal processing unit 13 processes the voice signal picked up by the microphone 10 and transmits the voice according to the voice signal output from the speaker 20 to the microphone 10.
  • a threshold judgment is performed by comparing the index (feedback rate) related to the amount of wraparound with a predetermined threshold, and the gain (suppression gain) corresponding to the threshold judgment is applied to the voice signal to reduce howling that occurs during off-microphone loudspeaker. At the same time, the loudspeaker sound quality can be improved.
  • the signal processing unit 13 corrects the feedback rate according to the volume change and adjusts the gain (suppression gain) for howling countermeasures. Even if it is changed, the occurrence of howling can be suppressed.
  • step S11 it is determined whether or not it is at the time of setting. If it is determined in the determination process of step S11 that it is the time of setting, the process proceeds to step S12, and the processes of steps S12 to S15 are performed in order to set the feedback rate.
  • step S12 the calibration signal generation unit 103 generates a calibration signal.
  • a white noise signal or the like is generated.
  • step S13 the signal output unit 14 outputs the generated calibration signal to the speaker 20.
  • white noise or the like is output from the speaker 20 as a calibration sound.
  • step S14 the feedback rate calculation process is performed.
  • the details of the feedback rate calculation process will be described with reference to the flowchart of FIG.
  • step S31 the output sound power value calculation unit 104 calculates the output sound power value based on the calibration signal output from the signal processing unit 13.
  • step S32 the input sound power value calculation unit 105 calculates the input sound power value based on the voice signal input to the signal processing unit 13.
  • step S33 the feedback rate calculation unit 106 calculates the feedback rate by performing a predetermined calculation using the calculated output sound power value and input sound power value.
  • the equation (1) is wide because even if the value of F0 ( ⁇ ) is less than 1.0, howling does not occur when the value exceeds a certain value (for example, about 0.5), but the sound loop is large. It means that the reverberation will occur as if you were talking in the hall. For example, this approximately 0.5 can be set to the threshold value T as a criterion for determining sound quality with a reverberant feeling.
  • step S33 When the process of step S33 is completed, the process returns to step S14 of FIG. 3, and the subsequent processes are performed.
  • step S15 the auto equalizer unit 101 sets the feedback rate F0 ( ⁇ ) for each frequency band calculated by the feedback rate calculation process.
  • step S15 If the process of step S15 is completed, or if it is determined in the determination process of step S11 that it is not the time of setting, the process is terminated.
  • the feedback rate F0 ( ⁇ ) for each frequency band is calculated using the calibration sound at the time of setting and is set as an initial value.
  • the feedback rate setting process is not limited to the setting time, and may be executed when a predetermined operation such as pressing the start button is performed at the start of use such as at the start of a class or a conference.
  • step S51 the auto equalizer unit 101 determines whether or not the feedback rate F0 ( ⁇ ) of each frequency band is equal to or less than the threshold value T for the audio signal input therein.
  • This feedback rate F0 ( ⁇ ) is set at the time of setting or the like.
  • step S51 If it is determined in the determination process of step S51 that the feedback rate F0 of each frequency band is equal to or less than the threshold value T, the process proceeds to step S52.
  • step S52 the auto-equalizer unit 101 sets the suppression gain G ( ⁇ ) of the frequency band to 1.0.
  • step S53 the auto-equalizer unit 101 calculates and sets the suppression gain G ( ⁇ ) of the frequency band by the following equation (2).
  • the suppression gain G ( ⁇ ) is calculated by the relationship shown in the following equation (3) according to the result of the threshold value determination of the feedback rate F0 ( ⁇ ).
  • step S52 or S53 When the process of step S52 or S53 is completed, the process proceeds to step S54.
  • step S54 the auto-equalizer unit 101 applies the calculated suppression gain G ( ⁇ ) for each frequency band to the audio signal input therein. That is, in the auto equalizer unit 101, the suppression gain G ( ⁇ ) for limiting the actual feedback rate F0 ( ⁇ ) to be equal to or less than the threshold value T is calculated, and the calculated suppression gain G ( ⁇ ) is used as an audio signal. Applying.
  • a threshold determination is performed using the feedback rate F0 ( ⁇ ) of each frequency band at the time of off-mic loudspeaker, and when the feedback rate F0 ( ⁇ ) exceeds the threshold T, the feedback rate F0 ( ⁇ ) is used.
  • ) Is calculated so that the suppression gain G ( ⁇ ) is equal to or less than the threshold value T, and the calculated suppression gain G ( ⁇ ) is multiplied by the voice signal.
  • FIG. 6 shows an example of feedback rate control in the auto equalizer unit 101.
  • a in FIG. 6 is a bar graph showing the feedback rate (unit: dB) for each frequency band by vertical bars with the horizontal axis as the frequency. This feedback rate is obtained from the relationship between the output sound and the input sound by the above-mentioned equation (1) at the time of setting.
  • the numbers # 1 to # 32 corresponding to the vertical bars are described as the values on the horizontal axis, and these numbers indicate the numbers that identify the frequency band. ing.
  • the feedback rate F0 ( ⁇ ) for each frequency band # 1 to # 32 is set.
  • This auto equalizer value is an audio signal obtained by the processing of the auto equalizer unit 101 at the time of off-mic loudspeaker.
  • the auto equalizer value is shown for each frequency band # 1 to # 32.
  • the threshold T is set to -6 dB, and the auto-equalizer value is processed so that the feedback rate F0 ( ⁇ ) becomes -6 dB.
  • C in FIG. 6 is a bar graph showing the corrected feedback rate (unit: dB) for each frequency band by a vertical bar with the horizontal axis as the frequency.
  • the feedback rate F0 ( ⁇ ) for each frequency band # 1 to # 32 is adjusted so as to be equal to or less than the threshold value T of -6 dB.
  • the suppression gain G ( ⁇ ) corresponding to the result of the threshold determination of the feedback rate F0 ( ⁇ ) for each frequency band is applied to the audio signal in the corresponding frequency band, resulting in the result.
  • the corrected feedback rate F0 ( ⁇ ) for each frequency band can be set to the threshold value T or less.
  • the positions of the microphone 10 and the speaker 20 are fixed in a classroom or the like, if the volume of the amplifier is fixed, the calibration sound is output from the speaker 20 and the sound picked up by the microphone 10 is output. By measuring, the frequency at which howling occurs can be recognized in advance.
  • the actual feedback rate F0 ( ⁇ ) is controlled to be 0.5 or less (-6 dB or less).
  • the suppression gain G ( ⁇ ) according to the result of the threshold determination of the feedback rate F0 ( ⁇ ) for each frequency band is applied, and the required frequency band is maintained while maintaining the sound quality by the minimum necessary reduction of the auto equalizer value. Since it controls howling, it is possible to suppress howling and not sacrifice sound quality more than necessary.
  • step S71 it is determined whether or not the volume has been changed by a user operation or the like. If it is determined in the determination process of step S71 that the volume has been changed, the process proceeds to step S72.
  • step S72 the auto equalizer unit 101 updates (corrects) the value (initial value, etc.) of the feedback rate F0 ( ⁇ ) set at the time of setting or the like in conjunction with the change in volume.
  • the feedback rate F ( ⁇ ) linked to the volume change can be expressed by the following equation (4).
  • the amplification amount corresponding to the volume change is input to the auto equalizer unit 101 when the volume is changed.
  • step S72 When the process of step S72 is completed or it is determined in the determination process of step S71 that the volume has not been changed, the process proceeds to step S73.
  • step S73 the feedback rate control process is performed.
  • This feedback rate control process is as described with reference to FIG. 5, but here, since the feedback rate is updated (corrected) to the feedback rate F ( ⁇ ) in conjunction with the volume change, the corrected feedback Using the rate F ( ⁇ ), the suppression gain G ( ⁇ ) for each frequency band is recalculated.
  • the suppression gain G ( ⁇ ) is calculated by the relationship shown in the following equation (5) according to the result of the threshold value determination of the corrected feedback rate F ( ⁇ ).
  • the suppression gain G ( ⁇ ) for each frequency band calculated according to the threshold value determination of the corrected feedback rate F ( ⁇ ) according to the equation (5) is applied to the audio signal.
  • the flow of the feedback rate update process has been explained above.
  • the feedback rate F ( ⁇ ) is corrected in conjunction with the volume change, and when the corrected feedback rate F ( ⁇ ) exceeds the threshold value T, the feedback rate F is corrected.
  • the suppression gain G ( ⁇ ) is calculated so that ( ⁇ ) is equal to or less than the threshold value T, and the calculated suppression gain G ( ⁇ ) is multiplied by the voice signal.
  • FIG. 8 shows an example of updating the feedback rate in the auto equalizer unit 101.
  • A1 and A2 in FIG. 8 represent the feedback rate F0 ( ⁇ ) and the auto-equalizer value before the volume change by the vertical bars when the horizontal axis is the frequency.
  • B1 and B2 in FIG. 8 and C1 and C2 in FIG. 8 represent the feedback rate F ( ⁇ ) and the auto-equalizer value after the volume change by vertical bars when the horizontal axis is the frequency.
  • the feedback rate increases in conjunction with the increase in volume. Further, when the auto-equalizer values of A2 to C2 in FIG. 8 are compared, the feedback rate F ( ⁇ ) is corrected in the auto-equalizer unit 101 in conjunction with the increase in volume, and the corrected feedback rate F ( ⁇ ) is corrected.
  • the suppression gain G ( ⁇ ) calculated according to the threshold value determination of) is applied to the voice signal.
  • FIG. 9 shows an example of another configuration of an embodiment of a voice processing device to which the present technology is applied.
  • the same parts as those in FIG. 2 are designated by the same or corresponding reference numerals, and the description thereof will be omitted as appropriate.
  • the microphone 10 is composed of a microphone unit 11-1 and a microphone unit 11-2.
  • the A / D conversion units 12-1 and 12-2 convert the audio signals from the microphone units 11-1 and 11-2 from analog signals to digital signals, respectively, and supply them to the signal processing unit 13A.
  • the signal processing unit 13A is further provided with a beamforming unit 111 and an equalizer unit 112 in addition to the auto equalizer unit 101 to the feedback rate calculation unit 106.
  • the beamforming unit 111 performs beamforming processing based on the audio signals from the A / D conversion units 12-1 and 12-2.
  • the directivity of the microphone 10 is formed so as not to take sound from the direction in which the speaker 20 is installed (as little as possible), and is passed to the subsequent process as a monaural signal.
  • a directivity that reduces the sensitivity in the direction in which the speaker 20 is installed is formed, and a monaural signal is formed. Is generated.
  • the beam former In order to suppress the sound from the direction of the speaker 20 (to prevent howling) by using a method such as an adaptive beam former, the beam former is used in the section where the sound is output only from the speaker 20 at the time of calibration or the like.
  • the internal parameters of are learned, and the directivity is calculated so that a blind spot (NULL directivity) is formed in the direction in which the speaker 20 is installed.
  • the beamforming unit 111 can obtain the suppression amount B ( ⁇ ) of the loudspeaker component by beamforming based on the directivity coefficient and the information on the loudspeaker sound calculated at the time of calibration.
  • the beamforming method is not limited to the adaptive beamformer, and other methods such as the delay sum method and the three-microphone integration method are also known, and any method may be used.
  • the beamforming processed audio signal is supplied to the equalizer unit 112.
  • the equalizer unit 112 changes the frequency characteristics of the audio signal supplied from the beamforming unit 111 according to the user's operation, and supplies the sound signal to the auto equalizer unit 101.
  • the equalizer unit 112 holds the gain EQ ( ⁇ ) for each frequency band as an internal parameter.
  • the auto equalizer unit 101 uses the internal parameters of the beamforming unit 111 and the equalizer unit 112 to calculate the feedback rate F ( ⁇ ) of the above equation (4) by the following equation. It can be modified as in (6).
  • the internal parameters are input to the auto equalizer unit 101 during the beamforming process or the frequency characteristic change process.
  • the suppression gain G ( ⁇ ) is calculated by the relationship shown in the following equation (7) according to the result of the threshold value determination of the corrected feedback rate F ( ⁇ ).
  • the suppression gain G ( ⁇ ) for each frequency band calculated according to the threshold value determination of the corrected feedback rate F ( ⁇ ) according to the equation (7) is applied to the audio signal.
  • audio signals are input to the input sound power value calculation unit 105 from both the A / D conversion unit 12-1 and the A / D conversion unit 12-2. Only the audio signal from either one may be input.
  • FIG. 10 shows an example of still another configuration of one embodiment of the voice processing device to which the present technology is applied.
  • the same parts as those in FIG. 2 or 9 are designated by the same or corresponding reference numerals, and the description thereof will be omitted as appropriate.
  • the signal processing unit 13B is further provided with a beamforming unit 111, an equalizer unit 112, a low frequency cut filter unit 121, and an auto gain control unit 122.
  • the low frequency cut filter unit 121 cuts a specific low frequency component among the frequency components of the audio signal supplied from the beamforming unit 111, and supplies the equalizer unit 112.
  • the auto gain control unit 122 automatically corrects the gain according to the input level of the audio signal supplied from the auto equalizer unit 101, keeps the output level of the signal having a level difference constant, and supplies it to the volume unit 102. To do.
  • the calculation formula of the feedback rate F ( ⁇ ) in the above equation (6) can be further modified by using the internal parameters of the low frequency cut filter unit 121 and the auto gain control unit 122. it can. Further, the suppression gain G ( ⁇ ) is obtained according to the result of the threshold value determination of the corrected feedback rate F ( ⁇ ), as in the above-described equation (7).
  • the equalizer unit 112, the low frequency cut filter unit 121, and the auto gain control unit 122 are examples of effector units that perform signal processing related to sound effects (signal processing that affects the volume), and when other signal processing is used.
  • the formula for calculating the feedback rate F ( ⁇ ) in the above formula (6) may be further modified according to the internal parameters.
  • the arrangement position of the low frequency cut filter unit 121 and the auto gain control unit 122 is an example, and may be arranged at other positions.
  • the suppression gain G ( ⁇ ) described above it is possible to estimate the loudspeaking sound quality at the time of off-mic loudspeaking.
  • the average value of the suppression gain G ( ⁇ ) for each frequency band is the sound quality score S (t)
  • the average suppression gain for the time t can be calculated by the following equation (8).
  • the suppression gain G ( ⁇ ) is 1.0 in all frequency bands, which means that the suppression filter is not applied, so that the loudspeaker sound quality is good. ing.
  • the suppression gain G ( ⁇ ) is applied in a part of the frequency band, but the amount of suppression by the suppression filter is limited, and the loudspeaker sound quality is relatively good. It means that.
  • the suppression gain G ( ⁇ ) is applied to many frequency bands, and the suppression filter is applied, so that the loudspeaker sound quality is relatively deteriorated. It means that you are.
  • Loudspeaker sound quality can be conveyed in an easy-to-understand manner.
  • the loudspeaker sound quality it is possible to support volume adjustment and setting.
  • FIG. 11 is a block diagram showing an example of a configuration of an embodiment of an information processing device to which the present technology is applied.
  • the information processing device 100 is a device for calculating and presenting a sound quality score as an index for evaluating whether or not the loudspeaker sound quality is appropriate.
  • the information processing device 100 calculates the sound quality score based on the data for calculating the sound quality score (hereinafter referred to as score calculation data). Further, the information processing device 100 generates evaluation information based on data for generating evaluation information (hereinafter referred to as evaluation information generation data) and presents it to the display device 30.
  • the score calculation data includes the suppression gain G ( ⁇ ) for each frequency band input from the voice processing device 1 (signal processing unit 13). Further, the evaluation information generation data can include, for example, in addition to the calculated sound quality score, information obtained when performing off-mic loudspeaker.
  • the display device 30 is a device having a display such as an LCD (Liquid Crystal Display) or an OLED (Organic Light Emitting Diode), for example.
  • the display device 30 presents the evaluation information output from the information processing device 100.
  • the information processing device 100 is, of course, configured as a single electronic device such as an audio device constituting a loudspeaker system, a dedicated measuring device, or a personal computer, as well as the voice processing device 1 and the microphone 10 described above. It may be configured as a part of a function of an electronic device such as a speaker 20. Further, the information processing device 100 and the display device 30 may be integrated into one electronic device.
  • the information processing device 100 includes a sound quality score calculation unit 151, an evaluation information generation unit 152, and a presentation control unit 153.
  • the sound quality score calculation unit 151 calculates the sound quality score S (t) by applying the above equation (8) using the suppression gain G ( ⁇ ) for each frequency band input therein as score calculation data. It is supplied to the evaluation information generation unit 152.
  • the evaluation information generation unit 152 generates evaluation information based on the sound quality score S (t) input as evaluation information generation data, and supplies the evaluation information to the presentation control unit 153.
  • the evaluation information includes information on the sound quality at the time of off-mic loudspeaker, such as the numerical value of the sound quality score S (t) and the result of the threshold value determination.
  • the presentation control unit 153 controls to present the evaluation information supplied from the evaluation information generation unit 152 on the screen of the display device 30.
  • the sound quality score is calculated using the suppression gain for each frequency band, and evaluation information including the calculated numerical value of the sound quality score and the result of the threshold value determination is generated.
  • the presentation of the generated evaluation information is controlled.
  • FIG. 12 shows an example of evaluation information presented as a GUI display.
  • the evaluation information screen 201 is presented with a user interface for adjusting the volume and three rectangular areas showing the result of the threshold value determination of the sound quality score.
  • the sound quality score S (t) is represented by three stages of 1.0, a threshold value T or more, or a threshold value T or less, the first from the left according to these three stages.
  • the lighting display of the three rectangular areas is controlled, such as the lighting display of the rectangular areas of the above, the lighting display of the first and second rectangular areas from the left, and the lighting display of all the rectangular areas.
  • the suppression filter is not applied (loud sound quality is good), the suppression amount by the suppression filter is limited (loud sound quality is relatively good), or many frequency bands.
  • the state in which the suppression filter is applied (loud sound quality is relatively deteriorated) is indicated by the lighting display.
  • the user, the installer, etc. can intuitively recognize how loud the sound quality is due to the influence of howling suppression when the volume is adjusted. it can. For example, when the user, the installer, or the like recognizes from the evaluation information screen 201 that the amount of sound wraparound is large, the user or the installer can take measures such as not raising the volume any more.
  • the result of the threshold value determination of the sound quality score is presented by displaying the rectangular area in a different color such as green, yellow, or red, not limited to the number of lighting displays in the rectangular area. May be good. Further, when displaying the rectangular areas in different colors, the number of the rectangular areas may be one and the color of the rectangular areas may be changed.
  • evaluation information is not limited to the GUI display, and may be presented by lighting the LED.
  • FIG. 13 shows an example of evaluation information presented by lighting the LED.
  • the information processing device 100 is provided with an LED 202 indicating the result of threshold value determination of the sound quality score.
  • the LED 202 capable of emitting three colors may be green, yellow, or red. It is controlled to light in any of the three colors.
  • a state in which the suppression filter is not applied is presented by the LED 202 lit in green or the like.
  • the information processing device 100 is provided with LEDs 202-1 to 202-3 indicating the result of the threshold value determination of the sound quality score.
  • LED202-1 lights up in green and LED202-2 lights up in yellow according to the three stages of the sound quality score S (t) being 1.0, above the threshold value T, or below the threshold value T.
  • LED202-3 is controlled to light up in red.
  • a state in which the suppression filter is not applied is presented by the LED202-1 lit in green or the like.
  • the method of presenting the evaluation information shown in FIGS. 12 and 13 is an example, and the evaluation information may be presented by another presentation method.
  • the score value of the sound quality score itself may be presented, not limited to the GUI display and the identification by the number or color by lighting the LED, or the score value of the sound quality score is read aloud and output by voice (sound). May be good.
  • the feedback rate is a value corresponding to the ratio of the audio signal output to the speaker 20 and the audio signal input from the microphone 10, and is a frequency. It is not limited to each area, and may be obtained, for example, in the time domain.
  • the auto equalizer unit 101 performs a threshold value determination using this feedback rate F0, and calculates a suppression gain G ( ⁇ ) according to the result of the threshold value determination.
  • noise included in the input sound may be added.
  • voice signal may be read as “sound signal”
  • sound processing device may be read as “sound processing device” or “signal processing device”.
  • the series of processes of the voice processing device 1 described above can be executed by hardware or software.
  • the programs constituting the software are installed on the computer of each device.
  • FIG. 14 is a block diagram showing an example of the hardware configuration of a computer that executes the above-mentioned series of processes programmatically.
  • a CPU Central Processing Unit
  • ROM Read Only Memory
  • RAM Random Access Memory
  • An input / output interface 1005 is further connected to the bus 1004.
  • An input unit 1006, an output unit 1007, a storage unit 1008, a communication unit 1009, and a drive 1010 are connected to the input / output interface 1005.
  • the input unit 1006 includes a microphone, a keyboard, a mouse, and the like.
  • the output unit 1007 includes a speaker, a display, and the like.
  • the storage unit 1008 includes a hard disk, a non-volatile memory, and the like.
  • the communication unit 1009 includes a network interface and the like.
  • the drive 1010 drives a removable recording medium 1011 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
  • the CPU 1001 loads the program recorded in the ROM 1002 and the storage unit 1008 into the RAM 1003 via the input / output interface 1005 and the bus 1004 and executes the above-mentioned series. Is processed.
  • the program executed by the computer can be recorded and provided on the removable recording medium 1011 as a package medium or the like, for example. Programs can also be provided via wired or wireless transmission media such as local area networks, the Internet, and digital satellite broadcasting.
  • the program can be installed in the storage unit 1008 via the input / output interface 1005 by mounting the removable recording medium 1011 in the drive 1010. Further, the program can be received by the communication unit 1009 and installed in the storage unit 1008 via a wired or wireless transmission medium. In addition, the program can be pre-installed in the ROM 1002 or the storage unit 1008.
  • the processing performed by the computer according to the program does not necessarily have to be performed in chronological order in the order described as the flowchart. That is, the processing performed by the computer according to the program also includes processing executed in parallel or individually (for example, parallel processing or processing by an object). Further, the program may be processed by one computer (processor) or may be distributed by a plurality of computers.
  • each step of the above-mentioned processing can be executed by one device or shared by a plurality of devices. Further, when a plurality of processes are included in one step, the plurality of processes included in the one step can be executed by one device or shared by a plurality of devices.
  • a signal processing unit that processes the audio signal picked up by the microphone, compares the index related to the amount of wraparound of the sound output from the speaker to the microphone with a predetermined threshold, and determines the threshold.
  • the index is calculated for each frequency band or in the time domain.
  • the signal processing unit applies the index value and the gain calculated from the threshold value to the audio signal for each frequency band (4).
  • the audio processing device described in. (6) The voice processing device according to (5), wherein the signal processing unit calculates the gain so that the value of the index becomes equal to or less than the threshold value.
  • the signal processing unit includes an auto equalizer unit.
  • the signal processing unit Generate a calibration signal and Described in any one of (2) to (7) above, which calculates a value according to the ratio of the calibration signal output to the speaker and the audio signal input from the microphone as the index during the calibration period.
  • Audio processing device (9) The signal processing unit With more volume When the volume is changed in the volume unit, a value corresponding to the ratio of the audio signal output to the speaker and the audio signal input from the microphone is calculated. The voice processing device according to (8) above, wherein the calculated value is used to correct the value of the index. (10) The signal processing unit Including a beamforming section that performs beamforming processing, The voice processing device according to (9) above, wherein the value of the index is corrected by using the internal parameters of the beamforming unit. (11) The signal processing unit It further includes an effector unit that performs processing on the voice signal. The voice processing device according to (9) or (10), wherein the value of the index is corrected by using the internal parameters of the effector unit.
  • the audio processing device (12) The audio processing device according to (11) above, wherein the effector unit includes at least one of an equalizer unit, an auto gain control unit, and a filter unit.
  • the threshold value is set according to a criterion for determining sound quality with a reverberant feeling.
  • the voice processing device according to any one of (3) to (7) above, further comprising a calculation unit for calculating a score according to the gain.
  • a generator that generates evaluation information based on the score
  • the voice processing device 14), further including a presentation control unit that controls the presentation of evaluation information.
  • the evaluation information includes information on sound quality at the time of loudspeaking.
  • the voice processing device according to any one of (1) to (16) above, wherein the microphone is installed at a position away from the mouth of the speaker. (18) The voice processing device according to (17), wherein the microphone and the speaker are fixedly installed at predetermined positions in the same space. (19) The voice processing device according to any one of (1) to (18), which is provided in the housing of the microphone or in an external housing. (20) The voice processing device A voice processing method in which a voice signal picked up by a microphone is processed, an index relating to the amount of wraparound of sound corresponding to the voice signal output from a speaker to the microphone is compared with a predetermined threshold value, and a threshold value is determined.
  • 1 sound processing device 10 microphone, 11-1, 11-2, 11 microphone unit, 12 A / D conversion unit, 13, 13A, 13B signal processing unit, 14 signal output unit, 20 speakers, 30 display devices, 100 information Processing device, 101 auto equalizer unit, 102 volume unit, 103 calibration signal generation unit, 104 output sound power value calculation unit, 105 input sound power value calculation unit, 106 feed rate calculation unit, 111 beam forming unit, 112 equalizer unit, 121 low frequency cut filter unit, 122 auto gain control unit, 151 sound quality score calculation unit, 152 evaluation information generation unit, 153 presentation control unit, 1001 CPU

Abstract

The present invention pertains to a voice processing device and a voice processing method with which it is possible to minimize acoustic feedback more assuredly. Provided is a voice processing device equipped with a signal processing unit that processes a voice signal picked up by a microphone, that compares, with a given threshold, an index relating to the amount of wraparound of a sound going into the microphone according to a voice signal outputted from a speaker, and that performs a threshold value determination. Accordingly, it is possible to minimize acoustic feedback more assuredly. The present invention is applicable to, for example, a sound amplification system that performs off-microphone sound amplification.

Description

音声処理装置、及び音声処理方法Voice processing device and voice processing method
 本技術は、音声処理装置、及び音声処理方法に関し、特に、より確実にハウリングを抑制することができるようにした音声処理装置、及び音声処理方法に関する。 The present technology relates to a voice processing device and a voice processing method, and more particularly to a voice processing device and a voice processing method capable of suppressing howling more reliably.
 マイクロフォンとスピーカ等から構成されるシステムにおいては、ハウリングの抑制に際し、ハウリングが発生する周波数を事前の測定で検知してノッチフィルタを設ける方式が、多くのハウリングサプレッサで用いられている(例えば、特許文献1参照)。 In a system composed of a microphone and a speaker, a method of detecting the frequency at which howling occurs by prior measurement and providing a notch filter when suppressing howling is used in many howling suppressors (for example, patents). Reference 1).
特開2008-224816号公報Japanese Unexamined Patent Publication No. 2008-224816
 しかしながら、ノッチフィルタを設ける方式では、音声信号を処理する信号処理部内のパラメータが変更された場合の対策としては十分ではなく、確実にハウリングを抑制するための技術が求められていた。 However, the method of providing a notch filter is not sufficient as a countermeasure when the parameters in the signal processing unit that processes the audio signal are changed, and a technique for surely suppressing howling has been required.
 本技術はこのような状況に鑑みてなされたものであり、より確実にハウリングを抑制することができるようにするものである。 This technology was made in view of such a situation, and makes it possible to suppress howling more reliably.
 本技術の一側面の音声処理装置は、マイクロフォンにより収音された音声信号を処理して、スピーカから出力される前記音声信号に応じた音の前記マイクロフォンへの回り込み量に関する指標を所定の閾値と比較し、閾値判定を行う信号処理部を備える音声処理装置である。 The voice processing device of one aspect of the present technology processes the voice signal picked up by the microphone, and sets an index regarding the amount of wraparound of the sound corresponding to the voice signal output from the speaker to the microphone as a predetermined threshold value. It is a voice processing device including a signal processing unit that compares and determines a threshold.
 本技術の一側面の音声処理方法は、音声処理装置が、マイクロフォンにより収音された音声信号を処理して、スピーカから出力される前記音声信号に応じた音の前記マイクロフォンへの回り込み量に関する指標を所定の閾値と比較し、閾値判定を行う音声処理方法である。 The voice processing method of one aspect of the present technology is an index relating to the amount of sound wrapping around the microphone according to the voice signal output from the speaker by the voice processing device processing the voice signal picked up by the microphone. Is a voice processing method for determining a threshold by comparing with a predetermined threshold.
 本技術の一側面の音声処理装置、及び音声処理方法においては、マイクロフォンにより収音された音声信号が処理されて、スピーカから出力される前記音声信号に応じた音の前記マイクロフォンへの回り込み量に関する指標が所定の閾値と比較され、閾値判定が行われる。 In the audio processing device and the audio processing method of one aspect of the present technology, the audio signal picked up by the microphone is processed, and the amount of the sound corresponding to the audio signal output from the speaker wraps around the microphone. The index is compared with a predetermined threshold, and the threshold determination is performed.
 本技術の一側面の音声処理装置は、独立した装置であってもよいし、1つの装置を構成している内部ブロックであってもよい。 The voice processing device on one aspect of the present technology may be an independent device or an internal block constituting one device.
本技術を適用したマイクロフォンとスピーカの設置の例を示す図である。It is a figure which shows the example of the installation of the microphone and the speaker to which this technology is applied. 本技術を適用した音声処理装置の一実施の形態の構成の例を示すブロック図である。It is a block diagram which shows the example of the structure of one Embodiment of the voice processing apparatus to which this technique is applied. フィードバックレート設定処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of a feedback rate setting process. フィードバックレート算出処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of a feedback rate calculation process. フィードバックレート制御処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of a feedback rate control process. フィードバックレート制御の例を示す図である。It is a figure which shows the example of feedback rate control. フィードバックレート更新処理の流れを説明するフローチャートである。It is a flowchart explaining the flow of a feedback rate update process. フィードバックレート更新の例を示す図である。It is a figure which shows the example of feedback rate update. 本技術を適用した音声処理装置の一実施の形態の他の構成の例を示すブロック図である。It is a block diagram which shows the example of another configuration of one Embodiment of the voice processing apparatus to which this technique is applied. 本技術を適用した音声処理装置の一実施の形態のさらに他の構成の例を示すブロック図である。It is a block diagram which shows the example of still another structure of one Embodiment of the voice processing apparatus to which this technique is applied. 本技術を適用した情報処理装置の一実施の形態の構成の例を示すブロック図である。It is a block diagram which shows the example of the structure of one Embodiment of the information processing apparatus to which this technique is applied. 評価情報の提示の第1の例を示す図である。It is a figure which shows the 1st example of the presentation of evaluation information. 評価情報の提示の第2の例を示す図である。It is a figure which shows the 2nd example of the presentation of evaluation information. コンピュータのハードウェアの構成の例を示すブロック図である。It is a block diagram which shows the example of the hardware configuration of a computer.
<1.第1の実施の形態> <1. First Embodiment>
 一般に、拡声(マイクロフォンにより収音された音を、同じ部屋に設置されたスピーカから再生)する際には、ハンドマイクやピンマイクなどが用いられる。この理由は、スピーカやマイクロフォンへの回り込み量を低減するために、マイクロフォンの感度を抑える必要があり、音量が大きく入るように、話者の口元に近い位置に、マイクロフォンを取り付ける必要があるからである。 Generally, a hand microphone or pin microphone is used for loudspeaking (sound picked up by a microphone is reproduced from a speaker installed in the same room). The reason for this is that it is necessary to reduce the sensitivity of the microphone in order to reduce the amount of wraparound to the speaker or microphone, and it is necessary to install the microphone near the speaker's mouth so that the volume can be increased. is there.
 一方で、図1に示すように、ハンドマイクやピンマイクではなく、例えば天井に取り付けたマイクロフォン10など、話者の口元から離れた位置にマイクロフォンを設置して拡声をすることを、オフマイク拡声と呼んでいる。例えば、図1においては、教師が話した声を、天井に取り付けたマイクロフォン10により収音して教室中に拡声し、生徒達が聞き取れるようにしている。 On the other hand, as shown in FIG. 1, instead of a hand-held microphone or a pin microphone, installing a microphone at a position away from the speaker's mouth, such as a microphone 10 mounted on the ceiling, to louden the sound is called off-mic loudspeaker. I'm out. For example, in FIG. 1, the voice spoken by the teacher is picked up by a microphone 10 mounted on the ceiling and loudened throughout the classroom so that the students can hear it.
 しかしながら、実際に教室や会議室などで、オフマイク拡声をすると、盛大なハウリングが発生してしまう。この理由は、天井に取り付けられたマイクロフォン10は、ハンドマイクやピンマイクと比べて、感度を高くする必要があるため、スピーカ20からマイクロフォン10への自音声の回り込み量が多い、つまり、音響結合が大きいためである。 However, if you actually make an off-mic loudspeaker in a classroom or conference room, howling will occur. The reason for this is that the ceiling-mounted microphone 10 needs to be more sensitive than a handheld microphone or pin microphone, so that the amount of self-voice wraparound from the speaker 20 to the microphone 10 is large, that is, acoustic coupling is achieved. Because it is big.
 ハウリングを抑制する装置は、ハウリングサプレッサやフィードバックリデューサなどを呼ばれる。フィードバックリデューサには、事前測定型と追従型がある。 The device that suppresses howling is called a howling suppressor or feedback reducer. There are two types of feedback reducers: pre-measurement type and follow-up type.
 事前測定型のフィードバックリデューサでは、事前にハウリングが起きるかどうかを測定し、ハウリングが起きる場合には、その周波数にノッチフィルタを入れるのが一般的である。また、ノッチフィルタの代わりに、グラフィックイコライザ等によって、ハウリングが起きる周波数のゲインを下げることで対処する場合もある。 In a pre-measurement type feedback reducer, it is common to measure whether howling occurs in advance, and if howling occurs, insert a notch filter at that frequency. Further, instead of the notch filter, a graphic equalizer or the like may be used to reduce the gain of the frequency at which howling occurs.
 追従型のフィードバックリデューサは、ハウリングを自動で検知して、ハウリングが起きる周波数に動的にノッチフィルタを追加したり、ノッチフィルタの代わりにグラフィックイコライザ等でハウリングが起きる周波数のゲインを下げたりするタイプの装置である。 The follow-up type feedback reducer automatically detects howling and dynamically adds a notch filter to the frequency at which howling occurs, or lowers the gain of the frequency at which howling occurs with a graphic equalizer instead of the notch filter. It is a device of.
 事前測定型のフィードバックリデューサは、ボリュームやイコライザ等の内部パラメータが変更された場合に、新たなハウリングが発生してしまうという問題がある。 The pre-measurement type feedback reducer has a problem that new howling occurs when internal parameters such as volume and equalizer are changed.
 一方で、追従型のフィードバックリデューサでは、ボリュームやイコライザなどの内部パラメータの変更によって、新たなハウリングが発生したとき、それを自動で検知してハウリングが発生しないような抑圧処理が働くことになる。しかしながら、ハウリングを検知したり、抑圧フィルタが働いたりするまでにある程度の時間を要するため、その時間帯にハウリングが聞こえてしまうという問題がある。 On the other hand, in the follow-up type feedback reducer, when new howling occurs due to changes in internal parameters such as volume and equalizer, suppression processing that automatically detects it and prevents howling from occurring will work. However, since it takes a certain amount of time for howling to be detected and the suppression filter to operate, there is a problem that howling is heard during that time period.
 このような状況に鑑みて、本技術では、より確実にハウリングを抑制することができるようにする。特に、本技術では、オフマイク拡声時にボリュームやイコライザ等の内部パラメータが変更された場合でも、確実にハウリングを抑制することができるようにする。 In view of this situation, this technology makes it possible to suppress howling more reliably. In particular, in this technology, howling can be reliably suppressed even when internal parameters such as volume and equalizer are changed during off-mic loudspeaker.
 以下、図面を参照しながら、本技術の実施の形態を説明する。 Hereinafter, embodiments of the present technology will be described with reference to the drawings.
(音声処理装置の構成)
 図2は、本技術を適用した音声処理装置の一実施の形態の構成の例を示している。
(Configuration of audio processing device)
FIG. 2 shows an example of the configuration of an embodiment of a voice processing device to which the present technology is applied.
 図2において、音声処理装置1は、A/D変換部12、信号処理部13、及び信号出力部14を含んで構成される。音声処理装置1は、マイクロフォン10及びスピーカ20のそれぞれと電気的に接続される。 In FIG. 2, the voice processing device 1 includes an A / D conversion unit 12, a signal processing unit 13, and a signal output unit 14. The voice processing device 1 is electrically connected to each of the microphone 10 and the speaker 20.
 マイクロフォン10は、マイクユニット11-1とマイクユニット11-2から構成される。2つのマイクユニット11の後段には、A/D変換部12が設けられる。 The microphone 10 is composed of a microphone unit 11-1 and a microphone unit 11-2. An A / D conversion unit 12 is provided after the two microphone units 11.
 なお、マイクロフォン10においては、1以上のマイクユニット11が設けられていればよい。また、マイクユニット11は、全指向性マイクでも、単一指向性マイクでも構わない。 Note that the microphone 10 may be provided with one or more microphone units 11. Further, the microphone unit 11 may be an omnidirectional microphone or a unidirectional microphone.
 マイクユニット11-1は、音声(音)を収音し、アナログ信号としての音声信号を、A/D変換部12に供給する。マイクユニット11-2は、音声(音)を収音し、アナログ信号としての音声信号を、A/D変換部12に供給する。 The microphone unit 11-1 collects voice (sound) and supplies a voice signal as an analog signal to the A / D conversion unit 12. The microphone unit 11-2 collects voice (sound) and supplies a voice signal as an analog signal to the A / D conversion unit 12.
 A/D変換部12は、マイクユニット11-1とマイクユニット11-2から供給される音声信号を、アナログ信号からデジタル信号に変換し、信号処理部13に供給する。 The A / D conversion unit 12 converts the audio signals supplied from the microphone unit 11-1 and the microphone unit 11-2 from analog signals to digital signals and supplies them to the signal processing unit 13.
 信号処理部13は、デジタルシグナルプロセッサ(DSP:Digital Signal Processor)などとして構成される。信号処理部13は、A/D変換部12から供給される音声信号に対して所定の信号処理を施し、その結果得られる音声信号(ハウリングを抑圧した音声信号)を、信号出力部14に供給する。 The signal processing unit 13 is configured as a digital signal processor (DSP: Digital Signal Processor) or the like. The signal processing unit 13 performs predetermined signal processing on the audio signal supplied from the A / D conversion unit 12, and supplies the resulting audio signal (audio signal with suppressed howling) to the signal output unit 14. To do.
 信号出力部14は、音声出力端子を含んで構成される。信号出力部14は、信号処理部13から供給される音声信号を、音声出力端子に接続されたスピーカ20に出力する。 The signal output unit 14 includes an audio output terminal. The signal output unit 14 outputs the audio signal supplied from the signal processing unit 13 to the speaker 20 connected to the audio output terminal.
 スピーカ20は、音声処理装置1(の信号出力部14)から出力される音声信号を処理し、音声信号に応じた音声(音)を出力する。 The speaker 20 processes the audio signal output from the audio processing device 1 (signal output unit 14), and outputs the audio (sound) corresponding to the audio signal.
 なお、音声処理装置1に、マイクロフォン10及びスピーカ20の少なくとも一方が含まれてもよい。また、マイクロフォン10が、A/D変換部12、信号処理部13、及び信号出力部14の全部、又は少なくとも一部を含んでもよい。つまり、信号処理部13は、マイクロフォン10の筐体内に設けられるか、あるいは外部の筐体に内に設けられてマイクロフォン10の筐体に外付けされてもよい。 Note that the voice processing device 1 may include at least one of the microphone 10 and the speaker 20. Further, the microphone 10 may include all or at least a part of the A / D conversion unit 12, the signal processing unit 13, and the signal output unit 14. That is, the signal processing unit 13 may be provided inside the housing of the microphone 10 or may be provided inside the housing of the microphone 10 and externally attached to the housing of the microphone 10.
 また、信号処理部13は、オートイコライザ部101、ボリューム部102、キャリブレーション信号生成部103、出力音パワー値算出部104、入力音パワー値算出部105、及びフィードバックレート算出部106を含んで構成される。 Further, the signal processing unit 13 includes an auto equalizer unit 101, a volume unit 102, a calibration signal generation unit 103, an output sound power value calculation unit 104, an input sound power value calculation unit 105, and a feedback rate calculation unit 106. Will be done.
 オートイコライザ部101は、そこに入力される音声信号の周波数特性を自動で変更し、ボリューム部102に供給する。 The auto equalizer unit 101 automatically changes the frequency characteristics of the audio signal input therein and supplies it to the volume unit 102.
 ボリューム部102は、オートイコライザ部101から供給される音声信号の音量(ボリューム)を調整し、信号出力部14に供給する。 The volume unit 102 adjusts the volume of the audio signal supplied from the auto equalizer unit 101 and supplies it to the signal output unit 14.
 キャリブレーション信号生成部103は、セッティング時などのキャリブレーション期間に、ホワイトノイズ信号やピンクノイズ信号等のキャリブレーション信号を生成し、信号出力部14に供給する。信号出力部14は、キャリブレーション信号生成部103から供給されるキャリブレーション信号をスピーカ20に出力する。 The calibration signal generation unit 103 generates a calibration signal such as a white noise signal or a pink noise signal during a calibration period such as during setting, and supplies the calibration signal to the signal output unit 14. The signal output unit 14 outputs the calibration signal supplied from the calibration signal generation unit 103 to the speaker 20.
 これにより、キャリブレーション期間に、スピーカ20は、信号出力部14から入力されるキャリブレーション信号に応じたキャリブレーション音を出力する。このキャリブレーション音は、マイクロフォン10により収音され、その音声信号が信号処理部13に入力される。 As a result, during the calibration period, the speaker 20 outputs a calibration sound corresponding to the calibration signal input from the signal output unit 14. The calibration sound is picked up by the microphone 10, and the audio signal is input to the signal processing unit 13.
 このとき、信号処理部13においては、出力音パワーと入力音パワーのそれぞれが、出力音パワー値算出部104と入力音パワー値算出部105により算出される。 At this time, in the signal processing unit 13, each of the output sound power and the input sound power is calculated by the output sound power value calculation unit 104 and the input sound power value calculation unit 105.
 出力音パワー値算出部104は、そこに入力されるキャリブレーション信号、すなわち、信号処理部13から出力される音声信号に基づき、出力音パワー値を算出し、フィードバックレート算出部106に供給する。 The output sound power value calculation unit 104 calculates the output sound power value based on the calibration signal input therein, that is, the audio signal output from the signal processing unit 13, and supplies the output sound power value to the feedback rate calculation unit 106.
 入力音パワー値算出部105は、そこに入力される音声信号、すなわち、信号処理部13に入力される音声信号に基づき、入力音パワー値を算出し、フィードバックレート算出部106に供給する。 The input sound power value calculation unit 105 calculates the input sound power value based on the audio signal input therein, that is, the audio signal input to the signal processing unit 13, and supplies the input sound power value to the feedback rate calculation unit 106.
 フィードバックレート算出部106には、出力音パワー値算出部104からの出力音パワー値と、入力音パワー値算出部105からの入力音パワー値が入力される。フィードバックレート算出部106は、出力音パワー値と入力音パワー値を用いた所定の演算を行うことで、フィードバックレートを算出する。 The feedback rate calculation unit 106 is input with the output sound power value from the output sound power value calculation unit 104 and the input sound power value from the input sound power value calculation unit 105. The feedback rate calculation unit 106 calculates the feedback rate by performing a predetermined calculation using the output sound power value and the input sound power value.
 ここで、フィードバックレート(Feedback Rate)とは、スピーカ20から出力された音が、マイクロフォン10にどのくらい回り込むかを定量的に表した音の回り込み量(回り込み率)に関する指標である。以下、フィードバックレートを周波数帯域ごとに求める場合について説明する。 Here, the feedback rate is an index related to the amount of sound wraparound (wraparound rate) that quantitatively expresses how much the sound output from the speaker 20 wraps around the microphone 10. Hereinafter, a case where the feedback rate is obtained for each frequency band will be described.
 フィードバックレート算出部106により算出されたフィードバックレートは、オートイコライザ部101に供給される。これにより、キャリブレーション期間に、オートイコライザ部101では、フィードバックレートが設定される。 The feedback rate calculated by the feedback rate calculation unit 106 is supplied to the auto equalizer unit 101. As a result, the feedback rate is set in the auto equalizer unit 101 during the calibration period.
 その後、オフマイク拡声時には、スピーカ20は、信号出力部14から入力される音声信号に応じた音を出力する。この音は、マイクロフォン10により収音され、その音声信号が信号処理部13に入力される。 After that, at the time of off-mic loudspeaker, the speaker 20 outputs a sound corresponding to the audio signal input from the signal output unit 14. This sound is picked up by the microphone 10, and the audio signal is input to the signal processing unit 13.
 このとき、オートイコライザ部101では、各周波数帯域のフィードバックレートを用いた閾値判定が行われる。オートイコライザ部101は、フィードバックレートが閾値を超える場合には、フィードバックレートを閾値以下にするためのゲインを算出し、そのゲインを音声信号に適用する。 At this time, the auto equalizer unit 101 performs threshold determination using the feedback rate of each frequency band. When the feedback rate exceeds the threshold value, the auto equalizer unit 101 calculates a gain for lowering the feedback rate to the threshold value or less, and applies the gain to the audio signal.
 さらに、オフマイク拡声時に、ボリューム部102によりボリューム変更が行われた場合、オートイコライザ部101では、変更したボリュームに連動してフィードバックレートを再設定(更新)する。 Further, when the volume is changed by the volume unit 102 at the time of off-mic loudspeaker, the auto equalizer unit 101 resets (updates) the feedback rate in conjunction with the changed volume.
 オートイコライザ部101では、フィードバックレートが更新されることで補正され、各周波数帯域の補正後のフィードバックレートを用いた閾値判定が行われる。オートイコライザ部101では、補正後のフィードバックレートが閾値を超える場合に、補正後のフィードバックレートを閾値以下にするためのゲインが算出され、そのゲインが音声信号に適用される。 The auto equalizer unit 101 is corrected by updating the feedback rate, and the threshold value is determined using the corrected feedback rate of each frequency band. In the auto equalizer unit 101, when the corrected feedback rate exceeds the threshold value, a gain for lowering the corrected feedback rate to the threshold value or less is calculated, and the gain is applied to the audio signal.
 以上のように構成される音声処理装置1において、信号処理部13では、マイクロフォン10により収音された音声信号を処理して、スピーカ20から出力される音声信号に応じた音声のマイクロフォン10への回り込み量に関する指標(フィードバックレート)を所定の閾値と比較する閾値判定が行われ、その閾値判定に応じたゲイン(抑圧ゲイン)が音声信号に適用されることで、オフマイク拡声時に発生するハウリングを低減するとともに、拡声音質を向上させることができる。 In the voice processing device 1 configured as described above, the signal processing unit 13 processes the voice signal picked up by the microphone 10 and transmits the voice according to the voice signal output from the speaker 20 to the microphone 10. A threshold judgment is performed by comparing the index (feedback rate) related to the amount of wraparound with a predetermined threshold, and the gain (suppression gain) corresponding to the threshold judgment is applied to the voice signal to reduce howling that occurs during off-microphone loudspeaker. At the same time, the loudspeaker sound quality can be improved.
 また、信号処理部13では、ボリューム変更が行われた場合に、ボリューム変更に追従してフィードバックレートを補正して、ハウリング対策のゲイン(抑圧ゲイン)を調整しているため、オフマイク拡声時にボリュームを変更しても、ハウリングの発生を抑制することができる。 Further, when the volume is changed, the signal processing unit 13 corrects the feedback rate according to the volume change and adjusts the gain (suppression gain) for howling countermeasures. Even if it is changed, the occurrence of howling can be suppressed.
 特に、現状の事前測定型のフィードバックリデューサと比べて、ボリュームなどの内部パラメータの変更時に新たなハウリングが発生することを抑制することができる。また、現状の追従型のフィードバックリデューサと比べて、ハウリングの検知までの遅延が発生しないため、ハウリングが発生しないことになる。 In particular, compared to the current pre-measurement type feedback reducer, it is possible to suppress the occurrence of new howling when changing internal parameters such as volume. Further, as compared with the current follow-up type feedback reducer, howling does not occur because there is no delay until howling is detected.
 次に、図3乃至図8を参照して、信号処理部13による信号処理の詳細を説明する。 Next, the details of signal processing by the signal processing unit 13 will be described with reference to FIGS. 3 to 8.
(フィードバックレート設定処理)
 まず、図3のフローチャートを参照して、セッティング時に実施されるフィードバックレート設定処理の流れを説明する。
(Feedback rate setting process)
First, the flow of the feedback rate setting process performed at the time of setting will be described with reference to the flowchart of FIG.
 ステップS11においては、セッティング時であるかどうかが判定される。ステップS11の判定処理で、セッティング時であると判定された場合、処理は、ステップS12に進められ、フィードバックレートを設定するために、ステップS12乃至S15の処理が行われる。 In step S11, it is determined whether or not it is at the time of setting. If it is determined in the determination process of step S11 that it is the time of setting, the process proceeds to step S12, and the processes of steps S12 to S15 are performed in order to set the feedback rate.
 ステップS12において、キャリブレーション信号生成部103は、キャリブレーション信号を生成する。このキャリブレーション信号としては、ホワイトノイズ信号などが生成される。 In step S12, the calibration signal generation unit 103 generates a calibration signal. As this calibration signal, a white noise signal or the like is generated.
 ステップS13において、信号出力部14は、生成されたキャリブレーション信号を、スピーカ20に出力する。これにより、スピーカ20からは、キャリブレーション音として、ホワイトノイズなどが出力される。 In step S13, the signal output unit 14 outputs the generated calibration signal to the speaker 20. As a result, white noise or the like is output from the speaker 20 as a calibration sound.
 ステップS14においては、フィードバックレート算出処理が行われる。ここで、図4のフローチャートを参照して、フィードバックレート算出処理の詳細を説明する。 In step S14, the feedback rate calculation process is performed. Here, the details of the feedback rate calculation process will be described with reference to the flowchart of FIG.
 ステップS31において、出力音パワー値算出部104は、信号処理部13から出力されるキャリブレーション信号に基づいて、出力音パワー値を算出する。 In step S31, the output sound power value calculation unit 104 calculates the output sound power value based on the calibration signal output from the signal processing unit 13.
 ステップS32において、入力音パワー値算出部105は、信号処理部13に入力された音声信号に基づいて、入力音パワー値を算出する。 In step S32, the input sound power value calculation unit 105 calculates the input sound power value based on the voice signal input to the signal processing unit 13.
 ステップS33において、フィードバックレート算出部106は、算出された出力音パワー値と入力音パワー値を用いた所定の演算を行うことで、フィードバックレートを算出する。 In step S33, the feedback rate calculation unit 106 calculates the feedback rate by performing a predetermined calculation using the calculated output sound power value and input sound power value.
 例えば、フィードバックレートを周波数帯域ごとに求める場合に、周波数帯域ごとの出力音をY(ω),入力音をM(ω)とし、周波数帯域ごとのフィードバックレートをF0(ω)とすると、F0(ω)は、下記の式(1)で表される。 For example, when the feedback rate is calculated for each frequency band, if the output sound for each frequency band is Y (ω), the input sound is M (ω), and the feedback rate for each frequency band is F0 (ω), then F0 ( ω) is represented by the following equation (1).
 F0(ω) = |M(ω)|/|Y(ω)|    ・・・(1) F0 (ω) = | M (ω) | / | Y (ω) | ・ ・ ・ (1)
 すなわち、式(1)において、F0(ω)の値が1.0倍を超えた周波数は、スピーカ20から出力された音が、それ以上に大きくマイクロフォン10に入力されているため、ハウリングが発生することを意味している。 That is, in the equation (1), at frequencies where the value of F0 (ω) exceeds 1.0 times, howling occurs because the sound output from the speaker 20 is input to the microphone 10 more loudly. Means.
 また、式(1)は、F0(ω)の値が1.0未満であっても、一定以上の値(例えば、略0.5)を超えると、ハウリングは発生しないものの、音のループが大きいため、広いホールで話しているかのような残響が発生することを意味している。例えば、この略0.5を、残響感のある音質の判断基準として、閾値Tに設定することができる。 Further, the equation (1) is wide because even if the value of F0 (ω) is less than 1.0, howling does not occur when the value exceeds a certain value (for example, about 0.5), but the sound loop is large. It means that the reverberation will occur as if you were talking in the hall. For example, this approximately 0.5 can be set to the threshold value T as a criterion for determining sound quality with a reverberant feeling.
 ステップS33の処理が終了すると、処理は、図3のステップS14に戻り、それ以降の処理が行われる。 When the process of step S33 is completed, the process returns to step S14 of FIG. 3, and the subsequent processes are performed.
 ステップS15において、オートイコライザ部101は、フィードバックレート算出処理で算出された周波数帯域ごとのフィードバックレートF0(ω)を設定する。 In step S15, the auto equalizer unit 101 sets the feedback rate F0 (ω) for each frequency band calculated by the feedback rate calculation process.
 ステップS15の処理が終了するか、又はステップS11の判定処理でセッティング時ではないと判定された場合には、処理を終了する。 If the process of step S15 is completed, or if it is determined in the determination process of step S11 that it is not the time of setting, the process is terminated.
 以上、フィードバックレート設定処理の流れを説明した。このフィードバックレート設定処理では、セッティング時に、キャリブレーション音を利用して、周波数帯域ごとのフィードバックレートF0(ω)が算出され、初期値として設定される。 The flow of the feedback rate setting process has been explained above. In this feedback rate setting process, the feedback rate F0 (ω) for each frequency band is calculated using the calibration sound at the time of setting and is set as an initial value.
 なお、フィードバックレート設定処理は、セッティング時に限らず、例えば、授業や会議の開始時などの使用開始時に、開始ボタンの押下などの所定の操作が行われたときに、実施されてもよい。 Note that the feedback rate setting process is not limited to the setting time, and may be executed when a predetermined operation such as pressing the start button is performed at the start of use such as at the start of a class or a conference.
(フィードバックレート制御処理)
 次に、図5のフローチャートを参照して、セッティング完了後のオフマイク拡声時に実施されるフィードバックレート制御処理の流れを説明する。
(Feedback rate control processing)
Next, the flow of the feedback rate control process performed at the time of off-mic loudspeaker after the setting is completed will be described with reference to the flowchart of FIG.
 ステップS51において、オートイコライザ部101は、そこに入力される音声信号について、各周波数帯域のフィードバックレートF0(ω)が、閾値T以下となるかどうかを判定する。このフィードバックレートF0(ω)は、セッティング時などに設定されている。 In step S51, the auto equalizer unit 101 determines whether or not the feedback rate F0 (ω) of each frequency band is equal to or less than the threshold value T for the audio signal input therein. This feedback rate F0 (ω) is set at the time of setting or the like.
 ステップS51の判定処理で、各周波数帯域のフィードバックレートF0が閾値T以下であると判定された場合、処理は、ステップS52に進められる。ステップS52において、オートイコライザ部101は、その周波数帯域の抑圧ゲインG(ω)を1.0に設定する。 If it is determined in the determination process of step S51 that the feedback rate F0 of each frequency band is equal to or less than the threshold value T, the process proceeds to step S52. In step S52, the auto-equalizer unit 101 sets the suppression gain G (ω) of the frequency band to 1.0.
 一方で、ステップS51の判定処理で、各周波数帯域のフィードバックレートF0が閾値Tを超えると判定された場合、処理は、ステップS53に進められる。ステップS53において、オートイコライザ部101は、下記の式(2)により、その周波数帯域の抑圧ゲインG(ω)を算出して設定する。 On the other hand, if it is determined in the determination process of step S51 that the feedback rate F0 of each frequency band exceeds the threshold value T, the process proceeds to step S53. In step S53, the auto-equalizer unit 101 calculates and sets the suppression gain G (ω) of the frequency band by the following equation (2).
 G(ω) = T/F0(ω)    ・・・(2) G (ω) = T / F0 (ω) ・ ・ ・ (2)
 すなわち、抑圧ゲインG(ω)は、フィードバックレートF0(ω)の閾値判定の結果に応じて、下記の式(3)に示すような関係で算出される。 That is, the suppression gain G (ω) is calculated by the relationship shown in the following equation (3) according to the result of the threshold value determination of the feedback rate F0 (ω).
 F0(ω) ≦ T : G(ω) = 1.0
 F0(ω) > T : G(ω) = T/F0(ω)    ・・・(3)
F0 (ω) ≤ T: G (ω) = 1.0
F0 (ω)> T: G (ω) = T / F0 (ω) ・ ・ ・ (3)
 ステップS52又はS53の処理が終了すると、処理は、ステップS54に進められる。 When the process of step S52 or S53 is completed, the process proceeds to step S54.
 ステップS54において、オートイコライザ部101は、算出した周波数帯域ごとの抑圧ゲインG(ω)を、そこに入力される音声信号に適用する。すなわち、オートイコライザ部101では、実際のフィードバックレートF0(ω)が閾値T以下になるように制限するための抑圧ゲインG(ω)を算出し、算出した抑圧ゲインG(ω)を音声信号に適用している。 In step S54, the auto-equalizer unit 101 applies the calculated suppression gain G (ω) for each frequency band to the audio signal input therein. That is, in the auto equalizer unit 101, the suppression gain G (ω) for limiting the actual feedback rate F0 (ω) to be equal to or less than the threshold value T is calculated, and the calculated suppression gain G (ω) is used as an audio signal. Applying.
 以上、フィードバックレート制御処理の流れを説明した。このフィードバックレート制御処理では、オフマイク拡声時に、各周波数帯域のフィードバックレートF0(ω)を用いた閾値判定が行われ、フィードバックレートF0(ω)が閾値Tを超える場合には、フィードバックレートF0(ω)が閾値T以下になるように抑圧ゲインG(ω)が算出され、算出された抑圧ゲインG(ω)が音声信号に乗算される。これにより、ハウリングの発生を未然に抑制し、残響の少ない高音質を実現することができる。 The flow of feedback rate control processing has been explained above. In this feedback rate control process, a threshold determination is performed using the feedback rate F0 (ω) of each frequency band at the time of off-mic loudspeaker, and when the feedback rate F0 (ω) exceeds the threshold T, the feedback rate F0 (ω) is used. ) Is calculated so that the suppression gain G (ω) is equal to or less than the threshold value T, and the calculated suppression gain G (ω) is multiplied by the voice signal. As a result, the occurrence of howling can be suppressed in advance, and high-quality sound with little reverberation can be realized.
 図6は、オートイコライザ部101におけるフィードバックレート制御の例を示している。 FIG. 6 shows an example of feedback rate control in the auto equalizer unit 101.
 図6のAは、横軸を周波数として、縦方向の棒によって周波数帯域ごとのフィードバックレート(単位:dB)を表した棒グラフである。このフィードバックレートは、セッティング時に、上述した式(1)により出力音と入力音との関係から求められる。 A in FIG. 6 is a bar graph showing the feedback rate (unit: dB) for each frequency band by vertical bars with the horizontal axis as the frequency. This feedback rate is obtained from the relationship between the output sound and the input sound by the above-mentioned equation (1) at the time of setting.
 図6のAでは、横軸の値として、縦方向の棒に対応した#1乃至#32の数字(奇数のみを図示)を記述しているが、この数字は周波数帯域を識別する番号を示している。オートイコライザ部101では、周波数帯域#1乃至#32の周波数帯域ごとのフィードバックレートF0(ω)が設定される。 In A of FIG. 6, the numbers # 1 to # 32 corresponding to the vertical bars (only odd numbers are shown) are described as the values on the horizontal axis, and these numbers indicate the numbers that identify the frequency band. ing. In the auto equalizer unit 101, the feedback rate F0 (ω) for each frequency band # 1 to # 32 is set.
 図6のBは、横軸を周波数として、縦方向の棒によって周波数帯域ごとのオートイコライザ値(単位:dB)を表した棒グラフである。このオートイコライザ値は、オフマイク拡声時に、オートイコライザ部101の処理で得られる音声信号である。 B in FIG. 6 is a bar graph showing the auto-equalizer value (unit: dB) for each frequency band by a vertical bar with the horizontal axis as the frequency. This auto equalizer value is an audio signal obtained by the processing of the auto equalizer unit 101 at the time of off-mic loudspeaker.
 図6のBでは、周波数帯域#1乃至#32の周波数帯域ごとに、オートイコライザ値が示されている。この例では、閾値Tが-6dBに設定されており、フィードバックレートF0(ω)が-6dBになるように、オートイコライザ値が処理されている。 In B of FIG. 6, the auto equalizer value is shown for each frequency band # 1 to # 32. In this example, the threshold T is set to -6 dB, and the auto-equalizer value is processed so that the feedback rate F0 (ω) becomes -6 dB.
 図6のCは、横軸を周波数として、縦方向の棒によって周波数帯域ごとの補正後のフィードバックレート(単位:dB)を表した棒グラフである。周波数帯域#1乃至#32の周波数帯域ごとのフィードバックレートF0(ω)が、-6dBである閾値T以下になるように調整されている。 C in FIG. 6 is a bar graph showing the corrected feedback rate (unit: dB) for each frequency band by a vertical bar with the horizontal axis as the frequency. The feedback rate F0 (ω) for each frequency band # 1 to # 32 is adjusted so as to be equal to or less than the threshold value T of -6 dB.
 すなわち、オートイコライザ部101では、周波数帯域ごとのフィードバックレートF0(ω)の閾値判定の結果に応じた抑圧ゲインG(ω)を、対応する周波数帯域の音声信号に適用することで、結果的に、周波数帯域ごとの補正後のフィードバックレートF0(ω)を、閾値T以下にすることができる。 That is, in the auto equalizer unit 101, the suppression gain G (ω) corresponding to the result of the threshold determination of the feedback rate F0 (ω) for each frequency band is applied to the audio signal in the corresponding frequency band, resulting in the result. The corrected feedback rate F0 (ω) for each frequency band can be set to the threshold value T or less.
 このように、教室などでマイクロフォン10とスピーカ20の位置が固定されている場合、アンプのボリュームが固定であれば、スピーカ20からキャリブレーション音を出力して、マイクロフォン10に収音される音を測定することで、ハウリングが発生する周波数を事前に認識することができる。 In this way, when the positions of the microphone 10 and the speaker 20 are fixed in a classroom or the like, if the volume of the amplifier is fixed, the calibration sound is output from the speaker 20 and the sound picked up by the microphone 10 is output. By measuring, the frequency at which howling occurs can be recognized in advance.
 フィードバックレートF0(ω)が、1.0倍を超えていれば、ハウリングが発生し、略0.5倍を超えていれば、残響感のある音質になるため、オフマイク拡声時には、フィードバックレートF0(ω)と閾値T(略0.5)との比較結果に応じた抑圧ゲインG(ω)を適用することで、実際のフィードバックレートF0(ω)が0.5以下(-6dB以下)になるように制御している。 If the feedback rate F0 (ω) exceeds 1.0 times, howling will occur, and if it exceeds approximately 0.5 times, the sound quality will have a reverberant feeling. By applying the suppression gain G (ω) according to the comparison result with the threshold value T (approximately 0.5), the actual feedback rate F0 (ω) is controlled to be 0.5 or less (-6 dB or less).
 これにより、より確実にハウリングを抑制することができる。また、周波数帯域ごとのフィードバックレートF0(ω)の閾値判定の結果に応じた抑圧ゲインG(ω)を適用して、必要最小減のオートイコライザ値により音質を維持しつつ、必要な周波数帯域にはハウリングを抑制する制御を行っているため、ハウリングを抑制することと、音質を必要以上に犠牲にしないことを両立することができる。 This makes it possible to suppress howling more reliably. In addition, the suppression gain G (ω) according to the result of the threshold determination of the feedback rate F0 (ω) for each frequency band is applied, and the required frequency band is maintained while maintaining the sound quality by the minimum necessary reduction of the auto equalizer value. Since it controls howling, it is possible to suppress howling and not sacrifice sound quality more than necessary.
(フィードバックレート更新処理)
 次に、図7のフローチャートを参照して、オフマイク拡声時におけるボリュームの変更時に実施されるフィードバックレート更新処理の流れを説明する。
(Feedback rate update process)
Next, the flow of the feedback rate update process executed when the volume is changed during the off-mic loudspeaker will be described with reference to the flowchart of FIG. 7.
 ステップS71においては、ユーザの操作等によりボリュームを変更したかどうかが判定される。ステップS71の判定処理で、ボリュームが変更されたと判定された場合、処理は、ステップS72に進められる。 In step S71, it is determined whether or not the volume has been changed by a user operation or the like. If it is determined in the determination process of step S71 that the volume has been changed, the process proceeds to step S72.
 ステップS72において、オートイコライザ部101は、セッティング時などに設定されたフィードバックレートF0(ω)の値(初期値等)を、ボリュームの変更に連動して更新(補正)する。 In step S72, the auto equalizer unit 101 updates (corrects) the value (initial value, etc.) of the feedback rate F0 (ω) set at the time of setting or the like in conjunction with the change in volume.
 ここで、ボリュームの変更に応じた増幅量を、P倍とすれば、ボリュームの変更に連動したフィードバックレートF(ω)は、下記の式(4)により表すことができる。なお、ボリュームの変更に応じた増幅量は、ボリュームの変更時に、オートイコライザ部101に入力される。 Here, if the amplification amount corresponding to the volume change is multiplied by P, the feedback rate F (ω) linked to the volume change can be expressed by the following equation (4). The amplification amount corresponding to the volume change is input to the auto equalizer unit 101 when the volume is changed.
 F(ω) = P×F0(ω)    ・・・(4) F (ω) = P × F0 (ω) ・ ・ ・ (4)
 ステップS72の処理が終了するか、又はステップS71の判定処理でボリュームを変更していないと判定された場合、処理は、ステップS73に進められる。 When the process of step S72 is completed or it is determined in the determination process of step S71 that the volume has not been changed, the process proceeds to step S73.
 ステップS73においては、フィードバックレート制御処理が行われる。このフィードバックレート制御処理は、図5を参照して説明した通りであるが、ここでは、ボリュームの変更に連動してフィードバックレートF(ω)に更新(補正)されているため、補正後のフィードバックレートF(ω)を用い、周波数帯域ごとの抑圧ゲインG(ω)を再算出する。 In step S73, the feedback rate control process is performed. This feedback rate control process is as described with reference to FIG. 5, but here, since the feedback rate is updated (corrected) to the feedback rate F (ω) in conjunction with the volume change, the corrected feedback Using the rate F (ω), the suppression gain G (ω) for each frequency band is recalculated.
 すなわち、抑圧ゲインG(ω)は、補正後のフィードバックレートF(ω)の閾値判定の結果に応じて、下記の式(5)に示すような関係で算出される。 That is, the suppression gain G (ω) is calculated by the relationship shown in the following equation (5) according to the result of the threshold value determination of the corrected feedback rate F (ω).
 F(ω) ≦ T : G(ω) = 1.0
 F(ω) > T : G(ω) = T/F(ω)    ・・・(5)
F (ω) ≤ T: G (ω) = 1.0
F (ω)> T: G (ω) = T / F (ω) ・ ・ ・ (5)
 オートイコライザ部101では、式(5)により、補正後のフィードバックレートF(ω)の閾値判定に応じて算出された周波数帯域ごとの抑圧ゲインG(ω)が、音声信号に適用される。 In the auto equalizer unit 101, the suppression gain G (ω) for each frequency band calculated according to the threshold value determination of the corrected feedback rate F (ω) according to the equation (5) is applied to the audio signal.
 以上、フィードバックレート更新処理の流れを説明した。このフィードバックレート更新処理では、ボリュームの変更時に、ボリュームの変更に連動してフィードバックレートF(ω)が補正され、補正後のフィードバックレートF(ω)が閾値Tを超える場合には、フィードバックレートF(ω)が閾値T以下になるように抑圧ゲインG(ω)が算出され、算出された抑圧ゲインG(ω)が音声信号に乗算される。 The flow of the feedback rate update process has been explained above. In this feedback rate update process, when the volume is changed, the feedback rate F (ω) is corrected in conjunction with the volume change, and when the corrected feedback rate F (ω) exceeds the threshold value T, the feedback rate F is corrected. The suppression gain G (ω) is calculated so that (ω) is equal to or less than the threshold value T, and the calculated suppression gain G (ω) is multiplied by the voice signal.
 これにより、ハウリングの発生を未然に抑制し、高音質を維持することができる。特に、ボリュームの変更に追従して、ハウリング対策の抑圧ゲインG(ω)を変えることで、ユーザの操作によりボリュームの変更が行われてもハウリングの発生を抑制することができる。 This makes it possible to suppress the occurrence of howling and maintain high sound quality. In particular, by changing the suppression gain G (ω) for howling countermeasures in accordance with the change in volume, it is possible to suppress the occurrence of howling even if the volume is changed by the user's operation.
 図8は、オートイコライザ部101におけるフィードバックレートの更新の例を示している。 FIG. 8 shows an example of updating the feedback rate in the auto equalizer unit 101.
 図8のA1,A2は、ボリューム変更前のフィードバックレートF0(ω)とオートイコライザ値を、横軸を周波数としたときの縦方向の棒によって表している。図8のB1,B2と、図8のC1,C2は、ボリューム変更後のフィードバックレートF(ω)とオートイコライザ値を、横軸を周波数としたときの縦方向の棒によって表している。 A1 and A2 in FIG. 8 represent the feedback rate F0 (ω) and the auto-equalizer value before the volume change by the vertical bars when the horizontal axis is the frequency. B1 and B2 in FIG. 8 and C1 and C2 in FIG. 8 represent the feedback rate F (ω) and the auto-equalizer value after the volume change by vertical bars when the horizontal axis is the frequency.
 図8のA1,A2と、図8のB1,B2と、図8のC1,C2では、「Gain 0dB」,「Gain +6dB」,「Gain +12dB」の遷移で表すように、セッティング時のボリュームに対応したフィードバックレートとオートイコライザ値の関係を、図8のAに示し、オフマイク拡声時にボリュームを徐々に上げたときのフィードバックレートとオートイコライザ値の関係を、図8のB,Cにそれぞれ示している。 In A1 and A2 of FIG. 8, B1 and B2 of FIG. 8, and C1 and C2 of FIG. 8, as shown by the transition of "Gain 0 dB", "Gain + 6 dB", and "Gain + 12 dB" at the time of setting. The relationship between the feedback rate and the auto-equalizer value corresponding to the volume is shown in A of FIG. 8, and the relationship between the feedback rate and the auto-equalizer value when the volume is gradually increased during off-mic loudspeaker is shown in B and C of FIG. 8, respectively. Shown.
 図8のA1乃至C1のフィードバックレートをそれぞれ比較すれば、ボリュームの増加に連動して、フィードバックレートが上昇している。また、図8のA2乃至C2のオートイコライザ値をそれぞれ比較すれば、オートイコライザ部101では、ボリュームの増加に連動して、フィードバックレートF(ω)が補正され、補正後のフィードバックレートF(ω)の閾値判定に応じて算出された抑圧ゲインG(ω)が、音声信号に適用されている。 Comparing the feedback rates of A1 to C1 in FIG. 8, the feedback rate increases in conjunction with the increase in volume. Further, when the auto-equalizer values of A2 to C2 in FIG. 8 are compared, the feedback rate F (ω) is corrected in the auto-equalizer unit 101 in conjunction with the increase in volume, and the corrected feedback rate F (ω) is corrected. The suppression gain G (ω) calculated according to the threshold value determination of) is applied to the voice signal.
<2.第2の実施の形態> <2. Second Embodiment>
 次に、図9と図10を参照して、ボリュームのみならず、ビームフォーミングやイコライザなどの音の回り込みに影響する処理との連動について説明する。 Next, with reference to FIGS. 9 and 10, not only the volume but also the interlocking with the processing that affects the sound wraparound such as beamforming and the equalizer will be described.
(音声処理装置の他の第1の構成)
 図9は、本技術を適用した音声処理装置の一実施の形態の他の構成の例を示している。なお、図9においては、図2と同じ部分には同一の又は対応する符号を付してあり、その説明は適宜省略する。
(Other first configuration of the voice processing device)
FIG. 9 shows an example of another configuration of an embodiment of a voice processing device to which the present technology is applied. In FIG. 9, the same parts as those in FIG. 2 are designated by the same or corresponding reference numerals, and the description thereof will be omitted as appropriate.
 マイクロフォン10は、マイクユニット11-1とマイクユニット11-2から構成される。A/D変換部12-1,12-2は、マイクユニット11-1,11-2からの音声信号を、アナログ信号からデジタル信号にそれぞれ変換し、信号処理部13Aに供給する。 The microphone 10 is composed of a microphone unit 11-1 and a microphone unit 11-2. The A / D conversion units 12-1 and 12-2 convert the audio signals from the microphone units 11-1 and 11-2 from analog signals to digital signals, respectively, and supply them to the signal processing unit 13A.
 信号処理部13Aは、オートイコライザ部101乃至フィードバックレート算出部106に加えて、ビームフォーミング部111、及びイコライザ部112がさらに設けられている。 The signal processing unit 13A is further provided with a beamforming unit 111 and an equalizer unit 112 in addition to the auto equalizer unit 101 to the feedback rate calculation unit 106.
 ビームフォーミング部111は、A/D変換部12-1,12-2からの音声信号に基づいて、ビームフォーミング処理を行う。 The beamforming unit 111 performs beamforming processing based on the audio signals from the A / D conversion units 12-1 and 12-2.
 このビームフォーミング処理では、目的音方向の感度を確保しつつ、目的音方向以外の感度を低下させることができる。すなわち、ビームフォーミング処理では、マイクロフォン10の指向性として、スピーカ20を設置した方向からの音をとらない(なるべくとらない)指向性が形成され、モノラル信号として後段の処理に渡される。 In this beamforming process, it is possible to reduce the sensitivity other than the target sound direction while ensuring the sensitivity in the target sound direction. That is, in the beamforming process, the directivity of the microphone 10 is formed so as not to take sound from the direction in which the speaker 20 is installed (as little as possible), and is passed to the subsequent process as a monaural signal.
 ここでは、適応ビームフォーマ等の手法を用い、マイクロフォン10(のマイクユニット11-1,11-2)の指向性として、スピーカ20を設置した方向の感度を低下させる指向性が形成され、モノラル信号が生成される。 Here, using a method such as an adaptive beam former, as the directivity of the microphone 10 (microphone units 11-1, 11-2), a directivity that reduces the sensitivity in the direction in which the speaker 20 is installed is formed, and a monaural signal is formed. Is generated.
 適応ビームフォーマ等の手法を用いてスピーカ20の方向からの音を抑圧するため(ハウリングを防ぐため)には、キャリブレーション時などに、スピーカ20のみから音が出力されている区間で、ビームフォーマの内部パラメータを学習し、スピーカ20を設置した方向に死角(NULL指向性)が形成されるような指向性を算出する。 In order to suppress the sound from the direction of the speaker 20 (to prevent howling) by using a method such as an adaptive beam former, the beam former is used in the section where the sound is output only from the speaker 20 at the time of calibration or the like. The internal parameters of are learned, and the directivity is calculated so that a blind spot (NULL directivity) is formed in the direction in which the speaker 20 is installed.
 ビームフォーミング部111では、指向性係数と、キャリブレーション時に算出した拡声音に関する情報とに基づき、ビームフォーミングによる拡声成分の抑圧量B(ω)を求めることができる。 The beamforming unit 111 can obtain the suppression amount B (ω) of the loudspeaker component by beamforming based on the directivity coefficient and the information on the loudspeaker sound calculated at the time of calibration.
 なお、ビームフォーミングの手法としては、適応ビームフォーマに限らず、例えば、遅延和法や3マイク積分方式などの他の方式も知られており、どの手法を用いても構わない。 Note that the beamforming method is not limited to the adaptive beamformer, and other methods such as the delay sum method and the three-microphone integration method are also known, and any method may be used.
 ビームフォーミング処理が施された音声信号は、イコライザ部112に供給される。 The beamforming processed audio signal is supplied to the equalizer unit 112.
 イコライザ部112は、ユーザの操作に応じて、ビームフォーミング部111から供給される音声信号の周波数特性を変更し、オートイコライザ部101に供給する。イコライザ部112では、内部パラメータとして、周波数帯域ごとのゲインEQ(ω)を保持している。 The equalizer unit 112 changes the frequency characteristics of the audio signal supplied from the beamforming unit 111 according to the user's operation, and supplies the sound signal to the auto equalizer unit 101. The equalizer unit 112 holds the gain EQ (ω) for each frequency band as an internal parameter.
 図9の信号処理部13Aにおいて、オートイコライザ部101では、ビームフォーミング部111とイコライザ部112の内部パラメータを用い、上述した式(4)のフィードバックレートF(ω)の算出式を、下記の式(6)のように修正することができる。 In the signal processing unit 13A of FIG. 9, the auto equalizer unit 101 uses the internal parameters of the beamforming unit 111 and the equalizer unit 112 to calculate the feedback rate F (ω) of the above equation (4) by the following equation. It can be modified as in (6).
 なお、内部パラメータは、ビームフォーミング処理時や周波数特性の変更処理時に、オートイコライザ部101に入力される。 The internal parameters are input to the auto equalizer unit 101 during the beamforming process or the frequency characteristic change process.
 F(ω) = P×B(ω)×EQ(ω)×F0(ω)    ・・・(6) F (ω) = P x B (ω) x EQ (ω) x F0 (ω) ... (6)
 すなわち、抑圧ゲインG(ω)は、補正後のフィードバックレートF(ω)の閾値判定の結果に応じて、下記の式(7)に示すような関係で算出される。 That is, the suppression gain G (ω) is calculated by the relationship shown in the following equation (7) according to the result of the threshold value determination of the corrected feedback rate F (ω).
 F(ω) ≦ T(ω) : G(ω) = 1.0
 F(ω) > T(ω) : G(ω) = T/F(ω)    ・・・(7)
F (ω) ≤ T (ω): G (ω) = 1.0
F (ω)> T (ω): G (ω) = T / F (ω) ・ ・ ・ (7)
 オートイコライザ部101では、式(7)により、補正後のフィードバックレートF(ω)の閾値判定に応じて算出された周波数帯域ごとの抑圧ゲインG(ω)が、音声信号に適用される。 In the auto equalizer unit 101, the suppression gain G (ω) for each frequency band calculated according to the threshold value determination of the corrected feedback rate F (ω) according to the equation (7) is applied to the audio signal.
 このように、ボリュームやイコライザ、ビームフォーミングの内部パラメータに応じて、フィードバックレートF(ω)を更新(補正)することで、ハウリングを抑制して高音質を維持する抑圧ゲインG(ω)を求めることができる。 In this way, by updating (correcting) the feedback rate F (ω) according to the internal parameters of the volume, equalizer, and beamforming, howling is suppressed and the suppression gain G (ω) that maintains high sound quality is obtained. be able to.
 特に、ビームフォーミング処理や周波数特性の変更処理等の信号処理に連動して、ハウリング対策の抑圧ゲインG(ω)を変えることで、各内部パラメータを変更しても、ハウリングの発生を抑制することができる。また、各内部パラメータに連動することで、追加の調整やキャリブレーションの手間を省くことができる。 In particular, by changing the suppression gain G (ω) of howling countermeasures in conjunction with signal processing such as beamforming processing and frequency characteristic change processing, howling can be suppressed even if each internal parameter is changed. Can be done. In addition, by linking with each internal parameter, it is possible to save the trouble of additional adjustment and calibration.
 なお、図9の信号処理部13Aにおいて、入力音パワー値算出部105には、A/D変換部12-1とA/D変換部12-2の両方から音声信号が入力されているが、いずれか一方からの音声信号のみが入力されるようにしても構わない。 In the signal processing unit 13A of FIG. 9, audio signals are input to the input sound power value calculation unit 105 from both the A / D conversion unit 12-1 and the A / D conversion unit 12-2. Only the audio signal from either one may be input.
(音声処理装置の他の第2の構成)
 図10は、本技術を適用した音声処理装置の一実施の形態のさらに他の構成の例を示している。なお、図10においては、図2又は図9と同じ部分には同一の又は対応する符号を付してあり、その説明は適宜省略する。
(Other second configuration of voice processing device)
FIG. 10 shows an example of still another configuration of one embodiment of the voice processing device to which the present technology is applied. In FIG. 10, the same parts as those in FIG. 2 or 9 are designated by the same or corresponding reference numerals, and the description thereof will be omitted as appropriate.
 信号処理部13Bは、オートイコライザ部101乃至フィードバックレート算出部106に加えて、ビームフォーミング部111、イコライザ部112、低域カットフィルタ部121、及びオートゲインコントロール部122がさらに設けられている。 In addition to the auto equalizer unit 101 to the feedback rate calculation unit 106, the signal processing unit 13B is further provided with a beamforming unit 111, an equalizer unit 112, a low frequency cut filter unit 121, and an auto gain control unit 122.
 低域カットフィルタ部121は、ビームフォーミング部111から供給される音声信号の周波数の成分のうち、特定の低域の成分をカットし、イコライザ部112に供給する。 The low frequency cut filter unit 121 cuts a specific low frequency component among the frequency components of the audio signal supplied from the beamforming unit 111, and supplies the equalizer unit 112.
 オートゲインコントロール部122は、オートイコライザ部101から供給される音声信号の入力レベルに応じてゲインを自動補正し、レベル差のある信号の出力レベルを一定に保つようにして、ボリューム部102に供給する。 The auto gain control unit 122 automatically corrects the gain according to the input level of the audio signal supplied from the auto equalizer unit 101, keeps the output level of the signal having a level difference constant, and supplies it to the volume unit 102. To do.
 図10の信号処理部13Bでは、低域カットフィルタ部121とオートゲインコントロール部122の内部パラメータを用い、上述した式(6)のフィードバックレートF(ω)の算出式を、さらに修正することができる。また、抑圧ゲインG(ω)は、上述した式(7)と同様に、補正後のフィードバックレートF(ω)の閾値判定の結果に応じて求められる。 In the signal processing unit 13B of FIG. 10, the calculation formula of the feedback rate F (ω) in the above equation (6) can be further modified by using the internal parameters of the low frequency cut filter unit 121 and the auto gain control unit 122. it can. Further, the suppression gain G (ω) is obtained according to the result of the threshold value determination of the corrected feedback rate F (ω), as in the above-described equation (7).
 なお、イコライザ部112、低域カットフィルタ部121、及びオートゲインコントロール部122は、音響効果に関する信号処理(ボリュームに影響する信号処理)を行うエフェクタ部の一例であり、他の信号処理を用いる場合も同様に、上述した式(6)のフィードバックレートF(ω)の算出式を、内部パラメータに応じてさらに修正すればよい。 The equalizer unit 112, the low frequency cut filter unit 121, and the auto gain control unit 122 are examples of effector units that perform signal processing related to sound effects (signal processing that affects the volume), and when other signal processing is used. Similarly, the formula for calculating the feedback rate F (ω) in the above formula (6) may be further modified according to the internal parameters.
 また、図10の信号処理部13Bにおいて、低域カットフィルタ部121とオートゲインコントロール部122の配置位置は一例であり、他の位置に配置されても構わない。 Further, in the signal processing unit 13B of FIG. 10, the arrangement position of the low frequency cut filter unit 121 and the auto gain control unit 122 is an example, and may be arranged at other positions.
<3.第3の実施の形態> <3. Third Embodiment>
 ところで、上述した抑圧ゲインG(ω)を用いることで、オフマイク拡声時の拡声音質を推定することができる。例えば、周波数帯域ごとの抑圧ゲインG(ω)の平均値を音質スコアS(t)とする場合、下記の式(8)により、時間tの平均抑圧ゲインを算出することができる。 By the way, by using the suppression gain G (ω) described above, it is possible to estimate the loudspeaking sound quality at the time of off-mic loudspeaking. For example, when the average value of the suppression gain G (ω) for each frequency band is the sound quality score S (t), the average suppression gain for the time t can be calculated by the following equation (8).
Figure JPOXMLDOC01-appb-M000001
   ・・・(8)
Figure JPOXMLDOC01-appb-M000001
... (8)
 このとき、音質スコアS(t)が1.0であるときには、抑圧ゲインG(ω)が全周波数帯域で1.0となっており、いわば抑圧フィルタがかからない状態であるため、拡声音質が良いことを意味している。 At this time, when the sound quality score S (t) is 1.0, the suppression gain G (ω) is 1.0 in all frequency bands, which means that the suppression filter is not applied, so that the loudspeaker sound quality is good. ing.
 音質スコアS(t)が閾値T以上であるときは、抑圧ゲインG(ω)が一部の周波数帯域でかかっているものの、抑圧フィルタによる抑圧量は限定的で、拡声音質は比較的良好であることを意味している。 When the sound quality score S (t) is equal to or higher than the threshold value T, the suppression gain G (ω) is applied in a part of the frequency band, but the amount of suppression by the suppression filter is limited, and the loudspeaker sound quality is relatively good. It means that.
 音質スコアS(t)が閾値T未満であるときには、多くの周波数帯域に抑圧ゲインG(ω)がかかっており、抑圧フィルタがかかった状態になっているため、拡声音質が比較的悪化していることを意味している。 When the sound quality score S (t) is less than the threshold value T, the suppression gain G (ω) is applied to many frequency bands, and the suppression filter is applied, so that the loudspeaker sound quality is relatively deteriorated. It means that you are.
 この音質スコアS(t)の数値や閾値判定の結果を評価情報として、GUI(Graphical User Interface)表示や、LED(Light Emitting Diode)等を用いて、ユーザや設置者等に提示することで、拡声音質をわかりやすく伝えることができる。拡声音質をユーザや設置者等に提示することで、ボリューム調整やセッティングなどを支援することができる。 By presenting the numerical value of the sound quality score S (t) and the result of the threshold value determination to the user, the installer, etc. using the GUI (Graphical User Interface) display, the LED (Light Emitting Diode), etc. as evaluation information, Loudspeaker sound quality can be conveyed in an easy-to-understand manner. By presenting the loudspeaker sound quality to the user, the installer, etc., it is possible to support volume adjustment and setting.
 具体的には、ユーザや設置者等は、この評価情報を確認することで、音質を改善するために、例えば、ボリュームを低減したり、マイクロフォン10とスピーカ20の設置位置を拡声に向くように変えたりすることができる。 Specifically, by confirming this evaluation information, the user, the installer, etc., in order to improve the sound quality, for example, reduce the volume or make the installation positions of the microphone 10 and the speaker 20 suitable for loudspeaking. You can change it.
(情報処理装置の構成)
 図11は、本技術を適用した情報処理装置の一実施の形態の構成の例を示すブロック図である。
(Configuration of information processing device)
FIG. 11 is a block diagram showing an example of a configuration of an embodiment of an information processing device to which the present technology is applied.
 情報処理装置100は、拡声音質が適切であるかどうかを評価するための指標として、音質スコアを算出して提示するための装置である。 The information processing device 100 is a device for calculating and presenting a sound quality score as an index for evaluating whether or not the loudspeaker sound quality is appropriate.
 情報処理装置100は、音質スコアを算出するためのデータ(以下、スコア算出用データという)に基づいて、音質スコアを算出する。また、情報処理装置100は、評価情報を生成するためのデータ(以下、評価情報生成用データという)に基づいて、評価情報を生成し、表示装置30に提示する。 The information processing device 100 calculates the sound quality score based on the data for calculating the sound quality score (hereinafter referred to as score calculation data). Further, the information processing device 100 generates evaluation information based on data for generating evaluation information (hereinafter referred to as evaluation information generation data) and presents it to the display device 30.
 なお、スコア算出用データは、音声処理装置1(の信号処理部13)から入力される周波数帯域ごとの抑圧ゲインG(ω)を含む。また、評価情報生成用データは、例えば、算出した音質スコアのほか、オフマイク拡声時を行う際に得られる情報などを含めることができる。 The score calculation data includes the suppression gain G (ω) for each frequency band input from the voice processing device 1 (signal processing unit 13). Further, the evaluation information generation data can include, for example, in addition to the calculated sound quality score, information obtained when performing off-mic loudspeaker.
 表示装置30は、例えば、LCD(Liquid Crystal Display)やOLED(Organic Light Emitting Diode)等のディスプレイを有する装置である。表示装置30は、情報処理装置100から出力される評価情報を提示する。 The display device 30 is a device having a display such as an LCD (Liquid Crystal Display) or an OLED (Organic Light Emitting Diode), for example. The display device 30 presents the evaluation information output from the information processing device 100.
 なお、情報処理装置100は、例えば、拡声システムを構成する音響機器や、専用の測定機器、パーソナルコンピュータ等の単独の電子機器として構成されることは勿論、上述した音声処理装置1やマイクロフォン10、スピーカ20等の電子機器の機能の一部として構成されるようにしてもよい。また、情報処理装置100と表示装置30が一体となって、1つの電子機器として構成されるようにしてもよい。 The information processing device 100 is, of course, configured as a single electronic device such as an audio device constituting a loudspeaker system, a dedicated measuring device, or a personal computer, as well as the voice processing device 1 and the microphone 10 described above. It may be configured as a part of a function of an electronic device such as a speaker 20. Further, the information processing device 100 and the display device 30 may be integrated into one electronic device.
 図11において、情報処理装置100は、音質スコア算出部151、評価情報生成部152、及び提示制御部153を含んで構成される。 In FIG. 11, the information processing device 100 includes a sound quality score calculation unit 151, an evaluation information generation unit 152, and a presentation control unit 153.
 音質スコア算出部151は、そこにスコア算出用データとして入力される周波数帯域ごとの抑圧ゲインG(ω)を用い、上述した式(8)を適用して音質スコアS(t)を算出し、評価情報生成部152に供給する。 The sound quality score calculation unit 151 calculates the sound quality score S (t) by applying the above equation (8) using the suppression gain G (ω) for each frequency band input therein as score calculation data. It is supplied to the evaluation information generation unit 152.
 評価情報生成部152は、そこに評価情報生成用データとして入力される音質スコアS(t)に基づいて、評価情報を生成し、提示制御部153に供給する。この評価情報としては、音質スコアS(t)の数値や閾値判定の結果などのオフマイク拡声時の音質に関する情報を含む。 The evaluation information generation unit 152 generates evaluation information based on the sound quality score S (t) input as evaluation information generation data, and supplies the evaluation information to the presentation control unit 153. The evaluation information includes information on the sound quality at the time of off-mic loudspeaker, such as the numerical value of the sound quality score S (t) and the result of the threshold value determination.
 提示制御部153は、評価情報生成部152から供給される評価情報を、表示装置30の画面に提示する制御を行う。 The presentation control unit 153 controls to present the evaluation information supplied from the evaluation information generation unit 152 on the screen of the display device 30.
 以上のように構成される情報処理装置100においては、周波数帯域ごとの抑圧ゲインを用いて音質スコアが算出され、算出された音質スコアの数値や閾値判定の結果などを含む評価情報が生成され、生成された評価情報の提示が制御される。 In the information processing apparatus 100 configured as described above, the sound quality score is calculated using the suppression gain for each frequency band, and evaluation information including the calculated numerical value of the sound quality score and the result of the threshold value determination is generated. The presentation of the generated evaluation information is controlled.
 図12は、GUI表示として提示される評価情報の例を示している。 FIG. 12 shows an example of evaluation information presented as a GUI display.
 図12において、評価情報画面201には、ボリューム調節用のユーザインターフェースとともに、音質スコアの閾値判定の結果を表す3つの矩形領域が提示されている。 In FIG. 12, the evaluation information screen 201 is presented with a user interface for adjusting the volume and three rectangular areas showing the result of the threshold value determination of the sound quality score.
 例えば、音質スコアS(t)が、1.0であるか、閾値T以上であるか、閾値T未満であるかの3段階で表される場合に、この3段階に応じて、左から1つ目の矩形領域を点灯表示、左から1つ目と2つ目の矩形領域を点灯表示、全ての矩形領域を点灯表示のように、3つの矩形領域の点灯表示が制御される。 For example, when the sound quality score S (t) is represented by three stages of 1.0, a threshold value T or more, or a threshold value T or less, the first from the left according to these three stages. The lighting display of the three rectangular areas is controlled, such as the lighting display of the rectangular areas of the above, the lighting display of the first and second rectangular areas from the left, and the lighting display of all the rectangular areas.
 これにより、音質スコアの閾値判定の結果として、抑圧フィルタがかかっていない状態(拡声音質が良い)、抑圧フィルタによる抑圧量が限定的な状態(拡声音質が比較的良好)、又は多くの周波数帯域で抑圧フィルタがかかった状態(拡声音質が比較的悪化)が点灯表示により提示される。 As a result of the threshold determination of the sound quality score, the suppression filter is not applied (loud sound quality is good), the suppression amount by the suppression filter is limited (loud sound quality is relatively good), or many frequency bands. The state in which the suppression filter is applied (loud sound quality is relatively deteriorated) is indicated by the lighting display.
 また、ユーザや設置者等は、評価情報画面201を確認することで、ボリュームを調節したときに、ハウリングの抑制の影響で、どの程度の拡声音質になっているかを直感的に認識することができる。例えば、ユーザや設置者等は、評価情報画面201から音の回り込み量が大きいことを認識した場合には、これ以上ボリュームを上げないようにするなどの対応をとることができる。 Further, by checking the evaluation information screen 201, the user, the installer, etc. can intuitively recognize how loud the sound quality is due to the influence of howling suppression when the volume is adjusted. it can. For example, when the user, the installer, or the like recognizes from the evaluation information screen 201 that the amount of sound wraparound is large, the user or the installer can take measures such as not raising the volume any more.
 なお、評価情報画面201では、矩形領域の点灯表示の数に限らず、矩形領域を、緑色、黄色、又は赤色などの異なる色で表示することで、音質スコアの閾値判定の結果が提示されてもよい。また、矩形領域を異なる色で表示する場合、矩形領域の数を1つとし、その矩形領域の色を変化させてもよい。 In the evaluation information screen 201, the result of the threshold value determination of the sound quality score is presented by displaying the rectangular area in a different color such as green, yellow, or red, not limited to the number of lighting displays in the rectangular area. May be good. Further, when displaying the rectangular areas in different colors, the number of the rectangular areas may be one and the color of the rectangular areas may be changed.
 また、評価情報は、GUI表示に限らず、LEDの点灯により提示してもよい。図13は、LEDの点灯により提示される評価情報の例を示している。 Moreover, the evaluation information is not limited to the GUI display, and may be presented by lighting the LED. FIG. 13 shows an example of evaluation information presented by lighting the LED.
 図13のAにおいて、情報処理装置100には、音質スコアの閾値判定の結果を表すLED202が設けられている。 In A of FIG. 13, the information processing device 100 is provided with an LED 202 indicating the result of threshold value determination of the sound quality score.
 例えば、音質スコアS(t)が、1.0であるか、閾値T以上であるか、閾値T未満であるかの3段階に応じて、3色発光可能なLED202を、緑色、黄色、又は赤色の3色のいずれかで点灯するように制御する。これにより、音質スコアの閾値判定の結果として、抑圧フィルタがかかっていない状態(拡声音質が良い)などが、緑色等に点灯したLED202により提示される。 For example, depending on the three stages of the sound quality score S (t) being 1.0, the threshold value T or more, or the threshold value T or less, the LED 202 capable of emitting three colors may be green, yellow, or red. It is controlled to light in any of the three colors. As a result, as a result of the threshold value determination of the sound quality score, a state in which the suppression filter is not applied (loud sound quality is good) is presented by the LED 202 lit in green or the like.
 図13のBにおいて、情報処理装置100には、音質スコアの閾値判定の結果を表すLED202-1乃至202-3が設けられている。 In B of FIG. 13, the information processing device 100 is provided with LEDs 202-1 to 202-3 indicating the result of the threshold value determination of the sound quality score.
 例えば、音質スコアS(t)が、1.0であるか、閾値T以上であるか、閾値T未満であるかの3段階に応じて、LED202-1が緑色で点灯、LED202-2が黄色で点灯、LED202-3が赤色で点灯するように制御する。これにより、音質スコアの閾値判定の結果として、抑圧フィルタがかかっていない状態(拡声音質が良い)などが、緑色に点灯したLED202-1などにより提示される。 For example, LED202-1 lights up in green and LED202-2 lights up in yellow according to the three stages of the sound quality score S (t) being 1.0, above the threshold value T, or below the threshold value T. , LED202-3 is controlled to light up in red. As a result, as a result of the threshold value determination of the sound quality score, a state in which the suppression filter is not applied (loud sound quality is good) is presented by the LED202-1 lit in green or the like.
 ユーザや設置者等は、LED202を確認することで、ハウリングの抑制の影響で、どの程度の拡声音質になっているかを直感的に認識することができる。 By checking the LED 202, users, installers, etc. can intuitively recognize how loud the sound quality is due to the effect of howling suppression.
 なお、図12と図13に示した評価情報の提示方法は一例であり、他の提示方法で評価情報を提示しても構わない。例えば、GUI表示や、LEDの点灯による数や色による識別に限らず、音質スコアのスコア値そのものを提示してもよく、音質スコアのスコア値を読み上げて音声(音)で出力したりしてもよい。 The method of presenting the evaluation information shown in FIGS. 12 and 13 is an example, and the evaluation information may be presented by another presentation method. For example, the score value of the sound quality score itself may be presented, not limited to the GUI display and the identification by the number or color by lighting the LED, or the score value of the sound quality score is read aloud and output by voice (sound). May be good.
<4.変形例> <4. Modification example>
 上述した説明では、フィードバックレートを周波数帯域ごとに求める場合を示したが、フィードバックレートは、スピーカ20に出力した音声信号と、マイクロフォン10から入力した音声信号との比に応じた値であり、周波数領域ごとに限らず、例えば、時間領域で求めても構わない。 In the above description, the case where the feedback rate is obtained for each frequency band has been shown, but the feedback rate is a value corresponding to the ratio of the audio signal output to the speaker 20 and the audio signal input from the microphone 10, and is a frequency. It is not limited to each area, and may be obtained, for example, in the time domain.
 例えば、時間領域の出力音をY,入力音をMとし、時間領域のフィードバックレートをF0とすれば、F0は、下記の式(9)で表される。 For example, if the output sound in the time domain is Y, the input sound is M, and the feedback rate in the time domain is F0, F0 is expressed by the following equation (9).
 F0= |M|/|Y|    ・・・(9) F0 = | M | / | Y | ... (9)
 オートイコライザ部101では、このフィードバックレートF0を用いた閾値判定が行われ、その閾値判定の結果に応じた抑圧ゲインG(ω)が算出される。 The auto equalizer unit 101 performs a threshold value determination using this feedback rate F0, and calculates a suppression gain G (ω) according to the result of the threshold value determination.
 なお、フィードバックレートを算出するに際しては、入力音に含まれるノイズなどを加味してもよい。 When calculating the feedback rate, noise included in the input sound may be added.
 なお、本明細書においては、「音声信号」を、「音信号」と読み替えるとともに、「音声処理装置」を、「音処理装置」又は「信号処理装置」と読み替えても構わない。 In this specification, "voice signal" may be read as "sound signal", and "sound processing device" may be read as "sound processing device" or "signal processing device".
<5.コンピュータの構成> <5. Computer configuration>
 上述した音声処理装置1の一連の処理は、ハードウェアにより実行することもできるし、ソフトウェアにより実行することもできる。一連の処理をソフトウェアにより実行する場合には、そのソフトウェアを構成するプログラムが、各装置のコンピュータにインストールされる。 The series of processes of the voice processing device 1 described above can be executed by hardware or software. When a series of processes are executed by software, the programs constituting the software are installed on the computer of each device.
 図14は、上述した一連の処理をプログラムにより実行するコンピュータのハードウェアの構成の例を示すブロック図である。 FIG. 14 is a block diagram showing an example of the hardware configuration of a computer that executes the above-mentioned series of processes programmatically.
 コンピュータにおいて、CPU(Central Processing Unit)1001、ROM(Read Only Memory)1002、RAM(Random Access Memory)1003は、バス1004により相互に接続されている。バス1004には、さらに、入出力インターフェース1005が接続されている。入出力インターフェース1005には、入力部1006、出力部1007、記憶部1008、通信部1009、及びドライブ1010が接続されている。 In a computer, a CPU (Central Processing Unit) 1001, a ROM (Read Only Memory) 1002, and a RAM (Random Access Memory) 1003 are connected to each other by a bus 1004. An input / output interface 1005 is further connected to the bus 1004. An input unit 1006, an output unit 1007, a storage unit 1008, a communication unit 1009, and a drive 1010 are connected to the input / output interface 1005.
 入力部1006は、マイクロフォン、キーボード、マウスなどよりなる。出力部1007は、スピーカ、ディスプレイなどよりなる。記憶部1008は、ハードディスクや不揮発性のメモリなどよりなる。通信部1009は、ネットワークインターフェースなどよりなる。ドライブ1010は、磁気ディスク、光ディスク、光磁気ディスク、又は半導体メモリなどのリムーバブル記録媒体1011を駆動する。 The input unit 1006 includes a microphone, a keyboard, a mouse, and the like. The output unit 1007 includes a speaker, a display, and the like. The storage unit 1008 includes a hard disk, a non-volatile memory, and the like. The communication unit 1009 includes a network interface and the like. The drive 1010 drives a removable recording medium 1011 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
 以上のように構成されるコンピュータでは、CPU1001が、ROM1002や記憶部1008に記録されているプログラムを、入出力インターフェース1005及びバス1004を介して、RAM1003にロードして実行することにより、上述した一連の処理が行われる。 In the computer configured as described above, the CPU 1001 loads the program recorded in the ROM 1002 and the storage unit 1008 into the RAM 1003 via the input / output interface 1005 and the bus 1004 and executes the above-mentioned series. Is processed.
 コンピュータ(CPU1001)が実行するプログラムは、例えば、パッケージメディア等としてのリムーバブル記録媒体1011に記録して提供することができる。また、プログラムは、ローカルエリアネットワーク、インターネット、デジタル衛星放送といった、有線又は無線の伝送媒体を介して提供することができる。 The program executed by the computer (CPU1001) can be recorded and provided on the removable recording medium 1011 as a package medium or the like, for example. Programs can also be provided via wired or wireless transmission media such as local area networks, the Internet, and digital satellite broadcasting.
 コンピュータでは、プログラムは、リムーバブル記録媒体1011をドライブ1010に装着することにより、入出力インターフェース1005を介して、記憶部1008にインストールすることができる。また、プログラムは、有線又は無線の伝送媒体を介して、通信部1009で受信し、記憶部1008にインストールすることができる。その他、プログラムは、ROM1002や記憶部1008に、あらかじめインストールしておくことができる。 In the computer, the program can be installed in the storage unit 1008 via the input / output interface 1005 by mounting the removable recording medium 1011 in the drive 1010. Further, the program can be received by the communication unit 1009 and installed in the storage unit 1008 via a wired or wireless transmission medium. In addition, the program can be pre-installed in the ROM 1002 or the storage unit 1008.
 ここで、本明細書において、コンピュータがプログラムに従って行う処理は、必ずしもフローチャートとして記載された順序に沿って時系列に行われる必要はない。すなわち、コンピュータがプログラムに従って行う処理は、並列的あるいは個別に実行される処理(例えば、並列処理あるいはオブジェクトによる処理)も含む。また、プログラムは、1のコンピュータ(プロセッサ)により処理されてもよいし、複数のコンピュータによって分散処理されてもよい。 Here, in the present specification, the processing performed by the computer according to the program does not necessarily have to be performed in chronological order in the order described as the flowchart. That is, the processing performed by the computer according to the program also includes processing executed in parallel or individually (for example, parallel processing or processing by an object). Further, the program may be processed by one computer (processor) or may be distributed by a plurality of computers.
 なお、本技術の実施の形態は、上述した実施の形態に限定されるものではなく、本技術の要旨を逸脱しない範囲において種々の変更が可能である。 The embodiment of the present technology is not limited to the above-described embodiment, and various changes can be made without departing from the gist of the present technology.
 また、上述した処理の各ステップは、1つの装置で実行する他、複数の装置で分担して実行することができる。さらに、1つのステップに複数の処理が含まれる場合には、その1つのステップに含まれる複数の処理は、1つの装置で実行する他、複数の装置で分担して実行することができる。 In addition, each step of the above-mentioned processing can be executed by one device or shared by a plurality of devices. Further, when a plurality of processes are included in one step, the plurality of processes included in the one step can be executed by one device or shared by a plurality of devices.
 なお、本技術は、以下のような構成をとることができる。 Note that this technology can have the following configuration.
(1)
 マイクロフォンにより収音された音声信号を処理して、スピーカから出力される前記音声信号に応じた音の前記マイクロフォンへの回り込み量に関する指標を所定の閾値と比較し、閾値判定を行う信号処理部を備える
 音声処理装置。
(2)
 前記指標は、前記スピーカに出力した音声信号と、前記マイクロフォンから入力した音声信号との比に応じた値である
 前記(1)に記載の音声処理装置。
(3)
 前記信号処理部は、前記閾値判定の結果に応じたゲインを、前記音声信号に適用する
 前記(2)に記載の音声処理装置。
(4)
 前記指標は、周波数帯域ごと、又は時間領域で算出される
 前記(3)に記載の音声処理装置。
(5)
 前記信号処理部は、周波数帯域ごとに算出した指標の値が前記閾値を超える場合、その周波数帯域ごとに、前記指標の値と前記閾値から算出したゲインを前記音声信号に適用する
 前記(4)に記載の音声処理装置。
(6)
 前記信号処理部は、前記指標の値が前記閾値以下になるように前記ゲインを算出する
 前記(5)に記載の音声処理装置。
(7)
 前記信号処理部は、オートイコライザ部を含み、
 前記オートイコライザ部は、前記ゲインを前記音声信号に適用する
 前記(3)乃至(6)のいずれかに記載の音声処理装置。
(8)
 前記信号処理部は、
  キャリブレーション信号を生成し、
  キャリブレーション期間に、前記指標として、前記スピーカに出力した前記キャリブレーション信号と、前記マイクロフォンから入力した音声信号との比に応じた値を算出する
 前記(2)乃至(7)のいずれかに記載の音声処理装置。
(9)
 前記信号処理部は、
  ボリューム部をさらに備え、
  前記ボリューム部でボリュームが変更された場合、前記スピーカに出力した音声信号と、前記マイクロフォンから入力した音声信号との比に応じた値を算出し、
  算出した前記値を用いて、前記指標の値を補正する
 前記(8)に記載の音声処理装置。
(10)
 前記信号処理部は、
  ビームフォーミング処理を行うビームフォーミング部をさらに含み、
  前記ビームフォーミング部の内部パラメータを用い、前記指標の値を補正する
 前記(9)に記載の音声処理装置。
(11)
 前記信号処理部は、
  前記音声信号に対する処理を行うエフェクタ部をさらに含み、
  前記エフェクタ部の内部パラメータを用い、前記指標の値を補正する
 前記(9)又は(10)に記載の音声処理装置。
(12)
 前記エフェクタ部は、イコライザ部、オートゲインコントロール部、及びフィルタ部の少なくとも1つを含む
 前記(11)に記載の音声処理装置。
(13)
 前記閾値は、残響感のある音質の判断基準に応じて設定される
 前記(1)乃至(12)のいずれかに記載の音声処理装置。
(14)
 前記ゲインに応じたスコアを算出する算出部をさらに備える
 前記(3)乃至(7)のいずれかに記載の音声処理装置。
(15)
 前記スコアに基づいて、評価情報を生成する生成部と、
 前記評価情報の提示を制御する提示制御部と
 をさらに備える前記(14)に記載の音声処理装置。
(16)
 前記評価情報は、拡声時の音質に関する情報を含む
 前記(15)に記載の音声処理装置。
(17)
 前記マイクロフォンは、話者の口元から離れた位置に設置される
 前記(1)乃至(16)のいずれかに記載の音声処理装置。
(18)
 前記マイクロフォンと前記スピーカは、同一の空間内の所定の位置にそれぞれ固定して設置される
 前記(17)に記載の音声処理装置。
(19)
 前記マイクロフォンの筐体内、又は外部の筐体内に設けられる
 前記(1)乃至(18)のいずれかに記載の音声処理装置。
(20)
 音声処理装置が、
 マイクロフォンにより収音された音声信号を処理して、スピーカから出力される前記音声信号に応じた音の前記マイクロフォンへの回り込み量に関する指標を所定の閾値と比較し、閾値判定を行う
 音声処理方法。
(1)
A signal processing unit that processes the audio signal picked up by the microphone, compares the index related to the amount of wraparound of the sound output from the speaker to the microphone with a predetermined threshold, and determines the threshold. A voice processing device to be equipped.
(2)
The voice processing device according to (1), wherein the index is a value corresponding to a ratio of a voice signal output to the speaker and a voice signal input from the microphone.
(3)
The voice processing device according to (2), wherein the signal processing unit applies a gain according to the result of the threshold value determination to the voice signal.
(4)
The voice processing device according to (3) above, wherein the index is calculated for each frequency band or in the time domain.
(5)
When the index value calculated for each frequency band exceeds the threshold value, the signal processing unit applies the index value and the gain calculated from the threshold value to the audio signal for each frequency band (4). The audio processing device described in.
(6)
The voice processing device according to (5), wherein the signal processing unit calculates the gain so that the value of the index becomes equal to or less than the threshold value.
(7)
The signal processing unit includes an auto equalizer unit.
The voice processing device according to any one of (3) to (6) above, wherein the auto equalizer unit applies the gain to the voice signal.
(8)
The signal processing unit
Generate a calibration signal and
Described in any one of (2) to (7) above, which calculates a value according to the ratio of the calibration signal output to the speaker and the audio signal input from the microphone as the index during the calibration period. Audio processing device.
(9)
The signal processing unit
With more volume
When the volume is changed in the volume unit, a value corresponding to the ratio of the audio signal output to the speaker and the audio signal input from the microphone is calculated.
The voice processing device according to (8) above, wherein the calculated value is used to correct the value of the index.
(10)
The signal processing unit
Including a beamforming section that performs beamforming processing,
The voice processing device according to (9) above, wherein the value of the index is corrected by using the internal parameters of the beamforming unit.
(11)
The signal processing unit
It further includes an effector unit that performs processing on the voice signal.
The voice processing device according to (9) or (10), wherein the value of the index is corrected by using the internal parameters of the effector unit.
(12)
The audio processing device according to (11) above, wherein the effector unit includes at least one of an equalizer unit, an auto gain control unit, and a filter unit.
(13)
The voice processing device according to any one of (1) to (12) above, wherein the threshold value is set according to a criterion for determining sound quality with a reverberant feeling.
(14)
The voice processing device according to any one of (3) to (7) above, further comprising a calculation unit for calculating a score according to the gain.
(15)
A generator that generates evaluation information based on the score,
The voice processing device according to (14), further including a presentation control unit that controls the presentation of evaluation information.
(16)
The voice processing device according to (15) above, wherein the evaluation information includes information on sound quality at the time of loudspeaking.
(17)
The voice processing device according to any one of (1) to (16) above, wherein the microphone is installed at a position away from the mouth of the speaker.
(18)
The voice processing device according to (17), wherein the microphone and the speaker are fixedly installed at predetermined positions in the same space.
(19)
The voice processing device according to any one of (1) to (18), which is provided in the housing of the microphone or in an external housing.
(20)
The voice processing device
A voice processing method in which a voice signal picked up by a microphone is processed, an index relating to the amount of wraparound of sound corresponding to the voice signal output from a speaker to the microphone is compared with a predetermined threshold value, and a threshold value is determined.
 1 音声処理装置, 10 マイクロフォン, 11-1,11-2,11 マイクユニット, 12 A/D変換部, 13,13A,13B 信号処理部, 14 信号出力部, 20 スピーカ, 30 表示装置, 100 情報処理装置, 101 オートイコライザ部, 102 ボリューム部, 103 キャリブレーション信号生成部, 104 出力音パワー値算出部, 105 入力音パワー値算出部, 106 フィードレート算出部, 111 ビームフォーミング部, 112 イコライザ部, 121 低域カットフィルタ部, 122 オートゲインコントロール部, 151 音質スコア算出部, 152 評価情報生成部, 153 提示制御部, 1001 CPU 1 sound processing device, 10 microphone, 11-1, 11-2, 11 microphone unit, 12 A / D conversion unit, 13, 13A, 13B signal processing unit, 14 signal output unit, 20 speakers, 30 display devices, 100 information Processing device, 101 auto equalizer unit, 102 volume unit, 103 calibration signal generation unit, 104 output sound power value calculation unit, 105 input sound power value calculation unit, 106 feed rate calculation unit, 111 beam forming unit, 112 equalizer unit, 121 low frequency cut filter unit, 122 auto gain control unit, 151 sound quality score calculation unit, 152 evaluation information generation unit, 153 presentation control unit, 1001 CPU

Claims (20)

  1.  マイクロフォンにより収音された音声信号を処理して、スピーカから出力される前記音声信号に応じた音の前記マイクロフォンへの回り込み量に関する指標を所定の閾値と比較し、閾値判定を行う信号処理部を備える
     音声処理装置。
    A signal processing unit that processes the audio signal picked up by the microphone, compares the index related to the amount of wraparound of the sound output from the speaker to the microphone with a predetermined threshold, and determines the threshold. A voice processing device to be equipped.
  2.  前記指標は、前記スピーカに出力した音声信号と、前記マイクロフォンから入力した音声信号との比に応じた値である
     請求項1に記載の音声処理装置。
    The voice processing device according to claim 1, wherein the index is a value corresponding to a ratio of a voice signal output to the speaker and a voice signal input from the microphone.
  3.  前記信号処理部は、前記閾値判定の結果に応じたゲインを、前記音声信号に適用する
     請求項2に記載の音声処理装置。
    The voice processing device according to claim 2, wherein the signal processing unit applies a gain according to the result of the threshold value determination to the voice signal.
  4.  前記指標は、周波数帯域ごと、又は時間領域で算出される
     請求項3に記載の音声処理装置。
    The voice processing device according to claim 3, wherein the index is calculated for each frequency band or in the time domain.
  5.  前記信号処理部は、周波数帯域ごとに算出した指標の値が前記閾値を超える場合、その周波数帯域ごとに、前記指標の値と前記閾値から算出したゲインを前記音声信号に適用する
     請求項4に記載の音声処理装置。
    According to claim 4, when the index value calculated for each frequency band exceeds the threshold value, the signal processing unit applies the index value and the gain calculated from the threshold value to the audio signal for each frequency band. The described audio processing device.
  6.  前記信号処理部は、前記指標の値が前記閾値以下になるように前記ゲインを算出する
     請求項5に記載の音声処理装置。
    The voice processing device according to claim 5, wherein the signal processing unit calculates the gain so that the value of the index becomes equal to or less than the threshold value.
  7.  前記信号処理部は、オートイコライザ部を含み、
     前記オートイコライザ部は、前記ゲインを前記音声信号に適用する
     請求項5に記載の音声処理装置。
    The signal processing unit includes an auto equalizer unit.
    The voice processing device according to claim 5, wherein the auto equalizer unit applies the gain to the voice signal.
  8.  前記信号処理部は、
      キャリブレーション信号を生成し、
      キャリブレーション期間に、前記指標として、前記スピーカに出力した前記キャリブレーション信号と、前記マイクロフォンから入力した音声信号との比に応じた値を算出する
     請求項2に記載の音声処理装置。
    The signal processing unit
    Generate a calibration signal and
    The audio processing device according to claim 2, wherein a value is calculated as an index according to the ratio of the calibration signal output to the speaker and the audio signal input from the microphone during the calibration period.
  9.  前記信号処理部は、
      ボリューム部をさらに備え、
      前記ボリューム部でボリュームが変更された場合、前記スピーカに出力した音声信号と、前記マイクロフォンから入力した音声信号との比に応じた値を算出し、
      算出した前記値を用いて、前記指標の値を補正する
     請求項8に記載の音声処理装置。
    The signal processing unit
    With more volume
    When the volume is changed in the volume unit, a value corresponding to the ratio of the audio signal output to the speaker and the audio signal input from the microphone is calculated.
    The voice processing device according to claim 8, wherein the calculated value is used to correct the value of the index.
  10.  前記信号処理部は、
      ビームフォーミング処理を行うビームフォーミング部をさらに含み、
      前記ビームフォーミング部の内部パラメータを用い、前記指標の値を補正する
     請求項9に記載の音声処理装置。
    The signal processing unit
    Including a beamforming section that performs beamforming processing,
    The voice processing apparatus according to claim 9, wherein the value of the index is corrected by using the internal parameters of the beamforming unit.
  11.  前記信号処理部は、
      前記音声信号に対する処理を行うエフェクタ部をさらに含み、
      前記エフェクタ部の内部パラメータを用い、前記指標の値を補正する
     請求項9に記載の音声処理装置。
    The signal processing unit
    It further includes an effector unit that performs processing on the voice signal.
    The voice processing device according to claim 9, wherein the value of the index is corrected by using the internal parameter of the effector unit.
  12.  前記エフェクタ部は、イコライザ部、オートゲインコントロール部、及びフィルタ部の少なくとも1つを含む
     請求項11に記載の音声処理装置。
    The audio processing device according to claim 11, wherein the effector unit includes at least one of an equalizer unit, an auto gain control unit, and a filter unit.
  13.  前記閾値は、残響感のある音質の判断基準に応じて設定される
     請求項1に記載の音声処理装置。
    The voice processing device according to claim 1, wherein the threshold value is set according to a criterion for determining sound quality with a reverberant feeling.
  14.  前記ゲインに応じたスコアを算出する算出部をさらに備える
     請求項3に記載の音声処理装置。
    The voice processing apparatus according to claim 3, further comprising a calculation unit for calculating a score according to the gain.
  15.  前記スコアに基づいて、評価情報を生成する生成部と、
     前記評価情報の提示を制御する提示制御部と
     をさらに備える請求項14に記載の音声処理装置。
    A generator that generates evaluation information based on the score,
    The voice processing device according to claim 14, further comprising a presentation control unit that controls the presentation of the evaluation information.
  16.  前記評価情報は、拡声時の音質に関する情報を含む
     請求項15に記載の音声処理装置。
    The voice processing device according to claim 15, wherein the evaluation information includes information on sound quality at the time of loudspeaking.
  17.  前記マイクロフォンは、話者の口元から離れた位置に設置される
     請求項1に記載の音声処理装置。
    The voice processing device according to claim 1, wherein the microphone is installed at a position away from the mouth of the speaker.
  18.  前記マイクロフォンと前記スピーカは、同一の空間内の所定の位置にそれぞれ固定して設置される
     請求項17に記載の音声処理装置。
    The voice processing device according to claim 17, wherein the microphone and the speaker are fixedly installed at predetermined positions in the same space.
  19.  前記マイクロフォンの筐体内、又は外部の筐体内に設けられる
     請求項1に記載の音声処理装置。
    The voice processing device according to claim 1, which is provided in the housing of the microphone or in an external housing.
  20.  音声処理装置が、
     マイクロフォンにより収音された音声信号を処理して、スピーカから出力される前記音声信号に応じた音の前記マイクロフォンへの回り込み量に関する指標を所定の閾値と比較し、閾値判定を行う
     音声処理方法。
    The voice processing device
    A voice processing method in which a voice signal picked up by a microphone is processed, an index relating to the amount of wraparound of the sound corresponding to the voice signal output from the speaker to the microphone is compared with a predetermined threshold value, and a threshold value is determined.
PCT/JP2020/039054 2019-10-30 2020-10-16 Voice processing device and voice processing method WO2021085174A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2019-197023 2019-10-30
JP2019197023 2019-10-30

Publications (1)

Publication Number Publication Date
WO2021085174A1 true WO2021085174A1 (en) 2021-05-06

Family

ID=75716262

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/039054 WO2021085174A1 (en) 2019-10-30 2020-10-16 Voice processing device and voice processing method

Country Status (1)

Country Link
WO (1) WO2021085174A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS56106494A (en) * 1980-01-25 1981-08-24 Toa Tokushu Denki Kk Method and device for howling prevention of loudspeaker
JP2018046452A (en) * 2016-09-15 2018-03-22 沖電気工業株式会社 Signal processing apparatus, program, method, and communications device
WO2019188388A1 (en) * 2018-03-29 2019-10-03 ソニー株式会社 Sound processing device, sound processing method, and program

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS56106494A (en) * 1980-01-25 1981-08-24 Toa Tokushu Denki Kk Method and device for howling prevention of loudspeaker
JP2018046452A (en) * 2016-09-15 2018-03-22 沖電気工業株式会社 Signal processing apparatus, program, method, and communications device
WO2019188388A1 (en) * 2018-03-29 2019-10-03 ソニー株式会社 Sound processing device, sound processing method, and program

Similar Documents

Publication Publication Date Title
US8634547B2 (en) Echo canceller operative in response to fluctuation on echo path
JP4311402B2 (en) Loudspeaker system
JP5969727B2 (en) Frequency band compression using dynamic threshold
US20070165878A1 (en) Loudspeaker array audio signal supply apparartus
EP3348047A1 (en) Audio signal processing
KR20100119890A (en) Audio device and method of operation therefor
US8718562B2 (en) Processing audio signals
JP2006005902A (en) Amplifier and amplitude frequency characteristics adjusting method
US10721562B1 (en) Wind noise detection systems and methods
GB2491173A (en) Setting gain applied to an audio signal based on direction of arrival (DOA) information
JP2007043295A (en) Amplifier and method for regulating amplitude frequency characteristics
CN110169083A (en) Microphone array Wave beam forming
US11902758B2 (en) Method of compensating a processed audio signal
US20230079741A1 (en) Automated audio tuning launch procedure and report
US8804981B2 (en) Processing audio signals
US11336999B2 (en) Sound processing device, sound processing method, and program
WO2021085174A1 (en) Voice processing device and voice processing method
JPWO2018211988A1 (en) Audio output control device, audio output control method, and program
JP7195344B2 (en) Forced gap insertion for pervasive listening
US20230146772A1 (en) Automated audio tuning and compensation procedure
US9137601B2 (en) Audio adjusting method and acoustic processing apparatus
JPH06327088A (en) Acoustic system design/operation supporting device and adaptive control type equalizer
WO2023081534A1 (en) Automated audio tuning launch procedure and report
JP2023132138A (en) Soundbar device, audio system, and setting method of soundbar device
JP2018137532A (en) Gain setting apparatus, gain setting method, and gain setting program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20882744

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20882744

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP