WO2018211806A1 - 音声信号処理装置 - Google Patents
音声信号処理装置 Download PDFInfo
- Publication number
- WO2018211806A1 WO2018211806A1 PCT/JP2018/010328 JP2018010328W WO2018211806A1 WO 2018211806 A1 WO2018211806 A1 WO 2018211806A1 JP 2018010328 W JP2018010328 W JP 2018010328W WO 2018211806 A1 WO2018211806 A1 WO 2018211806A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- unit
- signal
- output
- input
- input signal
- Prior art date
Links
- 230000005236 sound signal Effects 0.000 title claims abstract description 58
- 238000001514 detection method Methods 0.000 claims abstract description 75
- 238000012545 processing Methods 0.000 claims description 40
- 230000003111 delayed effect Effects 0.000 claims description 11
- 230000007613 environmental effect Effects 0.000 claims description 10
- 230000001934 delay Effects 0.000 claims 1
- 238000000034 method Methods 0.000 description 64
- 238000001228 spectrum Methods 0.000 description 29
- 238000010586 diagram Methods 0.000 description 8
- 230000029058 respiratory gaseous exchange Effects 0.000 description 4
- 230000000630 rising effect Effects 0.000 description 4
- 238000004378 air conditioning Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000010079 rubber tapping Methods 0.000 description 2
- 230000000903 blocking effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
- 206010041232 sneezing Diseases 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/002—Applications of echo suppressors or cancellers in telephonic connections
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L25/84—Detection of presence or absence of voice signals for discriminating voice from noise
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/18—Automatic or semi-automatic exchanges with means for reducing interference or noise; with means for reducing effects due to line faults with means for protecting lines
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/56—Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
- H04M3/568—Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L2021/02087—Noise filtering the noise being separate speech, e.g. cocktail party
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02165—Two microphones, one receiving mainly the noise signal and the other one mainly the speech signal
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L2025/783—Detection of presence or absence of voice signals based on threshold decision
Definitions
- the present invention relates to an audio signal processing device.
- the conference system is used for a conference attended by a large number of people, such as a committee or a television.
- the conference system smoothly advances the conference by processing audio signals from a plurality of microphones.
- Some conference systems include an automatic voice recognition function that automatically detects the speech (voice) of a participant and controls switching between output and blocking of a signal from the microphone.
- the automatic voice recognition function may erroneously detect noise such as the sound of tapping a desk with a pen or the sound of touching a document. In this case, the switching of the output / shut-off of the signal from the microphone that is not intended by the user is performed, and problems such as noise being emitted into the conference hall may occur.
- the noise detection device disclosed in Patent Document 1 compares silence, low frequency (low frequency) noise, and high frequency (high frequency) noise by comparing the autocorrelation coefficient of each order with a threshold value. Noise can be detected individually.
- the noise detection device disclosed in Patent Literature 1 can detect, for example, impulsive noise having a frequency spectrum of substantially the same level from low to high, such as a sound of hitting a desk with a pen. . As a result, the noise detection device disclosed in Patent Document 1 suppresses erroneous detection. However, the noise detection device disclosed in Patent Document 1 cannot detect complex noise that is a combination of impulsive noise such as a sound of rolled paper and high-frequency noise. For this reason, the noise detection device disclosed in Patent Document 1 may erroneously detect such complex noise as speech and emit the noise into the conference hall.
- the noise detection device disclosed in Patent Document 1 detects a section including a lot of low-frequency band components and high-frequency band components in a voice section in which a voice is output. It may be judged. That is, the noise detection device disclosed in Patent Document 1 may block the signal from the microphone in the middle of the participant's utterance when noise is detected while outputting the audio signal.
- An audio signal processing device includes an input unit that receives a signal from a microphone, an input signal determination unit that determines the presence or absence of an input signal from the input unit, and noise included in the input signal from the input unit.
- An output switching unit that switches between a noise detection unit to detect, an output unit that outputs an input signal as an output signal, an output state in which an output signal is output from the output unit, and a non-output state in which no output signal is output from the output unit;
- a control unit that controls switching of the output switching unit, and the switching control by the control unit is switched based on the determination result of the input signal determination unit and the detection result of the noise detection unit.
- an audio signal processing device that accurately detects various noises including complex noise and that does not block the audio signal even if noise is detected during output of the audio signal.
- Audio signal processing device ⁇ Audio signal processing device ⁇
- audio signal processing apparatus According to the present invention, embodiments of an audio signal processing apparatus according to the present invention will be described with reference to the drawings.
- FIG. 1 is a functional block diagram showing an embodiment of an audio signal processing device (hereinafter referred to as “this device”) according to the present invention.
- the apparatus 1 performs processing such as mixing, distribution, and balance adjustment of an electric signal (input signal) from a device such as a microphone 2 that converts voice or musical sound into an electric signal.
- the apparatus 1 is, for example, a mixer or a control unit of a conference system.
- the apparatus 1 includes an input unit 10, an input signal determination unit 20, a noise detection unit 30, a delay unit 40, a switching unit 50, a control unit 60, a storage unit 70, and an output unit 80. Do it.
- the input unit 10 is connected to the microphone 2 and receives the input signal s1 from the microphone 2, for example.
- An input signal s1 from the microphone 2 is input to the input unit 10 and input from the input unit 10 to the delay unit 40 and the switching unit 50, and is subjected to processing described later by the input unit 10 to be an input signal determination unit. 20 and the noise detection unit 30.
- the input unit 10 includes a receiving unit 11, a band pass filter 12, and a rectifier 13.
- the receiving unit 11 receives the input signal s1 from the microphone 2 and inputs the input signal s1 to the bandpass filter 12, the delay unit 40, and the switching unit 50.
- the band pass filter 12 removes a low frequency band (low frequency) signal and a high frequency band (high frequency) signal from the input signal s1. In other words, the band-pass filter 12 removes noise that appears in the low frequency and noise that appears in the high frequency from the input signal s1.
- a signal hereinafter referred to as “filter signal”) s ⁇ b> 2 output from the bandpass filter 12 is input to the rectifier 13 and the noise detection unit 30.
- the band-pass filter may be configured by combining a low-pass filter and a high-pass filter.
- the rectifier 13 converts the filter signal s2 that is an AC signal into a DC signal.
- the filter signal (hereinafter referred to as “DC signal”) s3 converted into a DC signal is input to the input signal determination unit 20.
- the input signal discriminating unit 20 discriminates the presence / absence of the input signal s1 (audio signal) from the microphone 2.
- the input signal determination unit 20 includes a first comparison unit 21.
- the first comparison unit 21 compares the DC signal s3 and the first threshold value V1 to determine the presence or absence of the input signal s1.
- the output of the first comparison unit 21, that is, the output from the input signal determination unit 20 (hereinafter referred to as “determination result”) r ⁇ b> 1 is input to the control unit 60.
- the “first threshold value V1” is a threshold value used by the device 1 to determine the presence or absence of the input signal s1.
- the first threshold value V1 is, for example, a variation value set based on a signal corresponding to the environmental sound collected by the microphone 2.
- “Environmental sound” is, for example, air-conditioning sound at the installation location (conference room, auditorium, etc.) of the apparatus 1 and the microphone 2 and reverberation sound of the room where the apparatus 1 and the microphone 2 are installed.
- the apparatus 1 uses the first threshold value V1 as a variable value, so that the environment in which the apparatus 1 is used (for example, whether the room in which the apparatus 1 is installed is air-conditioned, the size of the room, the gain of the microphone 2).
- the presence or absence of the input signal s1 can be determined according to the value.
- the first threshold value V1 is stored in the storage unit 70.
- the first threshold value may be a fixed value that matches the environment in which the apparatus is used.
- the noise detection unit 30 detects the noise included in the input signal s1 by detecting the characteristic caused by the noise included in the input signal s1 received from the microphone 2. That is, the noise detection unit 30 determines whether the input signal s1 is a signal due to noise (hereinafter referred to as “noise signal”) or a signal due to sound (hereinafter referred to as “audio signal”).
- the output (hereinafter referred to as “detection result”) r ⁇ b> 2 of the noise detection unit 30 is input to the control unit 60.
- “Noise” includes, for example, the sound of tapping a desk with a pen, the sound of turning a paper, the sound of sneezing, the sound of clapping a hand, the sound of rolling paper.
- FIG. 2 is a functional block diagram of the noise detection unit 30.
- the noise detection unit 30 includes a frequency component determination unit 31, a time change determination unit 32, and a logical sum operation unit 33.
- FIG. 3 is a functional block diagram of the frequency component determination unit 31.
- the frequency component determination unit 31 determines the presence or absence of noise based on the frequency component of the filter signal s2 (input signal s1). Usually, the power spectrum of the audio signal is larger in the mid-low frequency power than in the high frequency power. Also, the power spectrum of the audio signal tends to appear prominently in some frequency bands. On the other hand, the power spectrum of noise tends to appear in the entire frequency band.
- the frequency component discriminating unit 31 divides the power spectrum of the filter signal s2 into a power spectrum of a middle / low frequency band (middle / low band) and a power spectrum of a middle / high frequency band (middle / high band). The frequency component determination unit 31 determines whether the filter signal s2 (input signal s1) is an audio signal or a noise signal by comparing the two power spectra.
- the mid-low range is a frequency band including a frequency of about 100 Hz to 3 kHz, for example.
- the mid-high range is a frequency band including a frequency of 3 kHz or more, for example.
- mid-low range may overlap with the mid-high range in some frequency bands.
- the frequency component determination unit 31 includes a low-pass filter 311, a first moving average unit 312, a high-pass filter 313, a second moving average unit 314, a relative comparison unit 315, and a second comparison unit 316.
- the low-pass filter 311 extracts a medium / low frequency signal from the filter signal s2.
- the first moving average unit 312 converts the mid-low frequency signal to a DC signal and generates a power spectrum of the signal (hereinafter referred to as “middle low frequency signal power spectrum”) by the moving average processing.
- the high-pass filter 313 extracts a medium / high frequency signal from the filter signal s2.
- the second moving average unit 314 converts the mid-high range signal into a DC signal and generates a power spectrum of the signal (hereinafter referred to as “middle / high range signal power spectrum”) by the moving average process.
- the relative comparison unit 315 compares the mid-low frequency signal power spectrum generated by the first moving average unit 312 with the mid-high frequency signal power spectrum generated by the second moving average unit 314, and calculates the difference.
- the relative comparison unit 315 inputs the calculated difference to the second comparison unit 316.
- the second comparison unit 316 compares the difference from the relative comparison unit 315 with the second threshold value V2, and inputs a signal indicating the result to the OR operation unit 33.
- the “second threshold V2” is a threshold used by the frequency component determination unit 31 to determine whether the filter signal s2 (input signal s1) is an audio signal or a noise signal.
- the second threshold value V2 is stored in the storage unit 70 (see FIG. 1).
- FIG. 4 is a functional block diagram of the time change determination unit 32.
- the time change determining unit 32 determines the presence or absence of noise based on the time change of the filter signal s2 (input signal s1). Usually, the time-axis waveform of impulsive noise fluctuates immediately after the occurrence of noise and then attenuates at a predetermined time.
- the time change determining unit 32 determines the presence or absence of noise based on the time change of the filter signal s2 (input signal s1). That is, the time change determination unit 32 counts the time change of a signal having a time axis waveform of impulsive noise, and determines whether the filter signal s2 (input signal s1) is an audio signal or a noise signal.
- the time change determination unit 32 includes a third moving average unit 321, a third comparison unit 322, a time change counter unit 323, and a fourth comparison unit 324.
- the third moving average unit 321 converts the filter signal s2 into a DC signal by moving average processing of the filter signal s2, and generates a power spectrum of the signal (hereinafter referred to as “input signal power spectrum”).
- the third comparison unit 322 compares the input signal power spectrum generated by the third moving average unit 321 with the third threshold value V3 and outputs the result to the time change counter unit 323.
- the “third threshold value V3” is a threshold value used by the time change determination unit 32 to determine whether the filter signal s2 (input signal s1) is an audio signal or a noise signal.
- the third threshold value V3 is a variation value set based on a signal corresponding to the environmental sound collected by the microphone 2, for example.
- the apparatus 1 can determine whether the filter signal s2 is an audio signal or a noise signal according to the environment in which the apparatus 1 is used by setting the third threshold value V3 as a variation value.
- the third threshold value V3 is stored in the storage unit 70 (see FIG. 1).
- the third threshold value may be a fixed value according to the environment in which the present apparatus is used, or may be calculated by adding a predetermined adjustment to a signal that is the basis of the first threshold value.
- the time change counter unit 323 counts the time change (attenuation time) in the time axis waveform of the signal determined by the third comparison unit 322 to exceed the third threshold value V3, and the result is sent to the fourth comparison unit 324. Output.
- the fourth comparison unit 324 compares the count value of the time change counter unit 323 with the fourth threshold value V4 and outputs a signal indicating the result to the logical sum operation unit 33.
- the “fourth threshold V4” is a threshold used by the time change determination unit 32 to determine whether the filter signal s2 (input signal s1) is an audio signal or a noise signal.
- the fourth threshold value V4 is stored in the storage unit 70 (see FIG. 1).
- the logical sum calculation unit 33 calculates a logical sum of the output of the frequency component determination unit 31 (second comparison unit 316) and the output of the time change determination unit 32 (fourth comparison unit 324).
- the logical sum operation unit 33 outputs the filter signal s2 (input signal s1) when any one of the output of the second comparison unit 316 and the output of the fourth comparison unit 324 is determined as noise. ) Is determined as a noise signal. That is, the OR operation unit 33 detects noise from the filter signal s2 (input signal s1) based on the determination result of the frequency component determination unit 31 and the determination result of the time change determination unit 32.
- the delay unit 40 stores the input signal s1 from the input unit 10 for a predetermined time, and generates and outputs a delay signal s4 obtained by delaying the input signal s1 for a predetermined time.
- the “predetermined time” is set to a time longer than the time (for example, the fourth threshold value V4) required for the process of the time change determination unit 32 (time change determination process (ST202) described later).
- the present apparatus 1 generates a delayed signal s4 that is free from missing information (voice) compared to the input signal s1.
- the delay unit 40 includes, for example, a ring buffer. The generation of the delay signal s4 of the delay unit 40 is always performed while the input signal s1 is input to the delay unit 40.
- the delay signal s4 is input from the delay unit 40 to the switching unit 50.
- the switching unit 50 switches the signal input from the switching unit 50 to the output unit 80 to one of the input signal s1 and the delay signal s4 in accordance with a control signal cs1 from the control unit 60 described later. The presence or absence of signal input to the output unit 80 is switched.
- the switching unit 50 includes a signal switching unit 51 and an output switching unit 52.
- the signal switching unit 51 switches a signal input from the switching unit 50 to the output unit 80 in accordance with a control signal cs1 from the control unit 60 described later.
- the signal switching unit 51 includes two contacts P, that is, a contact P1 and a contact P2.
- the contact P1 is connected to the delay unit 40.
- the delay signal s4 from the delay unit 40 is input to the contact P1.
- the contact P2 is connected to the receiving unit 11.
- the input signal s1 from the receiving unit 11 is input to the contact P2. That is, the switching unit 50 inputs either the input signal s1 or the delay signal s4 to the output unit 80 by switching the contact point P (contact point P1, P2) of the signal switching unit 51.
- the contact P of the signal switching unit 51 is the contact P2.
- the output switching unit 52 switches presence / absence of signal input from the switching unit 50 to the output unit 80 in accordance with a control signal cs1 from the control unit 60 described later.
- the output switching unit 52 is, for example, a gate circuit. That is, for example, the output switching unit 52 is in an output state in which a signal is output when a high voltage is applied to the gate (hereinafter referred to as “gate on”) and is in a non-output state in which the signal is interrupted when a low voltage is applied to the gate (hereinafter “ It is called “gate off”.
- gate on an output state in which a signal is output when a high voltage is applied to the gate
- It It is called “gate off”.
- the switching unit 50 does not input a signal to the output unit 80 (mute on).
- the switching unit 50 When the state of the output switching unit 52 is gate-on, the switching unit 50 inputs a signal to the output unit 80 (mute off). In other words, the output switching unit 52 has an output state in which an output signal is output from the output unit 80 and a non-output state in which an output signal is not output from the output unit 80 in accordance with the control signal cs1 from the control unit 60. Switch. The “output signal” will be described later.
- the state of the output switching unit 52 When the device 1 is in the initial state, the state of the output switching unit 52 is gate-off. A signal (hereinafter referred to as “state signal”) gs1 indicating the state of the output switching unit 52 is input from the switching unit 50 to the control unit 60.
- the control unit 60 Based on the determination result r1 from the input signal determination unit 20, the detection result r2 from the noise detection unit 30, and the status signal gs1 of the output switching unit 52 from the switching unit 50, the control unit 60 switches the switching unit 50.
- the control signal cs1 for controlling the operation is generated. That is, the control unit 60 controls the output from the output unit 80 of either the input signal s1 or the delayed signal s4 based on the determination result r1, the detection result r2, and the state signal gs1.
- the control unit 60 includes an AND operation unit 61 and a counter unit 62.
- control signal cs1 is, for example, a signal for switching the contact P1 and the contact P2 of the signal switching unit 51 or a signal for switching the gate on and gate off of the output switching unit 52.
- the control signal cs1 is input from the control unit 60 to the switching unit 50.
- the logical product operation unit 61 performs switching of the switching unit 50 by the control unit 60 described later based on the logical product of a signal (hereinafter referred to as “detection result signal”) r2s indicating the detection result r2 and the state signal gs1. Control (first control, second control) is selected. The operation of the AND operation unit 61 will be described later.
- the counter unit 62 counts the silent time.
- the storage unit 70 is means for storing information necessary for the apparatus 1 to perform signal processing described later.
- the storage unit 70 includes a first threshold value V1, a second threshold value V2 (see FIG. 3), a third threshold value V3 (see FIG. 4), a fourth threshold value V4 (see FIG. 4), and a fifth threshold value V5 described later. (See FIG. 11).
- the output unit 80 outputs either the input signal s1 from the switching unit 50 or the delayed signal s4 from the switching unit 50 as an output signal, for example, to a speaker or a communication line connected to the apparatus 1. To do.
- FIG. 5 is a flowchart showing signal processing of the apparatus 1.
- the input signal s1 input to the receiving unit 11 of the input unit 10 is input to the delay unit 40 and the switching unit 50, and noise is detected as a filter signal s2 via the bandpass filter 12.
- the apparatus 1 executes an input signal determination process (ST1), a noise detection process (ST2), and a switching process (ST3) for each input signal s1 input to the input unit 10.
- the switching process (ST3) is executed after the input signal discrimination process (ST1) and the noise detection process (ST2).
- the input signal discrimination process and the noise detection process are not limited to being executed simultaneously, and either one of the processes may be executed first.
- the input signal discriminating process (ST1) is a process for discriminating whether or not the input signal s1 (DC signal s3) from the microphone 2 is present.
- FIG. 6 is a flowchart showing the input signal discrimination process (ST1).
- the apparatus 1 uses the input signal determination unit 20 to determine the presence or absence of the input signal s1 (DC signal s3).
- the DC signal s3 from the input unit 10 is input to the first comparison unit 21 of the input signal determination unit 20.
- the apparatus 1 uses the first comparison unit 21 to compare the DC signal s3 with the first threshold value V1 (ST101). When the DC signal s3 is equal to or higher than the first threshold value V1 (“Yes” in ST101), the apparatus 1 determines that the input signal s1 is present (sound) (ST102).
- the apparatus 1 determines that there is no input signal s1 (silence) (ST103).
- the determination result r1 is input from the input signal determination unit 20 to the control unit 60 (ST104).
- the first threshold value V1 is a fluctuation value set based on a signal corresponding to the environmental sound collected by the microphone 2. That is, for example, when the sound collected by the microphone 2 is an environmental sound, the device 1 determines that there is no input signal s1 (silence). On the other hand, for example, when the sound collected by the microphone 2 is voice or noise, the apparatus 1 determines that the input signal s1 is present (sound).
- the device 1 treats a sound (sound, noise) that is equal to or higher than the first threshold V1 set based on the environmental sound as the input signal s1, and inputs a sound (sound, noise) smaller than the first threshold V1. It is not handled as signal s1. That is, the device 1 does not handle sound (sound, noise) equivalent to environmental sound as the input signal s1 in the present invention.
- the noise detection process (ST2) is a process for detecting noise included in the filter signal s2 (input signal s1). That is, the noise detection process (ST2) is a process for determining whether the filter signal s2 is an audio signal or a noise signal.
- FIG. 7 is a flowchart showing the noise detection process (ST2). While the filter signal s ⁇ b> 2 is input from the input unit 10, the apparatus 1 performs a frequency component determination process (ST201) and a time change determination process (ST202).
- FIG. 8 is a flowchart showing the frequency component discrimination process (ST201).
- the frequency component comparison process (ST201) is a process for detecting noise having the same level of power spectrum from low to high.
- the apparatus 1 uses the frequency component determination unit 31 to execute frequency component determination processing (ST201).
- the apparatus 1 uses the low-pass filter 311 to extract a medium-low frequency signal from the filter signal s2 (ST211).
- the apparatus 1 uses the first moving average unit 312 to convert the mid-low frequency signal into a DC signal, and generates a mid-low frequency signal power spectrum from the signal (ST212).
- the present apparatus 1 uses the high-pass filter 313 to extract a medium-high frequency signal from the filter signal s2 (ST213).
- the apparatus 1 uses the second moving average unit 314 to convert the mid-high frequency signal into a DC signal, and generates a mid-high frequency signal power spectrum from the signal (ST214).
- the present apparatus 1 uses the relative comparison unit 315 to compare the mid-low band signal power spectrum and the mid-high band signal power spectrum and calculate the difference (ST215).
- the difference is calculated, for example, by subtracting the mid-high range signal power spectrum from the mid-low range signal power spectrum.
- the apparatus 1 uses the second comparison unit 316 to compare the difference calculated by the relative comparison unit 315 with the second threshold value V2 (ST216).
- the apparatus 1 determines that the input signal s1 is a noise signal (ST217).
- the apparatus 1 determines that the input signal s1 is an audio signal (ST218).
- the apparatus 1 inputs the discrimination result of the frequency component discrimination unit 31 to the logical sum operation unit 33 (ST219).
- FIG. 9 is a flowchart of the time change determination process (ST202).
- the time change determination process (ST202) is a process for detecting noise having an impulsive power spectrum.
- the apparatus 1 uses the time change determination unit 32 to execute a time change determination process (ST202).
- the apparatus 1 uses the third moving average unit 321 to convert the filter signal s2 into a DC signal, and generates an input signal power spectrum from the signal (ST221).
- the present apparatus 1 uses the third comparison unit 322 to compare the input signal power spectrum and the third threshold value V3 (ST222).
- the apparatus 1 uses the time change counter unit 323 to change the time in the time axis waveform of the signal exceeding the third threshold V3. Count (ST223).
- the present apparatus 1 determines that the input signal s1 is an audio signal (ST226).
- the present apparatus 1 uses the fourth comparison unit 324 to compare the count value of the time change counter unit 323 with the fourth threshold value V4 (ST224).
- the present apparatus 1 determines that the input signal s1 is a noise signal (ST225).
- the apparatus 1 determines that the input signal s1 is an audio signal (ST226).
- the apparatus 1 inputs the determination result of the time change determination unit 32 to the logical sum operation unit 33 (ST227).
- the apparatus 1 uses the OR operation unit 33 to calculate the logical sum of the determination result of the frequency component determination process (ST201) and the determination result of the time change determination process (ST202) (ST203). Based on the logical sum, it is determined whether the filter signal s2 (input signal s1) is a noise signal or an audio signal (ST204).
- this apparatus 1 determines that the input signal s1 is a noise signal (ST205). That is, the noise detection unit 30 detects noise based on the logical sum of the determination result of the frequency component determination unit 31 and the determination result of the time change determination unit 32.
- the discrimination result of the frequency component discrimination process (ST201) and the discrimination result of the time change discrimination process (ST202) are discrimination results ("No" in ST204)
- this Apparatus 1 determines that input signal s1 is an audio signal (ST206).
- the present apparatus 1 inputs the detection result r2 of the noise detection unit 30 to the control unit 60 (ST207).
- a control signal cs1 is generated from the determination result r1 from the input signal determination unit 20 and the detection result r2 from the noise detection unit 30, and the output switching between the signal switching unit 51 of the switching unit 50 and the output switching is performed. This is a process of switching to the unit 52.
- FIG. 10 is a flowchart showing a part of the switching process (ST3).
- the apparatus 1 uses the control unit 60 to check whether the state of the output switching unit 52 is gate-on (output state) or gate-off (non-output state) (ST301).
- the state signal gs1 of the output switching unit 52 is input from the switching unit 50 to the logical product operation unit 61 of the control unit 60.
- the present apparatus 1 checks the presence / absence of the input signal s1 from the determination result r1 of the input signal determination unit 20 (ST302).
- the present apparatus 1 confirms whether the input signal s1 is an audio signal or a noise signal from the detection result r2 of the noise detector 30 (ST303). At this time, the detection result signal r ⁇ b> 2 s of the noise detection unit 30 is input to the logical product calculation unit 61.
- the apparatus 1 uses the control unit 60 to switch the contact P of the signal switching unit 51 to the contact P1 and to switch the output switching unit 52 to gate-on.
- a signal cs1 is generated (ST304).
- the apparatus 1 inputs the control signal cs1 from the control unit 60 to the switching unit 50, and executes a process (ST308) described later.
- the contact P of the signal switching unit 51 becomes the contact P1
- the output switching unit 52 is turned on (mute off). That is, the delay signal s 4 is input from the switching unit 50 to the output unit 80. That is, the device 1 outputs the delay signal s4 as an output signal.
- the apparatus 1 uses the control unit 60 to switch the signal switching unit
- the control signal cs1 for maintaining the contact P of 51 at the contact P2 and maintaining the output switching unit 52 in the gate-off state is generated (ST305).
- the apparatus 1 inputs the control signal cs1 from the control unit 60 to the switching unit 50, and returns to the process (ST301).
- the contact P of the signal switching unit 51 becomes the contact P2, and the output switching unit 52 is turned off (mute on). That is, no signal (input signal s1 or delayed signal s4) is input from the switching unit 50 to the output unit 80. That is, the device 1 does not output an output signal.
- the present apparatus 1 checks the presence / absence of the input signal s1 from the determination result of the input signal determination unit 20 (ST306).
- the apparatus 1 uses the control unit 60 to maintain the contact point P of the signal switching unit 51 at the contact point P2 and to switch the output switching unit 52.
- a control signal cs1 for turning off the gate is generated (ST307).
- the apparatus 1 inputs the control signal cs1 from the control unit 60 to the switching unit 50, and returns to the process (ST301).
- FIG. 11 is a flowchart showing another part of the switching process (ST3). This figure shows the process when the state of the output switching unit 52 is gate-on and the input signal s1 is present in the switching process (ST3).
- the device 1 detects a short silence period such as a prompt sound or breathing included in the input signal s1.
- the silence time is detected by, for example, detecting a rising edge of a signal indicating silence.
- a signal indicating silence is generated by the input signal determination unit 20 and is input to the control unit 60 together with the determination result r1.
- the apparatus 1 uses the control unit 60 to detect a rising edge of a signal indicating silence from the input signal determination unit 20 (ST308).
- the present apparatus 1 starts counting silence time using the counter unit 62 of the control unit 60 (ST309).
- the silent time count is continued until the control unit 60 detects a falling edge of a signal indicating silence from the input signal determining unit 20 (“NO” in ST310).
- the apparatus 1 checks whether the silence time is equal to or greater than a predetermined fifth threshold value V5 (ST311).
- the “fifth threshold value V5” is a threshold value for classifying whether a short period of silence is a breath connection or a prompt sound. That is, when the silence time is equal to or greater than the fifth threshold value V5, the short silence is silence caused by breathing. On the other hand, when the silence time is smaller than the fifth threshold value V5, the short silence is silence caused by the prompt sound.
- the fifth threshold value V5 is set to a value larger than the fourth threshold value V4 in the time change determination process (ST202).
- the apparatus 1 When the silent time is equal to or greater than the fifth threshold value V5 ("Yes" in ST311), the apparatus 1 generates the control signal cs1 that switches the contact P of the signal switching unit 51 to the contact P2 and maintains the output switching unit 52 in the gate-on state. (ST312). Next, the apparatus 1 clears the count of the counter unit 62, ends the count of the counter unit 62 (ST313), and returns to the processing (ST301).
- the present apparatus 1 clears the count of the counter unit 62 (ST314), and returns to the processing (ST308).
- the apparatus 1 inputs the real-time input signal s1 to the output unit 80 when the silence time such as breathing is detected, and inputs the delay signal s4 to the output unit 80 when the silence time is not detected.
- the signal switching unit 51 outputs the input signal s1 from the input unit 10 if the silence time is equal to or greater than the fifth threshold value V5. Input to the unit 80. That is, the signal switching unit 51 outputs either the delay signal s4 or the input signal s1 to the output unit 80 based on the determination result r1 of the input signal determination unit 20.
- Control of switching of output switching unit by control unit is performed by determination result r1 of input signal determination unit 20 and detection result of noise detection unit 30. a first control that controls switching based on r2 (see ST301 to ST305), and a second control that controls switching based on the determination result r1 of the input signal determination unit 20 (see ST301, ST306, and ST307) ,including.
- the device 1 selects the first control. Only when the state of the output switching unit 52 is gate-off and the detection result r2 of the noise detection unit 30 is an audio signal, the device 1 switches the state of the output switching unit 52 to gate-on. When the state of the output switching unit 52 is gate-on, the device 1 selects the second control. As described above, when the state of the output switching unit 52 is gate-off, the apparatus 1 determines whether the output switching unit 52 is based on the logical product of the state of the output switching unit 52 and the detection result r2 of the noise detection unit 30. The state is switched to gate on and the second control is selected.
- the apparatus 1 uses the logical product operation unit 61 to select either the first control or the second control.
- the detection result signal r2s and the state signal gs1 are input to the AND operation unit 61.
- the AND operation unit 61 selects either the first control or the second control based on the logical product of the detection result signal r2s and the state signal gs1 when the state of the output switching unit 52 is gate-off.
- the AND operation unit 61 selects the second control when the state of the output switching unit 52 is gate-on. That is, the first control and the second control are selected by the device 1 based on the state of the output switching unit 52.
- the present apparatus 1 selects the first control when the state of the output switching unit 52 is gate-off (non-output state), and controls the switching of the output switching unit 52 based on the first control.
- the device 1 selects the second control and controls the switching of the output switching unit 52 based on the second control.
- the device 1 does not block (gate off) the output of the input signal s1 (or delayed signal s4) from the microphone 2 even if the microphone 2 picks up noise while the user of the microphone 2 speaks. .
- the switching control of the output switching unit 52 by the control unit 60 includes the first control and the second control, and the first control is such that the state of the output switching unit 52 is gate-off.
- the present apparatus 1 maintains the state of the output switching unit 52 in the gate-off state when detecting noise. That is, the present apparatus 1 erroneously detects noise as voice and does not control switching of the output of the signal from the microphone 2. That is, in the initial state, the audio signal processing device according to the present invention outputs the delay signal s4 when the audio signal is input (mute off), and does not output the signal when the noise signal is input (mute on).
- the second control is selected when the state of the output switching unit 52 is gate-on.
- the present apparatus 1 maintains the state of the output switching unit 52 in the gate-on state even if noise is detected. That is, in the state in which the user of the microphone 2 is speaking (hereinafter referred to as “speech state”), the apparatus 1 is configured to receive an input signal s1 (from the microphone 2) even if the microphone 2 collects noise. Alternatively, the delay signal s4) is output. That is, the apparatus 1 does not cut off the output of the audio signal even if noise is detected during the output of the audio signal.
- the control unit 60 includes the logical product operation unit 61.
- the apparatus 1 calculates a logical product of the detection result signal r2s and the state signal gs1 using the logical product calculation unit 61, and selects either the first control or the second control based on the logical product. To do.
- the present apparatus 1 switches the state of the output switching unit 52 to gate-on and performs the second control. Select.
- the apparatus 1 in the initial state, the apparatus 1 outputs an audio signal (delayed signal s4) when the audio signal is input (mute off), and does not output the signal when the noise signal is input (mute on).
- the device 1 in the speech state, the device 1 outputs either the input signal s1 or the delayed signal s4 even if noise is detected. That is, the apparatus 1 does not cut off the output of the audio signal even if noise is detected during the output of the audio signal.
- the input signal determination unit 20 compares the signal corresponding to the environmental sound (first threshold value V1) and the input signal s1 (DC signal s3) from the input unit 10. Based on the result, the presence or absence of the input signal s1 from the input unit 10 is determined. For this reason, the present apparatus 1 determines the input signal s1 in accordance with the environment in which the present apparatus 1 is installed (for example, the presence or absence of air conditioning in the room in which the apparatus is installed, the size of the room, the gain value of the microphone 2, etc.). The presence or absence can be determined.
- the noise detection unit 30 includes the frequency component determination unit 31 and the time change determination unit 32, and detects noise based on these determination results. Therefore, the present apparatus 1 can accurately detect complex noises having various shapes of waveforms.
- the present apparatus 1 outputs a delay signal s4 at the beginning of an utterance, and outputs a real-time input signal s1 when a short silence time such as breathing is detected. That is, the present apparatus 1 prevents so-called head missing of an audio signal generated by the processing of the noise detection unit 30 or the like.
- the switching unit 50 includes the output switching unit 52.
- the output unit may include an output switching unit.
- a control signal for switching between gate-on and gate-off is input from the control unit to the output unit.
- control unit is not limited to the configuration of the present embodiment. That is, for example, the control unit may include a control circuit that controls the signal switching unit and a control circuit that controls the output switching unit.
- the present apparatus may include a plurality of input units. That is, for example, this apparatus may include six input units (6ch) and process input signals from six microphones.
- the present apparatus may detect a short silence period based on the interval between successive input signals. That is, for example, the present apparatus may count the silence time by detecting the falling edge of a certain input signal and terminate the silence time counting by detecting the rising edge of the next input signal.
- the signal switching unit inputs the input signal from the input unit to the output unit when the silence time is equal to or greater than the fifth threshold, If the time is smaller than the fifth threshold, the delay signal from the delay unit is input to the output unit.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Otolaryngology (AREA)
- Quality & Reliability (AREA)
- General Health & Medical Sciences (AREA)
- Telephone Function (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
Description
以下、図面を参照しながら、本発明にかかる音声信号処理装置の実施の形態について説明する。
図1は、本発明にかかる音声信号処理装置(以下「本装置」という。)の実施の形態を示す機能ブロック図である。
本装置1は、音声や楽音を電気信号に変換するマイクロホン2などの機器からの電気信号(入力信号)の混合、分配、バランス調整などの処理を行う。本装置1は、例えば、ミキサや会議システムのコントロールユニットなどである。
ノイズ検出部30は、周波数成分判別部31と、時間変化判別部32と、論理和演算部33と、を備える。
周波数成分判別部31は、フィルタ信号s2(入力信号s1)の周波数成分に基づいてノイズの有無を判別する。通常、音声信号のパワースペクトルは、高域のパワーよりも中低域のパワーの方が大きい。また、音声信号のパワースペクトルは、一部の周波数帯域に突出して現れる傾向にある。一方、ノイズのパワースペクトルは、全周波数帯域に現れる傾向にある。周波数成分判別部31は、フィルタ信号s2のパワースペクトルを、中低周波数帯域(中低域)のパワースペクトルと、中高周波数帯域(中高域)のパワースペクトルと、に分割する。周波数成分判別部31は、2つのパワースペクトルを比較することで、フィルタ信号s2(入力信号s1)が音声信号かノイズ信号かを判別する。
時間変化判別部32は、フィルタ信号s2(入力信号s1)の時間変化に基づいてノイズの有無を判別する。通常、インパルス性のノイズの時間軸波形は、ノイズの発生直後に急峻に変動した後、所定の時間で減衰する。時間変化判別部32は、フィルタ信号s2(入力信号s1)の時間変化に基づいてノイズの有無を判別する。すなわち、時間変化判別部32は、インパルス性のノイズの時間軸波形を持つ信号の時間変化をカウントして、フィルタ信号s2(入力信号s1)が音声信号かノイズ信号かを判別する。
論理和演算部33は、周波数成分判別部31(第2比較部316)の出力と、時間変化判別部32(第4比較部324)の出力と、の論理和を演算する。論理和演算部33は、第2比較部316の出力と第4比較部324の出力とのうち、いずれか1つの出力がノイズと判別された出力である場合に、フィルタ信号s2(入力信号s1)をノイズ信号と判定する。すなわち、論理和演算部33は、周波数成分判別部31の判別結果と、時間変化判別部32の判別結果と、に基づいて、フィルタ信号s2(入力信号s1)からノイズを検出する。
遅延部40は、入力部10からの入力信号s1を所定時間記憶して、入力信号s1を所定時間遅延させた遅延信号s4を生成して出力する。「所定時間」は、時間変化判別部32の処理(後述する時間変化判別処理(ST202))に要する時間(例えば、第4閾値V4)よりも長い時間に設定される。その結果、本装置1は、時間変化判別部32の処理を実行しても、入力信号s1と比較して情報(音声)の欠けの無い遅延信号s4を生成する。遅延部40は、例えば、リングバッファを含む。遅延部40の遅延信号s4の生成は、遅延部40に入力信号s1が入力されている間、常に行われる。遅延信号s4は、遅延部40から切替部50に入力される。
次に、本装置1の信号処理(動作)について、説明する。
図1に示されるよう、入力部10の受信部11に入力された入力信号s1は、遅延部40と切替部50とに入力されると共に、バンドパスフィルタ12を介してフィルタ信号s2としてノイズ検出部30に入力された後に整流器13を介してDC信号s3に変換されて入力信号判別部20に入力される。本装置1は、入力部10に入力された入力信号s1ごとに、入力信号判別処理(ST1)と、ノイズ検出処理(ST2)と、切替処理(ST3)と、を実行する。切替処理(ST3)は、入力信号判別処理(ST1)とノイズ検出処理(ST2)との後に実行される。
入力信号判別処理(ST1)は、マイクロホン2からの入力信号s1(DC信号s3)の有無を判別する処理である。
本装置1は、入力信号判別部20を用いて、入力信号s1(DC信号s3)の有無を判別する。入力部10からのDC信号s3は、入力信号判別部20の第1比較部21に入力される。本装置1は、第1比較部21を用いて、DC信号s3と、第1閾値V1と、を比較する(ST101)。DC信号s3が第1閾値V1以上のとき(ST101の「はい」)、本装置1は、入力信号s1が有る(有音)と判別する(ST102)。一方、DC信号s3が第1閾値V1よりも小さいとき(ST101の「いいえ」)、本装置1は、入力信号s1が無い(無音)と判別する(ST103)。判別結果r1は、入力信号判別部20から制御部60に入力される(ST104)。
ノイズ検出処理(ST2)は、フィルタ信号s2(入力信号s1)に含まれるノイズを検出する処理である。すなわち、ノイズ検出処理(ST2)は、フィルタ信号s2が音声信号かノイズ信号かを判定する処理である。
本装置1は、入力部10からフィルタ信号s2が入力されている間、周波数成分判別処理(ST201)と、時間変化判別処理(ST202)と、を実行する。
周波数成分比較処理(ST201)は、低域から高域に亘り同レベルのパワースペクトルを持つノイズを検出する処理である。本装置1は、周波数成分判別部31を用いて、周波数成分判別処理(ST201)を実行する。
時間変化判別処理(ST202)は、インパルス性のパワースペクトルを持つノイズを検出する処理である。本装置1は、時間変化判別部32を用いて、時間変化判別処理(ST202)を実行する。
本装置1は、論理和演算部33を用いて、周波数成分判別処理(ST201)の判別結果と、時間変化判別処理(ST202)の判別結果と、の論理和を演算して(ST203)、同論理和に基づいてフィルタ信号s2(入力信号s1)がノイズ信号か音声信号かを判定する(ST204)。
図5に戻る。
切替処理(ST3)は、入力信号判別部20からの判別結果r1と、ノイズ検出部30からの検出結果r2と、から制御信号cs1を生成して、切替部50の信号切替部51と出力切替部52とを切り替える処理である。
先ず、本装置1は、制御部60を用いて、出力切替部52の状態がゲートオン(出力状態)かゲートオフ(非出力状態)かを確認する(ST301)。このとき、出力切替部52の状態信号gs1は、切替部50から制御部60の論理積演算部61に入力される。出力切替部52の状態がゲートオフのとき(ST301の「いいえ」)、本装置1は、入力信号判別部20の判別結果r1から、入力信号s1の有無を確認する(ST302)。
同図は、切替処理(ST3)のうち、出力切替部52の状態がゲートオン、かつ、入力信号s1が有るときの処理を示す。
図10に示されるように、制御部60による出力切替部52の切替の制御は、入力信号判別部20の判別結果r1とノイズ検出部30の検出結果r2とに基づいて切替を制御する第1制御(ST301-ST305を参照)と、入力信号判別部20の判別結果r1に基づいて切替を制御する第2制御(ST301,ST306,ST307を参照)と、を含む。
以上説明した実施の形態によれば、制御部60による出力切替部52の切替の制御は第1制御と第2制御とを含み、第1制御は、出力切替部52の状態がゲートオフのときに選択される。その結果、出力切替部52の状態がゲートオフのとき、本装置1は、ノイズを検出すると出力切替部52の状態をゲートオフに維持する。すなわち、本装置1は、ノイズを音声として誤検出し、マイクロホン2からの信号の出力の切替の制御を行わない。つまり、本発明にかかる音声信号処理装置は、初期状態では、音声信号が入力されると遅延信号s4を出力し(ミュートオフ)、ノイズ信号が入力されると信号を出力しない(ミュートオン)。
10 入力部
20 入力信号判別部
30 ノイズ検出部
31 周波数成分判別部
32 時間変化判別部
33 論理和演算部
40 遅延部
50 切替部
51 信号切替部
52 出力切替部
60 制御部
61 論理積演算部
80 出力部
r1 判別結果
r2 検出結果
s1 入力信号
s4 遅延信号
Claims (10)
- マイクロホンからの信号が入力される入力部と、
前記入力部からの入力信号の有無を判別する入力信号判別部と、
前記入力部からの前記入力信号に含まれるノイズを検出するノイズ検出部と、
前記入力信号を出力信号として出力する出力部と、
前記出力部から前記出力信号を出力する出力状態と、前記出力部から前記出力信号を出力しない非出力状態と、を切り替える出力切替部と、
前記出力切替部の切替を制御する制御部と、
を有してなり、
前記制御部による前記切替の制御は、
前記入力信号判別部の判別結果と、前記ノイズ検出部の検出結果と、に基づいて、前記切替を制御する第1制御と、
前記入力信号判別部の前記判別結果に基づいて、前記切替を制御する第2制御と、
を含み、
前記第1制御と前記第2制御とは、前記出力切替部の状態に基づいて、選択される、
ことを特徴とする音声信号処理装置。 - 前記第1制御は、前記出力切替部が非出力状態のとき、選択され、
前記第2制御は、前記出力切替部が出力状態のとき、選択される、
請求項1記載の音声信号処理装置。 - 前記制御部は、
前記ノイズ検出部の検出結果を示す検出結果信号と、前記出力切替部の状態を示す状態信号と、が入力される論理積演算部、
を備え、
前記論理積演算部は、前記検出結果信号と前記状態信号との論理積に基づいて、前記第1制御または前記第2制御のいずれか一方を選択する、
請求項1記載の音声信号処理装置。 - 前記マイクロホンは、前記マイクロホンの設置場所の環境音を収音し、
前記入力信号判別部は、前記環境音に対応する信号と、前記入力部からの前記入力信号と、の比較結果に基づいて、前記入力部からの前記入力信号の有無を判別する、
請求項1記載の音声信号処理装置。 - 前記出力部は、前記ノイズ検出部が前記ノイズを検出したとき、前記出力信号を出力しない、
請求項1記載の音声信号処理装置。 - 前記ノイズ検出部は、
前記入力信号の周波数成分に基づいて前記ノイズの有無を判別する周波数成分判別部と、
前記入力信号の時間変化に基づいて前記ノイズの有無を判別する時間変化判別部と、
を備え、
前記ノイズ検出部は、前記周波数成分判別部の判別結果と、前記時間変化判別部の判別結果と、に基づいて、前記ノイズを検出する、
請求項1記載の音声信号処理装置。 - 前記ノイズ検出部は、前記周波数成分判別部の判別結果と、前記時間変化判別部の判別結果と、の論理和に基づいて、前記ノイズを検出する、
請求項6記載の音声信号処理装置。 - 前記入力部から入力された前記入力信号を遅延させて出力する遅延部と、
前記遅延部からの遅延信号と、前記入力部からの前記入力信号と、が入力されて、前記遅延信号と前記入力信号のいずれか一方を出力する信号切替部と、
を備え、
前記出力部は、前記信号切替部から入力される前記遅延信号または前記入力信号を、前記出力信号として出力する、
請求項1記載の音声信号処理装置。 - 前記信号切替部は、前記入力信号判別部の前記判別結果に基づいて、前記遅延信号または前記入力信号のいずれか一方を前記出力部に入力する、
請求項8記載の音声信号処理装置。 - 前記信号切替部は、前記入力信号判別部が前記入力部からの前記入力信号が無いと判別したとき、前記入力部からの前記入力信号を前記出力部に入力する、
請求項9記載の音声信号処理装置。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP18802763.5A EP3627853A4 (en) | 2017-05-19 | 2018-03-15 | AUDIO SIGNAL PROCESSOR |
CN201880032965.5A CN110663258B (zh) | 2017-05-19 | 2018-03-15 | 语音信号处理装置 |
JP2019519088A JP7004332B2 (ja) | 2017-05-19 | 2018-03-15 | 音声信号処理装置 |
US16/614,628 US10971169B2 (en) | 2017-05-19 | 2018-03-15 | Sound signal processing device |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2017099804 | 2017-05-19 | ||
JP2017-099804 | 2017-05-19 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018211806A1 true WO2018211806A1 (ja) | 2018-11-22 |
Family
ID=64274499
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2018/010328 WO2018211806A1 (ja) | 2017-05-19 | 2018-03-15 | 音声信号処理装置 |
Country Status (5)
Country | Link |
---|---|
US (1) | US10971169B2 (ja) |
EP (1) | EP3627853A4 (ja) |
JP (1) | JP7004332B2 (ja) |
CN (1) | CN110663258B (ja) |
WO (1) | WO2018211806A1 (ja) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020243471A1 (en) * | 2019-05-31 | 2020-12-03 | Shure Acquisition Holdings, Inc. | Low latency automixer integrated with voice and noise activity detection |
US11297426B2 (en) | 2019-08-23 | 2022-04-05 | Shure Acquisition Holdings, Inc. | One-dimensional array microphone with improved directivity |
US11297423B2 (en) | 2018-06-15 | 2022-04-05 | Shure Acquisition Holdings, Inc. | Endfire linear array microphone |
US11303981B2 (en) | 2019-03-21 | 2022-04-12 | Shure Acquisition Holdings, Inc. | Housings and associated design features for ceiling array microphones |
US11310596B2 (en) | 2018-09-20 | 2022-04-19 | Shure Acquisition Holdings, Inc. | Adjustable lobe shape for array microphones |
US11310592B2 (en) | 2015-04-30 | 2022-04-19 | Shure Acquisition Holdings, Inc. | Array microphone system and method of assembling the same |
US11438691B2 (en) | 2019-03-21 | 2022-09-06 | Shure Acquisition Holdings, Inc. | Auto focus, auto focus within regions, and auto placement of beamformed microphone lobes with inhibition functionality |
US11445294B2 (en) | 2019-05-23 | 2022-09-13 | Shure Acquisition Holdings, Inc. | Steerable speaker array, system, and method for the same |
US11477327B2 (en) | 2017-01-13 | 2022-10-18 | Shure Acquisition Holdings, Inc. | Post-mixing acoustic echo cancellation systems and methods |
US11523212B2 (en) | 2018-06-01 | 2022-12-06 | Shure Acquisition Holdings, Inc. | Pattern-forming microphone array |
US11552611B2 (en) | 2020-02-07 | 2023-01-10 | Shure Acquisition Holdings, Inc. | System and method for automatic adjustment of reference gain |
US11558693B2 (en) | 2019-03-21 | 2023-01-17 | Shure Acquisition Holdings, Inc. | Auto focus, auto focus within regions, and auto placement of beamformed microphone lobes with inhibition and voice activity detection functionality |
US11678109B2 (en) | 2015-04-30 | 2023-06-13 | Shure Acquisition Holdings, Inc. | Offset cartridge microphones |
US11706562B2 (en) | 2020-05-29 | 2023-07-18 | Shure Acquisition Holdings, Inc. | Transducer steering and configuration systems and methods using a local positioning system |
US11785380B2 (en) | 2021-01-28 | 2023-10-10 | Shure Acquisition Holdings, Inc. | Hybrid audio beamforming system |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10971169B2 (en) * | 2017-05-19 | 2021-04-06 | Audio-Technica Corporation | Sound signal processing device |
WO2022119752A1 (en) * | 2020-12-02 | 2022-06-09 | HearUnow, Inc. | Dynamic voice accentuation and reinforcement |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0683391A (ja) | 1992-09-04 | 1994-03-25 | Matsushita Electric Ind Co Ltd | テレビ会議用発言音声検出装置 |
JPH0744996A (ja) * | 1993-07-30 | 1995-02-14 | Aiwa Co Ltd | ノイズ低減回路 |
JP2008015481A (ja) * | 2006-06-08 | 2008-01-24 | Audio Technica Corp | 音声会議装置 |
US20080167868A1 (en) * | 2007-01-04 | 2008-07-10 | Dimitri Kanevsky | Systems and methods for intelligent control of microphones for speech recognition applications |
US20080279366A1 (en) * | 2007-05-08 | 2008-11-13 | Polycom, Inc. | Method and Apparatus for Automatically Suppressing Computer Keyboard Noises in Audio Telecommunication Session |
JP2014053890A (ja) * | 2012-09-10 | 2014-03-20 | Polycom Inc | 望ましくないノイズに対する自動的マイクロホンミューティング |
Family Cites Families (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000267690A (ja) * | 1999-03-19 | 2000-09-29 | Toshiba Corp | 音声検知装置及び音声制御システム |
WO2007122923A1 (ja) * | 2006-04-24 | 2007-11-01 | Panasonic Corporation | 雑音抑圧装置 |
JP4747949B2 (ja) * | 2006-05-25 | 2011-08-17 | ヤマハ株式会社 | 音声会議装置 |
EP2047669B1 (de) * | 2006-07-28 | 2014-05-21 | Unify GmbH & Co. KG | Verfahren zum durchführen einer audiokonferenz, audiokonferenzeinrichtung und umschalteverfahren zwischen kodierern |
US8175291B2 (en) * | 2007-12-19 | 2012-05-08 | Qualcomm Incorporated | Systems, methods, and apparatus for multi-microphone based speech enhancement |
JP4474488B1 (ja) * | 2009-04-23 | 2010-06-02 | パナソニック株式会社 | 音声受信装置、音声処理方法、プログラムおよび音声受信システム |
EP2567377A4 (en) * | 2010-05-03 | 2016-10-12 | Aliphcom | WIND REMOVAL / REPLACEMENT COMPONENT FOR USE WITH ELECTRONIC SYSTEMS |
EP2405634B1 (en) * | 2010-07-09 | 2014-09-03 | Google, Inc. | Method of indicating presence of transient noise in a call and apparatus thereof |
US9288331B2 (en) * | 2011-08-16 | 2016-03-15 | Cisco Technology, Inc. | System and method for muting audio associated with a source |
US9282405B2 (en) * | 2012-04-24 | 2016-03-08 | Polycom, Inc. | Automatic microphone muting of undesired noises by microphone arrays |
JP6113303B2 (ja) * | 2012-12-27 | 2017-04-12 | ローベルト ボツシユ ゲゼルシヤフト ミツト ベシユレンクテル ハフツングRobert Bosch Gmbh | 会議システム及び会議システムにおけるボイスアクティベーションのための処理方法 |
US9607630B2 (en) * | 2013-04-16 | 2017-03-28 | International Business Machines Corporation | Prevention of unintended distribution of audio information |
US9215543B2 (en) * | 2013-12-03 | 2015-12-15 | Cisco Technology, Inc. | Microphone mute/unmute notification |
US9294858B2 (en) * | 2014-02-26 | 2016-03-22 | Revo Labs, Inc. | Controlling acoustic echo cancellation while handling a wireless microphone |
US9560316B1 (en) * | 2014-08-21 | 2017-01-31 | Google Inc. | Indicating sound quality during a conference |
JP2016051038A (ja) * | 2014-08-29 | 2016-04-11 | 株式会社Jvcケンウッド | ノイズゲート装置 |
US10499164B2 (en) * | 2015-03-18 | 2019-12-03 | Lenovo (Singapore) Pte. Ltd. | Presentation of audio based on source |
US10971169B2 (en) * | 2017-05-19 | 2021-04-06 | Audio-Technica Corporation | Sound signal processing device |
-
2018
- 2018-03-15 US US16/614,628 patent/US10971169B2/en active Active
- 2018-03-15 EP EP18802763.5A patent/EP3627853A4/en active Pending
- 2018-03-15 CN CN201880032965.5A patent/CN110663258B/zh active Active
- 2018-03-15 JP JP2019519088A patent/JP7004332B2/ja active Active
- 2018-03-15 WO PCT/JP2018/010328 patent/WO2018211806A1/ja active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0683391A (ja) | 1992-09-04 | 1994-03-25 | Matsushita Electric Ind Co Ltd | テレビ会議用発言音声検出装置 |
JPH0744996A (ja) * | 1993-07-30 | 1995-02-14 | Aiwa Co Ltd | ノイズ低減回路 |
JP2008015481A (ja) * | 2006-06-08 | 2008-01-24 | Audio Technica Corp | 音声会議装置 |
US20080167868A1 (en) * | 2007-01-04 | 2008-07-10 | Dimitri Kanevsky | Systems and methods for intelligent control of microphones for speech recognition applications |
US20080279366A1 (en) * | 2007-05-08 | 2008-11-13 | Polycom, Inc. | Method and Apparatus for Automatically Suppressing Computer Keyboard Noises in Audio Telecommunication Session |
JP2014053890A (ja) * | 2012-09-10 | 2014-03-20 | Polycom Inc | 望ましくないノイズに対する自動的マイクロホンミューティング |
Non-Patent Citations (1)
Title |
---|
See also references of EP3627853A4 |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11832053B2 (en) | 2015-04-30 | 2023-11-28 | Shure Acquisition Holdings, Inc. | Array microphone system and method of assembling the same |
US11678109B2 (en) | 2015-04-30 | 2023-06-13 | Shure Acquisition Holdings, Inc. | Offset cartridge microphones |
US11310592B2 (en) | 2015-04-30 | 2022-04-19 | Shure Acquisition Holdings, Inc. | Array microphone system and method of assembling the same |
US11477327B2 (en) | 2017-01-13 | 2022-10-18 | Shure Acquisition Holdings, Inc. | Post-mixing acoustic echo cancellation systems and methods |
US11800281B2 (en) | 2018-06-01 | 2023-10-24 | Shure Acquisition Holdings, Inc. | Pattern-forming microphone array |
US11523212B2 (en) | 2018-06-01 | 2022-12-06 | Shure Acquisition Holdings, Inc. | Pattern-forming microphone array |
US11297423B2 (en) | 2018-06-15 | 2022-04-05 | Shure Acquisition Holdings, Inc. | Endfire linear array microphone |
US11770650B2 (en) | 2018-06-15 | 2023-09-26 | Shure Acquisition Holdings, Inc. | Endfire linear array microphone |
US11310596B2 (en) | 2018-09-20 | 2022-04-19 | Shure Acquisition Holdings, Inc. | Adjustable lobe shape for array microphones |
US11778368B2 (en) | 2019-03-21 | 2023-10-03 | Shure Acquisition Holdings, Inc. | Auto focus, auto focus within regions, and auto placement of beamformed microphone lobes with inhibition functionality |
US11438691B2 (en) | 2019-03-21 | 2022-09-06 | Shure Acquisition Holdings, Inc. | Auto focus, auto focus within regions, and auto placement of beamformed microphone lobes with inhibition functionality |
US11558693B2 (en) | 2019-03-21 | 2023-01-17 | Shure Acquisition Holdings, Inc. | Auto focus, auto focus within regions, and auto placement of beamformed microphone lobes with inhibition and voice activity detection functionality |
US11303981B2 (en) | 2019-03-21 | 2022-04-12 | Shure Acquisition Holdings, Inc. | Housings and associated design features for ceiling array microphones |
US11800280B2 (en) | 2019-05-23 | 2023-10-24 | Shure Acquisition Holdings, Inc. | Steerable speaker array, system and method for the same |
US11445294B2 (en) | 2019-05-23 | 2022-09-13 | Shure Acquisition Holdings, Inc. | Steerable speaker array, system, and method for the same |
WO2020243471A1 (en) * | 2019-05-31 | 2020-12-03 | Shure Acquisition Holdings, Inc. | Low latency automixer integrated with voice and noise activity detection |
US11688418B2 (en) | 2019-05-31 | 2023-06-27 | Shure Acquisition Holdings, Inc. | Low latency automixer integrated with voice and noise activity detection |
US11302347B2 (en) | 2019-05-31 | 2022-04-12 | Shure Acquisition Holdings, Inc. | Low latency automixer integrated with voice and noise activity detection |
US11750972B2 (en) | 2019-08-23 | 2023-09-05 | Shure Acquisition Holdings, Inc. | One-dimensional array microphone with improved directivity |
US11297426B2 (en) | 2019-08-23 | 2022-04-05 | Shure Acquisition Holdings, Inc. | One-dimensional array microphone with improved directivity |
US11552611B2 (en) | 2020-02-07 | 2023-01-10 | Shure Acquisition Holdings, Inc. | System and method for automatic adjustment of reference gain |
US11706562B2 (en) | 2020-05-29 | 2023-07-18 | Shure Acquisition Holdings, Inc. | Transducer steering and configuration systems and methods using a local positioning system |
US11785380B2 (en) | 2021-01-28 | 2023-10-10 | Shure Acquisition Holdings, Inc. | Hybrid audio beamforming system |
Also Published As
Publication number | Publication date |
---|---|
US20200152218A1 (en) | 2020-05-14 |
JPWO2018211806A1 (ja) | 2020-03-19 |
CN110663258A (zh) | 2020-01-07 |
CN110663258B (zh) | 2021-08-03 |
JP7004332B2 (ja) | 2022-01-21 |
EP3627853A1 (en) | 2020-03-25 |
US10971169B2 (en) | 2021-04-06 |
EP3627853A4 (en) | 2021-02-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018211806A1 (ja) | 音声信号処理装置 | |
US8284947B2 (en) | Reverberation estimation and suppression system | |
JP4744874B2 (ja) | サウンドの検出および特定システム | |
US9959886B2 (en) | Spectral comb voice activity detection | |
US20180197525A1 (en) | Noise detector and sound signal output device | |
CA2390287C (en) | Acoustic source range detection system | |
JPH11327582A (ja) | 騒音下での音声検出システム | |
US11621017B2 (en) | Event detection for playback management in an audio device | |
US10581386B2 (en) | Protective device | |
US20020103636A1 (en) | Frequency-domain post-filtering voice-activity detector | |
US20190355380A1 (en) | Audio signal processing | |
JP3500953B2 (ja) | オーディオ再生システムのセットアップ方法及びその装置 | |
GB2563868A (en) | Sound responsive device and method | |
JPH0327698A (ja) | 音響信号検出方法 | |
JPS63118197A (ja) | 音声検出装置 | |
JPS62287297A (ja) | 音声検出装置 | |
JPH09292894A (ja) | 音声認識方法及び装置 | |
JP3901425B2 (ja) | 音声検出装置 | |
JPS63166346A (ja) | 多周波数比較型ハンズフリ−回路 | |
JP2020166148A (ja) | 集音制御装置、集音制御プログラム及び会議支援システム | |
JPH0527795A (ja) | 音声認識装置 | |
EP3753013A1 (en) | Speech processing apparatus, method, and program | |
EP3332558A2 (en) | Event detection for playback management in an audio device | |
JPH03220600A (ja) | 音声検出装置 | |
JPS5923397A (ja) | 音声認識装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18802763 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2019519088 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2018802763 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2018802763 Country of ref document: EP Effective date: 20191219 |