WO2012096073A1 - 音声処理装置及びその制御方法とその制御プログラムを格納した記憶媒体、該音声処理装置を備えた車両、情報処理装置及び情報処理システム - Google Patents
音声処理装置及びその制御方法とその制御プログラムを格納した記憶媒体、該音声処理装置を備えた車両、情報処理装置及び情報処理システム Download PDFInfo
- Publication number
- WO2012096073A1 WO2012096073A1 PCT/JP2011/077996 JP2011077996W WO2012096073A1 WO 2012096073 A1 WO2012096073 A1 WO 2012096073A1 JP 2011077996 W JP2011077996 W JP 2011077996W WO 2012096073 A1 WO2012096073 A1 WO 2012096073A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sound
- microphone
- mixed
- noise
- signal
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/20—Arrangements for obtaining desired frequency or directional characteristics
- H04R1/32—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
- H04R1/34—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by using a single transducer with sound reflecting, diffracting, directing or guiding means
- H04R1/342—Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by using a single transducer with sound reflecting, diffracting, directing or guiding means for microphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2410/00—Microphones
- H04R2410/05—Noise reduction with a separate noise microphone
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2499/00—Aspects covered by H04R or H04S not otherwise provided for in their subgroups
- H04R2499/10—General applications
- H04R2499/11—Transducers incorporated or for use in hand-held devices, e.g. mobile phones, PDA's, camera's
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2499/00—Aspects covered by H04R or H04S not otherwise provided for in their subgroups
- H04R2499/10—General applications
- H04R2499/13—Acoustic transducers and sound field adaptation in vehicles
Definitions
- the present invention relates to a technique for acquiring pseudo sound from mixed sound in which desired sound and noise are mixed.
- Patent Document 1 discloses a technique for suppressing noise from outside the vehicle with respect to in-vehicle sound in the vehicle.
- the external noise is suppressed using an adaptive filter based on the output signal of the microphone that picks up the in-vehicle sound and the output signal of the microphone that picks up the external noise.
- the technique of the above-mentioned Patent Document 1 is configured to block a non-primary input among desired speech and noise input to each microphone. For this reason, if the desired sound input to the microphone that picks up the sound is weak, the restored pseudo sound is also weakened. On the other hand, if the noise picked up by the microphone that picks up the noise is weak, the estimated accuracy of the noise to be suppressed is lowered, so that the restored pseudo-voice becomes unstable.
- An object of the present invention is to provide a technique for solving the above-described problems.
- an apparatus provides: A first microphone that inputs a first mixed sound in which desired voice and noise are mixed and outputs a first mixed signal; A second microphone that is open to the same sound space as the first microphone, inputs a second mixed sound in which the desired sound and the noise are mixed at a different ratio from the first mixed sound, and outputs a second mixed signal.
- a first sound collecting unit having a concave surface for collecting the first mixed sound with respect to the first microphone;
- a second sound collecting unit provided with a concave surface for collecting the second mixed sound with respect to the second microphone, and disposed in a different direction from the first sound collecting unit;
- a noise suppression circuit that suppresses an estimated noise signal based on the first mixed signal and the second mixed signal and outputs a pseudo voice signal; It is characterized by providing.
- an apparatus provides: A vehicle equipped with the voice processing device, The first microphone and the first sound collection unit are arranged at a position where a desired sound uttered by a passenger in a vehicle is collected by the first microphone by the first sound collection unit, The second microphone and the second sound collecting unit are arranged at a position where noise generated from a noise source in a vehicle is collected by the second microphone by the second sound collecting unit.
- an apparatus provided with the voice processing apparatus,
- the first microphone and the first sound collection unit are disposed at positions where desired sound uttered by an operator of the information processing apparatus is collected by the first microphone by the first sound collection unit,
- the second microphone and the second sound collecting unit are arranged at a position where noise generated from a noise source in the same sound space as the operator is collected by the first sound collecting unit on the second microphone. It is characterized by being.
- a system including the voice processing device, A speech recognition device for recognizing a desired speech from the pseudo speech signal output by the speech processing device; An information processing apparatus that processes information according to a desired voice recognized by the voice recognition apparatus; It is characterized by providing.
- the method according to the present invention comprises: A first microphone that inputs a first mixed sound in which desired voice and noise are mixed and outputs a first mixed signal; A second microphone that is open to the same sound space as the first microphone, inputs a second mixed sound in which the desired sound and the noise are mixed at a different ratio from the first mixed sound, and outputs a second mixed signal.
- a first sound collecting unit having a concave surface for collecting the first mixed sound with respect to the first microphone;
- a second sound collecting unit provided with a concave surface for collecting the second mixed sound with respect to the second microphone, and disposed in a different direction from the first sound collecting unit;
- a noise suppression circuit that suppresses an estimated noise signal based on the first mixed signal and the second mixed signal and outputs a pseudo voice signal;
- a method for controlling a speech processing apparatus comprising: Obtaining parameters of the noise suppression circuit; Determining a direction of the second sound collection unit according to a parameter of the noise suppression circuit so that a ratio of the noise in the second mixed sound input to the second microphone is increased; Controlling the direction of the second sound collecting unit; It is characterized by including.
- a storage medium provides: A first microphone that inputs a first mixed sound in which desired voice and noise are mixed and outputs a first mixed signal; A second microphone that is open to the same sound space as the first microphone, inputs a second mixed sound in which the desired sound and the noise are mixed at a different ratio from the first mixed sound, and outputs a second mixed signal.
- a first sound collecting unit having a concave surface for collecting the first mixed sound with respect to the first microphone;
- a second sound collecting unit provided with a concave surface for collecting the second mixed sound with respect to the second microphone, and disposed in a different direction from the first sound collecting unit;
- a noise suppression circuit that suppresses an estimated noise signal based on the first mixed signal and the second mixed signal and outputs a pseudo voice signal;
- a storage medium storing a control program for a voice processing device comprising: Obtaining parameters of the noise suppression circuit; Determining a direction of the second sound collection unit according to a parameter of the noise suppression circuit so that a ratio of the noise in the second mixed sound input to the second microphone is increased; Controlling the direction of the second sound collecting unit;
- a control program for causing a computer to execute is stored.
- the desired voice and noise are collected respectively, and the noise can be accurately estimated to restore a pseudo voice close to the desired voice.
- the sound processing apparatus 100 includes a first microphone 101, a second microphone 103, a first sound collection unit 111, a second sound collection unit 112, and a noise suppression circuit 106.
- the first microphone 101 inputs a first mixed sound 108 in which desired voice and noise are mixed and outputs a first mixed signal 102.
- the second microphone 103 is opened to the same sound space 110 as the first microphone 101, and receives the second mixed sound 109 in which the desired sound and noise are mixed at a different ratio from the first mixed sound 108 and receives the second mixed signal.
- 104 is output.
- the first sound collection unit 111 includes a concave surface 111 a that collects the first mixed sound 108 with respect to the first microphone 101.
- the second sound collection unit 112 includes a concave surface 112 a that collects the second mixed sound 109 with respect to the second microphone 103, and is disposed in a different direction from the first sound collection unit 111.
- the noise suppression circuit 106 suppresses the estimated noise signal based on the first mixed signal 102 and the second mixed signal 104 and outputs a pseudo audio signal 107.
- the desired voice and noise are collected respectively, and the noise is accurately estimated to restore a pseudo voice close to the desired voice.
- the second embodiment has a microphone set in which a first microphone, a second microphone, a first sound collection unit, and a second sound collection unit are fixed integrally.
- this microphone set By placing this microphone set at a desired position in consideration of the position of the sound source and noise source, the desired sound and noise can be collected in the same sound space where the desired sound and noise are mixed with a simple configuration. It is possible to restore the pseudo sound close to the desired sound by accurately estimating noise by sound.
- FIG. 2 is a block diagram illustrating a configuration of an information processing system 200 including the audio processing device 220 according to the present embodiment.
- the audio processing device 220 includes a microphone set 230 in which a first microphone, a second microphone, a first sound collection unit, and a second sound collection unit are integrally fixed, and a noise suppression circuit 206.
- the information processing system 200 includes a voice processing device 220, a voice recognition device 208, and an information processing device 209.
- the first microphone in the microphone set 230 converts the first mixed sound in which the desired sound collected by the first sound collecting unit and the wraparound noise are mixed into the first mixed signal 202 in which the voice signal and the noise signal are mixed. And transmitted to the noise suppression circuit 206.
- the second mixed sound in which the noise collected by the second sound collecting unit and the wraparound sound are mixed in a different ratio from the first mixed sound is input to the second microphone in the microphone set 230.
- the second microphone converts the second mixed sound into a second mixed signal 204 in which an audio signal and a noise signal are mixed at a different ratio from the first mixed signal, and transmits the second mixed signal 204 to the noise suppression circuit 206.
- the noise suppression circuit 206 outputs a pseudo audio signal 207 based on the transmitted first mixed signal 202 and second mixed signal 204.
- the pseudo voice signal 207 is recognized by the voice recognition device 208 and the information processing device 209 processes information by the recognized voice.
- the information processing apparatus 209 may perform processing according to a voice message or may process the voice input itself as information.
- the mixed sound in which the desired sound and noise generated in the same sound space are mixed is the first microphone in which the desired sound is collected by the recess of the first sound collection unit and the recess of the second sound collection unit.
- the signals are input at different mixing ratios. Then, based on the first mixed signal from the first microphone and the second mixed signal from the second microphone, the pseudo voice signal is restored by the noise suppression circuit 206, and the restored pseudo voice signal is converted into the voice recognition device 208. Recognized. Information processing is performed by the information processing device 209 based on the recognized voice.
- a signal line for transmitting the first and second mixed signals 202 and 204 may transmit a return signal such as a ground power source and a power source for operating the microphone.
- the noise suppression circuit 206 may be attached to the microphone set 230. In that case, a pseudo audio signal is output from the microphone set.
- this embodiment demonstrates by voice recognition, it is not limited to this, The exact decompression
- the present invention can be applied to a telephone or operation of a vehicle or a device.
- ⁇ Configuration of a microphone set including a fixed sound collecting unit according to this embodiment >>
- the first and second sound collection units are fixedly arranged in advance at predetermined positions.
- two configuration examples of the microphone set will be described, but the present invention is not limited to this.
- FIG. 3A is a diagram illustrating an example 230-1 of the microphone set 230 including the fixed sound collection unit according to the present embodiment.
- the microphone set 230-1 includes a first microphone 301, a second microphone 303, and a microphone support member 305 in which the first microphone 301 and the second microphone 303 are arranged on both sides.
- the sound reflection surfaces 305a and 305b on which the first microphone 301 and the second microphone 303 are arranged form a concave surface formed of a quadratic curved surface or a pseudo curved surface approximating a quadratic curved surface.
- the first microphone 301 and the second microphone 303 are arranged at the focal position of a quadratic curved surface or a pseudo curved surface that approximates a quadratic curved surface. As shown in FIG.
- the sound reflection surfaces 305 a and 305 b of the microphone support member 305 are formed symmetrically, and the first microphone 301 and the second microphone 303 are disposed on opposite sides of the microphone support member 305. That is, the first microphone 301 is attached to one surface of the microphone support member 305, and the second microphone is attached to the other surface of the microphone support member 305. Then, the first mixed signal 202 and the second mixed signal 204 are output from the first microphone 301 and the second microphone 303 to the noise suppression circuit 206, respectively.
- the sound 311 directed to the sound reflecting surface 305a that is a pseudo curved surface that approximates a quadric surface or a quadric surface is reflected by the sound reflecting surface 305a.
- Sound is collected by the first microphone 301. Therefore, the sound reflection surface 305a functions as a first sound collection unit.
- the first mixed sound in which the collected sound 311 and the noise 322 are mixed is input to the first microphone 301, which also includes the noise 322 from the noise source 320 that generates noise.
- the noise 321 directed to the sound reflecting surface 305b which is a quadratic surface or a pseudo curved surface approximating the quadratic surface, is reflected by the sound reflecting surface 305b and collected by the second microphone 303. Is done. Therefore, the sound reflection surface 305b functions as a second sound collection unit.
- the second mixed sound in which the collected noise 321 and the sound 312 are mixed is input to the second microphone 303, as the sound 312 from the sound source 310 also wraps around.
- the microphone support member 305 is preferably a sound insulator that blocks sound transmission.
- FIG. 3B is a diagram showing another example 230-2 of the microphone set 230 including the fixed sound collection unit according to the present embodiment.
- the microphone set 230-2 includes a first microphone 301, a second microphone 303, and a microphone support member 355 in which the first microphone 301 and the second microphone 303 are arranged on both sides.
- the sound reflection surfaces 355a and 355b on which the first microphone 301 and the second microphone 303 are arranged form a concave surface formed of a quadratic curved surface or a pseudo curved surface approximating a quadratic curved surface.
- the first microphone 301 and the second microphone 303 are arranged at the focal position of a quadratic curved surface or a pseudo curved surface that approximates a quadratic curved surface. As shown in FIG.
- the sound reflection surfaces 355a and 355b of the microphone support member 355 are formed with an angle so that the axes of the curved surfaces face the sound source and the noise source, respectively.
- the first mixed signal 202 and the second mixed signal 204 are output to the noise suppression circuit 206 from the first microphone 301 and the second microphone 303, respectively.
- the sound 311 directed to the sound reflecting surface 355a that is a pseudo curved surface that approximates a quadric surface or a quadratic surface is reflected by the sound reflecting surface 355a.
- Sound is collected by the first microphone 301. Therefore, the sound reflection surface 355a functions as a first sound collection unit.
- the first mixed sound in which the collected sound 311 and the noise 322 are mixed is input to the first microphone 301, which also includes the noise 322 from the noise source 320 that generates noise.
- the noise 321 directed to the sound reflecting surface 355b which is a quadratic surface or a pseudo curved surface approximating the quadratic surface, is reflected by the sound reflecting surface 355b and collected by the second microphone 303. Is done. Therefore, the sound reflection surface 355b functions as a second sound collection unit.
- the second mixed sound in which the collected noise 321 and the sound 312 are mixed is input to the second microphone 303, as the sound 312 from the sound source 310 also wraps around.
- the microphone support member 355 is preferably a sound insulator that blocks sound transmission.
- the sound insulator a substance having a large mass and a high density is desirable. Such materials require more energy to vibrate and thus can prevent sound penetration.
- the surface of the sound insulator is preferably a hard material, but the inside of the sound insulator is preferably a soft material. Since hard materials are easy to reflect sound, using hard materials on the surface of the sound insulation can collect sound reflected directly by the sound insulation in addition to the sound directly entering the microphone. Since a soft material is easy to absorb sound, unnecessary penetration of sound can be prevented by using a soft material on the inner surface of the sound insulator.
- the material on the surface on the first microphone side and the material on the surface on the second microphone side are separated without a continuous structure. If the structure is continuous, the sound propagates through the surface material and penetrates the sound insulation, so it has a three-layer structure, and a soft material is sandwiched between the hard materials on both surfaces. It is desirable that
- the sound reflecting surfaces 305a, 305b, 355a, and 355b which are pseudo-curved surfaces approximating the quadratic curved surface or the quadratic curved surface of FIGS. 3A and 3B, collect sound at the focal position.
- a pseudo-surface that approximates a quadric surface using 4A will be described with reference to FIG. 4B.
- FIG. 4A is a diagram for explaining sound collection by the microphone support member 405 having the secondary curved surface 405a serving as the sound collection unit according to the present embodiment.
- the line segments indicated by 406 and 408 are tangent lines of the quadric surface 405a.
- the sound 411 from the sound source 410 is reflected at the same angles ⁇ 1 and ⁇ 2 with respect to the normals 407 and 409 perpendicular to the line segments 406 and 408 at the point of contact with the quadric surface 405a, respectively.
- the sound 411 is collected by the microphone 401 located at the focal point of the secondary curved surface 405a.
- FIG. 4B is a view for explaining sound collection by the microphone support member 455 having the pseudo curved surface 455a serving as the sound collection unit according to this embodiment.
- the pseudo curved surface 455a is an aggregate of planes extending in the tangential direction of the quadric surface.
- the line segments indicated by 456 and 458 are the surfaces of the pseudo curved surface 455a.
- the sound 411 from the sound source 410 reflects at the same angles ⁇ 1 and ⁇ 2 with respect to normals 457 and 459 that intersect perpendicularly to the line segments 456 and 458.
- the sound 411 is collected by the microphone 401 positioned at the focal point of the pseudo curved surface 455a.
- FIG. 5 is a diagram showing a configuration of the noise suppression circuit 206 according to the present embodiment.
- the noise suppression circuit 206 includes a subtracter 501 that subtracts the estimated noise signal Y1 estimated to be mixed in the first mixed signal 202 from the first mixed signal 202.
- a subtracter 503 that subtracts the estimated audio signal Y2 estimated to be mixed in the second mixed signal 204 from the second mixed signal 204 is provided.
- it has an adaptive filter NF502 that is an estimated noise signal generation unit that generates the estimated noise signal Y1 from the pseudo noise signal E2 that is the output signal of the subtractor 503.
- an adaptive filter XF504 which is an estimated audio signal generation unit that generates the estimated audio signal Y2 from the pseudo audio signal E1 (207) that is the output signal of the subtractor 503, is provided.
- a specific example of the adaptive filter XF504 is described in International Publication No. 2005/024787. Even when the target sound wraps around and is input to the second microphone 203 and the sound signal is mixed in the second mixed signal 204, the adaptive filter XF 504 uses the subtractor 501 to convert the sound signal of the wraparound sound into the first mixed signal. It is possible to prevent accidental removal from 202.
- the subtractor 501 subtracts the estimated noise signal Y1 from the first mixed signal 202 transmitted from the first microphone 201, and outputs a pseudo audio signal E1 (207).
- the estimated noise signal Y1 is generated by the adaptive filter NF502 using a parameter that changes the pseudo noise signal E2 based on the pseudo audio signal E1 (207).
- the pseudo noise signal E2 is a signal obtained by subtracting the estimated audio signal Y2 by the subtractor 503 from the second mixed signal 204 transmitted from the second microphone 203 through the signal line.
- the estimated sound signal Y2 is generated by the adaptive filter XF504 using parameters that change the pseudo sound signal E1 (207) based on the estimated sound signal Y2.
- the noise suppression circuit 206 may be an analog circuit, a digital circuit, or a mixed circuit thereof. If the noise suppression circuit 206 is an analog circuit, the pseudo audio signal E1 (207) is converted into a digital signal by an A / D converter when used for digital control. On the other hand, if the noise suppression circuit 206 is a digital circuit, the signal from the microphone is converted into a digital signal by the A / D converter before entering the noise suppression circuit 206.
- the subtracters 501 and 503 may be configured by analog circuits
- the adaptive filter NF 502 and the adaptive filter XF 504 may be configured by analog circuits controlled by the digital circuit. Conceivable. Further, the noise suppression circuit 206 in FIG.
- the adaptive filter XF 504 of FIG. 5 can be replaced with a circuit that outputs a constant level in order to filter the spread sound.
- the subtracters 501 and / or 503 can be replaced with an integrator by representing the estimated noise signal Y1 and the estimated speech signal Y2 by coefficients that are integrated with the first mixed signal 202 and the second mixed signal 204, respectively. is there.
- FIG. 6 is a block diagram illustrating a configuration of an information processing system 600 including the audio processing device 620 according to the present embodiment.
- the sound processing device 620 is integrally fixed with a first microphone, a second microphone, a first sound collection unit, a second sound collection unit, and a movable unit that makes the second sound collection unit movable.
- a microphone set 630, a noise suppression circuit 606, and a sound collection control unit 640 are included.
- the information processing system 600 includes a voice processing device 620, a voice recognition device 208, and an information processing device 209.
- the first microphone in the microphone set 630 converts the first mixed sound in which the desired sound collected by the first sound collecting unit and the wraparound noise are mixed into the first mixed signal 202 in which the voice signal and the noise signal are mixed. Then, it is transmitted to the noise suppression circuit 606.
- the second mixed sound in which the noise collected by the second sound collecting unit and the wraparound sound are mixed at a different ratio from the first mixed sound is input to the second microphone in the microphone set 630.
- the second microphone converts the second mixed sound into a second mixed signal 204 in which an audio signal and a noise signal are mixed at a different ratio from the first mixed signal, and transmits the second mixed signal 204 to the noise suppression circuit 606.
- the second sound collection unit of the microphone set 630 is moved by the control signal 641 from the sound collection control unit 640 so that the input of noise increases.
- the noise suppression circuit 606 outputs a pseudo audio signal 207 based on the transmitted first mixed signal 202 and second mixed signal 204.
- the pseudo voice signal 207 is recognized by the voice recognition device 208 and the information processing device 209 processes information by the recognized voice.
- the information processing apparatus 209 may perform processing according to a voice message or may process the voice input itself as information.
- the sound collection control unit 640 outputs a control signal 641 for changing the sound collection direction of the second sound collection unit in the microphone set 630 according to the pseudo sound signal 207 and the parameter 607 of the noise suppression circuit 606.
- the first microphone from which the desired sound is collected by the first sound collecting unit and the noise by the second sound collecting unit. are input at different mixing ratios with the second microphone from which sound is collected. Then, based on the first mixed signal from the first microphone and the second mixed signal from the second microphone, the pseudo voice signal is restored by the noise suppression circuit 606, and the restored pseudo voice signal is converted into the voice recognition device 208. Recognized. Information processing is performed by the information processing device 209 based on the recognized voice.
- a signal line for transmitting the first and second mixed signals 202 and 204 may transmit a return signal such as a ground power source and a power source for operating the microphone.
- the noise suppression circuit 606 and the sound collection control unit 640 may be attached to the microphone set 630. In that case, a pseudo audio signal is output from the microphone set.
- this embodiment demonstrates by voice recognition, it is not limited to this, The exact decompression
- the present invention can be applied to a telephone or operation of a vehicle or a device.
- FIG. 7 is a diagram showing an example 630-1 of the microphone set 630 including the sound reflecting surface 752a serving as the moving second sound collection unit according to the present embodiment.
- the movable part that moves the second sound collecting part is not shown.
- a step motor or the like is arranged to automatically adjust the direction of the second sound collection unit.
- the microphone set 630-1 includes a first microphone 301, a second microphone 303, a first microphone support member 751 on which the first microphone 301 is disposed, and a second microphone support member 752 on which the second microphone 303 is disposed. including.
- the sound reflection surfaces 751a and 752a on which the first microphone 301 and the second microphone 303 are arranged are quadratic curved surfaces or pseudo curved surfaces that approximate quadratic curved surfaces.
- a concave surface is formed.
- the first microphone 301 and the second microphone 303 are arranged at the focal position of a quadratic curved surface or a pseudo curved surface that approximates a quadratic curved surface. As shown in FIG.
- the first microphone support member 751 is arranged in a predetermined direction so as to collect desired sound, but the second microphone support member 752 has a shaft 753 in a direction so as to collect noise. Is installed to be rotatable in the direction of arrow 754.
- the first mixed signal 202 and the second mixed signal 204 are output to the noise suppression circuit 206 from the first microphone 301 and the second microphone 303, respectively.
- a sound 311 directed to a sound reflection surface 751a that is a pseudo-curved surface that approximates a quadric surface or a quadratic surface out of the sound from the sound source 310 that utters a desired sound is reflected by the sound reflection surface 751a.
- Sound is collected by the first microphone 301. Therefore, the sound reflection surface 751a functions as a first sound collection unit.
- the first mixed sound in which the collected sound 311 and the noise 322 are mixed is input to the first microphone 301, which also includes the noise 322 from the noise source 320 that generates noise.
- the noise 321 directed to the sound reflecting surface 752a which is a quadratic surface or a pseudo curved surface approximating the quadratic surface, is reflected by the sound reflecting surface 752a and collected by the second microphone 303. Is done. Therefore, the sound reflection surface 752a functions as a second sound collection unit.
- the second mixed sound in which the collected noise 321 and the sound 312 are mixed is input to the second microphone 303, as the sound 312 from the sound source 310 also wraps around.
- the rotation of the sound reflecting surface 752a serving as the second sound collecting unit about the shaft 753 is performed by a step motor or the like by the control signal 641 from the sound collecting control unit 640, but is not limited thereto. Not. In FIG. 7, one-dimensional rotation around the axis 753 is shown, but it may be two-dimensional rotation or three-dimensional rotation.
- the first and microphone support members 751 and 752 are preferably sound insulators that block transmission of sound, and are disposed at positions where the first sound collection unit and the second sound collection unit are sandwiched between the first microphone and the second microphone, respectively. Is done.
- FIG. 8 is a diagram illustrating another example 630-2 of the microphone set 630 including the sound collector 805 which is the second sound collector that moves according to the present embodiment.
- the movable part that moves the second sound collecting part is not shown.
- a step motor or the like is arranged to automatically adjust the direction of the second sound collection unit.
- the microphone set 630-2 includes a first microphone 301, a second microphone 303, a microphone support member 305 having a sound reflecting surface 305 a serving as a first sound collection unit on which the first microphone 301 is disposed, and a second microphone 303. And a sound collector 805 which is a movable second sound collector for collecting noise.
- the sound reflection surface 305 a on which the first microphone 301 is disposed forms a concave surface that is a quadric surface or a pseudo curved surface that approximates a quadric surface.
- the first microphone 301 is arranged at the focal position of a quadratic curved surface or a pseudo curved surface approximating a quadratic curved surface.
- the sound collector 805 as the second sound collector is in contact with the curved surface (305b) of the microphone support member 305 in a rotatable manner together with the second microphone 303.
- Such rotatable contact is possible, for example, with a magnet, but is not limited thereto.
- the sound reflecting surface 805a of the sound collector 805, which is the second sound collector forms a quadratic curved surface or a pseudo curved surface that approximates a quadric surface.
- the second microphone 303 is arranged at the focal position of a quadratic curved surface or a pseudo curved surface approximating the quadratic curved surface.
- the first mixed signal 202 and the second mixed signal 204 are output to the noise suppression circuit 206 from the first microphone 301 and the second microphone 303, respectively.
- a sound 311 directed to a sound reflecting surface 305a that is a pseudo-curved surface that approximates a quadric surface or a quadratic surface out of the sound from the sound source 310 that utters a desired sound is reflected by the sound reflecting surface 305a.
- Sound is collected by the first microphone 301. Therefore, the sound reflection surface 305a functions as a first sound collection unit.
- the first mixed sound in which the collected sound 311 and the noise 322 are mixed is input to the first microphone 301, which also includes the noise 322 from the noise source 320 that generates noise.
- the noise 321 directed to the sound reflection surface 805a which is a quadratic curved surface or a pseudo curved surface approximating the quadratic surface, is reflected by the sound reflection surface 805a and collected by the second microphone 303. Is done. Therefore, the sound reflection surface 805a functions as a second sound collection unit.
- the second mixed sound in which the collected noise 321 and the sound 312 are mixed is input to the second microphone 303, as the sound 312 from the sound source 310 also wraps around.
- the sound reflecting surface 805a serving as the second sound collecting unit is rotated by a control signal 641 from the sound collecting control unit 640. Further, although one-dimensional rotation is shown in FIG. 8, it may be two-dimensional rotation or three-dimensional rotation.
- the microphone support member 305 is preferably a sound insulator that blocks transmission of sound.
- FIG. 9 is a block diagram showing a hardware configuration of the speech processing apparatus according to the present embodiment. Note that FIG. 9 also shows data used in the following fourth embodiment. Further, FIG. 9 illustrates a voice recognition device 208 and an information processing device 209 connected to the voice processing device 620.
- a CPU 910 is a processor for arithmetic control, and realizes a control unit of the voice processing device 620 by executing a program.
- the ROM 920 stores fixed data and programs such as initial data and programs.
- the communication control unit 930 exchanges information between the voice processing device 620, the voice recognition device 208, and the information processing device 209. Such communication may be wired or wireless.
- the noise suppression circuit 206 is illustrated as a unique functional component, but part or all of the processing of the noise suppression circuit 206 may be realized by processing by the CPU 910.
- the RAM 940 is a random access memory that the CPU 910 uses as a work area for temporary storage.
- an area for storing data necessary for realizing the present embodiment is secured.
- the first sound collection unit position control parameter 943 determined from the evaluation result 942 and the second sound collection unit position control parameter 944 determined from the evaluation result 942 are stored.
- the storage 950 is a mass storage device that stores a database, various parameters, and a program executed by the CPU 910 in a nonvolatile manner.
- the storage 950 stores the following data or programs necessary for realizing the present embodiment.
- a sound collection unit position control parameter DB 951 used for determining the first sound collection unit position control parameter 943 and the second sound collection unit position control parameter 944 from the evaluation result 942 is stored ( (See FIG. 10). Further, without using the sound collection unit position control parameter DB 951, the sound collection such as an arithmetic expression for determining the first sound collection unit position control parameter 943 and the second sound collection unit position control parameter 944 as needed from the evaluation result 942.
- a part position control algorithm 952 is stored.
- a sound collection control program 953 for controlling sound collection is stored as a program.
- a sound collection unit position control module 954 for controlling the position of the sound collection unit is stored.
- the input interface 960 is an interface for inputting control signals and data necessary for control by the CPU 910.
- the pseudo speech signal 207 that is an output from the noise suppression circuit 206 and the parameters 961 such as the parameters of the adaptive filter NF 502 and the adaptive filter XF 504 or the estimated noise signal Y1 are input.
- the parameter 961 is used for controlling the position of the sound collecting unit.
- the output interface 970 is an interface that outputs a control signal and data to the device under the control of the CPU 910.
- the first sound collection unit position control parameter 943 is output to the first sound collection unit position control unit 971, or the second sound collection unit position control parameter 944 is output to the second sound collection unit position control unit 972. Output. If the first sound collection unit position control unit 971 or the second sound collection unit position control unit 972 includes a motor, the first sound collection unit position control parameter 943 and the second sound collection unit position control parameter 944 may be the rotation direction. And rotation angle.
- FIG. 9 shows only data and programs essential to the present embodiment, and general-purpose data and programs such as OS are not shown. Further, the CPU 910 in FIG. 9 may also use the control of the voice recognition device 208 and the information processing device 209.
- FIG. 10 is a diagram showing the configuration of the sound collection unit position control parameter DB 951 according to this embodiment.
- the sound collector position control parameter DB 951 includes the pseudo audio signal 1001, the estimated noise signal 1002, the pseudo noise signal 1003, the estimated audio signal 1004, the adaptive filter NF parameter 1005, and the adaptive filter XF parameter 1006 acquired from the noise suppression circuit 206. At least one is included as a condition. Corresponding to such conditions, a first sound collection unit position control parameter 1007 and a second sound collection unit position control parameter 1008 are stored. Note that the first sound collection unit position control parameter 1007 and the second sound collection unit position control parameter 1008 are a change angle in one direction if the movement is one-dimensional, and a change angle in two directions if the movement is two-dimensional. If the movement is three-dimensional, the change angle in three directions is stored.
- FIG. 11 is a flowchart showing an audio processing procedure according to the present embodiment. The flowchart in FIG. 11 is executed by the CPU 910 in FIG. 9 using the RAM 940, and implements the sound collection control unit 640 in FIG.
- step S1101 it is determined whether it is time to adjust the second sound collection unit. If it is not time to adjust the second sound collection unit, the process ends.
- the timing of adjusting the second sound collection unit is, for example, at the time of initialization, when the voice recognition of the voice recognition device becomes defective, or the parameters of the pseudo noise signal E2 and the adaptive filter NF in the noise suppression circuit. It can be considered that the noise input is judged to be small.
- step S1103 If it is time to adjust the second sound collection unit, the position of the second sound collection unit is adjusted in step S1103.
- step S1105 the voice recognition device 208 and / or the information processing device 209 are notified of the completion or start of voice input via the communication control unit 930.
- FIG. 12A is a flowchart illustrating a first example of an adjustment procedure of the second sound collection unit according to the present embodiment.
- the second sound collection unit is adjusted to increase the noise input to the second microphone based on the output signal and parameters from the noise suppression circuit.
- step S1211 the noise-to-speech ratio of the second microphone and the parameters of the adaptive filter NF are acquired from the noise suppression circuit.
- step S1213 it is determined from the data acquired in step S1311 whether the noise input to the second microphone is sufficient. If the noise input to the second microphone is sufficient, the process is terminated and the process returns.
- step S1217 the moving motor of the second sound collecting unit is driven one step, and the process returns to step S1211 and the process is repeated until the noise input to the second microphone becomes sufficient.
- FIG. 12B is a flowchart illustrating a second example of the adjustment procedure of the second sound collection unit according to the present embodiment.
- the second sound collecting unit for increasing the noise input to the second microphone by moving the second microphone little by little in the vertical and horizontal directions and directing the noise in a direction in which the noise volume increases. Make adjustments.
- step S1221 the pseudo noise signal E2 is acquired from the noise suppression circuit.
- step S1223 the acquired pseudo noise signal E2 is stored in association with the position (angle) of the second sound collection unit.
- step S1225 it is determined whether or not the pseudo noise signal E2 is at a position that is larger than the values in the adjacent directions in the vertical and horizontal directions and has a maximum value. If the position reaches the maximum value, the process ends and returns. If the position is not the maximum value, in step S1227, the moving motor of the second sound collection unit is driven one step, and the process returns to step S1221 to return the second sound collection to a position (direction) where the pseudo noise signal E2 is maximum. Repeat the process until the part is placed.
- FIG. 12C is a flowchart illustrating a third example of the adjustment procedure of the second sound collecting unit according to the present embodiment.
- the second sound collection unit is adjusted to increase the noise input to the second microphone by determining the direction of the noise source using two microphones in a state where the voice is not uttered. Do.
- step S1231 it is determined whether the pseudo audio signal E1 is almost zero.
- the direction of the noise source is estimated from the time delay that is the difference in arrival of noise between the first microphone and the second microphone.
- step S1335 the second sound collection unit is returned to the estimated noise source direction.
- the position of the second sound collection unit can be adjusted, and the input of noise to the second microphone is increased corresponding to the changing noise source.
- the position of the first sound collection unit can be changed to adjust the input of the desired sound.
- the input of the desired sound is increased in response to the change in the position of the sound source that emits the desired sound, and more accurate pseudo sound is restored. Note that description of configurations and processes common to the second and third embodiments is omitted.
- FIG. 13 is a block diagram showing a configuration of an information processing system 1300 provided with a voice processing device 1320 according to this embodiment.
- the sound processing device 1320 includes a microphone set 1330 in which a first microphone, a second microphone, a first sound collection unit, and a second sound collection unit are fixed integrally, a noise suppression circuit 1306, And a control unit 1340.
- the information processing system 1300 includes a voice processing device 1320, a voice recognition device 208, and an information processing device 209.
- the fourth embodiment is different from the third embodiment in that the direction can be changed using the first sound collection unit of the microphone set 1330 as the sound source. Although the difference will be described below, the configuration and operation thereof are similar to those of the second sound collection unit of the third embodiment, and detailed description thereof will be omitted.
- the second sound collection unit of the microphone set 1330 is moved by the control signal 641 from the sound collection control unit 1340 so that the input of noise increases. Furthermore, the first sound collection unit of the microphone set 1330 is moved by the control signal 1341 from the sound collection control unit 1340 so that the input of the desired sound is increased.
- the sound collection control unit 1340 includes a control signal 1341 for changing the sound collection direction of the first sound collection unit in the microphone set 1330 and the noise of the second sound collection unit based on the pseudo sound signal 207 and the parameter 1307 of the noise suppression circuit 1306.
- a control signal 641 for changing the sound collection direction is output.
- the first microphone from which the desired sound is collected by the first sound collecting unit and the noise by the second sound collecting unit. are input at different mixing ratios with the second microphone from which sound is collected. Then, based on the first mixed signal from the first microphone and the second mixed signal from the second microphone, the pseudo voice signal is restored by the noise suppression circuit 1306, and the restored pseudo voice signal is the voice recognition device 208. Recognized. Information processing is performed by the information processing device 209 based on the recognized voice.
- a signal line for transmitting the first and second mixed signals 202 and 204 may transmit a return signal such as a ground power source and a power source for operating the microphone.
- the noise suppression circuit 1306 and the sound collection control unit 1340 may be attached to the microphone set 1330. In that case, a pseudo audio signal is output from the microphone set.
- this embodiment demonstrates by voice recognition, it is not limited to this, The exact decompression
- the present invention can be applied to a telephone or operation of a vehicle or a device.
- FIG. 14 is a flowchart showing an audio processing procedure according to the present embodiment. The flowchart in FIG. 14 is executed by the CPU 910 in FIG. 9 using the RAM 940, and implements the sound collection control unit 1340 in FIG.
- step S1401 it is determined whether it is time to adjust the first sound collection unit and / or the second sound collection unit. If it is not time to adjust, the process ends.
- the timing for adjusting the first sound collection unit and / or the second sound collection unit may be, for example, at the time of initialization or when the voice recognition of the voice recognition device becomes defective. Or, when the noise input is reduced from the parameters of the pseudo noise signal E2 and the adaptive filter NF in the noise suppression circuit, or when it is determined that the voice input is reduced from the parameters of the pseudo audio signal E1 and the adaptive filter XF. Conceivable.
- step S1403 If it is time to adjust the first sound collection unit and / or the second sound collection unit, the position of the first sound collection unit and / or the second sound collection unit is adjusted in step S1403.
- various methods for adjusting the position of the first sound collecting unit and / or the second sound collecting unit but several examples thereof have been described above with reference to FIGS. 12A to 12C, and thus description thereof is omitted here.
- step S1405 the voice recognition device 208 and / or the information processing device 209 is prepared for voice input via the communication control unit 930. Notify completion or start.
- the fifth embodiment is a case where a vehicle system is assumed as an information processing system including the above-described sound processing device, and a microphone set 230 having a difference in angle between the first microphone and the second microphone shown in FIG. 3B. -2 is used. According to the present embodiment, it is possible to accurately convey a voice instruction of an occupant to a car navigation device while driving a vehicle while suppressing noise in the vehicle, for example, noise generated by an air conditioner.
- FIG. 15 is a block diagram illustrating a configuration of a vehicle system 1500 that is an information processing system including the audio processing device according to the present embodiment.
- the sound processing apparatus includes a first microphone 301, a second microphone 303, a sound reflection surface 355a that serves as a first sound collection unit that collects sound on both sides of the first microphone 301, and a second microphone.
- 303 includes a microphone support member 355 having a sound reflection surface 355b serving as a second sound collection unit for collecting noise, and a noise suppression circuit 206.
- the microphone support member 355 is preferably a sound insulator.
- the vehicle system 1500 includes a voice processing device, a voice recognition device 208, and a car navigation device 1509 that is an information processing device.
- a voice processing device a voice recognition device 208
- a car navigation device 1509 that is an information processing device.
- the first microphone 301, the second microphone 303, and the microphone support member 355 that is a sound insulator may be provided as a microphone set that is an integrated audio input unit.
- a sound space 1510 is a space in the vehicle.
- a part of the sound space 1510 in FIG. 15 is defined by a windshield 1530 and a ceiling 1540.
- the configuration and operation of the present embodiment will be described by taking as an example a case where an occupant 1520 operates the car navigation device 1509 by voice within a sound space 1510 in which noise from an air conditioner or the like is mixed. It is assumed that the air conditioner is in the dashboard 1516. However, the noise source is not limited to the air conditioner, and may be other devices arranged at other positions. The voice of the occupant 1520 is not limited to the operation of the car navigation device 1509.
- the first microphone 301, the second microphone 303, and the microphone support member 355 that is a sound insulator are disposed on the ceiling portion in the front of the vehicle interior.
- the microphone support member 355 has a portion protruding from the ceiling 1540 into the vehicle intersecting with a line segment connecting the first microphone 301 and the noise source to block the mixture of direct air propagation noise from the noise source to the first microphone 301. ing.
- the microphone support member 355 blocks a mixture of solid propagation noise transmitted from the noise source to the first microphone 301 through the windshield 1530 and the ceiling 1540.
- the protrusion of the microphone support member 355 may also serve as a sun visor. In this case, it is particularly preferable to use a transparent material that is not exposed to direct sunlight and opaque when it receives direct sunlight.
- the first mixed sound in which the air propagation sound 1511 and the wraparound air propagation noise 1522 mixed by the occupant 1520 and collected by the sound reflecting surface 355a as the first sound collection unit is input to the first microphone 301.
- the first microphone 301 converts the first mixed sound into a first mixed signal 202 in which an audio signal and a noise signal are mixed, and transmits the first mixed signal 202 to the noise suppression circuit 206.
- the second microphone 303 a second mixture in which the air propagation noise 1521 and the wraparound air propagation sound 1512 collected by the sound reflection surface 355b serving as the second sound collection unit are mixed at a different rate from the first mixed sound. Sound is input.
- the second microphone 303 converts the second mixed sound into a second mixed signal 204 in which an audio signal and a noise signal are mixed at a different ratio from the first mixed signal, and transmits the second mixed signal 204 to the noise suppression circuit 206.
- the noise suppression circuit 206 outputs a pseudo audio signal 207 based on the transmitted first mixed signal 202 and second mixed signal 204.
- the pseudo voice signal 207 is recognized by the voice recognition device 208 and processed as a voice operation by the occupant 1520 in the car navigation device 1509.
- the sound indicating the operation to the car navigation device 1509 uttered by the occupant 1520 is the sound reflection surface 355a that is the first sound collection unit and
- the first microphone 301, the sound reflection surface 355b serving as the second sound collection unit, and the second microphone 303 are input as mixed sounds having different mixing ratios.
- the pseudo audio signal is restored by the noise suppression circuit 206, and the restored pseudo audio signal is recognized by the voice recognition. Recognized at device 208.
- the car navigation device 1509 is operated by the recognized voice.
- a signal line for transmitting the first and second mixed signals 202 and 204 may transmit a return signal such as a ground power source and a power source for operating the microphone.
- the noise suppression circuit 206 may be attached to the microphone support member 355.
- the pseudo voice signal is transmitted from the noise suppression circuit 206 to the voice recognition device 208 through the signal line.
- voice recognition and car navigation are described.
- the present invention is not limited to this, and accurate restoration of the voice uttered by the occupant 1520 is also useful in other processes. For example, it can be applied to a car phone or a vehicle operation that does not directly lead to driving.
- the sixth embodiment is a case where a vehicle system is assumed as an information processing system including the above-described sound processing device, and in FIG. 8, the direction of the second sound collecting unit that collects noise can be adjusted. It is embodiment using the microphone set which isolate
- FIG. 16 is a block diagram illustrating a configuration of a vehicle system 1600 that is an information processing system including the audio processing device according to the present embodiment.
- the audio processing apparatus includes a first microphone 301, a second microphone 303, and a first microphone support having a sound reflection surface 751 a that is a first sound collection unit that collects sound in the first microphone 301.
- the first microphone support member 751 is preferably a sound insulator.
- the vehicle system 1600 includes a voice processing device, a voice recognition device 208, and a car navigation device 1509 that is an information processing device.
- the first microphone 301, the second microphone 303, the first microphone support member 355, the second microphone support member 1652, and the sound collector 805 that is the second sound collection unit are set as a microphone set that is an audio input unit. May be provided.
- the first microphone 301 and the first microphone support member 751 that is a sound insulator are disposed on the ceiling portion in the front of the vehicle.
- the sound reflecting surface 751 a that is the first sound collecting portion of the first microphone support member 751 collects the sound uttered by the occupant 1520 and inputs the sound to the first microphone 303.
- the first microphone support member 751 has a portion protruding from the ceiling 1540 into the vehicle intersecting a line segment connecting the first microphone 301 and a noise source (especially a dashboard air conditioner), so that the first microphone support member 751 Mixing of direct air propagation noise to the microphone 301 is blocked.
- the first microphone support member 751 blocks a mixture of solid propagation noise transmitted from the noise source to the first microphone 301 through the windshield 1530 and the ceiling 1540.
- the protrusion of the first microphone support member 751 may also serve as a sun visor. In this case, it is particularly preferable to use a transparent material that is not exposed to direct sunlight and opaque when it receives direct sunlight.
- the sound collecting body 805 that is the second microphone and the second sound collecting section is movable in the direction to the second microphone support member 1652 at the center of the ceiling that can collect more noise from a plurality of noise sources in the vehicle. It is installed to become.
- the sound collector 805 as the second microphone and the second sound collector is not shown in the direction so as to collect more noise from a plurality of noise sources in the vehicle by the control signal 641 from the sound collection controller 640.
- the movement is controlled by a movement control unit (for example, a motor).
- the first mixed sound in which the air propagation sound 1611 uttered by the occupant 1520 and collected by the sound reflecting surface 751a which is the first sound collection unit and the circulated air propagation noise 1622 are mixed is input to the first microphone 301.
- the first microphone 301 converts the first mixed sound into a first mixed signal 202 in which an audio signal and a noise signal are mixed, and transmits the first mixed signal 202 to the noise suppression circuit 606.
- the second microphone 303 has a ratio in which the air propagation noise 1621 and the sneak air propagation sound 1612 from the plurality of noise sources collected by the sound collector 805 as the second sound collection unit are different from the first mixed sound.
- the second mixed sound mixed in is input.
- the second microphone 303 converts the second mixed sound into a second mixed signal 204 in which an audio signal and a noise signal are mixed at a different ratio from the first mixed signal, and transmits the second mixed signal 204 to the noise suppression circuit 206.
- the noise suppression circuit 606 outputs the pseudo audio signal 207 and the parameter 607 used by the sound collection control unit 640 based on the transmitted first mixed signal 202 and second mixed signal 204.
- the pseudo voice signal 207 is recognized by the voice recognition device 208 and processed as a voice operation by the occupant 1520 in the car navigation device 1509.
- the sound collection control unit 640 generates a control signal 641 for controlling the direction of the second microphone 303 and the sound collection body 805 as the second sound collection unit based on the pseudo sound signal 207 from the noise suppression circuit 606 and the parameter 607. Output.
- the sound indicating the operation to the car navigation device 1509 uttered by the occupant 1520 is the sound reflecting surface 751a that is the first sound collection unit and
- the first microphone 301 and the sound collecting body 805 that is the second sound collecting unit whose direction is adjusted so as to collect more in-vehicle noise and the second microphone 303 are input as mixed sounds having different mixing ratios.
- the pseudo audio signal is restored by the noise suppression circuit 606, and the restored pseudo audio signal is recognized by the voice recognition. Recognized at device 208.
- the car navigation device 1509 is operated by the recognized voice.
- the noise suppression circuit 606 and the sound collection control unit 640 may be attached to the first microphone support member 751 or the second microphone support member 1652.
- the pseudo voice signal is transmitted from the noise suppression circuit 606 to the voice recognition device 208 through the signal line.
- voice recognition and car navigation are described.
- the present invention is not limited to this, and accurate restoration of the voice uttered by the occupant 1520 is also useful in other processes. For example, it can be applied to a car phone or a vehicle operation that does not directly lead to driving.
- the seventh embodiment is a case where a personal computer (hereinafter abbreviated as a PC), particularly a notebook PC, is assumed as an information processing system including the above-described sound processing device.
- a personal computer hereinafter abbreviated as a PC
- a notebook PC is assumed as an information processing system including the above-described sound processing device.
- the first microphone and the second microphone shown in FIG. This is an embodiment using the microphone set 230-1 in which the microphone is installed on both sides of the microphone support member.
- the voice instruction of the operator to the notebook PC is accurately transmitted to the notebook PC by suppressing room noise, for example, noise such as air conditioners and voices uttered by others. It becomes possible to convey.
- FIG. 17 is a block diagram showing a configuration of a notebook personal computer (hereinafter referred to as notebook PC 1700), which is an information processing system including the voice processing apparatus according to the present embodiment.
- notebook PC 1700 a notebook personal computer
- description of the original function of the notebook PC is omitted, and a configuration related to sound collection to the first microphone 301 and the second microphone 303, which is a feature of the present embodiment, will be described.
- the notebook PC 1700 includes a display unit 1730 having a display screen and a keyboard unit 1740 including a keyboard.
- the first microphone 301, the second microphone 303, and the sound reflecting surface 305a that is the first sound collecting unit and the sound reflecting surface 305b that is the second sound collecting unit on both sides are included in the microphone set 230-1.
- the microphone support member 305 is disposed in the display unit 1730. That is, the first microphone 301 and the sound reflection surface serving as the first sound collection unit are disposed on the operator side of the display unit 1730, and the second microphone 303 and the sound reflection surface 305b serving as the second sound collection unit are disposed on the display unit 1730. It is arranged on the opposite side to the operator.
- the first mixed sound in which the voice 1711 uttered by the operator 1720 and collected by the sound reflecting surface 305a, which is the first sound collection unit, and the circulated air propagation noise 1714 are input to the first microphone 301.
- the first microphone 301 converts the first mixed sound into a first mixed signal in which an audio signal and a noise signal are mixed, and transmits the first mixed signal to a noise suppression circuit 206 (not shown).
- the second microphone 303 has a second mixed sound in which the air propagation noise 1713 and the wraparound sound 1712 collected by the sound reflecting surface 305b serving as the second sound collecting unit are mixed at a different rate from the first mixed sound. Entered.
- the second microphone 303 converts the second mixed sound into a second mixed signal in which an audio signal and a noise signal are mixed at a different ratio from the first mixed signal, and transmits the second mixed signal to a noise suppression circuit 206 (not shown).
- the noise suppression circuit 206 outputs a pseudo audio signal 207 based on the first mixed signal and the second mixed signal transmitted from the first microphone 301 and the second microphone 303, respectively.
- the pseudo voice signal 207 is recognized by the voice recognition device 208 and processed as voice operation or data voice input by the operator 1720 in the notebook PC 1700.
- the sound to the notebook PC 1700 uttered by the operator 1720 is the sound reflecting surface 305a, the first microphone 301, which is the first sound collection unit, and the first microphone 301. It is input as mixed sound having different mixing ratios between the sound reflection surface 305b and the second microphone 303 serving as the two sound collection units. Then, based on the first mixed signal from the first microphone 301 and the second mixed signal from the second microphone 303, the pseudo audio signal is restored by the noise suppression circuit 206, and the restored pseudo audio signal is recognized by the voice recognition. Recognized at device 208. The recognized voice is processed by the notebook PC 1700.
- the first sound collection unit and the second sound collection unit are fixed to the microphone support member.
- the eighth embodiment has a configuration similar to that of FIG. 8 in which the direction of the second sound collecting unit that collects noise can be adjusted, and conversely, the direction of the first sound collecting unit that collects sound can be adjusted. And it is embodiment using the microphone set which separated the microphone support member.
- the voice instruction of the operator to the notebook PC is inputted as a larger collected voice, and indoor noise, for example, an air conditioner or the like, voice uttered by another person, etc. Noise can be suppressed and accurately transmitted to the notebook PC.
- FIG. 18 is a block diagram illustrating a configuration of a personal computer (notebook type PC 1800) that is an information processing system including the voice processing device according to the present embodiment.
- note type PC 1800 an information processing system including the voice processing device according to the present embodiment.
- description of the original functions of the notebook PC is omitted, and a configuration related to sound collection to the first microphone 301 and the second microphone 303, which is a feature of the present embodiment, will be described.
- the notebook PC 1800 includes a display unit 1830 having a display screen and a keyboard unit 1840 including a keyboard.
- the first microphone 301 constituting the microphone set, the sound collector 805 as the first sound collector, and the first microphone support member 1851 are arranged in the display unit 1830.
- the second microphone 303 and the second microphone support member 1852 having the sound reflection surface 1852a which is the second sound collection unit are disposed in the keyboard unit 1840. That is, the first microphone 301 and the sound collector 805 that is the first sound collection unit are arranged on the keyboard surface of the keyboard unit 1840, and the sound reflection surface 1852a that is the second microphone 303 and the second sound collection unit is the display unit 1830. It is arranged on the opposite side to the operator.
- the first microphone 301 and the sound collector 805 that is the first sound collector for example, determine the position of the operator from the angle formed by the display unit 1830 and the keyboard unit 1840, and move their directions.
- the first microphone 301 a first mixture in which a sound 1811 collected by a sound collector 805, which is a first sound collecting unit uttered by the operator 1820 and faces the operator 1820, and wraparound air propagation noise 1814 are mixed. Sound is input.
- the first microphone 301 converts the first mixed sound into a first mixed signal in which an audio signal and a noise signal are mixed, and transmits the first mixed signal to a noise suppression circuit 206 (not shown).
- the second microphone 303 has a second mixed sound in which the air propagation noise 1813 and the wraparound sound 1812 collected by the sound reflecting surface 1852a serving as the second sound collecting unit are mixed at a different rate from the first mixed sound. Entered.
- the second microphone 303 converts the second mixed sound into a second mixed signal in which an audio signal and a noise signal are mixed at a different ratio from the first mixed signal, and transmits the second mixed signal to a noise suppression circuit 206 (not shown).
- the noise suppression circuit 206 outputs a pseudo audio signal 207 based on the first mixed signal and the second mixed signal transmitted from the first microphone 301 and the second microphone 303, respectively.
- the pseudo voice signal 207 is recognized by the voice recognition device 208 and processed as voice operation or data voice input by the operator 1720 in the notebook PC 1700.
- the sound to the notebook type PC 1800 uttered by the operator 1820 is the sound collecting body 805 and the first microphone 301 that are the first sound collecting unit, and the first The sound is reflected as a mixed sound having different mixing ratios on the sound reflection surface 1852a and the second microphone 303 serving as the two sound collection units. Then, based on the first mixed signal from the first microphone 301 and the second mixed signal from the second microphone 303, the pseudo audio signal is restored by the noise suppression circuit 206, and the restored pseudo audio signal is recognized by the voice recognition. Recognized at device 208. The recognized voice is processed by the notebook PC 1700.
- the present invention may be applied to a system composed of a plurality of devices, or may be applied to a single device. Furthermore, the present invention can also be applied to a case where a control program that realizes the functions of the embodiments is supplied directly or remotely to a system or apparatus. Therefore, in order to realize the functions of the present invention on a computer, a control program installed in the computer, a medium storing the control program, and a WWW (World Wide Web) server that downloads the control program are also included in the scope of the present invention. include.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Acoustics & Sound (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Quality & Reliability (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Otolaryngology (AREA)
- Circuit For Audible Band Transducer (AREA)
- Fittings On The Vehicle Exterior For Carrying Loads, And Devices For Holding Or Mounting Articles (AREA)
- Obtaining Desirable Characteristics In Audible-Bandwidth Transducers (AREA)
Abstract
Description
所望音声と雑音とが混在した第1混在音を入力して第1混在信号を出力する第1マイクと、
前記第1マイクと同じ音空間に開放され、前記所望音声と前記雑音とが前記第1混在音とは異なる割合で混在した第2混在音を入力して第2混在信号を出力する第2マイクと、
前記第1混在音を前記第1マイクに対して集音する凹面を備えた第1集音部と、
前記第2混在音を前記第2マイクに対して集音する凹面を備え、前記第1集音部とは異なる向きに配置された第2集音部と、
前記第1混在信号と前記第2混在信号とに基づいて推定雑音信号を抑圧し、擬似音声信号を出力する雑音抑圧回路と、
を備えることを特徴とする。
上記音声処理装置を備えた車両であって、
前記第1マイク及び前記第1集音部は、車内の乗員が発声する所望音声が前記第1集音部によって前記第1マイクに集音される位置に配置され、
前記第2マイク及び前記第2集音部は、車内の雑音源から発生する雑音が前記第2集音部によって前記第2マイクに集音される位置に配置されていることを特徴とする。
上記音声処理装置を備えた情報処理装置であって、
前記第1マイク及び前記第1集音部は、前記情報処理装置の操作者が発声する所望音声が前記第1集音部によって前記第1マイクに集音される位置に配置され、
前記第2マイク及び前記第2集音部は、前記操作者と同じ音空間にある雑音源から発生する雑音が前記第1集音部によって前記第2マイクに集音される位置に配置されていることを特徴とする。
上記音声処理装置を備えた情報処理システムであって、
前記音声処理装置の出力する前記擬似音声信号から所望音声を認識する音声認識装置と、
前記音声認識装置が認識した所望音声に従って情報を処理する情報処理装置と、
を備えることを特徴とする。
所望音声と雑音とが混在した第1混在音を入力して第1混在信号を出力する第1マイクと、
前記第1マイクと同じ音空間に開放され、前記所望音声と前記雑音とが前記第1混在音とは異なる割合で混在した第2混在音を入力して第2混在信号を出力する第2マイクと、
前記第1混在音を前記第1マイクに対して集音する凹面を備えた第1集音部と、
前記第2混在音を前記第2マイクに対して集音する凹面を備え、前記第1集音部とは異なる向きに配置された第2集音部と、
前記第1混在信号と前記第2混在信号とに基づいて推定雑音信号を抑圧し、擬似音声信号を出力する雑音抑圧回路と、
を備える音声処理装置の制御方法であって、
前記雑音抑圧回路のパラメータを取得するステップと、
前記雑音抑圧回路のパラメータに従って、前記第2マイクに入力される前記第2混在音において前記雑音の割合がより多くなるように、前記第2集音部の方向を決定するステップと、
前記第2集音部の方向を制御するステップと、
を含むことを特徴とする。
所望音声と雑音とが混在した第1混在音を入力して第1混在信号を出力する第1マイクと、
前記第1マイクと同じ音空間に開放され、前記所望音声と前記雑音とが前記第1混在音とは異なる割合で混在した第2混在音を入力して第2混在信号を出力する第2マイクと、
前記第1混在音を前記第1マイクに対して集音する凹面を備えた第1集音部と、
前記第2混在音を前記第2マイクに対して集音する凹面を備え、前記第1集音部とは異なる向きに配置された第2集音部と、
前記第1混在信号と前記第2混在信号とに基づいて推定雑音信号を抑圧し、擬似音声信号を出力する雑音抑圧回路と、
を備える音声処理装置の制御プログラムを格納した記憶媒体であって、
前記雑音抑圧回路のパラメータを取得するステップと、
前記雑音抑圧回路のパラメータに従って、前記第2マイクに入力される前記第2混在音において前記雑音の割合がより多くなるように、前記第2集音部の方向を決定するステップと、
前記第2集音部の方向を制御するステップと、
をコンピュータに実行させる制御プログラムを格納したことを特徴とする。
本発明の第1実施形態としての音声処理装置100について、図1を用いて説明する。 図1に示すように、音声処理装置100は、第1マイク101と、第2マイク103と、第1集音部111と、第2集音部112と、雑音抑圧回路106と、を含む。第1マイク101は、所望音声と雑音とが混在した第1混在音108を入力して第1混在信号102を出力する。第2マイク103は、第1マイク101と同じ音空間110に開放され、所望音声と雑音とが第1混在音108とは異なる割合で混在した第2混在音109を入力して第2混在信号104を出力する。第1集音部111は、第1混在音108を第1マイク101に対して集音する凹面111aを備えている。第2集音部112は、第2混在音109を第2マイク103に対して集音する凹面112aを備え、第1集音部111とは異なる向きに配置されている。雑音抑圧回路106は、第1混在信号102と第2混在信号104とに基づいて推定雑音信号を抑圧し、擬似音声信号107を出力する。
第2実施形態においては、第1マイク、第2マイク、第1集音部、第2集音部が一体に固定されたマイクセットを有している。音声源や雑音源の位置を考慮して、このマイクセットを所望の位置に配置することによって、簡単な構成で、所望音声と雑音とが混在する同じ音空間において、所望音声と雑音をそれぞれ集音して雑音を正確に推定して所望音声に近い擬似音声を復元することができる。
図2は、本実施形態に係る音声処理装置220を備えた情報処理システム200の構成を示すブロック図である。なお、図2において、音声処理装置220は、第1マイク、第2マイク、第1集音部、第2集音部が一体に固定されたマイクセット230と、雑音抑圧回路206とを含む。また、情報処理システム200は、音声処理装置220と、さらに、音声認識装置208と、情報処理装置209とを含む。
本実施形態において、第1及び第2集音部は所定位置に予め固定的に配置される。以下、マイクセットの2つの構成例について説明するが、これに限定されない。
図3Aは、本実施形態に係る固定した集音部を含むマイクセット230の一例230-1を示す図である。
図3Bは、本実施形態に係る固定した集音部を含むマイクセット230の他例230-2を示す図である。
以下、図3A及び図3Bの二次曲面あるいは二次曲面を近似した擬似曲面である音反射面305a、305b、355a、355bが、その焦点位置に集音することを、二次曲面については図4Aを使って、二次曲面を近似した擬似曲面については図4Bを使って説明する。
図4Aは、本実施形態に係る集音部となる二次曲面405aを有するマイク支持部材405による集音を説明する図である。
図4Bは、本実施形態に係る集音部となる擬似曲面455aを有するマイク支持部材455による集音を説明する図である。擬似曲面455aとは、二次曲面の接線方向に延びる平面の集合体である。
図5は、本実施形態に係る雑音抑圧回路206の構成を示す図である。
第2実施形態においては、マイクセットにおいて第1マイク及び第2マイクがマイク支持部材に予め決められた方向に固定されている例を説明した。第3実施形態においては、マイク支持部材が移動することにより第2集音部の向きが変更可能である例、あるいは第2集音部の向き自身が移動可能である例について説明する。第2集音部は雑音の入力が大きくなるように移動する。本実施形態によれば、第2マイクがより大きな雑音を入力することで、雑音抑圧回路において抑圧される雑音の正確さ、出力される擬似音声の正確さを高めることができる。なお、第2実施形態と共通の構成や処理の説明は省略する。
図6は、本実施形態に係る音声処理装置620を備えた情報処理システム600の構成を示すブロック図である。なお、図6において、音声処理装置620は、第1マイク、第2マイク、第1集音部、第2集音部、第2集音部を可動とする可動部とが一体に固定されたマイクセット630と、雑音抑圧回路606と、集音制御部640とを含む。また、情報処理システム600は、音声処理装置620と、さらに、音声認識装置208と、情報処理装置209とを含む。
本実施形態においては、第2集音部が雑音を集音するように移動する。以下、マイクセットについて2つの構成例を説明するが、これに限定されない。
図7は、本実施形態に係る移動する第2集音部となる音反射面752aを含むマイクセット630の一例630-1を示す図である。なお、第2集音部を移動させる可動部については図示されていない。例えば、ステップモータなどが配置されて、第2集音部の向きが自動調整される。
図8は、本実施形態に係る移動する第2集音部である集音体805を含むマイクセット630の他例630-2を示す図である。なお、第2集音部を移動させる可動部については図示されていない。例えば、ステップモータなどが配置されて、第2集音部の向きが自動調整される。
図9は、本実施形態に係る音声処理装置のハードウエア構成を示すブロック図である。なお、図9には、次の第4実施形態で使用されるデータも図示されている。また、図9には、音声処理装置620に接続する音声認識装置208と情報処理装置209とを図示する。
図10は、本実施形態に係る集音部位置制御パラメータDB951の構成を示す図である。
図11は、本実施形態に係る音声処理手順を示すフローチャートである。図11のフローチャートは、図9のCPU910がRAM940を使用して実行し、図6の集音制御部640を実現する。
図12Aは、本実施形態に係る第2集音部の調整手順の第1例を示すフローチャートである。図12Aの例では、雑音抑圧回路からの出力信号やパラメータに基づいて、第2マイクへの雑音入力を大きくするための第2集音部の調整を行なう。
図12Bは、本実施形態に係る第2集音部の調整手順の第2例を示すフローチャートである。図12Bの例では、第2マイクを上下左右の方向に少しずつ動かしてより雑音の音量が大きくなる方向に向けることにより、第2マイクへの雑音入力を大きくするための第2集音部の調整を行なう。
図12Cは、本実施形態に係る第2集音部の調整手順の第3例を示すフローチャートである。図12Cの例では、音声が発声されていない状態で2つのマイクを用いて雑音源の方向を決定することにより、第2マイクへの雑音入力を大きくするための第2集音部の調整を行なう。
第3実施形態においては、第2集音部の位置を調整可能として、変化する雑音源に対応して第2マイクへの雑音の入力を大きくした。第4実施形態においては、第1集音部の位置も変更可能とすることにより、所望音声の入力を大きくする調整を行なう。本実施形態によれば、所望音声を発しする音声源の位置の変化にも対応して所望音声の入力を大きくし、より正確な擬似音声を復元する。なお、第2及び第3実施形態と共通の構成や処理は説明を省略する。
図13は、本実施形態に係る音声処理装置1320を備えた情報処理システム1300の構成を示すブロック図である。
図14は、本実施形態に係る音声処理手順を示すフローチャートである。図14のフローチャートは、図9のCPU910がRAM940を使用して実行し、図13の集音制御部1340を実現する。
第2及び第4実施形態では、音声処理装置を備えた情報処理システムの汎用の構成及び動作を説明した。第5乃至第8実施形態では、上記音声処理装置を備えた情報処理システムを具体的な情報処理システムに適用した場合の数例を説明する。
図15は、本実施形態に係る音声処理装置を備えた情報処理システムである車両システム1500の構成を示すブロック図である。なお、図15において、音声処理装置は、第1マイク301と、第2マイク303と、両側に第1マイク301に音声を集音する第1集音部となる音反射面355aと第2マイク303に雑音を集音する第2集音部となる音反射面355bとを有するマイク支持部材355と、雑音抑圧回路206とを含む。なお、マイク支持部材355は遮音体であるのが望ましい。また、車両システム1500は、音声処理装置と、さらに、音声認識装置208と、情報処理装置であるカーナビゲーション装置1509とを含む。なお、第1マイク301と、第2マイク303と、遮音体であるマイク支持部材355とは、一体の音声入力ユニットであるマイクセットとして提供されてよい。
第6実施形態は、上記音声処理装置を備えた情報処理システムとして車両システムを想定した場合であって、雑音を集音する第2集音部の向きを調整可能とした図8において、マイク支持部材を分離したマイクセットを使用した実施形態である。本実施形態によれば、車両を運転中のカーナビゲーション装置への乗員の音声指示を、車内の多数の雑音源が発声する雑音を抑圧して、正確に伝えることが可能となる。
図16は、本実施形態に係る音声処理装置を備えた情報処理システムである車両システム1600の構成を示すブロック図である。なお、図16において、音声処理装置は、第1マイク301と、第2マイク303と、第1マイク301に音声を集音する第1集音部である音反射面751aを有する第1マイク支持部材751と、第2マイク303に音声を集音する可動の第2集音部である集音体805を有する第2マイク支持部材1652と、雑音抑圧回路206と、集音制御部640とを含む。第1マイク支持部材751は遮音体であるのが望ましい。また、車両システム1600は、音声処理装置と、さらに、音声認識装置208と、情報処理装置であるカーナビゲーション装置1509とを含む。なお、第1マイク301と、第2マイク303と、第1マイク支持部材355、第2マイク支持部材1652、第2集音部である集音体805とは、音声入力ユニットであるマイクセットとして提供されてよい。
第7実施形態は、上記音声処理装置を備えた情報処理システムとしてパーソナルコンピュータ(以下、PCと略す)、特にノート型PCを想定した場合であって、図3Bで示した第1マイクと第2マイクとがマイク支持部材の両側に設置されたマイクセット230-1を使用した実施形態である。本実施形態によれば、ノート型PCへの操作者の音声指示を、室内の雑音、例えは空調機などの機器や他人の発声した音声などの雑音を抑圧して、正確にノート型PCへ伝えることが可能となる。
図17は、本実施形態に係る音声処理装置を備えた情報処理システムであるノート型パーソナルコンピュータ(以下、ノート型PC1700)の構成を示すブロック図である。なお、図17には、ノート型PCの本来の機能などについては説明を省略し、本実施形態の特徴である第1マイク301及び第2マイク303への集音に関連する構成を説明する。
第7実施形態は、第1集音部や第2集音部はマイク支持部材に固定された構成であった。第8実施形態は、雑音を集音する第2集音部の向きを調整可能とした図8と類似の構成で、逆に音声を集音する第1集音部の向きを調整可能とし、且つ、マイク支持部材を分離したマイクセットを使用した実施形態である。本実施形態によれば、ノート型PCへの操作者の音声指示を、より大きな集音された音声を入力し、且つ、室内の雑音、例えは空調機などの機器や他人の発声した音声などの雑音を抑圧して、正確にノート型PCへ伝えることが可能となる。
図18は、本実施形態に係る音声処理装置を備えた情報処理システムであるパーソナルコンピュータ(ノート型PC1800)の構成を示すブロック図である。なお、図18には、ノート型PCの本来の機能などについては説明を省略し、本実施形態の特徴である第1マイク301及び第2マイク303への集音に関連する構成を説明する。
以上、実施形態を参照して本発明を説明したが、本発明は上記実施形態に限定されものではない。本発明の構成や詳細には、本発明のスコープ内で当業者が理解し得る様々な変更をすることができる。また、それぞれの実施形態に含まれる別々の特徴を如何様に組み合わせたシステム又は装置も、本発明の範疇に含まれる。
Claims (25)
- 所望音声と雑音とが混在した第1混在音を入力して第1混在信号を出力する第1マイクと、
前記第1マイクと同じ音空間に開放され、前記所望音声と前記雑音とが前記第1混在音とは異なる割合で混在した第2混在音を入力して第2混在信号を出力する第2マイクと、
前記第1混在音を前記第1マイクに対して集音する凹面を備えた第1集音部と、
前記第2混在音を前記第2マイクに対して集音する凹面を備え、前記第1集音部とは異なる向きに配置された第2集音部と、
前記第1混在信号と前記第2混在信号とに基づいて推定雑音信号を抑圧し、擬似音声信号を出力する雑音抑圧回路と、
を備えることを特徴とする音声処理装置。 - 前記第1集音部及び前記第2集音部の凹面は、それぞれ、前記第1マイク及び前記第2マイクの位置が焦点となる二次曲面の音反射面であることを特徴とする請求項1に記載の音声処理装置。
- 前記第1集音部及び前記第2集音部の凹面は、それぞれ、前記第1マイク及び前記第2マイクの位置が焦点となる二次曲面を近似した擬似曲面の音反射面であることを特徴とする請求項1に記載の音声処理装置。
- 前記擬似曲面は、前記二次曲面の接線方向に延びる平面の集合体であることを特徴とする請求項3に記載の音声処理装置。
- 前記第1マイクが前記所望音声を集音するマイクであり、前記第2マイクが前記雑音を集音するマイクであって、
前記第2集音部の二次曲面または擬似曲面が集音する、曲面の軸に垂直な範囲は、前記第1集音部の二次曲面または擬似曲面が集音する、曲面の軸に垂直な範囲よりも広いことを特徴とする請求項1乃至4のいずれか1項に記載の音声処理装置。 - 前記第1マイクが前記所望音声を集音する方向に前記第1集音部を移動可能とする第1可動部をさらに備えることを特徴とする請求項1乃至5のいずれか1項に記載の音声処理装置。
- 前記第1マイクに入力される前記第1混在音において前記所望音声の割合がより多くなるように、前記第1可動部の移動を制御する第1移動制御手段をさらに備えることを特徴とする請求項6に記載の音声処理装置。
- 前記第1移動制御手段は、前記第1集音部の向きを変更することを特徴とする請求項7に記載の音声処理装置。
- 前記第1移動制御手段は、前記雑音抑圧回路が用いる第1パラメータに従って前記第1可動部の移動を制御することを特徴とする請求項7または8に記載の音声処理装置。
- 前記第2マイクが前記雑音を集音する方向に前記第2集音部を移動可能とする第2可動部をさらに備えることを特徴とする請求項1乃至9のいずれか1項に記載の音声処理装置。
- 前記第2マイクに入力される前記第2混在音において前記雑音の割合がより多くなるように、前記第2可動部の移動を制御する第2移動制御手段をさらに備えることを特徴とする請求項10に記載の音声処理装置。
- 前記第2移動制御手段は、前記第2集音部の向きを変更することを特徴とする請求項11に記載の音声処理装置。
- 前記第2移動制御手段は、前記雑音抑圧回路が用いる第2パラメータに従って前記可動部の移動を制御することを特徴とする請求項11または12に記載の音声処理装置。
- 前記第2移動制御手段は、方向を変えながら前記第2混在音に混在する前記雑音を示す情報を取得し、前記雑音が最大となる方向への前記第2集音部の移動を制御することを特徴とする請求項11または12に記載の音声処理装置。
- 前記第2移動制御手段は、前記所望音声の無い条件で、前記第1マイクが入力した第1混在音における雑音と前記第2マイクが入力した第2混在音における雑音との間の、時間遅延に基づいて雑音源の位置を推測し、推測した前記雑音源の方向への前記第2集音部の移動を制御することを特徴とする請求項11または12に記載の音声処理装置。
- 前記第1マイクと第2マイクとの間に配置された遮音体をさらに備えることを特徴とする請求項1乃至15のいずれか1項に記載の音声処理装置。
- 前記第1マイク及び前記第1集音部が前記遮音体の一方の面に取り付けられ、前記第2マイク及び前記第2集音部が前記遮音体の他方の面に取り付けられて、前記第1マイク、前記第2マイク、前記第1集音部、前記第2集音部及び前記遮音体を一体の音声入力ユニットとして備えることを特徴とする請求項16に記載の音声処理装置。
- 前記第1集音部を前記第1マイクと挟む位置に取り付けられた第1遮音体と、前記第2集音部を前記第2マイクと挟む位置に取り付けられた第2遮音体とをさらに備えることを特徴とする請求項1乃至15のいずれか1項に記載の音声処理装置。
- 前記雑音抑圧回路は、
前記第1混在信号に混在すると推定される前記推定雑音信号を、前記第1混在信号から減算する第1減算手段と、
前記第2混在信号に混在すると推定される推定音声信号を、前記第2混在信号から減算する第2減算手段と、
前記推定雑音信号を前記第2減算手段の出力信号から生成する推定雑音信号生成手段と、
前記推定音声信号を前記第1減算手段の出力信号から生成する推定音声信号生成手段と、
を有し、
前記擬似音声信号は、前記第1減算手段の出力信号であることを特徴とする請求項1乃至17のいずれか1項に記載の音声処理装置。 - 請求項1乃至19のいずれか1項に記載の音声処理装置を備えた車両であって、
前記第1マイク及び前記第1集音部は、車内の乗員が発声する所望音声が前記第1集音部によって前記第1マイクに集音される位置に配置され、
前記第2マイク及び前記第2集音部は、車内の雑音源から発生する雑音が前記第2集音部によって前記第2マイクに集音される位置に配置されていることを特徴とする車両。 - 請求項1乃至19のいずれか1項に記載の音声処理装置を備えた情報処理装置であって、
前記第1マイク及び前記第1集音部は、前記情報処理装置の操作者が発声する所望音声が前記第1集音部によって前記第1マイクに集音される位置に配置され、
前記第2マイク及び前記第2集音部は、前記操作者と同じ音空間にある雑音源から発生する雑音が前記第1集音部によって前記第2マイクに集音される位置に配置されていることを特徴とする情報処理装置。 - 前記情報処理装置は、ノート型パーソナルコンピュータであって、
前記第1マイク及び前記第1集音部は、ディスプレイの操作者側の面またはキーボード面に配置され、前記第2マイク及び前記第2集音部は、前記ディスプレイの操作者とは反対側の面に配置されていることを特徴とする請求項21に記載の情報処理装置。 - 請求項1乃至19のいずれか1項に記載の音声処理装置を備えた情報処理システムであって、
前記音声処理装置の出力する前記擬似音声信号から所望音声を認識する音声認識装置と、
前記音声認識装置が認識した所望音声に従って情報を処理する情報処理装置と、
を備えることを特徴とする情報処理システム。 - 所望音声と雑音とが混在した第1混在音を入力して第1混在信号を出力する第1マイクと、
前記第1マイクと同じ音空間に開放され、前記所望音声と前記雑音とが前記第1混在音とは異なる割合で混在した第2混在音を入力して第2混在信号を出力する第2マイクと、
前記第1混在音を前記第1マイクに対して集音する凹面を備えた第1集音部と、
前記第2混在音を前記第2マイクに対して集音する凹面を備え、前記第1集音部とは異なる向きに配置された第2集音部と、
前記第1混在信号と前記第2混在信号とに基づいて推定雑音信号を抑圧し、擬似音声信号を出力する雑音抑圧回路と、
を備える音声処理装置の制御方法であって、
前記雑音抑圧回路のパラメータを取得するステップと、
前記雑音抑圧回路のパラメータに従って、前記第2マイクに入力される前記第2混在音において前記雑音の割合がより多くなるように、前記第2集音部の方向を決定するステップと、
前記第2集音部の方向を制御するステップと、
を含むことを特徴とする音声処理装置の制御方法。 - 所望音声と雑音とが混在した第1混在音を入力して第1混在信号を出力する第1マイクと、
前記第1マイクと同じ音空間に開放され、前記所望音声と前記雑音とが前記第1混在音とは異なる割合で混在した第2混在音を入力して第2混在信号を出力する第2マイクと、
前記第1混在音を前記第1マイクに対して集音する凹面を備えた第1集音部と、
前記第2混在音を前記第2マイクに対して集音する凹面を備え、前記第1集音部とは異なる向きに配置された第2集音部と、
前記第1混在信号と前記第2混在信号とに基づいて推定雑音信号を抑圧し、擬似音声信号を出力する雑音抑圧回路と、
を備える音声処理装置の制御プログラムを格納した記憶媒体であって、
前記雑音抑圧回路のパラメータを取得するステップと、
前記雑音抑圧回路のパラメータに従って、前記第2マイクに入力される前記第2混在音において前記雑音の割合がより多くなるように、前記第2集音部の方向を決定するステップと、
前記第2集音部の方向を制御するステップと、
をコンピュータに実行させる制御プログラムを格納したことを特徴とする記憶媒体。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2012552642A JP5936070B2 (ja) | 2011-01-13 | 2011-12-03 | 音声処理装置及びその制御方法とその制御プログラム、該音声処理装置を備えた車両、情報処理装置及び情報処理システム |
US13/978,446 US20130282370A1 (en) | 2011-01-13 | 2011-12-03 | Speech processing apparatus, control method thereof, storage medium storing control program thereof, and vehicle, information processing apparatus, and information processing system including the speech processing apparatus |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2011005316 | 2011-01-13 | ||
JP2011-005316 | 2011-01-13 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2012096073A1 true WO2012096073A1 (ja) | 2012-07-19 |
Family
ID=46506987
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2011/077996 WO2012096073A1 (ja) | 2011-01-13 | 2011-12-03 | 音声処理装置及びその制御方法とその制御プログラムを格納した記憶媒体、該音声処理装置を備えた車両、情報処理装置及び情報処理システム |
Country Status (3)
Country | Link |
---|---|
US (1) | US20130282370A1 (ja) |
JP (1) | JP5936070B2 (ja) |
WO (1) | WO2012096073A1 (ja) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2014187685A (ja) * | 2013-01-24 | 2014-10-02 | Nippon Telegr & Teleph Corp <Ntt> | 収音装置 |
JP2020041958A (ja) * | 2018-09-13 | 2020-03-19 | 日本電気株式会社 | 音響特性計測装置、音響特性計測方法、およびプログラム |
CN115223327A (zh) * | 2021-07-14 | 2022-10-21 | 广州汽车集团股份有限公司 | 一种车内活体保护方法及系统 |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101827276B1 (ko) * | 2016-05-13 | 2018-03-22 | 엘지전자 주식회사 | 전자 장치 및 그 제어 방법 |
JP2018191145A (ja) * | 2017-05-08 | 2018-11-29 | オリンパス株式会社 | 収音装置、収音方法、収音プログラム及びディクテーション方法 |
CN110750142A (zh) * | 2019-10-21 | 2020-02-04 | 湖南理工学院 | 一种基于人工智能的自媒体信息编辑装置 |
CN111627456B (zh) * | 2020-05-13 | 2023-07-21 | 广州国音智能科技有限公司 | 噪音排除方法、装置、设备及可读存储介质 |
CN113066500B (zh) * | 2021-03-30 | 2023-05-23 | 联想(北京)有限公司 | 声音采集方法、装置及设备和存储介质 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS5514827U (ja) * | 1978-07-12 | 1980-01-30 | ||
JPH07231495A (ja) * | 1994-02-18 | 1995-08-29 | Hokkaido Univ | 集音器 |
JP2004215066A (ja) * | 2003-01-07 | 2004-07-29 | Nissan Motor Co Ltd | 自動車用音声入力装置 |
JP2004279241A (ja) * | 2003-03-17 | 2004-10-07 | Internatl Business Mach Corp <Ibm> | 音源位置取得システム、音源位置取得方法、該音源位置取得システムに使用するための音反射要素および該音反射要素の形成方法 |
JP2005236407A (ja) * | 2004-02-17 | 2005-09-02 | Toshiba Corp | 音響処理装置、音響処理方法および製造方法 |
JP2006525743A (ja) * | 2003-05-08 | 2006-11-09 | タンドベルク・テレコム・エイ・エス | 音源追跡のための配置及び方法 |
WO2009051132A1 (ja) * | 2007-10-19 | 2009-04-23 | Nec Corporation | 信号処理システムと、その装置、方法及びそのプログラム |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH07210180A (ja) * | 1994-01-12 | 1995-08-11 | Sony Corp | 集音マイク |
JP4163294B2 (ja) * | 1998-07-31 | 2008-10-08 | 株式会社東芝 | 雑音抑圧処理装置および雑音抑圧処理方法 |
US6826528B1 (en) * | 1998-09-09 | 2004-11-30 | Sony Corporation | Weighted frequency-channel background noise suppressor |
US20040114778A1 (en) * | 2002-12-11 | 2004-06-17 | Gobeli Garth W. | Miniature directional microphone |
KR100806769B1 (ko) * | 2003-09-02 | 2008-03-06 | 닛본 덴끼 가부시끼가이샤 | 신호 처리 방법 및 장치 |
JP4797330B2 (ja) * | 2004-03-08 | 2011-10-19 | 日本電気株式会社 | ロボット |
CN1983642A (zh) * | 2006-02-09 | 2007-06-20 | 易斌宣 | 超高倍率聚光太阳能电池装置 |
US20100098266A1 (en) * | 2007-06-01 | 2010-04-22 | Ikoa Corporation | Multi-channel audio device |
US9302630B2 (en) * | 2007-11-13 | 2016-04-05 | Tk Holdings Inc. | System and method for receiving audible input in a vehicle |
JP2009124540A (ja) * | 2007-11-16 | 2009-06-04 | Toyota Motor Corp | 車両用通話装置、通話方法 |
JP2010023534A (ja) * | 2008-07-15 | 2010-02-04 | Panasonic Corp | 騒音低減装置 |
US8229126B2 (en) * | 2009-03-13 | 2012-07-24 | Harris Corporation | Noise error amplitude reduction |
-
2011
- 2011-12-03 WO PCT/JP2011/077996 patent/WO2012096073A1/ja active Application Filing
- 2011-12-03 US US13/978,446 patent/US20130282370A1/en not_active Abandoned
- 2011-12-03 JP JP2012552642A patent/JP5936070B2/ja active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS5514827U (ja) * | 1978-07-12 | 1980-01-30 | ||
JPH07231495A (ja) * | 1994-02-18 | 1995-08-29 | Hokkaido Univ | 集音器 |
JP2004215066A (ja) * | 2003-01-07 | 2004-07-29 | Nissan Motor Co Ltd | 自動車用音声入力装置 |
JP2004279241A (ja) * | 2003-03-17 | 2004-10-07 | Internatl Business Mach Corp <Ibm> | 音源位置取得システム、音源位置取得方法、該音源位置取得システムに使用するための音反射要素および該音反射要素の形成方法 |
JP2006525743A (ja) * | 2003-05-08 | 2006-11-09 | タンドベルク・テレコム・エイ・エス | 音源追跡のための配置及び方法 |
JP2005236407A (ja) * | 2004-02-17 | 2005-09-02 | Toshiba Corp | 音響処理装置、音響処理方法および製造方法 |
WO2009051132A1 (ja) * | 2007-10-19 | 2009-04-23 | Nec Corporation | 信号処理システムと、その装置、方法及びそのプログラム |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2014187685A (ja) * | 2013-01-24 | 2014-10-02 | Nippon Telegr & Teleph Corp <Ntt> | 収音装置 |
JP2020041958A (ja) * | 2018-09-13 | 2020-03-19 | 日本電気株式会社 | 音響特性計測装置、音響特性計測方法、およびプログラム |
JP7127448B2 (ja) | 2018-09-13 | 2022-08-30 | 日本電気株式会社 | 音響特性計測装置、音響特性計測方法、およびプログラム |
CN115223327A (zh) * | 2021-07-14 | 2022-10-21 | 广州汽车集团股份有限公司 | 一种车内活体保护方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
US20130282370A1 (en) | 2013-10-24 |
JPWO2012096073A1 (ja) | 2014-06-09 |
JP5936070B2 (ja) | 2016-06-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5936070B2 (ja) | 音声処理装置及びその制御方法とその制御プログラム、該音声処理装置を備えた車両、情報処理装置及び情報処理システム | |
CN112020864B (zh) | 麦克风阵列中的智能波束控制 | |
JP5936069B2 (ja) | 音声処理装置及びその制御方法とその制御プログラム、該音声処理装置を備えた車両、情報処理装置及び情報処理システム | |
JP3780516B2 (ja) | ロボット聴覚装置及びロボット聴覚システム | |
JP6889989B2 (ja) | 音声認識性能を向上させるためのアクティブノイズキャンセレーション装置及び方法 | |
JP6101989B2 (ja) | 拡張現実環境における信号増強ビーム形成 | |
US7158645B2 (en) | Orthogonal circular microphone array system and method for detecting three-dimensional direction of sound source using the same | |
EP3032848A1 (en) | Directional sound modification | |
US8116478B2 (en) | Apparatus and method for beamforming in consideration of actual noise environment character | |
US20120308039A1 (en) | Sound source separation system, sound source separation method, and acoustic signal acquisition device | |
CN103392349A (zh) | 用于空间选择性音频增强的系统、方法、设备和计算机可读媒体 | |
JPWO2007018293A1 (ja) | 音源分離装置、音声認識装置、携帯電話機、音源分離方法、及び、プログラム | |
JPWO2020079957A1 (ja) | 音声信号処理装置、雑音抑圧方法 | |
CN112672251A (zh) | 一种扬声器的控制方法和系统、存储介质及扬声器 | |
JP5939161B2 (ja) | 音声処理装置及びその制御方法とその制御プログラム、情報処理システム | |
US20200329308A1 (en) | Voice input device and method, and program | |
JP2023551556A (ja) | エコーの抑制のためのオーディオ信号処理方法及びシステム | |
JP5086768B2 (ja) | 通話装置 | |
Li et al. | Optimal active noise control in large rooms using a “locally global” control strategy | |
Okuno et al. | Sound and visual tracking for humanoid robot | |
KR102168812B1 (ko) | 사운드를 제어하는 전자 장치 및 그 동작 방법 | |
Novoa et al. | Robustness over time-varying channels in DNN-hmm ASR based human-robot interaction. | |
JP6165043B2 (ja) | 騒音評価装置及び騒音評価方法 | |
CN112669808A (zh) | 一种目标跟踪的主动噪声控制窗及其控制方法 | |
Nakadai et al. | Humanoid active audition system improved by the cover acoustics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 11855732 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2012552642 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13978446 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 11855732 Country of ref document: EP Kind code of ref document: A1 |