JP5538249B2 - Stereo headset - Google Patents

Stereo headset Download PDF

Info

Publication number
JP5538249B2
JP5538249B2 JP2011009938A JP2011009938A JP5538249B2 JP 5538249 B2 JP5538249 B2 JP 5538249B2 JP 2011009938 A JP2011009938 A JP 2011009938A JP 2011009938 A JP2011009938 A JP 2011009938A JP 5538249 B2 JP5538249 B2 JP 5538249B2
Authority
JP
Japan
Prior art keywords
microphone
sound
signal
wearer
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2011009938A
Other languages
Japanese (ja)
Other versions
JP2012151745A (en
Inventor
学 岡本
陽一 羽田
Original Assignee
日本電信電話株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電信電話株式会社 filed Critical 日本電信電話株式会社
Priority to JP2011009938A priority Critical patent/JP5538249B2/en
Publication of JP2012151745A publication Critical patent/JP2012151745A/en
Application granted granted Critical
Publication of JP5538249B2 publication Critical patent/JP5538249B2/en
Application status is Active legal-status Critical
Anticipated expiration legal-status Critical

Links

Images

Description

  The present invention relates to a stereo headset in which a microphone for collecting sound is attached to headphones or earphones for receiving stereo signals.

  Headsets with headphones and earphones with microphones are used in broadcasting and communication fields, etc., while the wearer listens to the sound of the ground (connection destination), etc. It is used to send to or record.

  A typical headset is arranged such that a speaker is in contact with one or both ears of the wearer, and one microphone is arranged near the mouth of the wearer.

  FIG. 7 shows an example of a conventional configuration of a headset. In this example, the headset includes two types of speakers 11L and 11R and a microphone 12 for reproducing left and right signals of a stereo signal. The speakers 11L and 11R are attached so as to be in contact with both ears of the wearer.

  The left channel reception signal and the right channel reception signal are input to the speakers 11L and 11R, respectively, so that the wearer can listen to the ground sound reproduced by the speakers 11L and 11R. 12 is picked up and transmitted to the ground or a recorder as a transmission signal. In addition, although there is a form in which the microphone is arranged not near the mouth but near the speaker, the function is the same in that the voice of the wearer is collected and transmitted.

  On the other hand, Patent Document 1 describes a configuration in which an environmental sound recording microphone is arranged in the vicinity of a speaker in addition to a microphone arranged near the mouth of the wearer. It collects environmental sounds around and sends them to the ground. Since the environmental sound recording microphone is located near each of the left and right ears of the wearer, the spatial information on which the wearer is located can be transmitted by transmitting each sound collection signal as 2ch. For example, when the receiver listens with the left and right ears, the environmental sound at the recording point can be heard with a sense of sound image localization.

  There is also an object that does not transmit the wearer's voice to the ground, but actively collects environmental sounds around the wearer and records or transmits the sound. In Patent Document 2, a microphone is installed in the vicinity of the headphones to collect environmental sound around the wearer and to make it difficult for howling to occur when the wearer listens to (monitors) the environmental sound. Is described.

  When the configuration (binaural headphone microphone) described in Patent Document 2 is attached to each of the left and right ears, and recorded and transmitted as 2ch signals, ambient sound is recorded and transmitted while maintaining spatial information. be able to. Also, listening to the signal with the left and right ears makes it possible to hear the environmental sound at the recording point with a sense of sound image localization.

  On the other hand, in order to remove environmental sounds other than the wearer's voice from the sound collected from the microphone of the headset, a separate microphone is installed near the wearer's mouth and the sound collected by the separately installed microphone is collected. Some of them remove other than the wearer's voice from the sound picked up by the microphone near the mouth based on the signal. Non-Patent Document 1 describes this configuration, in which noise (noise) mixed in an audio microphone is subjected to signal processing based on a collected sound signal of the noise microphone and transmitted.

Japanese Patent No. 3556987 Japanese Utility Model Publication No. 60-10156

Mariko Aoki, 3 others, "High noise noise suppression based on sound source separation method SAFIA-Challenge to F1 circuit", IEICE Technical Report, IEICE, April 2004, EA2004-2, p. . 7-12

  As described above, a general headset has a single microphone and can transmit the voice of the wearer, but cannot transmit environmental sounds around the wearer while maintaining spatial information. Similarly, a method of removing ambient noise from a sound collected by a microphone near the mouth described in Non-Patent Document 1 using other microphone signals can also transmit the wearer's voice. The surrounding environmental sound cannot be transmitted while maintaining the spatial information.

  On the other hand, in the configurations described in Patent Document 1 and Patent Document 2, it is possible to transmit and record the ambient environment sound while maintaining the spatial information. However, when the wearer makes a sound, the sound is converted into the ambient environment sound. It will be mixed. Since the wearer's voice is picked up by a louder sound than the ambient sound, there is a problem that the sound is louder in the signal transmitted and recorded, and the ambient sound is difficult to hear.

  In view of this problem, the object of the present invention is to allow environmental sound around the wearer to be transmitted and recorded while retaining spatial information of the sound that the wearer is listening to, and to reduce the voice uttered by the wearer. It is to provide a stereo headset that can be deleted or erased.

  According to the first aspect of the present invention, the stereo headset including two types of speakers for reproducing the left and right signals of the stereo signal is arranged in the vicinity of each speaker and is acoustically isolated from the speaker. Two types of environmental sound collection microphones, a voice collection microphone that picks up the voice of the stereo headset wearer, and a signal collected by the environmental sound collection microphone. Subtracting means for subtracting and outputting the signal collected by the voice collecting microphone.

  According to a second aspect of the present invention, in the first aspect of the invention, the subtraction processing means is a signal based on the utterance included in the signal collected by the environmental sound collecting microphone from the signal collected by the utterance collecting microphone. A signal estimator that estimates and outputs the output, a gain setter that can set an arbitrary gain, and outputs the set gain, and a multiplier that multiplies the output of the signal estimator by the output of the gain setter. And a subtractor for subtracting the output of the multiplier from the signal picked up by the environmental sound pickup microphone.

  In the invention of claim 3, in the invention of claim 2, the subtraction processing means includes a Fourier transformer at each input end of the signal collected by the environmental sound collection microphone and the signal collected by the voice collection microphone. In addition, an inverse Fourier transformer is provided at the output terminal of the subtractor, and processing of the signal estimator, multiplier and subtractor is performed in the frequency domain.

  According to a fourth aspect of the present invention, in the first aspect of the present invention, the subtracting means receives the signal collected by the environmental sound collecting microphone and the signal collected by the utterance sound collecting microphone, and the environmental sound collecting microphone An adaptive filter that eliminates the signal based on the utterance included in the collected signal, an arbitrary gain that can be set, and a gain setting unit that outputs the set gain, and a utterance microphone. A multiplier for multiplying the sound signal by the output of the gain setter; and an adder for adding the output of the multiplier to the output of the adaptive filter.

  According to the present invention, environmental sounds around the wearer can be transmitted and recorded while maintaining spatial information of the sound that the wearer is listening to, and even when the wearer speaks, the sound is erased. Or can be sent to and recorded in an arbitrary size.

The figure which shows the structure of one Example of the stereo headset by this invention. The figure which shows the 1st structural example of the subtraction process means in FIG. The figure which shows the 2nd structural example of the subtraction process means in FIG. The figure which shows the 3rd structural example of the subtraction process means in FIG. The figure which shows the 1st modification of the stereo headset by this invention. The figure which shows the 2nd modification of the stereo headset by this invention. The figure which shows the prior art structural example of a stereo headset.

  Embodiments of the present invention will be described with reference to the drawings.

  FIG. 1 shows the structure of an embodiment of a stereo headset according to the present invention. As in the conventional example shown in FIG. 7, the stereo headset includes two sets of speakers 11L and 11R for reproducing the left and right stereo signals. The left channel reception signal and the right channel reception signal are input to the speakers 11L and 11R, respectively. When the stereo headset is worn on the wearer, the speakers 11L and 11R are worn toward the left and right ear holes of the wearer.

  In this example, two types of environmental sound collecting microphones 21L and 21R are disposed in the vicinity of the speakers 11L and 11R, acoustically isolated from the speakers 11L and 11R. The environmental sound collection microphones 21L and 21R collect environmental sounds around the wearer. Since these environmental sound collection microphones 21L and 21R are arranged in the vicinity of the positions of the left and right ear holes of the wearer, the signals collected by the environmental sound collection microphones 21L and 21R hold sound image localization information and are collected. If the sounded signal is received as it is as the left and right signals, the sound being listened to by the wearer can be heard in the same positional relationship.

  However, when the wearer utters along with the ambient environmental sound, both the environmental sound collecting microphones 21L and 21R also collect the voice of the wearer. The voice of the wearer has a route to be collected through the air from the mouth to the environmental sound collection microphones 21L and 21R, and a route to be collected through the wearer's body. The sound is based on the voice that was uttered.

  In this example, a microphone other than the environmental sound collection microphones 21L and 21R in the vicinity of the speakers 11L and 11R is arranged as the voice collection microphone 22 near the wearer's mouth. The sound collection microphone 22 is located away from the environmental sound collection microphones 21L and 21R. The voice collection microphone 22 also picks up ambient sound, but since the wearer's mouth is close, the main component of the collected sound is the voice uttered by the wearer.

  The signal collected by the left environmental sound collecting microphone 21L and the signal collected by the utterance sound collecting microphone 22 are input to the subtraction processing unit 30L, and are similarly collected by the right environmental sound collecting microphone 21R. The signal and the signal collected by the voice collecting microphone 22 are input to the subtraction processing means 30R. These subtraction processing means 30L and 30R have the same configuration, process the input signal, and output the processed signals as a left channel transmission signal and a right channel transmission signal, respectively.

  FIG. 2 shows the configuration of these subtraction processing means 30 (30L, 30R). The subtraction processing means 30 is composed of a signal estimator 31, a gain setting unit 32, a multiplier 33, and a subtractor 34. . The output (sound collection signal) of the sound collection microphone 22 is input to the signal estimator 31, and the output (sound collection signal) of the environmental sound collection microphone 21 </ b> L (21 </ b> R) is input to the subtractor 34.

  The signal estimator 31 estimates and outputs a signal based on the utterance of the wearer included in the output of the environmental sound pickup microphone based on the output of the voice pickup microphone. Since the difference between the sound collection characteristics of the sound collection microphone 22 and the sound collection characteristics of the environmental sound collection microphone 21L (21R) with respect to the wearer's voice is known, the signal estimator 31 is based on the difference in the sound collection characteristics. A signal based on the utterance of the wearer included in the output of the environmental sound pickup microphone is estimated and output.

  The gain setting unit 32 can set an arbitrary gain. The setting of the gain can be appropriately performed by the wearer, for example, and detailed illustration is omitted, but an adjustment knob or the like for adjusting / setting the gain is provided. The gain setting unit 32 outputs the set gain.

  The output of the signal estimator 31 and the output of the gain setter 32 are input to a multiplier 33, and the multiplier 33 multiplies the output of the signal estimator 31 by the gain output from the gain setter 32. The output of the multiplier 33 is input to a subtractor 34, which subtracts the output of the multiplier 33 from the output of the environmental sound collecting microphone.

  By providing the subtraction processing means 30 (30L, 30R) for performing the above processing, in this example, the proportion of the wearer's voice included in the output signals of the environmental sound pickup microphones 21L, 21R can be arbitrarily set. It can be adjusted, that is, only the wearer's voice can be adjusted. As a result, it is possible to erase the wearer's voice from the signal obtained by picking up the ambient environmental sound while maintaining the sound image localization information, and it is possible to reduce the sound to an arbitrary level.

  FIG. 3 shows a second configuration example of the subtraction processing means. In this example, the subtraction processing means 30 'is different from the subtraction processing means 30 of the first configuration example shown in FIG. FFT) 35 and 36 and an inverse Fourier transformer (IFFT) 37 are added.

  The FFTs 35 and 36 are arranged at the input terminals of the output of the environmental sound collecting microphone (sound collecting signal) and the output of the voice collecting microphone (sound collecting signal), and the IFFT 37 is arranged at the output terminal of the subtractor 34. The output of the environmental sound collection microphone and the output of the voice collection microphone are input to the FFTs 35 and 36, respectively, and converted into frequency domain signals. Thereby, each process of the signal estimator 31, the multiplier 33, and the subtractor 34 is performed in the frequency domain. That is, in this example, the subtraction process is performed in the frequency domain, and after the process, the time signal is restored. By performing the subtraction process in the frequency domain in this way, it is possible to erase sound and adjust the volume that are resistant to phase fluctuations.

  Next, a third configuration example of the subtraction processing unit shown in FIG. 4 will be described. In this example, the subtraction processing means 30 ″ uses an adaptive filter. The output of the environmental sound collecting microphone (sound collecting signal) and the output of the voice collecting microphone (sound collecting signal) are input to the adaptive filter 38. The adaptive filter 38 performs processing for erasing a signal based on the utterance of the wearer included in the output of the environmental sound pickup microphone, that is, the voice of the wearer is output from the output of the environmental sound pickup microphone by automatic adaptive processing. to erase.

  In this example, after the wearer's voice is once erased from the ambient environment sound, the wearer's voice is added to the desired volume and ambient environment sound. Therefore, in addition to the adaptive filter 38, a gain setting unit 32, a multiplier 33, and an adder 39 are provided.

  The gain set by the gain setting unit 32 is input to the multiplier 33. Multiplier 33 multiplies the output of the voice collecting microphone by a gain and outputs the result to adder 39. The adder 39 adds the output of the multiplier 33 to the output of the adaptive filter 38. Thus, in this example, the wearer's voice can be erased from the ambient environment sound, and the wearer's voice can be added to the desired volume and ambient environment sound as necessary.

  As described above, various configuration examples of the subtraction processing unit have been described. However, the subtraction processing unit may be, for example, only a subtractor, and may be configured to simply subtract the output of the utterance sound collection microphone from the output of the environmental sound collection microphone. .

  By the way, in the above-described embodiment, the environmental sound collecting microphones 21L and 21R are disposed acoustically isolated from the speakers 11L and 11R, so that the sounds of the speakers 11L and 11R do not enter the environmental sound collecting microphones 21L and 21R. In this way, the sound quality of the environmental sound collecting microphones 21L, 21R can be prevented from being deteriorated and howling can be prevented. Will be attached to. Therefore, the wearer cannot directly listen to the environmental sound around him / her.

  The configuration of the stereo headset shown in FIG. 5 is such that the wearer can listen to the surrounding environmental sound with respect to this problem. In this example, the input terminals of the left channel reception signal and the right channel reception signal are connected to each input terminal. The adders 40L and 40R are provided. The adders 40L and 40R add the left channel transmission signal and the right channel transmission signal output from the subtraction processing units 30L and 30R to the left channel reception signal and the right channel reception signal input to the speakers 11L and 11R, respectively. . As a result, the wearer can listen to the surrounding environmental sound.

  Next, the configuration of the stereo headset shown in FIG. 6 will be described. In this configuration, in addition to the left channel transmission signal and the right channel transmission signal output from the subtraction processing means 30L and 30R, the wearer's voice collected by the utterance microphone 22 is output on another channel. If the wearer's voice is transmitted and recorded on different channels, the wearer's voice and the ambient sound can be handled separately.

  As described above, ambient sound that the wearer of the stereo headset is listening to can be transmitted and recorded while retaining the sound image localization information felt by the wearer. We have explained various configurations of stereo headsets that can erase or reduce the volume of the wearer's voice so as not to interfere with the environmental sound. If there is a sound other than the voice of the wearer, microphones that can individually collect the sound to be erased are arranged according to the number of sounds to be erased, and the signals collected by the environmental sound collecting microphones 21L and 21R are used. By performing the subtraction process in the same manner as in the above-described example, it can be deleted or reduced.

  Note that, in the above-described example, one sound collection microphone 22 positioned apart from the environmental sound collection microphones 21L and 21R is disposed near the mouth of the wearer, but a plurality of sound collection microphones 22 are provided. You may make it use.

  When the stereo headset according to the present invention is used for real-time communication such as a communication conference, the wearer's sound environment is shared remotely, and the wearer can feel the immersive feeling of the wearer like a remote agent. It is possible to feel the space where the wearer is, and at that time, the voice of the wearer can also be heard at a volume that is easy for the remote person to hear.

  Further, if the output (transmission signal) of the stereo headset according to the present invention is recorded, the sound heard by the wearer at the site can be created as content including a three-dimensional sound. In this case, if the wearer's voice can be output on another channel as in the configuration shown in FIG. 6, the wearer's voice can be individually recorded as a guide voice, or only the wearer's voice can be recognized. It becomes possible to turn it into a text and assign a table of contents to the content.

Claims (2)

  1. A stereo headset having two speakers for reproducing left and right signals of a stereo signal,
    In the vicinity of each of the speakers, two environmental sound collection microphones arranged acoustically isolated from the speakers;
    A sound collection microphone that is located away from the environmental sound collection microphone and collects the sound of the stereo headset wearer;
    Subtracting processing means for subtracting and outputting the signal collected by the voice collecting microphone from the signal collected by the environmental sound collecting microphone,
    The subtraction processing means includes
    A signal based on the utterance included in the signal collected by the environmental sound collecting microphone, in which the signal collected by the environmental sound collecting microphone and the signal collected by the utterance sound collecting microphone are input. An adaptive filter that eliminates
    A gain setter capable of setting an arbitrary gain and outputting the set gain;
    A multiplier that multiplies the output of the gain setter by the signal collected by the voice collecting microphone;
    An adder for adding the output of the multiplier to the output of the adaptive filter;
    A stereo headset comprising:
  2. The stereo headset of claim 1,
    The subtraction processing means comprises a Fourier transformer to the input terminals of the picked-up signal with said ring Sakaion collected sound signal and the vocal sound pickup microphone which are picked up by the microphone, the reverse to the output of the adder With a Fourier Transformer,
    A stereo headset characterized in that the processing of the adaptive filter, the multiplier and the adder is performed in a frequency domain.
JP2011009938A 2011-01-20 2011-01-20 Stereo headset Active JP5538249B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2011009938A JP5538249B2 (en) 2011-01-20 2011-01-20 Stereo headset

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2011009938A JP5538249B2 (en) 2011-01-20 2011-01-20 Stereo headset

Publications (2)

Publication Number Publication Date
JP2012151745A JP2012151745A (en) 2012-08-09
JP5538249B2 true JP5538249B2 (en) 2014-07-02

Family

ID=46793566

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2011009938A Active JP5538249B2 (en) 2011-01-20 2011-01-20 Stereo headset

Country Status (1)

Country Link
JP (1) JP5538249B2 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6204312B2 (en) * 2014-08-28 2017-09-27 日本電信電話株式会社 Sound collector

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3556987B2 (en) * 1995-02-07 2004-08-25 富士通株式会社 Environmental sound transmission type headset devices
JP2001036984A (en) * 1999-07-16 2001-02-09 Matsushita Electric Ind Co Ltd Acoustic reproducing device
JP4282317B2 (en) * 2002-12-05 2009-06-17 アルパイン株式会社 Voice communication device
US7099821B2 (en) * 2003-09-12 2006-08-29 Softmax, Inc. Separation of target acoustic signals in a multi-transducer arrangement
JP2008228198A (en) * 2007-03-15 2008-09-25 Sharp Corp Apparatus and method for adjusting playback sound
JP2008263383A (en) * 2007-04-11 2008-10-30 Sony Ericsson Mobilecommunications Japan Inc Apparatus and method for canceling generated sound
JP5087514B2 (en) * 2008-09-29 2012-12-05 京セラ株式会社 Mobile communication terminal

Also Published As

Publication number Publication date
JP2012151745A (en) 2012-08-09

Similar Documents

Publication Publication Date Title
US8699742B2 (en) Sound system and a method for providing sound
KR102025527B1 (en) Coordinated control of adaptive noise cancellation(anc) among earspeaker channels
CN101616351B (en) Noise reduction audio reproducing device and noise reduction audio reproducing method
JP6387429B2 (en) Providing the natural surroundings with ANR headphones
US8699732B2 (en) Systems and methods employing multiple individual wireless earbuds for a common audio source
JP6573624B2 (en) Frequency dependent sidetone calibration
JP5956083B2 (en) Blocking effect reduction processing with ANR headphones
EP2680608B1 (en) Communication headset speech enhancement method and device, and noise reduction communication headset
US8917894B2 (en) Method and device for acute sound detection and reproduction
JP5499633B2 (en) Reproduction device, headphone, and reproduction method
US20110144779A1 (en) Data processing for a wearable apparatus
JP2008193420A (en) Headphone apparatus, sound reproduction system and method
KR20110099693A (en) An earpiece and a method for playing a stereo and a mono signal
KR20140019023A (en) Generating a masking signal on an electronic device
CN101268715B (en) Teleconferencing device
CN106416290B (en) The system and method for the performance of audio-frequency transducer is improved based on the detection of energy converter state
JP6389232B2 (en) Short latency multi-driver adaptive noise cancellation (ANC) system for personal audio devices
EP2518724B1 (en) Microphone/headphone audio headset comprising a means for suppressing noise in a speech signal, in particular for a hands-free telephone system
KR20120034085A (en) Earphone arrangement and method of operation therefor
US20120101819A1 (en) System and a method for providing sound signals
US20140294182A1 (en) Systems and methods for locating an error microphone to minimize or reduce obstruction of an acoustic transducer wave path
US8265297B2 (en) Sound reproducing device and sound reproduction method for echo cancelling and noise reduction
CN1832636B (en) System and method for determining directionality of sound detected by a hearing aid
US9858912B2 (en) Apparatus, method, and computer program for adjustable noise cancellation
JP4304636B2 (en) Sound system, sound device, and optimal sound field generation method

Legal Events

Date Code Title Description
A621 Written request for application examination

Effective date: 20121225

Free format text: JAPANESE INTERMEDIATE CODE: A621

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20130904

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20130910

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20131024

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20140304

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20140401

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20140422

R150 Certificate of patent (=grant) or registration of utility model

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150

Ref document number: 5538249

A61 First payment of annual fees (during grant procedure)

Effective date: 20140428

Free format text: JAPANESE INTERMEDIATE CODE: A61