KR101647974B1

KR101647974B1 - Smart earphone, appliance and system having smart mixing module, and method for mixing external sound and internal audio

Info

Publication number: KR101647974B1
Application number: KR1020150044533A
Authority: KR
Inventors: 신대진
Original assignee: 주식회사 이드웨어
Priority date: 2015-03-30
Filing date: 2015-03-30
Publication date: 2016-08-16

Abstract

It provides a smart earphone with a smart mixing module that optimally mixes the device sound and the external sound.
The smart mixing module 300 includes an external sound detection unit 310, an acoustic control unit 320, an acoustic output unit 400, an energy detection unit 311, a voice activity detection unit 312, a noise energy detection unit 313, An adaptive filtering unit 340, a sound quality enhancing unit 341, a background noise removing unit 342, a specific situation recognizing unit 343, a warning sound generating unit 344, and an acoustic mixing unit 330, When the sound is being output, the user can clearly hear the surrounding sounds that need special attention.

Description

TECHNICAL FIELD [0001] The present invention relates to a smart earphone equipped with a smart mixing module, a device having a smart mixing module, a method and system for mixing an external sound and a device sound,

The present invention relates to a personal audio output device such as an earphone, an earphone, a headphone, a headset, and the like. More particularly, the present invention relates to an audio output device including an appliance audio and an external audio output device for naturally and smoothly communicating with the surroundings, The present invention relates to a smart earphone having a smart mixing module for mixing an external sound in an optimum state. Here, the device sound refers to a sound appliance such as a smart appliance such as a mobile phone, a note, a pad, a computer such as an MP3 player, a PC, a notebook computer, or a cassette player, a CD player, a home theater, Refers to an internal sound generated within an earphone, an earphone, an earphone, a headphone, a headset, or a speaker, and an external sound refers to all sound generated in the surroundings without being generated in the sound device. Herein, the term mixing refers to mixing and processing each volume of the external sound and the device sound appropriately and includes completely canceling, i.e., shutting off at least one volume of the external sound and the device sound .

The present invention also relates to a smart device or a sound device having a smart mixing module for mixing an instrument sound and an external sound in an optimal state, and also includes a smart earphone or a speaker and a smart device or a sound device, Or a smart system or sound system with a smart mixing module implemented in a speaker and a smart device or a sound device, and also relates to a method of mixing device sound and external sound in such earphone, device or system.

A personal audio output device collectively refers to a speaker that sounds near an individual's ear so that only a person can listen to it in preparation for a speaker that spreads sound to the public so that the public can hear them at the same time.

The personal audio output device is structurally divided into earphone and headphone. The earphone is relatively compact due to the importance of activity, and is attached or attached to the auricle or ear canal. The headphone is large in size with a relatively large speaker As shown in FIG.

Both earphones and headphones basically have only a speaker for sound output, but they also have a microphone for sound input in order to expand their functions. An earphone with a microphone is called an earset, and a headphone with a microphone is called a headset.

Although the use of headphones and earphones has been somewhat different, the headphones have become smaller and smaller, and the earphones have diversified into different uses and functions, and the distinction between the two is becoming more and more common.

Open headphones are similar to closed headphones. However, open headphones are similar to closed headphones. However, open headphones are similar to closed headphones. However, A duct (a sound hole) is provided to allow the user to hear the outside sound, and the condenser type headphone uses a diaphragm sound speaker using static electricity.

The earphone is divided into an open earphone and a canal earphone. The open earphone is the most common type of earphone, and the cannular earphone is closely attached to the ear canal such as a wine cap, and is called an in ear earphone. And it has excellent external sound insulation effect compared to an open type earphone.

Herein, the term earphone should be interpreted as a generic term for all personal audio output devices including earphones, ear sets, headphones, and headsets, unless expressly defined otherwise or interpreted differently within the context.

Originally, the structure of the headphone is much more effective than the earphone, but the sound of the earphone also makes it difficult for the wearer to hear the outside sound.

For example, an earphone wearer may not be able to listen to the ringing tones of the surrounding people, may hesitate to listen to the voice of the conversation person, may be disturbed by the conversation person's voice, may not be able to listen to the voice of the vehicle while walking while wearing earphones .

The talker who hears while hearing through the earphone hears the other's voice clearly, but sometimes he feels awkward in a situation where his voice is blocked by the earphone.

A variety of techniques have been developed to allow external sounds to be heard while wearing earphones for such problems.

Figure 9 shows a personal audio output device associated with the device.

The illustrated device 200 is a kind of smart phone including a phone module 210 and a media player 220. [ The device is not limited to a smart phone and may be an audio device having only the media player 220 without the phone module 210. [

The device 200 includes a microphone 110 for inputting a transmission tone, and includes a switch 230 for receiving a telephone call while listening to music. The switch 230 is generally integrated with the function of the telephone receiving button switch.

The audio output unit may be an earphone or a headphone having an audio output unit 400 composed of a speaker 410, an earphone or a headset having a microphone 110, a switch 230).

10 shows a personal audio output apparatus similar to that shown in FIG. 9, but includes an external sound input unit 100 for solving the above-described problem.

One example of the external sound input unit 100 according to the related art is a duct (sound hole) through which air can enter and leave, and a lid for opening and closing the duct is also provided. However, it is easy to install a duct in a relatively large structure such as a headphone, but it is difficult to provide a space for installing a duct in a small-sized structure, and a complicated and weak structure that forms a duct and a cover is liable to be damaged. In addition, the external sound that is introduced through a small duct compared to the sound of a device through a speaker that rings directly to the ear and is not amplified is very weak, so that a great effect can not be expected.

Another example of the external sound input unit 100 according to the related art is a separate microphone exposed on the outer surface of the earphone housing. The sound wave input through the microphone is amplified through a separate amplifying circuit, . In order to operate the external sound input unit 100 including the separate microphone, amplification circuit, and speaker, a separate power source such as a separate battery is required. In addition, when the original instrument sound, which is heard through the speaker 410 of the earphone, is not reduced, the external sound flowing from the separate speaker is caused to be mixed with the ear, so that the sound may be mixed with noise. I can not hear.

The present invention aims at solving the above problem, so that the wearer of the earphone can listen to the ringing tones of surrounding persons at any time, listen clearly to the voice of the talker, and immediately hear and feel the dangerous situation such as the sound of the vehicle.

It is also an object of the present invention to enable a user to speak while listening to his or her own voice as if the user did not wear an earphone even when he / she was talking through the earphone.

Another object of the present invention is to allow a user to communicate smoothly with the surroundings in a state where the earphone is worn.

The above-described object of the present invention can be solved by a smart mixing module for mixing an instrument sound and an external sound in an optimal state.

According to one aspect of the present invention, there is provided an earphone for use in listening to a device sound, the earphone comprising: an acoustic output section including a speaker for outputting a device sound; And a smart mixing module that mixes the detected external sound with the adjusted device sound and outputs the mixed external sound to the speaker of the sound output unit. Here, the earphone is a representative term collectively referred to as all personal audio output devices including an earphone, an ear set, a headphone, and a headset.

The smart mixing module includes an external sound detection unit configured to detect a sound wave signal input through an external sound input unit including a microphone, energy of external sound detected, energy of a sound, and noise energy included in the external sound, An acoustic mixing unit for mixing the external sound detected by the external sound detection unit and the device sound adjusted by the sound control unit and outputting the mixed device sound to the speaker of the sound output unit, . Here, the term " voice activity " means an acoustic component having a voice characteristic such as a sound wave pattern having a voice wave pattern.

The external sound detecting unit includes an energy detecting unit configured to detect energy of a sound wave signal input through an external sound input unit including a microphone, a sound activity detecting unit detecting sound activity in a sound wave signal, And a sound activity detector for discriminating between a sound activity component and a noise component in the sound wave signal in consideration of the sound activity in the sound wave signal sensed by the sound activity sensing section and the energy of the noise sensed by the noise energy sensing section, And an adaptive filtering unit for filtering the sound signal according to the component. Here, the voice activity component is mainly a conversation sound, but it may also include a short voice such as a call sound heard from the surroundings. In addition, the noise component includes all the sounds except for the background noise such as the bicycling sound, the windshield cracking sound, etc., as well as the surrounding background noise.

The adaptive filtering unit may include a sound quality enhancing unit for enhancing sound quality by processing a sound activity component of an external sound sensed by the sound activity sensing unit, a background eliminating background noise from a noise component of the external sound sensed by the noise energy sensing unit, A specific situation recognizing unit for recognizing a surrounding situation by a noise component from which background noise is removed in a noise removing unit and a background noise removing unit, and a warning unit for generating a warning sound corresponding to a specific situation recognized by the specific situation recognizing unit, And a warning sound generating unit. Background noise refers to a steady sound that does not require special attention, such as, for example, spatial noise, buzzing sound, airplane engine sound, and so on. It is necessary to pay special attention to eliminating steady sounds that do not need to be tilted, for example the sounds of dangerous situations during daily conversations, calls, outdoor workouts, such as the bicycling of bicycles, It means to leave the surrounding sound.

The sound quality improving unit may be configured to output the sound activity component having improved sound quality to the sound mixing unit.

The sound quality improving unit may be configured to bypass the sound mixing unit and output the sound activity component having improved sound quality directly to the speaker of the sound output unit.

The sound quality enhancement unit may include a Wiener filter based on discrete wavelet transform (DWT).

The background noise canceller may include a Wiener filter based on Fast Fourier Transform (FFT).

The specific situation recognition unit may be configured to prepare a sound wave model corresponding to a specific situation, to compare the noise component input from the background noise removing unit with the sound wave model, and to determine the surrounding situation as a specific situation according to the comparison result. Here, the specific situation refers to a situation in which a user selects a situation in which a user should pay special attention to a variable situation that the user can encounter, and prepare and store a corresponding sound wave model.

The specific situation recognizer may include a deep neural network based sound classifier that compares the noise component with the sound wave model by deep running.

The sound control unit takes into account the energy of the detected external sound, the sound activity, and the energy of the noise included in the external sound, and considers the type of the device sound input from the device, Sound effect processing including sound field effect processing and sound adjustment processing including fade-out processing.

The external sound input unit may be configured to include a microphone built in a device to which the earphone is connected.

The external sound input unit may include a microphone attached to the earphone.

The earphone may be configured to be wireless, wired or wirelessly wired to a telephone including the telephone module.

The earphone can be configured to connect to the media player wirelessly, wired or wirelessly and wired.

According to another aspect of the present invention, there is provided a device configured to output sound through an earphone, the device comprising: an external device for detecting an external sound, suitably adjusting the device sound according to the detected external sound, And a smart mixing module that mixes the device sounds and outputs them to a speaker of the earphone. Here, the term " device " includes all mobile devices that output sounds regardless of product names such as a mobile phone, a smart phone, a notebook, and a pad, and may be a media player such as an MP3 player. In addition, the device may be a personal computer such as a PC or a notebook computer having an audio output unit, or a home appliance that outputs sound such as a cassette player, a CD player, a home theater, or a TV.

The smart mixing module includes an external sound detection unit configured to detect a sound wave signal input through an external sound input unit including a microphone, energy of external sound detected, energy of a sound included in the external sound, And an acoustic mixing unit for mixing the external sound detected by the external sound detecting unit and the device sound adjusted by the sound adjusting unit and outputting the mixed device sound to the speaker of the earphone.

The external sound detecting unit includes an energy detecting unit configured to detect energy of a sound wave signal input through an external sound input unit including a microphone, a sound activity detecting unit detecting sound activity in a sound wave signal, And a sound activity detector for discriminating between a sound activity component and a noise component in the sound wave signal in consideration of the sound activity in the sound wave signal sensed by the sound activity sensing section and the energy of the noise sensed by the noise energy sensing section, And an adaptive filtering unit for filtering the sound signal according to the component.

The adaptive filtering unit may include a sound quality enhancing unit for enhancing sound quality by processing a sound activity component of an external sound sensed by the sound activity sensing unit, a background eliminating background noise from a noise component of the external sound sensed by the noise energy sensing unit, A specific situation recognizing unit for recognizing a surrounding situation by a noise component from which background noise is removed in a noise removing unit and a background noise removing unit, and a warning unit for generating a warning sound corresponding to a specific situation recognized by the specific situation recognizing unit, And a warning sound generating unit. Here, the beep sound may be the ambient sound itself, or it may be a separate sound that is classified according to the surrounding situation, which is the ambient sound. The generated tones may be various sounds or voices that the wearer of the earphone can easily recognize danger or the like.

The sound quality improving unit may be configured to bypass the sound mixing unit and output the sound activity component having improved sound quality directly to the speaker of the earphone.

The specific situation recognition unit may be configured to prepare a sound wave model corresponding to a specific situation, to compare the noise component input from the background noise removing unit with the sound wave model, and to determine the surrounding situation as a specific situation according to the comparison result.

At least one of the sound adjustment processing including the volume adjustment processing, the sound field effect processing, and the fade-out processing for the device sound is performed in consideration of the energy of the detected external sound, the sound activity and the energy of the noise included in the external sound And the like.

The external sound input unit may include a built-in microphone.

The external sound input unit may be configured to include a microphone attached to the earphone.

The device may be a mobile device that includes a phone module configured to be wirelessly, wired, or wired to the earphone.

The device may be a media player configured to be connected to the earphone wirelessly, wired, or wirelessly.

According to another aspect of the present invention, there is provided a method of mixing a device sound and an external sound, comprising the steps of detecting an external sound and appropriately adjusting the device sound according to the detected external sound, And a smart mixing step of mixing the instrument sounds with the sound and outputting them to the speaker.

The smart mixing step includes an external sound detecting step of detecting a sound wave signal input through an external sound input unit including a microphone, an energy of a detected external sound, a sound activity, and an energy of noise included in the external sound, And an acoustic mixing step of mixing the external sound detected in the external sound detecting step and the device sound adjusted in the sound adjusting step and outputting them to the speaker.

The external sound detecting step may include an energy detecting step of detecting energy of a sound wave signal input through an external sound input unit including a microphone, a sound activity detecting step of detecting sound activity in a sound wave signal, A sound activity detection step of detecting a sound activity and a noise activity in a sound wave signal detected in the sound activity detection step and a noise energy detected in the noise energy sensing step, And an adaptive filtering step of filtering the sound signal according to the components of the sound signal.

The adaptive filtering step may include a sound quality enhancement step of processing the sound activity component of the external sound sensed in the sound activity sensing step to improve the sound quality and outputting the sound quality, and a step of removing the background noise from the noise component of the external sound sensed in the noise energy sensing step A specific situation recognizing step of recognizing a surrounding situation by a noise component from which background noise is removed in the background noise removing step and the background noise removing step and a warning sound generating step of generating a warning sound corresponding to a specific situation recognized in the specific situation recognizing step .

The adaptive filtering step may include a sound quality enhancement step of processing the sound activity component of the external sound detected in the sound activity sensing step to improve the sound quality and outputting the sound quality, comparing the noise component of the external sound sensed in the noise energy sensing step with a sound wave model A noise discrimination step of discriminating whether or not the noise component is a specific noise, a noise elimination step of removing a noise gate if the noise component is not a specific noise, and a noise state determining step of determining that the surrounding situation corresponds to a specific noise And a warning sound generating step for generating a warning sound corresponding to a specific situation. Here, the term " specific noise " means a noise corresponding to a sound wave model corresponding to a selected specific situation, i.e., a sound having the same or at least similar characteristics as those of the sound wave model.

The sound quality enhancement step may be configured to send the sound activity component having improved sound quality to the sound mixing stage.

The sound quality improving step may be configured to omit the sound mixing step and output the sound activity component having improved sound quality directly to the speaker.

Discrete wavelet transform (DWT) can be used for the sound quality enhancement step.

The background noise cancellation step may utilize Fast Fourier Transform (FFT).

In the specific situation recognition step, a sound wave model corresponding to a specific situation may be prepared, a noise component sent in the background noise removing step may be compared with a sound wave model, and the surrounding situation may be determined as a specific situation according to the comparison result.

The specific situation recognition step can use a deep neural network based sound classifier that compares the noise component with the sound wave model by deep running.

The sound adjustment step may include at least one of sound adjustment processing including volume adjustment processing, sound field effect processing, and fade-out processing for the device sound considering the energy of the detected external sound, the sound activity, and the energy of noise included in the external sound It can be configured to perform one processing.

According to yet another aspect of the present invention, there is provided a system for mixing and outputting a device sound and an external sound, the device comprising: an external sound detection unit for detecting an external sound and appropriately adjusting the device sound according to the detected external sound, And a smart mixing unit for mixing the sound of the device with the sound adjusted to the sound and outputting the mixture to the speaker.

The smart mixing unit includes an external sound detecting unit that detects a sound wave signal input through an external sound input unit including a microphone, an external sound detecting unit that adjusts the sound of the device in consideration of the energy of the detected external sound, And an acoustic mixing unit for mixing the external sound detected by the external sound detecting unit and the device sound adjusted by the sound adjusting unit and outputting them to the speaker.

The external sound detection unit includes an energy detection unit for detecting energy of a sound wave signal input through an external sound input unit including a microphone, a sound activity detection unit for detecting sound activity in a sound wave signal, The noise activity sensing unit, and the sound activity sensing unit, and the energy of the noise sensed by the noise energy sensing unit are taken into consideration, the sound activity component and the noise component are distinguished from each other, And an adaptive filtering unit for filtering the sound signal according to the received signal.

The adaptive filtering unit may include a sound quality enhancing unit for enhancing sound quality by processing a sound activity component of an external sound sensed by the sound activity sensing unit, a background eliminating background noise from a noise component of the external sound sensed by the noise energy sensing unit, A specific situation recognition unit for recognizing a surrounding situation by a noise component from which background noise has been removed from the background noise removing unit and a warning sound generating unit for generating a warning sound corresponding to a specific situation recognized by the specific situation recognition unit .

The sound quality improving unit may be configured to bypass the sound mixing unit and directly output the sound activity component having improved sound quality to the speaker.

The specific situation recognition unit may be configured to prepare a sound wave model corresponding to a specific situation, compare the noise component input from the background noise removing unit with the sound wave model, and determine the surrounding situation as a specific situation according to the comparison result.

The external sound input unit may include a microphone built in the device for generating the device sound.

The external sound input unit may include a microphone attached to an earphone for outputting a device sound.

The device generating the device tone may be a mobile device including a phone module in which the earphone is configured to be connected wirelessly, wired, or wirelessly.

The device generating the device sound may be a media player in which the earphone is configured to be wired to wireless, wired or wireless.

With this invention as described above, at any time when the wearer of the earphone is listening to the sound through the earphone, such as a telephone call or listening to music or a broadcast, the sound of someone calling himself or herself, , The sound of a bike or a motorcycle ringing at you, and the sound of crowds rushing around you.

In addition, even when the earphone is worn, it is possible to clearly hear the voice of the other party conversing through the nurturing and the voice uttered outside the mouth of the user, so that the other party does not feel inconvenience, I can communicate and live in daily life.

The Smart Mixing Module can be used to amplify and / or attenuate the sound heard through the earphone and surrounding sounds from the device, such as reducing the device sound or increasing the ambient sound to suit the volume, sound quality, type, The accompanying smart mixing makes it possible to hear ambient sounds clearly in any situation at any time.

In addition, background noise in the surroundings or all meaningless sounds that do not need to be heard by the wearer of the earphone are blocked without mixing, so that the device sound heard through the earphone can be heard according to the original sound quality.

Therefore, by using the present invention, it is possible to prevent the earphone wearer from listening to the ringing tones of the surrounding people, silently speaking, being unable to listen to the voice of the talker clearly or being disturbed by the conversation, Concern is reduced or eliminated.

1 is a block diagram showing means constituted by an earphone or a device for implementing this invention according to an embodiment of the present invention,
FIG. 2 is a block diagram showing the smart mixing module shown in FIG. 1 in more detail,
3 is a block diagram showing the device sound input unit shown in Fig. 2 in more detail,
FIG. 4 is a block diagram showing the external sound detection unit shown in FIG. 3 in more detail,
FIG. 5 is a block diagram showing the adaptive filtering unit shown in FIG. 4 in more detail,
Fig. 6 is a block diagram similar to Fig. 5, but showing a change in the configuration of the sound quality enhancing section according to another embodiment of the present invention,
Figure 7 is a flow chart illustrating the steps for implementing a method according to one embodiment of the present invention using the means shown in Figure 1,
Figure 8 is a flow chart illustrating steps for implementing a method according to another embodiment of the present invention using the means shown in Figure 1,
9 is a block diagram showing a configuration of a device and an earphone according to an example of the prior art,
10 is a block diagram showing the configuration of a device and an earphone according to an example of the prior art.

In the following, the invention will be described in more detail with reference to the accompanying drawings.

FIG. 1 is a block diagram illustrating means configured in an earphone or device for implementing the invention in accordance with one embodiment of the present invention. 1, an earphone or a device according to an embodiment of the present invention includes an external sound input unit 100, a device sound input unit 200, a device sound input unit 200 and an external sound input unit 100, A smart mixing module 300 for mixing the external sounds input from the smart mixing module 300 in an optimum state and an acoustic output unit 400 for outputting mixed sounds in the smart mixing module 300. Although the illustrated figures illustrate the means constituted in the earphone or device for implementing the invention, that is to say that each functional block is a single configuration, it is also possible for the earphone, earset, headphone, It is desirable to separately process each of the two or more multi-channel sounds and output the multi-channel sound separately. It is preferable that the multi-channel sound is separately processed in such a manner that it is possible to process the multi- It is obvious to those who have it, so we do not explain it further.

FIG. 2 shows the configuration of the smart mixing module shown in FIG. 1 in more detail.

The smart mixing module 300 may include an external sound detection unit 310, a sound control unit 320, and a sound mixer 330.

The external sound detection unit 310 is configured to detect a sound wave signal input through the external sound input unit 100 including a microphone. The microphone of the external sound input unit 100 may be connected to an earphone, an earphone, a headphone, or a device to which a headset is to be connected, such as a mobile device such as a mobile phone, a smart phone, a notebook, a pad, a media player such as an MP3 player, Or a built-in microphone in a consumer product such as a cassette player, a CD player, a home theater, a TV, and the like. The microphone of the external sound input unit 100 may be an earphone, an earset, a headphone, or a microphone attached to the headset. More specifically, the microphone may be an earphone, an earphone, a headphone, A microphone mounted somewhere on the connection line of the headset, or a microphone mounted adjacent to or integrated with the speaker to be worn on the ear.

The sound adjusting unit 320 is configured to adjust the sound of the device in consideration of the energy of the detected external sound, the sound activity, and the energy of the noise included in the external sound, taking into consideration the type of the device sound input from the device. The sound adjusting unit 320 may be configured to perform at least one of processing of a device input from the device, such as volume adjustment processing, sound field effect processing, and sound adjustment processing including fade-out processing. Here, the sound field effect processing refers to a process in which, for example, processing is performed so that ambient sounds can be well distinguished during listening to a music sound, and each of a plurality of sounds mixed with a speaker of the sound output unit 400 is output Refers to various types of processing that are individually handled in accordance with the situation and purpose. For example, audio signal processing that generates a sound effect such as a reverberation effect and / or a 3D audio effect processing.

The sound mixing unit 330 mixes the external sound detected by the external sound detection unit 310 and the device sound adjusted by the sound adjustment unit 320 and outputs the mixed sound to the speaker of the sound output unit 400.

Fig. 3 shows the configuration of the device sound input unit shown in Fig. 2 in more detail.

The device sound input unit 200 may be a mobile device including a phone module 210 configured to be connected to an earphone, an earphone, a headphone, a headset, or the like wirelessly, wiredly or wirelessly. The device sound input unit 200 may be a media player 220 configured to be connected to an earphone, an earphone, a headphone, a headset, or the like wirelessly, wired, or wirelessly. The device sound input unit 200 includes a switch 230 having both the phone module 210 and the media player 220 as described above and capable of selecting one of the operating states and the sound output, Lt; / RTI > Further, the device sound input unit 200 may be a computer such as a PC or a notebook computer, or a cassette player, a CD player, a home theater, a TV set, or the like, which is configured to be connected to a wireless, wired, Or may be a household appliance that outputs sound.

FIG. 4 shows the configuration of the external sound detection unit shown in FIG. 3 in more detail.

The external sound detection unit 310 includes an energy detection unit 311, a voice activity detection unit (VAD) 312, a noise energy detection unit 313, an adaptive filtering unit 340 ).

The energy detecting unit 311 is configured to detect the energy of a sound wave signal input through the external sound input unit 100 including the microphone.

The voice activity detection unit 312 is configured to detect voice activity in a sound wave signal.

The noise energy sensing unit 313 is configured to sense the energy of the noise included in the sound wave signal. The noise energy sensing unit 313 may sense the noise level, for example, in decibels (dB), and the sensed noise level may be utilized in adaptive filtering in the adaptive filtering unit 340.

The adaptive filtering unit 340 may be configured to adaptively filter the audio activity component and the noise component of the sound wave signal in consideration of the sound activity in the sound wave signal sensed by the sound activity sensing unit 312 and the energy of the noise sensed by the noise energy sensing unit 313, To discriminate the components, and to filter the sound signal according to each component.

Fig. 5 shows the configuration of the adaptive filtering unit shown in Fig. 4 in more detail.

The adaptive filtering unit 340 may include a sound quality enhancing unit 341, a background noise removing unit 342, a specific situation recognizing unit 343, and a warning sound generating unit 344.

The sound quality enhancement unit 341 is configured to process the sound activity component of the external sound sensed by the sound activity sensing unit 312 to improve sound quality and output the sound activity. The sound quality enhancement unit 341 may include a Wiener filter based on a discrete wavelet transform (DWT). 5 illustrates that the sound quality enhancing unit 341 outputs sound activity components having improved sound quality to the sound mixing unit 330.

The background noise removing unit 342 is configured to remove background noise from a noise component of the external sound sensed by the noise energy sensing unit 313. [ The background noise canceller 342 may include a Wiener filter based on a fast Fourier transform (FFT).

The specific situation recognizer 343 is configured to recognize the surrounding situation by the noise component from which the background noise is removed in the background noise removing unit 342. [ The specific situation recognizing unit 343 prepares a sound wave model corresponding to a specific situation, compares the noise component inputted from the background noise removing unit 342 with the sound wave model, and judges the surrounding situation as a specific situation according to the comparison result Lt; / RTI > The specific situation recognition unit 343 may include a deep neural network (DNN) -based sound classifier that compares the noise component with the sound wave model by deep running.

The warning sound generating unit 344 generates an alarm sound corresponding to the specific situation recognized by the specific situation recognizing unit 343 and outputs the generated alarm sound to the sound mixing unit 330. [ The warning sound generating unit 344 may process the original sound of the sound generated in the specific situation by removing the background noise and allowing the user to listen and pay attention. Alternatively, or additionally, the warning sound generating unit 344 may classify the alarm sound generating unit 344 into intervals, short ends, elevations, and / or waveforms such as a beep sound, a beep sound and a beep sound, A simple warning sound may be generated to identify a specific situation. Alternatively or additionally, the alert sound generator 344 may generate a voice alert tone that expresses a particular situation, such as, for example, there is a vehicle nearby, be careful, or hear a loud sound.

FIG. 6 is a block diagram illustrating means configured in an earphone or device to implement this invention in accordance with another embodiment of the present invention. The configuration shown in Fig. 6 is similar to the configuration shown in Fig. 5, but the configuration of the sound quality improving section is changed according to another embodiment of the present invention. In this embodiment, the sound quality enhancement unit 341 is configured to bypass the sound mixing unit and output the sound activity component directly to the speaker of the sound output unit 400.

According to an aspect of the present invention, the configuration of the present invention as described above, particularly the configuration of the smart mixing module 300, can be implemented in an earphone, an earphone, a headphone, a headset, or the like used for listening to a device sound. At this time, the earphone, earset, headphone, or headset includes an acoustic output unit 400 including a speaker for outputting a device sound, and a sound output unit 400 for detecting an external sound and appropriately adjusting the device sound according to the detected external sound, And a smart mixing module 300 for mixing external sounds and controlled device sounds and outputting the mixed sounds to a speaker of the sound output unit 400. The speaker of the sound output unit 400 may be configured to be worn in a manner of wrapping the ear, worn by putting it in the ear canal, or putting it in the ear canal. An earphone, an earphone, a headphone, or a headset may be connected to a computer such as a mobile device including a phone module 210 and / or a media player 220, a computer such as a PC or a notebook computer, Or may be configured to be wired and wireless.

According to another aspect of the present invention, the configuration of the present invention, particularly, the configuration of the smart mixing module 300 is implemented in a device, and the device is configured to output sound through an earphone, an earphone, a headphone, . At this time, the device includes a smart mixing module 300 that detects an external sound, suitably adjusts the device sound according to the detected external sound, and mixes the detected external sound with the adjusted device sound, A headphone, a headset, or the like may be configured simply as a speaker for outputting sounds received from the smart mixing module 300 of the device. Such a device may be a mobile device including a phone module 210 and / or a media player 220, a computer such as a PC, a notebook computer, or a cassette player, a CD player, a home theater, a TV, The sound mixing unit 330, the energy detecting unit 311, the voice activity detecting unit 312, the sound detecting unit 312, The configuration of the noise energy sensing unit 313, the adaptive filtering unit 340, the sound quality enhancing unit 341, the background noise removing unit 342, the specific situation recognizing unit 343 and the warning sound generating unit 344 , The same as those implemented in the above-described earphone, etc., and thus the description thereof is omitted here to avoid duplication.

According to another aspect of the present invention, the configuration of the present invention as described above, particularly the configuration of the smart mixing module 300, is implemented so as to be partially scattered and distributed in at least some of the components of various components of the acoustic system, The acoustic system is configured to mix the device sound and the external sound.

This acoustic system detects the components of the sound system by detecting the external sound, appropriately adjusting the device sound according to the detected external sound, mixing the detected external sound with the adjusted device sound, and outputting it to the speaker You can include it anywhere in the component. For example, the external sound detecting unit 310, the sound adjusting unit 320, the sound mixing unit 330, the energy detecting unit 311, the voice activity detecting unit 312, A CD player, a CD player, a home theater, a TV, or the like, and includes an adaptive filtering unit 340, The configuration of the sound quality enhancing unit 341, the background noise removing unit 342, the specific situation recognizing unit 343 and the warning sound generating unit 344 can be implemented in an earphone, an ear set, a headphone, have. Further, the external sound detecting unit 310, the sound adjusting unit 320, the sound mixing unit 330, the energy detecting unit 311, the voice activity sensing unit 312, the noise energy sensing unit 313, The configuration of all or part of the sound quality enhancing unit 340, the sound quality enhancing unit 341, the background noise removing unit 342, the specific situation recognizing unit 343 and the warning sound generating unit 344 may be a computer such as a PC, , A CD player, a home theater, a TV, or the like, and diffuses sound into the air. A sound mixing unit 330, an energy detecting unit 311, a voice activity detecting unit 312, and a voice activity detecting unit 312. The external sound detecting unit 310, the sound adjusting unit 320, the sound mixing unit 330, The background noise removing unit 342, the specific situation recognizing unit 343, and the alarm sound generating unit 344, as shown in FIG. 1), the noise energy detecting unit 313, the adaptive filtering unit 340, the sound quality improving unit 341, Are the same as those implemented in the above-described earphone or the like, and therefore, a description thereof will be omitted in order to avoid redundancy.

According to another aspect of the present invention, a method of mixing a device sound and an external sound is provided.

FIG. 7 is a flow diagram illustrating steps for implementing a method according to one embodiment of the present invention using the smart mixing module 300 described above.

In this method, an external sound input from the external sound input unit 100 is detected, the device sound input from the device sound input unit 200 is appropriately adjusted according to the detected external sound, And outputting the mixed signals to the speakers of the sound output unit 400.

Preferably, the smart mixing step may include an external sound detection step, a sound adjustment step, and an acoustic mixing step. The external sound detection step is configured to detect a sound wave signal input through the external sound input unit 100 including the microphone. The sound adjusting step is configured to adjust the sound input to the device sound input unit 200 in consideration of the energy of the external sound detected in the external sound detecting step, the sound activity, and the energy of the noise included in the external sound (S570) . In the sound mixing step S590, the external sound detected in the external sound detecting step and the device sound adjusted in the sound adjusting step are mixed (S580) and outputted to the speaker of the sound output unit 400 (S590).

The external tone detection step may include an energy detection step, a voice activity detection step, a noise energy detection step, and an adaptive filtering step. The energy detection step is configured to detect the energy of the sound wave signal input through the external sound input unit 100 including the microphone (S510). The voice activity detection step is configured to detect voice activity in the sound wave signal (S520). The noise energy sensing step is configured to sense the energy of the noise included in the sound wave signal (S530). The adaptive filtering step compares the sound activity of the sound wave sensed in the sound activity sensing step with the energy of the noise sensed in the noise energy sensing step (S540), distinguishes the sound activity component from the noise component in the sound wave signal ), And is configured to filter the sound signal according to each component.

Preferably, the adaptive filtering step may be configured to divide speech sound processing into non-contrast sound processing and non-contrast sound processing according to the determination result after determining whether the external sound is a conversation sound (S550). The distinction between the sound activity component and the noise component in S550 can be performed by, for example, a deep learning method. The adaptive filtering step may include a sound quality improving step, a background noise canceling step, a specific situation recognizing step, and a warning sound generating step. In the sound quality enhancement step, the sound activity component of the external sound detected in the sound activity sensing step is processed to improve sound quality and output (S561). The background noise removing step is configured to remove background noise from a noise component of the external sound detected in the noise energy sensing step (S566). The specific situation recognizing step is configured to recognize the surrounding situation by the noise component from which the background noise is removed in the background noise removing step (S568). The warning sound generating step is configured to generate a warning sound corresponding to the specific situation recognized in the specific situation recognizing step (S569).

Preferably, the sound activity component having improved sound quality in the sound quality enhancement step may be sent to the sound mixing step. Alternatively, the sound activity component whose sound quality is improved in the sound quality improving step may be configured to omit the sound mixing step and output directly to the speaker of the sound output unit 400. For example, the tone enhancement step may be configured to improve sound quality using a discrete wavelet transform (DWT). For example, the background noise cancellation step may be configured to remove background noise using Fast Fourier Transform (FFT).

Preferably, the specific situation recognition step includes preparing a sound wave model corresponding to a specific situation, comparing the noise component sent from the background noise removing step with the sound wave model (step S567), and then, based on the comparison result, (S568). For example, the specific situation recognition step may be configured to recognize a specific situation using a deep neural network based sound classifier that compares the noise component with the sound wave model by deep running.

Preferably, the sound adjustment step compares the energy of the external sound detected at S510, the sound activity sensed at S520, and the noise energy included in the external sound sensed at S530, And a sound adjustment process including a volume adjustment process, a sound field effect process, and a fade-out process with respect to the device sound.

FIG. 8 is a flow diagram illustrating steps for implementing a method according to another embodiment of the present invention using the smart mixing module 300 described above.

The configuration of the sound mixing step and the sound mixing step constituting the smart mixing step according to this embodiment are the same as those of the step of the method according to the embodiment shown in FIG. 7 described above. do. The configuration of the energy detecting step S510, the voice activity sensing step S520, and the noise energy sensing step S530 constituting the external sound detecting step may be the same as the configuration of the method according to the embodiment shown in FIG. Therefore, duplication is avoided here, and a description thereof will be omitted.

The adaptive filtering step constituting the external sound detecting step of the method according to this embodiment is configured as follows.

The adaptive filtering step compares the sound activity of the sound wave sensed in the sound activity sensing step with the energy of the noise sensed in the noise energy sensing step (S540), distinguishes the sound activity component from the noise component in the sound wave signal ), And is configured to filter the sound signal according to each component.

Preferably, the adaptive filtering step may be configured to divide speech sound processing into non-contrast sound processing and non-contrast sound processing according to the determination result after determining whether the external sound is a conversation sound (S550). The distinction between the sound activity component and the noise component in S550 can be performed by, for example, a deep learning method. The adaptive filtering step according to this embodiment may include a sound quality enhancement step, a specific noise discrimination step, an abstraction sound canceling step, and a warning sound generating step. In the sound quality enhancement step, the sound activity component of the external sound detected in the sound activity sensing step is processed to improve sound quality and output (S561). In the specific noise identification step, in step S567, it is determined whether the non-converted sound is a specific noise in step S568. If it is determined in step S568 that the non-converted sound is a specific noise in the sound wave model, It is determined that the surrounding situation is a specific situation corresponding to the sound wave model. The warning sound generating step is configured to generate a warning sound corresponding to the specific situation determined according to the result of the specific noise identifying step (S569). In the non-dialogue removing step, in S550, all external sounds that are determined not to be a conversation sound are removed (S566). The warning sound generating step may be configured to remove an external sound and generate a warning sound, or may be configured to generate an alarm sound with an external sound as a background sound.

Preferably, the specific noise identification step may be configured to identify a specific noise using a deep neural network based sound classifier that compares the noise component with the sound wave model by deep running.

Although the earphone, the device, the system and the method according to the preferred embodiment of the present invention have been described above, these embodiments are described for the purpose of illustrating the present invention and are not intended to limit the present invention.

Those skilled in the art will appreciate that modifications, alterations, or permutations of the configurations of the exemplary embodiments described above are possible without departing from the spirit and scope of the invention.

It is intended that the appended claims be construed to include within their scope of protection, modification, or substitution without departing from the spirit and scope of the invention.

100: External sound input section
200:
300: Smart Mixing Module
310: external sound detection unit
320:
330: acoustic mixing section
311: Energy detection unit
312: voice activity detection unit
313: noise energy sensing unit
340: adaptive filtering unit
341: Sound quality enhancing part
342: Background noise rejection
343: Specific situation recognition section
344:
400: sound output section

Claims

delete

1. An earphone used for listening to a device sound,
An audio output section including a speaker for outputting a device sound,
And a smart mixing module for detecting an external sound, adjusting the device sound according to the detected external sound, mixing the detected external sound and the adjusted device sound, and outputting the mixed external sound to a speaker of the sound output unit,
The smart mixing module includes:
An external sound detection unit configured to detect a sound wave signal input through an external sound input unit including a microphone,
A sound adjusting unit for adjusting the sound of the device in consideration of the energy of the detected external sound, the sound activity, and the energy of the noise included in the external sound,
And an acoustic mixing unit for mixing the external sound detected by the external sound detecting unit and the device sound adjusted by the sound adjusting unit and outputting them to the speaker of the sound output unit,
Wherein the external sound detection unit comprises:
An energy detector configured to detect energy of a sound wave signal input through an external sound input unit including a microphone,
A voice activity detection unit for detecting voice activity in the sound wave signal,
A noise energy sensing unit for sensing energy of noise included in the sound wave signal,
A sound activity detection unit for detecting sound activity in the sound wave signal and noise energy detected by the noise energy sensing unit to distinguish a sound activity component and a noise component from each other, And an adaptive filtering unit for filtering the signal.

The method of claim 3,
Wherein the adaptive filtering unit comprises:
A sound quality enhancing unit for enhancing the sound quality by processing the sound activity component of the external sound sensed by the sound activity sensing unit,
A background noise removing unit for removing background noise from a noise component of the external sound detected by the noise energy sensing unit,
A specific situation recognizing unit for recognizing a surrounding situation by a noise component from which the background noise is removed in the background noise removing unit;
And a warning sound generating unit for generating a warning sound corresponding to a specific situation recognized by the specific situation recognition unit and outputting the alarm sound to the sound mixing unit.

5. The method of claim 4,
Wherein the sound quality enhancing unit is configured to output a sound activity component having improved sound quality to the sound mixing unit.

5. The method of claim 4,
Wherein the sound quality enhancing unit is configured to bypass the sound mixing unit and output the sound activity component having improved sound quality directly to the speaker of the sound output unit.

7. The method according to any one of claims 4 to 6,
Wherein the sound quality enhancer includes a Wiener filter based on discrete wavelet transform (DWT).

7. The method according to any one of claims 4 to 6,
Wherein the background noise canceller comprises a Wiener filter based on Fast Fourier Transform (FFT).

7. The method according to any one of claims 4 to 6,
Wherein the specific situation recognition unit is configured to prepare a sound wave model corresponding to a specific situation, to compare the noise component input from the background noise removing unit with the sound wave model, and to determine the surrounding situation as a specific situation according to the comparison result.

10. The method of claim 9,
Wherein the specific situation recognition section includes a deep neural network based sound classifier for comparing the noise component with the sound wave model by deep running.

7. The method according to any one of claims 3 to 6,
The sound controller adjusts the volume of the device sound input from the device in consideration of the energy of the external sound, the sound activity, and the energy of the noise included in the external sound, , Sound field effect processing, and sound adjustment processing including fade-out processing.

7. The method according to any one of claims 3 to 6,
Wherein the external sound input unit includes a microphone built in a device to which the earphone is connected.

7. The method according to any one of claims 3 to 6,
Wherein the external sound input unit includes a microphone attached to the earphone.

delete

1. An apparatus configured to output sound through an earphone,
And a smart mixing module for detecting an external sound, adjusting the device sound according to the detected external sound, mixing the detected external sound and the adjusted device sound, and outputting the mixed external sound to the speaker of the earphone,
The smart mixing module includes:
An external sound detection unit configured to detect a sound wave signal input through an external sound input unit including a microphone,
A sound adjusting unit for adjusting the sound of the device in consideration of the energy of the detected external sound, the sound activity, and the energy of the noise included in the external sound,
And an acoustic mixing unit for mixing the external sound detected by the external sound detecting unit and the device sound adjusted by the sound adjusting unit and outputting them to the speaker of the earphone,
Wherein the external sound detection unit comprises:
An energy detector configured to detect energy of a sound wave signal input through an external sound input unit including a microphone,
A voice activity detection unit for detecting voice activity in the sound wave signal,
A noise energy sensing unit for sensing energy of noise included in the sound wave signal,
A sound activity detection unit for detecting sound activity in the sound wave signal and noise energy detected by the noise energy sensing unit to distinguish a sound activity component and a noise component from each other, And an adaptive filtering unit for filtering the signal.

17. The method of claim 16,
Wherein the adaptive filtering unit comprises:
A sound quality enhancing unit for enhancing the sound quality by processing the sound activity component of the external sound sensed by the sound activity sensing unit,
A background noise removing unit for removing background noise from a noise component of the external sound detected by the noise energy sensing unit,
A specific situation recognizing unit for recognizing a surrounding situation by a noise component from which the background noise is removed in the background noise removing unit;
And a warning sound generating unit for generating a warning sound corresponding to the specific situation recognized by the specific situation recognition unit and outputting the alarm sound to the sound mixing unit.

18. The method of claim 17,
And the sound quality improving unit is configured to output a sound activity component having improved sound quality to the sound mixing unit.

18. The method of claim 17,
Wherein the sound quality enhancing unit is configured to bypass the sound mixing unit and to output the sound activity component having improved sound quality directly to the speaker of the earphone.

20. The method according to any one of claims 17 to 19,
Wherein the sound quality enhancer comprises a Wiener filter based on discrete wavelet transform (DWT).

20. The method according to any one of claims 17 to 19,
Wherein the background noise canceller comprises a Wiener filter based on Fast Fourier Transform (FFT).

20. The method according to any one of claims 17 to 19,
Wherein the specific situation recognition unit is configured to prepare a sound wave model corresponding to a specific situation, compare the noise component input from the background noise removing unit with the sound wave model, and determine the surrounding situation as a specific situation according to the comparison result.

23. The method of claim 22,
Wherein the specific situation recognizer comprises a deep neural network based sound classifier for comparing the noise component with the sound wave model by deep running.

20. The method according to any one of claims 16 to 19,
The sound adjusting unit may adjust at least one of a sound adjusting process including a volume adjusting process, a sound field effect process, and a fade-out process for the device sound considering the energy of the detected external sound, the sound activity, A device configured to perform a single process.

20. The method according to any one of claims 16 to 19,
Wherein the device is a mobile device including a telephone module configured to be connected to the earphone wirelessly, wired, or wirelessly.

20. The method according to any one of claims 16 to 19,
Wherein the device is a media player configured to be connected to the earphone wirelessly, wirelessly or wirelessly.

delete

A method of mixing a device sound and an external sound,
And a smart mixing step of detecting an external sound, adjusting the device sound according to the detected external sound, mixing the detected external sound and the adjusted device sound, and outputting the mixed device sound to a speaker,
Wherein the smart mixing step comprises:
An external sound detection step of detecting a sound wave signal input through an external sound input unit including a microphone,
A sound adjusting step of adjusting the sound of the device in consideration of the energy of the detected external sound, the sound activity, and the energy of the noise included in the external sound,
And an acoustic mixing step of mixing the external sound detected in the external sound detecting step and the device sound adjusted in the sound adjusting step and outputting them to the speaker,
Wherein the external sound detecting step comprises:
An energy detecting step of detecting energy of a sound wave signal input through an external sound input unit including a microphone,
A voice activity detection step of detecting voice activity in the sound wave signal,
A noise energy sensing step of sensing energy of noise included in the sound wave signal,
A sound activity detection step of detecting sound activity in the sound wave signal and noise energy detected in the noise energy sensing step to distinguish a sound activity component and a noise component from each other, &Lt; / RTI > comprising an adaptive filtering step of filtering the signal.

30. The method of claim 29,
Wherein the adaptive filtering step comprises:
A sound quality enhancement step of enhancing and outputting sound quality by processing a sound activity component of the external sound sensed in the sound activity sensing step,
A background noise removing step of removing background noise from a noise component of the external sound detected in the noise energy sensing step,
A specific situation recognition step of recognizing a surrounding situation by a noise component from which background noise is removed in the background noise removing step;
And a warning sound generating step of generating a warning sound corresponding to the specific situation recognized in the specific situation recognition step.

30. The method of claim 29,
Wherein the adaptive filtering step comprises:
A sound quality enhancement step of enhancing and outputting sound quality by processing a sound activity component of the external sound sensed in the sound activity sensing step,
A sound wave comparing step of comparing the noise component of the external sound detected in the noise energy sensing step with a sound wave model,
A noise identification step of identifying whether the noise component is a specific noise,
A noise removing step of removing the noise gates if the noise component is not a specific noise,
A situation determination step of determining that a surrounding situation corresponds to the specific noise when the noise component is a specific noise,
And a warning sound generating step of generating a warning sound corresponding to the specific situation.

31. The method of claim 30,
Wherein the sound quality enhancement step is configured to skip the sound mixing step and directly output the sound activity component having improved sound quality to the speaker.

33. The method according to any one of claims 30 to 32,
Wherein the sound quality enhancement step uses discrete wavelet transform (DWT).

33. The method according to any one of claims 30 to 32,
Wherein the background noise cancellation step uses Fast Fourier Transform (FFT).

33. The method according to any one of claims 30 to 32,
Wherein the specific situation recognizing step comprises preparing a sound wave model corresponding to a specific situation, comparing the noise component sent from the background noise removing step with the sound wave model, and determining the surrounding situation as a specific situation according to the comparison result.

36. The method of claim 35,
Wherein the specific situation recognition step uses a deep neural network based acoustic classification that compares the noise component with the sound wave model by deep running.

33. The method according to any one of claims 29 to 32,
The sound adjusting step may include adjusting the volume of the device sound, the sound field effect process, and the fade-out process, in consideration of the energy of the detected external sound, the sound activity, And to perform at least one process.

delete

A system for mixing a device sound and an external sound and outputting the same,
And a smart mixing unit for detecting an external sound, adjusting the device sound according to the detected external sound, mixing the detected external sound and the adjusted device sound, and outputting the mixed external sound to a speaker,
The smart-
An external sound detection unit for detecting a sound wave signal inputted through an external sound input unit including a microphone,
A sound adjusting unit for adjusting the sound of the device in consideration of the energy of the detected external sound, the sound activity and the energy of the noise included in the external sound,
And an acoustic mixing unit for mixing the external sound detected by the external sound detecting unit and the device sound adjusted by the sound adjusting unit and outputting the mixture to the speaker,
Wherein the external sound detection unit comprises:
An energy detecting unit for detecting energy of a sound wave signal inputted through an external sound input unit including a microphone,
A voice activity detection unit for detecting voice activity in the sound wave signal,
A noise energy sensing unit for sensing energy of noise included in the sound wave signal,
A sound activity detection unit for detecting sound activity in the sound wave signal and noise energy detected by the noise energy sensing unit to distinguish a sound activity component and a noise component from each other, And an adaptive filtering unit for filtering the signal.

41. The method of claim 40,
Wherein the adaptive filtering unit comprises:
A sound quality enhancing unit for enhancing the sound quality by processing the sound activity component of the external sound sensed by the sound activity sensing unit,
A background noise removing unit for removing background noise from a noise component of the external sound detected by the noise energy sensing unit,
A specific situation recognizing unit for recognizing a surrounding situation by a noise component from which the background noise is removed in the background noise removing unit;
And a warning sound generating section for generating a warning sound corresponding to the specific situation recognized by the specific situation recognizing section.

42. The method of claim 41,
Wherein the specific situation recognizing unit prepares a sound wave model corresponding to a specific situation, compares the noise component input from the background noise removing unit with the sound wave model, and determines the surrounding situation as a specific situation according to the comparison result.

43. The method according to any one of claims 40 to 42,
The sound adjusting unit may adjust at least one of a sound adjusting process including a volume adjusting process, a sound field effect process, and a fade-out process for the device sound considering the energy of the detected external sound, the sound activity, And to perform one process.