CN110265056B - Sound source control method, loudspeaker device and apparatus - Google Patents

Sound source control method, loudspeaker device and apparatus Download PDF

Info

Publication number
CN110265056B
CN110265056B CN201910501128.9A CN201910501128A CN110265056B CN 110265056 B CN110265056 B CN 110265056B CN 201910501128 A CN201910501128 A CN 201910501128A CN 110265056 B CN110265056 B CN 110265056B
Authority
CN
China
Prior art keywords
audio signal
signal
noise ratio
audio
interaction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910501128.9A
Other languages
Chinese (zh)
Other versions
CN110265056A (en
Inventor
叶强胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anker Innovations Co Ltd
Original Assignee
Anker Innovations Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anker Innovations Co Ltd filed Critical Anker Innovations Co Ltd
Priority to CN201910501128.9A priority Critical patent/CN110265056B/en
Publication of CN110265056A publication Critical patent/CN110265056A/en
Application granted granted Critical
Publication of CN110265056B publication Critical patent/CN110265056B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0316Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
    • G10L21/0324Details of processing therefor
    • G10L21/034Automatic adjustment
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/60Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for measuring the quality of voice signals

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The invention relates to the technical field of loudspeakers and discloses a sound source control method, loudspeaker equipment and a device with a storage function. The control method comprises the following steps: acquiring a first audio signal with a bone conduction sensor and a second audio signal with a microphone; fusing the first audio signal and the second audio signal to obtain a fused audio signal; calculating the signal-to-noise ratio of the fusion audio signal, and judging whether the signal-to-noise ratio of the fusion audio signal is smaller than a signal-to-noise ratio threshold value; and if the signal-to-noise ratio of the fusion audio signal is smaller than the signal-to-noise ratio threshold value, applying the first audio signal to audio signal interaction. Through the mode, the method and the device can select reasonable audio signals to be applied to audio signal interaction.

Description

Sound source control method, loudspeaker device and apparatus
Technical Field
The present invention relates to the field of speaker technology, and in particular, to a method for controlling a sound source, a speaker device, and an apparatus having a storage function.
Background
At present, a user can generally wear a speaker such as an earphone and the speaker is connected with a mobile terminal such as a smart phone, so that the user can perform voice interaction with a call counterpart by using the earphone as a medium for call voice input and output.
When the environmental noise of the user is large or the noise is difficult to eliminate, the normal operation of the communication will be disturbed, and the noise reduction of the earphone communication voice is needed. Based on the current noise reduction method applied to the earphone communication voice, the noise reduction effect is poor, so that the definition of the communication voice is insufficient.
Disclosure of Invention
In view of the above, the present invention provides a method for controlling an audio source, a speaker device, and an apparatus with a storage function, which can select a reasonable audio signal for audio signal interaction.
In order to solve the technical problems, the invention adopts a technical scheme that: there is provided a control method of a sound source, the control method including: acquiring a first audio signal with a bone conduction sensor and a second audio signal with a microphone; fusing the first audio signal and the second audio signal to obtain a fused audio signal; calculating the signal-to-noise ratio of the fusion audio signal, and judging whether the signal-to-noise ratio of the fusion audio signal is smaller than a signal-to-noise ratio threshold value; and if the signal-to-noise ratio of the fusion audio signal is smaller than the signal-to-noise ratio threshold value, applying the first audio signal to audio signal interaction.
In an embodiment of the invention, the signal-to-noise ratio of the first audio signal is greater than the signal-to-noise ratio threshold and the signal-to-noise ratio of the second audio signal.
In an embodiment of the invention, the step of applying the first audio signal to the audio signal interaction comprises: performing equalization adjustment or gain processing on the first audio signal to at least increase the amplitude of the audio signal in a preset frequency band in the first audio signal; the first audio signal subjected to the equalization adjustment or the gain processing is applied to the audio signal interaction.
In an embodiment of the present invention, the step of acquiring the first audio signal by using the bone conduction sensor and the step of acquiring the second audio signal by using the microphone comprise: outputting an audio signal in a preset frequency band in the first audio signal to a user; acquiring the minimum amplitude of the audio signal in the preset frequency band which can be perceived by a user in the first audio signal; the step of performing equalization adjustment or gain processing on the first audio signal to increase at least the amplitude of the audio signal in the preset frequency band in the first audio signal includes: and increasing the amplitude of the audio signal in the preset frequency band in the first audio signal to be not less than the minimum amplitude.
In an embodiment of the present invention, the interactive contents recorded in the first audio signal and the second audio signal are the same; the step of fusing the first audio signal and the second audio signal to obtain a fused audio signal comprises: and taking the audio signal in the preset frequency band in the second audio signal as the audio signal corresponding to the preset frequency band in the first audio signal on the basis of the first audio signal to obtain a fused audio signal.
In an embodiment of the invention, the predetermined frequency band is defined as an audio frequency band greater than or equal to 1 KHz.
In an embodiment of the present invention, the step of determining whether the signal-to-noise ratio of the fusion audio signal is smaller than the signal-to-noise ratio threshold further includes: and if the signal-to-noise ratio of the fusion audio signal is not less than the signal-to-noise ratio threshold value, applying the fusion audio signal to audio signal interaction.
In one embodiment of the invention, the audio signal interaction comprises at least one of a mobile data call and an internet call.
In order to solve the technical problem, the invention adopts another technical scheme that: there is provided a speaker device comprising a processor, a bone conduction sensor and a microphone, the bone conduction sensor and the microphone being respectively coupled to the processor, the processor being configured to: controlling the bone conduction sensor and the microphone to perform audio signal interaction; acquiring a first audio signal with a bone conduction sensor and a second audio signal with a microphone; fusing the first audio signal and the second audio signal to obtain a fused audio signal; calculating the signal-to-noise ratio of the fusion audio signal, and judging whether the signal-to-noise ratio of the fusion audio signal is smaller than a signal-to-noise ratio threshold value; and if the signal-to-noise ratio of the fusion audio signal is smaller than the signal-to-noise ratio threshold value, applying the first audio signal to audio signal interaction.
In order to solve the technical problem, the invention adopts another technical scheme that: there is provided a device having a storage function, which stores program data that can be executed to implement the method of controlling a sound source as set forth in the above embodiment.
The invention has the beneficial effects that: different from the prior art, the invention provides a sound source control method, which comprises the steps of fusing a first audio signal acquired by a bone conduction sensor and a second audio signal acquired by a microphone to obtain a fused audio signal, calculating the signal-to-noise ratio of the fused audio signal and judging whether the signal-to-noise ratio of the fused audio signal is smaller than a signal-to-noise ratio threshold value, and applying the first audio signal acquired by the bone conduction sensor (generally, the audio signal acquired by the bone conduction sensor has a higher signal-to-noise ratio) to audio signal interaction when the signal-to-noise ratio of the fused audio signal is smaller than the signal-to-noise ratio threshold value, namely, the signal-to-noise ratio of the fused audio signal does not meet the requirement of audio signal interaction, so that a reasonable audio signal can be selected to be applied to the audio signal interaction to meet the requirement of the definition of the audio signal corresponding to the audio signal interaction.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts. Wherein:
FIG. 1 is a schematic flow chart of a sound source control method according to a first embodiment of the present invention;
FIG. 2 is a schematic flow chart of a sound source control method according to a second embodiment of the present invention;
FIG. 3 is a flow chart of a sound source control method according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of an embodiment of the speaker apparatus of the present invention;
FIG. 5 is a schematic structural diagram of an embodiment of an apparatus with a storage function according to the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first", "second", and the like in the present invention are used for distinguishing different objects, not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments. And features in the embodiments and examples described below may be combined with each other without conflict.
Bone conduction transducer (Bone sensor) is a way to capture and capture the user's voice. Which is used to capture the sound emitted by the user transmitted through the user's skull, bony labyrinth, etc. The bone conduction sensor can collect the sound emitted by a relatively pure user. However, the frequency of the sound collected by the bone conduction sensor is not high, which is represented by high-frequency sound loss, resulting in poor sound quality.
While a Microphone (Microphone) is another way to capture and capture the user's voice. Which is used to collect the sound emitted by the user transmitted through a medium such as air. The microphone can collect most of the frequency range of the sound emitted by the user, especially the high frequency part, but also collects the noise in the environment, and when the noise in the environment is large, the signal-to-noise ratio of the sound signal collected by the microphone is small, which means that the definition of the sound collected by the microphone is insufficient.
In view of this, the embodiments of the present invention provide a method for controlling an audio source in combination with a bone conduction transducer and a microphone, which can at least meet the requirement of the definition of an audio signal applied to audio signal interaction. The following is a detailed description:
referring to fig. 1, fig. 1 is a schematic flow chart of a sound source control method according to a first embodiment of the present invention.
S101: acquiring a first audio signal with a bone conduction sensor and a second audio signal with a microphone;
in the embodiment, although the audio signal collected by the bone conduction sensor has a condition that a high frequency part thereof is absent, the audio signal collected by the bone conduction sensor generally has a higher signal-to-noise ratio and a higher definition. Although the microphone can collect audio signals of most frequency bands, especially high-frequency parts, the microphone also collects noise in the environment, so that the signal-to-noise ratio of the audio signals collected by the microphone in the environment with serious noise is low in part, and the definition of the audio signals for audio signal interaction is insufficient.
Therefore, the present embodiment combines the bone conduction sensor and the microphone, so that the bone conduction sensor and the microphone have complementary advantages, and an audio signal with high signal-to-noise ratio and good sound quality is fused for audio signal interaction. Specifically, a first audio signal is acquired with a bone conduction sensor and a second audio signal is acquired with a microphone.
S102: fusing the first audio signal and the second audio signal to obtain a fused audio signal;
in this embodiment, after the bone conduction sensor is used to collect the first audio signal and the microphone is used to collect the second audio signal, the first audio signal and the second audio signal are fused to realize the advantage complementation between the bone conduction sensor and the microphone, so as to obtain the fused audio signal. Generally, the fused audio signal has a high signal-to-noise ratio and good sound quality. However, in an environment with severe partial noise, when the frequency of the noise is higher than the maximum frequency of the audio signal that can be collected by the bone conduction sensor, the fusion audio signal may have a portion of noise that cannot be reduced or eliminated in a fusion manner, so that the noise reduction effect of the fusion audio signal is not good enough, and the signal-to-noise ratio of the fusion audio signal may be small, and the definition of the fusion audio signal is not sufficient to meet the requirement of audio signal interaction.
The definition of the audio signal can be understood as the degree to which the interactive content carried by the audio signal can be clearly received and recognized by the human ear of the user. The popular understanding is that the higher the intelligibility of the audio signal, the clearer the user hears, so that the audio signal emitted by one party of the audio signal interaction can be clearly received and recognized by the other party.
S103: calculating the signal-to-noise ratio of the fusion audio signal, and judging whether the signal-to-noise ratio of the fusion audio signal is smaller than a signal-to-noise ratio threshold value;
in this embodiment, considering that an environment with severe partial noise may cause a situation that partial noise in the fusion audio signal cannot be reduced and eliminated in a fusion manner, it is necessary to analyze the signal-to-noise ratio of the fusion audio signal and determine whether the fusion audio signal can satisfy the requirement of audio signal interaction.
Specifically, the signal-to-noise ratio of the fusion audio signal is calculated, and whether the signal-to-noise ratio of the fusion audio signal is smaller than a signal-to-noise ratio threshold is judged. Wherein the signal-to-noise ratio threshold is defined as the minimum signal-to-noise ratio corresponding to the audio signal intelligibility requirement for audio signal interaction. When the signal-to-noise ratio of the fusion audio signal is smaller than the signal-to-noise ratio threshold value, the definition of the fusion audio signal cannot meet the requirement of audio signal interaction, and otherwise, the definition of the fusion audio signal cannot meet the requirement of audio signal interaction.
S104: if the signal-to-noise ratio of the fusion audio signal is smaller than the signal-to-noise ratio threshold value, applying the first audio signal to audio signal interaction;
in this embodiment, if the signal-to-noise ratio of the fusion audio signal is smaller than the signal-to-noise ratio threshold, it is indicated that the definition of the fusion audio signal cannot meet the requirement of audio signal interaction, and therefore, other reasonable audio signals need to be selected to replace the fusion audio signal for audio signal interaction. The present embodiment improves the intelligibility of the audio signal applied to the audio signal interaction by selecting the audio signal with a higher signal-to-noise ratio. In particular, a first audio signal is applied to the audio signal interaction. Because the audio signal that bone conduction sensor gathered has higher SNR usually, its definition is higher, consequently is applied to audio signal interaction with the first audio signal that bone conduction sensor gathered, can improve the definition of being applied to audio signal interactive audio signal greatly, guarantees at least that interactive content can clearly be received, discerned by the people ear, guarantees interactive normal clear going on.
It can be seen from the above that, according to the sound source control method provided by the present invention, when the signal-to-noise ratio of the fusion audio signal is smaller than the signal-to-noise ratio threshold, that is, the signal-to-noise ratio of the fusion audio signal does not meet the requirement of audio signal interaction, the first audio signal acquired by the bone conduction sensor is applied to the audio signal interaction, and the requirement of the definition of the audio signal applied to the audio signal interaction is met by selecting a reasonable manner in which the audio signal is applied to the audio signal interaction.
Referring to fig. 2, fig. 2 is a schematic flow chart of a sound source control method according to a second embodiment of the present invention.
S201: carrying out audio signal interaction;
in the present embodiment, audio signal interaction is performed, and the present embodiment can provide an audio signal with a high signal-to-noise ratio for audio signal interaction, so as to ensure at least clear transmission of interactive contents. The audio signal interaction can be mobile data call, call is established through modes such as operator network dialing and the like, and call voice interaction is carried out between a calling party and a receiving party of the call, namely the audio signal interaction. Of course, the audio signal interaction may also be an internet call, etc., and a network call is established through the internet, which is not limited herein.
S202: acquiring a first audio signal with a bone conduction sensor and a second audio signal with a microphone;
in the embodiment, by combining the bone conduction sensor and the microphone, the advantages of the bone conduction sensor and the microphone are complemented so as to fuse the audio signal with high signal-to-noise ratio and good sound quality for audio signal interaction. Specifically, a first audio signal is acquired with a bone conduction sensor and a second audio signal is acquired with a microphone.
S203: fusing the first audio signal and the second audio signal to obtain a fused audio signal;
in this embodiment, after the bone conduction sensor is used to collect the first audio signal and the microphone is used to collect the second audio signal, the first audio signal and the second audio signal are fused to realize the advantage complementation between the bone conduction sensor and the microphone, so as to obtain the fused audio signal.
Generally, the fused audio signal has a high signal-to-noise ratio and good sound quality. Considering that the interactive contents recorded in the first audio signal and the second audio signal are the same, the specific process of fusing the first audio signal and the second audio signal may be: and taking the audio signal in the preset frequency band in the second audio signal as the audio signal corresponding to the preset frequency band in the first audio signal on the basis of the first audio signal to obtain a fused audio signal. That is to say, the audio signal in the second audio signal in the preset frequency band replaces the audio signal corresponding to the preset frequency band in the first audio signal, and the audio signal in the second audio signal in the preset frequency band makes up for the absence of the audio signal corresponding to the preset frequency band in the first audio signal, without affecting the integrity of the interactive content recorded by the first audio signal and the second audio signal, i.e., the fused audio signal still completely records the interactive content.
Based on the above, in an environment with severe noise, especially when the frequency of the noise is higher than the maximum frequency of the audio signal that can be collected by the bone conduction sensor, replacing the audio signal corresponding to the preset frequency band in the first audio signal with the audio signal in the preset frequency band in the second audio signal will fuse the noise with the frequency higher than the maximum frequency of the audio signal that can be collected by the bone conduction sensor into the fused audio signal, so that the noise in the fused audio signal cannot be eliminated, the noise reduction effect of the fused audio signal is poor, and the signal-to-noise ratio of the fused audio signal may be small, and the definition of the fused audio signal is not sufficient to meet the requirement of audio signal interaction.
It should be noted that the predetermined frequency band is defined as an audio frequency band greater than or equal to 1 KHz. For the frequency band below 1KHz, the bone conduction sensor can acquire a pure audio signal, namely the first audio signal, and the signal-to-noise ratio of the first audio signal is high. That is to say, the integrity of the audio signal part of the frequency band below 1KHz in the first audio signal collected by the bone conduction sensor is high, and the audio signal part of the frequency band above 1KHz is absent. In contrast, the microphone is capable of collecting the majority of the frequency band, or even the full frequency band, of the audio signal, i.e., the second audio signal, completely. But the microphone introduces ambient noise into the second audio signal resulting in a small signal-to-noise ratio of the second audio signal.
S204: calculating the signal-to-noise ratio of the fusion audio signal, and judging whether the signal-to-noise ratio of the fusion audio signal is smaller than a signal-to-noise ratio threshold value;
in this embodiment, if the signal-to-noise ratio of the fusion audio signal is smaller than the signal-to-noise ratio threshold, step S205 is executed; if the snr of the fusion audio signal is not less than the snr threshold, step S206 is executed.
Considering that the fusion audio signal has a condition that partial noise cannot be reduced and eliminated in a fusion mode due to a severe environment with partial noise, the signal-to-noise ratio of the fusion audio signal needs to be analyzed to determine whether the fusion audio signal can meet the requirement of audio signal interaction.
Specifically, the signal-to-noise ratio of the fusion audio signal is calculated, and whether the signal-to-noise ratio of the fusion audio signal is smaller than a signal-to-noise ratio threshold is judged. When the signal-to-noise ratio of the fusion audio signal is smaller than the signal-to-noise ratio threshold value, the definition of the fusion audio signal cannot meet the requirement of audio signal interaction; and when the signal-to-noise ratio of the fusion audio signal is not less than the signal-to-noise ratio threshold, the definition of the fusion audio signal can meet the requirement of audio signal interaction.
S205: applying the first audio signal to the audio signal interaction;
in this embodiment, if the signal-to-noise ratio of the fusion audio signal is smaller than the signal-to-noise ratio threshold, it is indicated that the definition of the fusion audio signal cannot meet the requirement of audio signal interaction, and therefore, other reasonable audio signals need to be selected to replace the fusion audio signal for audio signal interaction. The present embodiment improves the intelligibility of the audio signal applied to the audio signal interaction by selecting the audio signal with a higher signal-to-noise ratio. Specifically, the first audio signal is applied to the audio signal interaction, and then the flow ends.
Since the audio signal collected by the bone conduction sensor generally has a higher signal-to-noise ratio, the signal-to-noise ratio of the first audio signal is generally greater than the signal-to-noise ratio threshold and the signal-to-noise ratio of the second audio signal, and the intelligibility thereof is higher. Therefore, the above fusion process of the first audio signal and the second audio signal is also based on the first audio signal with high signal-to-noise ratio, and the second audio signal is replaced in the missing high-frequency portion of the first audio signal. And the first audio signal that will gather bone conduction sensor is applied to audio signal and is mutual, can improve the definition of the audio signal who is applied to audio signal interaction greatly, guarantees at least that interactive content can clearly be received, discerned by the people's ear, guarantees interactive normal clear going on.
S206: applying the fused audio signal to the audio signal interaction;
in this embodiment, if the signal-to-noise ratio of the fusion audio signal is not less than the signal-to-noise ratio threshold, it is indicated that the definition of the fusion audio signal can meet the requirement of audio signal interaction, and therefore the fusion audio signal is applied to audio signal interaction. Compared with the first audio signal, the fusion audio signal is combined with the middle-low frequency part (namely, the frequency band below the preset frequency band) with higher signal-to-noise ratio in the first audio signal and the high frequency part (namely, the preset frequency band) in the second audio signal, so that the fusion audio signal has the signal-to-noise ratio enough for audio signal interaction, and meanwhile, the integrity of the audio frequency of the high frequency part is kept, and the fusion audio signal has good tone quality.
Referring to fig. 3, fig. 3 is a schematic flow chart of a sound source control method according to a third embodiment of the present invention. The present embodiment is different from the foregoing embodiments in that the first audio signal collected by the bone conduction sensor is adaptively processed, and the purpose is to pull up the audio of the high frequency part in the first audio signal, so as to improve the sound quality of the first audio signal. The following is a detailed description:
s301: carrying out audio signal interaction;
in the present embodiment, audio signal interaction is performed, and the present embodiment can provide an audio signal with a high signal-to-noise ratio for audio signal interaction, so as to ensure at least clear transmission of interactive contents.
S302: acquiring a first audio signal with a bone conduction sensor and a second audio signal with a microphone;
in the embodiment, by combining the bone conduction sensor and the microphone, the advantages of the bone conduction sensor and the microphone are complemented so as to fuse the audio signal with high signal-to-noise ratio and good sound quality for audio signal interaction. Specifically, a first audio signal is acquired with a bone conduction sensor and a second audio signal is acquired with a microphone.
S303: fusing the first audio signal and the second audio signal to obtain a fused audio signal;
in this embodiment, after the bone conduction sensor is used to collect the first audio signal and the microphone is used to collect the second audio signal, the first audio signal and the second audio signal are fused to realize the advantage complementation between the bone conduction sensor and the microphone, so as to obtain the fused audio signal.
S304: calculating the signal-to-noise ratio of the fusion audio signal, and judging whether the signal-to-noise ratio of the fusion audio signal is smaller than a signal-to-noise ratio threshold value;
in this embodiment, if the signal-to-noise ratio of the fusion audio signal is smaller than the signal-to-noise ratio threshold, step S305 is executed; if the signal-to-noise ratio of the fusion audio signal is not less than the signal-to-noise ratio threshold, step S307 is executed.
Considering that the fusion audio signal has a condition that partial noise cannot be reduced and eliminated in a fusion mode due to a severe environment with partial noise, the signal-to-noise ratio of the fusion audio signal needs to be analyzed to determine whether the fusion audio signal can meet the requirement of audio signal interaction.
Specifically, the signal-to-noise ratio of the fusion audio signal is calculated, and whether the signal-to-noise ratio of the fusion audio signal is smaller than a signal-to-noise ratio threshold is judged. When the signal-to-noise ratio of the fusion audio signal is smaller than the signal-to-noise ratio threshold value, the definition of the fusion audio signal cannot meet the requirement of audio signal interaction; and when the signal-to-noise ratio of the fusion audio signal is not less than the signal-to-noise ratio threshold, the definition of the fusion audio signal can meet the requirement of audio signal interaction.
S305: carrying out equalization adjustment or gain processing on the first audio signal;
in this embodiment, if the signal-to-noise ratio of the fusion audio signal is smaller than the signal-to-noise ratio threshold, it is indicated that the definition of the fusion audio signal cannot meet the requirement of audio signal interaction, and therefore, other reasonable audio signals need to be selected to replace the fusion audio signal for audio signal interaction. The present embodiment improves the intelligibility of the audio signal applied to the audio signal interaction by selecting the audio signal with a higher signal-to-noise ratio. In particular, a first audio signal is applied to the audio signal interaction. Because the audio signal that bone conduction sensor gathered has higher SNR usually, consequently be applied to audio signal interaction with the first audio signal that bone conduction sensor gathered, can improve the definition of being applied to audio signal interactive audio signal greatly, guarantee at least that interactive content can clearly be received, discerned by the people ear, guarantee interactive normal clear going on.
Further, the tone quality of the first audio signal is poor due to the fact that the audio in the first audio signal in the preset frequency band is missing. In view of this, in the embodiment, the first audio signal is subjected to equalization adjustment or gain processing to at least increase the amplitude of the audio signal in the preset frequency band in the first audio signal, so that the audio signal in the preset frequency band in the first audio signal can be perceived by a receiving party of audio signal interaction, thereby ensuring that the interaction is performed normally and improving the problem of poor sound quality of the first audio signal caused by audio loss in a high-frequency part.
Equalization adjustment, i.e. adjustment of the first audio signal by means of an Equalizer (EQ). The equalization adjustment may adjust for an audio signal in a preset frequency band in the first audio signal to increase an amplitude of the audio signal in the preset frequency band in the first audio signal. And to the frequency channel except presetting the frequency channel, first audio signal shows that the SNR is higher and clear enough, can not adjust to reduce the work load of adjusting audio signal, reduce equipment burden. Of course, in other embodiments of the present invention, the audio signal of the first audio signal in the full frequency band may also be adjusted by using the equalization adjustment, so as to further ensure the clear transmission of the interactive content and ensure the quality of the audio signal interaction, which is not limited herein.
And gain processing, namely gain circuits such as a signal amplification circuit and the like are used for increasing the gain of the full-frequency-band audio signal of the first audio signal so as to at least increase the amplitude of the audio signal in the preset frequency band in the first audio signal, and the gain of the audio signal in the frequency bands except the preset frequency band is also increased so as to further ensure the clear transmission of interactive contents and ensure the interactive quality of the audio signal.
Further, for the adjustment amplitude of the audio signal in the preset frequency band in the first audio signal by the equalization adjustment and the gain processing, the adjustment amplitude may be determined before the audio signal interaction, so as to ensure that the adjusted first audio signal can have good sound quality.
Specifically, the audio signal in the preset frequency band in the first audio signal is output to the user, and then the minimum amplitude value, at which the user can perceive the audio signal in the preset frequency band in the first audio signal, is obtained. Wherein the amplitude is defined as the amplitude of the harmonic corresponding to the audio signal.
For example, the sound source control method described in this embodiment is applied to a mobile terminal and a speaker device for example. The mobile terminal may be a smart phone or the like, and the speaker device may be an earphone or the like. The earphone is connected with the mobile terminal to carry out input and output of the conversation voice. The determination of the adjustment amplitude may be implemented by the application program (which may be an equalizer) on the mobile terminal and the earphone cooperating with each other. For example, the earphone outputs an audio signal in a preset frequency band in the first audio signal to the user, the user adjusts the volume value of the audio signal in the preset frequency band in an application program on the mobile terminal, and then the minimum volume value of the audio signal in the preset frequency band, which can be perceived by the user, in the first audio signal is determined, and the minimum volume value corresponds to the minimum amplitude value of the audio signal in the preset frequency band, so that the minimum amplitude value of the audio signal in the preset frequency band, which can be perceived by the user, in the first audio signal, can be determined.
In the foregoing, the first audio signal is subjected to equalization adjustment or gain processing to at least increase the amplitude of the audio signal in the preset frequency band in the first audio signal, which may specifically be to increase the amplitude of the audio signal in the preset frequency band in the first audio signal to be not less than a minimum amplitude, so that it can be ensured that the audio signal in the preset frequency band in the first audio signal can be perceived by a user, and further, the audio quality of the first audio signal is improved.
S306: applying the first audio signal subjected to the equalization adjustment or the gain processing to the audio signal interaction;
in this embodiment, after the first audio signal is subjected to the equalization adjustment or the gain processing, the tone quality of the first audio signal is improved, so that when the first audio signal subjected to the equalization adjustment or the gain processing is applied to audio signal interaction, the first audio signal has a higher signal-to-noise ratio, and it is ensured that the first audio signal can be clearly identified, and meanwhile, the first audio signal also has good tone quality. The flow is then ended.
S307: applying the fused audio signal to the audio signal interaction;
in this embodiment, if the signal-to-noise ratio of the fusion audio signal is not less than the signal-to-noise ratio threshold, it is indicated that the definition of the fusion audio signal can meet the requirement of audio signal interaction, and therefore the fusion audio signal is applied to audio signal interaction. Compared with the first audio signal, the fusion audio signal is combined with the middle-low frequency part with higher signal-to-noise ratio in the first audio signal and the high-frequency part in the second audio signal, so that the fusion audio signal has the signal-to-noise ratio enough for audio signal interaction, meanwhile, the integrity of the high-frequency part audio is kept, and the fusion audio signal has good tone quality.
In summary, according to the sound source control method provided by the present invention, when the signal-to-noise ratio of the fusion audio signal is smaller than the signal-to-noise ratio threshold, that is, the signal-to-noise ratio of the fusion audio signal does not meet the requirement of audio signal interaction, the first audio signal acquired by the bone conduction sensor is applied to the audio signal interaction, and the requirement of the definition of the audio signal applied to the audio signal interaction is met by selecting a reasonable manner in which the audio signal is applied to the audio signal interaction.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a speaker device according to an embodiment of the present invention.
In an embodiment, the speaker device 1 comprises a processor 11, a bone conduction sensor 12 and a microphone 13, the bone conduction sensor 12 and the microphone 13 being respectively coupled to the processor 11. The speaker device 1 may be an earphone or the like, such as a headphone, an in-ear earphone, or the like, and is not limited herein.
The processor 11 is used for controlling the bone conduction sensor 12 and the microphone 13 to perform audio signal interaction; acquiring a first audio signal with the bone conduction sensor 12 and a second audio signal with the microphone 13; fusing the first audio signal and the second audio signal to obtain a fused audio signal; calculating the signal-to-noise ratio of the fusion audio signal, and judging whether the signal-to-noise ratio of the fusion audio signal is smaller than a signal-to-noise ratio threshold value; and if the signal-to-noise ratio of the fusion audio signal is smaller than the signal-to-noise ratio threshold value, applying the first audio signal to audio signal interaction.
The processor 11 can implement the sound source control method described in the above embodiments, and will not be described herein again.
Referring to fig. 5, fig. 5 is a schematic structural diagram of an embodiment of a device with a storage function according to the present invention.
In an embodiment, the device 2 with storage function stores the program data 21, and the program data 21 can be executed to implement the method for controlling the sound source as described in the above embodiments, which will not be described herein again.
In the several embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are illustrative, and for example, the division of the modules or units into one logical functional division may be implemented in practice in another logical functional division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage device and includes several instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present invention. The aforementioned apparatus having a storage function includes: a U disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, a server, and other various media capable of storing program codes.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes performed by the present specification and drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A method for controlling an audio source, the method comprising:
acquiring a first audio signal with a bone conduction sensor and a second audio signal with a microphone;
fusing the first audio signal and the second audio signal to obtain a fused audio signal;
calculating the signal-to-noise ratio of the fusion audio signal, and judging whether the signal-to-noise ratio of the fusion audio signal is smaller than a signal-to-noise ratio threshold value;
if the signal-to-noise ratio of the fused audio signal is smaller than the signal-to-noise ratio threshold, applying the first audio signal to audio signal interaction;
wherein a signal-to-noise ratio of the first audio signal is greater than the signal-to-noise ratio threshold and a signal-to-noise ratio of the second audio signal; the interactive contents recorded by the first audio signal and the second audio signal are the same; and the number of the first and second groups,
the step of fusing the first audio signal and the second audio signal to obtain a fused audio signal comprises:
and on the basis of the first audio signal, taking the audio signal in a preset frequency band in the second audio signal as the audio signal corresponding to the preset frequency band in the first audio signal to obtain the fusion audio signal.
2. The control method of claim 1, wherein the step of applying the first audio signal to an audio signal interaction comprises:
performing equalization adjustment or gain processing on the first audio signal to at least increase the amplitude of an audio signal in a preset frequency band in the first audio signal;
applying the first audio signal subjected to the equalization adjustment or the gain processing to the audio signal interaction.
3. The control method of claim 2, wherein the steps of acquiring the first audio signal with the bone conduction sensor and acquiring the second audio signal with the microphone are preceded by:
outputting the audio signals in the preset frequency band in the first audio signals to a user;
acquiring the minimum amplitude of the audio signal which is in the preset frequency band and can be perceived by a user;
the step of performing equalization adjustment or gain processing on the first audio signal to at least increase the amplitude of the audio signal in the preset frequency band in the first audio signal includes:
and increasing the amplitude of the audio signal in the preset frequency band in the first audio signal to be not less than the minimum amplitude.
4. The control method according to any one of claims 2 to 3, wherein the preset frequency band is defined as an audio frequency band greater than or equal to 1 KHz.
5. The method of claim 1, wherein the step of determining whether the signal-to-noise ratio of the fused audio signal is less than a signal-to-noise ratio threshold further comprises:
and if the signal-to-noise ratio of the fused audio signal is not less than the signal-to-noise ratio threshold, applying the fused audio signal to the audio signal interaction.
6. The control method of claim 1, wherein the audio signal interaction comprises at least one of a mobile data call and an internet call.
7. An audio speaker device comprising a processor, a bone conduction sensor, and a microphone, the bone conduction sensor and the microphone being respectively coupled to the processor, the processor being configured to:
controlling the bone conduction sensor and the microphone to perform audio signal interaction;
acquiring a first audio signal with the bone conduction sensor and a second audio signal with the microphone;
fusing the first audio signal and the second audio signal to obtain a fused audio signal;
calculating the signal-to-noise ratio of the fusion audio signal, and judging whether the signal-to-noise ratio of the fusion audio signal is smaller than a signal-to-noise ratio threshold value;
if the signal-to-noise ratio of the fused audio signal is smaller than the signal-to-noise ratio threshold, applying the first audio signal to audio signal interaction;
wherein a signal-to-noise ratio of the first audio signal is greater than the signal-to-noise ratio threshold and a signal-to-noise ratio of the second audio signal; the interactive contents recorded by the first audio signal and the second audio signal are the same; and the number of the first and second groups,
when the first audio signal and the second audio signal are fused to obtain a fused audio signal, the processor is further configured to:
and on the basis of the first audio signal, taking the audio signal in a preset frequency band in the second audio signal as the audio signal corresponding to the preset frequency band in the first audio signal to obtain the fusion audio signal.
8. The acoustic speaker apparatus of claim 7 wherein said processor, when applying said first audio signal to an audio signal interaction, is further configured to:
performing equalization adjustment or gain processing on the first audio signal to at least increase the amplitude of an audio signal in a preset frequency band in the first audio signal;
applying the first audio signal subjected to the equalization adjustment or the gain processing to the audio signal interaction.
9. The audio output device of claim 7, wherein said processor is further configured to, upon determining whether a signal-to-noise ratio of said merged audio signal is less than a signal-to-noise ratio threshold:
and if the signal-to-noise ratio of the fused audio signal is not less than the signal-to-noise ratio threshold, applying the fused audio signal to the audio signal interaction.
10. An apparatus having a storage function, characterized in that the apparatus having a storage function stores program data executable to implement the method of controlling a sound source according to any one of claims 1 to 6.
CN201910501128.9A 2019-06-11 2019-06-11 Sound source control method, loudspeaker device and apparatus Active CN110265056B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910501128.9A CN110265056B (en) 2019-06-11 2019-06-11 Sound source control method, loudspeaker device and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910501128.9A CN110265056B (en) 2019-06-11 2019-06-11 Sound source control method, loudspeaker device and apparatus

Publications (2)

Publication Number Publication Date
CN110265056A CN110265056A (en) 2019-09-20
CN110265056B true CN110265056B (en) 2021-09-17

Family

ID=67917517

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910501128.9A Active CN110265056B (en) 2019-06-11 2019-06-11 Sound source control method, loudspeaker device and apparatus

Country Status (1)

Country Link
CN (1) CN110265056B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113053371A (en) * 2019-12-27 2021-06-29 阿里巴巴集团控股有限公司 Voice control system and method, voice suite, bone conduction and voice processing device
CN111327985A (en) * 2020-03-06 2020-06-23 华勤通讯技术有限公司 Earphone noise reduction method and device
CN112261229B (en) * 2020-09-11 2022-01-18 歌尔科技有限公司 Bone conduction call equipment testing method, device and system
CN113301468B (en) * 2021-05-12 2024-05-31 广东喜中喜科技有限公司 TWS Bluetooth headset capable of realizing call noise reduction
CN113421580B (en) * 2021-08-23 2021-11-05 深圳市中科蓝讯科技股份有限公司 Noise reduction method, storage medium, chip and electronic device
CN113421583B (en) * 2021-08-23 2021-11-05 深圳市中科蓝讯科技股份有限公司 Noise reduction method, storage medium, chip and electronic device

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060293887A1 (en) * 2005-06-28 2006-12-28 Microsoft Corporation Multi-sensory speech enhancement using a speech-state model
CN102770909A (en) * 2010-02-24 2012-11-07 高通股份有限公司 Voice activity detection based on plural voice activity detectors
EP2518724B1 (en) * 2011-04-26 2013-10-02 Parrot Microphone/headphone audio headset comprising a means for suppressing noise in a speech signal, in particular for a hands-free telephone system
CN103813227A (en) * 2012-11-12 2014-05-21 上海耐普微电子有限公司 Method for improving signal-to-noise ratio of microphone and high signal-to-noise ratio microphone
CN104780486A (en) * 2014-01-13 2015-07-15 Dsp集团有限公司 Use of microphones with vsensors for wearable devices
CN105632512A (en) * 2016-01-14 2016-06-01 华南理工大学 Dual-sensor voice enhancement method based on statistics model and device
CN106791122A (en) * 2016-12-27 2017-05-31 广东小天才科技有限公司 Call control method of wearable device and wearable device
US20180367882A1 (en) * 2017-06-16 2018-12-20 Cirrus Logic International Semiconductor Ltd. Earbud speech estimation
CN109195042A (en) * 2018-07-16 2019-01-11 恒玄科技(上海)有限公司 The high-efficient noise-reducing earphone and noise reduction system of low-power consumption
CN109767783A (en) * 2019-02-15 2019-05-17 深圳市汇顶科技股份有限公司 Sound enhancement method, device, equipment and storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060293887A1 (en) * 2005-06-28 2006-12-28 Microsoft Corporation Multi-sensory speech enhancement using a speech-state model
CN102770909A (en) * 2010-02-24 2012-11-07 高通股份有限公司 Voice activity detection based on plural voice activity detectors
EP2518724B1 (en) * 2011-04-26 2013-10-02 Parrot Microphone/headphone audio headset comprising a means for suppressing noise in a speech signal, in particular for a hands-free telephone system
CN103813227A (en) * 2012-11-12 2014-05-21 上海耐普微电子有限公司 Method for improving signal-to-noise ratio of microphone and high signal-to-noise ratio microphone
CN104780486A (en) * 2014-01-13 2015-07-15 Dsp集团有限公司 Use of microphones with vsensors for wearable devices
CN105632512A (en) * 2016-01-14 2016-06-01 华南理工大学 Dual-sensor voice enhancement method based on statistics model and device
CN106791122A (en) * 2016-12-27 2017-05-31 广东小天才科技有限公司 Call control method of wearable device and wearable device
US20180367882A1 (en) * 2017-06-16 2018-12-20 Cirrus Logic International Semiconductor Ltd. Earbud speech estimation
CN109195042A (en) * 2018-07-16 2019-01-11 恒玄科技(上海)有限公司 The high-efficient noise-reducing earphone and noise reduction system of low-power consumption
CN109767783A (en) * 2019-02-15 2019-05-17 深圳市汇顶科技股份有限公司 Sound enhancement method, device, equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Body conducted speech enhancement by equalization and signal fusion;Dekens T 等;《IEEE transactions on audio, speech, and language processing》;20130724;第21卷(第12期);全文 *
骨导和气导结合的语音增强系统搭建;李敏杰;《中国优秀硕士学位论文全文数据库 信息科技辑》;20170215;全文 *

Also Published As

Publication number Publication date
CN110265056A (en) 2019-09-20

Similar Documents

Publication Publication Date Title
CN110265056B (en) Sound source control method, loudspeaker device and apparatus
US11765497B2 (en) Earhole-wearable sound collection device, signal processing device, and sound collection method
CN108076226B (en) Method for adjusting call quality, mobile terminal and storage medium
US20170103774A1 (en) Audio Signal Processing
CN107396223B (en) Earphone uplink noise-reduction method
EP3038255B1 (en) An intelligent volume control interface
US10567697B2 (en) Signal processing device and signal processing method
CN105260159A (en) Audio play method and apparatus
JP6315046B2 (en) Ear hole mounting type sound collecting device, signal processing device, and sound collecting method
CN107750038B (en) Volume adjusting method, device, equipment and storage medium
CN115696110A (en) Audio device and audio signal processing method
US11490216B2 (en) Compensating hidden hearing losses by attenuating high sound pressure levels
CN112584275B (en) Sound field expansion method, computer equipment and computer readable storage medium
CN102576560A (en) Electronic audio device
CN108919277B (en) Indoor and outdoor environment identification method and system based on sub-ultrasonic waves and storage medium
US10997984B2 (en) Sounding device, audio transmission system, and audio analysis method thereof
CN113031904A (en) Control method and electronic equipment
JP2003345375A (en) Device and system for reproducing voice
CN111800712B (en) Audio processing method and electronic equipment
US11902747B1 (en) Hearing loss amplification that amplifies speech and noise subsignals differently
WO2024098958A1 (en) Audio processing method and apparatus, and electronic device and storage medium
US10748548B2 (en) Voice processing method, voice communication device and computer program product thereof
CN105681527B (en) A kind of de-noising method of adjustment and electronic equipment
CN118151879A (en) Volume adjustment method, device, equipment and storage medium
CN116367066A (en) Audio device with audio quality detection and related method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant