CN111402913A - Noise reduction method, device, equipment and storage medium - Google Patents

Noise reduction method, device, equipment and storage medium Download PDF

Info

Publication number
CN111402913A
CN111402913A CN202010111706.0A CN202010111706A CN111402913A CN 111402913 A CN111402913 A CN 111402913A CN 202010111706 A CN202010111706 A CN 202010111706A CN 111402913 A CN111402913 A CN 111402913A
Authority
CN
China
Prior art keywords
sound
target
audio signal
noise
audio signals
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010111706.0A
Other languages
Chinese (zh)
Other versions
CN111402913B (en
Inventor
冯大航
陈孝良
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing SoundAI Technology Co Ltd
Original Assignee
Beijing SoundAI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing SoundAI Technology Co Ltd filed Critical Beijing SoundAI Technology Co Ltd
Priority to CN202010111706.0A priority Critical patent/CN111402913B/en
Publication of CN111402913A publication Critical patent/CN111402913A/en
Application granted granted Critical
Publication of CN111402913B publication Critical patent/CN111402913B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The application provides a noise reduction method, a noise reduction device, noise reduction equipment and a storage medium, and belongs to the technical field of voice. The noise reduction method comprises the following steps: the method comprises the steps of acquiring audio signals acquired by collecting the same sound emitted by a target sound source through two sound collection devices, filtering the audio signals acquired by the two sound collection devices respectively, and further filtering the audio signals corresponding to the two obtained sound collection devices, so that the audio signals corresponding to the two sound collection devices are subjected to noise reduction again, the occupied ratio of the noise signals in the obtained target audio signals is smaller, the signal-to-noise ratio of the target audio signals is larger, and the noise reduction effect is better.

Description

Noise reduction method, device, equipment and storage medium
Technical Field
The present application relates to the field of speech technologies, and in particular, to a noise reduction method, apparatus, device, and storage medium.
Background
Nowadays, wireless earphone products are very popular with users due to their convenience. The wireless earphone is integrated with the microphone and the receiver, when a user needs to make a call, the wireless earphone connected with the mobile phone is worn on the ear, and the user can communicate through the wireless earphone, so that the method for communicating by replacing a handheld mobile phone can be achieved, and the method is more convenient. However, when a call is made, noise inevitably exists in the environment, which affects the call quality.
In the related art, the wireless earphone is usually equipped with a noise reduction function, so that the wireless earphone can process the collected voice signals, strengthen the voice, weaken the noise and achieve the effect of noise reduction.
Before the appearance of a TWS (true Wireless) headset, only one Wireless headset is usually used, only one group of voice input signals can be acquired, and the noise reduction effect after processing the voice input signals is not good enough. At present, a TWS headset generally comprises two wireless headsets, and the two wireless headsets can collect two groups of voice input signals respectively, but a noise reduction method capable of utilizing the two groups of voice input signals does not exist in the related art, so that a noise reduction method applicable to the TWS headset is urgently needed.
Disclosure of Invention
In view of this, embodiments of the present application provide a noise reduction method, apparatus, device and storage medium, which can reduce noise of audio signals of two sound collection devices, and can improve noise reduction effect.
In one aspect, a method for noise reduction is provided, the method comprising:
acquiring an audio signal acquired by acquiring the same sound emitted by a target sound source by two sound acquisition devices;
respectively filtering the audio signals collected by the two sound collection devices to obtain the audio signal corresponding to each sound collection device;
and taking the two sound collection devices as two microphones in the same microphone array, and filtering audio signals corresponding to the two sound collection devices to obtain the target audio signal which is emitted by the target sound source and subjected to the noise reduction of the same sound.
Optionally, the filtering the audio signals corresponding to the two sound collection devices to obtain the target audio signal emitted by the target sound source after the noise of the same sound is reduced includes:
and carrying out weighted summation on the audio signals corresponding to the two sound acquisition devices to obtain the target audio signal which is emitted by the target sound source and subjected to noise reduction on the same sound.
Optionally, regarding the two sound collection devices as two microphones in the same microphone array, filtering the audio signals corresponding to the two sound collection devices to obtain the target audio signal after the noise reduction of the same sound emitted by the target sound source, including:
carrying out weighted summation on the audio signals corresponding to the two sound acquisition devices to obtain candidate target audio signals which are emitted by the target sound source and subjected to noise reduction on the same sound;
and filtering audio signals except the target direction in the candidate target audio signals to obtain the target audio signal which is emitted by the target sound source and subjected to noise reduction.
Optionally, the filtering out audio signals other than the target direction in the candidate target audio signals to obtain the noise-reduced target audio signal of the same sound emitted by the target sound source includes:
respectively filtering audio signals of a target direction in the audio signals corresponding to each sound collection device to obtain noise signals corresponding to each sound collection device, wherein the target direction is a direction in which the target sound source points to the middle position of the two sound collection devices;
carrying out weighted summation on the noise signals corresponding to the two sound acquisition devices to obtain a noise signal;
and removing the noise signal in the candidate target audio signal to obtain the target audio signal which is emitted by the target sound source and subjected to noise reduction.
Optionally, the noise signals corresponding to the two sound collection devices are subjected to weighted summation to obtain a noise signal; removing the noise signal in the candidate target audio signal to obtain the noise-reduced target audio signal of the same sound emitted by the target sound source, including:
according to the weight corresponding to each sound acquisition device, carrying out weighted summation on the noise signals corresponding to the two sound acquisition devices to obtain a noise signal;
removing the noise signal in the candidate target audio signal to obtain a target audio signal which is emitted by the target sound source and subjected to noise reduction;
and adjusting the weight according to the correlation between the target audio signal and an expected audio signal obtained based on the candidate target audio signal, and continuing to execute the steps of obtaining and removing the noise signal based on the adjusted weight until a target condition is met, so as to obtain the target audio signal which is emitted by the target sound source and subjected to noise reduction and is emitted by the same sound.
Optionally, the filtering the audio signals collected by the two sound collection devices respectively to obtain the audio signal corresponding to each sound collection device includes:
according to the time delay of collecting the sound by different collecting units in each sound collecting device, carrying out time delay compensation on the audio signal collected by each collecting unit in each sound collecting device to obtain a candidate audio signal of each collecting unit;
and filtering the candidate audio signals of the acquisition units in each sound acquisition device to obtain the audio signal corresponding to each sound acquisition device.
Optionally, the method further includes:
detecting the states of the two sound collection devices;
when the two sound collection devices are both in a working state, executing the steps of obtaining audio signals corresponding to the two sound collection devices, respectively filtering and using the audio signals as a same microphone array;
when any sound collection equipment is not in a working state, acquiring an audio signal collected by the sound collection equipment in the working state, which is sent by the target sound source, and filtering the audio signal to obtain the target audio signal sent by the target sound source after the same sound is denoised.
In one aspect, a noise reduction apparatus is provided, the apparatus comprising:
the acquisition module is used for acquiring audio signals acquired by the two sound acquisition devices for acquiring the same sound emitted by the target sound source;
the first filtering module is used for respectively filtering the audio signals acquired by the two sound acquisition devices to obtain an audio signal corresponding to each sound acquisition device;
and the second filtering module is used for filtering the audio signals corresponding to the two sound acquisition devices to obtain the target audio signal which is emitted by the target sound source and subjected to the noise reduction of the same sound.
Optionally, the second filtering module includes:
and the second weighting module is used for weighting and summing the audio signals corresponding to the two sound acquisition devices to obtain the target audio signal which is emitted by the target sound source and subjected to the noise reduction of the same sound.
Optionally, the second filtering module includes:
the second weighting module is used for carrying out weighted summation on the audio signals corresponding to the two sound acquisition devices to obtain candidate target audio signals which are emitted by the target sound source and subjected to noise reduction on the same sound;
and the second filtering module is used for filtering audio signals except for the target direction in the candidate target audio signals to obtain the target audio signal which is emitted by the target sound source and subjected to noise reduction by the same sound.
Optionally, the second filtering module includes:
the second filtering submodule is used for respectively filtering the audio signals in the target direction in the audio signals corresponding to each sound acquisition device to obtain the noise signals corresponding to each sound acquisition device, and the target direction is the direction in which the target sound source points to the middle position of the two sound acquisition devices;
the second weighting submodule is used for weighting and summing the noise signals corresponding to the two sound acquisition devices to obtain a noise signal;
and the second removing module is used for removing the noise signal in the candidate target audio signal to obtain the target audio signal which is emitted by the target sound source and subjected to noise reduction by the same sound.
Optionally, the second weighting submodule is specifically configured to perform weighted summation on the noise signals corresponding to the two sound collection devices according to the weight corresponding to each sound collection device, so as to obtain a noise signal;
the second filtering module further comprises a second adjusting module for adjusting the weight according to the correlation between the target audio signal and an expected audio signal obtained based on the candidate target audio signal;
and the second filtering module is further configured to continue to execute the steps of acquiring and removing the noise signal based on the adjusted weight until a target condition is met, and obtain a target audio signal, which is emitted by the target sound source and subjected to noise reduction, of the same sound.
Optionally, the first filtering module includes:
the time delay module is used for carrying out time delay compensation on the audio signal acquired by each acquisition unit in each sound acquisition device according to the time delay of the sound acquired by different acquisition units in each sound acquisition device to obtain a candidate audio signal of each acquisition unit;
and the first filtering submodule is used for filtering the candidate audio signals of the acquisition units in each sound acquisition device to obtain the audio signals corresponding to each sound acquisition device.
Optionally, the apparatus further comprises:
the detection module is used for detecting the states of the two sound acquisition devices;
the apparatus is further configured to:
when the two sound collection devices are both in a working state, executing the steps of obtaining audio signals corresponding to the two sound collection devices, respectively filtering and using the audio signals as a same microphone array;
when any sound collection equipment is not in a working state, acquiring an audio signal collected by the sound collection equipment in the working state, which is sent by the target sound source, and filtering the audio signal to obtain the target audio signal sent by the target sound source after the same sound is denoised.
In one aspect, a computer device is provided that includes one or more processors and one or more memories having at least one instruction stored therein, the instruction being loaded and executed by the one or more processors to implement operations performed by the noise reduction method.
In one aspect, a computer-readable storage medium having at least one instruction stored therein is provided, which is loaded and executed by a processor to implement operations performed by the noise reduction method.
The beneficial effects brought by the technical scheme provided by the embodiment of the application at least can comprise:
in this application embodiment, through obtaining two sound collection equipment and gathering the audio signal that obtains to the same sound that target sound source sent, carry out the filtering to the audio signal that two sound collection equipment were gathered respectively, further carry out the filtering to the audio signal that two sound collection equipment that obtain correspond again, realized the processing of making an uproar once more of the audio signal that two sound collection equipment correspond, in the target audio signal that makes the acquisition, the occupation ratio of noise signal is littleer, the SNR of target audio signal is bigger, the noise reduction effect is better.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic implementation environment of a noise reduction method provided in an embodiment of the present application;
fig. 2 is a flowchart of a noise reduction method provided in an embodiment of the present application;
FIG. 3 is a flow chart of another noise reduction method provided by the embodiments of the present application;
fig. 4 is a block diagram of a noise reduction method according to an embodiment of the present application;
FIG. 5 is a block diagram of another noise reduction method provided in the embodiments of the present application;
FIG. 6 is a block diagram of another noise reduction method provided in the embodiments of the present application;
FIG. 7 is a block diagram of another noise reduction method provided in the embodiments of the present application;
fig. 8 is a schematic structural diagram of a noise reduction device according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of a terminal according to an embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
It should be noted that the embodiments described below are some embodiments of the present application, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In this application, the terms "first," "second," and the like are used for distinguishing between similar items and items having substantially the same function, and it should be understood that "first," "second," and "n" have no logical or temporal dependency, nor are they limited in number or order of execution.
Fig. 1 is a schematic diagram of an implementation environment of a noise reduction method provided by an embodiment of the present application, and referring to fig. 1, the implementation environment may include two sound collection devices 101 and a target sound source 102.
The target sound source 102 may be a human mouth or a speaker, and the sound collection device 101 may be a wireless earphone or a terminal device capable of collecting sound. The target sound source 102 emits sound, and the two sound collection devices 101 collect the same sound emitted by the target sound source 102, so as to obtain an audio signal emitted by the target sound source 102.
In this embodiment, the two sound collection devices 101 may collect sounds emitted by a target sound source, and perform noise reduction processing on the collected audio signals to obtain target audio signals.
Fig. 2 is a flowchart of a noise reduction method provided in an embodiment of the present application, and referring to fig. 2, the method includes:
201. and acquiring audio signals acquired by acquiring the same sound emitted by the target sound source by the two sound acquisition devices.
202. And respectively filtering the audio signals collected by the two sound collection devices to obtain the audio signal corresponding to each sound collection device.
203. And taking the two sound collection devices as two microphones in the same microphone array, and filtering audio signals corresponding to the two sound collection devices to obtain a target audio signal which is emitted by the target sound source and subjected to noise reduction and is generated by the same sound.
In a possible implementation manner, the filtering the audio signals corresponding to the two sound collection devices to obtain a target audio signal, which is emitted by the target sound source and subjected to noise reduction, includes:
and carrying out weighted summation on the audio signals corresponding to the two sound acquisition devices to obtain the target audio signal which is emitted by the target sound source and subjected to noise reduction on the same sound.
In a possible implementation manner, the taking the two sound collection devices as two microphones in the same microphone array, and filtering audio signals corresponding to the two sound collection devices to obtain a target audio signal emitted by the target sound source after the same sound is denoised, includes:
carrying out weighted summation on the audio signals corresponding to the two sound acquisition devices to obtain a candidate target audio signal which is emitted by the target sound source and subjected to noise reduction on the same sound;
and filtering audio signals except the target direction in the candidate target audio signals to obtain the target audio signal which is emitted by the target sound source and subjected to noise reduction by the same sound.
In a possible implementation manner, the filtering out audio signals other than the target direction from the candidate target audio signals to obtain a noise-reduced target audio signal of the same sound emitted by the target sound source includes:
respectively filtering audio signals in a target direction in the audio signals corresponding to each sound collection device to obtain noise signals corresponding to each sound collection device, wherein the target direction is a direction in which the target sound source points to the middle position of the two sound collection devices;
carrying out weighted summation on the noise signals corresponding to the two sound acquisition devices to obtain a noise signal;
and removing the noise signal in the candidate target audio signal to obtain the target audio signal which is emitted by the target sound source and subjected to noise reduction by the same sound.
In a possible implementation manner, the noise signals corresponding to the two sound collection devices are subjected to weighted summation to obtain a noise signal; removing the noise signal in the candidate target audio signal to obtain a noise-reduced target audio signal of the same sound emitted by the target sound source, including:
according to the weight corresponding to each sound acquisition device, carrying out weighted summation on the noise signals corresponding to the two sound acquisition devices to obtain a noise signal;
removing the noise signal in the candidate target audio signal to obtain a target audio signal which is emitted by the target sound source and subjected to noise reduction;
and adjusting the weight according to the correlation between the target audio signal and the expected audio signal acquired based on the candidate target audio signal, and continuing to execute the steps of acquiring and removing the noise signal based on the adjusted weight until the target audio signal meets the target condition, so as to obtain the target audio signal which is generated by the target sound source and subjected to noise reduction and is emitted by the same sound.
In a possible implementation manner, the filtering the audio signals collected by the two sound collection devices respectively to obtain the audio signal corresponding to each sound collection device includes:
according to the time delay of collecting the sound by different collecting units in each sound collecting device, carrying out time delay compensation on the audio signal collected by each collecting unit in each sound collecting device to obtain a candidate audio signal of each collecting unit;
and filtering the candidate audio signals of the acquisition units in each sound acquisition device to obtain the audio signal corresponding to each sound acquisition device.
In one possible embodiment, the method further comprises:
detecting the states of the two sound collection devices;
when the two sound collection devices are both in a working state, the steps of obtaining audio signals corresponding to the two sound collection devices, respectively filtering and using the audio signals as the same microphone array are executed;
when any sound collection equipment is not in a working state, obtaining an audio signal collected by the sound collection equipment in the working state to the sound emitted by the target sound source, and filtering the audio signal to obtain the target audio signal emitted by the target sound source and subjected to noise reduction by the same sound.
Fig. 3 is a flowchart of a noise reduction method according to an embodiment of the present application. Referring to fig. 3, the method includes:
301. the computer equipment acquires audio signals acquired by the two sound acquisition equipment through acquiring the same sound emitted by the target sound source.
In an embodiment of the present application, a method for collecting and denoising sound by two sound collection devices is provided. Each sound collection device can comprise at least two collection units for collecting the audio signals of the target sound source. For example, the sound collection device may be a headset and the collection unit may be a microphone. In a specific scenario, two microphones may be installed on each earphone, and sound may be collected by the four microphones of the two earphones.
In each sound collection device, the arrangement mode of the collection units may be linear arrangement or annular arrangement, which is not limited in the present application, and the following description is given by taking a linear arrangement as an example. For example, as shown in fig. 1, the two sound collecting devices may be two earphones, each earphone may include two microphones, one earphone may include mic1 and mic2, and the other earphone may include mic3 and mic 4. The target sound source can be the people's mouth, and the user can wear two earphones respectively to two ears when using the earphone to converse, and four microphones on two earphones homoenergetic are gathered the sound that the people's mouth sent.
For the audio signal collected by each sound collection device, there may be an audio signal set, and the audio signal set may include the audio signal collected by each collection unit in the sound collection device. Therefore, the sound emitted by the target sound source is collected by the two sound collecting devices, more audio signals can be obtained, and enough data are provided for the follow-up noise reduction of the audio signals.
In one possible implementation, the computer device may be a device to which the two sound collection devices are connected, for example, the two sound collection devices may collect sound, obtain an audio signal, and transmit the audio signal to the connected computer device, and the computer device processes the audio signal. For example, the computer device may be a mobile phone for a user to talk, and the earphone acquires an audio signal and then sends the audio signal to the mobile phone for noise reduction processing and then sends the audio signal.
In another possible implementation manner, the computer device may also be the two sound collection devices, and after the two sound collection devices collect the audio signal, the audio signal may be sent or played after performing noise reduction on the audio signal.
302. And the computer equipment respectively filters the audio signals collected by the two sound collection equipment to obtain the audio signal corresponding to each sound collection equipment.
The audio signals collected by each sound device include a sound signal of a target sound source and a noise signal in the environment, and the audio signals collected by the two sound collection devices are filtered, that is, the noise signal in the audio signals is filtered, so that the noise-reduced audio signal corresponding to each sound collection device is obtained. For example, taking the sound collection device as an earphone, the collection unit as a microphone, and the earphone including mic1 and mic2 as the first earphone and the earphone including mic3 and mic4 as the second earphone. And filtering the audio signals collected by the mic1 and the mic2, wherein the noise signals in the audio signals are filtered out in the filtering process, and finally the noise-reduced audio signals corresponding to the first earphone can be obtained. And filtering the audio signals collected by the mic3 and the mic4, wherein the noise signals in the audio signals are filtered in the filtering process, and finally, the noise-reduced audio signals corresponding to the second earphone can be obtained.
In a possible embodiment, the filtering process may filter the time delay of the sound collected by the collecting unit on the sound collecting device to remove the noise signal. Specifically, the computer device may perform delay compensation on the audio signal acquired by each acquisition unit in each sound acquisition device according to the delay of acquiring the sound by the different acquisition units in each sound acquisition device to obtain a candidate audio signal of each acquisition unit, and then filter the candidate audio signal of the acquisition unit in each sound acquisition device to obtain an audio signal corresponding to each sound acquisition device.
For each acquisition unit, the time delay may be a time delay of the audio signal acquired by the acquisition unit with respect to the audio signal acquired by the reference array element. Specifically, the computer device may analyze the audio signal collected by each sound collection device, and obtain a time delay of the audio signal of the target sound source collected by each collection unit in each sound collection device relative to the reference array element.
In each sound collection device, the distances between each collection unit and the target sound source may be different, so that the time for each collection unit to collect the audio signal of the target sound source may be different, and the phases of the collected audio signals may also be different. The acquisition unit closest to the target sound source acquires the audio signal of the target sound source first, and the acquisition unit farthest from the target sound source acquires the audio signal of the target sound source last. For example, as shown in fig. 1, when a user uses two earphones to talk, the mic2 and the mic4 collect the sound emitted from the mouth of the user first, and the mic1 and the mic3 collect the sound emitted from the mouth of the user after a certain time delay. The mic2 closer to the mouth of the first headphone may be used as a reference element, and the mic4 closer to the mouth of the second headphone may be used as a reference element. By analyzing the audio signals collected by the mic1 and the mic2, the time delay of the mic1 relative to the time delay of the mic2 collecting the sound emitted by the human mouth can be obtained. By analyzing the audio signals collected by the mic3 and the mic4, the time delay of the mic3 relative to the time delay of the mic4 collecting the sound emitted by the human mouth can be obtained.
Specifically, the computer device may obtain a cross-correlation coefficient between the audio signals collected by each of the collection units, and according to the cross-correlation coefficient, obtain a time delay of each of the collection units with respect to the reference array element to determine the time delay of the sound collected by each of the collection units with respect to the reference array element. Of course, other methods may also be adopted in the time delay obtaining process, and the embodiment of the present application does not limit the specific time delay obtaining manner. By analyzing the audio signals collected by each collecting unit, the time delay of the audio signals of the target sound source collected by each collecting unit relative to the reference array element can be obtained, and a basis is provided for the follow-up time delay compensation.
After the computer equipment obtains the time delay, the time delay compensation can be respectively carried out on the audio signals collected by each collecting unit according to the time delay, and candidate audio signals of each collecting unit are obtained.
For each sound collection device, if the time of collecting the audio signal of the target sound source by each collection unit is different, the phases of the audio signals of the target sound source collected by each collection unit in each sound collection device may be different at the same time, and under such a situation, it is difficult to analyze the audio signals collected by a plurality of collection units in a comprehensive manner.
The time delay compensation process for multiple acquisition units can be as shown in FIG. 4, and can be represented by xi,m(t) represents the audio signal collected by the m-th collection unit in the ith sound collection device, and accordingly, the candidate audio signal of the m-th collection unit can be represented as x'i,m(t)=xi,m(t+τi,m). Wherein, taui,mAnd the time delay corresponding to the mth acquisition unit in the ith sound acquisition device. After the computer device obtains the candidate audio signal of each acquisition unit, the computer device may perform weighted summation on the candidate audio signal of the acquisition unit in each sound acquisition device to obtain the audio signal corresponding to each sound acquisition device, where i and m are positive integers.
Wherein, through the process of weighted summation, the audio signal from the direction of the target sound source can be strengthened, thereby realizing the beam forming of the direction of the target sound source, and the noise signal is weakened relative to the sound signal. The time delay compensation enables the phases of the audio signals of the target sound sources collected by each collecting unit at the same moment to be the same, and in the audio signals collected by each collecting unit, the audio signals of the target sound sources are coherent, and the noise signals can be incoherent. For example, after the computer device obtains the candidate audio signals of the mic1 and the mic2 of the first headphone, the candidate audio signals of the mic1 and the mic2 may be subjected to weighted summation, and in the obtained audio signals corresponding to the first headphone, the audio signals from the human mouth direction are enhanced, while the noise signals in other directions are not enhanced, so that the noise reduction of the audio signals collected by the mic1 and the mic2 is realized.
In one possible implementation, the computer device may perform weighted summation on the candidate audio signals of the plurality of acquisition units of each sound acquisition device by the following formula:
Figure BDA0002390252100000111
wherein, Wa=[wa1,wa2,…wam]TIs the weight vector corresponding to the acquisition unit, wamIs the weight, X 'of the m-th acquisition unit'i(t)=[x′i,1(t),x′i,2(t),…x′i,m(t)]TIs a candidate audio signal vector s of m acquisition units in the ith sound acquisition equipment after time delay compensationiAnd (t) is an audio signal corresponding to the ith sound acquisition equipment.
It should be noted that, when the candidate audio signals of the acquisition units are weighted and summed, the weight of each acquisition unit may be a fixed weight, and the fixed weight may be preset by a relevant technician according to actual needs, self experience or experimental results, for example, each sound acquisition device may include two acquisition units, and correspondingly, the weight of each acquisition unit may be 0.5. The weight may also be adjusted according to the filtering result, for example, the weight may be adjusted according to the quality of the obtained audio signal corresponding to each sound collection device.
In a possible implementation manner, as shown in fig. 5, the computer device may perform weighted summation on the candidate audio signals of the acquisition units in each sound acquisition device according to the first weight corresponding to each acquisition unit to obtain the candidate audio signals corresponding to each sound acquisition device, then respectively filter the audio signals from the direction of the target sound source in the candidate audio signals of each acquisition unit to obtain the noise signals corresponding to each acquisition unit, and perform weighted summation on the noise signals of the acquisition units in each sound acquisition device according to the second weight corresponding to each acquisition unit to obtain the noise signals corresponding to each sound acquisition device. And obtaining the audio signal corresponding to each sound acquisition device by making a difference between the candidate audio signal corresponding to each sound acquisition device and the noise signal.
The candidate audio signal corresponding to each sound collection device is obtained by enhancing the audio signal from the direction of the target sound source, which is equivalent to the attenuation of the noise signal from other directions, but still includes the audio signal and the noise signal of the target sound source, so that the candidate audio signal and the noise signal corresponding to each sound collection device can be subjected to a difference, thereby filtering the noise signal in the candidate audio signal and realizing further noise reduction. The noise signal corresponding to each sound collection device can be obtained by filtering the audio signal from the direction of the target sound source in the audio signal of each collection unit to obtain the noise signal corresponding to each collection unit, and performing weighted summation on the noise signal of the collection unit in each sound collection device.
For example, after the computer device performs weighted summation on candidate audio signals of the mic1 and the mic2 in the first headphone to obtain candidate audio signals corresponding to the first headphone, audio signals from the direction of the human mouth in the audio signals collected by the mic1 and the mic2 may be filtered respectively to obtain noise signals corresponding to the mic1 and the mic2, and then the noise signals of the mic1 and the mic2 are subjected to weighted summation to obtain noise signals corresponding to the first headphone. And subtracting the noise signal from the candidate audio signal corresponding to the first earphone, so as to filter the noise signal from the candidate audio signal corresponding to the first earphone, thereby obtaining the audio signal corresponding to the first earphone.
The computer equipment obtains the noise signal corresponding to each sound acquisition equipment by filtering the audio signal from the direction of the target sound source, and makes a difference between the candidate audio signal corresponding to each sound acquisition equipment and the noise signal, so that the filtering of the noise signal in the candidate audio signal of each sound acquisition equipment is realized, and a better noise reduction effect is achieved.
In a possible implementation manner, as shown in fig. 5, the computer device may implement a filtering process on the audio signal from the direction of the target sound source in the audio signal of each acquisition unit by using a blocking matrix, so as to obtain an audio signal from a direction other than the direction of the target sound source in the audio signal of each acquisition unit, that is, a noise signal corresponding to each acquisition unit.
For example, the computer device may obtain the audio signal corresponding to each sound collection device through the following formula.
Figure BDA0002390252100000121
Ui(t)=BX′i(t)
Figure BDA0002390252100000122
si(t)=s′i(t)-zi(t)
Wherein, s'i(t) is a candidate audio signal corresponding to the ith sound collecting device, Wa=[wa1,wa2,…wam]TIs a first weight vector, X 'corresponding to the acquisition unit'i(t)=[x′i,1(t),x′i,2(t),…x′i,m(t)]TFor the candidate audio signal vectors, U, of m acquisition units in the ith sound acquisition device after time delay compensationi(t) is an array signal obtained by processing audio signal vectors of m acquisition units in the ith sound acquisition equipment by using a blocking matrix B, Wb=[wb1,wb2,…wbm]TIs the second weight vector, w, corresponding to the acquisition unitbmIs the adaptive weight of the m acquisition unit, ziAnd (t) is a noise signal corresponding to the ith sound acquisition equipment, and i and m are positive integers. In order to be able to filter out the audio signals from the direction of the target sound source from the audio signals of each acquisition unit, the blocking matrix B may be
Figure BDA0002390252100000131
It should be noted that, when the candidate audio signals corresponding to the capturing units are weighted and summed, the first weight and the second weight may be fixed weights, and the first weight and the second weight may be preset by a relevant technician according to actual needs, experience or experimental results, for example, the first weight and the second weight of each sound capturing device may be 0.5. The first weight and the second weight may also be adjusted according to the filtering result, for example, according to the quality of the obtained audio signal corresponding to the sound collection device.
In one possible implementation, the second weight may be an updatable weight. After the computer device obtains the audio signal corresponding to each sound collection device, the second weight may be adjusted according to the correlation between the audio signal corresponding to each sound collection device and the expected audio signal obtained based on the audio signal corresponding to each sound collection device, and the steps of obtaining and removing the noise signal are continuously performed based on the adjusted second weight until the target condition is met, so as to obtain the audio signal corresponding to each sound collection device.
In the target audio signal obtained by subtracting the candidate target audio signal and the noise signal, there may be a residual noise signal that is not filtered, and accordingly, the second weight for performing weighted summation on the noise signals of the two sound collection devices may be an updatable weight, so that the obtained target audio signal meets the target condition.
Through the iteration process, the second weight can be adaptively adjusted to improve the noise reduction effect of the audio signal. For the target condition, the target condition may be that the correlation is converged, or the correlation is greater than a target threshold, or a difference between the correlation and the target threshold is less than a difference threshold, or may be that the number of iterations reaches a target number, which is not limited in the embodiment of the present application.
For example, after the candidate audio signal and the noise signal of the first headphone are subtracted to obtain the audio signal corresponding to the first headphone, the second weights of the mic1 and the mic2 may be updated, so that the noise signal of the first headphone obtained according to the noise signals of the mic1 and the mic2 can be better filtered out from the candidate audio signal of the first headphone.
The computer device may update the second weights using a modified least mean square algorithm to minimize the output power of the noise signal of each sound collection device.
Figure BDA0002390252100000141
Wherein, WbAnd (n +1) is the updated second weight vector.
Through according to the correlation of the audio signal corresponding to each sound collection device and the expected audio signal obtained based on the audio signal corresponding to each sound collection device, the second weight of each collection unit is updated, so that the noise signal of each sound collection device can be updated in a self-adaptive manner, the filtering effect of the noise signal in the candidate audio signal corresponding to each sound collection device is better and better, and the noise reduction effect is improved.
In a possible implementation manner, after the sound collection device collects the audio signal, the computer device may also convert the collected audio signal into a frequency domain signal, and then perform the next data processing.
Any sound is essentially a sound wave generated by the vibration of an object, and the acquisition unit can acquire an audio signal by converting the sound wave generated by the vibration into an electric signal, wherein the acquired electric signal is actually the change of voltage along with time, and the voltage can represent the change of the sound to a certain extent. The time domain is the change of the description variable with time, and the audio signal collected by the collecting unit is obviously located on the time domain. The audio signal in the time domain is formed by overlapping a plurality of signals, and is difficult to split in the time domain, but the audio signal can be split into signals with different frequencies in the frequency domain, so that the complex audio signal can be split into a plurality of simple audio signals, and the signals can be more conveniently analyzed. The present application does not limit the specific conversion method.
303. The computer device takes the two sound collection devices as two microphones in the same microphone array, and carries out filtering on audio signals corresponding to the two sound collection devices to obtain a target audio signal which is emitted by a target sound source and has the same sound subjected to noise reduction.
After the computer equipment respectively reduces the noise of the audio signals collected by the two sound collection devices, the two sound collection devices can be used as two microphones in the same microphone array to further reduce the noise.
Specifically, the denoising process may be: and filtering the audio signals corresponding to the two sound collection devices, filtering out noise signals in directions other than the target direction, and obtaining the target audio signal which is generated by the target sound source and subjected to noise reduction. For example, the audio signals corresponding to the two earphones can be respectively used as the audio signals acquired by the mic2 and the mic4, and further, the mic2 and the mic4 can be used as two microphones in the same microphone array. Correspondingly, the target direction can be for the direction of the middle position of people's mouth directional mic2 and mic4, and the incident angle of the sound wave that people's mouth sent to the microphone array that mic2 and mic4 are constituteed can be regarded as 90 degrees promptly, during the filtering, can filter the audio signal of the direction beyond this direction, and then realize making an uproar.
The audio signals corresponding to the sound collection devices are further subjected to noise reduction, so that the occupation ratio of the noise signals in the finally obtained target audio signals is smaller, namely the signal-to-noise ratio of the target audio signals is larger, and the noise reduction effect is better. When the two earphones are worn on the two ears respectively, the time for the sound waves emitted by the mouth of the user to reach the two earphones can be considered to be equal, namely, the time for the two sound collection devices to collect the sound waves emitted by the target sound source can be considered to be equal. The time delay does not need to be considered when the audio signals corresponding to the two sound acquisition devices are filtered respectively, and the noise reduction method is simplified. Of course, the computer device may also calculate the time delays of the two sound collection devices for collecting the sound, and perform filtering after time delay compensation based on the time delays. The embodiment of the present application does not limit what specific implementation manner is adopted.
The process of further reducing noise may include multiple ways, and in a possible implementation manner, as shown in fig. 6, the computer device may perform weighted summation on the audio signals corresponding to the two sound collection devices to obtain a target audio signal, which is generated by the target sound source and is subjected to noise reduction, and implement a filtering process on the audio signals corresponding to the two sound collection devices.
Wherein, through the process of weighted summation, the audio signal from the direction of the target sound source can be strengthened, thereby realizing the beam forming of the target sound source direction. So that the noise signal is attenuated with respect to the sound signal. In the audio signals corresponding to each sound collection device, the audio signals of the target sound source are coherent, and the noise signals can be incoherent. For example, after the computer device obtains the audio signals of the first earphone and the second earphone, the audio signals of the first earphone and the second earphone may be subjected to weighted summation, and in the obtained target audio signal, the audio signal from the human mouth direction is enhanced, while the noise signals in other directions are not enhanced, thereby realizing further noise reduction of the audio signals of the first earphone and the second earphone.
In one possible implementation manner, the computer device may perform weighted summation on the audio signal corresponding to each sound collection device by the following formula:
Figure BDA0002390252100000151
wherein, Wa=[wa1,wa2,…wai]TWeight vector, w, corresponding to the sound collection deviceaiIs the weight, s, of the ith sound collection deviceiAnd (t) is an audio signal corresponding to the ith sound acquisition equipment, and y (t) is a target audio signal.
It should be noted that, when the audio signals corresponding to the two sound collection devices are weighted and summed, the weight of each sound collection device may be a fixed weight, and the fixed weight may be preset by a relevant technician according to actual needs, self experience or experimental results, for example, the fixed weight of each sound collection device may be 0.5. The weights may also be adjusted according to the filtering result, e.g. according to the quality of the resulting target audio signal.
In a possible implementation manner, as shown in fig. 7, the computer device may perform weighted summation on the audio signals corresponding to the two sound collection devices to obtain candidate target audio signals, which are generated by the target sound source and have the same sound with noise reduced, and then filter audio signals, which are not in the target direction, in the candidate target audio signals to obtain the target audio signals, which are generated by the target sound source and have the same sound with noise reduced.
The computer device filters audio signals out of the candidate target audio signals in the target direction, namely noise signals in the candidate target audio signals, and can be divided into two cases, wherein one case is to directly filter the noise signals in the candidate audio signals to obtain the target audio signals, and the other case is to obtain the noise signals according to the audio signals corresponding to the two sound collection devices and subtract the noise signals from the candidate target audio signals to obtain the target audio signals.
In a possible implementation manner, the computer device may directly perform filtering again on the candidate audio signal, filter the audio signal in a direction other than the target direction, and obtain the target audio signal in the target direction. The present application does not limit the specific filtering method.
In a possible implementation manner, as shown in fig. 7, the computer device may perform weighted summation on the audio signals corresponding to the two sound collection devices according to the first weight corresponding to each sound collection device, so as to obtain a candidate target audio signal, which is generated by the target sound source and has the same sound subjected to noise reduction, after noise reduction. And the computer equipment respectively filters the audio signals of the target direction in the audio signals corresponding to each sound acquisition equipment to obtain the noise signals corresponding to each sound acquisition equipment, and performs weighted summation on the noise signals of the two sound acquisition equipment according to the second weight corresponding to each sound acquisition equipment to obtain the noise signals. And performing difference on the candidate target audio signal and the noise signal to obtain a target audio signal which is generated by the target sound source and subjected to noise reduction by the same sound.
Wherein the target direction may be a direction in which the target sound source points to an intermediate position of the two sound collection devices. The candidate target audio signal is obtained by enhancing the audio signal from the target direction, but still includes the audio signal and the noise signal of the target sound source, so that the candidate target audio signal and the noise signal can be subtracted, thereby filtering the noise signal in the candidate target audio signal and realizing further noise reduction. The noise signal can be obtained by filtering the audio signal of the target direction in the audio signal corresponding to each sound collection device to obtain the noise signal corresponding to each sound collection device, and performing weighted summation on the noise signals of the two sound collection devices.
For example, after the audio signals corresponding to the first earphone and the second earphone are weighted and summed to obtain the candidate target audio signals, the audio signals in the directions pointing to the middle positions of the first earphone and the second earphone by the mouth of the person in the audio signals corresponding to the first earphone and the second earphone can be filtered respectively to obtain noise signals corresponding to the first earphone and the second earphone, and then the noise signals corresponding to the first earphone and the second earphone are weighted and summed to obtain the noise signals. And subtracting the candidate target audio signal from the noise signal, so as to filter the noise signal in the candidate target audio signal and obtain the target audio signal.
The computer equipment obtains the noise signal by filtering the audio signal in the target direction, and makes a difference between the candidate target audio signal and the noise signal, so that the noise signal in the candidate target audio signal is filtered, and a better noise reduction effect is achieved.
In a possible implementation manner, as shown in fig. 7, the computer device may use the blocking matrix to filter audio signals in a target direction from the audio signals corresponding to each sound collection device, so as to obtain audio signals in directions other than the target direction from the audio signals of each sound collection device, that is, noise signals of each sound collection device.
The target audio signal can be obtained by, for example, the following formula.
Figure BDA0002390252100000171
U(t)=BSi(t)
Figure BDA0002390252100000172
y(t)=y′(t)-z(t)
Where y' (t) is the candidate target audio signal, Wa=[wa1,wa2,…wai]TA first weight vector, S, corresponding to the sound collection devicei(t)=[s1(t),s2(t)…si(t)]TIs the audio signal vector of i sound collection devices, U (t) is the array signal of the audio signal vector of i sound collection devices after the processing of the blocking matrix B, Wb=[wb1,wb2,…wbi]TA second weight vector, w, corresponding to the sound pickup devicebiThe second weight of the ith sound collecting device, z (t) is a noise signal, and y (t) is a target audio signal. In order to filter out the audio signal of the target direction in the audio signal of each sound collection device, the blocking matrix B may be
Figure BDA0002390252100000173
It should be noted that, when the audio signals corresponding to the two sound collection devices are weighted and summed, the first weight and the second weight may be fixed weights, and the first weight and the second weight may be preset by a relevant technician according to actual needs, experience or experimental results, for example, the first weight and the second weight of each sound collection device may be 0.5. The first weight and the second weight may also be adjusted according to the filtering result, for example, according to the quality of the obtained target audio signal.
In one possible implementation, the second weight may be an updatable weight. The computer device performs weighted summation on the noise signals corresponding to the two sound collection devices according to the second weight corresponding to each sound collection device to obtain a noise signal, removes the noise signal in the candidate target audio signal to obtain a target audio signal which is generated by the target sound source and subjected to noise reduction on the same sound, and then adjusts the second weight according to the correlation between the target audio signal and the expected audio signal obtained based on the candidate target audio signal, and continues to execute the steps of obtaining and removing the noise signal on the basis of the adjusted second weight until the target audio signal meets the target condition, so as to obtain the target audio signal which is generated by the target sound source and subjected to noise reduction on the same sound.
In the target audio signal obtained by subtracting the candidate target audio signal and the noise signal, there may be a residual noise signal that is not filtered, and accordingly, the second weight for performing weighted summation on the noise signals of the two sound collection devices may be an updatable weight, so that the obtained target audio signal meets the target condition.
Through the iteration process, the second weight can be adaptively adjusted to improve the noise reduction effect of the audio signal. For the target condition, the target condition may be that the correlation is converged, or the correlation is greater than a target threshold, or a difference between the correlation and the target threshold is less than a difference threshold, or may be that the number of iterations reaches a target number, which is not limited in the embodiment of the present application.
For example, the computer device may perform a difference between the candidate target audio signal and the noise signal to obtain the target audio signal, and then update the second weight, so that the noise signal obtained according to the noise signals corresponding to the first earphone and the second earphone can be better filtered out from the candidate target audio signal.
The computer device may update the second weights using a modified least mean square algorithm to minimize the output power of the noise signal.
Figure BDA0002390252100000181
Wherein, WbAnd (n +1) is the updated second weight vector.
The second weight is updated according to the correlation between the target audio signal and the expected audio signal obtained based on the candidate target audio signal, so that the noise signal can be updated in a self-adaptive manner, the filtering effect of the noise signal in the candidate target audio signal is better and better, and the noise reduction effect is improved.
In one possible embodiment, the noise reduction method further includes:
the computer equipment detects the states of the two sound acquisition equipment, and executes the steps of acquiring audio signals corresponding to the two sound acquisition equipment, respectively filtering and taking the audio signals as the same microphone array when the two sound acquisition equipment are both in working states; when any sound collection equipment is not in a working state, acquiring an audio signal collected by the sound collection equipment in the working state to the sound emitted by the target sound source, and filtering the audio signal to obtain the target audio signal emitted by the target sound source after the same sound is denoised.
Before the noise reduction method is implemented, the computer device detects whether the two sound acquisition devices are simultaneously started or not, and then executes different noise reduction methods, so that when only one sound acquisition device is started, the acquired audio signals can be subjected to noise reduction, a certain noise reduction effect is achieved, and the situation that the audio signals cannot be subjected to noise reduction due to the fact that necessary data are lacked is avoided.
The embodiment of the application provides a noise reduction method, a computer device acquires audio signals acquired by acquiring the same sound emitted by a target sound source through two sound acquisition devices, the audio signals acquired by the two sound acquisition devices are filtered respectively, the audio signals corresponding to the two acquired sound acquisition devices are further filtered, the secondary noise reduction processing of the audio signals corresponding to the two sound acquisition devices is realized, the occupied ratio of the noise signals in the acquired target audio signals is smaller, the signal-to-noise ratio of the target audio signals is larger, and the noise reduction effect is better.
Fig. 8 is a schematic structural diagram of a noise reduction device provided in an embodiment of the present application, and referring to fig. 8, the device includes:
an obtaining module 801, configured to obtain an audio signal obtained by collecting the same sound emitted by a target sound source by two sound collecting devices.
The first filtering module 802 is configured to filter the audio signals acquired by the two sound acquisition devices respectively to obtain an audio signal corresponding to each sound acquisition device.
The second filtering module 803 is configured to filter the audio signals corresponding to the two sound collection devices to obtain a target audio signal, which is generated by the target sound source and obtained after the same sound is denoised.
In a possible implementation, the second filtering module 803 includes:
and the second weighting module is used for weighting and summing the audio signals corresponding to the two sound acquisition devices to obtain the target audio signal which is emitted by the target sound source and subjected to the same sound noise reduction.
In a possible implementation, the second filtering module 803 includes:
and the second weighting module is used for weighting and summing the audio signals corresponding to the two sound acquisition devices to obtain a candidate target audio signal which is emitted by the target sound source and subjected to the same sound noise reduction.
And the second filtering module is used for filtering audio signals except the target direction in the candidate target audio signals to obtain the target audio signal which is emitted by the target sound source and subjected to noise reduction by the same sound.
In a possible embodiment, the second filtering module comprises:
and the second filtering submodule is used for respectively filtering the audio signals in the target direction in the audio signals corresponding to each sound acquisition device to obtain the noise signals corresponding to each sound acquisition device, and the target direction is the direction in which the target sound source points to the middle position of the two sound acquisition devices.
And the second weighting submodule is used for weighting and summing the noise signals corresponding to the two sound acquisition devices to obtain the noise signals.
And the second removing module is used for removing the noise signal in the candidate target audio signal to obtain the target audio signal which is emitted by the target sound source and subjected to noise reduction by the same sound.
In a possible implementation manner, the second weighting submodule is specifically configured to perform weighted summation on the noise signals corresponding to the two sound collection devices according to the weight corresponding to each sound collection device, so as to obtain a noise signal.
The second filtering module further includes a second adjusting module, configured to adjust the weight according to a correlation between the target audio signal and an expected audio signal obtained based on the candidate target audio signal.
The second filtering module is further configured to continue to perform the steps of obtaining and removing the noise signal based on the adjusted weight until a target condition is met, and obtain a target audio signal, which is generated by the target sound source and has the same sound and subjected to noise reduction, is obtained.
In one possible implementation, the first filtering module 802 includes:
and the time delay module is used for carrying out time delay compensation on the audio signal acquired by each acquisition unit in each sound acquisition device according to the time delay of the sound acquired by the different acquisition units in each sound acquisition device so as to obtain the candidate audio signal of each acquisition unit.
And the first filtering submodule is used for filtering the candidate audio signals of the acquisition units in each sound acquisition device to obtain the audio signals corresponding to each sound acquisition device.
In one possible embodiment, the apparatus further comprises:
and the detection module is used for detecting the states of the two sound acquisition devices.
The apparatus is also configured to:
and when the two sound acquisition devices are in working states, executing the steps of acquiring the audio signals corresponding to the two sound acquisition devices, respectively filtering and taking the audio signals as the same microphone array.
When any sound collection equipment is not in a working state, obtaining an audio signal collected by the sound collection equipment in the working state to the sound emitted by the target sound source, and filtering the audio signal to obtain the target audio signal emitted by the target sound source and subjected to noise reduction by the same sound.
The embodiment of the application provides a device of making an uproar falls, through obtaining two sound collection equipment and gathering the audio signal that obtains to the same sound that target sound source sent, carry out filtering to the audio signal that two sound collection equipment gathered respectively, further carry out filtering to the audio signal that two sound collection equipment that obtain correspond again, realized the processing of making an uproar fall once more to the audio signal that two sound collection equipment correspond, in the target audio signal that makes the acquisition, the occupation ratio of noise signal is littleer, the SNR of target audio signal is bigger, the noise reduction effect is better.
It should be noted that: in the noise reduction device provided in the above embodiment, only the division of the functional modules is illustrated when performing noise reduction, and in practical applications, the functions may be distributed by different functional modules as needed, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above. In addition, the noise reduction device and the noise reduction method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments in detail and are not described herein again.
Referring to fig. 9, the terminal 900 may be a smart phone, a tablet computer, an MP3(Moving Picture Experts Group Audio L layer III, motion Picture Experts compression standard Audio layer 3) player, an MP4(Moving Picture Experts Group Audio L layer IV, motion Picture Experts compression standard Audio layer 4) player, a notebook computer or a desktop computer, and the terminal 900 may also be referred to as a user equipment, a portable terminal, a laptop terminal, a desktop terminal, or other names.
In general, terminal 900 includes: one or more processors 901 and one or more memories 902.
The processor 901 may include one or more Processing cores, such as a 4-core processor, a 9-core processor, etc., the processor 901 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), a P L a (Programmable logic Array), the processor 901 may also include a main processor and a coprocessor, the main processor being a processor for Processing data in a wake-up state, also known as a CPU (Central Processing Unit), the coprocessor being a low-power processor for Processing data in a standby state, in some embodiments, the processor 901 may be integrated with a GPU (Graphics Processing Unit) for rendering and rendering content desired for a display screen, in some embodiments, the processor 901 may also include an intelligent processor for learning about AI operations.
Memory 902 may include one or more computer-readable storage media, which may be non-transitory. The memory 902 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 902 is used to store at least one instruction for execution by processor 901 to implement the noise reduction methods provided by the method embodiments herein.
In some embodiments, terminal 900 can also optionally include: a peripheral interface 903 and at least one peripheral. The processor 901, memory 902, and peripheral interface 903 may be connected by buses or signal lines. Various peripheral devices may be connected to the peripheral interface 903 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of a radio frequency circuit 904, a display screen 905, a camera assembly 906, an audio circuit 907, a positioning assembly 908, and a power supply 909.
The peripheral interface 903 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 901 and the memory 902. In some embodiments, the processor 901, memory 902, and peripheral interface 903 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 901, the memory 902 and the peripheral interface 903 may be implemented on a separate chip or circuit board, which is not limited by this embodiment.
The Radio Frequency circuit 904 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 904 communicates with communication networks and other communication devices via electromagnetic signals. The radio frequency circuit 904 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 904 comprises: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuit 904 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generation mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the radio frequency circuit 904 may also include NFC (Near Field Communication) related circuits, which are not limited in this application.
The Display 905 is used to Display a UI (User Interface), which may include graphics, text, icons, video, and any combination thereof, when the Display 905 is a touch Display, the Display 905 also has the ability to capture touch signals on or over the surface of the Display 905, which may be input to the processor 901 for processing as control signals, at this time, the Display 905 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard, in some embodiments, the Display 905 may be one, providing the front panel of the terminal 900, in other embodiments, the Display 905 may be at least two, each provided on a different surface or in a folded design of the terminal 900, in some embodiments, the Display 905 may be a flexible Display, provided on a curved surface or on a folded surface of the terminal 900, even, the Display 905 may be provided as a non-rectangular irregular graphic, i.e., a shaped Display, the Display 905 may be manufactured using L CD (L idCrysky, Display, liquid crystal Display, emissive Display (Organic LED L, Organic LED).
The camera assembly 906 is used to capture images or video. Optionally, camera assembly 906 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 906 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.
Audio circuit 907 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 901 for processing, or inputting the electric signals to the radio frequency circuit 904 for realizing voice communication. For stereo sound acquisition or noise reduction purposes, the microphones may be multiple and disposed at different locations of the terminal 900. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 901 or the radio frequency circuit 904 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, audio circuit 907 may also include a headphone jack.
The positioning component 908 is used to locate the current geographic location of the terminal 900 to implement navigation or L BS (L geographic based Service). the positioning component 908 can be a positioning component based on the united states GPS (global positioning System), the beidou System of china, the graves System of russia, or the galileo System of the european union.
Power supply 909 is used to provide power to the various components in terminal 900. The power source 909 may be alternating current, direct current, disposable or rechargeable. When power source 909 comprises a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.
In some embodiments, terminal 900 can also include one or more sensors 910. The one or more sensors 910 include, but are not limited to: acceleration sensor 911, gyro sensor 912, pressure sensor 913, fingerprint sensor 914, optical sensor 915, and proximity sensor 916.
The acceleration sensor 911 can detect the magnitude of acceleration in three coordinate axes of the coordinate system established with the terminal 900. For example, the acceleration sensor 911 may be used to detect the components of the gravitational acceleration in three coordinate axes. The processor 901 can control the display screen 905 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 911. The acceleration sensor 911 may also be used for acquisition of motion data of a game or a user.
The gyro sensor 912 may detect a body direction and a rotation angle of the terminal 900, and the gyro sensor 912 may cooperate with the acceleration sensor 911 to acquire a 3D motion of the user on the terminal 900. The processor 901 can implement the following functions according to the data collected by the gyro sensor 912: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.
The pressure sensor 913 may be disposed on a side bezel of the terminal 900 and/or underneath the display 905. When the pressure sensor 913 is disposed on the side frame of the terminal 900, the user's holding signal of the terminal 900 may be detected, and the processor 901 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 913. When the pressure sensor 913 is disposed at a lower layer of the display screen 905, the processor 901 controls the operability control on the UI interface according to the pressure operation of the user on the display screen 905. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.
The fingerprint sensor 914 is used for collecting fingerprints of a user, the identity of the user is identified by the processor 901 according to the fingerprints collected by the fingerprint sensor 914, or the identity of the user is identified by the fingerprint sensor 914 according to the collected fingerprints, when the identity of the user is identified as a credible identity, the user is authorized by the processor 901 to execute relevant sensitive operations, the sensitive operations comprise screen unlocking, encrypted information viewing, software downloading, payment, setting change and the like, the fingerprint sensor 914 can be arranged on the front side, the back side or the side of the terminal 900, when a physical key or a manufacturer L ogo is arranged on the terminal 900, the fingerprint sensor 914 can be integrated with the physical key or the manufacturer L ogo.
The optical sensor 915 is used to collect ambient light intensity. In one embodiment, the processor 901 may control the display brightness of the display screen 905 based on the ambient light intensity collected by the optical sensor 915. Specifically, when the ambient light intensity is high, the display brightness of the display screen 905 is increased; when the ambient light intensity is low, the display brightness of the display screen 905 is reduced. In another embodiment, the processor 901 can also dynamically adjust the shooting parameters of the camera assembly 906 according to the ambient light intensity collected by the optical sensor 915.
Proximity sensor 916, also known as a distance sensor, is typically disposed on the front panel of terminal 900. The proximity sensor 916 is used to collect the distance between the user and the front face of the terminal 900. In one embodiment, when the proximity sensor 916 detects that the distance between the user and the front face of the terminal 900 gradually decreases, the processor 901 controls the display 905 to switch from the bright screen state to the dark screen state; when the proximity sensor 916 detects that the distance between the user and the front surface of the terminal 900 gradually becomes larger, the display 905 is controlled by the processor 901 to switch from the breath screen state to the bright screen state.
Those skilled in the art will appreciate that the configuration shown in fig. 9 does not constitute a limitation of terminal 900, and may include more or fewer components than those shown, or may combine certain components, or may employ a different arrangement of components.
In an exemplary embodiment, a computer-readable storage medium, such as a memory, including instructions executable by a processor to perform the noise reduction method in the above-described embodiments is also provided. For example, the computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a Compact Disc Read-Only Memory (CD-ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, and the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is intended to be exemplary only, and not to limit the present application, and any modifications, equivalents, improvements, etc. made within the spirit and scope of the present application are intended to be included therein.

Claims (10)

1. A method of noise reduction, the method comprising:
acquiring an audio signal acquired by acquiring the same sound emitted by a target sound source by two sound acquisition devices;
respectively filtering the audio signals collected by the two sound collection devices to obtain the audio signal corresponding to each sound collection device;
and taking the two sound collection devices as two microphones in the same microphone array, and filtering audio signals corresponding to the two sound collection devices to obtain the target audio signal which is emitted by the target sound source and subjected to the noise reduction of the same sound.
2. The method according to claim 1, wherein the filtering the audio signals corresponding to the two sound collection devices to obtain a noise-reduced target audio signal of the same sound emitted by the target sound source comprises:
and carrying out weighted summation on the audio signals corresponding to the two sound acquisition devices to obtain the target audio signal which is emitted by the target sound source and subjected to noise reduction on the same sound.
3. The method according to claim 1, wherein the step of filtering audio signals corresponding to the two sound collecting devices by using the two sound collecting devices as two microphones in a same microphone array to obtain a noise-reduced target audio signal of the same sound emitted by the target sound source comprises:
carrying out weighted summation on the audio signals corresponding to the two sound acquisition devices to obtain candidate target audio signals which are emitted by the target sound source and subjected to noise reduction on the same sound;
and filtering audio signals except the target direction in the candidate target audio signals to obtain the target audio signal which is emitted by the target sound source and subjected to noise reduction.
4. The method according to claim 3, wherein said filtering out audio signals other than the target direction from the candidate target audio signals to obtain a noise-reduced target audio signal of the same sound emitted by the target sound source comprises:
respectively filtering audio signals of a target direction in the audio signals corresponding to each sound collection device to obtain noise signals corresponding to each sound collection device, wherein the target direction is a direction in which the target sound source points to the middle position of the two sound collection devices;
carrying out weighted summation on the noise signals corresponding to the two sound acquisition devices to obtain a noise signal;
and removing the noise signal in the candidate target audio signal to obtain the target audio signal which is emitted by the target sound source and subjected to noise reduction.
5. The method according to claim 4, wherein the noise signals corresponding to the two sound collection devices are weighted and summed to obtain a noise signal; removing the noise signal in the candidate target audio signal to obtain the noise-reduced target audio signal of the same sound emitted by the target sound source, including:
according to the weight corresponding to each sound acquisition device, carrying out weighted summation on the noise signals corresponding to the two sound acquisition devices to obtain a noise signal;
removing the noise signal in the candidate target audio signal to obtain a target audio signal which is emitted by the target sound source and subjected to noise reduction;
and adjusting the weight according to the correlation between the target audio signal and an expected audio signal obtained based on the candidate target audio signal, and continuing to execute the steps of obtaining and removing the noise signal based on the adjusted weight until a target condition is met, so as to obtain the target audio signal which is emitted by the target sound source and subjected to noise reduction and is emitted by the same sound.
6. The method according to claim 1, wherein the filtering the audio signals collected by the two sound collection devices respectively to obtain the audio signal corresponding to each sound collection device comprises:
according to the time delay of collecting the sound by different collecting units in each sound collecting device, carrying out time delay compensation on the audio signal collected by each collecting unit in each sound collecting device to obtain a candidate audio signal of each collecting unit;
and filtering the candidate audio signals of the acquisition units in each sound acquisition device to obtain the audio signal corresponding to each sound acquisition device.
7. The method of claim 1, further comprising:
detecting the states of the two sound collection devices;
when the two sound collection devices are both in a working state, executing the steps of obtaining audio signals corresponding to the two sound collection devices, respectively filtering and using the audio signals as a same microphone array;
when any sound collection equipment is not in a working state, acquiring an audio signal collected by the sound collection equipment in the working state, which is sent by the target sound source, and filtering the audio signal to obtain the target audio signal sent by the target sound source after the same sound is denoised.
8. A noise reduction apparatus, characterized in that the apparatus comprises a plurality of functional modules for performing the noise reduction method of any one of claims 1 to 7.
9. A computer device comprising one or more processors and one or more memories having stored therein at least one instruction that is loaded and executed by the one or more processors to implement operations performed by the noise reduction method of any of claims 1-7.
10. A computer-readable storage medium having stored therein at least one instruction, which is loaded and executed by a processor to perform operations performed by the noise reduction method of any one of claims 1 to 7.
CN202010111706.0A 2020-02-24 2020-02-24 Noise reduction method, device, equipment and storage medium Active CN111402913B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010111706.0A CN111402913B (en) 2020-02-24 2020-02-24 Noise reduction method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010111706.0A CN111402913B (en) 2020-02-24 2020-02-24 Noise reduction method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111402913A true CN111402913A (en) 2020-07-10
CN111402913B CN111402913B (en) 2023-09-12

Family

ID=71413851

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010111706.0A Active CN111402913B (en) 2020-02-24 2020-02-24 Noise reduction method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111402913B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112185336A (en) * 2020-09-28 2021-01-05 苏州臻迪智能科技有限公司 Noise reduction method, device and equipment
CN112785998A (en) * 2020-12-29 2021-05-11 展讯通信(上海)有限公司 Signal processing method, equipment and device
CN112837703A (en) * 2020-12-30 2021-05-25 深圳市联影高端医疗装备创新研究院 Method, apparatus, device and medium for acquiring voice signal in medical imaging device
CN113539291A (en) * 2021-07-09 2021-10-22 北京声智科技有限公司 Method and device for reducing noise of audio signal, electronic equipment and storage medium
CN113766385A (en) * 2021-09-24 2021-12-07 维沃移动通信有限公司 Earphone noise reduction method and device
CN114697812A (en) * 2020-12-29 2022-07-01 华为技术有限公司 Sound collection method, electronic equipment and system
CN115132220A (en) * 2022-08-25 2022-09-30 深圳市友杰智新科技有限公司 Method, device, equipment and storage medium for restraining double-microphone awakening of television noise

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9741360B1 (en) * 2016-10-09 2017-08-22 Spectimbre Inc. Speech enhancement for target speakers
CN108091344A (en) * 2018-02-28 2018-05-29 科大讯飞股份有限公司 A kind of noise-reduction method, apparatus and system
WO2018127483A1 (en) * 2017-01-03 2018-07-12 Koninklijke Philips N.V. Audio capture using beamforming
CN108922554A (en) * 2018-06-04 2018-11-30 南京信息工程大学 The constant Wave beam forming voice enhancement algorithm of LCMV frequency based on logarithm Power estimation
US20190287546A1 (en) * 2018-03-19 2019-09-19 Bose Corporation Echo control in binaural adaptive noise cancellation systems in headsets

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9741360B1 (en) * 2016-10-09 2017-08-22 Spectimbre Inc. Speech enhancement for target speakers
WO2018127483A1 (en) * 2017-01-03 2018-07-12 Koninklijke Philips N.V. Audio capture using beamforming
CN108091344A (en) * 2018-02-28 2018-05-29 科大讯飞股份有限公司 A kind of noise-reduction method, apparatus and system
US20190287546A1 (en) * 2018-03-19 2019-09-19 Bose Corporation Echo control in binaural adaptive noise cancellation systems in headsets
CN108922554A (en) * 2018-06-04 2018-11-30 南京信息工程大学 The constant Wave beam forming voice enhancement algorithm of LCMV frequency based on logarithm Power estimation

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112185336A (en) * 2020-09-28 2021-01-05 苏州臻迪智能科技有限公司 Noise reduction method, device and equipment
CN112785998B (en) * 2020-12-29 2022-11-15 展讯通信(上海)有限公司 Signal processing method, equipment and device
CN112785998A (en) * 2020-12-29 2021-05-11 展讯通信(上海)有限公司 Signal processing method, equipment and device
CN114697812B (en) * 2020-12-29 2023-06-20 华为技术有限公司 Sound collection method, electronic equipment and system
CN114697812A (en) * 2020-12-29 2022-07-01 华为技术有限公司 Sound collection method, electronic equipment and system
WO2022142833A1 (en) * 2020-12-29 2022-07-07 展讯通信(上海)有限公司 Signal processing method, device and apparatus
CN112837703A (en) * 2020-12-30 2021-05-25 深圳市联影高端医疗装备创新研究院 Method, apparatus, device and medium for acquiring voice signal in medical imaging device
CN113539291A (en) * 2021-07-09 2021-10-22 北京声智科技有限公司 Method and device for reducing noise of audio signal, electronic equipment and storage medium
CN113539291B (en) * 2021-07-09 2024-06-25 北京声智科技有限公司 Noise reduction method and device for audio signal, electronic equipment and storage medium
CN113766385A (en) * 2021-09-24 2021-12-07 维沃移动通信有限公司 Earphone noise reduction method and device
CN113766385B (en) * 2021-09-24 2023-12-22 维沃移动通信有限公司 Earphone noise reduction method and device
CN115132220A (en) * 2022-08-25 2022-09-30 深圳市友杰智新科技有限公司 Method, device, equipment and storage medium for restraining double-microphone awakening of television noise
CN115132220B (en) * 2022-08-25 2023-02-28 深圳市友杰智新科技有限公司 Method, device, equipment and storage medium for restraining double-microphone awakening of television noise

Also Published As

Publication number Publication date
CN111402913B (en) 2023-09-12

Similar Documents

Publication Publication Date Title
CN111402913B (en) Noise reduction method, device, equipment and storage medium
CN111050250B (en) Noise reduction method, device, equipment and storage medium
CN110764730B (en) Method and device for playing audio data
CN108156561B (en) Audio signal processing method and device and terminal
CN109887494B (en) Method and apparatus for reconstructing a speech signal
CN111445901B (en) Audio data acquisition method and device, electronic equipment and storage medium
CN111696570B (en) Voice signal processing method, device, equipment and storage medium
CN112133332B (en) Method, device and equipment for playing audio
CN110797042B (en) Audio processing method, device and storage medium
CN110956580A (en) Image face changing method and device, computer equipment and storage medium
CN112614500A (en) Echo cancellation method, device, equipment and computer storage medium
CN111613213B (en) Audio classification method, device, equipment and storage medium
CN112233689A (en) Audio noise reduction method, device, equipment and medium
CN109360577B (en) Method, apparatus, and storage medium for processing audio
CN110473562B (en) Audio data processing method, device and system
CN114120950B (en) Human voice shielding method and electronic equipment
CN114384466A (en) Sound source direction determining method, sound source direction determining device, electronic equipment and storage medium
CN111916105A (en) Voice signal processing method and device, electronic equipment and storage medium
CN113539291B (en) Noise reduction method and device for audio signal, electronic equipment and storage medium
CN116233696B (en) Airflow noise suppression method, audio module, sound generating device and storage medium
CN115334413B (en) Voice signal processing method, system and device and electronic equipment
CN110660031B (en) Image sharpening method and device and storage medium
CN110910893B (en) Audio processing method, device and storage medium
CN111091512B (en) Image processing method and device and computer readable storage medium
CN113990340A (en) Audio signal processing method and device, terminal and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant