CN113689873A - Noise suppression method, device, electronic equipment and storage medium - Google Patents
Noise suppression method, device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN113689873A CN113689873A CN202111045463.6A CN202111045463A CN113689873A CN 113689873 A CN113689873 A CN 113689873A CN 202111045463 A CN202111045463 A CN 202111045463A CN 113689873 A CN113689873 A CN 113689873A
- Authority
- CN
- China
- Prior art keywords
- sound
- category
- interest
- obtaining
- noise reduction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 48
- 230000001629 suppression Effects 0.000 title claims abstract description 22
- 230000005236 sound signal Effects 0.000 claims abstract description 103
- 230000009467 reduction Effects 0.000 claims abstract description 56
- 238000012545 processing Methods 0.000 claims abstract description 23
- 238000001914 filtration Methods 0.000 claims abstract description 10
- 230000002708 enhancing effect Effects 0.000 claims description 4
- 238000004590 computer program Methods 0.000 claims description 2
- 230000010365 information processing Effects 0.000 claims description 2
- 230000000977 initiatory effect Effects 0.000 claims description 2
- 238000003672 processing method Methods 0.000 claims description 2
- 230000000694 effects Effects 0.000 abstract description 9
- 230000002452 interceptive effect Effects 0.000 description 10
- 241001465754 Metazoa Species 0.000 description 9
- 238000004891 communication Methods 0.000 description 9
- 230000008569 process Effects 0.000 description 8
- 230000006870 function Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 206010011469 Crying Diseases 0.000 description 4
- 206010039740 Screaming Diseases 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 238000003062 neural network model Methods 0.000 description 3
- 230000005534 acoustic noise Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 230000002238 attenuated effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Circuit For Audible Band Transducer (AREA)
- Obtaining Desirable Characteristics In Audible-Bandwidth Transducers (AREA)
Abstract
The embodiment of the application discloses a method and a device for improving directional noise suppression in electronic equipment with a microphone array consisting of a plurality of microphones, the electronic equipment and a storage medium, and an attention direction is obtained; forming a receiving beam of a microphone array based on the attention direction to focus on the attention direction and suppress signals of sound sources other than the attention direction; acquiring an audio signal based on receive beamforming; the collected audio signals are processed based on a noise reduction engine to generate audio signals, and the noise reduction engine is used for filtering noise signals in the collected audio signals. Based on the application, after the audio signal is collected based on the receiving beam forming, the collected audio signal is subjected to noise reduction processing based on the noise reduction engine, so that the attenuation of noise in the receiving beam is realized, and the audio recording effect is improved.
Description
Technical Field
The present application relates to the field of audio processing technologies, and in particular, to a method and an apparatus for improving directional noise suppression in an electronic device having a microphone array composed of multiple microphones, an electronic device, and a storage medium.
Background
When recording audio, a current portable electronic device, such as a smart phone, a tablet computer, etc., uses a beam-forming sound focusing scheme, which amplifies sound in a beam and attenuates sound outside the beam, but when a noise source is located in the beam, noise is amplified when amplifying sound in the beam, so that the audio recording effect is poor.
Disclosure of Invention
The application aims to provide a method and a device for improving directional noise suppression in electronic equipment with a microphone array composed of a plurality of microphones, the electronic equipment and a storage medium, and the method comprises the following technical scheme:
a method of enhancing directional noise suppression in an electronic device having a microphone array comprised of a plurality of microphones, the method comprising:
obtaining a direction of interest;
forming a receive beam of the microphone array based on the direction of interest to focus on the direction of interest and suppress signals of sound sources other than the direction of interest;
acquiring an audio signal based on the receive beamforming;
and processing the collected audio signals based on a noise reduction engine to generate audio signals, wherein the noise reduction engine is used for filtering noise signals in the collected audio signals.
The above method, preferably, further comprises:
obtaining a target sound category in the attention direction;
the processing the captured audio signal based on a noise reduction engine to generate an audio signal comprises:
and processing and extracting sound signals meeting the target sound category characteristics of the collected audio signals as the audio signals based on the sound model corresponding to the target sound category.
In the above method, preferably, each acoustic model corresponds to at least one acoustic category, the acoustic categories corresponding to different acoustic models are at least partially different, and the target acoustic category includes one or more acoustic categories.
In the above method, preferably, the obtaining of the target sound category in the attention direction includes:
identifying the collected audio signals and obtaining at least one sound category corresponding to the collected audio signals;
obtaining a selection operation for a selection object displaying each of the at least one sound category;
a target sound category is determined based on the selection operation.
In the above method, preferably, the obtaining of the target sound category in the attention direction includes:
acquiring an acquired image through a camera of the electronic equipment, wherein the acquisition direction of the camera is the attention direction;
identifying an acquisition object in the acquisition image, and determining the category of the acquisition object, wherein the acquisition object comprises one or more than one;
obtaining a selection operation for an acquisition object;
a target sound category is determined based on the selection operation.
The above method, preferably, the obtaining the attention direction includes:
if the target camera of the electronic equipment is turned on, the acquisition direction of the target camera is the attention direction;
or,
positioning the position of a sound source in a space range where the electronic equipment is located through the microphone array;
displaying the position of the sound source;
selecting an audio source of a target position, the direction of the target position relative to the electronic device being a direction of interest.
The above method, preferably, further comprises:
starting video recording/sound recording, forming a receiving beam of the microphone array and the noise reduction engine to obtain an audio signal based on the attention direction;
or,
initiating a video call/voice call, forming a receive beam of the microphone array based on the direction of interest and the noise reduction engine obtaining an audio signal.
An apparatus to promote directional noise suppression in an electronic device having a microphone array comprised of a plurality of microphones, the apparatus comprising:
an obtaining module for obtaining a direction of interest;
a beam forming module for forming receiving beams of the microphone array based on the attention direction so as to focus on the attention direction and suppress signals of sound sources except the attention direction;
an acquisition module to acquire an audio signal based on the receive beamforming;
the generating module is used for processing the collected audio signals to generate audio signals based on a noise reduction engine, and the noise reduction engine is used for filtering noise signals in the collected audio signals.
An electronic device, comprising:
a memory for storing a program;
a processor for invoking and executing the program in the memory, the program being executed to implement the steps of any of the above methods of enhancing directional noise suppression in an electronic device having a microphone array comprising a plurality of microphones.
A readable storage medium on which a computer program is stored, which, when executed by a processor, implements the steps of the information processing method as described in any one of the above.
According to the scheme, the method and the device for improving the directional noise suppression in the electronic equipment with the microphone array consisting of the plurality of microphones, the electronic equipment and the storage medium are provided, and the attention direction is obtained; forming a receiving beam of a microphone array based on the attention direction to focus on the attention direction and suppress signals of sound sources other than the attention direction; acquiring an audio signal based on receive beamforming; the collected audio signals are processed based on a noise reduction engine to generate audio signals, and the noise reduction engine is used for filtering noise signals in the collected audio signals. Based on the application, after the audio signal is collected based on the receiving beam forming, the collected audio signal is subjected to noise reduction processing based on a noise reduction engine, so that the attenuation of noise in the receiving beam is realized, and the audio recording effect is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a flow chart of an implementation of a method for improving directional noise suppression in an electronic device having a microphone array composed of multiple microphones according to an embodiment of the present disclosure;
FIG. 2 is a schematic structural diagram of an optimization scheme provided by an embodiment of the present application;
FIG. 3 is a flow chart of one implementation of obtaining a target sound category in a direction of interest according to an embodiment of the present application;
FIG. 4 is a flow chart of one implementation of obtaining a target sound category in a direction of interest according to an embodiment of the present application;
FIG. 5 is a flow chart of another implementation of obtaining a direction of interest according to an embodiment of the present disclosure;
FIG. 6 is a graph comparing the effects of the beamforming acoustic noise reduction scheme of the present application and the prior art provided by an embodiment of the present application;
fig. 7 is a schematic structural diagram of an apparatus for improving directional noise suppression in an electronic device having a microphone array composed of a plurality of microphones according to an embodiment of the present disclosure;
fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings described above, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in other sequences than described or illustrated herein.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without inventive step, are within the scope of the present disclosure.
The scheme provided by the embodiment of the application is used in the electronic equipment, the electronic equipment is provided with a microphone array consisting of a plurality of microphones, and the electronic equipment can be a smart phone or a tablet computer or other electronic equipment, such as a video camera and the like.
The scheme of the application can be used for audio recording, wherein the audio recording can be pure audio recording or audio recording in the video recording process.
As shown in fig. 1, a flowchart of an implementation of a method for improving directional noise suppression in an electronic device having a microphone array composed of multiple microphones according to an embodiment of the present application may include:
step S101: a direction of interest is obtained.
The focus direction refers to a direction in which a user focuses on recording audio/video. The direction of interest may be determined by a microphone array or may be determined in other ways, for example by a camera. The concrete implementation mode can refer to the subsequent embodiments.
Step S102: receive beams of the microphone array are formed based on the direction of interest to focus on the direction of interest and suppress signals of sound sources other than the direction of interest.
The receiving beams of the microphone array indicate the sound collecting range of the microphone array, which is located in the direction of interest, so that the sound in the direction of interest can be amplified, and the sound located outside the direction of interest can be suppressed.
Step S103: audio signals are acquired based on receive beamforming.
Sound surrounding the electronic device may be collected by the microphone array and then the sound within the receive beam may be amplified while the sound outside the receive beam may be attenuated or suppressed to form a captured audio signal.
Step S104: the collected audio signals are processed based on a noise reduction engine to generate audio signals, and the noise reduction engine is used for filtering noise signals in the collected audio signals.
In the application, after the audio signal is collected based on the receiving beam forming, the collected audio signal is subjected to noise reduction processing based on the noise reduction engine, so that the attenuation of noise in the receiving beam is realized, and the audio recording effect is improved.
Generally, when recording audio or video, it is mainly to record the voice of the shooting subject (e.g., the host, the main character of the small video, etc.) in terms of sound, and therefore, the audio signal obtained by filtering the noise signal in the captured audio signal may only be the voice of the shooting subject. However, in real life, there are other sound sources around the main subject, and in some scenes, not all non-voices are noise for the user recording audio/video, for example, when shooting a baby, the baby's cry/laugh is not noise, and in a concert, the music is not noise. In order to meet different noise reduction requirements of users in different scenes, the scheme can be further optimized. As shown in fig. 2, an architecture diagram of an optimization scheme provided in the embodiment of the present application is shown, in which a beamforming noise reduction engine is configured to form a collected audio signal based on an audio signal collected by a microphone array, a noise classification module is configured to identify each sound category corresponding to the collected audio signal, determine a target sound category, and send a noise reduction control signal to an AI noise reduction module, where the noise reduction control signal is used to instruct the AI noise reduction module to extract an audio signal of the target sound category from the collected audio signal and filter other audio signals, the AI noise reduction module extracts an audio signal of the target sound category from the collected audio signal according to the noise reduction control signal, and a signal output by the AI noise reduction module is the noise-reduced audio signal.
Based on the foregoing architecture, the method for improving directional noise suppression in an electronic device having a microphone array composed of multiple microphones according to an embodiment of the present application may further include:
a target sound category in the direction of interest is obtained.
In the embodiment of the present application, the target sound category refers to a sound category that is prohibited from being filtered out, that is, a sound category that needs to be retained. The sound categories may include, but are not limited to, the following: voice, music, crying or/and laughing of the baby, animal screaming, etc.
Accordingly, generating an audio signal based on the processing of the captured audio signal by the noise reduction engine may include:
and processing and extracting the sound signals meeting the characteristics of the target sound category from the collected audio signals as the audio signals based on the sound model corresponding to the target sound category.
In the embodiment of the present application, a plurality of acoustic models are pre-established, where the acoustic models may be neural network models, or may be other types of models, for example, models constructed based on a Digital Signal Processing (DSP). Wherein,
each sound model corresponds to at least one sound category, and the number of the sound categories corresponding to different sound models can be the same or different.
The sound categories corresponding to different sound models may be partially different, or the sound categories corresponding to different sound models may be completely different.
Each sound model can extract the audio signals of the corresponding sound category from the collected audio signals, and attenuate or filter the audio signals of other sound categories except the corresponding sound category.
By way of example, the acoustic models in the embodiments of the present application may include, but are not limited to, at least some of the following: the sound models of only simple speech, of only music, of only animal screaming, of only baby crying and/or laughing, of only natural sounds (wind, rain, thunder, etc.), of only speech and music, of only speech and baby crying and/or laughing, of only speech and animal screaming, of only music and baby crying and/or laughing, of only natural sounds and animal screaming, and the like are retained.
The following description will be given by taking an acoustic model only retaining music and an acoustic model only retaining voices of voices and animals as examples, where the acoustic model only retaining music means that when the acoustic model performs noise reduction processing on the collected audio signal, sounds except the music are filtered out as noise, so that the noise-reduced audio signal has only music and no other sounds; similarly, the acoustic model that only voice and animal cry are reserved means that when the acoustic model performs noise reduction processing on the collected audio signal, the acoustic model filters out the voice except the voice and the animal cry as noise, so that only the voice and the animal cry are included in the noise-reduced audio signal.
The target sound category may comprise at least one sound category, i.e. the target sound category may comprise one sound category, or the target sound category may comprise a plurality of sound categories (i.e. at least two sound categories).
When each sound model only corresponds to one sound type and the target sound type comprises a plurality of sound types, the sound model corresponding to the target sound type is the sound model corresponding to each sound type comprised by the target sound type, namely the sound models corresponding to the target sound type are multiple;
under the condition that the sound models correspond to a plurality of sound categories and the target sound category comprises a plurality of sound categories, if the plurality of sound categories included in the target sound category are exactly the sound categories corresponding to the same sound model, only one sound model corresponding to the target sound category is included, otherwise, the sound models corresponding to the target sound category comprise at least two sound models.
In an alternative embodiment, a flowchart for obtaining the target sound category in the attention direction is shown in fig. 3, and may include:
step S301: and identifying the collected audio signals and obtaining at least one sound category corresponding to the collected audio signals.
The captured audio signal may be identified based on a voice recognition engine to obtain a sound category included in the captured audio signal. The sound recognition engine may be a pre-trained neural network model for recognizing a sound class corresponding to the audio signal.
Step S302: a selection operation for a selection object displaying each of the at least one sound category is obtained.
After at least one sound category corresponding to the collected audio signal is identified, an interactive interface can be displayed, a plurality of selection objects can be displayed on the interactive interface, each selection object represents one sound category, and a user can select the sound category to be reserved according to own needs. The user may perform a selection operation on a selection object to be selected, and the selection operation may include, but is not limited to, any of the following: single or double click, etc.
Step S303: a target sound category is determined based on the selection operation.
The sound class characterized by the selection object for which the selection operation is directed may be determined as the target sound class.
In an alternative embodiment, a flowchart for obtaining the target sound category in the attention direction is shown in fig. 4, and may include:
step S401: and acquiring an acquired image through a camera of the electronic equipment, wherein the acquisition direction of the camera is the attention direction.
Under the condition of simply recording audio, the acquired image obtained by the camera of the electronic equipment can not be displayed, and under the condition of recording video, the acquired image obtained by the camera of the electronic equipment can be an image acquired and displayed by the electronic equipment in real time.
Step S402: the method comprises the steps of identifying acquisition objects in an acquired image, and determining the category of the acquisition objects, wherein the acquisition objects comprise one or more.
In the present application, identifying a captured object in a captured image refers to identifying which classes of objects are in the captured image.
The categories of the acquisition object may include, but are not limited to: non-infants, animals, musical instruments, vehicles, toys, and the like.
The class of the captured object may be determined by identifying the captured object in the captured image based on an image recognition engine, which may be a pre-trained neural network model for identifying the class of the object in the image.
Step S403: a selection operation for an acquisition object is obtained.
After the category of the collection object is determined, an interactive interface can be displayed, and the identified selection object is displayed on the interactive interface so that a user can select the corresponding collection object according to the sound category which the user wishes to keep. The user can perform a preset selection operation (such as single click or double click) on the acquisition object to be selected so as to realize the selection of the acquisition object.
Step S404: the target sound category is determined based on the selection operation.
The collection object selected by the user may be determined according to the selection operation, and then the sound category corresponding to the collection object selected by the user may be determined as the target sound category.
In the embodiment of the application, the corresponding relation between the collection object and the sound type is stored in advance, and after the collection object selected by a user is determined, the target sound type is determined according to the corresponding relation.
In an optional embodiment, after the acquisition object selected by the user is determined, when the interactive interface is displayed, each acquisition object may be displayed on the interactive interface, and a sound category corresponding to each acquisition object is displayed at the same time, so that the user can select the acquisition object, where each acquisition object and the corresponding sound category are displayed in association (e.g., in a close-position or in the same color).
In an optional embodiment, in the case of recording a video, the interface displayed in real time may be directly determined as an interactive interface, and the corresponding sound category is marked on the interactive interface for each recognized collection object.
In an alternative embodiment, one implementation manner of obtaining the attention direction may be:
and if the target camera of the electronic equipment is turned on, the acquisition direction of the target camera is the attention direction.
Under the condition of recording the video, the collection direction of the adjusted camera can be determined as the attention direction. When the electronic device has a plurality of cameras (for example, a front camera and a rear camera), the electronic device knows which camera is turned on each camera, and the orientation of the electronic device itself (in the world coordinate system) and the orientation of the cameras on the electronic device are also known, so that the electronic device can know the orientation of each camera (in the world coordinate system).
Alternatively, whether recording pure audio or video, the direction of interest may be obtained by a microphone array. Optionally, as shown in fig. 5, another implementation flowchart for obtaining the attention direction may include:
step S501: and positioning the position of the sound source in the space range where the electronic equipment is positioned through the microphone array.
The specific implementation manner can refer to the existing positioning method, and the implementation manner is not the focus of the scheme, so that the detailed description is omitted.
Step S502: the position of the sound source is displayed.
In the embodiment of the application, the interactive interface can be displayed, and the positions of the sound sources are displayed in the interactive interface according to the relative position relationship between the sound sources and the electronic equipment, so that a user can conveniently select the position where the sound source needing to keep the sound is located.
Step S503: an audio source at a target position is selected, and the direction of the target position relative to the electronic device is taken as the direction of interest.
After the user selects the target position, the direction of the target position relative to the electronic device is taken as the attention direction.
Alternatively, if there is only one position of the sound source located in step S501, step S502 to step S503 may not be executed, and the direction of the located position with respect to the electronic device may be directly used as the attention direction. Based on this, after step S501 is executed, the number of positions of the recognized sound sources may be determined, and if there is only one position, the direction of the located position relative to the electronic device is directly used as the attention direction, otherwise, steps S502 to S503 are executed.
Optionally, the method for improving directional noise suppression in an electronic device having a microphone array composed of multiple microphones according to the embodiment of the present application may further include:
video recording/sound recording is initiated, and audio signals are obtained by a noise reduction engine and receive beams forming a microphone array based on the direction of interest.
That is, the method for improving directional noise suppression in an electronic device having a microphone array composed of multiple microphones according to the embodiments of the present application may be started when video recording or sound recording is started.
Optionally, the embodiment shown in fig. 1 or the refinement process of each step may be executed when an audio recording instruction or a video recording instruction is obtained, and after the execution of the steps is completed, recording the audio/video may be started.
Alternatively, it may be detected whether the attention direction is changed during recording of the audio/video, and if the attention direction is detected to be changed, the above steps S102 to S104 are performed again, or the above steps S102 to S104 and the refinement thereof are performed. Wherein the change in the direction of interest may include, but is not limited to: switching between cameras in different orientations on the electronic device; the relative positional relationship between the photographic subject (sound source) and the electronic device changes, and the like.
Alternatively, the embodiment shown in fig. 1 and the refinement process of each step may be executed when an audio recording instruction or a video recording instruction is obtained, and recording the audio/video may be started after the execution of the steps. In the process of recording the audio/video, whether the attention direction changes is detected, if the attention direction changes is detected, the steps S102 to S104 are executed again based on the changed attention direction, or the steps S102 to S104 and the thinning steps are executed.
Optionally, the method for improving directional noise suppression in an electronic device having a microphone array composed of multiple microphones according to the embodiment of the present application may further include:
a video call/voice call (i.e., a voice call) is initiated, and a reception beam forming a microphone array and a noise reduction engine acquire an audio signal based on the above-mentioned direction of interest.
That is, the method for improving directional noise suppression in an electronic device having a microphone array composed of multiple microphones according to the embodiments of the present application may be started when a video call/a voice call is started.
Optionally, when a video call start instruction or a voice call start instruction is obtained, the embodiment shown in fig. 1 or the refinement process of each step may be executed, and after the steps are executed, the video call and the voice call are started.
Alternatively, it may be detected whether the direction of attention is changed during the video call/voice call, and if it is detected that the direction of attention is changed, the above-mentioned steps S102 to S104 are performed again, or the above-mentioned steps S102 to S104 and the detailed steps thereof are performed. Wherein the change in the direction of interest may include, but is not limited to: switching between cameras in different orientations on the electronic device; the relative positional relationship between the photographic subject (sound source) and the electronic device changes, and the like.
Alternatively, the embodiment shown in fig. 1 and the refinement process of each step may be executed when a video call start instruction or a voice call start instruction is obtained, and after the execution of the steps is completed, the video call or the voice call is started. In the process of video call/voice call, whether the direction of attention is changed or not is detected, and if the change of the direction of attention is detected, the steps S102 to S104 are executed again based on the changed direction of attention, or the steps S102 to S104 and the detailed steps thereof are executed.
Fig. 6 is a graph comparing the effect of the beamforming acoustic noise reduction scheme of the present application and the prior art provided by the embodiment of the present application, as shown in fig. 6. Wherein, a picture is a noise reduction effect illustration picture based on the wave beam forming sound noise reduction scheme of the prior art, b picture is a noise reduction effect illustration picture based on the noise reduction scheme of this application, obviously, the sound noise reduction scheme based on the wave beam forming of the prior art can only filter the noise outside the wave beam, and the noise in the wave beam still exists, and based on the sound noise reduction scheme of this application, except can filter the noise outside the wave beam, can also filter the noise in the wave beam, moreover, in this application, can filter different noise in the wave beam according to the user's demand difference.
Corresponding to the method embodiment, an embodiment of the present application further provides an apparatus for improving directional noise suppression in an electronic device having a microphone array composed of multiple microphones, where a schematic structural diagram of the apparatus is shown in fig. 7, and the apparatus may include:
an obtaining module 701, a beam forming module 702, an acquisition module 703 and a generating module 704; wherein,
the obtaining module 701 is configured to obtain a direction of interest;
a beam forming module 702 for forming a receiving beam of the microphone array based on the direction of interest to focus on the direction of interest and suppress signals of sound sources other than the direction of interest;
the acquisition module 703 is configured to acquire an audio signal based on the receive beamforming;
the generating module 704 is configured to generate an audio signal based on a noise reduction engine processing the collected audio signal, where the noise reduction engine is configured to filter out a noise signal in the collected audio signal.
The device for improving the directional noise suppression in the electronic equipment with the microphone array formed by the plurality of microphones, which is provided by the embodiment of the application, is used for performing noise reduction processing on the collected audio signals based on the noise reduction engine after the audio signals are collected based on the formation of the receiving beams, so that the attenuation of noise in the receiving beams is realized, and the audio recording effect is improved.
In an optional embodiment, the apparatus may further include:
a category obtaining module, configured to obtain a target sound category in the attention direction;
the generating module 704 is specifically configured to process and extract, as the audio signal, a sound signal of the collected audio signal that satisfies the target sound category feature based on the sound model corresponding to the target sound category.
In an alternative embodiment, each acoustic model corresponds to at least one acoustic category, the acoustic categories corresponding to different acoustic models are at least partially different, and the target acoustic category includes one or more acoustic categories.
In an alternative embodiment, the category obtaining module may include:
the voice recognition module is used for recognizing the collected audio signals and obtaining at least one sound category corresponding to the collected audio signals;
a first selection module for obtaining a selection operation for a selection object displaying each of the at least one sound category;
a first determination module to determine a target sound category based on the selection operation.
In an alternative embodiment, the category obtaining module may include:
an image acquisition module to acquire an acquisition image with respect to a camera of the electronic device, the camera acquisition direction being the direction of interest;
the image identification module is used for identifying an acquisition object in the acquisition image and determining the category of the acquisition object, wherein the acquisition object comprises one or more than one acquisition object;
the second selection module is used for obtaining selection operation aiming at the acquisition object;
a second determination module to determine a target sound category based on the selection operation.
In an alternative embodiment, the obtaining module 701 may be configured to:
if the target camera of the electronic equipment is turned on, the acquisition direction of the target camera is the attention direction;
or,
positioning the position of a sound source in a space range where the electronic equipment is located through the microphone array; displaying the position of the sound source; selecting an audio source of a target position, the direction of the target position relative to the electronic device being a direction of interest.
In an optional embodiment, the method may further include:
a recording start module, configured to start video recording/sound recording, so that the obtaining module 701, the beam forming module 702, the collecting module 703 and the generating module 704 execute corresponding functions.
In an optional embodiment, the method may further include:
a call starting module, configured to start a video call/voice call, so that the obtaining module 701, the beam forming module 702, the collecting module 703 and the generating module 704 execute corresponding functions.
Corresponding to the method embodiment, the present application further provides an electronic device, a schematic structural diagram of which is shown in fig. 8, and the electronic device may include: at least one processor 1, at least one communication interface 2, at least one memory 3 and at least one communication bus 4;
in the embodiment of the application, the number of the processor 1, the communication interface 2, the memory 3 and the communication bus 4 is at least one, and the processor 1, the communication interface 2 and the memory 3 complete mutual communication through the communication bus 4;
the processor 1 may be a central processing unit CPU, or an application Specific Integrated circuit asic, or one or more Integrated circuits configured to implement embodiments of the present application, etc.;
the memory 3 may include a high-speed RAM memory, and may further include a non-volatile memory (non-volatile memory) or the like, such as at least one disk memory;
wherein the memory 3 stores a program, and the processor 1 may call the program stored in the memory 3, the program being configured to:
obtaining a direction of interest;
forming a receive beam of the microphone array based on the direction of interest to focus on the direction of interest and suppress signals of sound sources other than the direction of interest;
acquiring an audio signal based on the receive beamforming;
and processing the collected audio signals based on a noise reduction engine to generate audio signals, wherein the noise reduction engine is used for filtering noise signals in the collected audio signals.
Alternatively, the detailed function and the extended function of the program may be as described above.
Embodiments of the present application further provide a storage medium, where a program suitable for execution by a processor may be stored, where the program is configured to:
obtaining a direction of interest;
forming a receive beam of the microphone array based on the direction of interest to focus on the direction of interest and suppress signals of sound sources other than the direction of interest;
acquiring an audio signal based on the receive beamforming;
and processing the collected audio signals based on a noise reduction engine to generate audio signals, wherein the noise reduction engine is used for filtering noise signals in the collected audio signals.
Alternatively, the detailed function and the extended function of the program may be as described above.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
It should be understood that the technical problems can be solved by combining and combining the features of the embodiments from the claims.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims (10)
1. A method of enhancing directional noise suppression in an electronic device having a microphone array comprised of a plurality of microphones, the method comprising:
obtaining a direction of interest;
forming a receive beam of the microphone array based on the direction of interest to focus on the direction of interest and suppress signals of sound sources other than the direction of interest;
acquiring an audio signal based on the receive beamforming;
and processing the collected audio signals based on a noise reduction engine to generate audio signals, wherein the noise reduction engine is used for filtering noise signals in the collected audio signals.
2. The method of claim 1, further comprising:
obtaining a target sound category in the attention direction;
the processing the captured audio signal based on a noise reduction engine to generate an audio signal comprises:
and processing and extracting sound signals meeting the target sound category characteristics of the collected audio signals as the audio signals based on the sound model corresponding to the target sound category.
3. The method of claim 2, wherein each acoustic model corresponds to at least one acoustic category, and wherein the acoustic categories corresponding to different acoustic models differ at least in part, and wherein the target acoustic category comprises one or more acoustic categories.
4. The method of claim 3, the obtaining a target sound category in the direction of interest comprising:
identifying the collected audio signals and obtaining at least one sound category corresponding to the collected audio signals;
obtaining a selection operation for a selection object displaying each of the at least one sound category;
a target sound category is determined based on the selection operation.
5. The method of claim 3, the obtaining a target sound category in the direction of interest comprising:
acquiring an acquired image through a camera of the electronic equipment, wherein the acquisition direction of the camera is the attention direction;
identifying an acquisition object in the acquisition image, and determining the category of the acquisition object, wherein the acquisition object comprises one or more than one;
obtaining a selection operation for an acquisition object;
a target sound category is determined based on the selection operation.
6. The method of claim 1, the obtaining a direction of interest comprising:
if the target camera of the electronic equipment is turned on, the acquisition direction of the target camera is the attention direction;
or,
positioning the position of a sound source in a space range where the electronic equipment is located through the microphone array;
displaying the position of the sound source;
selecting an audio source of a target position, the direction of the target position relative to the electronic device being a direction of interest.
7. The method of claim 6, further comprising:
starting video recording/sound recording, forming a receiving beam of the microphone array and the noise reduction engine to obtain an audio signal based on the attention direction;
or,
initiating a video call/voice call, forming a receive beam of the microphone array based on the direction of interest and the noise reduction engine obtaining an audio signal.
8. An apparatus to promote directional noise suppression in an electronic device having a microphone array comprised of a plurality of microphones, the apparatus comprising:
an obtaining module for obtaining a direction of interest;
a beam forming module for forming receiving beams of the microphone array based on the attention direction so as to focus on the attention direction and suppress signals of sound sources except the attention direction;
an acquisition module to acquire an audio signal based on the receive beamforming;
the generating module is used for processing the collected audio signals to generate audio signals based on a noise reduction engine, and the noise reduction engine is used for filtering noise signals in the collected audio signals.
9. An electronic device, comprising:
a memory for storing a program;
a processor for invoking and executing the program in the memory, the program being executed to implement the steps of the method of any of claims 1-7 of enhancing directional noise suppression in an electronic device having a microphone array of a plurality of microphones.
10. A readable storage medium on which a computer program is stored which, when being executed by a processor, carries out the steps of the information processing method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111045463.6A CN113689873A (en) | 2021-09-07 | 2021-09-07 | Noise suppression method, device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111045463.6A CN113689873A (en) | 2021-09-07 | 2021-09-07 | Noise suppression method, device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113689873A true CN113689873A (en) | 2021-11-23 |
Family
ID=78585561
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111045463.6A Pending CN113689873A (en) | 2021-09-07 | 2021-09-07 | Noise suppression method, device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113689873A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114863943A (en) * | 2022-07-04 | 2022-08-05 | 杭州兆华电子股份有限公司 | Self-adaptive positioning method and device for environmental noise source based on beam forming |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160064002A1 (en) * | 2014-08-29 | 2016-03-03 | Samsung Electronics Co., Ltd. | Method and apparatus for voice recording and playback |
CN107360387A (en) * | 2017-07-13 | 2017-11-17 | 广东小天才科技有限公司 | Video recording method and device and terminal equipment |
CN109874096A (en) * | 2019-01-17 | 2019-06-11 | 天津大学 | A kind of ears microphone hearing aid noise reduction algorithm based on intelligent terminal selection output |
CN110556113A (en) * | 2018-05-15 | 2019-12-10 | 上海博泰悦臻网络技术服务有限公司 | Vehicle control method based on voiceprint recognition and cloud server |
CN110781850A (en) * | 2019-10-31 | 2020-02-11 | 深圳金信诺高新技术股份有限公司 | Semantic segmentation system and method for road recognition, and computer storage medium |
CN111078185A (en) * | 2019-12-26 | 2020-04-28 | 珠海格力电器股份有限公司 | Method and equipment for recording sound |
CN111292510A (en) * | 2020-01-16 | 2020-06-16 | 广州华铭电力科技有限公司 | Recognition early warning method for urban cable damaged by external force |
US20200213728A1 (en) * | 2020-03-10 | 2020-07-02 | Intel Corportation | Audio-based detection and tracking of emergency vehicles |
CN111724823A (en) * | 2016-03-29 | 2020-09-29 | 联想(北京)有限公司 | Information processing method and device and electronic equipment |
CN112289326A (en) * | 2020-12-25 | 2021-01-29 | 浙江弄潮儿智慧科技有限公司 | Bird identification comprehensive management system with noise removal function and noise removal method thereof |
-
2021
- 2021-09-07 CN CN202111045463.6A patent/CN113689873A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160064002A1 (en) * | 2014-08-29 | 2016-03-03 | Samsung Electronics Co., Ltd. | Method and apparatus for voice recording and playback |
CN111724823A (en) * | 2016-03-29 | 2020-09-29 | 联想(北京)有限公司 | Information processing method and device and electronic equipment |
CN107360387A (en) * | 2017-07-13 | 2017-11-17 | 广东小天才科技有限公司 | Video recording method and device and terminal equipment |
CN110556113A (en) * | 2018-05-15 | 2019-12-10 | 上海博泰悦臻网络技术服务有限公司 | Vehicle control method based on voiceprint recognition and cloud server |
CN109874096A (en) * | 2019-01-17 | 2019-06-11 | 天津大学 | A kind of ears microphone hearing aid noise reduction algorithm based on intelligent terminal selection output |
CN110781850A (en) * | 2019-10-31 | 2020-02-11 | 深圳金信诺高新技术股份有限公司 | Semantic segmentation system and method for road recognition, and computer storage medium |
CN111078185A (en) * | 2019-12-26 | 2020-04-28 | 珠海格力电器股份有限公司 | Method and equipment for recording sound |
CN111292510A (en) * | 2020-01-16 | 2020-06-16 | 广州华铭电力科技有限公司 | Recognition early warning method for urban cable damaged by external force |
US20200213728A1 (en) * | 2020-03-10 | 2020-07-02 | Intel Corportation | Audio-based detection and tracking of emergency vehicles |
CN112289326A (en) * | 2020-12-25 | 2021-01-29 | 浙江弄潮儿智慧科技有限公司 | Bird identification comprehensive management system with noise removal function and noise removal method thereof |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114863943A (en) * | 2022-07-04 | 2022-08-05 | 杭州兆华电子股份有限公司 | Self-adaptive positioning method and device for environmental noise source based on beam forming |
CN114863943B (en) * | 2022-07-04 | 2022-11-04 | 杭州兆华电子股份有限公司 | Self-adaptive positioning method and device for environmental noise source based on beam forming |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10848889B2 (en) | Intelligent audio rendering for video recording | |
CN110970057B (en) | Sound processing method, device and equipment | |
CN112165590B (en) | Video recording implementation method and device and electronic equipment | |
CN104853091B (en) | A kind of method taken pictures and mobile terminal | |
CN106375590A (en) | Volume adjusting method and device of intelligent terminal | |
CN105578097A (en) | Video recording method and terminal | |
US11431887B2 (en) | Information processing device and method for detection of a sound image object | |
WO2018095400A1 (en) | Audio signal processing method and related device | |
CN113676592B (en) | Recording method, recording device, electronic equipment and computer readable medium | |
WO2022179453A1 (en) | Sound recording method and related device | |
CN110620895A (en) | Data processing device, data processing method, and recording medium | |
CN105635452A (en) | Mobile terminal and contact person identification method thereof | |
CN108737934B (en) | Intelligent sound box and control method thereof | |
CN109151366B (en) | Sound processing method for video call, storage medium and server | |
CN113676668A (en) | Video shooting method and device, electronic equipment and readable storage medium | |
CN113689873A (en) | Noise suppression method, device, electronic equipment and storage medium | |
CN112165591B (en) | Audio data processing method and device and electronic equipment | |
CN106803886A (en) | A kind of method and device taken pictures | |
CN113542466A (en) | Audio processing method, electronic device and storage medium | |
CN107197404B (en) | Automatic sound effect adjusting method and device and recording and broadcasting system | |
CN113329138A (en) | Video shooting method, video playing method and electronic equipment | |
CN112073639A (en) | Shooting control method and device, computer readable medium and electronic equipment | |
CN111933174B (en) | Voice processing method, device, equipment and system | |
CN114333817A (en) | Remote controller and remote controller voice recognition method | |
CN113436613A (en) | Voice recognition method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |