CN111863012A - Audio signal processing method and device, terminal and storage medium - Google Patents

Audio signal processing method and device, terminal and storage medium Download PDF

Info

Publication number
CN111863012A
CN111863012A CN202010763471.3A CN202010763471A CN111863012A CN 111863012 A CN111863012 A CN 111863012A CN 202010763471 A CN202010763471 A CN 202010763471A CN 111863012 A CN111863012 A CN 111863012A
Authority
CN
China
Prior art keywords
sound source
source signals
directions
null
original sound
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010763471.3A
Other languages
Chinese (zh)
Inventor
李炯亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaomi Pinecone Electronic Co Ltd
Original Assignee
Beijing Xiaomi Pinecone Electronic Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaomi Pinecone Electronic Co Ltd filed Critical Beijing Xiaomi Pinecone Electronic Co Ltd
Priority to CN202010763471.3A priority Critical patent/CN111863012A/en
Publication of CN111863012A publication Critical patent/CN111863012A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Obtaining Desirable Characteristics In Audible-Bandwidth Transducers (AREA)

Abstract

The present disclosure relates to an audio signal processing method, an audio signal processing apparatus, a terminal, and a storage medium, wherein the audio signal processing method includes: converting original sound source signals received by at least two microphones into original sound source signals in a plurality of wave beam directions; the original sound source signal in one wave beam direction has a null direction, and the null directions of the original sound source signals in different wave beam directions are different; superposing original sound source signals in a plurality of wave beam directions based on a null direction to obtain at least two preprocessed sound source signals; wherein at least one of the preprocessed sound source signals has at least two null directions that suppress interference. The method can inhibit interference or reduce reverberation, and improve the quality of the audio signal of the target sound source.

Description

Audio signal processing method and device, terminal and storage medium
Technical Field
The present disclosure relates to the field of communications technologies, and in particular, to an audio signal processing method, an audio signal processing apparatus, a terminal, and a storage medium.
Background
In the related art, the smart product device may improve the quality of an audio signal by a beamforming technique using a plurality of microphones. In the case where the microphone is limited, the gain of the beam is limited, and it is difficult to determine an ideal spatial gain, thereby failing to improve the quality of the audio signal well. Also, in a scene of a plurality of interfering sound sources, which are not only caused by interfering signals from the interfering sources but also reverberant signals generated due to refraction, reflection, etc. of a wall, etc., the interfering sound sources may make it more difficult for the device to obtain an audio signal of a target sound source of higher quality.
Disclosure of Invention
The present disclosure provides an audio signal processing method, apparatus, terminal and storage medium.
According to a first aspect of embodiments of the present disclosure, there is provided an audio signal processing method, the method comprising:
converting original sound source signals received by at least two microphones into original sound source signals in a plurality of wave beam directions; the original sound source signal in one wave beam direction has a null direction, and the null directions of the original sound source signals in different wave beam directions are different;
superposing original sound source signals in a plurality of wave beam directions based on a null direction to obtain at least two preprocessed sound source signals; wherein at least one of the preprocessed sound source signals has at least two null directions that suppress interference.
In the above scheme, the method further comprises:
and performing blind source separation on at least two preprocessed sound source signals to obtain at least two target sound source signals.
In the above scheme, the method further comprises:
and acquiring the corrected at least two target sound source information based on the at least two target sound source signals and the respective weight coefficients.
In the foregoing solution, the converting original sound source signals received by at least two microphones into original sound source signals in multiple beam directions includes:
converting the original sound source signal into original sound source signals of a plurality of beam directions each pointing to a direction of a target sound source;
the superimposing original sound source signals of a plurality of beam directions based on a null direction to obtain at least two of the preprocessed sound source signals comprises:
superposing original sound source signals in at least two beam directions in original sound source signals in a plurality of beam directions based on a null direction to obtain a preprocessed sound source signal with at least two null directions and at least one preprocessed sound source signal with a null direction; alternatively, the first and second electrodes may be,
and superposing original sound source signals in at least two beam directions in the original sound source signals in the plurality of beam directions based on the null directions to obtain at least two preprocessed sound source signals with at least two null directions.
In the foregoing solution, the superimposing original sound source signals in at least two beam directions from among original sound source signals in a plurality of beam directions based on a null direction to obtain at least two preprocessed sound source signals with at least two null directions includes:
dividing original sound source signals in a plurality of wave beam directions into at least two parts;
and respectively superposing the original sound source signals in at least two wave beam directions based on the null direction to obtain at least two preprocessed sound source signals with at least two null directions.
In the above solution, the converting, by the at least two microphones, the original sound source signal received by the at least two microphones into original sound source signals in a plurality of beam directions includes:
converting the original sound source signal into original sound source signals in a plurality of wave beam directions pointing to fixed directions, wherein the fixed directions corresponding to the original sound source signals in different wave beam directions are different; the fixed direction corresponding to the original sound source signal in one wave beam direction is the direction pointing to the target sound source;
the superimposing original sound source signals of a plurality of beam directions based on a null direction to obtain at least two of the preprocessed sound source signals comprises:
superimposing original sound source signals of at least two beam directions, the fixed directions of which are not directions pointing to a target sound source, based on null directions to obtain at least one preprocessed sound source signal having at least two null directions;
an original sound source signal of which the fixed direction is a beam direction directed to the direction of the target sound source is taken as one of the preprocessed sound source signals.
In the above scheme, if at least two of the microphones are linear arrays, a null direction of the preprocessed sound source signal is as follows: two opposite phase angles of the preprocessed sound source signal in one direction are both in a null direction;
if at least two of the microphones are in a circular array, a null direction of the preprocessed sound source signal is as follows: one phase angle of the preprocessed sound source signal in one direction is a null direction.
According to a second aspect of embodiments of the present disclosure, there is provided an audio signal processing apparatus, the apparatus comprising:
the conversion module is used for converting original sound source signals received by at least two microphones into original sound source signals in a plurality of wave beam directions; the original sound source signal in one wave beam direction has a null direction, and the null directions of the original sound source signals in different wave beam directions are different;
the superposition module is used for superposing the original sound source signals in the multiple wave beam directions on the basis of the null direction so as to obtain at least two preprocessed sound source signals; wherein at least one of the preprocessed sound source signals has at least two null directions that suppress interference.
In the above scheme, the apparatus further comprises:
and the separation module is used for carrying out blind source separation on the at least two preprocessed sound source signals so as to obtain at least two target sound source signals.
In the above scheme, the apparatus further comprises:
and the correcting module is used for obtaining at least two corrected target sound source information based on at least two target sound source signals and respective weight coefficients.
In the foregoing solution, the converting module is configured to convert the original sound source signal into original sound source signals in a plurality of beam directions all pointing to a direction of a target sound source;
the superposition module is configured to superpose original sound source signals in at least two beam directions in original sound source signals in multiple beam directions based on a null direction, so as to obtain a preprocessed sound source signal with at least two null directions and at least one preprocessed sound source signal with one null direction; alternatively, the first and second electrodes may be,
and superposing original sound source signals in at least two beam directions in the original sound source signals in the plurality of beam directions based on the null directions to obtain at least two preprocessed sound source signals with at least two null directions.
In the above scheme, the superposition module is configured to equally divide original sound source signals in multiple beam directions into at least two parts; and respectively superposing the original sound source signals in at least two wave beam directions based on the null direction to obtain at least two preprocessed sound source signals with at least two null directions.
In the foregoing solution, the converting module is configured to convert the original sound source signal into original sound source signals in a plurality of beam directions pointing to fixed directions, where the fixed directions corresponding to the original sound source signals in different beam directions are different; the fixed direction corresponding to the original sound source signal in one wave beam direction is the direction pointing to the target sound source;
the superposition module is used for superposing original sound source signals of at least two wave beam directions of which the fixed directions are not the directions pointing to a target sound source on the basis of the null directions so as to obtain at least one preprocessed sound source signal with at least two null directions; an original sound source signal of which the fixed direction is a beam direction directed to the direction of the target sound source is taken as one of the preprocessed sound source signals.
In the above scheme, if at least two of the microphones are linear arrays, a null direction of the preprocessed sound source signal is as follows: two opposite phase angles of the preprocessed sound source signal in one direction are both in a null direction;
if at least two of the microphones are in a circular array, a null direction of the preprocessed sound source signal is as follows: one phase angle of the preprocessed sound source signal in one direction is a null direction.
According to a third aspect of the embodiments of the present disclosure, there is provided a terminal, including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to: when the executable instructions are executed, the audio signal processing method according to any embodiment of the disclosure is realized.
According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium storing an executable program, wherein the executable program, when executed by a processor, implements the audio signal processing method according to any of the embodiments of the present disclosure.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:
in the embodiment of the disclosure, original sound source information received by at least two microphones may be converted into original sound source signals in a plurality of beam directions, where the original sound source signals in the beam directions have a null direction; and superposing the original sound source signals in the plurality of beam directions based on the null directions to obtain at least two preprocessed sound source signals, wherein at least one of the preprocessed sound source signals has at least two null directions for suppressing interference. Thus, the embodiment of the present disclosure can suppress the interfering sound source from at least two directions, so that the influence of interference or reverberation on the target sound source can be greatly reduced, and the quality of the audio signal of the target sound source can be improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
Fig. 1 is a schematic diagram illustrating an application scenario of an audio signal processing method.
Fig. 2 is a flow chart illustrating a method of audio signal processing according to an exemplary embodiment.
Fig. 3 is a beam schematic of an original sound source signal illustrating one beam direction according to an exemplary embodiment.
Fig. 4 is a beam schematic of an original sound source signal illustrating one beam direction according to an exemplary embodiment.
Fig. 5 is a beam schematic of an original sound source signal illustrating one beam direction according to an exemplary embodiment.
Fig. 6 is a beam schematic of an original sound source signal for one beam direction shown according to an exemplary embodiment.
Fig. 7 is a beam schematic of an original sound source signal for one beam direction shown according to an exemplary embodiment.
Fig. 8 is a beam schematic of an original sound source signal for one beam direction shown according to an exemplary embodiment.
Fig. 9 is a flowchart illustrating an audio signal processing method according to an exemplary embodiment.
Fig. 10 is a flowchart illustrating an audio signal processing method according to an exemplary embodiment.
Fig. 11 is a flowchart illustrating an audio signal processing method according to an exemplary embodiment.
Fig. 12 is a flowchart illustrating an audio signal processing method according to an exemplary embodiment.
Fig. 13 is a schematic diagram illustrating an audio signal processing method according to an exemplary embodiment.
Fig. 14 is a schematic diagram illustrating an audio signal processing apparatus according to an exemplary embodiment.
Fig. 15 is a block diagram illustrating a terminal according to an example embodiment.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
Fig. 1 is a schematic diagram of an application scenario of an audio signal processing method; as shown in fig. 1, in the application scenario, there exists a target sound source and a plurality of interfering sound sources, and one of the plurality of interfering sound sources includes: television interference, human voice interference, outdoor noise, background noise, household noise, music echo, room reverberation and the like. Here, the room reverberation may be a signal that a target sound source reflects or refracts through a cavity of a room or an obstacle in the room, or the like; the room reverberation can also be a signal reflected or refracted by a cavity of a room or an obstacle in the room, such as human voice interference, television interference and the like. Thus, under the interference of many interference sound sources, the quality of the audio signal of the target sound source collected by the microphone is not high.
Fig. 2 is a flow chart illustrating a method of audio signal processing according to an exemplary embodiment, the method comprising the steps of, as shown in fig. 2:
step S21: converting original sound source signals received by at least two microphones into original sound source signals in a plurality of wave beam directions;
the original sound source signal in one wave beam direction has a null direction, and the null directions of the original sound source signals in different wave beam directions are different;
step S22: superposing original sound source signals in a plurality of wave beam directions based on a null direction to obtain at least two preprocessed sound source signals;
wherein at least one of the preprocessed sound source signals has at least two null directions that suppress interference.
Here, the null direction is a direction indicating an interfering sound source. Here, the null direction forms a null in a direction indicating an interfering sound source.
Here, the interference sound sources include a first kind of interference sound source and a second kind of interference sound source. Here, the first type of interfering sound source may be a sound source emitted by various interfering sources, for example, a sound source emitted by a person, an electronic device, or the like. The second type interference sound source can be reverberation or echo; for example, echoes of the first type interfering sound source and the target sound source reflected or refracted by the obstacle may be used, or reverberation of the first type interfering sound source and the target sound source reflected or refracted by the obstacle may be used.
Here, the interference sound source includes, but is not limited to, at least one of: a human voice interference sound source, a television interference sound source, background noise, household noise, echo and reverberation.
Here, the at least two microphones may be two or more. For example, the number of microphones is 2, 4, or 6, etc.
Here, the at least two microphones may be a linear array, or alternatively, may be a circular array. For example, as shown in fig. 3, there are 4 microphones and 4 microphones are a linear array. As another example, as shown in fig. 4, there are 6 microphones, and 6 microphones are in a circular array.
Here, the original sound source signals received by the at least two microphones are original sound source signals in all directions; for example, as shown in fig. 5, the original sound source signals received by the at least two microphones are 360 degrees original sound source signals.
In one embodiment, the number of original sound source signals in the plurality of beam directions is more than 2. For example, the original sound source signal of 360 degrees in fig. 5 may be converted into original sound source signals of 4, 6, or 8 equal beam directions. Here, the angular range covered by each beam direction may be the same or different.
Here, one way to implement step S21 is to: determining a direction indicating each interfering sound source; original sound source signals of a beam direction are obtained based on the direction of an interfering sound source, wherein the original sound source signals of a beam direction form nulls in the direction of the interfering sound source.
Exemplarily, as shown in fig. 6, a 360-degree original sound source signal is converted into original sound source signals of 4 beam directions, wherein the original sound source signal in one beam direction is represented by one beam; the original sound source signals in the 4 beam directions are beam 1, beam 2, beam 3, and beam 4, respectively. Wherein, the original sound source signal of 360 degrees has an interference sound source at 90 degrees, 135 degrees, 180 degrees and 315 degrees, and then the null direction of the beam 1 is 90 degrees, the null direction of the beam 2 is 135 degrees, the null direction of the beam 3 is 180 degrees, and the null direction of the beam 4 is 315 degrees.
Here, another way to implement step S21 is: the original sound source signals in the 360-degree direction are equally divided into original sound source signals in a plurality of angle ranges, wherein the original sound source signal in each angle range is the original sound source signal in one beam direction.
Illustratively, as shown in fig. 7, a raw sound source signal of 360 degrees is converted into a raw sound source signal of 6 angular ranges; wherein, 6 angle ranges are respectively: 330 to 30 degrees (including 330 to 360 degrees and 0 to 30 degrees), 30 to 90 degrees, 90 to 150 degrees, 150 to 210 degrees, 210 to 270 degrees, 270 to 330 degrees; the original sound source signals corresponding to the 6 angles are beam 1, beam 2, beam 3, beam 4, beam 5, and beam 6, respectively. Wherein the null direction of beam 1 is 180 degrees, the null direction of beam 2 is 240 degrees, the null direction of beam 3 is 300 degrees, the null direction of beam 4 is 0 degrees, the null direction of beam 5 is 60 degrees, and the null direction of beam 6 is 120 degrees.
Here, another way to realize step S21 is: dividing original sound source signals in the 360-degree direction into original sound source signals in a plurality of angle ranges, wherein the angle ranges are unequal angle ranges.
Illustratively, a 360-degree original sound source signal is divided into unequal 4-angle-range original sound source signals; wherein, 4 angle ranges are respectively: 0 to 80 degrees, 80 to 180 degrees, 180 degrees 250 degrees and 250 degrees 36 degrees.
In some embodiments, if at least two of the microphones are linear arrays, one null direction of the preprocessed sound source signal is: two opposite phase angles of the preprocessed sound source signal in one direction are both in a null direction;
if at least two of the microphones are in a circular array, a null direction of the preprocessed sound source signal is as follows: one phase angle of the preprocessed sound source signal in one direction is a null direction.
Illustratively, at least two microphones are linear arrays, and the original sound source signals received by the at least two microphones are converted into original sound source signals of a plurality of beam directions, as shown in fig. 6. Here, the original sound source signals in the 4 beam directions are beam, beam 2, beam 3, and beam 4, respectively. One null direction of the 4 beams is a null direction of two opposite phase angles corresponding to one direction. For example, beam 1 has two opposite phase angles in the direction of 90 degrees, which are 90 degrees and 270 degrees, respectively; then both of the phase angles 90 degrees and 270 are null directions. Here, one null direction is two angles in one direction, which are different by 180 degrees.
Illustratively, at least two microphones are a circular array, and the original sound source signals received by the at least two microphones are converted into original sound source signals of a plurality of beam directions, as shown in fig. 8. Here, one null direction of the 4 beams is a null direction of one phase angle of one direction. For example, beam 1 has two opposite phase angles in the direction of 90 degrees, which are 90 degrees and 270 degrees, respectively; but the beam 1 has a null direction only at a phase angle of 90 degrees.
Here, one null direction is a direction in which two phase angles are opposite in direction, based on the characteristics of the linear array microphone collecting the sound source signal; based on the characteristics of the annular array microphone for collecting the sound source signal, one null direction is a direction of one direction and one phase angle. Therefore, the beam pattern of the original sound source signal in the beam direction can be determined according to the actual acquisition situation.
In the embodiment of the disclosure, original sound source information received by at least two microphones may be converted into original sound source signals in a plurality of beam directions, where the original sound source signals in the beam directions have a null direction; and superposing the original sound source signals in the plurality of beam directions based on the null directions to obtain at least two preprocessed sound source signals, wherein at least one of the preprocessed sound source signals has at least two null directions for suppressing interference. Thus, the embodiment of the present disclosure can suppress the interfering sound source from at least two directions, reduce the influence of the interfering sound source on the target sound source, and thus can improve the quality of the audio signal of the target sound source.
In addition, the interference sound source in the disclosure can be the sound emitted by various interference sources, or the sound source emitted by various interference sources and the target sound source are reflected or refracted back or reverberation, etc.; as such, the embodiments of the present disclosure may suppress interference of various interfering sound sources and reduce reverberation.
As shown in fig. 9, in some embodiments, the step S21 includes:
step S211: converting the original sound source signal into original sound source signals of a plurality of beam directions each pointing to a direction of a target sound source;
the step S22 includes:
step S221: superposing original sound source signals in at least two beam directions in original sound source signals in a plurality of beam directions based on a null direction to obtain a preprocessed sound source signal with at least two null directions and at least one preprocessed sound source signal with a null direction; alternatively, the first and second electrodes may be,
and superposing original sound source signals in at least two beam directions in the original sound source signals in the plurality of beam directions based on the null directions to obtain at least two preprocessed sound source signals with at least two null directions.
Here, the original sound source signal of each beam direction is directed to the direction of the target sound source. For example, as shown in fig. 6, the direction of the target sound source is 0 degrees, and each of the beam 1, the beam 2, the beam 3, and the beam 4 is directed to 0 degrees.
Here, the original sound source signal of each beam direction has one null direction, and the null directions of the original sound source signals of different beam directions are different. For example, referring again to fig. 6, the null direction for beam 1 is 90 degrees, the null direction for beam 2 is 135 degrees, the null direction for beam 3 is 180 degrees, and the null direction for beam 4 is 315 degrees.
Here, in the step S221, at least two original sound source signals of the beam directions among the original sound source signals of the beam directions may be superimposed based on the null direction, and original sound sources of other beam directions than the original sound source signals of the at least two beams among the original sound source signals of the beam directions may be superimposed or not superimposed.
Here, one way to realize the superposition based on the null direction may be: the superposition is based on any at least two original sound source signals with a beam direction of a null direction.
For example, as shown in fig. 6, beam 1 and beam 2 may be superimposed, and beam 3 and beam 4 are not superimposed; or, the beam 1, the beam 2 and the beam 3 are superposed, and the beam 4 is not superposed; or, beam 1 and beam 3 are superimposed, and beam 2 and beam 4 are superimposed; and so on.
Another way to realize the superposition based on the null direction may be: the superposition is performed based on the original signals of the beam directions of which the angle between any at least two null directions is smaller than a predetermined threshold. For example, if the angle difference between the null directions of beam 1 and beam 2 is less than 90 degrees from the threshold angle, it is determined that beam 1 and beam 2 are superimposed; for another example, if the angle difference between the largest two beams of beam 1, beam 2, and beam 3 is smaller than the threshold angle 90 degrees, it is determined that beam 1, beam 2, and beam 3 are superimposed.
The case where the beam 1 and the beam 2 are superimposed is exemplified below. Before the beam 1 and the beam 2 are superposed, the beam 1 and the beam 2 both correspond to 0db in the 0-degree direction; prior to superposition, beam 1 is attenuated by 40db in the 90 degree direction (i.e., -40db as shown in fig. 6), and beam 1 is attenuated by about 8db in the 135 degree direction (i.e., -8db as shown in fig. 6); prior to superposition, beam 2 is attenuated by about 8db in the 90 degree direction (i.e., -8db shown in fig. 6), and beam 2 is attenuated by 40db in the 135 degree direction (i.e., -40db shown in fig. 6). After the beam 1 and the beam 2 are superposed, the beam formed by superposing the beam 1 and the beam 2 is not attenuated in the direction of 0 degree; the beam formed by the superposition of the beam 1 and the beam 2 has at least about 8db of attenuation at 90 degrees and also has about 8db of attenuation at 135 degrees; in this way, the superimposed beam is attenuated in both the 90 degree and 135 degree directions, i.e. the superimposed beam can attenuate two interfering sound sources.
Therefore, the embodiment of the disclosure can suppress a plurality of interfering sound sources on the premise of ensuring that the target sound source is not attenuated, and can greatly increase the number of the interfering sound sources and reduce reverberation compared with the prior art that only one interfering sound source can be suppressed. For example, the plurality of sources of disturbing sound include: the embodiment of the disclosure can not only suppress the interference sound source, background noise and the like emitted by the human voice, the television and the like, but also suppress the echo, reverberation and the like caused by the reflection, refraction and the like of the target sound source and various interference sound sources; thereby enabling to improve the quality of the audio signal of the target sound source.
In some embodiments, said superimposing original sound source signals of at least two beam directions of the original sound source signals of the plurality of beam directions based on the null direction to obtain at least two of said preprocessed sound source signals having at least two null directions comprises:
dividing original sound source signals in a plurality of wave beam directions into at least two parts;
and respectively superposing the original sound source signals in at least two wave beam directions based on the null direction to obtain at least two preprocessed sound source signals with at least two null directions.
Here, the original sound source signals of the plurality of beam directions may be equally divided into two, three, four, or five, and so on.
For example, as shown in fig. 6, 4 beams may be divided into two, one of which is beam 1 and beam 2, and the other of which is beam 3 and beam 4; beams 1 and 2, and beams 3 and 4 are subsequently superimposed.
For another example, if there are 5 beams from beam 1 to beam 5 in the original sound source signal in the beam direction, the beams 1 to 5 can be divided into two parts, wherein one part is beam 1 and beam 2, and the other part is beam 3, beam 4 and beam 5.
For another example, if the original sound source signal in the beam direction has 9 beams from beam 1 to beam 9, the beams 1 to 9 can be divided into three parts, wherein the first part is from beam 1 to beam 3, the second part is from beam 4 to beam 6, and the third part is from beam 7 to beam 9.
Thus, in the embodiment of the present disclosure, the sound source signals in the multiple beam directions may be equally divided as much as possible to obtain at least two preprocessed sound source signals with substantially the same number of null directions; on one hand, each preprocessed sound source signal can inhibit audio signals of a plurality of interference sound sources as much as possible, and on the other hand, separation of the interference sound sources and the target sound source is facilitated during subsequent blind source separation.
In an embodiment, the superimposing original sound source signals of at least two beam directions among the original sound source signals of the plurality of beam directions based on a null direction to obtain at least two preprocessed sound source signals having at least two null directions includes:
equally dividing a plurality of original sound source signals in the beam direction into two parts;
and respectively superposing the two original sound source signals based on the null directions to obtain two preprocessed sound source signals with at least two null directions.
In the embodiment of the present disclosure, because two preprocessed sound source signals are obtained, original sound source signals in more beam directions can be superimposed as much as possible, so that each preprocessed sound source signal suppresses a plurality of interfering sound sources as much as possible; and moreover, when the preprocessed sound source signals are subsequently subjected to blind source separation, the input quantity is two paths of input signals as few as possible, and therefore the complexity of blind source separation calculation can be greatly reduced.
As shown in fig. 10, in some embodiments, the step S21 includes:
step S212: converting the original sound source signal into original sound source signals in a plurality of wave beam directions pointing to fixed directions, wherein the fixed directions corresponding to the original sound source signals in different wave beam directions are different; the fixed direction corresponding to the original sound source signal in one wave beam direction is the direction pointing to the target sound source;
the step S22 includes:
step S222: superimposing original sound source signals of at least two beam directions, the fixed directions of which are not directions pointing to a target sound source, based on null directions to obtain at least one preprocessed sound source signal having at least two null directions; an original sound source signal of which the fixed direction is a beam direction directed to the direction of the target sound source is taken as one of the preprocessed sound source signals.
Here, the fixed direction may be any specified direction.
For example, if the original sound source signal is divided into original sound source signals of 6 beam directions, it can be determined that the 6 beam original sound source signals point in the fixed directions of 30 degrees, 90 degrees, 150 degrees, 210 degrees, 270 degrees, and 330 degrees, respectively. The 360 degrees may be equally divided into beam directions of 6 beams, and the 6 beams may cover 0 to 60 degrees, 60 to 120 degrees, 120 to 180 degrees, 180 to 240 degrees, 240 to 300 degrees, and 300 to 360 degrees, respectively.
Here, the fixed direction of the original sound source signal of one beam direction among the original sound source signals of the plurality of beam directions is a direction pointing to the target sound source.
For example, if the original sound source signal is divided into original sound source signals of 6 beam directions, if the direction of the target sound source is referred to as 0 degree, one of the fixed directions is 0 degree, and the other 5 fixed directions may be 60 degrees, 120 degrees, 180 degrees, 240 degrees, and 300 degrees. Then 360 degrees may be equally divided into beam directions of 6 beams; as shown in fig. 7, the 6 beams may be: covering beams 1 through 360 and 0 through 30 degrees, beams 2 through 30 degrees, beams 3 through 90 degrees, beams 4 through 150 degrees, beams 5 through 210 degrees, and beams 6 through 270 degrees, respectively.
Of course, in other embodiments, the fixed direction of one original sound source signal in the beam direction in the original sound source signals in the multiple beam directions may also be the direction pointing to the target sound source; the null directions of the original sound source signals of the other beam directions should correspond as far as possible to the directions of the interfering sound sources.
Of course, in other embodiments, the angular ranges covered by the original sound source signals pointing to the multiple beam directions of different fixed directions may also be unequal; for example, 360 degrees may be divided into unequal beam directions of 6 beams, such as beam directions covering 0 to 65 degrees, beam 2 covering 65 to 95 degrees, beam 3 covering 95 to 182 degrees, beam 4 covering 182 to 220 degrees, beam 5 covering 220 to 300 degrees, and beam 6 covering 330 to 360 degrees.
Thus, it is possible to point the original sound source signal of all directions, i.e., 360 degrees, to the original sound source signal of one beam direction of the target sound source direction and the original sound source signals of a plurality of beam directions having one null direction; the target sound source can be preliminarily separated.
Here, in the step S222, the original sound source signals of at least two beam directions whose fixed directions are not directions pointing to the target sound source are superimposed based on the null direction, and may be:
superposing original sound source signals of at least two wave beam directions of which all fixed directions are not directed to the target sound source direction on the basis of the null direction; alternatively, the first and second electrodes may be,
dividing original sound source signals of at least two wave beam directions of which the fixed directions are not directed to the target sound source direction into at least two parts, and respectively superposing the original sound source signals of the at least two wave beam directions.
For example, as shown in fig. 7, the fixed direction of the beam 1 is a direction pointing to a target sound source; the fixed direction of beams 2 to 6 is not a direction pointing to the target sound source; beams 2 through 6 may be superimposed; alternatively, the beams 2 to 6 may be divided into two, the beam 2 and the beam 3 may be superimposed, and the beam 4, the beam 5, and the beam 6 may be superimposed.
Here, when the beams 2 to 6 are superimposed, the superimposed beams are attenuated at least in the directions of 90 degrees, 150 degrees, 210 degrees, 270 degrees, and 330 degrees.
Of course, in other embodiments, in step S222, the original sound source signals in the plurality of beam directions may be directly divided into at least two parts, and the at least two parts of the original sound source signals may be respectively superimposed. Here, it is also possible to directly perform group superposition of a plurality of original sound source signals in the beam direction directed to the direction of the target sound source without considering that the original sound source signals in the beam direction are individually extracted, and it is also possible to suppress a plurality of interfering sound sources.
In the disclosed embodiment, by obtaining at least one preprocessed sound source signal with at least two null directions, a plurality of interfering sound sources can be suppressed, and the influence of reverberation, echo, interfering sources and the like on a target sound source can be reduced. In addition, an original sound source signal with a fixed direction being a beam direction pointing to the target sound source can be extracted, so that the target sound source can be extracted preliminarily, and convenience is provided for subsequent further processing of the target sound source.
As shown in fig. 11, in some embodiments, the method further comprises:
step S23: and performing blind source separation on at least two preprocessed sound source signals to obtain at least two target sound source signals.
Here, the at least two target sound source signals include a first type target sound source signal and at least one second type target sound source signal; wherein the first type of target sound source signal is an audio signal comprising a target sound source and the second type of target sound source signal is an audio signal comprising an interfering sound source.
Here, one implementation of the step S23 is: obtaining at least two mixed observed signals based on at least two of the preprocessed sound source signals; obtaining an estimation matrix based on the preprocessed sound source signals; obtaining a separation signal at each frequency domain point based on the estimation matrix; obtaining a target separation matrix based on the separation matrix on each frequency domain point; and obtaining at least two target sound source signals based on the target separation matrix and at least two observation signals.
Of course, in other embodiments, the step S23 may also adopt any other blind source separation technique, which only needs to separate the preprocessed sound source signal into the target sound source signal of each sound source; the target sound source signal may be an audio signal of an interfering sound source or an audio signal of a target sound source.
In the embodiment of the present disclosure, the sound source signals of a plurality of sound sources may be separated based on a blind source separation technique, and the audio signal of the target sound source may be separated therefrom, so that the audio signal of the target sound source with high quality may be obtained.
Of course, in the above embodiment, if the number of the obtained preprocessed sound source signals is 2 or less than the predetermined number based on the step S22, the computational complexity of the blind source separation in the step S23 can be reduced. For example, if the number of the preprocessed sound source signals is 2, the input amount of the unknown signals (i.e., the preprocessed sound source signals) for blind source separation is 2, and the blind source separation calculation amount is smaller, so that the calculation amount in the blind source separation can be greatly reduced.
Referring again to fig. 11, in some embodiments, the method further comprises:
step S24: and acquiring the corrected at least two target sound source information based on the at least two target sound source signals and the respective weight coefficients.
Here, if the target sound source signal includes an audio signal of a target sound source, the target sound source signal is a first weight coefficient; if the target sound source signal comprises an audio signal of an interference sound source, the target sound source signal is a second weight coefficient; wherein the first weight coefficient is greater than the second weight coefficient.
For example, there are 2 target sound source signals, which are S1 and S2, respectively, where S1 is the audio signal of the target sound source, and S2 is the audio signal of the interference sound source; it can be determined that the first weight coefficient of S1 is 80% and the second weight coefficient of S2 is 20%. Thus, the audio signal of the target sound source in the modified target sound source signal is increased by 4 times, and the audio signal of the interfering sound source is reduced by one fourth. Thus, the signal-to-noise ratio can be greatly improved; and the audio signal of the target sound source has a very small influence on the target sound source even if a small amount of interfering sound source signals are included.
Of course, if the target sound source signal includes audio signals of a plurality of interfering sound sources, for example, the number of the target sound source signals is 3, which are S1, S2, and S3; wherein, the S1 is an audio signal of a target sound source, and the S2 and the S3 are both audio signals of an interference sound source. Then, the corresponding weighting coefficients of S1, S2, and S3 may also be determined, for example, the first weighting coefficient of S1 is 80%, the second weighting coefficient of S2 is 15%, and the second weighting coefficient of S3 is 5%; it is only necessary that the first weight coefficient is greater than the second weight coefficient.
Thus, in the embodiment of the present disclosure, the signal-to-noise ratio of the target sound source signal can be adjusted by determining the corresponding weight for each target sound source signal; if the weighting coefficient of the audio signal of the target sound source is greater than that of the interference sound source, the signal-to-noise ratio of the target sound source signal can be greatly improved, and therefore the audio signal of the target sound source with higher quality can be obtained.
One specific example is provided below in connection with any of the embodiments described above:
as shown in fig. 12, an embodiment of the present disclosure discloses an audio signal processing method applied to a terminal, the method including the following steps:
step S41: acquiring original sound source signals received by two 2 microphones;
as shown in fig. 13, in an alternative embodiment, the terminal acquires two original sound source signals (i.e., X _1 and X _2) and inputs the two original sound source signals to the beamforming module.
Step S42: converting the original sound source signal into original sound source signals in 6 wave beam directions; dividing original sound source signals in 6 wave beam directions into two parts; respectively superposing the original sound source signals in the two wave beam directions to obtain two preprocessed sound source signals;
here, the original sound source signal of one beam direction has one null direction; the null directions of different original sound source signals are different. One preprocessed sound source direction has three null directions.
Referring to fig. 13 again, in an alternative embodiment, in the beam forming module, two paths of original sound source signals (i.e., X _1 and X _2) are converted into original sound source signals in 6 beam directions; the original sound source signals in the 6 beam directions are beam 1(beam _1), beam 2(beam _2), beam 3(beam _3), beam 4(beam _4), beam 5(beam _5), and beam 6(beam _6), respectively; superimposing the beam _1, beam _2 and beam _3 to obtain a preprocessed sound source signal (Y1); and superimposing the beam _4, beam _5, and beam _6 to obtain another preprocessed sound source signal (Y2).
Step S43: carrying out blind source separation on the two preprocessed sound source signals to obtain two target sound source signals;
referring again to fig. 13, in an alternative embodiment, the Y1 and Y2 are input to the blind source separation module. In the blind source separation module, the Y1 and the Y2 are subjected to blind source separation, and two target sound source signals, S1 and S2, are output.
Step S44: and obtaining the two corrected target sound source signals based on the two target sound source signals and various weight coefficients.
Referring again to FIG. 13, in an alternative embodiment, the S1 and the S2 are input to a post-processing module. In the preprocessing module, the weighting coefficients of S1 and S2 are obtained, and the modified target sound source signal S1 is obtained based on the weighting coefficients of S1 and S1And obtaining a modified target sound source signal S2 based on the weighting coefficients of S2 and S2
In the embodiment of the present disclosure, the original sound source signal is converted into the original sound source signal in a plurality of beam directions for superposition, and the preprocessed sound source signal having a plurality of null directions is obtained, so that interfering sound sources in more directions can be suppressed, that is, the influence of the interfering sound source, echo, reverberation, and the like on the target sound source can be reduced. In addition, the audio signal of the target sound source can be separated through blind source separation, and the high-quality audio signal of the target sound source can be acquired. And because the signals of each target sound source can be further corrected based on the preprocessing module, the audio signals of the target sound sources are further enhanced, the audio signals of the interference sound sources are further attenuated, the signal-to-noise ratio of the separated target sound sources can be improved, and the quality of the audio signals of the target sound sources is further improved.
Fig. 14 is a block diagram of an audio signal processing apparatus according to an exemplary illustration. Referring to fig. 14, the audio signal processing apparatus includes:
a conversion module 61, configured to convert original sound source signals received by at least two microphones into original sound source signals in multiple beam directions; the original sound source signal in one wave beam direction has a null direction, and the null directions of the original sound source signals in different wave beam directions are different;
a superposition module 62, configured to superpose original sound source signals in multiple beam directions based on a null direction to obtain at least two preprocessed sound source signals; wherein at least one of the preprocessed sound source signals has at least two null directions that suppress interference.
Referring again to fig. 14, in some embodiments, the apparatus further comprises:
a separation module 63, configured to perform blind source separation on at least two of the preprocessed sound source signals to obtain at least two target sound source signals.
Referring again to fig. 14, in some embodiments, the apparatus further comprises:
and a correcting module 64, configured to obtain at least two pieces of corrected target sound source information based on the at least two target sound source signals and their respective weight coefficients.
In some embodiments, the converting module 61 is configured to convert the original sound source signal into original sound source signals of a plurality of beam directions each pointing to a direction of a target sound source;
the superimposing module 62 is configured to superimpose at least two original sound source signals in the beam directions of the original sound source signals in the plurality of beam directions based on the null direction, so as to obtain at least one preprocessed sound source signal with at least two null directions and at least one preprocessed sound source signal with one null direction; alternatively, the first and second electrodes may be,
and superposing original sound source signals in at least two beam directions in the original sound source signals in the plurality of beam directions based on the null directions to obtain at least two preprocessed sound source signals with at least two null directions.
In some embodiments, the superposition module 62 is configured to divide the original sound source signals of multiple beam directions into at least two parts; and respectively superposing the original sound source signals in at least two wave beam directions based on the null direction to obtain at least two preprocessed sound source signals with at least two null directions.
In some embodiments, the converting module 61 is configured to convert the original sound source signal into original sound source signals of a plurality of beam directions pointing to fixed directions, where the fixed directions corresponding to the original sound source signals of different beam directions are different; the fixed direction corresponding to the original sound source signal in one wave beam direction is the direction pointing to the target sound source;
the superposition module 62 is configured to superpose original sound source signals in at least two beam directions, of which fixed directions are not directions pointing to a target sound source, based on a null direction, so as to obtain at least one preprocessed sound source signal having at least two null directions; an original sound source signal of which the fixed direction is a beam direction directed to the direction of the target sound source is taken as one of the preprocessed sound source signals.
In some embodiments, if at least two of the microphones are linear arrays, one null direction of the preprocessed sound source signal is: two opposite phase angles of the preprocessed sound source signal in one direction are both in a null direction;
if at least two of the microphones are in a circular array, a null direction of the preprocessed sound source signal is as follows: one phase angle of the preprocessed sound source signal in one direction is a null direction.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
An embodiment of the present disclosure further provides a terminal, which includes:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to: when the audio signal processing method is used for executing the executable instructions, the audio signal processing method according to any embodiment of the disclosure is realized.
The memory may include various types of storage media, which are non-transitory computer storage media capable of continuing to remember the information stored thereon after a communication device has been powered down.
The processor may be connected to the memory via a bus or the like for reading the executable program stored on the memory, for example, for implementing at least one of the methods shown in fig. 2, 8 to 12.
Embodiments of the present disclosure also provide a computer-readable storage medium storing an executable program, wherein the executable program, when executed by a processor, implements the audio signal processing method according to any embodiment of the present disclosure. For example, at least one of the methods shown in fig. 2, 9 to 12 is implemented.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
Fig. 15 is a block diagram illustrating a terminal 800 according to an example embodiment. For example, the terminal 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.
Referring to fig. 15, terminal 800 may include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and communication component 816.
The processing component 802 generally controls overall operation of the terminal 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing components 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operation at the device 800. Examples of such data include instructions for any application or method operating on terminal 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
Power components 806 provide power to the various components of terminal 800. Power components 806 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for terminal 800.
The multimedia component 808 includes a screen providing an output interface between the terminal 800 and the user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front-facing camera and/or the rear-facing camera may receive external multimedia data when the device 800 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the terminal 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
Sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for terminal 800. For example, sensor assembly 814 can detect the open/closed state of device 800, the relative positioning of components, such as a display and keypad of terminal 800, sensor assembly 814 can also detect a change in position of terminal 800 or a component of terminal 800, the presence or absence of user contact with terminal 800, orientation or acceleration/deceleration of terminal 800, and a change in temperature of terminal 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
Communication component 816 is configured to facilitate communications between terminal 800 and other devices in a wired or wireless manner. The terminal 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the terminal 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 804 comprising instructions, executable by the processor 820 of the terminal 800 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims (16)

1. A method of audio signal processing, the method comprising:
converting original sound source signals received by at least two microphones into original sound source signals in a plurality of wave beam directions; the original sound source signal in one wave beam direction has a null direction, and the null directions of the original sound source signals in different wave beam directions are different;
superposing original sound source signals in a plurality of wave beam directions based on a null direction to obtain at least two preprocessed sound source signals; wherein at least one of the preprocessed sound source signals has at least two null directions that suppress interference.
2. The method of claim 1, further comprising:
and performing blind source separation on at least two preprocessed sound source signals to obtain at least two target sound source signals.
3. The method according to claim 1 or 2, characterized in that the method further comprises:
and acquiring the corrected at least two target sound source information based on the at least two target sound source signals and the respective weight coefficients.
4. The method according to claim 1 or 2,
the converting original sound source signals received by at least two microphones into original sound source signals of a plurality of beam directions includes:
converting the original sound source signal into original sound source signals of a plurality of beam directions each pointing to a direction of a target sound source;
the superimposing original sound source signals of a plurality of beam directions based on a null direction to obtain at least two of the preprocessed sound source signals comprises:
superposing original sound source signals in at least two beam directions in original sound source signals in a plurality of beam directions based on a null direction to obtain a preprocessed sound source signal with at least two null directions and at least one preprocessed sound source signal with a null direction; alternatively, the first and second electrodes may be,
and superposing original sound source signals in at least two beam directions in the original sound source signals in the plurality of beam directions based on the null directions to obtain at least two preprocessed sound source signals with at least two null directions.
5. The method according to claim 4, wherein said superimposing original sound source signals of at least two beam directions among original sound source signals of a plurality of beam directions based on a null direction to obtain at least two of said preprocessed sound source signals having at least two null directions comprises:
dividing original sound source signals in a plurality of wave beam directions into at least two parts;
and respectively superposing the original sound source signals in at least two wave beam directions based on the null direction to obtain at least two preprocessed sound source signals with at least two null directions.
6. The method according to claim 1 or 2,
the at least two microphones receive original sound source signals and convert the original sound source signals into original sound source signals with a plurality of beam directions, and the method comprises the following steps:
converting the original sound source signal into original sound source signals in a plurality of wave beam directions pointing to fixed directions, wherein the fixed directions corresponding to the original sound source signals in different wave beam directions are different; the fixed direction corresponding to the original sound source signal in one wave beam direction is the direction pointing to the target sound source;
the superimposing original sound source signals of a plurality of beam directions based on a null direction to obtain at least two of the preprocessed sound source signals comprises:
superimposing original sound source signals of at least two beam directions, the fixed directions of which are not directions pointing to a target sound source, based on null directions to obtain at least one preprocessed sound source signal having at least two null directions;
an original sound source signal of which the fixed direction is a beam direction directed to the direction of the target sound source is taken as one of the preprocessed sound source signals.
7. The method according to claim 1 or 2, wherein if at least two of said microphones are linear arrays, a null direction of said preprocessed sound source signal is: two opposite phase angles of the preprocessed sound source signal in one direction are both in a null direction;
if at least two of the microphones are in a circular array, a null direction of the preprocessed sound source signal is as follows: one phase angle of the preprocessed sound source signal in one direction is a null direction.
8. An audio signal processing apparatus, characterized in that the apparatus comprises:
the conversion module is used for converting original sound source signals received by at least two microphones into original sound source signals in a plurality of wave beam directions; the original sound source signal in one wave beam direction has a null direction, and the null directions of the original sound source signals in different wave beam directions are different;
the superposition module is used for superposing the original sound source signals in the multiple wave beam directions on the basis of the null direction so as to obtain at least two preprocessed sound source signals; wherein at least one of the preprocessed sound source signals has at least two null directions that suppress interference.
9. The apparatus of claim 8, further comprising:
and the separation module is used for carrying out blind source separation on the at least two preprocessed sound source signals so as to obtain at least two target sound source signals.
10. The apparatus of claim 8 or 9, further comprising:
and the correcting module is used for obtaining at least two corrected target sound source information based on at least two target sound source signals and respective weight coefficients.
11. The apparatus according to claim 8 or 9,
the conversion module is used for converting the original sound source signal into original sound source signals in a plurality of wave beam directions which all point to the direction of a target sound source;
the superposition module is configured to superpose original sound source signals in at least two beam directions in original sound source signals in multiple beam directions based on a null direction, so as to obtain a preprocessed sound source signal with at least two null directions and at least one preprocessed sound source signal with one null direction; alternatively, the first and second electrodes may be,
and superposing original sound source signals in at least two beam directions in the original sound source signals in the plurality of beam directions based on the null directions to obtain at least two preprocessed sound source signals with at least two null directions.
12. The apparatus of claim 11,
the superposition module is used for equally dividing the original sound source signals in the multiple beam directions into at least two parts; and respectively superposing the original sound source signals in at least two wave beam directions based on the null direction to obtain at least two preprocessed sound source signals with at least two null directions.
13. The apparatus according to claim 8 or 9,
the conversion module is used for converting the original sound source signal into original sound source signals in a plurality of wave beam directions pointing to fixed directions, wherein the fixed directions corresponding to the original sound source signals in different wave beam directions are different; the fixed direction corresponding to the original sound source signal in one wave beam direction is the direction pointing to the target sound source;
the superposition module is used for superposing original sound source signals of at least two wave beam directions of which the fixed directions are not the directions pointing to a target sound source on the basis of the null directions so as to obtain at least one preprocessed sound source signal with at least two null directions; an original sound source signal of which the fixed direction is a beam direction directed to the direction of the target sound source is taken as one of the preprocessed sound source signals.
14. The apparatus according to claim 8 or 9,
if at least two of the microphones are linear arrays, a null direction of the preprocessed sound source signal is: two opposite phase angles of the preprocessed sound source signal in one direction are both in a null direction;
if at least two of the microphones are in a circular array, a null direction of the preprocessed sound source signal is as follows: one phase angle of the preprocessed sound source signal in one direction is a null direction.
15. A terminal, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to: for implementing the audio signal processing method of any of claims 1-7 when executing the executable instructions.
16. A computer-readable storage medium, characterized in that the readable storage medium stores an executable program, wherein the executable program, when executed by a processor, implements the audio signal processing method of any one of claims 1 to 7.
CN202010763471.3A 2020-07-31 2020-07-31 Audio signal processing method and device, terminal and storage medium Pending CN111863012A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010763471.3A CN111863012A (en) 2020-07-31 2020-07-31 Audio signal processing method and device, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010763471.3A CN111863012A (en) 2020-07-31 2020-07-31 Audio signal processing method and device, terminal and storage medium

Publications (1)

Publication Number Publication Date
CN111863012A true CN111863012A (en) 2020-10-30

Family

ID=72954127

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010763471.3A Pending CN111863012A (en) 2020-07-31 2020-07-31 Audio signal processing method and device, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN111863012A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113132519A (en) * 2021-04-14 2021-07-16 Oppo广东移动通信有限公司 Electronic device, voice recognition method for electronic device, and storage medium
CN113314135A (en) * 2021-05-25 2021-08-27 北京小米移动软件有限公司 Sound signal identification method and device

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050074129A1 (en) * 2001-08-01 2005-04-07 Dashen Fan Cardioid beam with a desired null based acoustic devices, systems and methods
US20050195988A1 (en) * 2004-03-02 2005-09-08 Microsoft Corporation System and method for beamforming using a microphone array
JP2009201075A (en) * 2008-02-25 2009-09-03 Kyocera Corp Base station, and wireless communication method
CN102324237A (en) * 2011-05-30 2012-01-18 深圳市华新微声学技术有限公司 Microphone array voice wave beam formation method, speech signal processing device and system
GB201322975D0 (en) * 2013-11-07 2014-02-12 Continental Automotive Systems Cotalker nulling based on multi super directional beamformer
US20140049596A1 (en) * 2012-08-20 2014-02-20 Abdel-Aziz El-Solh Localization Algorithm for Conferencing
CN105979442A (en) * 2016-07-22 2016-09-28 北京地平线机器人技术研发有限公司 Noise suppression method and device and mobile device
CN106887239A (en) * 2008-01-29 2017-06-23 高通股份有限公司 For the enhanced blind source separation algorithm of the mixture of height correlation
US9930448B1 (en) * 2016-11-09 2018-03-27 Northwestern Polytechnical University Concentric circular differential microphone arrays and associated beamforming
CN108631851A (en) * 2017-10-27 2018-10-09 西安电子科技大学 The Adaptive beamformer method deepened based on uniform linear array null
CN108694957A (en) * 2018-04-08 2018-10-23 湖北工业大学 The echo cancelltion design method formed based on circular microphone array beams
CN109102822A (en) * 2018-07-25 2018-12-28 出门问问信息科技有限公司 A kind of filtering method and device formed based on fixed beam
CN109119092A (en) * 2018-08-31 2019-01-01 广东美的制冷设备有限公司 Beam position switching method and apparatus based on microphone array
CN110265020A (en) * 2019-07-12 2019-09-20 大象声科(深圳)科技有限公司 Voice awakening method, device and electronic equipment, storage medium
CN110310651A (en) * 2018-03-25 2019-10-08 深圳市麦吉通科技有限公司 Adaptive voice processing method, mobile terminal and the storage medium of Wave beam forming
CN111327984A (en) * 2020-02-27 2020-06-23 北京声加科技有限公司 Earphone auxiliary listening method based on null filtering and ear-worn equipment

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050074129A1 (en) * 2001-08-01 2005-04-07 Dashen Fan Cardioid beam with a desired null based acoustic devices, systems and methods
US20050195988A1 (en) * 2004-03-02 2005-09-08 Microsoft Corporation System and method for beamforming using a microphone array
CN106887239A (en) * 2008-01-29 2017-06-23 高通股份有限公司 For the enhanced blind source separation algorithm of the mixture of height correlation
JP2009201075A (en) * 2008-02-25 2009-09-03 Kyocera Corp Base station, and wireless communication method
CN102324237A (en) * 2011-05-30 2012-01-18 深圳市华新微声学技术有限公司 Microphone array voice wave beam formation method, speech signal processing device and system
US20140049596A1 (en) * 2012-08-20 2014-02-20 Abdel-Aziz El-Solh Localization Algorithm for Conferencing
GB201322975D0 (en) * 2013-11-07 2014-02-12 Continental Automotive Systems Cotalker nulling based on multi super directional beamformer
CN105979442A (en) * 2016-07-22 2016-09-28 北京地平线机器人技术研发有限公司 Noise suppression method and device and mobile device
US9930448B1 (en) * 2016-11-09 2018-03-27 Northwestern Polytechnical University Concentric circular differential microphone arrays and associated beamforming
CN108631851A (en) * 2017-10-27 2018-10-09 西安电子科技大学 The Adaptive beamformer method deepened based on uniform linear array null
CN110310651A (en) * 2018-03-25 2019-10-08 深圳市麦吉通科技有限公司 Adaptive voice processing method, mobile terminal and the storage medium of Wave beam forming
CN108694957A (en) * 2018-04-08 2018-10-23 湖北工业大学 The echo cancelltion design method formed based on circular microphone array beams
CN109102822A (en) * 2018-07-25 2018-12-28 出门问问信息科技有限公司 A kind of filtering method and device formed based on fixed beam
CN109119092A (en) * 2018-08-31 2019-01-01 广东美的制冷设备有限公司 Beam position switching method and apparatus based on microphone array
CN110265020A (en) * 2019-07-12 2019-09-20 大象声科(深圳)科技有限公司 Voice awakening method, device and electronic equipment, storage medium
CN111327984A (en) * 2020-02-27 2020-06-23 北京声加科技有限公司 Earphone auxiliary listening method based on null filtering and ear-worn equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MOHAMMAD J. TAGHIZADEH, ETAL, IEEE XPLORE, 1 June 2011 (2011-06-01) *
陈小燕: "混响环境下稳健麦克风阵列波束形成语音增强算法研究", 中国优秀硕士学位论文全文数据库工程科技Ⅰ辑, 15 March 2018 (2018-03-15) *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113132519A (en) * 2021-04-14 2021-07-16 Oppo广东移动通信有限公司 Electronic device, voice recognition method for electronic device, and storage medium
CN113314135A (en) * 2021-05-25 2021-08-27 北京小米移动软件有限公司 Sound signal identification method and device
CN113314135B (en) * 2021-05-25 2024-04-26 北京小米移动软件有限公司 Voice signal identification method and device

Similar Documents

Publication Publication Date Title
KR102175602B1 (en) Audio focusing via multiple microphones
EP3576430B1 (en) Audio signal processing method and device, and storage medium
KR102150013B1 (en) Beamforming method and apparatus for sound signal
JP6964666B2 (en) Multi-beam selection method and equipment
CN111128221B (en) Audio signal processing method and device, terminal and storage medium
CN112866894B (en) Sound field control method and device, mobile terminal and storage medium
CN111863012A (en) Audio signal processing method and device, terminal and storage medium
CN112185388B (en) Speech recognition method, device, equipment and computer readable storage medium
CN111179960A (en) Audio signal processing method and device and storage medium
CN110392334B (en) Microphone array audio signal self-adaptive processing method, device and medium
CN113053406B (en) Voice signal identification method and device
CN113506582A (en) Sound signal identification method, device and system
CN112447184A (en) Voice signal processing method and device, electronic equipment and storage medium
CN116095254B (en) Audio processing method and device
CN113488066B (en) Audio signal processing method, audio signal processing device and storage medium
CN116405774A (en) Video processing method and electronic equipment
WO2018090343A1 (en) Microphone, and method and device for audio processing
CN112752191A (en) Audio acquisition method, device and storage medium
WO2021027049A1 (en) Sound acquisition method and device, and medium
CN112099364A (en) Intelligent interaction method for Internet of things household equipment
CN116705047B (en) Audio acquisition method, device and storage medium
CN112804462B (en) Multi-point focusing imaging method and device, mobile terminal and storage medium
CN113223548B (en) Sound source positioning method and device
CN113223543B (en) Speech enhancement method, device and storage medium
WO2023286680A1 (en) Electronic device, program, and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination