CN111261178A - Beam forming method and device - Google Patents

Beam forming method and device Download PDF

Info

Publication number
CN111261178A
CN111261178A CN201811453561.1A CN201811453561A CN111261178A CN 111261178 A CN111261178 A CN 111261178A CN 201811453561 A CN201811453561 A CN 201811453561A CN 111261178 A CN111261178 A CN 111261178A
Authority
CN
China
Prior art keywords
frequency
coefficients
coefficient
time domain
mth microphone
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811453561.1A
Other languages
Chinese (zh)
Inventor
耿岭
陈宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201811453561.1A priority Critical patent/CN111261178A/en
Publication of CN111261178A publication Critical patent/CN111261178A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming

Abstract

The present disclosure provides a beamforming method and apparatus. The beamforming apparatus divides a designated frequency range to obtain a predetermined number of frequency bands, applies the same response constraint to the center frequency of each frequency band in the time domain to obtain a corresponding time domain coefficient, converts the time domain coefficient from the time domain to the frequency domain to obtain a corresponding frequency domain coefficient, band-pass filters the frequency domain coefficient to retain only the coefficient associated with the corresponding center frequency point to obtain a corresponding filter coefficient, synthesizes the filter coefficients to obtain a corresponding beamforming coefficient, and performs beamforming processing using the beamforming coefficient. The present disclosure does not require the number of microphone arrays and is applicable to microphone arrays of different array types.

Description

Beam forming method and device
Technical Field
The present disclosure relates to the field of information processing, and in particular, to a method and an apparatus for beamforming.
Background
The voice signal belongs to a broadband signal, and for a beam forming algorithm based on a narrowband signal, the beam response of the voice signal changes along with the change of the signal frequency. In order to make the beam response invariant to changes in signal frequency, especially to the frequency range of interest in the main lobe direction, the following two concepts are mainly used. The first mode is that the relationship between the frequency and the aperture in the beam forming is utilized, and the effective aperture of the array is changed to ensure that the beams corresponding to different frequencies are constant; the first way is to approximate the beam to be synthesized to the desired beam according to a predetermined criterion.
Disclosure of Invention
The inventors have found through studies that in the first mode described above, the array shape is required to strictly satisfy the requirement of the extended structure, and the array size is required to be large, and the number of required array elements is large. In the second method, although an array of any shape can be used, the Bessel (Bessel) series needs to be retained to more than ten orders for better accuracy, so that more array elements are required.
To this end, the present disclosure provides a beamforming scheme to accommodate microphone arrays of different numbers of elements.
In accordance with an aspect of one or more embodiments of the present disclosure, there is provided a beamforming method including: dividing the designated frequency range to obtain a predetermined number of frequency bands; applying the same response constraint to the center frequency of each frequency band in the time domain to obtain a corresponding time domain coefficient; converting the time domain coefficients from the time domain to the frequency domain to obtain corresponding frequency domain coefficients; band-pass filtering the frequency domain coefficients so as to retain only the coefficients associated with the corresponding center frequency points, thereby obtaining corresponding filter coefficients; synthesizing the filter coefficients to obtain corresponding beamforming coefficients; and performing beam forming processing by using the beam forming coefficient.
In some embodiments, converting the time domain coefficients from the time domain to the frequency domain comprises: the time domain coefficient associated with the mth microphone is expanded through zero padding to obtain the expansion coefficient of the mth microphone, wherein the expansion coefficient of the mth microphone is an N-dimensional vector, N is the number of frequency bands, M is more than or equal to 1 and less than or equal to M, and M is the total number of the microphones; and converting the expansion coefficient of the mth microphone from the time domain to the frequency domain to obtain the frequency domain coefficient of the mth microphone at the central frequency point of the corresponding frequency band.
In some embodiments, band-pass filtering the frequency domain coefficients comprises: and for the mth microphone, setting the frequency domain coefficients corresponding to other frequency points except the nth frequency point to zero to obtain the filter coefficient of the mth microphone at the nth frequency point, wherein N is more than or equal to 1 and less than or equal to N.
In some embodiments, synthesizing the filter coefficients comprises: synthesizing the filter coefficients of the mth microphone at each frequency point to obtain a synthesis coefficient of the mth microphone; and synthesizing the synthesis coefficients of the microphones to obtain corresponding beam forming coefficients.
In some embodiments, the response constraints include: the response is a first specified value in a main lobe region of the sound wave incident angle theta, and the response is a second specified value in a side lobe region of the sound wave incident angle theta, wherein the first specified value is larger than the second specified value.
In some embodiments, the first specified value is 1 and the second specified value is 0.
In some embodiments, the response constraint imposed in the time domain for the center frequency of each frequency band is associated with the center frequency of the respective frequency band, the acoustic incident angle θ, and the corresponding time domain coefficients.
In accordance with another aspect of one or more embodiments of the present disclosure, there is provided a beamforming apparatus including: a frequency band dividing module configured to divide a specified frequency range to obtain a predetermined number of frequency bands; a time domain coefficient processing module configured to apply the same response constraint in the time domain to the center frequency of each frequency band to obtain a corresponding time domain coefficient; a conversion module configured to convert the time domain coefficients from the time domain to the frequency domain to obtain corresponding frequency domain coefficients; a filtering module configured to band-pass filter the frequency domain coefficients, so as to retain only coefficients associated with corresponding center frequency points, thereby obtaining corresponding filter coefficients; a synthesis module configured to synthesize the filter coefficients to obtain corresponding beamforming coefficients; and the beam forming processing module is configured to perform beam forming processing by using the beam forming coefficient.
In some embodiments, the conversion module is configured to expand the time domain coefficients associated with the mth microphone by zero padding to obtain the expansion coefficients of the mth microphone, where the expansion coefficients of the mth microphone are N-dimensional vectors, N is the number of frequency bands, 1 ≦ M, and M is the total number of microphones; and converting the expansion coefficient of the mth microphone from the time domain to the frequency domain to obtain the frequency domain coefficient of the mth microphone at the central frequency point of the corresponding frequency band.
In some embodiments, the filtering module is configured to zero frequency domain coefficients corresponding to other frequency points except the nth frequency point for the mth microphone to obtain a filtering coefficient of the mth microphone at the nth frequency point, wherein N is greater than or equal to 1 and less than or equal to N.
In some embodiments, the combining module is configured to combine the filter coefficients of the mth microphone at each frequency point to obtain a combined coefficient of the mth microphone, and combine the combined coefficients of the microphones to obtain a corresponding beamforming coefficient.
In some embodiments, the response constraints include a response at a first specified value in a main lobe region of the acoustic angle of incidence θ and a response at a second specified value in a side lobe region of the acoustic angle of incidence θ, the first specified value being greater than the second specified value.
In some embodiments, the first specified value is 1 and the second specified value is 0.
In some embodiments, the response constraint imposed in the time domain for the center frequency of each frequency band is associated with the center frequency of the respective frequency band, the acoustic incident angle θ, and the corresponding time domain coefficients.
In accordance with another aspect of one or more embodiments of the present disclosure, there is provided a beamforming apparatus including: a memory configured to store instructions; a processor coupled to the memory, the processor configured to perform a method according to any of the embodiments described above based on instructions stored in the memory.
According to another aspect of one or more embodiments of the present disclosure, there is provided a computer-readable storage medium, wherein the computer-readable storage medium stores computer instructions, which when executed by a processor, implement a method as described above in relation to any one of the embodiments.
Other features of the present disclosure and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which proceeds with reference to the accompanying drawings.
Drawings
In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and for those skilled in the art, other drawings can be obtained according to the drawings without inventive exercise.
Fig. 1 is an exemplary flow chart of a beamforming method of one embodiment of the present disclosure;
fig. 2 is an exemplary block diagram of a beamforming apparatus of one embodiment of the present disclosure;
fig. 3 is an exemplary block diagram of a beamforming apparatus of another embodiment of the present disclosure;
fig. 4 is a schematic diagram of a beamforming scheme of one embodiment of the present disclosure.
Detailed Description
The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, and not all of the embodiments. The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
The relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.
Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.
Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.
In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
Fig. 1 is an exemplary flowchart of a beamforming method according to an embodiment of the present disclosure. In some embodiments, the method steps of the present embodiment may be performed by a beamforming device.
In step 101, a specified frequency range is divided to obtain a predetermined number of frequency bands.
In some embodiments, the center frequency of each band is:
Figure BDA0001887187490000051
wherein N is more than or equal to 1 and less than or equal to N, N is the number of frequency bands, and Fs is the sampling frequency. For example, the sampling frequency is 16000 Hz. Since the audio frequency range that can be perceived by the human ear is limited, only the first 100 frequency bands can be considered here, i.e. the frequency range is 0 to 3093 Hz.
In step 102, the same response constraint is applied in the time domain for the center frequency of each band to obtain the corresponding time domain coefficients.
In some embodiments, the response constraint imposed in the time domain for the center frequency of each frequency band is associated with the center frequency of the respective frequency band, the acoustic incident angle θ, and the corresponding time domain coefficients.
For example, let the position of the microphone bePm=[xm,ym]TM is more than or equal to 0 and less than or equal to M-1, M is the number of microphones, and the sound incidence angle is theta, then the frequency f is definednThe beam response at (a) is:
Figure BDA0001887187490000052
where J is the filter tap coefficient, e.g., J is 200.
Figure BDA0001887187490000053
Is the coefficient to be designed, where the superscript t characterizes the time domain, i.e. the coefficient is real. Ts is the sampling period. Tau ism(θ) represents the time delay of the sound wave to the m-th microphone with reference to the array origin. The corresponding calculation formula is as follows:
Figure BDA0001887187490000054
where V is the propagation velocity of the sound wave in air, and V is 340 m/s. Is provided with
Figure BDA0001887187490000055
Figure BDA0001887187490000056
Figure BDA0001887187490000057
Figure BDA0001887187490000058
Wherein the symbols
Figure BDA0001887187490000059
Representing the kronecker product. Thus, the beam response can be expressed as:
Figure BDA0001887187490000061
let the width of the main lobe be thetawidth(e.g., a main lobe width of 30deg, although adjustments may be made as necessary), and the entire angular range is adjusted
Figure BDA0001887187490000062
Divided into main lobe region Θm=[θ-0.5θwidth,θ+0.5θwidth]And a side lobe region ΘS=Θ-Θm
In some embodiments, the desired constraint is: the response is a first specified value in a main lobe region of the sound wave incident angle theta, and the response is a second specified value in a side lobe region of the sound wave incident angle theta, wherein the first specified value is larger than the second specified value. For example, the first specified value is 1 and the second specified value is 0. This is summarized as the quadratic programming problem as follows:
Figure BDA0001887187490000063
Figure BDA0001887187490000064
α is an equalization factor, for example, 0.01 may be preferred, although it may be adjusted as needed.
To facilitate the use of tools to solve, the above equation can be further written as:
Figure BDA0001887187490000065
CTwt=f (12)
Figure BDA0001887187490000066
wherein C ═ Re { s (f)n,θ)},Im{s(fn,θ)}],f=[1 0]T. Re {. denotes the real part of the complex number, Im {. denotes the imaginary part of the complex number, and the superscript H denotes the conjugate transpose.
Since the solution to the quadratic programming problem is known to those skilled in the art, it is not described here.
In step 103, the time domain coefficients are converted from the time domain to the frequency domain to obtain corresponding frequency domain coefficients.
In some embodiments, converting the time domain coefficients from the time domain to the frequency domain comprises: and (3) expanding the time domain coefficient associated with the mth microphone by zero padding to obtain an expansion coefficient of the mth microphone, wherein the expansion coefficient of the mth microphone is an N-dimensional vector, N is the number of frequency bands, M is more than or equal to 1 and less than or equal to M, and M is the total number of the microphones.
Due to N>J, so the time domain coefficients need to be zero padded before the transform. For example, the time domain coefficient associated with the mth microphone is expanded by zero padding to obtain the expansion coefficient W of the mth microphonem,exp
Wm,exp=[ωm,0,...,ωm,J-1,0,...,0]T(14)
Wherein Wm,expIs a column vector of dimension N, representing the expansion coefficient of the mth microphone. Next, the following formula is utilized:
Wm,n=FFT(Wm,exp) (15)
to obtain the frequency domain coefficient W of the mth microphone at the nth central frequency pointm,n. Where the FFT is a fast fourier transform.
In step 104, the frequency domain coefficients are band-pass filtered so that only the coefficients associated with the corresponding center frequency point are retained, resulting in corresponding filter coefficients.
In some embodiments, for the mth microphone, the frequency domain coefficients corresponding to other frequency points except the nth frequency point are set to zero to obtain the filter coefficient of the mth microphone at the nth frequency point, wherein N is greater than or equal to 1 and less than or equal to N.
For example, for the coefficient of the mth microphone at the nth frequency point:
Figure BDA0001887187490000071
the coefficients after band-pass filtering are:
Figure BDA0001887187490000072
in step 105, the filter coefficients are synthesized to obtain corresponding beamforming coefficients.
In some embodiments, synthesizing the filter coefficients comprises: and synthesizing the filter coefficients of the mth microphone at each frequency point to obtain a synthesis coefficient of the mth microphone, and synthesizing the synthesis coefficients of the microphones to obtain a corresponding beam forming coefficient.
For example, after the coefficients for each frequency point are calculated, the coefficient synthesis may be performed as follows. For example, for the mth microphone, the synthesis result is:
Figure BDA0001887187490000073
wherein
Figure BDA0001887187490000074
Is the coefficient of the mth microphone to the nth frequency point. Thus, the resulting beamforming coefficients are:
Figure BDA0001887187490000075
in step 106, a beamforming process is performed using the beamforming coefficients.
Since the corresponding beamforming process using beamforming coefficients is known to those skilled in the art, it is not described herein.
In the beamforming method provided in the foregoing embodiment of the present disclosure, a predetermined number of frequency bands are obtained by dividing a specified frequency range, applying the same response constraint to the center frequency of each frequency band in the time domain to obtain a corresponding time domain coefficient, converting the time domain coefficient from the time domain to the frequency domain to obtain a corresponding frequency domain coefficient, performing band-pass filtering on the frequency domain coefficient to retain only the coefficient associated with the corresponding center frequency point to obtain a corresponding filtering coefficient, synthesizing the filtering coefficients to obtain a corresponding beamforming coefficient, and performing beamforming processing using the beamforming coefficient. The present disclosure does not require the number of microphone arrays and is applicable to microphone arrays of different array types. For example, the present disclosure may be applicable to circular microphone arrays, as well as to other arrays such as linear microphone arrays.
Fig. 2 is an exemplary flowchart of a beamforming apparatus according to an embodiment of the present disclosure. As shown in fig. 2, the beamforming apparatus includes a band division module 21, a time domain coefficient processing module 22, a conversion module 23, a filtering module 24, a synthesis module 25, and a beamforming processing module 26.
The band division module 21 is configured to divide the specified frequency range to obtain a predetermined number of frequency bands.
In some embodiments, the center frequency of each band is as shown in equation (1) above.
The time domain coefficient processing module 22 is configured to apply the same response constraint in the time domain for the center frequency of each frequency band to obtain the corresponding time domain coefficient.
In some embodiments, the response constraint imposed in the time domain for the center frequency of each frequency band is associated with the center frequency of the respective frequency band, the acoustic incident angle θ, and the corresponding time domain coefficients.
In some embodiments, the desired constraint is: the response is a first specified value in a main lobe region of the sound wave incident angle theta, and the response is a second specified value in a side lobe region of the sound wave incident angle theta, wherein the first specified value is larger than the second specified value. For example, the first specified value is 1 and the second specified value is 0. For example, the above constraints can be attributed to the quadratic programming problem as described in the above equations (9), (10). Since the solution to the quadratic programming problem is known to those skilled in the art, it is not described here.
The conversion module 23 is configured to convert the time domain coefficients from the time domain to the frequency domain to obtain corresponding frequency domain coefficients.
In some embodiments, the converting module 23 is configured to expand the time domain coefficient associated with the mth microphone by zero padding to obtain an expansion coefficient of the mth microphone, where the expansion coefficient of the mth microphone is an N-dimensional vector, N is the number of frequency bands, 1 ≦ M, and M is the total number of microphones; and converting the expansion coefficient of the mth microphone from the time domain to the frequency domain to obtain the frequency domain coefficient of the mth microphone at the central frequency point of the corresponding frequency band.
For example, the time domain coefficients associated with the mth microphone may be extended by zero padding according to the above equation (14), and the time-frequency domain conversion may be performed according to the above equation (15).
The filtering module 24 is configured to band-pass filter the frequency domain coefficients, so as to retain only the coefficients associated with the corresponding center frequency point, thereby obtaining corresponding filter coefficients.
In some embodiments, the filtering module 24 is configured to zero frequency domain coefficients corresponding to other frequency points than the nth frequency point for the mth microphone to obtain a filtering coefficient of the mth microphone at the nth frequency point, where N is greater than or equal to 1 and less than or equal to N.
For example, the coefficients of the mth microphone at the nth frequency point may be band-pass filtered using the above equations (16), (17).
The synthesis module 25 is configured to synthesize the filter coefficients to obtain corresponding beamforming coefficients.
In some embodiments, the synthesis module 25 is configured to synthesize the filter coefficients of the mth microphone at each frequency point to obtain a synthesis coefficient of the mth microphone, and synthesize the synthesis coefficients of the microphones to obtain a corresponding beamforming coefficient.
For example, the coefficient synthesis may be performed according to the above equations (18), (19) to obtain the corresponding beamforming coefficients.
The beamforming processing module 26 is configured to perform beamforming processing using beamforming coefficients.
Since the corresponding beamforming process using beamforming coefficients is known to those skilled in the art, it is not described herein.
In the beamforming apparatus provided in the foregoing embodiment of the disclosure, a predetermined number of frequency bands are obtained by dividing a specified frequency range, the same response constraint is applied to the center frequency of each frequency band in the time domain to obtain a corresponding time domain coefficient, the time domain coefficient is converted from the time domain to the frequency domain to obtain a corresponding frequency domain coefficient, the frequency domain coefficient is band-pass filtered to retain only the coefficient associated with the corresponding center frequency point to obtain a corresponding filter coefficient, the filter coefficients are synthesized to obtain a corresponding beamforming coefficient, and beamforming processing is performed using the beamforming coefficient. The present disclosure does not require the number of microphone arrays and is applicable to microphone arrays of different array types. For example, the present disclosure may be applicable to circular microphone arrays, as well as to other arrays such as linear microphone arrays.
Fig. 3 is an exemplary block diagram of a beamforming apparatus according to another embodiment of the present disclosure. As shown in fig. 3, the beamforming apparatus comprises a memory 31 and a processor 32.
The memory 31 is used for storing instructions, the processor 32 is coupled to the memory 31, and the processor 32 is configured to execute the method according to any embodiment in fig. 1 based on the instructions stored in the memory.
As shown in fig. 3, the beam forming apparatus further includes a communication interface 33 for information interaction with other devices. Meanwhile, the device also comprises a bus 34, and the processor 32, the communication interface 33 and the memory 31 are communicated with each other through the bus 34.
The memory 31 may comprise a high-speed RAM memory, and may further include a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 31 may also be a memory array. The storage 31 may also be partitioned and the blocks may be combined into virtual volumes according to certain rules.
Further, the processor 32 may be a central processing unit CPU, or may be an application specific integrated circuit ASIC, or one or more integrated circuits configured to implement embodiments of the present disclosure.
The present disclosure also relates to a computer-readable storage medium, wherein the computer-readable storage medium stores computer instructions, and the instructions, when executed by a processor, implement the method according to any one of the embodiments in fig. 1.
Fig. 4 is a schematic diagram of a beamforming scheme of one embodiment of the present disclosure. As shown in fig. 4, a predetermined number of frequency bands are obtained by dividing a specified frequency range. The same response constraint is applied in the time domain for the center frequency of each band to obtain the corresponding time domain coefficients. The time domain coefficients are transformed from the time domain to the frequency domain by the FFT to obtain corresponding frequency domain coefficients, which are then band-pass filtered in the frequency domain to retain only the coefficients associated with the corresponding center frequency point to obtain corresponding filter coefficients. And finally, synthesizing the filter coefficients to obtain corresponding beamforming coefficients, so as to obtain beamforming coefficients in the whole interested frequency range. Then, the beamforming process is performed using the beamforming coefficients. Thereby being applicable to microphone arrays of different array types, such as linear arrays and the like.
In some embodiments, the functional unit modules described above may be implemented as a general purpose Processor, a Programmable Logic Controller (PLC), a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable Logic device, discrete gate or transistor Logic, discrete hardware components, or any suitable combination thereof for performing the functions described in this disclosure.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The description of the present disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the disclosure in the form disclosed. Many modifications and variations will be apparent to practitioners skilled in this art. The embodiment was chosen and described in order to best explain the principles of the disclosure and the practical application, and to enable others of ordinary skill in the art to understand the disclosure for various embodiments with various modifications as are suited to the particular use contemplated.

Claims (16)

1. A method of beamforming, comprising:
dividing the designated frequency range to obtain a predetermined number of frequency bands;
applying the same response constraint to the center frequency of each frequency band in the time domain to obtain a corresponding time domain coefficient;
converting the time domain coefficients from the time domain to the frequency domain to obtain corresponding frequency domain coefficients;
band-pass filtering the frequency domain coefficients so as to retain only the coefficients associated with the corresponding center frequency points, thereby obtaining corresponding filter coefficients;
synthesizing the filter coefficients to obtain corresponding beamforming coefficients;
and performing beam forming processing by using the beam forming coefficient.
2. The method of claim 1, wherein converting the time domain coefficients from the time domain to the frequency domain comprises:
the time domain coefficient associated with the mth microphone is expanded through zero padding to obtain the expansion coefficient of the mth microphone, wherein the expansion coefficient of the mth microphone is an N-dimensional vector, N is the number of frequency bands, M is more than or equal to 1 and less than or equal to M, and M is the total number of the microphones;
and converting the expansion coefficient of the mth microphone from the time domain to the frequency domain to obtain the frequency domain coefficient of the mth microphone at the central frequency point of the corresponding frequency band.
3. The method of claim 2, wherein band-pass filtering the frequency domain coefficients comprises:
and for the mth microphone, setting the frequency domain coefficients corresponding to other frequency points except the nth frequency point to zero to obtain the filter coefficient of the mth microphone at the nth frequency point, wherein N is more than or equal to 1 and less than or equal to N.
4. The method of claim 3, wherein synthesizing the filter coefficients comprises:
synthesizing the filter coefficients of the mth microphone at each frequency point to obtain a synthesis coefficient of the mth microphone;
and synthesizing the synthesis coefficients of the microphones to obtain corresponding beam forming coefficients.
5. The method of any one of claims 1-4,
the response constraints include: the response is a first specified value in a main lobe region of the sound wave incident angle theta, and the response is a second specified value in a side lobe region of the sound wave incident angle theta, wherein the first specified value is larger than the second specified value.
6. The method of claim 5, wherein,
the first specified value is 1 and the second specified value is 0.
7. The method of claim 5, wherein,
the response constraints imposed in the time domain for the center frequency of each frequency band are associated with the center frequency of the respective frequency band, the acoustic incident angle θ, and the corresponding time domain coefficients.
8. A beamforming apparatus, comprising:
a frequency band dividing module configured to divide a specified frequency range to obtain a predetermined number of frequency bands;
a time domain coefficient processing module configured to apply the same response constraint in the time domain to the center frequency of each frequency band to obtain a corresponding time domain coefficient;
a conversion module configured to convert the time domain coefficients from the time domain to the frequency domain to obtain corresponding frequency domain coefficients;
a filtering module configured to band-pass filter the frequency domain coefficients, so as to retain only coefficients associated with corresponding center frequency points, thereby obtaining corresponding filter coefficients;
a synthesis module configured to synthesize the filter coefficients to obtain corresponding beamforming coefficients;
and the beam forming processing module is configured to perform beam forming processing by using the beam forming coefficient.
9. The apparatus of claim 8, wherein,
the conversion module is configured to expand the time domain coefficient associated with the mth microphone through zero padding to obtain an expansion coefficient of the mth microphone, wherein the expansion coefficient of the mth microphone is an N-dimensional vector, N is the number of frequency bands, M is greater than or equal to 1 and less than or equal to M, and M is the total number of the microphones; and converting the expansion coefficient of the mth microphone from the time domain to the frequency domain to obtain the frequency domain coefficient of the mth microphone at the central frequency point of the corresponding frequency band.
10. The apparatus of claim 9, wherein,
the filtering module is configured to zero frequency domain coefficients corresponding to other frequency points except the nth frequency point for the mth microphone to obtain a filtering coefficient of the mth microphone at the nth frequency point, wherein N is more than or equal to 1 and less than or equal to N.
11. The apparatus of claim 10, wherein,
the synthesis module is configured to synthesize the filter coefficients of the mth microphone at each frequency point to obtain a synthesis coefficient of the mth microphone, and synthesize the synthesis coefficients of the microphones to obtain a corresponding beamforming coefficient.
12. The apparatus of any one of claims 8-11,
the response constraints include a first specified value of response in a main lobe region of the acoustic incident angle theta and a second specified value of response in a side lobe region of the acoustic incident angle theta, the first specified value being greater than the second specified value.
13. The apparatus of claim 12, wherein,
the first specified value is 1 and the second specified value is 0.
14. The apparatus of claim 12, wherein,
the response constraints imposed in the time domain for the center frequency of each frequency band are associated with the center frequency of the respective frequency band, the acoustic incident angle θ, and the corresponding time domain coefficients.
15. A beamforming apparatus, comprising:
a memory configured to store instructions;
a processor coupled to the memory, the processor configured to perform implementing the method of any of claims 1-7 based on instructions stored by the memory.
16. A computer-readable storage medium, wherein the computer-readable storage medium stores computer instructions which, when executed by a processor, implement the method of any one of claims 1-7.
CN201811453561.1A 2018-11-30 2018-11-30 Beam forming method and device Pending CN111261178A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811453561.1A CN111261178A (en) 2018-11-30 2018-11-30 Beam forming method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811453561.1A CN111261178A (en) 2018-11-30 2018-11-30 Beam forming method and device

Publications (1)

Publication Number Publication Date
CN111261178A true CN111261178A (en) 2020-06-09

Family

ID=70950143

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811453561.1A Pending CN111261178A (en) 2018-11-30 2018-11-30 Beam forming method and device

Country Status (1)

Country Link
CN (1) CN111261178A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112492452A (en) * 2020-11-26 2021-03-12 北京字节跳动网络技术有限公司 Beam coefficient storage method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101192411A (en) * 2007-12-27 2008-06-04 北京中星微电子有限公司 Large distance microphone array noise cancellation method and noise cancellation system
US20130287226A1 (en) * 2012-04-30 2013-10-31 Conexant System, Inc. Reduced-delay subband signal processing system and method
CN104768099A (en) * 2014-01-02 2015-07-08 中国科学院声学研究所 Modal beam former for circular array and frequency-domain broadband implementation method
CN106782590A (en) * 2016-12-14 2017-05-31 南京信息工程大学 Based on microphone array Beamforming Method under reverberant ambiance
US20170164100A1 (en) * 2014-08-22 2017-06-08 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. FIR Filter Coefficient Calculation for Beam-forming Filters

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101192411A (en) * 2007-12-27 2008-06-04 北京中星微电子有限公司 Large distance microphone array noise cancellation method and noise cancellation system
US20130287226A1 (en) * 2012-04-30 2013-10-31 Conexant System, Inc. Reduced-delay subband signal processing system and method
CN104768099A (en) * 2014-01-02 2015-07-08 中国科学院声学研究所 Modal beam former for circular array and frequency-domain broadband implementation method
US20170164100A1 (en) * 2014-08-22 2017-06-08 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. FIR Filter Coefficient Calculation for Beam-forming Filters
CN106782590A (en) * 2016-12-14 2017-05-31 南京信息工程大学 Based on microphone array Beamforming Method under reverberant ambiance

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
鄢社锋 等: "基于二阶锥规划的任意传感器阵列时域恒定束宽波束形成", 声学学报(中文版), vol. 04, pages 309 - 316 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112492452A (en) * 2020-11-26 2021-03-12 北京字节跳动网络技术有限公司 Beam coefficient storage method, device, equipment and storage medium
CN112492452B (en) * 2020-11-26 2022-08-26 北京字节跳动网络技术有限公司 Beam coefficient storage method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN105590631B (en) Signal processing method and device
US9113281B2 (en) Reconstruction of a recorded sound field
JP4066197B2 (en) Microphone device
CN109285557B (en) Directional pickup method and device and electronic equipment
WO2017002525A1 (en) Signal processing device, signal processing method, and signal processing program
CN105679304B (en) Variable bandwidth non-delay sub-band algorithm for broadband active noise control system
JPH073936B2 (en) Frequency domain block adaptive digital filter
JP2015228643A5 (en)
CN111261178A (en) Beam forming method and device
US11482239B2 (en) Joint source localization and separation method for acoustic sources
US9036752B2 (en) Low-delay filtering
AU2014329890A1 (en) Adaptive diffuse signal generation in an upmixer
JP6567216B2 (en) Signal processing device
JP6644356B2 (en) Sound source separation system, method and program
Xu et al. A study of the virtual microphone algorithm for ANC system working in audio interference environment
JP4948019B2 (en) Adaptive signal processing apparatus and adaptive signal processing method thereof
CN111785289B (en) Residual echo cancellation method and device
CN108702558B (en) Method and device for estimating direction of arrival and electronic equipment
CN104952455B (en) The method and apparatus for realizing reverberation
CN115588438B (en) WLS multi-channel speech dereverberation method based on bilinear decomposition
WO2022234822A1 (en) Signal processing device, signal processing method, and program
JP6080557B2 (en) Signal processing apparatus, method thereof and program thereof
CN112584300B (en) Audio upmixing method, device, electronic equipment and storage medium
Shekarchi et al. Compression of head-related transfer function using autoregressive-moving-average models and Legendre polynomials
Yang et al. Design of 2-D recursive digital filters using nonsymmetric half-plane allpass filters

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination