WO2015039439A1 - 音频信号处理方法及装置、差分波束形成方法及装置 - Google Patents

音频信号处理方法及装置、差分波束形成方法及装置 Download PDF

Info

Publication number
WO2015039439A1
WO2015039439A1 PCT/CN2014/076127 CN2014076127W WO2015039439A1 WO 2015039439 A1 WO2015039439 A1 WO 2015039439A1 CN 2014076127 W CN2014076127 W CN 2014076127W WO 2015039439 A1 WO2015039439 A1 WO 2015039439A1
Authority
WO
WIPO (PCT)
Prior art keywords
signal
super
weight coefficient
audio
directional differential
Prior art date
Application number
PCT/CN2014/076127
Other languages
English (en)
French (fr)
Inventor
李海婷
张德明
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2015039439A1 publication Critical patent/WO2015039439A1/zh
Priority to US15/049,515 priority Critical patent/US9641929B2/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
    • H04R1/406Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers microphones
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0316Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
    • G10L21/0364Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L2021/02082Noise filtering the noise being echo, reverberation of the speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2201/00Details of transducers, loudspeakers or microphones covered by H04R1/00 but not provided for in any of its subgroups
    • H04R2201/02Details casings, cabinets or mounting therein for transducers covered by H04R1/02 but not provided for in any of its subgroups
    • H04R2201/025Transducer mountings or cabinet supports enabling variable orientation of transducer of cabinet
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2201/00Details of transducers, loudspeakers or microphones covered by H04R1/00 but not provided for in any of its subgroups
    • H04R2201/40Details of arrangements for obtaining desired directional characteristic by combining a number of identical transducers covered by H04R1/40 but not provided for in any of its subgroups
    • H04R2201/405Non-uniform arrays of transducers or a plurality of uniform arrays with different transducer spacing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2430/00Signal processing covered by H04R, not provided for in its groups
    • H04R2430/20Processing of the output signals of the acoustic transducers of an array for obtaining a desired directivity characteristic
    • H04R2430/21Direction finding using differential microphone array [DMA]

Definitions

  • the present invention relates to the field of audio technologies, and in particular, to an audio signal processing method and apparatus, a differential beamforming method and apparatus. Background technique
  • the application range of the audio signal collection by using the microphone array is more and more extensive, for example, it can be applied to various application scenarios such as high-definition call, audio and video conference, voice interaction, and spatial sound field recording, and It will be gradually applied to a wider range of application scenarios such as in-vehicle systems, home media systems, and video conferencing systems.
  • the technical microphone array performs audio signal acquisition, and processes the audio signal collected by the microphone array to output a mono signal, that is, the audio signal processing system applied to the mono signal output can only obtain a mono signal, and cannot It is applied to scenes that require two-channel signals, such as the recording of spatial sound fields.
  • an apparatus for processing an audio signal includes a weight coefficient storage module, a signal acquisition module, a beamforming processing module, and a signal output module, where:
  • the weight coefficient storage module is configured to store a super-directional differential beamforming weight coefficient
  • the signal acquisition module is configured to obtain an audio input signal, and output the audio input signal to the beamforming processing module, and is further configured to determine a current application scenario and an output signal type required by the current application scenario, and send the beam to the beam Forming, by the processing module, the current application scenario and an output signal type required by the current application scenario;
  • the beamforming processing module is configured to acquire a weight coefficient corresponding to a current application scenario from the weight coefficient storage module according to an output signal type required by the current application scenario, and use the obtained weight coefficient to perform the audio input signal Pointing to differential beamforming processing to obtain a super-directional differential beamforming signal, and transmitting the super-directional differential beamforming signal to the signal output module;
  • the signal output module is configured to output the super-directional differential beamforming signal.
  • the beam forming processing module is specifically configured to:
  • the signal output module is specifically configured to:
  • the left channel hyper-point differential beamforming signal and the right channel hyper-point differential beamforming signal are output.
  • the beamforming processing module is specifically configured to:
  • the signal output module is specifically configured to:
  • the one channel mono super-point differential beamforming signal is output.
  • the audio signal processing apparatus further includes a microphone array adjustment module, where:
  • the microphone array adjustment module is configured to adjust the microphone array to be a first sub-array and a second sub-array, and an end-fire direction of the first sub-array is different from an end-fire direction of the second sub-array;
  • the first sub-array and the second sub-array respectively collect original audio signals, and transmit the original audio signals as audio input signals to the signal acquisition module.
  • the audio signal processing apparatus further includes a microphone array adjustment module, where:
  • the microphone array adjustment module is configured to adjust an end-fire direction of the microphone array, and the end-fire direction is directed to the target sound source;
  • the microphone array collects an original audio signal from the target sound source and transmits the original audio signal as an audio input signal to the signal acquisition module.
  • the audio signal processing apparatus further includes a weight coefficient update Module, where
  • the weight coefficient update module is specifically configured to:
  • the audio collection area is adjusted, determining a geometry of the microphone array, a speaker position, and an adjusted audio collection effective area; Adjusting a beam shape according to the audio collection effective area, or adjusting a beam shape according to the audio collection effective area and the speaker position, to obtain an adjusted beam shape;
  • the weight coefficient storage module is specifically configured to: store the adjustment weight coefficient.
  • the audio signal processing apparatus further includes an echo cancellation module, where
  • the echo cancellation module is specifically configured to:
  • Cache the speaker playback signal perform echo cancellation on the original audio signal collected by the microphone array, obtain an echo cancellation audio signal, and transmit the echo cancellation audio signal as an audio input signal to the signal acquisition module;
  • the signal output module is specifically configured to:
  • the echo cancellation super-point differential beamforming signal is output.
  • the audio signal processing apparatus further includes an echo suppression module and a noise suppression module, where
  • the echo suppression module is configured to perform echo suppression processing on the super-directional differential beamforming signal output by the beamforming processing module, or perform echo suppression processing on the noise suppression hyper-directional differential beamforming signal output by the noise suppression module, Obtaining an echo suppression super-directed differential beamforming signal, and transmitting the echo suppression hyper-directional differential beamforming signal to the signal output module;
  • the noise suppression module configured for a super-directional differential beamforming signal output by the beamforming processing module Performing noise suppression processing, or performing noise suppression processing on the echo suppression hyper-directional differential beamforming signal output by the echo suppression module, obtaining a noise suppression hyper-directional differential beamforming signal, and transmitting the noise to the signal output module Suppressing a hyper-directed differential beamforming signal;
  • the signal output module is specifically configured to:
  • the echo suppression super-point differential beamforming signal or the noise suppression hyper-point difference beamforming signal is output.
  • the beamforming processing module is further configured to:
  • At least one beamforming signal is formed as a reference noise signal in a direction other than the sound source direction in the end-fire direction in which the microphone array can be adjusted, and the reference noise signal is transmitted to the noise suppression module.
  • an audio signal processing method including:
  • the obtaining according to a required output signal type of the current application scenario, acquiring a weight coefficient corresponding to the current application scenario, and using the obtained weight coefficient to perform the super Pointing to the differential beamforming process, obtaining a super-directional differential beamforming signal, and outputting the super-directional differential beamforming signal, specifically:
  • the left channel hyper-point differential beamforming signal and the right channel hyper-point differential beamforming signal are output.
  • the obtaining, according to an output signal type required by the current application scenario, acquiring a weight coefficient corresponding to the current application scenario, and using the obtained weight coefficient to perform the audio input signal Pointing to the differential beamforming process, obtaining a super-directional differential beamforming signal, and outputting the super-directional differential beamforming signal specifically:
  • the output signal type required by the current application scenario is a mono signal, obtaining a mono super-point differential beamforming weight coefficient of the current application scene to form a mono signal;
  • the method before acquiring the audio input signal, the method further includes:
  • the microphone array is adjusted to be a first sub-array and a second sub-array, and an end-fire direction of the first sub-array is different from an end-fire direction of the second sub-array;
  • the original audio signal is separately collected by the first sub-array and the second sub-array, and the original audio signal is used as an audio input signal.
  • the method before acquiring the audio input signal, the method further includes:
  • the original audio signal of the target sound source is collected and the original audio signal is used as an audio input signal.
  • the method further includes:
  • the audio collection area is adjusted, determining a geometry of the microphone array, a speaker position, and an adjusted audio collection effective area
  • the audio input signal is subjected to a hyper-directional differential beamforming process using the adjustment weight coefficients.
  • the method further includes:
  • the hyper-directional differential beamforming signal is echo canceled.
  • the method further includes:
  • the hyper-directional differential beamforming signal is subjected to echo suppression processing, and/or noise suppression processing.
  • the method further includes:
  • At least one beamforming signal as a reference noise signal in a direction other than the direction of the sound source in an end-fire direction in which the microphone array can be adjusted;
  • the hyper-directional differential beamforming signal is subjected to noise suppression processing using the reference noise signal.
  • a differential beamforming method including:
  • the process of determining a differential beamforming weight coefficient includes:
  • ( ⁇ ) is the weight coefficient
  • ⁇ ( ⁇ , ⁇ ) is the steering matrix corresponding to the microphone array of any geometric shape, which is determined by the relative delay between the sound sources reaching the microphones in the microphone array at different incident angles
  • ⁇ ( ⁇ , ⁇ ) represents the conjugate transposed matrix of ⁇ ( ⁇ , ⁇ )
  • is the frequency of the audio signal
  • is the incident angle of the sound source
  • is the response vector when the incident angle is ⁇ .
  • the determining 0 ( ⁇ , ⁇ ) and ⁇ according to the geometry of the microphone array and the set audio collection effective area Specifically include:
  • ⁇ ( ⁇ , ⁇ ) and ⁇ in different application scenarios according to the pole direction of the transformation and the zero point direction; wherein the pole direction is an incident in which the super-directed differential beam has a response value of 1 in the direction Angle, the zero point direction is an incident angle that causes the super-directed differential beam to have a response value of 0 in the direction.
  • the determining, according to the geometry of the microphone array, the set audio effective area, and the speaker position determines ⁇ ) ( ⁇ , ⁇ ; ⁇ ⁇ , specifically: According to the output signal type required by different application scenarios, the set audio effective area is converted into the pole direction and the zero point direction, and the speaker position is converted into a zero point direction;
  • ⁇ ( ⁇ , ⁇ ) and ⁇ in different application scenarios according to the pole direction of the transformation and the zero point direction; wherein the pole direction is an incident in which the super-directed differential beam has a response value of 1 in the direction Angle, the zero point direction is an incident angle that causes the super-directed differential beam to have a response value of 0 in the direction.
  • the output signal type required according to different application scenarios is set
  • the audio effective area is converted into a pole direction and a zero point direction, and specifically includes: when the required output signal type of the application scene is a mono signal, setting the end direction of the microphone array to a pole direction, and setting M zero point directions, Where M ⁇ N - 1, N is the number of microphones in the microphone array;
  • the required output signal type of the application scene is a two-channel signal
  • set the 0-degree direction of the microphone array to the pole direction and set the 180-degree direction of the microphone array to the zero point direction to determine the super-channel corresponding to one of the channels.
  • a fourth aspect provides a differential beamforming apparatus, including: a weight coefficient determining unit and a beam forming processing unit;
  • the weight coefficient determining unit is configured to determine a differential beamforming weight coefficient according to a geometry of the microphone array and the set audio collection effective region, and transmit the formed weight coefficient to the beamforming processing unit; or Determining a differential beamforming weight coefficient by the geometry of the microphone array, the set audio set effective area, and the speaker position, and transmitting the formed weight coefficient to the beamforming processing unit;
  • the beamforming processing unit acquires a weight coefficient corresponding to the current application scene from the weight coefficient determining unit according to the required output signal type of the current application scenario, and performs differential beamforming processing on the audio input signal by using the obtained weight coefficient.
  • the weight coefficient determining unit is specifically configured to:
  • W 03 is the weight coefficient
  • is the steering matrix corresponding to the microphone array of any geometric shape, which is determined by the relative delay between the sound sources reaching the microphones in the microphone array at different incident angles
  • ⁇ ( ⁇ , ⁇ ) indicates a conjugate transposed matrix of ⁇ ( ⁇ , ⁇ ), where 1 is the frequency of the audio signal, ⁇ is the incident angle of the sound source, and ⁇ is the response vector when the incident angle is ⁇ .
  • the weight coefficient determining unit is specifically configured to:
  • the set audio effective area is converted into the pole direction and the zero point direction, and the sum ⁇ of different application scenarios is determined according to the obtained pole direction and the zero point direction; or according to different application scenarios It is necessary to output the signal type, convert the set audio effective area into the pole direction and the zero point direction, convert the speaker position into the zero point direction, and determine the different application scenarios according to the obtained pole direction and the zero point direction.
  • f 5 wherein the pole direction is an incident angle in which the super-directed differential beam has a response value of 1 in the direction, and the zero-point direction is an incident angle in which the super-directed differential beam has a response value of 0 in the direction.
  • the weight coefficient determining unit is specifically configured to:
  • the required output signal type of the application scene is a two-channel signal
  • set the 0-degree direction of the microphone array to the pole direction and set the 180-degree direction of the microphone array to the zero point direction to determine the super-channel corresponding to one of the channels.
  • the beam forming processing module is based on the current application scenario
  • the required output signal type, the weight coefficient corresponding to the current application scenario is obtained from the weight coefficient storage module, and the obtained audio input signal is subjected to super-directional differential beam processing by using the obtained weight coefficient to form a current application scenario.
  • the super-directional differential beamforming signal can be processed correspondingly to obtain the final desired audio output signal, which can meet the requirements of different audio signal processing modes for different application scenarios.
  • FIG. 1 is a flowchart of an audio signal processing method according to an embodiment of the present invention.
  • 2A-2F are schematic diagrams showing the arrangement of a linear microphone according to an embodiment of the present invention.
  • 3A-3C are schematic diagrams of a microphone array according to an embodiment of the present invention.
  • 4A-4B are schematic diagrams showing the relationship between the end direction of the microphone array and the angle of the speaker according to an embodiment of the present invention
  • FIG. 5 is a schematic diagram of an angle of a microphone array for forming two audio signals according to an embodiment of the present invention
  • FIG. 6 is a schematic diagram of a microphone array split into two sub-arrays according to an embodiment of the present invention
  • FIG. 7 is a schematic diagram of human-computer interaction and high-definition voice according to an embodiment of the present invention
  • FIG. 8 is a flow chart of an audio signal processing method in a spatial sound field recording process according to an embodiment of the present invention.
  • FIG. 9 is a flowchart of a method for processing an audio signal in a stereo call according to an embodiment of the present invention
  • FIG. 10A is a method for processing an audio signal in a spatial sound field recording process
  • 10B is a flow chart of an audio signal processing method during a stereo call
  • FIGS. 11A-11E are schematic diagrams showing the structure of an audio signal processing apparatus according to an embodiment of the present invention.
  • FIG. 12 is a schematic diagram of a differential beamforming process according to an embodiment of the present invention.
  • FIG. 13 is a schematic structural diagram of a differential beamforming apparatus according to an embodiment of the present disclosure.
  • FIG. 14 is a schematic structural diagram of a controller according to an embodiment of the present invention. detailed description
  • An embodiment of the present invention provides an audio signal processing method. As shown in FIG. 1, the method includes: S101: Determine a super-directional differential beamforming weight coefficient.
  • the application scenarios involved in the embodiments of the present invention may include multiple application scenarios, such as a high-definition call, an audio and video conference, a voice interaction, and a spatial sound field recording, and different audio signal processing modes may be determined according to different application scenarios.
  • the super-directional differential beamforming weight coefficient is a differential beam constructed according to the geometry of the microphone array and the preset beam shape in the embodiment of the present invention.
  • S102 Acquire an audio input signal required by the current application scenario, and determine a current application scenario and an output signal type required by the current application scenario.
  • whether the original audio signal collected by the microphone array needs to be echo-removed according to the current application scenario may be determined to determine different audio input signals, where the audio input signal may be The echo signal of the original audio signal collected by the microphone array determined according to the current application scenario, or the original audio signal collected by the microphone array.
  • the output signal types required by different application scenarios are different. For example, in the application scenario of human-computer interaction and high-definition voice communication, a mono signal is required. In the space sound field recording and stereo call application scenarios, a two-channel signal is required. In the embodiment of the invention, according to the determined current application scenario, the output signal type required by the current application scenario is determined.
  • the corresponding output signal type is obtained according to the current application scenario.
  • the weight coefficient of the left channel super-point differential beamforming weight coefficient corresponding to the current application scene and the right channel hyper-pointing differential beamforming corresponding to the current application scene when the current output signal type of the current application scenario is a two-channel signal The weight coefficient; when the output signal type required by the current application scene is a mono signal, the mono super-point differential beamforming weight coefficient of the current application scene forming the mono signal is obtained.
  • S104 Perform super-point differential beamforming processing on the audio input signal obtained in S102 by using the weight coefficient obtained in S103, to obtain a super-directional differential beamforming signal.
  • the left channel super-point differential beamforming weight coefficient corresponding to the current application scenario and the right channel super corresponding to the current application scenario are obtained. Pointing to the differential beamforming weight coefficient; performing super-point differential beamforming processing on the audio input signal according to the left channel super-pointing differential beamforming weight coefficient corresponding to the current application scenario, and obtaining the left channel hyper-directional differential beamforming signal corresponding to the current application scenario And performing super-point differential beamforming processing on the audio input signal according to the right channel hyper-point differential beamforming weight coefficient corresponding to the current application scenario, and obtaining a right channel hyper-pointing differential beamforming signal corresponding to the current application scenario.
  • the super-directional differential beamforming weight coefficient corresponding to the current application scenario of the mono signal is obtained, and according to the acquired differential beamforming weight coefficient, The audio input signal is subjected to a super-directed differential beamforming process to form a mono-super-directed differential beamforming signal.
  • S105 Output the super-directional differential beamforming signal obtained in S104.
  • the super-directional differential beamforming signal after outputting the super-directional differential beamforming signal obtained in S104, the super-directional differential beamforming signal can be processed to obtain the final audio signal required by the current application scenario, which can be used according to the current application scenario.
  • the required signal processing method processes the hyper-directional differential beamforming signal, for example, performs noise suppression processing, echo suppression processing, etc. on the super-directional differential beamforming signal, and finally obtains an audio signal required in the current application scenario.
  • the embodiment of the present invention predetermines the super-directional differential beamforming weight coefficient in different application scenarios, and may use the determined current application scenario when the audio signals of different application scenarios need to be processed.
  • the lower super-point differential beamforming weight coefficient and the audio input signal of the current application scenario form a super-directional differential beam in the current application scenario, and the super-directional differential beam is processed correspondingly to obtain the final desired audio signal, which can satisfy different Application scenarios require the need for different audio signal processing methods.
  • each of the super-directional differential beamforming weight coefficients corresponding to different output signal types in different application scenarios may be determined according to the geometry of the microphone array and the set beam shape, wherein the beam shape is different according to different output signal types.
  • the requirements for the beam shape in the application scenario are determined, or the beam shape requirements and the speaker position are determined in different application scenarios according to different output signal types.
  • the sound source when determining the super-directional differential beamforming weight coefficient, it is necessary to construct a microphone array for collecting audio signals, and according to the geometry of the microphone array, the sound source reaches the microphone array at different incident angles.
  • each super-directional differential beamforming weight coefficient corresponding to different output signal types in different application scenarios can be determined according to the following formula:
  • ( ⁇ ) is the weight coefficient
  • ⁇ ( ⁇ , ⁇ ) is the steering matrix corresponding to the microphone array of any geometric shape, which is determined by the relative delay between the sound sources reaching the microphones in the microphone array at different incident angles
  • D H (° ⁇ ) represents the conjugate transposed matrix of ⁇ ( ⁇ , ⁇ )
  • is the frequency of the audio signal
  • is the incident angle of the sound source
  • is the response vector when the incident angle is ⁇ .
  • the frequency ⁇ is generally discretized, that is, in the effective frequency band of the signal. Discretely sample some frequency points, and obtain corresponding weight coefficients h ( , composition coefficient matrix for different frequencies respectively.
  • the value range of k is related to the number of effective frequency points in super-directional differential beamforming.
  • the geometry of the microphone array constructed in the embodiment of the present invention can be flexibly set, and the geometry of the specifically constructed microphone array is not limited, as long as the sound source reaches the microphones in the microphone array at different incident angles.
  • the relative delay determine ⁇ ( ⁇ , ⁇ ), and then determine the weight coefficient according to the set beam shape by the above formula.
  • different weight coefficients need to be determined according to different output scenarios required by different application scenarios.
  • the output signal required for the application scenario is a two-channel signal
  • the left channel super pointing differential is determined according to the above formula.
  • the mono super-point differential beamforming weight coefficient forming the mono signal needs to be determined according to the above formula.
  • the method further includes: determining whether the audio collection area is adjusted; if the audio collection area is adjusted, determining the geometry of the microphone array, the speaker position, and the adjusted audio collection. Effective area; adjust the beam shape according to the adjusted audio collection effective area, or adjust the beam shape according to the adjusted audio collection effective area and speaker position to obtain an adjusted beam shape; and then according to the geometry of the microphone array, the adjusted beam shape,
  • determines the super-pointing differential beamforming weight coefficient, and obtains an adjustment weight coefficient to perform super-directional differential beamforming processing on the audio input signal by using the adjustment weight coefficient.
  • ⁇ ' ⁇
  • different ⁇ ⁇ ( ⁇ ' ⁇ ) can be obtained according to different geometric shapes of the constructed microphone array, which will be exemplified below.
  • a linear array including a plurality of microphones can be constructed, which is set in the embodiment of the present invention.
  • the microphone and the speaker can be arranged in a plurality of different manners in the linear microphone array.
  • the microphone end direction adjustment can be realized, and the microphone is disposed on the rotatable platform, as shown in FIG. 2A-
  • the speaker is placed on both sides, the portion between the two speakers is divided into two layers, the upper layer is rotatable, and N microphones are placed thereon, N is a positive integer greater than or equal to 2, and N
  • the microphones can be linearly equidistant, and can be linear and non-equal.
  • FIG. 2A and 2B are schematic views showing the arrangement of the first type of microphone and speaker. The opening of the microphone is super-superimposed, wherein FIG. 2A is a plan view of the microphone and the speaker, and FIG. 2B is a front view of the microphone and the speaker.
  • FIGS. 2A and 2B are top and front views of another microphone and speaker deployment according to the present invention, and are different from FIGS. 2A and 2B in that the opening of the microphone faces directly in front.
  • FIGS. 2E and 2F are top and front views of a third type of microphone and speaker deployment proposed by the present invention, which differs from the first two cases in that the opening of the microphone is on the side line of the upper portion.
  • the microphone array may be a microphone array of other geometric shapes than a linear array, such as a circular array, a triangular array, a rectangular array or other polygonal array.
  • the microphone and the speaker are arranged in the embodiment of the present invention. The location is not limited to the above cases, but is just an example.
  • ⁇ ⁇ , ⁇ there are different ways of determining ⁇ ⁇ , ⁇ ), for example, when the microphone array is a linear array including one microphone in the embodiment of the present invention, as shown in the figure
  • ⁇ ' is the ith set sound source incident angle
  • the upper angle T indicates transposition
  • c is the sound velocity
  • c is the sound velocity
  • c is the sound velocity
  • is the frequency of the audio signal.
  • is the number of microphones in the microphone array
  • is the number of incident angles of the sound source, ⁇ ⁇ ⁇ .
  • 1, 2, ..., ⁇ , is the response value corresponding to the incident angle of the sound source set for the ith.
  • the microphone array is a uniform circular array including N microphones, as shown in FIG. 3B, 4 ⁇ is set as the radius of the uniform circular array, and ⁇ is the incident angle of the sound source, which is the distance between the sound source and the center position of the microphone array.
  • the sampling frequency of the microphone array acquisition signal is f
  • c is the speed of sound.
  • the projection of the position S on the plane of the uniform circular array is S', S, and the clip between the first microphone and the first microphone.
  • the angle is called the horizontal angle and is recorded as 0 ⁇ then the horizontal angle of the nth microphone "", then
  • is the first set sound source incident angle
  • is the distance between the sound source and the center position of the microphone array, to set the projection of the sound source position on the plane of the uniform circular array
  • the angle between the first microphone, c is the speed of sound
  • is the frequency of the audio signal
  • the upper corner indicates the transposition
  • is the number of microphones in the microphone array
  • is the number of incident angles of the set sound source.
  • AJ 1, 2, ..., M, is the response value corresponding to the incident angle of the sound source set by the ith.
  • the microphone array is a uniform rectangular array including N microphones, as shown in FIG. 3C, taking the geometric center of the rectangular array as the origin, assuming that the coordinates of the nth microphone of the microphone array are ( X "'), the set sound source The incident angle is ⁇ , and the distance between the sound source and the center of the microphone array is
  • ⁇ , ⁇ 2 ... ⁇ ⁇
  • the abscissa of the nth microphone in the microphone array, which is the ordinate of the n microphones in the microphone array
  • the ith setting
  • the incident angle of the sound source ⁇ is the distance between the sound source and the center position of the microphone array
  • is the frequency of the audio signal
  • c is the speed of sound
  • is the number of microphones in the microphone array
  • is the set angle of incidence of the sound source.
  • determining the differential beamforming weight coefficient in the embodiment of the present invention determining by considering the speaker position and not considering the speaker position, when the speaker position is not considered, according to the geometry and setting of the microphone array The audio is set to the effective area, and ⁇ ( ⁇ , ⁇ ⁇ . ⁇ is determined.
  • considering the speaker position it can be determined according to the geometry of the microphone array, the set audio collection effective area and the speaker position. 1) ( (0 , 6)
  • the output signal type according to different application scenarios is set.
  • the effective area of the audio is converted into the pole direction and the zero point direction; according to the pole direction of the conversion and the direction of the zero point, 1 ⁇ ° ⁇ ) and in different application scenarios are determined; wherein, the pole direction is the super-point difference
  • the response of the beam in this direction is the angle of incidence of 1 and the direction of the zero is the angle of incidence of the response of the super-directed differential beam in this direction to zero.
  • the set audio effective area and the speaker position when determining ⁇ ( ⁇ , ⁇ ) and ⁇ , the output signal type according to different application scenarios will be set.
  • the audio effective area is converted into the pole direction and the zero point direction, and the speaker position is converted into a zero point direction; according to the pole direction of the conversion and the zero point direction, ⁇ )( ⁇ , ⁇ ) and ⁇ in different application scenarios are determined; wherein, the pole direction is The response angle of the super-directed differential beam in this direction is an incident angle of 1, and the zero-point direction is an incident angle that causes the super-directional differential beam to have a response value of 0 in the direction.
  • the set audio active area is converted into the pole direction and the zero point direction according to the output signal type required by different application scenarios, and specifically includes:
  • the required output signal type of the application scene is a two-channel signal
  • set the 0-degree direction of the microphone array to the pole direction and set the 180-degree direction of the microphone array to the zero point direction to determine the super-channel corresponding to one of the channels.
  • the angle of the beam response vector is set to 1, the number of the beam response vector is 0 (the number of zero points of the lower tube), and the angle of each zero point can also be set differently.
  • a linear array in which a microphone array is N microphones is taken as an example for description.
  • the number of zero points for beam formation is L
  • the angle of each zero point is hour, l H, L, L ⁇ N- ⁇ .
  • any angle can be taken. Since the cosine function has symmetry, generally only the angle between (0, 180) is taken.
  • the end-fire direction of the microphone array can be adjusted such that the end-fire direction is toward a set direction, for example, the end-fire direction is directed toward the sound source direction, and the adjustment method may be manual.
  • FIG. 3A is a schematic diagram of the adjusted direction of the microphone array.
  • the end direction of the microphone array that is, the 0 degree direction, is taken as the pole direction, and the response vector is 1, and the steering array D (co, e) ) becomes:
  • the end-fire direction can also be set to the pole direction, the response vector is 1, the first zero point is, ie, the remaining zeros and zero positions are based on Set the zero distance ⁇ to determine.
  • ⁇ ⁇ + ⁇ ⁇ + ⁇ , ⁇ ⁇ -2, if this clause is not met
  • the maximum value of ⁇ is as long as ⁇ -2.
  • the angle of the angle of the speaker in the zero point direction may be preset, and in the embodiment of the present invention, the speaker may be used inside the device.
  • the speaker can also adopt the sound of the peripheral device as shown in FIG. 4A, which is a schematic diagram of the relationship between the end direction of the microphone array and the angle of the speaker when the internal speaker of the device is used in the embodiment of the present invention, and it is assumed that the counterclockwise rotation angle of the microphone array is recorded as a rotation. Later, the angle of the speaker relative to the microphone changes from 0 degrees and 180 degrees to the degree and
  • 18G _ degrees This time and 1 degree is the default zero point, and the response vector is 0.
  • the degree and 18Q - degree can be set to zero point, that is, the angle value that can be set when the zero point number is set. It is reduced by two, at which point the steering cloma ( ⁇ , ⁇ ) becomes:
  • M is a positive integer.
  • FIG. 4B is a schematic diagram showing the correlation between the end direction of the microphone array and the angle of the speaker when the external speaker of the device is used in the embodiment of the present invention, and the angle between the left speaker and the original position of the microphone array is set to be the right speaker.
  • the angle between the original position of the microphone array and the microphone array is counterclockwise.
  • the microphone array is rotated, the angle of the left speaker relative to the microphone array is changed from the original degree to + degrees, and the right microphone is opposite to the microphone array.
  • the angle is changed from 180- to 180-degree, then -4 and 180-- are the default zeros, and the response vector is o.
  • the degree and 18Q -degree can be set to zero.
  • the settable angle value is reduced by two, and the steering enthalpy ( ⁇ , ⁇ ) becomes:
  • M is a positive integer.
  • the process of determining the weight coefficient is applicable to the case where the required output signal type of the application scenario is a mono signal, and a mono super-point differential beamforming weight coefficient is formed.
  • the output signal type required by the application scenario is a two-channel signal
  • determining the super-directional differential beamforming weight coefficient of the left channel corresponding to the current application scenario and the super-directional differential beamforming weight coefficient of the right channel corresponding to the current application scenario you can determine the steering clutter ( ⁇ , ⁇ ) as follows:
  • FIG. 5 it is a schematic diagram of an angle of a microphone array for forming a two-channel audio signal according to an embodiment of the present invention.
  • a 0 degree direction is taken as a pole.
  • the response vector is 1
  • the direction of the 180 degree is zero
  • the response vector is 0.
  • the steering cloma ( ⁇ , ⁇ ) becomes:
  • the response matrix becomes: [1 0].
  • the 180-degree direction is the pole direction
  • the response vector is 1
  • the 0-degree direction is the zero-point direction
  • the response vector is 0.
  • the steering matrix 0 ( ⁇ , ⁇ ) becomes:
  • the response matrix becomes: o].
  • the zero direction of the super-directional differential beam of the left and right channels is symmetrical with the pole direction, only the weight coefficients of the left channel or the right channel can be calculated, and the other uncalculated weight coefficients can use the same weight coefficient. , just change the order of inputting each microphone signal into reverse order when using it.
  • the beam shape set at the time of determining the weight coefficient may be a preset beam shape or an adjusted beam shape.
  • the super-directional differential beamforming process is performed to obtain a super-directional differential beamforming signal.
  • the super-directional differential beamforming signal in the current application scenario is formed according to the obtained weight coefficient and the audio input signal.
  • the audio input signals are different in different application scenarios.
  • the audio input signals are echo-removed by the original audio signals collected by the microphone array determined according to the current application scenario.
  • the original audio signal collected by the microphone array is used as an audio input signal.
  • the super-directed differential beamforming process is performed according to the determined weight coefficient and the audio input signal, and the processed hyper-directional differential beam is formed to form an output signal.
  • FFT—LEN is the transform length of the fast discrete Fourier transform.
  • ⁇ ) is the kth group weight coefficient
  • X(k) [X, (k), X 2 (k), ..., X N (k)
  • X, (k) is a frequency domain signal corresponding to the first audio signal of the echo canceled original audio signal of the microphone array, or a frequency domain signal corresponding to the first original audio signal collected by the microphone array.
  • the mono super-point differential beamforming weight coefficient of the monophonic signal is obtained by the current application scene, according to the acquired mono
  • the track super-pointing differential beamforming weight coefficient performs super-directional differential beamforming processing on the audio input signal to form a single channel super-point differential beamforming signal; when the channel signal required for the application scene is a two-channel signal, respectively Obtaining a left channel super-pointing differential beamforming weight coefficient corresponding to the current application scenario and a right channel hyper-pointing differential beamforming weight coefficient corresponding to the current application scenario; and selecting a left channel super-pointing differential beamforming right according to the acquired current application scenario
  • the coefficient performs super-directional differential beamforming processing on the audio input signal to obtain a left channel super-pointing differential beamforming signal corresponding to the current application scenario; and the right channel super-directional differential beamforming weight coefficient corresponding to the acquired current application scene is used for audio input
  • the signal is subjected to a super-directed differential beamforming process, Corresponding
  • the original audio signal is better collected, and when the output signal type required by the current application scenario is a mono signal; adjusting the end-fire direction of the microphone array to direct the end-fire direction to the target sound source, The original audio signal of the target sound source is collected, and the original audio signal of the collected sound source is used as an audio input signal.
  • the microphone array when the channel signal required for the application scene is a two-channel signal, such as spatial sound field recording and stereo recording, the microphone array can be split into two sub-arrays, which are respectively the first sub-array. And the second sub-array, the end-fire direction of the first sub-array is different from the end-fire direction of the second sub-array, and the original audio signal is separately collected by using the first sub-array and the second sub-array, and the original audio is collected according to the two sub-arrays Signal and left channel hyper-directed differential beamforming weight coefficients and right channel super-pointing differential beamforming weight coefficients, or based on original audio signals collected for two sub-arrays
  • the echo-removed audio signal and the left channel super-point differential beamforming weight coefficient and the right channel hyper-pointing differential beamforming weight coefficient form a super-directional differential beamforming signal in the current application scenario, and the microphone array is split into two sub-children.
  • the schematic diagram behind the array as shown in Figure 6, the audio signal collected by one sub-array is used for the formation of the left channel super-pointing differential beamforming signal, and the audio signal of the other sub-array is used for the right channel super-pointing differential. Formation of a beamforming signal.
  • whether to perform noise suppression and/or echo suppression processing on the super-directional differential beam may be selected according to an actual application scenario, and the specific noise suppression processing mode and the echo suppression processing manner may be used. Multiple implementations.
  • the embodiment of the present invention in order to achieve a higher directional suppression effect, in the embodiment of the present invention, when forming a super-directional differential beam, Q weight coefficients different from the above-mentioned super-directional differential beam weight coefficients may be calculated to
  • the microphone array can adjust any direction other than the direction of the sound source in the end-fire direction.
  • the Q beamforming signals are obtained as reference noise signals by using the super-directional differential beam weight coefficient, where Q is an integer not less than 1, and noise is performed. Suppress to achieve better directional noise suppression.
  • the audio signal processing method provided by the embodiment of the present invention can flexibly set the geometry of the microphone array when determining the weight coefficient of the super-directed differential beam, and does not need to set multiple sets of microphone arrays, because there is not much requirement for the deployment mode of the microphone array.
  • the cost of the microphone is reduced, and when the audio collection area is adjusted, the weight coefficient is re-determined according to the adjusted audio set effective area, and the super-directional differential beamforming processing is performed according to the adjustment weight coefficient, thereby improving the experience.
  • an audio signal processing method in a human-computer interaction requiring a mono signal and a high-definition voice communication process is exemplified.
  • a flowchart of a frequency signal processing method including:
  • S701 Adjust the microphone array so that the direction of the microphone array is directed to the target speaker, that is, the sound source.
  • the microphone array adjustment may be manually adjusted, or may be automatically adjusted according to a preset rotation angle, and the microphone array may be used to detect the speaker orientation, and then the end direction of the microphone array is turned to the target. people.
  • There are many methods for speaker orientation detection using a microphone array such as sound source localization based on MUSIC algorithm, SRP-PHAT steering response energy phase transform technique or GCC-PHAT generalized cross-correlation phase transform.
  • S702 Determine whether the user adjusts the audio collection effective area. When the user adjusts the audio collection effective area, the S703 re-determines the super-pointing differential beamforming weight coefficient, otherwise the super-pointing differential beam weight coefficient is not updated, and the predetermined Hyper-directional differential beamforming weight coefficients for S704.
  • S703 Redetermining the super-directional differential beamforming weight coefficient according to the audio set effective area and the microphone speaker position set by the user.
  • the super-directional differential beam forming weight coefficient when the user resets the audio collection effective area, the super-directional differential beam forming weight coefficient may be re-determined according to the weight coefficient calculation method for determining the super-directional differential beam involved in the second embodiment.
  • the embodiment of the invention utilizes a microphone array including N microphones, collects the original audio signal picked up by the N-channel microphone, and synchronously buffers the data signal played by the speaker, and uses the data signal played by the speaker as a reference signal for echo suppression and echo cancellation, and The signal is framed.
  • the super-directional differential beam processing is performed on the frequency domain signal of the echo-removed audio input signal to obtain a super-directional differential beamforming signal in the frequency domain.
  • the super-directional differential beamforming signal in the frequency domain is The inverse transform of the fast discrete Fourier transform is used to transform into the time domain, and the output signal of the super-point differential beamforming is obtained.
  • the Q beamforming signals may be obtained as the reference noise signal in the same manner in any direction other than the target speaker direction, but used to generate the Q corresponding to the Q reference noise signals.
  • the weight coefficients of the super-directional differential beamforming need to be recalculated, and the calculation method is similar to the above method.
  • the selected direction other than the direction of the target speaker can be used as the pole direction of the beam
  • the response vector is 1
  • the direction opposite to the pole direction is the zero point direction
  • the response vector is 0, which can be calculated according to the selected Q directions.
  • the weight coefficient of the Q-point super-point differential beamforming is derived.
  • S707 Perform noise suppression processing.
  • Noise suppression processing is performed on the output signal «) of the super-point differential beamforming to obtain a noise-inhibited signal
  • the Q reference noise signals may be used for further noise suppression processing to achieve better directivity. The effect of noise suppression.
  • the echo suppression processing is performed based on the synchronized buffered speaker playback data and the noise-suppressed signal to obtain the final output signal z( «).
  • S708 is an optional item, and may perform echo suppression processing or no echo suppression processing.
  • the execution order of S707 and S706 is not required, and the noise suppression processing may be performed first and then the echo suppression processing may be performed, or the echo suppression processing may be performed first and then the noise suppression processing may be performed.
  • the execution order of S705 and S706 is also interchangeable.
  • the audio input signal is changed from the echo ⁇ ',» after each echo cancellation.
  • the signal, but the super-directional differential beamforming output signal obtained from the original audio signal collected by the circuit.”
  • the original N-channel processing can be reduced to one-way processing during the echo suppression processing.
  • the zero point needs to be set at the position of the left and right speakers to avoid the influence of the echo signal on the noise suppression performance.
  • the audio output signal after the above processing in the embodiment of the present invention if applied in high definition voice In communication, the final output signal is encoded and transmitted to the other party. If it is applied to human-computer interaction, the final output signal is further processed as a front-end gather signal of speech recognition.
  • an audio signal processing method in spatial sound field recording requiring a two-channel signal is exemplified.
  • a flow chart of an audio signal processing method in a spatial sound field recording process includes:
  • S801 Collect the original audio signal.
  • the original signal picked up by the N-channel microphone is collected, and the signal is subjected to frame processing.
  • S802 Perform left channel hyper-point differential beamforming processing and right channel differential beamforming processing, respectively.
  • the super-directional differential beamforming weight coefficient of the left channel corresponding to the current application scenario and the super-directional differential beamforming weight coefficient of the right channel corresponding to the current application scenario are pre-calculated and stored, using the stored
  • the super-directional differential beamforming weight coefficient of the left channel corresponding to the current application scenario and the super-directional differential beamforming weight coefficient of the right channel corresponding to the current application scenario, and the original audio acquisition signal in S801 are respectively corresponding to the current application scenario.
  • the left channel super-point differential beamforming process and the right channel differential beamforming process corresponding to the current application scenario can obtain the left channel super-pointing differential beamforming signal corresponding to the current application scenario and the right channel super corresponding to the current application scenario. Points to the differential beamforming signal 1 ⁇ 4 (n).
  • the present invention is a super-directional differential beamforming weight coefficient of the left channel and a super-directional differential beamforming weight coefficient of the right channel in the embodiment.
  • the output signal type required for the application scenario in the second embodiment is two-channel. When the signal is used, the method of determining the weight coefficient is determined, and will not be described here.
  • the frequency input signal is the original audio signal ⁇ » of the collected N-channel microphone
  • the weight coefficient is the super-directional differential beamforming weight coefficient corresponding to the left channel or the right channel, respectively.
  • multi-channel joint noise suppression is adopted, and the left channel super-point differential beamforming signal ⁇ ( ⁇ ) and the right channel hyper-directional differential beamforming signal y R (n) are used as input signals for multi-channel joint noise suppression.
  • the noise of the non-background noise signal can be prevented from drifting while the noise is suppressed, and the residual noise of the left and right channels is not affected to affect the sense of hearing of the processed stereo signal.
  • multi-channel joint noise suppression is optional, and the left channel super pointing differential beamforming signal (") and the right channel super pointing can be directly performed without multi-channel joint noise suppression.
  • the differential beamforming signal (n) constitutes a stereo signal that is output as the final spatial sound field recording signal.
  • an audio signal processing method in a stereo call is exemplified.
  • a flow chart of a method for processing an audio signal in a stereo call according to an embodiment of the present invention includes:
  • S901 Acquire the original audio signal picked up by the N-channel microphone, and synchronously buffer the speaker playing data, as a reference signal for multi-channel joint echo suppression and multi-channel joint echo cancellation, and perform frame processing on the original audio signal and the reference signal.
  • Q is the channel for the speaker to play data.
  • S903 Perform left channel hyper-directed differential beamforming and right channel differential beamforming processing, respectively. Specifically, the left channel super-pointing differential beamforming and the right channel difference are performed in the embodiment of the present invention.
  • the left channel super-point differential beamforming signal and the right channel hyper-point differential beamforming signal y R ⁇ n) are obtained after processing.
  • the process of performing the multi-channel joint noise suppression process in the embodiment of the present invention is the same as the process of the S803 in the fourth embodiment, and details are not described herein again.
  • the echo suppression processing is performed according to the synchronously buffered speaker playing data and the multi-channel joint noise suppressed signal, and the final output signal is obtained.
  • the multi-channel joint echo suppression processing in the embodiment of the present invention is optional, and may or may not be performed.
  • the execution sequence of the multi-channel joint echo suppression processing process and the multi-channel joint noise suppression processing process are not required, and the multi-channel joint noise suppression processing may be performed first, and then the multi-channel joint echo suppression processing may be performed, or Multi-channel joint echo suppression processing is performed first and then multi-channel joint noise suppression processing is performed.
  • the embodiment of the present invention provides an audio signal processing method, which is applied to a spatial sound field recording and a stereo call.
  • the sound field gathering mode can be adjusted according to the needs of the user, and the microphone array is used before the audio signal is collected.
  • Split into two sub-arrays respectively adjust the end-fire direction of the sub-array to collect the original audio signal through the split two sub-arrays.
  • the microphone array is split into two sub-arrays, and the end-fire directions of the sub-arrays are respectively adjusted, and the adjustment method may be manual adjustment by the user, or may be automatically adjusted according to the angle set by the user. It is also possible to preset a rotation angle.
  • the rotation angle can be set to the left counterclockwise rotation 45 Degree, the right side rotates 45 degrees clockwise, of course, can also be adjusted according to user settings.
  • FIG. 10A is a method for processing an audio signal during a spatial sound field recording
  • FIG. 10B is an audio during a stereo call. Signal processing method flow chart.
  • Embodiment 7 of the present invention provides an audio signal processing apparatus.
  • the apparatus includes a weight coefficient storage module 1101, a signal acquisition module 1102, a beamforming processing module 1103, and a signal output module 1104, where:
  • a weight coefficient storage module 1101 configured to store a super-directional differential beamforming weight coefficient
  • the signal acquisition module 1102 is configured to acquire an audio input signal, and transmit the acquired audio input signal to the beamforming processing module 1103.
  • the method further includes: determining a current application scenario and an output signal type required by the current application scenario, and performing a beamforming processing module. 1103 transmits the current application scenario and the output signal type required by the current application scenario.
  • the beamforming processing module 1103 is configured to select a weight coefficient corresponding to the current application scenario from the weight coefficient storage module 1101 according to the required output signal type of the current application scenario, and perform the audio input signal output by the signal acquiring module 1102 by using the selected weight coefficient.
  • Super-point differential beamforming processing obtaining a super-directional differential beamforming signal, and transmitting a super-directional differential beamforming signal to the signal output module 1104;
  • the signal output module 1104 is configured to output a super-directional differential beamforming signal transmitted by the beamforming processing module 1103.
  • the beamforming processing module 1103 is specifically configured to:
  • the slave weight storage module 1101 When the output signal type required for the current application scenario is a two-channel signal, the slave weight storage module 1101: acquiring a left channel super-pointing differential beamforming weight coefficient and a right channel hyper-pointing differential beamforming weight coefficient, and performing super-directional differential beamforming processing on the audio input signal according to the acquired left channel hyper-directional differential beamforming weight coefficient, Obtaining a left channel super-pointing differential beamforming signal, and performing a super-point differential beamforming process on the audio input signal according to a right channel hyper-directional differential beamforming weight coefficient to obtain a right channel hyper-directional differential beamforming signal, the signal output module 1104 transmits a left channel hyper-directed differential beamforming signal and a right channel hyper-directional differential beamforming signal.
  • the signal output module 1104 is specifically configured to:
  • the left channel super-point differential beamforming signal and the right channel hyper-pointing differential beamforming signal are output.
  • the beamforming processing module 1103 is specifically configured to:
  • the right super-point differential beamforming weight coefficient corresponding to the current application scene forming the mono signal is obtained from the weight coefficient storage module 1101, and when the monophonic sound is acquired,
  • the audio input signal is subjected to super-point differential beamforming processing according to the mono super-point differential beamforming weight coefficient to form a mono-channel hyper-directional differential beamforming signal; the signal output module 1104 A single channel super-point differential beamforming signal is transmitted.
  • the signal output module 1104 is specifically configured to:
  • the device further includes a microphone array adjustment module 1105, as shown in FIG. 11B, wherein:
  • a microphone array adjustment module 1105 configured to adjust the microphone array to be a first sub-array and a second sub-array, wherein an end-fire direction of the first sub-array is different from an end-fire direction of the second sub-array; the first sub-array and the second sub-array
  • the original audio signal is separately acquired, and the original audio signal is transmitted as an audio input signal to the signal acquisition module 1102.
  • the microphone array is adjusted into two sub-arrays, and the direction of the end directions of the two sub-arrays obtained are directed to different directions to be separately collected for the left channel.
  • Hyper-directional differential beamforming processing and right channel hyper-directional differential beamforming The raw audio acquisition signal required for processing.
  • the device includes a microphone array adjustment module 1105, configured to adjust an end-fire direction of the microphone array, point the end-fire direction to the target sound source, and the microphone array collects the original audio signal from the target sound source, and the original audio signal
  • the audio input signal is transmitted to the signal acquisition module 1102.
  • the device further includes a weight coefficient updating module 1106, as shown in FIG. 11C, wherein: the weight coefficient updating module 1106 is configured to determine whether the audio collection area is adjusted; if the audio collection area is adjusted, determining the geometry of the microphone array Shape, speaker position, and adjusted audio set effective area; adjusting beam shape according to audio set effective area, or adjusting beam shape according to audio set effective area and said speaker position, obtaining adjusted beam shape; according to microphone array The geometric shape, the adjusted beam shape, determining the super-directional differential beamforming weight coefficient, obtaining an adjustment weight coefficient, and transmitting the adjustment weight coefficient to the weight coefficient storage module 1101;
  • the weight coefficient storage module 1101 is specifically configured to: store an adjustment weight coefficient.
  • the weight coefficient update module 1106 is specifically configured to:
  • ( ⁇ ) is the weight coefficient
  • ⁇ ( ⁇ , ⁇ ) is the steering matrix corresponding to the microphone array of any geometric shape, which is determined by the relative delay between the sound sources reaching the microphones in the microphone array at different incident angles
  • ⁇ ⁇ ( ⁇ , ⁇ ) represents the conjugate transposed matrix of ⁇ ( ⁇ , ⁇ )
  • is the frequency of the audio signal
  • is the incident angle of the sound source
  • is the response vector when the incident angle is ⁇ .
  • the weight coefficient update module 1106 is specifically configured to:
  • ⁇ ( ⁇ , ⁇ ) and time or according to the geometry of the microphone array, the set audio acquisition effective area, and the speaker position, depending on the geometry of the microphone array and the set audio acquisition effective area
  • the set audio effective area is converted into the pole direction and the zero point direction, and according to the obtained pole direction and zero direction, the different application scenarios are determined. 1 ⁇ ⁇ and; or according to the output signal type required by different application scenarios, the set audio effective area is converted into the pole direction and the zero direction, the speaker position is converted to the zero point direction, and the obtained pole direction is obtained according to the And the zero direction to determine the different application scenarios
  • the pole direction is such that the super-directional differential beam forms a super-directional differential beam response with an incident angle of 1, and the zero direction is an incident angle of the hyper-directional differential beam forming a super-directional differential beam response value of zero.
  • the weight coefficient update module 1106 is specifically configured to:
  • the end-fire direction of the microphone array is set to be a pole.
  • the required output signal type of the application scene is a two-channel signal
  • set the 0-degree direction of the microphone array to the pole direction and set the 180-degree direction of the microphone array to the zero point direction to determine the super-channel corresponding to one of the channels.
  • the device further includes an echo cancellation module 1107, as shown in FIG. 11D, wherein: the echo cancellation module 1107 is configured to buffer the speaker playback signal, perform echo cancellation on the original audio signal collected by the microphone array, and obtain an echo cancellation audio signal. And transmitting the echo cancellation audio signal as an audio input signal to the signal acquisition module 1102; or performing echo cancellation on the super-directional differential beamforming signal output by the beamforming processing module 1103 to obtain an echo cancellation hyper-directional differential beamforming signal, and a signal to the signal Output module 1104 transmits an echo cancellation hyper-pointing differential beamforming signal.
  • the echo cancellation module 1107 is configured to buffer the speaker playback signal, perform echo cancellation on the original audio signal collected by the microphone array, and obtain an echo cancellation audio signal. And transmitting the echo cancellation audio signal as an audio input signal to the signal acquisition module 1102; or performing echo cancellation on the super-directional differential beamforming signal output by the beamforming processing module 1103 to obtain an echo cancellation hyper-directional differential beamforming signal, and a signal to the signal Output module 1104 transmits an echo cancellation
  • the signal output module 1104 is specifically configured to: The output echo cancels the hyper-directed differential beamforming signal.
  • the audio input signal required by the signal acquisition module 1102 is: the original audio signal collected by the microphone array is echo-removed by the echo cancellation module 1107, or the original audio signal collected by the microphone array;
  • the apparatus further includes: an echo suppression module 1108 and a noise suppression module 1109, as shown in FIG. 11E, wherein:
  • the echo suppression module 1108 is configured to perform echo suppression processing on the super-directional differential beamforming signal output by the beamforming processing module 1103.
  • the noise suppression module 1109 is configured to perform noise suppression processing on the hyper-directional differential beamforming signal after the echo suppression processing output by the echo suppression module 1108. Or
  • the noise suppression module 1109 is configured to perform noise suppression processing on the super-directional differential beamforming signal output by the beamforming processing module 1103.
  • the echo suppression module 1108 is configured to perform echo suppression processing on the hyper-directional differential beamforming signal after the noise suppression processing output by the noise suppression module 1109.
  • the echo suppression module 1108 is configured to perform echo suppression processing on the super-directional differential beamforming signal output by the beamforming processing module 1103.
  • the noise suppression module 1109 is configured to perform noise suppression processing on the super-directional differential beamforming signal output by the beamforming processing module 1103.
  • the signal output module 1104 is specifically configured to:
  • the output echo suppresses the hyper-directional differential beamforming signal or the noise-suppressed hyper-directional differential beamforming signal.
  • the beamforming processing module 1103 is further configured to:
  • the signal output module 1104 includes the noise suppression module 1109
  • at least one beamforming signal is formed as a reference noise signal in a direction other than the sound source direction in the end-fire direction that the microphone array can adjust, and the reference noise to be formed is formed.
  • the signal is transmitted to the noise suppression module 1109.
  • the super-directional differential beam used is: according to the geometry of the wind array, the set beam shape, and the Differential beam.
  • the beam forming processing module selects a corresponding weight coefficient in the weight coefficient storage module according to the output signal type required by the current application scenario, and outputs the output coefficient to the signal acquisition module by using the selected weight coefficient.
  • the audio input signal is super-directional differential beam processing to form a super-directional differential beam in the current application scenario, and the super-directional differential beam is processed correspondingly to obtain the final desired audio signal, which can meet different application scenarios and require different audio signal processing. The way the demand is.
  • the audio signal processing apparatus may be an independent component or integrated in other components.
  • An embodiment of the present invention provides a differential beamforming method, as shown in FIG. 12, including:
  • S1201 determining and storing the differential beamforming weight coefficient according to the geometry of the microphone array and the set audio collection effective area; or determining the differential beamforming right according to the geometry of the microphone array, the set audio collection effective area, and the speaker position Coefficient and storage;
  • S1202 Obtain the differential beamforming weight coefficient corresponding to the current application scenario according to the output signal type required by the current application scenario, and perform differential beamforming processing on the audio input signal by using the obtained weight coefficient to obtain a super-directional differential beam.
  • the process of determining a differential beamforming weight coefficient includes:
  • ⁇ 03 is the weight coefficient
  • is the steering corresponding to the microphone array of any geometric shape Matrix by the sound source at various incident angles arrive at the microphone array is determined relative delay between the respective microphones
  • ⁇ ( ⁇ , ⁇ ) represents ⁇ ( ⁇ , ⁇ ) conjugate transposed matrix
  • [omega] is the frequency of the audio signal
  • is the incident angle of the sound source
  • is the response vector when the incident angle is ⁇ .
  • determining D( ro , e) and ⁇ or according to the geometry of the microphone array, the set audio collection effective area and the speaker position, determining ⁇ ) and When specifically, it includes:
  • the set audio effective area is converted into the pole direction and the zero point direction, and the harmony of different application scenarios is determined according to the obtained pole direction and the zero point direction; or according to different application scenarios. It is necessary to output the signal type, convert the set audio effective area into the pole direction and the zero point direction, convert the speaker position into the zero point direction, and determine the ⁇ ( ⁇ , ⁇ ) in different application scenarios according to the obtained pole direction and the zero point direction. with! 3 ; wherein, the pole direction is such that the super-directional differential beam forms a super-directional differential beam response value of 1 incident angle, and the zero direction is an incident angle of super-directional differential beam forming super-point differential beam response value of 0.
  • the required output signal type of the application scene is a two-channel signal
  • set the 0-degree direction of the microphone array to the pole direction and set the 180-degree direction of the microphone array to the zero point direction to determine the super-channel corresponding to one of the channels.
  • the differential beamforming method provided in the embodiment of the present invention can determine different weight coefficients according to the audio signal output type required by different scenarios, and form a differential beam formed by differential beam processing. It has high adaptability and can meet the requirements of different scenes for the generated beam shape.
  • the embodiment of the present invention provides a differential beamforming apparatus, as shown in FIG. 13, including: a weight coefficient determining unit 1301 and a beam forming processing unit 1302;
  • the weight coefficient determining unit 1301 is configured to determine a differential beamforming weight coefficient according to the geometry of the omnidirectional microphone array, the set audio collection effective region, and transmit the formed differential beamforming weight coefficient to the beamforming processing unit 1302; or A differential beamforming weight coefficient is determined based on the geometry of the omnidirectional microphone array, the set audio acquisition effective area, and the speaker position, and the formed differential beamforming weight coefficients are transmitted to the beamforming processing unit 1302.
  • the beamforming processing unit 1302 selects a corresponding weight coefficient in the weight coefficient determining unit 1301 according to the required output signal type of the current application scenario, and performs differential beamforming processing on the audio input signal by using the selected weight coefficient.
  • the weight coefficient determining unit 1301 is specifically configured to:
  • ⁇ ( ⁇ , ⁇ ) and ⁇ are determined according to the geometry of the microphone array and the set audio effective area; or ⁇ ( ⁇ , according to the geometry of the microphone array, the set audio effective area and the speaker position) ⁇ ) and ⁇ ; according to the determination. ( ) and, according to the formula: Redefining the weight coefficient of the super-point differential beamforming;
  • h ( m ) is the weight coefficient, which is the steering matrix corresponding to the microphone array of any geometric shape, which is determined by the relative delay between the sound sources reaching the microphones in the microphone array at different incident angles
  • ⁇ ( ⁇ , ⁇ ) A conjugate transposed matrix representing ⁇ ( ⁇ , ⁇ ), ⁇ is the frequency of the audio signal, ⁇ is the incident angle of the sound source, and ⁇ is the response vector when the incident angle is ⁇ .
  • the weight coefficient determining unit 1301 is specifically configured to:
  • the pole direction is an incident angle at which a super-directional differential beam response value is to be formed
  • the zero direction is an incident angle at which a super-point differential beam response value to be formed is 0.
  • weight coefficient determining unit 1301 is specifically configured to:
  • the required output signal type of the application scene is a two-channel signal
  • set the 0-degree direction of the microphone array to the pole direction and set the 180-degree direction of the microphone array to the zero point direction to determine the super-channel corresponding to one of the channels.
  • the differential beamforming apparatus provided in the embodiment of the present invention can determine different weight coefficients according to the audio signal output type required by different scenarios, and the differential beam formed by the differential beam processing has high adaptability and can satisfy different scenarios. Requirements for the resulting beam shape.
  • differential beamforming process involved in the differential beamforming apparatus in the embodiment of the present invention may be further referred to the description of the differential beamforming process in the related method embodiments, and details are not described herein again.
  • the audio signal processing method and apparatus and the differential beam forming method and apparatus provided by the embodiments of the present invention provide a controller.
  • the controller includes a processor 1401 and an I/O interface. 1402 , where:
  • the processor 1401 is configured to determine and store each super-directional differential beamforming weight coefficient corresponding to different output signal types in different application scenarios, and obtain an audio input signal, and determine a current application scenario and an output signal required by the current application scenario.
  • Type according to the current application scenario Obtaining a weight coefficient corresponding to the current application scenario by using the obtained signal type, performing super-directional differential beamforming processing on the obtained audio input signal by using the obtained weight coefficient, obtaining a super-directional differential beamforming signal, and obtaining the super-directional differential beamforming signal Transfer to I/O interface 1402.
  • the I/O interface 1402 is configured to output the super-directional differential beamforming signal obtained by the processor 1401.
  • the controller provided by the embodiment of the present invention acquires a corresponding weight coefficient according to an output signal type required by the current application scenario, and performs super-directional differential beam processing on the audio input signal by using the obtained weight coefficient to form a super under the current application scenario. Pointing to the differential beam, the super-directional differential beam is processed accordingly to obtain the final desired audio signal, which can meet the requirements of different audio signal processing modes for different application scenarios.
  • the foregoing controller may be a separate component or may be integrated into other components.
  • embodiments of the present invention can be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware. Moreover, the invention can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Otolaryngology (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Quality & Reliability (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Obtaining Desirable Characteristics In Audible-Bandwidth Transducers (AREA)

Abstract

一种音频信号处理方法及装置、差分波束形成方法及装置,以解决现有的音频信号处理系统无法同时对多种应用场景下音频信号处理的问题。该方法包括:确定超指向差分波束形成权系数(S101);获取音频输入信号,并确定当前应用场景以及当前应用场景所需音频输出信号(S102);根据当前应用场景所需输出信号类型获取当前应用场景对应的权系数(S103),利用获取的权系数对音频输入信号进行超指向差分波束形成处理,得到当前应用场景下的超指向差分波束形成信号(S104);对形成信号进行处理,得到当前应用场景所需的最终音频信号。通过本方法能够满足不同应用场景需要不同音频信号处理方式的需求。

Description

音频信号处理方法及装置、 差分波束形成方法及装置 技术领域
本发明涉及音频技术领域, 尤其涉及一种音频信号处理方法及装置、 差 分波束形成方法及装置。 背景技术
随着麦克风阵列处理技术的不断发展, 利用麦克风阵列进行音频信号采 集的应用范围越来越广泛, 例如可应用于高清通话、 音视频会议、 语音交互、 空间声场录制等多种应用场景下, 并将逐步被应用到车载系统、 家庭媒体系 统、 视频会议系统等更广泛的应用场景下。
一般的, 不同的应用场景具有不同的音频信号处理装置, 并采用不同的 麦克风阵列处理技术, 例如需要单声道信号的高性能人机交互以及高清话音 通信场景下, 一般利用基于自适应波束形成技术的麦克风阵列进行音频信号 的采集, 对麦克风阵列采集到的音频信号处理后输出单声道信号, 即这种应 用于单声道信号输出的音频信号处理系统只能获取单声道信号, 无法应用于 需求双声道信号的场景, 例如无法实现空间声场的录制。
随着一体化进程的发展, 集高清通话、 音视频会议、 语音交互、 空间声 场录制等多种功能为一体的终端已被应用, 在终端工作在不同应用场景下就 需要不同的麦克风阵列处理系统进行音频信号的处理, 以得到不同的输出信 号, 技术实现相对较复杂, 因此设计一种音频信号处理装置, 使其同时满足 高清话音通信、 音视频会议、 语音交互以及空间声场录制等多种应用场景, 是麦克风阵列处理技术的研究方向。 发明内容
本发明实施例提供一种音频信号处理方法及装置、 差分波束形成方法及 装置, 以解决现有的音频信号处理装置无法同时满足多种应用场景下音频信 号处理的问题。 第一方面, 提供一种音频信号的处理装置, 包括权系数存储模块、 信号 获取模块、 波束形成处理模块和信号输出模块, 其中:
所述权系数存储模块, 用于存储超指向差分波束形成权系数;
所述信号获取模块, 用于获取音频输入信号, 并向所述波束形成处理模 块输出所述音频输入信号, 还用于确定当前应用场景以及当前应用场景所需 输出信号类型, 并向所述波束形成处理模块传输所述当前应用场景以及当前 应用场景所需输出信号类型;
所述波束形成处理模块, 用于根据当前应用场景所需输出信号类型从所 述权系数存储模块获取与当前应用场景对应的权系数, 利用获取的所述权系 数对所述音频输入信号进行超指向差分波束形成处理, 得到超指向差分波束 形成信号, 并向所述信号输出模块传输所述超指向差分波束形成信号;
所述信号输出模块, 用于输出所述超指向差分波束形成信号。
结合第一方面, 在第一种可能的实现方式中, 所述波束形成处理模块, 具体用于:
当所述当前应用场景所需输出信号类型为双声道信号时, 从所述权系数 存储模块获取左声道超指向差分波束形成权系数以及右声道超指向差分波束 形成权系数;
根据所述左声道超指向差分波束形成权系数对所述音频输入信号进行超 指向差分波束形成处理, 得到左声道超指向差分波束形成信号; 以及
根据所述右声道超指向差分波束形成权系数对所述音频输入信号进行超 指向差分波束形成处理, 得到右声道超指向差分波束形成信号;
向所述信号输出模块传输所述左声道超指向差分波束形成信号和所述右 声道超指向差分波束形成信号;
所述信号输出模块, 具体用于:
输出所述左声道超指向差分波束形成信号和所述右声道超指向差分波束 形成信号。
结合第一方面, 在第二种可能的实现方式中, 所述波束形成处理模块, 具体用于:
当所述当前应用场景所需输出信号类型为单声道信号时, 从所述权系数 存储模块获取当前应用场景对应的单声道超指向差分波束形成权系数;
根据所述单声道超指向差分波束形成权系数对所述音频输入信号进行超 指向差分波束形成处理, 形成一路单声道超指向差分波束形成信号;
向所述信号输出模块传输所述一路单声道超指向差分波束形成信号; 所述信号输出模块, 具体用于:
输出所述一路单声道超指向差分波束形成信号。
结合第一方面, 在第三种可能的实现方式中, 所述音频信号处理装置还 包括麦克风阵列调整模块, 其中:
所述麦克风阵列调整模块, 用于调整麦克风阵列为第一子阵列与第二子 阵列, 所述第一子阵列的端射方向与所述第二子阵列的端射方向不同;
所述第一子阵列与所述第二子阵列分别釆集原始音频信号, 并将所述原 始音频信号作为音频输入信号向所述信号获取模块传输。
结合第一方面, 在第四种可能的实现方式中, 所述音频信号处理装置还 包括麦克风阵列调整模块, 其中:
所述麦克风阵列调整模块, 用于调整麦克风阵列的端射方向, 使所述端 射方向指向目标声源;
所述麦克风阵列釆集所述目标声源发出的原始音频信号, 并将所述原始 音频信号作为音频输入信号向所述信号获取模块传输。
结合第一方面, 第一方面的第一种可能的实现方式, 第一方面的第二种 可能的实现方式, 在第五种可能的实现方式中, 所述音频信号处理装置还包 括权系数更新模块, 其中,
所述权系数更新模块, 具体用于:
判断音频采集区域是否被调整;
若所述音频采集区域被调整, 则确定麦克风阵列的几何形状、 扬声器位 置以及调整后的音频采集有效区域; 根据所述音频采集有效区域调整波束形状, 或者根据所述音频采集有效 区域和所述扬声器位置调整波束形状, 得到调整的波束形状;
根据所述麦克风阵列的几何形状、 所述调整的波束形状, 确定超指向差 分波束形成权系数, 得到调整权系数, 并将所述调整权系数向所述权系数存 储模块传输;
所述权系数存储模块, 具体用于: 存储所述调整权系数。
结合第一方面, 在第六种可能的实现方式中, 所述音频信号处理装置还 包括回声消除模块, 其中,
所述回声消除模块, 具体用于:
緩存扬声器播放信号, 对麦克风阵列采集的原始音频信号进行回声消除, 得到回声消除音频信号, 并所述回声消除音频信号作为音频输入信号向所述 信号获取模块传输; 或者
对波束形成处理模块输出的超指向差分波束形成信号进行回声消除, 得 到回声消除超指向差分波束形成信号 , 并向所述信号输出模块传输所述回声 消除超指向差分波束形成信号;
所述信号输出模块, 具体用于:
输出所述回声消除超指向差分波束形成信号。
结合第一方面, 在第七种可能的实现方式中, 所述音频信号处理装置还 包括回声抑制模块和噪声抑制模块, 其中,
所述回声抑制模块, 用于对所述波束形成处理模块输出的超指向差分波 束形成信号进行回声抑制处理, 或者对所述噪声抑制模块输出的噪声抑制超 指向差分波束形成信号进行回声抑制处理, 得到回声抑制超指向差分波束形 成信号, 并向所述信号输出模块传输所述回声抑制超指向差分波束形成信号; 所述噪声抑制模块, 用于对波束形成处理模块输出的超指向差分波束形 成信号进行噪声抑制处理, 或者对所述回声抑制模块输出的所述回声抑制超 指向差分波束形成信号进行噪声抑制处理, 得到噪声抑制超指向差分波束形 成信号, 并向所述信号输出模块传输所述噪声抑制超指向差分波束形成信号; 所述信号输出模块, 具体用于:
输出所述回声抑制超指向差分波束形成信号或者所述噪声抑制超指向差 分波束形成信号。
结合第一方面的第七种可能实现方式, 在第八种可能的实现方式中, 所 述波束形成处理模块, 还用于:
在麦克风阵列能够调整的端射方向中、 除声源方向以外的其它方向上, 形成至少一个波束形成信号作为参考噪声信号, 并向所述噪声抑制模块传输 所述参考噪声信号。
第二方面, 提供一种音频信号处理方法, 包括:
确定超指向差分波束形成权系数;
获取音频输入信号, 并确定当前应用场景以及当前应用场景所需输出信 号类型;
根据当前应用场景所需输出信号类型获取当前应用场景对应的权系数, 利用获取的所述权系数对所述音频输入信号进行超指向差分波束形成处理, 得到超指向差分波束形成信号, 并输出所述超指向差分波束形成信号。
结合第二方面, 在第一种可能的实现方式中, 所述根据当前应用场景所 需输出信号类型获取当前应用场景对应的权系数, 利用获取的所述权系数对 所述音频输入信号进行超指向差分波束形成处理, 得到超指向差分波束形成 信号, 并输出所述超指向差分波束形成信号, 具体包括:
在当前应用场景所需输出信号类型为双声道信号时, 获取左声道超指向 差分波束形成权系数以及右声道超指向差分波束形成权系数;
根据所述左声道超指向差分波束形成权系数对所述音频输入信号进行超 指向差分波束形成处理, 得到左声道超指向差分波束形成信号;
根据所述右声道超指向差分波束形成权系数对所述音频输入信号进行超 指向差分波束形成处理, 得到右声道超指向差分波束形成信号;
输出所述左声道超指向差分波束形成信号和所述右声道超指向差分波束 形成信号。 结合第二方面, 在第二种可能的实现方式中, 所述根据当前应用场景所 需输出信号类型获取当前应用场景对应的权系数, 利用获取的所述权系数对 所述音频输入信号进行超指向差分波束形成处理, 得到超指向差分波束形成 信号, 并输出所述超指向差分波束形成信号, 具体包括:
在当前应用场景所需输出信号类型为单声道信号时, 获取当前应用场景 形成单声道信号的单声道超指向差分波束形成权系数;
根据获取的单声道超指向差分波束形成权系数, 对所述音频输入信号进 行超指向差分波束形成处理, 形成一路单声道超指向差分波束形成信号, 并 输出所述一路单声道超指向差分波束形成信号。
结合第二方面, 在第三种可能的实现方式中, 获取音频输入信号之前, 该方法还包括:
调整麦克风阵列为第一子阵列与第二子阵列, 所述第一子阵列的端射方 向与所述第二子阵列的端射方向不同;
利用所述所述第一子阵列与所述第二子阵列分别采集原始音频信号, 将 所述原始音频信号作为音频输入信号。
结合第二方面, 在第四种可能的实现方式中, 获取音频输入信号之前, 该方法还包括:
调整麦克风阵列的端射方向, 使所述端射方向指向目标声源;
釆集目标声源的原始音频信号, 并将所述原始音频信号作为音频输入信 号。
结合第二方面, 第二方面的第一种可能的实现方式, 第二方面的第二种 可能的实现方式, 在第五种可能的实现方式中, 根据当前应用场景所需输出 信号类型获取当前应用场景对应的权系数之前, 该方法还包括:
判断音频采集区域是否被调整;
若所述音频采集区域被调整, 则确定麦克风阵列的几何形状、 扬声器位 置以及调整后的音频釆集有效区域;
根据所述音频采集有效区域调整波束形状, 或者根据所述音频采集有效 区域和所述扬声器位置调整波束形状, 得到调整的波束形状; 根据所述麦克风阵列的几何形状、 所述调整的波束形状, 确定超指向差 分波束形成权系数, 得到调整权系数;
利用所述调整权系数对所述音频输入信号进行超指向差分波束形成处 理。
结合第二方面, 在第六种可能的实现方式中, 该方法还包括:
对麦克风阵列采集的原始音频信号进行回声消除; 或者
对所述超指向差分波束形成信号进行回声消除。
结合第二方面, 在第七种可能的实现方式中, 形成超指向差分波束形成 信号之后, 该方法还包括:
对所述超指向差分波束形成信号进行回声抑制处理,和 /或噪声抑制处理。 结合第二方面, 在第八可能的实现方式中, 该方法还包括:
在麦克风阵列能够调整的端射方向中、 除声源方向以外的其它方向上, 形成至少一个波束形成信号作为参考噪声信号;
利用所述参考噪声信号对所述超指向差分波束形成信号进行噪声抑制处 理。
第三方面, 提供一种差分波束形成方法, 包括:
根据麦克风阵列的几何形状和设定的音频采集有效区域, 确定差分波束 形成权系数并存储; 或者根据麦克风阵列的几何形状、 设定的音频采集有效 区域和扬声器位置, 确定差分波束形成权系数并存储;
根据当前应用场景所需输出信号类型获取当前应用场景对应的权系数, 利用获取的所述权系数对音频输入信号进行差分波束形成处理, 得到超指向 差分波束。
结合第三方面, 在第一种可能的实现方式中, 所述确定差分波束形成权 系数的过程, 具体包括:
根据麦克风阵列的几何形状和设定的音频采集有效区域,确定 ϋ(ωθ)和 β; 或根据麦克风阵列的几何形状、 设定的音频采集有效区域和扬声器位置, 确 定 Ι)(ω,θ)和 β ; 根据确定的 ϋ(ω,θ)和 按照公式: h )=DH 'e)[D(ffl'e)DH( )] , 确定超 指向差分波束形成的权系数;
其中, (ω)为权系数, ϋ(ωθ)为任意几何形状的麦克风阵列所对应的转向 矩阵, 由不同入射角度下声源到达麦克风阵列中各麦克风间的相对时延决定 的, ϋΗ(ωθ)表示 ϋ(ωθ)的共轭转置矩阵, ω为音频信号的频率, Θ为声源入射 角度, β为入射角度为 Θ时的响应向量。
结合第三方面的第一种可能的实现方式, 在第二种可能的实现方式中, 所述根据麦克风阵列的几何形状和设定的音频采集有效区域,确定 0(ωθ)和 Ρ, 具体包括:
根据不同应用场景所需输出信号类型, 将设定的音频有效区域转换为极 点方向以及零点方向;
根据转换的所述极点方向以及所述零点方向, 确定不同应用场景下的 Ο(ω,θ)和 β ; 其中, 所述极点方向为使超指向差分波束在该方向上响应值为 1 的入射 角度, 所述零点方向为使超指向差分波束在该方向上响应值为 0的入射角度。
结合第三方面的第一种可能的实现方式, 在第三种可能的实现方式中, 所述根据麦克风阵列的几何形状、 设定的音频采集有效区域和扬声器位置, 确定 Ε)(ω,θ;^ β, 具体包括: 根据不同应用场景所需输出信号类型, 将设定的音频有效区域转换为极 点方向以及零点方向, 将扬声器位置转换为零点方向;
根据转换的所述极点方向以及所述零点方向, 确定不同应用场景下的 Ο(ω,θ)和 β ; 其中, 所述极点方向为使超指向差分波束在该方向上响应值为 1 的入射 角度, 所述零点方向为使超指向差分波束在该方向上响应值为 0的入射角度。 结合第三方面的第二种可能实现方式, 或者结合第三方面的第三种可能 实现方式, 在第四种可能的实现方式中, 所述根据不同应用场景所需输出信 号类型, 将设定的音频有效区域转换为极点方向以及零点方向, 具体包括: 当应用场景所需输出信号类型为单声道信号时, 设定麦克风阵列的端射 方向为极点方向, 并设定 M个零点方向, 其中 M≤N- 1, N为麦克风阵列中 的麦克风数量;
当应用场景所需输出信号类型为双声道信号时, 设定麦克风阵列的 0度 方向为极点方向, 并将麦克风阵列的 180度方向设定为零点方向, 以确定其 中一个声道对应的超指向差分波束形成权系数, 并设定麦克风阵列的 180度 方向为极点方向, 并将麦克风阵列的 0度方向设定为零点方向, 以确定另一 个声道对应的超指向差分波束形成权系数。
第四方面, 提供一种差分波束形成装置, 包括: 权系数确定单元和波束 形成处理单元;
所述权系数确定单元, 用于根据麦克风阵列的几何形状和设定的音频采 集有效区域, 确定差分波束形成权系数, 并将形成的所述权系数向所述波束 形成处理单元传输; 或根据麦克风阵列的几何形状、 设定的音频釆集有效区 域和扬声器位置, 确定差分波束形成权系数, 并将形成的所述权系数向所述 波束形成处理单元传输;
所述波束形成处理单元, 根据当前应用场景所需输出信号类型从所述权 系数确定单元获取当前应用场景对应的权系数, 利用获取的所述权系数对音 频输入信号进行差分波束形成处理。
结合第四方面, 在第一种可能的实现方式中, 所述权系数确定单元, 具 体用于:
根据麦克风阵列的几何形状和设定的音频釆集有效区域,确定 ϋ(ωθ)和 β; 或根据麦克风阵列的几何形状、 设定的音频采集有效区域和扬声器位置, 确 定 ΐ ω,θ)和 β ; 根据确定的 0(ω,θ)和 β , 按照公式:
Figure imgf000012_0001
, 确定超 指向差分波束形成的权系数;
其中, W03)为权系数, ^^ 为任意几何形状的麦克风阵列所对应的转向 矩阵, 由不同入射角度下声源到达麦克风阵列中各麦克风间的相对时延决定 的, ϋΗ(ωθ)表示 ϋ(ωθ)的共轭转置矩阵, ①为音频信号的频率, Θ为声源入射 角度, β为入射角度为 Θ时的响应向量。
结合第四方面的第一种可能的实现方式, 在第二种可能的实现方式中, 所述权系数确定单元, 具体用于:
根据不同应用场景所需输出信号类型, 将设定的音频有效区域转换为极 点方向以及零点方向, 并根据得到的极点方向以及零点方向, 确定不同应用 场景下的 和 β ; 或者根据不同应用场景所需输出信号类型, 将设定的音 频有效区域转换为极点方向以及零点方向, 将扬声器位置转换为零点方向, 并根据得到的极点方向以及零点方向, 确定不同应用场景下的
Figure imgf000012_0002
f5; 其中, 所述极点方向为使超指向差分波束在该方向上响应值为 1 的入射 角度, 所述零点方向为使超指向差分波束在该方向上响应值为 0的入射角度。
结合第四方面的第二种可能的实现方式, 在第三种可能的实现方式中, 所述权系数确定单元, 具体用于:
当应用场景所需输出信号类型为单声道信号时, 设定麦克风阵列的端射 方向为极点方向, 并设定 M个零点方向, 其中 M≤N-1 , N为麦克风阵列中 的麦克风数量;
当应用场景所需输出信号类型为双声道信号时, 设定麦克风阵列的 0度 方向为极点方向, 并将麦克风阵列的 180度方向设定为零点方向, 以确定其 中一个声道对应的超指向差分波束形成权系数, 并设定麦克风阵列的 180度 方向为极点方向, 并将麦克风阵列的 0度方向设定为零点方向, 以确定另一 个声道对应的超指向差分波束形成权系数。
本发明提供的音频信号处理装置, 波束形成处理模块根据当前应用场景 所需的输出信号类型, 从权系数存储模块获取与当前应用场景对应的权系数, 并利用获取的权系数对信号获取模块输出的音频输入信号进行超指向差分波 束处理, 形成当前应用场景下的超指向差分波束形成信号, 对超指向差分波 束进行相应的处理即可得到最终所需的音频输出信号, 能够满足不同应用场 景需要不同音频信号处理方式的需求。 附图说明
图 1为本发明实施例提供的音频信号处理方法流程图;
图 2A-图 2F为本发明实施例提供的直线形麦克风布放示意图;
图 3A-图 3C为本发明实施例提供的麦克风阵列示意图;
图 4A-图 4B为本发明实施例提供的麦克风阵列端射方向与扬声器角度相 关性示意图;
图 5为本发明实施例中形成两路音频信号麦克风阵列角度示意图; 图 6为本发明实施例麦克风阵列拆分为两个子阵列后的示意图; 图 7 为本发明实施例人机交互和高清话音通信过程中音频信号处理方法 流程图;
图 8 为本发明实施例提供的空间声场录制过程中音频信号处理方法流程 图;
图 9为本发明实施例提供的立体声通话中音频信号处理方法流程图; 图 10A为空间声场录制过程中音频信号的处理方法;
图 10B为立体声通话过程中音频信号处理方法流程图;
图 11A-图 11E为本发明实施例提供的音频信号处理装置结构示意图; 图 12为本发明实施例提供的差分波束形成流程示意图;
图 13为本发明实施例提供的差分波束形成装置构成示意图;
图 14为本发明实施例提供的控制器构成示意图。 具体实施方式
下面将结合本发明实施例中的附图, 对本发明实施例中的技术方案进行 清楚、 完整地描述, 显然, 所描述的实施例仅仅是本发明一部分实施例, 并 不是全部的实施例。 基于本发明中的实施例, 本领域普通技术人员在没有做 出创造性劳动前提下所获得的所有其他实施例 , 都属于本发明保护的范围。
实施例一
本发明实施例一提供一种音频信号处理方法, 如图 1所示, 包括: S101 : 确定超指向差分波束形成权系数。
具体的, 本发明实施例中涉及的应用场景可以包括高清通话、 音视频会 议、 语音交互、 空间声场录制等多种应用场景, 根据不同应用场景所需的音 频信号处理方式, 则可确定不同的超指向差分波束形成权系数, 本发明实施 例中超指向差分波束为根据麦克风阵列的几何形状、 预先设定的波束形状, 构建的差分波束。
S102: 获取当前应用场景所需的音频输入信号, 并确定当前应用场景以 及当前应用场景所需输出信号类型。
具体的, 本发明实施例中形成超指向差分波束时, 可根据当前应用场景 下是否需要对麦克风阵列釆集的原始音频信号进行回声消除处理, 确定不同 的音频输入信号, 该音频输入信号可以是根据当前应用场景确定的麦克风阵 列采集的原始音频信号经过回声消除的音频信号, 或者麦克风阵列采集的原 始音频信号。
不同应用场景需要的输出信号类型是不同的, 比如人机交互和高清话音 通信应用场景下需要的是单声道信号 , 在空间声场录制以及立体声通话应用 场景下, 则需要双声道信号, 本发明实施例中根据确定的当前应用场景, 确 定当前应用场景所需输出信号类型。
S103 : 获取当前应用场景对应的权系数。
具体的, 本发明实施例中根据当前应用场景所需输出信号类型获取对应 的权系数, 在当前应用场景所需输出信号类型为双声道信号时, 获取当前应 用场景对应的左声道超指向差分波束形成权系数以及当前应用场景对应的右 声道超指向差分波束形成权系数; 在当前应用场景所需输出信号类型为单声 道信号时, 获取形成单声道信号的当前应用场景的单声道超指向差分波束形 成权系数。
S104: 利用 S103中获取的权系数对 S102中获取的音频输入信号进行超 指向差分波束形成处理, 得到超指向差分波束形成信号。
具体的, 本发明实施例中在当前应用场景所需输出信号类型为双声道信 号时, 获取当前应用场景对应的左声道超指向差分波束形成权系数以及当前 应用场景对应的右声道超指向差分波束形成权系数; 根据当前应用场景对应 的左声道超指向差分波束形成权系数对音频输入信号进行超指向差分波束形 成处理, 得到当前应用场景对应的左声道超指向差分波束形成信号; 以及根 据当前应用场景对应的右声道超指向差分波束形成权系数对音频输入信号进 行超指向差分波束形成处理, 得到当前应用场景对应的右声道超指向差分波 束形成信号。
本发明实施例中, 在当前应用场景所需输出信号类型为单声道信号时, 获取单声道信号的当前应用场景对应的超指向差分波束形成权系数, 根据获 取的差分波束形成权系数, 对音频输入信号进行超指向差分波束形成处理, 形成一路单声道超指向差分波束形成信号。
S105: 输出 S104中得到的超指向差分波束形成信号。
具体的,本发明实施例中输出 S104中得到的超指向差分波束形成信号后, 可对超指向差分波束形成信号进行处理, 得到当前应用场景所需的最终音频 信号, 可以按照当前应用场景下所需的信号处理方式对超指向差分波束形成 信号进行处理, 例如对超指向差分波束形成信号进行噪声抑制处理、 回声抑 制处理等, 最终得到当前应用场景下所需的音频信号。
本发明实施例预先确定不同应用场景下超指向差分波束形成权系数, 在 需要对不同应用场景的音频信号进行处理时, 可以利用确定的当前应用场景 下超指向差分波束形成权系数以及当前应用场景的音频输入信号, 形成当前 应用场景下的超指向差分波束, 对超指向差分波束进行相应的处理即可得到 最终所需的音频信号, 能够满足不同应用场景需要不同音频信号处理方式的 需求。
实施例二
本发明以下将结合附图对实施例一涉及的音频信号处理方法进行详细说 明。
一、 确定超指向差分波束形成权系数
本发明实施例中可根据麦克风阵列的几何形状以及设定的波束形状确定 不同输出信号类型在不同应用场景对应的各超指向差分波束形成权系数, 其 中, 波束形状为根据不同输出信号类型在不同应用场景下对波束形状的要求 确定, 或者根据不同输出信号类型在不同应用场景下对波束形状的要求和扬 声器位置确定。
本发明实施例中, 进行超指向差分波束形成权系数的确定时, 需要构建 用于釆集音频信号的麦克风阵列, 才艮据麦克风阵列的几何形状得到不同入射 角度下声源到达麦克风阵列中各麦克风间的相对时延, 并根据设定的波束形 状, 确定超指向差分波束形成权系数。
根据全指向麦克风阵列的几何形状以及设定的波束形状确定不同输出信 号类型在不同应用场景对应的各超指向差分波束形成权系数, 可按照如下公 式进行计算:
h(ro)=DH (ω,θ) [ϋ(ω,θ)ϋΗ (ω,θ)]"1 β
其中, (ω)为权系数, ϋ(ωθ)为任意几何形状的麦克风阵列所对应的转向 矩阵, 由不同入射角度下声源到达麦克风阵列中各麦克风间的相对时延决定 的, DH(°^)表示 ϋ(ωθ)的共轭转置矩阵, ω为音频信号的频率, Θ为声源入射 角度, β为入射角度为 Θ时的响应向量。
在具体应用时, 一般对频率 ω进行离散化处理, 也就是在信号的有效频带 内离散的采样一些频率点,对于不同的频率 ,分别求取对应的权系数 h ( , 组成系数矩阵。 k的取值范围与超指向差分波束形成时有效频点数有关。假设 超指向差分波束形成时快速离散傅里叶变换的长度为 FFT— LEN, 有效频点个 数为 FFT_LEN/2+l。 假设信号的采样率为 AHz, 则 ω = k, k=0,\ ...,FFT LEN 12 0
FFT _ LEN - 进一步的, 本发明实施例中构建的麦克风阵列几何形状可灵活设置, 具 体构建的麦克风阵列几何形状并不做限定, 只要能够得到不同入射角度下声 源到达麦克风阵列中各麦克风间的相对时延, 确定 ϋ(ωθ), 然后根据设定的波 束形状, 通过上述公式即可确定权系数。
具体的, 本发明实施例中根据不同应用场景所需输出信号类型需要确定 不同的权系数, 在应用场景所需输出信号为双声道信号时, 则需要按照上述 公式确定左声道超指向差分波束形成权系数以及右声道超指向差分波束形成 权系数。 在应用场景所需输出信号为单声道信号时, 则需要按照上述公式确 定形成单声道信号的单声道超指向差分波束形成权系数。
进一步的, 本发明实施例中选择对应的权系数之前, 还包括: 判断音频 采集区域是否被调整; 若音频釆集区域被调整, 则确定麦克风阵列的几何形 状、 扬声器位置以及调整后的音频采集有效区域; 根据调整后的音频采集有 效区域调整波束形状, 或者根据调整后的音频采集有效区域和扬声器位置调 整波束形状, 得到调整的波束形状; 然后根据麦克风阵列的几何形状、 调整 的波束形状, 按照公式
Figure imgf000017_0001
β确定超指向差分波束形成 权系数, 得到调整权系数, 以利用调整权系数对音频输入信号进行超指向差 分波束形成处理。
本发明实施例中根据构建的麦克风阵列的几何形状不同, 可得到不同的 ϋ(ω'θ) , 以下举例进行说明。
本发明中可以构建包括 Ν个麦克风的直线形阵列, 本发明实施例中设置 的直线形麦克风阵列中麦克风与扬声器的布放方式可以有很多种不同的方 式, 本发明实施例为能实现麦克风端射方向的调整, 将麦克风设置在可转动 的平台上, 如图 2A-图 2F所示, 将扬声器放置在两侧, 两个扬声器之间的部 分分两层, 上层为可转动的, 并在其上面布放 N个麦克风, N为大于等于 2 的正整数, 并且 N个麦克风可以是直线型等间距的, 可以是直线型非等间距 的。
图 2A和图 2B为第一种麦克风与扬声器布放的示意图, 麦克风的开孔超 正上方, 其中图 2A为麦克风与扬声器布放的俯视图, 图 2B为麦克风与扬声 器布放的正面示意图。
图 2C和图 2D为本发明提出的另一种麦克风与扬声器布放的俯视图和正 面示意图, 与图 2A和图 2B , 不同之处在于麦克风的开孔朝向正前方。
图 2E和图 2F为是本发明提出的第三种麦克风与扬声器布放的俯视图和 正面示意图, 与前两种情况相比, 不同之处在于麦克风的开孔在上层部分的 边线上。
本发明实施例中麦克风阵列可以是除直线形阵列以外的其他几何形状的 麦克风阵列, 如圆形阵列、 三角形阵列、 矩形阵列或其他多边形阵列, 当然, 本发明实施例中麦克风与扬声器的布放位置不限于以上几个情况, 这里只是 举例说明。
本发明实施例中根据构建的麦克风阵列几何形状的不同, 则有不同的确 定 ΐ ω,θ)方式, 例如: 本发明实施例中当麦克风阵列为包括 Ν个麦克风的直线形阵列时, 如图
3Α所示, 可釆用如下公式进行 ^^ 和 的确定, 其中:
άΗ (ω, cos 6>】)
dH(m, cos 6>2)
ϋ(ω,θ)= dH (ω, cos dM) 其中, ί/(ω,οο86·) = [β^100^ e—J se! ... e*cos5i]r ,ζ = 1,2,....,
c 其中, θ '为第 i个设定的声源入射角度, 上角标 T表示转置, c为声速, 一般可以取 342m/s或者 340m/s , 为第 k个麦克风与设定的阵列原点位置 之间的距离, 一般情况下, 麦克风阵列阵列的原点位置取阵列的几何中心, 也可以取阵列中的某一个麦克风位置为原点 (如第一个麦克风), ω为音频信 号的频率, Ν为麦克风阵列中麦克风的数量, Μ为设定的声源入射角度的个 数, Μ≤ Ν。 响应向量 的公式:
Figure imgf000019_0001
其中 Α,ζ· = 1,2,...,Μ, 为第 i个设定的声源入射角度对应的响应值。
当麦克风阵列为包括 N个麦克风的均匀圓形阵列, 如图 3B所示, 4叚设 为 均匀圓阵的半径, θ为声源入射角度, 为声源与麦克风阵列中心位置之间的 距离, 麦克风阵列采集信号的采样频率为 f, c为声速, 假定感兴趣声源的位置 S, 则位置 S在均匀圓阵所在平面上的投影为 S', S,与第一个麦克风之间的夹角 称为水平角, 记作0^ 那么第 n个麦克风的水平角" ",则
2π{η-\) . - ΛΤ
a„ =or, + ~ -,« = 1, 2,....,N
" 1 N
则声源 S距离麦克风阵列第 n个麦克风的距离为 ",则
Figure imgf000019_0002
则时延调整参数为:
T = [T T2,...,Tn]= - r^f,-r^f,..-r^^f,
\_ c c c
超指向差分波束形成权系数的设计方法计算权系数的公式如下: h(ro)=DH (ω,θ) [D(ro,9)DH (ω,θ)]" β
其中转向阵 ϋ(ω,θ)的公式:
2, + ...,M,
Figure imgf000020_0001
响应矩阵 的公式: β = [β, A ... βΜί 。
为均匀圓阵的半径, ^为第 1个设定的声源入射角度, ^为声源与麦克 风阵列中心位置之间的距离, 为设定声源位置在均匀圓阵所在平面上的投 影与第一个麦克风之间的夹角, c为声速, ω为音频信号的频率, 上角标 Τ表 示转置, Ν为麦克风阵列中麦克风的数量, Μ为设定的声源入射角度的个数; 其中 AJ = 1, 2,...,M, 为第 i个设定的声源入射角度对应的响应值。
当麦克风阵列为包括 N个麦克风的均匀矩形阵列,如图 3C所示, 以矩形 阵列的几何中心为原点, 假设麦克风阵列的第 n个麦克风的坐标为(X" ' ) , 设 定的声源的入射角度为 θ , 声源与麦克风阵列中心位置的距离为
则声源 S距离麦克风阵列第 n个阵元的距离为 ,
Figure imgf000020_0002
则时延调整参数 :
Figure imgf000020_0003
超指向差分波束形成权系数的设计方法计算权系数的公式如下: h(ro)=DH (ω,θ) [D(ro,9)DH (ω,θ)]" β
其中转向阵 ϋ(ω,θ)的公式:
1, 2, + ...,M,
Figure imgf000021_0001
响应矩阵 的公式: β = β、 β2 ... βΜ 其中, χ"为麦克风阵列中第 η个麦克风的横坐标, 为麦克风阵列中 η 个麦克风的纵坐标, ^为第 i个设定的声源入射角度, ^为声源与麦克风阵列 中心位置之间的距离, ω为音频信号的频率, c为声速, Ν为麦克风阵列中麦 克风的数量, Μ为设定的声源入射角度的个数, J = 1, 2, ...,M, 为第 1个设定 的声源入射角度对应的响应值。
进一步的, 本发明实施例中进行差分波束形成权系数确定时, 通过考虑 扬声器位置和不考虑扬声器位置两种方式来确定, 当不考虑扬声器位置时, 可根据麦克风阵列的几何形状和设定的音频釆集有效区域, 确定 ϋ(ωθ^。β。 当考虑扬声器位置时, 可根据麦克风阵列的几何形状、 设定的音频采集有效 区域和扬声器位置, 确定 1)((0,6)和3。 具体的, 本发明实施例中根据麦克风阵列的几何形状和设定的音频采集 有效区域, 确定 D(rae)和 β时, 根据不同应用场景所需输出信号类型, 将设定 的音频有效区域转换为极点方向以及零点方向; 根据转换的极点方向以及零 点方向, 确定不同应用场景下的1^ °^)和 ; 其中, 极点方向为使超指向差分 波束在该方向上的响应值为 1 的入射角度, 零点方向为使超指向差分波束在 该方向上的响应值为 0的入射角度。
进一步的, 本发明实施例中根据麦克风阵列的几何形状、 设定的音频采 集有效区域和扬声器位置, 确定 ϋ(ωθ)和 β时, 根据不同应用场景所需输出信 号类型, 将设定的音频有效区域转换为极点方向以及零点方向, 将扬声器位 置转换为零点方向; 根据转换的极点方向以及零点方向, 确定不同应用场景 下的 Ι)(ω,θ)和 β ; 其中, 极点方向为使超指向差分波束在该方向上的响应值为 1的入射角度,零点方向为使超指向差分波束在该方向上的响应值为 0的入射 角度。
更进一步的, 本发明实施例中才艮据不同应用场景所需输出信号类型, 将 设定的音频有效区域转换为极点方向以及零点方向, 具体包括:
当应用场景所需输出信号类型为单声道信号时, 设定麦克风阵列的端射 方向为极点方向, 并设定 Μ个零点方向, 其中 M≤N- 1 , N为麦克风阵列中 的麦克风数量;
当应用场景所需输出信号类型为双声道信号时, 设定麦克风阵列的 0度 方向为极点方向, 并将麦克风阵列的 180度方向设定为零点方向, 以确定其 中一个声道对应的超指向差分波束形成权系数, 并设定麦克风阵列的 180度 方向为极点方向, 并将麦克风阵列的 0度方向设定为零点方向, 以确定另一 个声道对应的超指向差分波束形成权系数。
本发明实施例中进行波束形状设置时, 可设定波束响应向量为 1的角度、 波束响应向量为 0 的个数(以下筒称零点的个数) 以及每一个零点的角度, 也可以设置不同角度下的响应程度, 或者是设定感兴趣区域的角度范围。 本 发明实施例中以麦克风阵列为 N个麦克风的直线形阵列为例进行说明。
假设设定波束形成零点的个数为 L , 每一个零点的角度为 时, l H,L,L < N-\。 根据余弦函数的周期性, 可以取任意角度, 由于余弦 函数具有对称性, 一般只取 (0, 180]之间的角度。 进一步的, 当麦克风阵列为包括 N个麦克风的直线形阵列, 可调整麦克 风阵列的端射方向, 使端射方向朝向设定的方向, 比如使端射方向朝向声源 方向, 调整方法可以是手动调整, 也可以是自动调整, 可以预先设定一个旋 转角度, 比较常见的旋转角度为顺时针旋转 90度, 当然也可以利用麦克风阵 列进行声源方位检测, 然后将麦克风阵列的端射方向转向声源, 如图 3A所示 为调整后的麦克风阵列方向示意图, 本发明实施例中取麦克风阵列的端射方 向即 0度方向作为极点方向, 响应向量为 1, 此时转向阵 D(co,e)变为:
Figure imgf000023_0001
响应矩阵 变为: = [1 0 ... of。 假设设定感兴趣区域的角度范围卜 , ]时, 为 0度到 180度之间的角度, 此时, 可设定端射方向为极点方向, 响应向量为 1, 第 1个零点为 ^, 即 = ,
180-^
其余的零点 +1 = ζ + γ,ζ = \,2,....,Κ,Κ≤Ν-2 ο 此时转向阵 ϋ(ω,θ)变为:
N-z
Figure imgf000023_0002
响应矩阵 变为: = [ιο...ο]τ。 当设定感兴趣区域的角度范围卜, γ]时, 也可设定端射方向为极点方向, 响应向量为 1, 第 1个零点为 , 即 = , 其余的零点个数和零点位置根据预 先设定零点距离 σ确定。 θζ+ι =σζ + γ,ζ <Ν-2, 如果不满足此条
Figure imgf000024_0001
件, 则 ζ的最大取值截止到 Ν-2。
进一步的, 本发明实施例中为了有效去除扬声器播放声音引起的回声问 题对整个装置性能的影响, 可以预先设定扬声器的角度为零点方向的角度, 并且本发明实施例中扬声器可采用装置内部的扬声器, 也可采用外设的扬声 如图 4Α所示为本发明实施例中采用装置内部扬声器时,麦克风阵列端射 方向与扬声器角度相关性示意图,假设麦克风阵列逆时针旋转角度记为 则 旋转以后扬声器相对于麦克风的角度就从原来的 0 度和 180 度变为 度和
18G_ 度。 这个时候 度和1 度就为默认的零点, 响应向量为 0, 则进行 零点设置时, 可将 度和 18Q- 度设置为零点, 即在进行零点个数设定时, 可设定的角度值就减少了 2个, 此时转向阵 Β(ωθ)变为:
M为正整数。
Figure imgf000024_0002
如图 4B所示为本发明实施例中釆用装置外部扬声器时, 麦克风阵列端射 方向与扬声器角度相关性示意图, £设左侧扬声器与麦克风阵列原始位置水 平线之间夹角为^ 右侧扬声器与麦克风阵列原始位置之间夹角为 麦克风 阵列逆时针旋转角度为^则麦克风阵列旋转以后左侧扬声器相对于麦克风阵 列的角度就从原来的 - 度变为 + 度, 右侧麦克风相对于麦克风阵列的角 度就从原来的 180- 变为 180- 度, 则 - + 4和 180- - 就为默认的零点, 响应向量为 o, 则进行零点设置时, 可将 度和18Q- 度设置为零点, 即在进 行零点个数设定时, 可设定的角度值就减少了 2个, 此时转向阵 ϋ(ωθ)变为:
M < N , M为正整数。
Figure imgf000025_0001
需要说明的是, 本发明实施例中上述确定权系数的过程适用于应用场景 所需输出信号类型为单声道信号的情况下, 形成单声道超指向差分波束形成 权系数。
在应用场景所需输出信号类型为双声道信号时, 确定当前应用场景对应 的左声道的超指向差分波束形成权系数和当前应用场景对应的右声道的超指 向差分波束形成权系数时, 可釆用如下方式确定转向阵 ϋ(ωθ):
如图 5 所示, 为本发明实施例中用于形成双声道音频信号麦克风阵列角 度示意图, 对于当前应用场景对应的左声道的超指向差分波束形成权系数设 计时取 0度方向为极点方向, 响应向量为 1, 180度方向为零点方向, 响应向 量为 0。 此时转向阵 ϋ(ωθ)变为:
^ (ω,Ι)
D(ro,9)=
^(ω, -1) 响应矩阵 变为: = [1 0]。 对于当前应用场景对应的右声道的超指向差分波束形成权系数设计时取 180度方向为极点方向, 响应向量为 1 , 0度方向为零点方向, 响应向量为 0。 此时转向阵 0(ωθ)变为:
Figure imgf000025_0002
响应矩阵 变为: o]。
进一步的, 由于左右声道的超指向差分波束的零点方向与极点方向是相 对称的, 因此可以只计算左声道或右声道的权系数, 另一未计算的权系数可 用同样的权系数, 只不过在使用的时候将输入各路麦克风信号的顺序变为逆 序即可。
需要说明的是, 本发明实施例中进行权系数确定时上述设定的波束形状, 可以是预先设定的波束形状, 也可以是调整的波束形状。
二、 进行超指向差分波束形成处理, 得到超指向差分波束形成信号 本发明实施例中根据获取的权系数以及音频输入信号, 形成当前应用场 景下的超指向差分波束形成信号。 其中, 不同的应用场景下音频输入信号不 同, 当应用场景需要对麦克风阵列采集的原始音频信号进行回声消除处理, 则音频输入信号为根据当前应用场景确定的麦克风阵列采集的原始音频信号 经过回声消除后的音频信号, 当应用场景不需要对麦克风阵列釆集的原始音 频信号进行回声消除处理, 则将麦克风阵列采集的原始音频信号作为音频输 入信号。
进一步的, 当确定了音频输入信号和权系数后, 则根据确定的权系数和 音频输入信号, 进行超指向差分波束形成处理, 得到处理后的超指向差分波 束形成输出信号。
具体的, 一般对音频输入信号进行快速离散傅里叶变换, 得到每一路音 频输入信号对应的频域信号 i=l,2,..., N, k=l,2,... , FFT— LEN,其中, FFT— LEN为快速离散傅里叶变换的变换长度。 根据离散傅里叶变换的性质, 变换后的信号具有复对称特性, XAFFT_LEN + 2 _ k、 = X^k) , k=2,... , FFT LEN/2, 其中, *表示共轭。 因此离散傅里叶变换后得到信号的有效频点 数为 FFT— LEN/2+l。 一般情况下只存储有效频点对应的超指向差分波束形成 的权系数。 按照公式: k = h»X(f ) , k=1,2,..., FFT_LEN/2+l , 和 Y^FFT LEN+ 2 -k) ^ Y k) , k=2, ... , FFT— LEN/2, 对频域上的音频输入信号进 行超指向差分波束处理,得到频域上的超指向差分波束形成信号。其中, 为频域上的超指向差分波束形成信号, ^ )为第 k 组权系数, X(k) = [X, (k), X2(k), ..., XN (k) , X, (k)为麦克风阵列釆集的原始音频信号经过回声 消除的第 1路音频信号对应的频域信号, 或者麦克风阵列采集的第 1路原始音 频信号对应的频域信号。
进一步的, 本发明实施例中当应用场景需要的声道信号为单声道信号时, 则获取当前应用场景形成单声道信号的单声道超指向差分波束形成权系数, 根据获取的单声道超指向差分波束形成权系数, 对音频输入信号进行超指向 差分波束形成处理, 形成一路单声道超指向差分波束形成信号; 在应用场景 所需声道信号为双声道信号时, 则分别获取当前应用场景对应的左声道超指 向差分波束形成权系数以及当前应用场景对应的右声道超指向差分波束形成 权系数; 根据获取的当前应用场景对应的左声道超指向差分波束形成权系数 对音频输入信号进行超指向差分波束形成处理, 得到当前应用场景对应的左 声道超指向差分波束形成信号; 根据获取的当前应用场景对应的右声道超指 向差分波束形成权系数对音频输入信号进行超指向差分波束形成处理, 得到 当前应用场景对应的右声道超指向差分波束形成信号。
进一步的, 本发明实施例中为较好的采集原始音频信号, 在当前应用场 景所需输出信号类型为单声道信号时; 调整麦克风阵列的端射方向, 使端射 方向指向目标声源, 釆集目标声源的原始音频信号, 并将釆集的原始音频信 号作为音频输入信号。
更进一步的, 本发明实施例中当应用场景所需声道信号为双声道信号时, 例如空间声场录制以及立体声录制时, 可将麦克风阵列拆分为两个子阵列, 分别为第一子阵列和第二子阵列, 第一子阵列的端射方向与第二子阵列的端 射方向不同, 利用第一子阵列与述第二子阵列分别采集原始音频信号, 根据 两个子阵列采集的原始音频信号与左声道超指向差分波束形成权系数以及右 声道超指向差分波束形成权系数, 或根据对两个子阵列釆集的原始音频信号 进行回声消除后的音频信号与左声道超指向差分波束形成权系数以及右声道 超指向差分波束形成权系数, 形成当前应用场景下的超指向差分波束形成信 号, 麦克风阵列拆分为两个子阵列后的示意图, 如图 6 所示, 其中一个子阵 列采集的音频信号用于左声道超指向差分波束形成信号的形成, 另一个子阵 列釆集的音频信号用于右声道超指向差分波束形成信号的形成。
三、 对形成的超指向差分波束进行处理
本发明实施例中在形成超指向差分波束后, 可根据实际的应用场景选择 是否对超指向差分波束进行噪声抑制和 /或回声抑制处理, 具体的噪声抑制处 理方式和回声抑制处理方式可釆用多种实现方式。
本发明实施例中为达到更高的方向性抑制效果, 本发明实施例中在形成 超指向差分波束的时候, 可以计算出不同于上述形成超指向差分波束权系数 的 Q个权系数, 以在麦克风阵列能够调整的端射方向中、 除声源方向以外的 其他任意方向, 同样利用超指向差分波束权系数得到 Q个波束形成信号作为 参考噪声信号, 其中 Q为不小于 1的整数, 进行噪声抑制, 以达到更好的方 向性噪声抑制效果。
本发明实施例提供的音频信号处理方法, 确定超指向差分波束的权系数 时, 可灵活设置麦克风阵列的几何形状, 并且无需设置多组麦克风阵列, 由 于对麦克风阵列的布放方式没有太大要求, 降低了麦克风布放的成本, 并且 在调整了音频釆集区域时, 根据调整的音频釆集有效区域重新确定权系数, 根据调整权系数进行超指向差分波束形成处理, 能够提升体验。
本发明实施例以下结合具体的应用场景, 例如人机交互、 高清话音通信、 空间声场录制以及立体声通话等应用场景, 对应用上述音频信号处理方法进 行举例说明, 当然并不因以为限。
实施例三
本发明实施例中以需要单声道信号的人机交互和高清话音通信过程中的 音频信号处理方法进行举例说明。
如图 7 所示, 为本发明实施例提供的人机交互和高清话音通信过程中音 频信号处理方法流程图, 包括:
S701 : 调整麦克风阵列, 使麦克风阵列端射方向指向目标说话人即声源。 本发明实施例中进行麦克风阵列调整时可以是手动调整, 也可以是根据 预先设定的旋转角度自动调整, 还可以利用麦克风阵列进行说话人方位检测, 然后将麦克风阵列的端射方向转向目标说话人。 利用麦克风阵列进行说话人 方位检测的方法有很多种, 如基于 MUSIC算法的声源定位技术、 SRP-PHAT 转向响应能量相位变换技术或者 GCC-PHAT广义互相关相位变换等技术。
S702: 判断用户是否调整了音频采集有效区域, 当用户调整了音频采集 有效区域, 则转 S703重新确定超指向差分波束形成权系数, 否则不进行超指 向差分波束权系数的更新, 利用预先确定的超指向差分波束形成权系数进行 S704.„
S703: 根据用户设定的音频釆集有效区域与麦克风扬声器位置, 重新确 定超指向差分波束形成权系数。
本发明实施例中当用户重新设定了音频采集有效区域, 则可按照实施例 二中涉及的确定超指向差分波束的权系数计算方法重新确定超指向差分波束 形成权系数。
S704: 采集原始音频信号。
本发明实施例利用包括 N个麦克风的麦克风阵列, 采集 N路麦克风拾取 到的原始音频信号, 并同步緩存扬声器播放的数据信号, 以扬声器播放的数 据信号作为回声抑制和回声消除的参考信号, 并对信号进行分帧处理。 设 N 路麦克风拾取到的原始音频信号为 χ» , 1=1,2,...,Ν, 同步緩存扬声器播放的 数据为 Ο), 7· = 1,2,..., ρ , j=l,2,... ,Q, Q为扬声器播放数据的声道数。
S705: 进行回声消除处理。
本发明实施例中对麦克风阵列中, 每一个麦克风拾取到的原始音频信号, 根据同步緩存的扬声器播放数据, 进行回声消除, 回声消除后的每一路音频 信号记为 .(«), i=l,2,...,N, 具体的回声消除算法这里不再赘述, 可釆用多种 实现方式。
需要说明的是, 本发明实施例中如果扬声器播放数据的声道数大于 1, 这 个时候需要采用多声道回声消除算法进行处理; 如果扬声器播放数据的声道 数等于 1, 这个时候可以使用单声道回声消除算法进行处理。
S706: 形成超指向差分波束。
本发明实施例中对每一路回声消除后的信号分别进行快速离散傅里叶变 换,得到每一路回声消除后的信号对应的频域信号; , i=l,2,...,FFT— LEN。
FFT— LEN为快速离散傅里叶变换的变换长度, 根据离散傅里叶变换的性质, 变换后的信号具有复对称特性, X^FFT _LEN+2-k)^X*(k) , k=2,..., FFT— LEN/2, 其中, *表示共轭。 因此离散傅里叶变换后得到信号的有效频点 数为 FFT_LEN/2+l点。 一般情况下只存储有效频点对应的超指向差分波束形 成的权系数。 按照公式:
Y(k、 = h>k , k=l,2,..., FFT— LEN/2+1,
Yt (FFT _ LEN +2-k)^Y* (k) , k=2,..., FFT— LEN/2,
对回声消除后的音频输入信号的频域信号进行超指向差分波束处理, 得 到频域上的超指向差分波束形成信号。 其中, 为频域上的超指向差分波 束形成信号, 为第 k组权系数, ;^)=[ ), 2( ..., ^(^。 最后将频域 上的超指向差分波束形成信号利用快速离散傅里叶变换的反变换变换到时 域, 得到超指向差分波束形成的输出信号 。
进一步的, 本发明实施例中还可以在除目标说话人方向以外的其他任意 方向, 利用同样的方式得到 Q个波束形成信号作为参考噪声信号, 但是用于 生成 Q个参考噪声信号所对应的 Q个超指向差分波束形成的权系数需要重新 计算, 计算方法与上面的方法类似。 例如, 可以将选定的除目标说话人方向 以外的方向作为波束的极点方向, 响应向量为 1, 与极点方向相反的方向为零 点方向, 响应向量为 0, 根据选取的 Q个方向就可以计算出 Q组超指向差分 波束形成的权系数。 S707: 进行噪声抑制处理。
对超指向差分波束形成的输出信号 «) 进行噪声抑制处理, 得到噪声抑 制后的信号
进一步的, 本发明实施例中若 S706中在形成超指向差分波束的同时, 形 成了 Q个参考噪声信号, 则可以利用 Q个参考噪声信号做进一步的噪声抑制 处理, 以达到更好的方向性噪声抑制的效果。
S708: 进行回声抑制处理。
根据同步緩存的扬声器播放数据和噪声抑制后的信号 进行回声抑制 处理, 得到最终的输出信号 z(«)。
需要说明的是, 本发明实施例中 S708为可选的项, 可以进行回声抑制处 理, 也可以不进行回声抑制处理。 另外, 本发明实施例中 S707和 S706的执 行顺序不作要求, 可以先进行噪声抑制处理然后进行回声抑制处理, 也可以 是先进行回声抑制处理然后再进行噪声抑制处理。
进一步的, 本发明实施例中, S705和 S706的执行顺序也可互换, 此时, 进行超指向差分波束形成时, 音频输入信号由每一路回声消除后的信号 χ',» 变为釆集到的原始音频信号 χ», ι=1,2,...,Ν, 进行超指向差分波束形成处理 后, 得到的不再是根据 Ν路回声消除后的信号得到的超指向差分波束形成输 出信号, 而是根据 Ν路釆集到的原始音频信号得到的超指向差分波束形成输 出信号 ")。 另外, 进行回声消除处理时, 输入信号由釆集到的 N路原始音 频信号 i=l,2,... ,N变为超指向差分波束形成信号 W")。
上述音频信号的处理方式,在进行回声抑制处理过程中,可以将原来的 N 路处理降低为一路处理。
需要说明的是, 如果使用超指向差分波束形成的方法产生 Q个参考噪声信 号, 则需要将零点设置在左右扬声器的位置, 避免回声信号对于噪声抑制性 能的影响。
本发明实施例中经过上述处理后的音频输出信号, 如果应用在高清话音 通信中, 则将最终的输出信号进行编码, 并传输到通话另一方。 如果是应用 在人机交互, 则将最终的输出信号作为语音识别的前端釆集信号进行进一步 处理。
实施例四
本发明实施例中以需要双声道信号的空间声场录制中的音频信号处理方 法进行举例说明。
如图 8 所示, 为本发明实施例提供的空间声场录制过程中音频信号处理 方法流程图, 包括:
S801 : 釆集原始音频信号。
具体的, 本发明实施例中采集 N路麦克风拾取到的原始信号, 并对信号 进行分帧处理, 作为原始音频信号,设 N路原始音频信号为 x» , i=l,2,... ,N。
S802: 分别进行左声道超指向差分波束形成处理和右声道差分波束形成 处理。
本发明实施例中当前应用场景对应的左声道的超指向差分波束形成权系 数和当前应用场景对应的右声道的超指向差分波束形成权系数是预先计算好 并存储下来的, 利用存储的当前应用场景对应的左声道的超指向差分波束形 成权系数和当前应用场景对应的右声道的超指向差分波束形成权系数, 以及 S801 中的原始音频采集信号, 分别进行当前应用场景对应的左声道超指向差 分波束形成处理和当前应用场景对应的右声道差分波束形成处理, 则可得到 当前应用场景对应的左声道超指向差分波束形成信号 以及当前应用场景 对应的右声道超指向差分波束形成信号 ¼ (n)。
具体的, 本发明是实施例中左声道的超指向差分波束形成权系数和右声 道的超指向差分波束形成权系数可釆用实施例二中应用场景所需输出信号类 型为双声道信号时, 确定权系数的方法进行确定, 在此不再赘述。
进一步的, 本发明实施例中进行左声道超指向差分波束形成和右声道差 分波束形成处理过程与上述实施例涉及的超指向波束形成处理过程相似, 音 频输入信号为采集到的 N路麦克风的原始音频信号 χ» , 权系数则分别为左 声道或右声道对应的超指向差分波束形成权系数。
S803: 进行多通道联合噪声抑制。
本发明实施例中采用多通道联合噪声抑制, 以左声道超指向差分波束形 成信号 Ά (η)以及右声道超指向差分波束形成信号 yR (n)为输入信号进行多通道 联合噪声抑制, 能够在噪声抑制的同时, 使非背景噪声信号的声像不发生漂 移, 并且保证左右声道残留噪声不会影响处理后的立体声信号的听感。
需要说明的是, 本发明实施例中进行多通道联合噪声抑制是可选的, 可 以不进行多通道联合噪声抑制, 直接将左声道超指向差分波束形成信号 (《) 以及右声道超指向差分波束形成信号 (n)组成立体声信号, 作为最终的空间 声场录制信号输出。
实施例五
本发明实施例中以立体声通话中的音频信号处理方法进行举例说明。 如图 9 所示, 为本发明实施例提供的立体声通话中音频信号处理方法流 程图, 包括:
S901 : 采集 N路麦克风拾取到的原始音频信号, 并同步緩存扬声器播放 数据, 作为多通道联合回声抑制和多通道联合回声消除的参考信号, 并对原 始音频信号和参考信号进行分帧处理。 设 N路麦克风拾取到的原始音频信号 为 i=l,2,... ,N, 同步緩存扬声器播放的数据为 (")J = 1'2-,2 , Q为扬 声器播放数据的声道数, 本发明实施例中 Q=2。
S902: 进行多通道联合回声消除。
对每一路麦克风拾取到的原始音频信号, 根据同步緩存的扬声器播放数 据 refj (n j = 1, 2 , 进行多通道联合回声消除, 每一路回声消除后的信号记为 i=l,2,... ,N。
S903: 分别进行左声道超指向差分波束形成和右声道差分波束形成处理。 具体的, 本发明实施例中进行左声道超指向差分波束形成和右声道差分 波束形成处理的过程, 与实施例四中空间声场录制处理流程中的 S802相似, 只不过输入信号变为每一路回声消除后的信号 x'», i=l,2, ... ,N。 处理后得到 左声道超指向差分波束形成信号 以及右声道超指向差分波束形成信号 yR {n)。
S904: 进行多通道联合噪声抑制处理。
具体的, 本发明实施例中进行多通道联合噪声抑制处理过程与实施例四 中 S803过程相同, 在此不再赘述。
S905: 进行多通道联合回声抑制处理。
具体的, 本发明实施例中根据同步緩存的扬声器播放数据和多通道联合 噪声抑制后的信号进行回声抑制处理, 得到最终的输出信号。
需要说明的是, 本发明实施例中进行多通道联合回声抑制处理是可选的, 可以进行此项处理, 也可以不进行此项处理。 另外, 本发明实施例中对于多 通道联合回声抑制处理过程与多通道联合噪声抑制处理过程的执行顺序并不 作要求, 可以先进行多通道联合噪声抑制处理再进行多通道联合回声抑制处 理, 也可以是先进行多通道联合回声抑制处理再进行多通道联合噪声抑制处 理。
实施例六
本发明实施例提供一种音频信号处理方法, 应用于空间声场录制以及立 体声通话中, 本发明实施例中可以根据用户的需要进行声场釆集方式的调整, 在进行音频信号采集之前, 将麦克风阵列拆分为两个子阵列, 分别调整子阵 列的端射方向, 以通过拆分的两个子阵列进行原始音频信号的采集。
具体的, 本发明实施例中, 将麦克风阵列拆分为两个子阵列, 分别调整 子阵列的端射方向, 调整方法可以是用户进行手动调整, 也可以是根据用户 设定角度后进行自动调整, 还可以预先设定一个旋转角度, 当装置启动空间 声场录制功能后将麦克风阵列拆分为 2 个子阵列, 并将子阵列的端射方向自 动调整为预先设定的方向。 一般的, 可将旋转角度设定为左侧逆时针旋转 45 度, 右侧顺时针旋转 45度, 当然也可以根据用户设定任意调整。 麦克风阵列 拆分后形成两个子阵列, 一个子阵列釆集到的信号用于左声道超指向差分波 束形成, 采集到的原始信号记为 ,(«), = 1, 2,..., 。 另一个子阵列采集到的信号 用于左声道超指向差分波束形成, 采集到的原始信号记为 »J = 1, 2, ..., N2 , 其
Ν。
本发明实施例中将麦克风拆分为两个子阵列的音频信号处理方法, 如图 10A和图 10B所示, 图 10A为空间声场录制过程中音频信号的处理方法, 图 10B为立体声通话过程中音频信号处理方法流程图。
实施例七
本发明实施例七提供一种音频信号处理装置, 如图 11A所示, 该装置包 括权系数存储模块 1101、信号获取模块 1102、波束形成处理模块 1103和信号 输出模块 1104, 其中:
权系数存储模块 1101, 用于存储超指向差分波束形成权系数;
信号获取模块 1102 ,用于获取音频输入信号,并向波束形成处理模块 1103 传输获取到的音频输入信号; 还用于确定当前应用场景以及当前应用场景所 需输出信号类型, 并向波束形成处理模块 1103传输当前应用场景以及当前应 用场景所需输出信号类型。
波束形成处理模块 1103 , 用于根据当前应用场景所需输出信号类型从权 系数存储模块 1101中选取与当前应用场景对应的权系数, 利用选取的权系数 对信号获取模块 1102输出的音频输入信号进行超指向差分波束形成处理, 得 到超指向差分波束形成信号, 并向信号输出模块 1104传输超指向差分波束形 成信号 ;
信号输出模块 1104,用于输出波束形成处理模块 1103传输的超指向差分 波束形成信号。
其中, 波束形成处理模块 1103 , 具体用于:
在当前应用场景所需输出信号类型为双声道信号时, 从权系数存储模块 1101 获取左声道超指向差分波束形成权系数以及右声道超指向差分波束形成 权系数, 并根据获取的左声道超指向差分波束形成权系数对音频输入信号进 行超指向差分波束形成处理, 得到左声道超指向差分波束形成信号, 以及根 据右声道超指向差分波束形成权系数对音频输入信号进行超指向差分波束形 成处理, 得到右声道超指向差分波束形成信号, 向信号输出模块 1104传输左 声道超指向差分波束形成信号和右声道超指向差分波束形成信号。
信号输出模块 1104, 具体用于:
输出左声道超指向差分波束形成信号和右声道超指向差分波束形成信 号。
其中, 波束形成处理模块 1103 , 具体用于:
在当前应用场景所需输出信号类型为单声道信号时, 从权系数存储模块 1101 获取形成单声道信号的当前应用场景对应的单声道超指向差分波束形成 权系数, 当获取到单声道超指向差分波束形成权系数时, 根据单声道超指向 差分波束形成权系数对音频输入信号进行超指向差分波束形成处理, 形成一 路单声道超指向差分波束形成信号; 向信号输出模块 1104传输得到的一路单 声道超指向差分波束形成信号。
信号输出模块 1104, 具体用于:
输出一路单声道超指向差分波束形成信号。
进一步的, 该装置还包括麦克风阵列调整模块 1105 , 如图 11B所示, 其 中:
麦克风阵列调整模块 1105 , 用于调整麦克风阵列为第一子阵列与第二子 阵列, 第一子阵列的端射方向与第二子阵列的端射方向不同; 第一子阵列与 第二子阵列分别采集原始音频信号, 并将原始音频信号作为音频输入信号向 信号获取模块 1102传输。
在当前应用场景所需输出信号类型为双声道信号时, 调整麦克风阵列为 两个子阵列, 并使调整得到的两个子阵列的端射方向指向不同的方向, 以分 别采集用于进行左声道超指向差分波束形成处理与右声道超指向差分波束形 成处理所需的原始音频采集信号。
更进一步的, 该装置包括的麦克风阵列调整模块 1105, 用于调整麦克风 阵列的端射方向, 使端射方向指向目标声源, 麦克风阵列采集目标声源发出 的原始音频信号, 并将原始音频信号作为音频输入信号向信号获取模块 1102 传输。
进一步的, 该装置还包括权系数更新模块 1106, 如图 11C所示, 其中: 权系数更新模块 1106, 用于判断音频采集区域是否被调整; 若音频采集 区域被调整, 则确定麦克风阵列的几何形状、 扬声器位置以及调整后的音频 釆集有效区域; 根据音频釆集有效区域调整波束形状, 或者根据音频釆集有 效区域和所述扬声器位置调整波束形状, 得到调整的波束形状; 根据麦克风 阵列的几何形状、 调整的波束形状, 确定超指向差分波束形成权系数, 得到 调整权系数, 并将调整权系数向权系数存储模块 1101传输;
权系数存储模块 1101 , 具体用于: 存储调整权系数。
其中, 权系数更新模块 1106, 具体用于:
根据麦克风阵列的几何形状和设定的音频采集有效区域,确定 ϋ(ωθ)和 β; 或根据麦克风阵列的几何形状、 设定的音频采集有效区域和扬声器位置, 确 定 Ι)(ω,θ)和 β ; 根据确定的 ϋ(ω,θ)和 按照公式: h )=DH 'e)[D(ffl'e)DH( )] , 确定超 指向差分波束形成的权系数;
其中, (ω)为权系数, ϋ(ωθ)为任意几何形状的麦克风阵列所对应的转向 矩阵, 由不同入射角度下声源到达麦克风阵列中各麦克风间的相对时延决定, ϋΗ(ω,θ)表示 ϋ(ω,θ)的共轭转置矩阵, ω为音频信号的频率, Θ为声源入射角度, β为入射角度为 Θ时的响应向量。
其中, 权系数更新模块 1106, 具体用于:
在根据麦克风阵列的几何形状和设定的音频采集有效区域,确定 ϋ(ωθ)和 时,或根据麦克风阵列的几何形状、设定的音频采集有效区域和扬声器位置, 确定 ιχω,θ)和 β时, 根据不同应用场景所需输出信号类型, 将设定的音频有效 区域转换为极点方向以及零点方向, 并根据得到的极点方向以及零点方向, 确定不同应用场景下的1^∞ 和 ;或者才艮据不同应用场景所需输出信号类型, 将设定的音频有效区域转换为极点方向以及零点方向, 将扬声器位置转换为 零点方向, 并才艮据得到的极点方向以及零点方向, 确定不同应用场景下的
Ο(ω,θ)和 β ;
其中, 极点方向为使超指向差分波束形成超指向差分波束响应值为 1 的 入射角度, 零点方向为使超指向差分波束形成超指向差分波束响应值为 0 的 入射角度。
其中, 权系数更新模块 1106, 具体用于:
根据得到的极点方向以及零点方向,确定不同应用场景下的 ϋ(ωθ)和 β时, 当应用场景所需输出信号类型为单声道信号时, 设定麦克风阵列的端射方向 为极点方向, 并设定 Μ个零点方向, 其中 M≤N-1 , N为麦克风阵列中的麦 克风数量;
当应用场景所需输出信号类型为双声道信号时, 设定麦克风阵列的 0度 方向为极点方向, 并将麦克风阵列的 180度方向设定为零点方向, 以确定其 中一个声道对应的超指向差分波束形成权系数, 并设定麦克风阵列的 180度 方向为极点方向, 并将麦克风阵列的 0度方向设定为零点方向, 以确定另一 个声道对应的超指向差分波束形成权系数。
进一步的, 该装置还包括回声消除模块 1107 , 如图 11D所示, 其中: 回声消除模块 1107, 用于緩存扬声器播放信号, 对麦克风阵列釆集的原 始音频信号进行回声消除, 得到回声消除音频信号, 并将回声消除音频信号 作为音频输入信号向信号获取模块 1102传输; 或者对波束形成处理模块 1103 输出的超指向差分波束形成信号进行回声消除, 得到回声消除超指向差分波 束形成信号,并向信号输出模块 1104传输回声消除超指向差分波束形成信号。
信号输出模块 1104 , 具体用于: 输出回声消除超指向差分波束形成信号。
其中, 信号获取模块 1102获取的当前应用场景所需音频输入信号为: 麦克风阵列采集的原始音频信号经过回声消除模块 1107进行回声消除后 的音频信号, 或者麦克风阵列采集的原始音频信号;
进一步的, 该装置还包括: 回声抑制模块 1108和噪声抑制模块 1109, 如 图 11E所示, 其中:
回声抑制模块 1108 ,用于对波束形成处理模块 1103输出的超指向差分波 束形成信号进行回声抑制处理;
噪声抑制模块 1109 ,用于对回声抑制模块 1108输出的回声抑制处理后的 超指向差分波束形成信号进行噪声抑制处理。 或者
噪声抑制模块 1109 ,用于对波束形成处理模块 1103输出的超指向差分波 束形成信号进行噪声抑制处理;
回声抑制模块 1108 ,用于对噪声抑制模块 1109输出的噪声抑制处理后的 超指向差分波束形成信号进行回声抑制处理。
进一步的, 回声抑制模块 1108, 用于对波束形成处理模块 1103输出的超 指向差分波束形成信号进行回声抑制处理;
噪声抑制模块 1109 ,用于对波束形成处理模块 1103输出的超指向差分波 束形成信号进行噪声抑制处理。
信号输出模块 1104, 具体用于:
输出回声抑制超指向差分波束形成信号或者噪声抑制超指向差分波束形 成信号。
具体的, 波束形成处理模块 1103, 还用于:
在信号输出模块 1104 包括噪声抑制模块 1109时, 在麦克风阵列能够调 整的端射方向中、 除声源方向以外的其它方向上, 形成至少一个波束形成信 号作为参考噪声信号, 并将形成的参考噪声信号向噪声抑制模块 1109传输。
进一步的, 波束形成处理模块 1103进行超指向差分波束形成处理时, 所 用的超指向差分波束为: 根据克风阵列的几何形状、 设定的波束形状, 构建 的差分波束。
本发明实施例提供的音频信号处理装置, 波束形成处理模块根据当前应 用场景所需的输出信号类型, 在权系数存储模块中选择对应的权系数, 并利 用选择的权系数对信号获取模块输出的音频输入信号进行超指向差分波束处 理, 形成当前应用场景下的超指向差分波束, 对超指向差分波束进行相应的 处理即可得到最终所需的音频信号, 能够满足不同应用场景需要不同音频信 号处理方式的需求。
需要说明的是, 本发明实施例中上述音频信号处理装置, 可以是独立的 部件, 也可以是集成于其他部件中。
进一步需要说明的是, 本发明实施例中上述音频信号处理装置中各个模 块 /单元的功能实现以及交互方式可以进一步参照相关方法实施例的描述。
实施例八
本发明实施例提供一种差分波束形成方法, 如图 12所示, 包括:
S1201 : 根据麦克风阵列的几何形状和设定的音频采集有效区域, 确定差 分波束形成权系数并存储; 或者根据麦克风阵列的几何形状、 设定的音频采 集有效区域和扬声器位置, 确定差分波束形成权系数并存储;
S1202: 才艮据当前应用场景所需输出信号类型获取当前应用场景对应的差 分波束形成权系数, 利用获取的权系数对音频输入信号进行差分波束形成处 理, 得到超指向差分波束。
其中, 确定差分波束形成权系数的过程, 具体包括:
根据麦克风阵列的几何形状和设定的音频采集有效区域,确定 D^e)和 P ; 或根据麦克风阵列的几何形状、 设定的音频采集有效区域和扬声器位置, 确 定 Ε)(ω,θ)和 β ; 根据确定的 0(ω,θ)和 β , 按照公式:
Figure imgf000040_0001
, 确定超 指向差分波束形成的权系数;
其中, ^03)为权系数, ^^ 为任意几何形状的麦克风阵列所对应的转向 矩阵, 由不同入射角度下声源到达麦克风阵列中各麦克风间的相对时延决定 的, ϋΗ(ωθ)表示 ϋ(ωθ)的共轭转置矩阵, ω为音频信号的频率, Θ为声源入射 角度, β为入射角度为 Θ时的响应向量。
其中, 根据麦克风阵列的几何形状和设定的音频采集有效区域, 确定 D(ro,e)和 β , 或根据麦克风阵列的几何形状、 设定的音频采集有效区域和扬声 器位置, 确定∑) 和 时, 具体包括:
根据不同应用场景所需输出信号类型, 将设定的音频有效区域转换为极 点方向以及零点方向, 并根据得到的极点方向以及零点方向, 确定不同应用 场景下的 和 Ρ ; 或者根据不同应用场景所需输出信号类型, 将设定的音 频有效区域转换为极点方向以及零点方向, 将扬声器位置转换为零点方向, 并根据得到的极点方向以及零点方向, 确定不同应用场景下的 ϋ(ωθ)和!3; 其中, 极点方向为使超指向差分波束形成超指向差分波束响应值为 1 的 入射角度, 零点方向为使超指向差分波束形成超指向差分波束响应值为 0 的 入射角度。
具体的, 根据得到的极点方向以及零点方向, 确定不同应用场景下的
0((0,6)和3, 具体包括:
当应用场景所需输出信号类型为单声道信号时, 设定麦克风阵列的端射 方向为极点方向, 并设定 Μ个零点方向, 其中 M≤N- 1 , N为麦克风阵列中 的麦克风数量;
当应用场景所需输出信号类型为双声道信号时, 设定麦克风阵列的 0度 方向为极点方向, 并将麦克风阵列的 180度方向设定为零点方向, 以确定其 中一个声道对应的超指向差分波束形成权系数, 并设定麦克风阵列的 180度 方向为极点方向, 并将麦克风阵列的 0度方向设定为零点方向, 以确定另一 个声道对应的超指向差分波束形成权系数。
本发明实施例中提供的差分波束形成方法, 能够根据不同场景所需的音 频信号输出类型, 确定不同的权系数, 进行差分波束处理后形成的差分波束 具有较高的适应性, 可满足不同场景对于所产生的波束形状的要求。
需要说明的是, 本发明实施例中差分波束形成的过程, 可进一步参照相 关方法实施例中对于差分波束形成过程的描述, 在此不再赘述。
实施例九
本发明实施例提供一种差分波束形成装置, 如图 13所示, 包括: 权系数 确定单元 1301和波束形成处理单元 1302;
权系数确定单元 1301 , 用于根据全指向麦克风阵列的几何形状、 设定的 音频采集有效区域, 确定差分波束形成权系数, 并将形成的差分波束形成权 系数向波束形成处理单元 1302传输; 或者用于根据全指向麦克风阵列的几何 形状、 设定的音频采集有效区域和扬声器位置, 确定差分波束形成权系数, 并将形成的差分波束形成权系数向波束形成处理单元 1302传输。
波束形成处理单元 1302, 根据当前应用场景所需输出信号类型在权系数 确定单元 1301中选择对应的权系数, 利用选择的权系数对音频输入信号进行 差分波束形成处理。
其中, 权系数确定单元 1301 , 具体用于:
根据麦克风阵列的几何形状和设定的音频采集有效区域,确定 ϋ(ωθ)和 β; 或根据麦克风阵列的几何形状、 设定的音频釆集有效区域和扬声器位置, 确 定 ϋ(ω,θ)和 β ; 根据确定的。( )和 , 按照公式:
Figure imgf000042_0001
重新确 定超指向差分波束形成的权系数;
其中, h(m)为权系数, 为任意几何形状的麦克风阵列所对应的转向 矩阵, 由不同入射角度下声源到达麦克风阵列中各麦克风间的相对时延决定 的, ϋΗ(ωθ)表示 ϋ(ωθ)的共轭转置矩阵, ω为音频信号的频率, Θ为声源入射 角度, β为入射角度为 Θ时的响应向量。
其中, 权系数确定单元 1301 , 具体用于:
根据不同应用场景所需输出信号类型, 将设定的音频有效区域转换为极 点方向以及零点方向, 并根据得到的极点方向以及零点方向, 确定不同应用 场景下的 D(°^)和 β ;
其中, 极点方向为使待形成超指向差分波束响应值为 1 的入射角度, 零 点方向为使待形成超指向差分波束响应值为 0的入射角度。
进一步的, 权系数确定单元 1301 , 具体用于:
当应用场景所需输出信号类型为单声道信号时, 设定麦克风阵列的端射 方向为极点方向, 并设定 Μ个零点方向, 其中 M≤N- 1, N为麦克风阵列中 的麦克风数量;
当应用场景所需输出信号类型为双声道信号时, 设定麦克风阵列的 0度 方向为极点方向, 并将麦克风阵列的 180度方向设定为零点方向, 以确定其 中一个声道对应的超指向差分波束形成权系数, 并设定麦克风阵列的 180度 方向为极点方向, 并将麦克风阵列的 0度方向设定为零点方向, 以确定另一 个声道对应的超指向差分波束形成权系数。
本发明实施例中提供的差分波束形成装置, 能够根据不同场景所需的音 频信号输出类型, 确定不同的权系数, 进行差分波束处理后形成的差分波束 具有较高的适应性, 可满足不同场景对于所产生的波束形状的要求。
需要说明的是, 本发明实施例中差分波束形成装置涉及的差分波束形成 过程, 可进一步参照相关方法实施例中对于差分波束形成过程的描述, 在此 不再赘述。
实施例十
基于本发明实施例提供的音频信号处理方法及装置、 差分波束形成方法 及装置, 本发明实施例提供了一种控制器, 如图 14所示, 该控制器包括处理 器 1401和 I/O接口 1402 , 其中:
处理器 1401 , 用于确定不同输出信号类型在不同应用场景对应的各超指 向差分波束形成权系数并进行存储, 当获取到音频输入信号, 并确定了当前 应用场景以及当前应用场景所需输出信号类型时, 根据当前应用场景所需输 出信号类型获取与当前应用场景对应的权系数, 利用获取的权系数对获取到 的音频输入信号进行超指向差分波束形成处理, 得到超指向差分波束形成信 号, 并将该超指向差分波束形成信号传输至 I/O接口 1402。
I/O接口 1402, 用于将处理器 1401处理后得到的超指向差分波束形成信 号进行输出。
本发明实施例提供的控制器, 根据当前应用场景所需的输出信号类型, 获取对应的权系数, 并利用获取的权系数对音频输入信号进行超指向差分波 束处理, 形成当前应用场景下的超指向差分波束, 对超指向差分波束进行相 应的处理即可得到最终所需的音频信号, 能够满足不同应用场景需要不同音 频信号处理方式的需求。
需要说明的是, 本发明实施例中上述控制器, 可以是独立的部件, 也可 以是集成于其他部件中。
进一步需要说明的是, 本发明实施例中上述控制器各个模块 /单元的功能 实现以及交互方式可以进一步参照相关方法实施例的描述。
本领域内的技术人员应明白, 本发明的实施例可提供为方法、 系统、 或 计算机程序产品。 因此, 本发明可釆用完全硬件实施例、 完全软件实施例、 或结合软件和硬件方面的实施例的形式。 而且, 本发明可采用在一个或多个 其中包含有计算机可用程序代码的计算机可用存储介质 (包括但不限于磁盘 存储器、 CD-ROM、 光学存储器等) 上实施的计算机程序产品的形式。
本发明是参照根据本发明实施例的方法、 设备(系统)、 和计算机程序产 品的流程图和 /或方框图来描述的。 应理解可由计算机程序指令实现流程图 和 /或方框图中的每一流程和 /或方框、 以及流程图和 /或方框图中的流程 和 /或方框的结合。 可提供这些计算机程序指令到通用计算机、 专用计算机、 嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器, 使得通 过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流 程图一个流程或多个流程和 /或方框图一个方框或多个方框中指定的功能的 装置。 这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设 备以特定方式工作的计算机可读存储器中, 使得存储在该计算机可读存储器 中的指令产生包括指令装置的制造品, 该指令装置实现在流程图一个流程或 多个流程和 /或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上, 使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的 处理, 从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图 一个流程或多个流程和 /或方框图一个方框或多个方框中指定的功能的步 骤。
尽管已描述了本发明的优选实施例, 但本领域内的技术人员一旦得知了 基本创造性概念, 则可对这些实施例作出另外的变更和修改。 所以, 所附权 利要求意欲解释为包括优选实施例以及落入本发明范围的所有变更和修改。
显然, 本领域的技术人员可以对本发明实施例进行各种改动和变型而不 脱离本发明实施例的精神和范围。 这样, 倘若本发明实施例的这些修改和变 型属于本发明权利要求及其等同技术的范围之内, 则本发明也意图包含这些 改动和变型在内。

Claims

权 利 要 求
1、 一种音频信号处理装置, 其特征在于, 包括权系数存储模块、 信号获 取模块、 波束形成处理模块和信号输出模块, 其中:
所述权系数存储模块, 用于存储超指向差分波束形成权系数;
所述信号获取模块, 用于获取音频输入信号, 并向所述波束形成处理模 块输出所述音频输入信号, 还用于确定当前应用场景以及当前应用场景所需 输出信号类型, 并向所述波束形成处理模块传输所述当前应用场景以及当前 应用场景所需输出信号类型;
所述波束形成处理模块, 用于根据当前应用场景所需输出信号类型从所 述权系数存储模块获取与当前应用场景对应的权系数, 利用获取的所述权系 数对所述音频输入信号进行超指向差分波束形成处理, 得到超指向差分波束 形成信号, 并向所述信号输出模块传输所述超指向差分波束形成信号;
所述信号输出模块, 用于输出所述超指向差分波束形成信号。
2、 如权利要求 1所述的装置, 其特征在于,
所述波束形成处理模块, 具体用于:
当所述当前应用场景所需输出信号类型为双声道信号时, 从所述权系数 存储模块获取左声道超指向差分波束形成权系数以及右声道超指向差分波束 形成权系数;
根据所述左声道超指向差分波束形成权系数对所述音频输入信号进行超 指向差分波束形成处理, 得到左声道超指向差分波束形成信号; 以及
根据所述右声道超指向差分波束形成权系数对所述音频输入信号进行超 指向差分波束形成处理, 得到右声道超指向差分波束形成信号;
向所述信号输出模块传输所述左声道超指向差分波束形成信号和所述右 声道超指向差分波束形成信号;
所述信号输出模块, 具体用于:
输出所述左声道超指向差分波束形成信号和所述右声道超指向差分波束 形成信号。
3、 如权利要求 1所述的装置, 其特征在于,
所述波束形成处理模块, 具体用于:
当所述当前应用场景所需输出信号类型为单声道信号时, 从所述权系数 存储模块获取当前应用场景对应的单声道超指向差分波束形成权系数;
根据所述单声道超指向差分波束形成权系数对所述音频输入信号进行超 指向差分波束形成处理, 形成一路单声道超指向差分波束形成信号;
向所述信号输出模块传输所述一路单声道超指向差分波束形成信号; 所述信号输出模块, 具体用于:
输出所述一路单声道超指向差分波束形成信号。
4、 如权利要求 1所述的装置, 其特征在于, 所述音频信号处理装置还包 括麦克风阵列调整模块, 其中:
所述麦克风阵列调整模块, 用于调整麦克风阵列为第一子阵列与第二子 阵列, 所述第一子阵列的端射方向与所述第二子阵列的端射方向不同;
所述第一子阵列与所述第二子阵列分别采集原始音频信号, 并将所述原 始音频信号作为音频输入信号向所述信号获取模块传输。
5、 如权利要求 1所述的装置, 其特征在于, 所述音频信号处理装置还包 括麦克风阵列调整模块, 其中:
所述麦克风阵列调整模块, 用于调整麦克风阵列的端射方向, 使所述端 射方向指向目标声源;
所述麦克风阵列采集所述目标声源发出的原始音频信号, 并将所述原始 音频信号作为音频输入信号向所述信号获取模块传输。
6、 如权利要求 1-3任一项所述的装置, 其特征在于, 所述音频信号处理 装置还包括权系数更新模块, 其中,
所述权系数更新模块, 具体用于:
判断音频釆集区域是否被调整;
若所述音频采集区域被调整, 则确定麦克风阵列的几何形状、 扬声器位 置以及调整后的音频采集有效区域;
根据所述音频釆集有效区域调整波束形状, 或者根据所述音频采集有效 区域和所述扬声器位置调整波束形状, 得到调整的波束形状;
根据所述麦克风阵列的几何形状、 所述调整的波束形状, 确定超指向差 分波束形成权系数, 得到调整权系数, 并将所述调整权系数向所述权系数存 储模块传输;
所述权系数存储模块, 具体用于: 存储所述调整权系数。
7、 如权利要求 1所述的装置, 其特征在于, 所述音频信号处理装置还包 括回声消除模块, 其中,
所述回声消除模块, 具体用于:
緩存扬声器播放信号, 对麦克风阵列采集的原始音频信号进行回声消除, 得到回声消除音频信号, 并将所述回声消除音频信号作为音频输入信号向所 述信号获取模块传输; 或者
对波束形成处理模块输出的超指向差分波束形成信号进行回声消除, 得 到回声消除超指向差分波束形成信号, 并向所述信号输出模块传输所述回声 消除超指向差分波束形成信号;
所述信号输出模块, 具体用于:
输出所述回声消除超指向差分波束形成信号。
8、 如权利要求 1所述的装置, 其特征在于, 所述音频信号处理装置还包 括回声抑制模块和噪声抑制模块, 其中,
所述回声抑制模块, 用于对所述波束形成处理模块输出的超指向差分波 束形成信号进行回声抑制处理, 或者对所述噪声抑制模块输出的噪声抑制超 指向差分波束形成信号进行回声抑制处理, 得到回声抑制超指向差分波束形 成信号, 并向所述信号输出模块传输所述回声抑制超指向差分波束形成信号; 所述噪声抑制模块, 用于对波束形成处理模块输出的超指向差分波束形 成信号进行噪声抑制处理, 或者对所述回声抑制模块输出的所述回声抑制超 指向差分波束形成信号进行噪声抑制处理, 得到噪声抑制超指向差分波束形 成信号, 并向所述信号输出模块传输所述噪声抑制超指向差分波束形成信号; 所述信号输出模块, 具体用于:
输出所述回声抑制超指向差分波束形成信号或者所述噪声抑制超指向差 分波束形成信号。
9、 如权利要求 8所述的装置, 其特征在于, 所述波束形成处理模块, 还 用于:
在麦克风阵列能够调整的端射方向中、 除声源方向以外的其它方向上, 形成至少一个波束形成信号作为参考噪声信号, 并向所述噪声抑制模块传输 所述参考噪声信号。
10、 一种音频信号处理方法, 其特征在于, 包括:
确定超指向差分波束形成权系数;
获取音频输入信号, 并确定当前应用场景以及当前应用场景所需输出信 号类型;
根据当前应用场景所需输出信号类型获取当前应用场景对应的权系数, 利用获取的所述权系数对所述音频输入信号进行超指向差分波束形成处理, 得到超指向差分波束形成信号, 并输出所述超指向差分波束形成信号。
11、 如权利要求 10所述的音频信号处理方法, 其特征在于, 所述根据当 前应用场景所需输出信号类型获取当前应用场景对应的权系数, 利用获取的 所述权系数对所述音频输入信号进行超指向差分波束形成处理, 得到超指向 差分波束形成信号, 并输出所述超指向差分波束形成信号, 具体包括:
在当前应用场景所需输出信号类型为双声道信号时, 获取左声道超指向 差分波束形成权系数以及右声道超指向差分波束形成权系数;
根据所述左声道超指向差分波束形成权系数对所述音频输入信号进行超 指向差分波束形成处理, 得到左声道超指向差分波束形成信号;
根据所述右声道超指向差分波束形成权系数对所述音频输入信号进行超 指向差分波束形成处理, 得到右声道超指向差分波束形成信号;
输出所述左声道超指向差分波束形成信号和所述右声道超指向差分波束 形成信号。
12、 如权利要求 10所述的音频信号处理方法, 其特征在于, 所述根据当 前应用场景所需输出信号类型获取当前应用场景对应的权系数, 利用获取的 所述权系数对所述音频输入信号进行超指向差分波束形成处理, 得到超指向 差分波束形成信号, 并输出所述超指向差分波束形成信号, 具体包括:
在当前应用场景所需输出信号类型为单声道信号时, 获取当前应用场景 形成单声道信号的单声道超指向差分波束形成权系数;
根据获取的单声道超指向差分波束形成权系数, 对所述音频输入信号进 行超指向差分波束形成处理, 形成一路单声道超指向差分波束形成信号, 并 输出所述一路单声道超指向差分波束形成信号。
13、 如权利要求 10所述的音频信号处理方法, 其特征在于, 获取音频输 入信号之前, 该方法还包括:
调整麦克风阵列为第一子阵列与第二子阵列, 所述第一子阵列的端射方 向与所述第二子阵列的端射方向不同;
利用所述第一子阵列与所述第二子阵列分别采集原始音频信号, 将所述 原始音频信号作为音频输入信号。
14、 如权利要求 10所述的音频信号处理方法, 其特征在于, 获取音频输 入信号之前, 该方法还包括:
调整麦克风阵列的端射方向, 使所述端射方向指向目标声源;
采集目标声源的原始音频信号, 并将所述原始音频信号作为音频输入信 号。
15、 如权利要求 10-12任一项所述的音频信号处理方法, 其特征在于, 根 据当前应用场景所需输出信号类型获取当前应用场景对应的权系数之前, 该 方法还包括:
判断音频采集区域是否被调整;
若所述音频采集区域被调整, 则确定麦克风阵列的几何形状、 扬声器位 置以及调整后的音频采集有效区域; 根据所述音频采集有效区域调整波束形状, 或者根据所述音频采集有效 区域和所述扬声器位置调整波束形状, 得到调整的波束形状;
根据所述麦克风阵列的几何形状、 所述调整的波束形状, 确定超指向差 分波束形成权系数, 得到调整权系数;
利用所述调整权系数对所述音频输入信号进行超指向差分波束形成处 理。
16、 如权利要求 10所述的音频信号处理方法, 其特征在于, 该方法还包 括:
对麦克风阵列釆集的原始音频信号进行回声消除; 或者
对所述超指向差分波束形成信号进行回声消除。
17、 如权利要求 10所述的音频信号处理方法, 其特征在于, 形成超指向 差分波束形成信号之后, 该方法还包括:
对所述超指向差分波束形成信号进行回声抑制处理,和 /或噪声抑制处理。
18、 如权利要求 10所述的音频信号处理方法, 其特征在于, 该方法还包 括:
在麦克风阵列能够调整的端射方向中、 除声源方向以外的其它方向上, 形成至少一个波束形成信号作为参考噪声信号;
利用所述参考噪声信号对所述超指向差分波束形成信号进行噪声抑制处 理。
19、 一种差分波束形成方法, 其特征在于, 包括:
根据麦克风阵列的几何形状和设定的音频采集有效区域, 确定差分波束 形成权系数并存储; 或者根据麦克风阵列的几何形状、 设定的音频釆集有效 区域和扬声器位置, 确定差分波束形成权系数并存储;
根据当前应用场景所需输出信号类型获取当前应用场景对应的权系数, 利用获取的所述权系数对音频输入信号进行差分波束形成处理, 得到超指向 差分波束。
20、 如权利要求 19所述的方法, 其特征在于, 所述确定差分波束形成权 系数的过程, 具体包括:
根据麦克风阵列的几何形状和设定的音频采集有效区域,确定 ϋ(ωθ)和 β; 或根据麦克风阵列的几何形状、 设定的音频釆集有效区域和扬声器位置, 确 定 ϋ(ω,θ)和 β ; 根据确定的 ϋ(ω'θ β, 按照公式:
Figure imgf000052_0001
β , 确定超 指向差分波束形成的权系数;
其中, (ω)为权系数, !^^, 为任意几何形状的麦克风阵列所对应的转向 矩阵, 由不同入射角度下声源到达麦克风阵列中各麦克风间的相对时延决定 的, ϋΗ(ωθ)表示 ϋ(ωθ)的共轭转置矩阵, ω为音频信号的频率, Θ为声源入射 角度, Ρ为入射角度为 Θ时的响应向量。
21、 如权利要求 20所述的方法, 其特征在于, 所述根据麦克风阵列的几 何形状和设定的音频釆集有效区域, 确定 ϋ(ωθ β, 具体包括:
根据不同应用场景所需输出信号类型, 将设定的音频有效区域转换为极 点方向以及零点方向;
根据转换的所述极点方向以及所述零点方向, 确定不同应用场景下的 Ο(ω,θ)和 β ; 其中, 所述极点方向为使超指向差分波束在该方向上响应值为 1 的入射 角度, 所述零点方向为使超指向差分波束在该方向上响应值为 0的入射角度。
22、 如权利要求 20所述的方法, 其特征在于, 所述根据麦克风阵列的几 何形状、设定的音频采集有效区域和扬声器位置, 确定 ϋ(ωθ^οβ , 具体包括: 根据不同应用场景所需输出信号类型, 将设定的音频有效区域转换为极 点方向以及零点方向, 将扬声器位置转换为零点方向;
根据转换的所述极点方向以及所述零点方向, 确定不同应用场景下的 Ο(ω,θ)和 β ; 其中, 所述极点方向为使超指向差分波束在该方向上响应值为 1 的入射 角度, 所述零点方向为使超指向差分波束在该方向上响应值为 0的入射角度。
23、 如权利要求 21或 22所述的方法, 其特征在于, 所述根据不同应用 场景所需输出信号类型, 将设定的音频有效区域转换为极点方向以及零点方 向, 具体包括:
当应用场景所需输出信号类型为单声道信号时, 设定麦克风阵列的端射 方向为极点方向, 并设定 M个零点方向, 其中 M≤N-1 , N为麦克风阵列中 的麦克风数量;
当应用场景所需输出信号类型为双声道信号时, 设定麦克风阵列的 0度 方向为极点方向, 并将麦克风阵列的 180度方向设定为零点方向, 以确定其 中一个声道对应的超指向差分波束形成权系数, 并设定麦克风阵列的 180度 方向为极点方向, 并将麦克风阵列的 0度方向设定为零点方向, 以确定另一 个声道对应的超指向差分波束形成权系数。
24、 一种差分波束形成装置, 其特征在于, 包括: 权系数确定单元和波 束形成处理单元;
所述权系数确定单元, 用于根据麦克风阵列的几何形状和设定的音频采 集有效区域, 确定差分波束形成权系数, 并将形成的所述权系数向所述波束 形成处理单元传输; 或根据麦克风阵列的几何形状、 设定的音频采集有效区 域和扬声器位置, 确定差分波束形成权系数, 并将形成的所述权系数向所述 波束形成处理单元传输;
所述波束形成处理单元, 根据当前应用场景所需输出信号类型从所述权 系数确定单元获取当前应用场景对应的权系数, 利用获取的所述权系数对音 频输入信号进行差分波束形成处理。
25、 如权利要求 24所述的装置, 其特征在于, 所述权系数确定单元, 具 体用于:
根据麦克风阵列的几何形状和设定的音频采集有效区域,确定 ϋ(ωθ)和 β; 或根据麦克风阵列的几何形状、 设定的音频采集有效区域和扬声器位置, 确 定 Ε)(ω,θ)和 β ; 根据确定的 0(ω,θ)和 β , 按照公式: )=01 '9)[0( )011( )1 确定超 指向差分波束形成的权系数;
其中, ω)为权系数, ^^ 为任意几何形状的麦克风阵列所对应的转向 矩阵, 由不同入射角度下声源到达麦克风阵列中各麦克风间的相对时延决定 的, ϋΗ(ωθ)表示 ϋ(ωθ)的共轭转置矩阵, ①为音频信号的频率, Θ为声源入射 角度, β为入射角度为 Θ时的响应向量。
26、 如权利要求 25所述的装置, 其特征在于, 所述权系数确定单元, 具 体用于:
根据不同应用场景所需输出信号类型, 将设定的音频有效区域转换为极 点方向以及零点方向, 并根据得到的极点方向以及零点方向, 确定不同应用 场景下的 D(°^ ; 或者根据不同应用场景所需输出信号类型, 将设定的音 频有效区域转换为极点方向以及零点方向, 将扬声器位置转换为零点方向, 并根据得到的极点方向以及零点方向, 确定不同应用场景下的 D(°^^o P; 其中, 所述极点方向为使超指向差分波束在该方向上响应值为 1 的入射 角度, 所述零点方向为使超指向差分波束在该方向上响应值为 0的入射角度。
27、 如权利要求 26所述的装置, 其特征在于, 所述权系数确定单元, 具 体用于:
当应用场景所需输出信号类型为单声道信号时, 设定麦克风阵列的端射 方向为极点方向, 并设定 M个零点方向, 其中 M≤N-1 , N为麦克风阵列中 的麦克风数量;
当应用场景所需输出信号类型为双声道信号时, 设定麦克风阵列的 0度 方向为极点方向, 并将麦克风阵列的 180度方向设定为零点方向, 以确定其 中一个声道对应的超指向差分波束形成权系数, 并设定麦克风阵列的 180度 方向为极点方向, 并将麦克风阵列的 0度方向设定为零点方向, 以确定另一 个声道对应的超指向差分波束形成权系数。
PCT/CN2014/076127 2013-09-18 2014-04-24 音频信号处理方法及装置、差分波束形成方法及装置 WO2015039439A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/049,515 US9641929B2 (en) 2013-09-18 2016-02-22 Audio signal processing method and apparatus and differential beamforming method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310430978.7 2013-09-18
CN201310430978.7A CN104464739B (zh) 2013-09-18 2013-09-18 音频信号处理方法及装置、差分波束形成方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/049,515 Continuation US9641929B2 (en) 2013-09-18 2016-02-22 Audio signal processing method and apparatus and differential beamforming method and apparatus

Publications (1)

Publication Number Publication Date
WO2015039439A1 true WO2015039439A1 (zh) 2015-03-26

Family

ID=52688156

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/076127 WO2015039439A1 (zh) 2013-09-18 2014-04-24 音频信号处理方法及装置、差分波束形成方法及装置

Country Status (3)

Country Link
US (1) US9641929B2 (zh)
CN (1) CN104464739B (zh)
WO (1) WO2015039439A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107041012A (zh) * 2016-02-03 2017-08-11 北京三星通信技术研究有限公司 基于差分波束的随机接入方法、基站设备及用户设备
US10643634B2 (en) 2017-06-15 2020-05-05 Goertek Inc. Multichannel echo cancellation circuit and method and smart device

Families Citing this family (60)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102224568B1 (ko) * 2014-08-27 2021-03-08 삼성전자주식회사 오디오 데이터 처리 방법과 이를 지원하는 전자 장치
US9565493B2 (en) 2015-04-30 2017-02-07 Shure Acquisition Holdings, Inc. Array microphone system and method of assembling the same
US9554207B2 (en) 2015-04-30 2017-01-24 Shure Acquisition Holdings, Inc. Offset cartridge microphones
US10410650B2 (en) 2015-05-20 2019-09-10 Huawei Technologies Co., Ltd. Method for locating sound emitting position and terminal device
CN106325142A (zh) * 2015-06-30 2017-01-11 芋头科技(杭州)有限公司 一种机器人系统及其控制方法
CN105120421B (zh) * 2015-08-21 2017-06-30 北京时代拓灵科技有限公司 一种生成虚拟环绕声的方法和装置
US9788109B2 (en) * 2015-09-09 2017-10-10 Microsoft Technology Licensing, Llc Microphone placement for sound source direction estimation
US9804599B2 (en) 2015-11-04 2017-10-31 Zoox, Inc. Active lighting control for communicating a state of an autonomous vehicle to entities in a surrounding environment
US9878664B2 (en) * 2015-11-04 2018-01-30 Zoox, Inc. Method for robotic vehicle communication with an external environment via acoustic beam forming
US9494940B1 (en) 2015-11-04 2016-11-15 Zoox, Inc. Quadrant configuration of robotic vehicles
US10993057B2 (en) * 2016-04-21 2021-04-27 Hewlett-Packard Development Company, L.P. Electronic device microphone listening modes
JP6634354B2 (ja) * 2016-07-20 2020-01-22 ホシデン株式会社 緊急通報システム用ハンズフリー通話装置
CN106448693B (zh) * 2016-09-05 2019-11-29 华为技术有限公司 一种语音信号处理方法及装置
CN107888237B (zh) * 2016-09-30 2022-06-21 北京三星通信技术研究有限公司 初始接入和随机接入的方法、基站设备及用户设备
US10405125B2 (en) * 2016-09-30 2019-09-03 Apple Inc. Spatial audio rendering for beamforming loudspeaker array
US9930448B1 (en) * 2016-11-09 2018-03-27 Northwestern Polytechnical University Concentric circular differential microphone arrays and associated beamforming
CN106548783B (zh) * 2016-12-09 2020-07-14 西安Tcl软件开发有限公司 语音增强方法、装置及智能音箱、智能电视
RU2759715C2 (ru) * 2017-01-03 2021-11-17 Конинклейке Филипс Н.В. Звукозапись с использованием формирования диаграммы направленности
US10367948B2 (en) 2017-01-13 2019-07-30 Shure Acquisition Holdings, Inc. Post-mixing acoustic echo cancellation systems and methods
US10366700B2 (en) 2017-02-08 2019-07-30 Logitech Europe, S.A. Device for acquiring and processing audible input
US10229667B2 (en) 2017-02-08 2019-03-12 Logitech Europe S.A. Multi-directional beamforming device for acquiring and processing audible input
US10362393B2 (en) 2017-02-08 2019-07-23 Logitech Europe, S.A. Direction detection device for acquiring and processing audible input
US10366702B2 (en) 2017-02-08 2019-07-30 Logitech Europe, S.A. Direction detection device for acquiring and processing audible input
CN107170462A (zh) * 2017-03-19 2017-09-15 临境声学科技江苏有限公司 基于mvdr的隐声方法
CN107248413A (zh) * 2017-03-19 2017-10-13 临境声学科技江苏有限公司 基于差分波束形成的隐声方法
JP2018191145A (ja) * 2017-05-08 2018-11-29 オリンパス株式会社 収音装置、収音方法、収音プログラム及びディクテーション方法
CN108228577A (zh) * 2018-01-31 2018-06-29 北京百度网讯科技有限公司 在线翻译方法、装置、设备及计算机可读介质
CN108091344A (zh) * 2018-02-28 2018-05-29 科大讯飞股份有限公司 一种降噪方法、装置及系统
US11523212B2 (en) 2018-06-01 2022-12-06 Shure Acquisition Holdings, Inc. Pattern-forming microphone array
US11297423B2 (en) 2018-06-15 2022-04-05 Shure Acquisition Holdings, Inc. Endfire linear array microphone
CN109104683B (zh) * 2018-07-13 2021-02-02 深圳市小瑞科技股份有限公司 一种双麦克风相位测量校正的方法及校正系统
WO2020034095A1 (zh) * 2018-08-14 2020-02-20 阿里巴巴集团控股有限公司 音频信号处理装置及方法
CN109119092B (zh) * 2018-08-31 2021-08-20 广东美的制冷设备有限公司 基于麦克风阵列的波束指向切换方法和装置
EP3854108A1 (en) 2018-09-20 2021-07-28 Shure Acquisition Holdings, Inc. Adjustable lobe shape for array microphones
CN111383655B (zh) * 2018-12-29 2023-08-04 嘉楠明芯(北京)科技有限公司 一种波束形成方法、装置及计算机可读存储介质
EP3942845A1 (en) 2019-03-21 2022-01-26 Shure Acquisition Holdings, Inc. Auto focus, auto focus within regions, and auto placement of beamformed microphone lobes with inhibition functionality
US11558693B2 (en) 2019-03-21 2023-01-17 Shure Acquisition Holdings, Inc. Auto focus, auto focus within regions, and auto placement of beamformed microphone lobes with inhibition and voice activity detection functionality
CN113841419A (zh) 2019-03-21 2021-12-24 舒尔获得控股公司 天花板阵列麦克风的外壳及相关联设计特征
CN110095755B (zh) * 2019-04-01 2021-03-12 云知声智能科技股份有限公司 一种声源定位方法
US11445294B2 (en) 2019-05-23 2022-09-13 Shure Acquisition Holdings, Inc. Steerable speaker array, system, and method for the same
JP2022535229A (ja) 2019-05-31 2022-08-05 シュアー アクイジッション ホールディングス インコーポレイテッド 音声およびノイズアクティビティ検出と統合された低レイテンシオートミキサー
CN110383378B (zh) * 2019-06-14 2023-05-19 深圳市汇顶科技股份有限公司 差分波束形成方法及模块、信号处理方法及装置、芯片
EP3994689B1 (en) * 2019-07-02 2024-01-03 Dolby International AB Methods and apparatus for representation, encoding, and decoding of discrete directivity data
US11565426B2 (en) * 2019-07-19 2023-01-31 Lg Electronics Inc. Movable robot and method for tracking position of speaker by movable robot
CN114467312A (zh) 2019-08-23 2022-05-10 舒尔获得控股公司 具有改进方向性的二维麦克风阵列
CN110677786B (zh) * 2019-09-19 2020-09-01 南京大学 一种用于提升紧凑型声重放系统空间感的波束形成方法
US10904657B1 (en) * 2019-10-11 2021-01-26 Plantronics, Inc. Second-order gradient microphone system with baffles for teleconferencing
CN110767247B (zh) * 2019-10-29 2021-02-19 支付宝(杭州)信息技术有限公司 语音信号处理方法、声音采集装置和电子设备
CN111081233B (zh) * 2019-12-31 2023-01-06 联想(北京)有限公司 一种音频处理方法及电子设备
US11552611B2 (en) 2020-02-07 2023-01-10 Shure Acquisition Holdings, Inc. System and method for automatic adjustment of reference gain
US11277689B2 (en) 2020-02-24 2022-03-15 Logitech Europe S.A. Apparatus and method for optimizing sound quality of a generated audible signal
USD944776S1 (en) 2020-05-05 2022-03-01 Shure Acquisition Holdings, Inc. Audio device
CN113645546B (zh) * 2020-05-11 2023-02-28 阿里巴巴集团控股有限公司 语音信号处理方法和系统及音视频通信设备
WO2021243368A2 (en) 2020-05-29 2021-12-02 Shure Acquisition Holdings, Inc. Transducer steering and configuration systems and methods using a local positioning system
CN112073873B (zh) * 2020-08-17 2021-08-10 南京航空航天大学 一种无冗余阵元的一阶可调差分阵列的优化设计方法
KR20220097075A (ko) * 2020-12-31 2022-07-07 엘지디스플레이 주식회사 차량용 음향 제어 시스템, 이를 포함하는 차량, 및 차량용 음향 제어 방법
CN116918351A (zh) 2021-01-28 2023-10-20 舒尔获得控股公司 混合音频波束成形系统
WO2023065317A1 (zh) * 2021-10-22 2023-04-27 阿里巴巴达摩院(杭州)科技有限公司 会议终端及回声消除方法
CN113868583B (zh) * 2021-12-06 2022-03-04 杭州兆华电子股份有限公司 一种子阵波束聚焦的声源距离计算方法及系统
CN115038014A (zh) * 2022-06-02 2022-09-09 深圳市长丰影像器材有限公司 一种音频信号处理方法、装置、电子设备和存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1267445A (zh) * 1997-08-20 2000-09-20 福纳克有限公司 声信号的电子波束形成方法和声传感装置
WO2005004532A1 (en) * 2003-06-30 2005-01-13 Harman Becker Automotive Systems Gmbh Handsfree system for use in a vehicle
CN1753084A (zh) * 2004-09-23 2006-03-29 哈曼贝克自动系统股份有限公司 使用噪声降低的多通道自适应语音信号处理
CN101964934A (zh) * 2010-06-08 2011-02-02 浙江大学 二元麦克风微阵列语音波束形成方法
CN102164328A (zh) * 2010-12-29 2011-08-24 中国科学院声学研究所 一种用于家庭环境的基于传声器阵列的音频输入系统
CN102474680A (zh) * 2009-07-24 2012-05-23 皇家飞利浦电子股份有限公司 音频波束形成
CN103065639A (zh) * 2011-09-30 2013-04-24 斯凯普公司 处理信号

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8842848B2 (en) * 2009-09-18 2014-09-23 Aliphcom Multi-modal audio system with automatic usage mode detection and configuration capability
CH702399B1 (fr) 2009-12-02 2018-05-15 Veovox Sa Appareil et procédé pour la saisie et le traitement de la voix.
US20130343549A1 (en) * 2012-06-22 2013-12-26 Verisilicon Holdings Co., Ltd. Microphone arrays for generating stereo and surround channels, method of operation thereof and module incorporating the same
US9294859B2 (en) * 2013-03-12 2016-03-22 Google Technology Holdings LLC Apparatus with adaptive audio adjustment based on surface proximity, surface type and motion
US9462379B2 (en) * 2013-03-12 2016-10-04 Google Technology Holdings LLC Method and apparatus for detecting and controlling the orientation of a virtual microphone

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1267445A (zh) * 1997-08-20 2000-09-20 福纳克有限公司 声信号的电子波束形成方法和声传感装置
WO2005004532A1 (en) * 2003-06-30 2005-01-13 Harman Becker Automotive Systems Gmbh Handsfree system for use in a vehicle
CN1753084A (zh) * 2004-09-23 2006-03-29 哈曼贝克自动系统股份有限公司 使用噪声降低的多通道自适应语音信号处理
CN102474680A (zh) * 2009-07-24 2012-05-23 皇家飞利浦电子股份有限公司 音频波束形成
CN101964934A (zh) * 2010-06-08 2011-02-02 浙江大学 二元麦克风微阵列语音波束形成方法
CN102164328A (zh) * 2010-12-29 2011-08-24 中国科学院声学研究所 一种用于家庭环境的基于传声器阵列的音频输入系统
CN103065639A (zh) * 2011-09-30 2013-04-24 斯凯普公司 处理信号

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107041012A (zh) * 2016-02-03 2017-08-11 北京三星通信技术研究有限公司 基于差分波束的随机接入方法、基站设备及用户设备
CN107041012B (zh) * 2016-02-03 2022-11-22 北京三星通信技术研究有限公司 基于差分波束的随机接入方法、基站设备及用户设备
US10643634B2 (en) 2017-06-15 2020-05-05 Goertek Inc. Multichannel echo cancellation circuit and method and smart device

Also Published As

Publication number Publication date
CN104464739A (zh) 2015-03-25
CN104464739B (zh) 2017-08-11
US20160173978A1 (en) 2016-06-16
US9641929B2 (en) 2017-05-02

Similar Documents

Publication Publication Date Title
WO2015039439A1 (zh) 音频信号处理方法及装置、差分波束形成方法及装置
EP3320692B1 (en) Spatial audio processing apparatus
WO2015035785A1 (zh) 语音信号处理方法与装置
CN106664501B (zh) 基于所通知的空间滤波的一致声学场景再现的系统、装置和方法
JP5886304B2 (ja) 方向性高感度記録制御のためのシステム、方法、装置、及びコンピュータ可読媒体
KR101547035B1 (ko) 다중 마이크에 의한 3차원 사운드 포착 및 재생
KR101724514B1 (ko) 사운드 신호 처리 방법 및 장치
TWI555412B (zh) 整合幾何空間音源編碼串流之設備及方法
US9485574B2 (en) Spatial interference suppression using dual-microphone arrays
KR101555416B1 (ko) 음향 삼각 측량에 의한 공간 선택적 사운드 취득 장치 및 방법
CN103026734B (zh) 生成带有可操控零位的波束成形的音频信号的电子装置
US20170365255A1 (en) Far field automatic speech recognition pre-processing
US9838646B2 (en) Attenuation of loudspeaker in microphone array
JP2014501945A (ja) 幾何ベースの空間オーディオ符号化のための装置および方法
CN103004233A (zh) 基于两个或更多宽带麦克风信号生成修改宽带音频信号的电子设备
JP2020500480A5 (zh)
CN108141665A (zh) 信号处理装置、信号处理方法和程序
JP2008252625A (ja) 指向性スピーカシステム
KR101678305B1 (ko) 텔레프레즌스를 위한 하이브리드형 3d 마이크로폰 어레이 시스템 및 동작 방법
KR20130109615A (ko) 가상 입체 음향 생성 방법 및 장치
Wan et al. Robust and low complexity localization algorithm based on head-related impulse responses and interaural time difference
Peled et al. Objective performance analysis of spherical microphone arrays for speech enhancement in rooms
JP2010161735A (ja) 音再生装置および音再生方法
CN104735582A (zh) 一种声音信号处理方法、装置及设备
Shabtai et al. Spherical array beamforming for binaural sound reproduction

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14846445

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14846445

Country of ref document: EP

Kind code of ref document: A1