CN206349145U

CN206349145U - Audio signal processing apparatus

Info

Publication number: CN206349145U
Application number: CN201621455822.XU
Authority: CN
Inventors: 徐荣强
Original assignee: Beijing Horizon Information Technology Co Ltd
Current assignee: Beijing Horizon Information Technology Co Ltd
Priority date: 2016-12-28
Filing date: 2016-12-28
Publication date: 2017-07-21
Anticipated expiration: 2026-12-28

Abstract

A kind of audio signal processing apparatus is disclosed, the equipment includes：Loudspeaker；Microphone array, including multiple directional microphones with different pickup areas, each directional microphone is used in the pickup area of itself gather branch input signal, and branch input signal includes the attention signal component from signal source and the echo signal components from loudspeaker；Multiplexer, the branch input signal for each directional microphone to be gathered merges into total input signal；Auditory localization device, the position for determining signal source and loudspeaker；And gain control mechanism, the gain of each directional microphone is adjusted for the position according to signal source and loudspeaker, is returned with the letter between the power for the echo signal components for causing the power of the attention signal component received in total input signal from signal source and being received from loudspeaker than maximum.It is thereby achieved that lossless attention signal enhancing and echo signal suppress.

Description

Audio signal processing apparatus

Technical field

The application is related to Audiotechnica field, and more particularly, to a kind of audio signal processing apparatus.

Background technology

Either intelligentized speech recognition system (for example, intelligent appliance, robot etc.), or traditional voice communication System (for example, conference system, Internet Protocol transmission speech VoIP system etc.), the problem of all running into echo cancelltion.

For example, in the case where singly saying pattern：In smart machine application scenarios, equipment is not intended to wrap in the content oneself played The wake-up word or identification word included is again introduced into the identifying system of oneself to cause wrong report, experience lf being influenced and waste of resource；Passing In communication system of uniting, remote subscriber is not intended to hear the echo oneself spoken.It is double say pattern under：In smart machine application scenarios In, it is desirable to equipment can hear user's one's voice in speech, but the content do not played by oneself is disturbed；In legacy communications system In, it is desirable to even if proximally and distally user speaks simultaneously, it also ensure that clearly communication quality and higher intelligibility.More than All it is the very important scene in voice experience, is also the problem in current Audio Signal Processing.

Current existing echo cancellation technology is the combination based on single microphone and echo restrainable algorithms.Echo restrainable algorithms Only input signal is handled from time domain and the angle of frequency domain so that can also damage voice simultaneously when echo is handled, from And influence follow-up discrimination.Also, in the case where there is big echo, otherwise will face echo processing does not influence totally Discrimination, or algorithm suppresses too strong and damages phonetic element, the two can all influence recognition effect.

Utility model content

In order to solve the above-mentioned technical problem, it is proposed that the application.Embodiments herein is provided at a kind of audio signal Equipment is managed, it can realize lossless attention signal enhancing and echo signal suppression using the characteristic of directional microphone array.

According to the one side of the application there is provided a kind of audio signal processing apparatus, the equipment includes：Loudspeaker； Microphone array, including multiple directional microphones with different pickup areas, each directional microphone are used in the pickup of itself Branch input signal is gathered in area, the branch input signal includes the attention signal component from signal source and raised from described The echo signal components of sound device；Multiplexer, is electrically connected with each directional microphone, for gathered each directional microphone Branch input signal merges into total input signal；Auditory localization device, for determining the position of the signal source and described raising one's voice The position of device；And gain control mechanism, electrically connected with the auditory localization device and each directional microphone, for according to institute State the position of signal source and the position of the loudspeaker to adjust the gain of each directional microphone, to cause in total input The power of the attention signal component received in signal from the signal source and the echo signal point received from the loudspeaker Letter between the power of amount is returned than maximum.

In one embodiment of the application, the auditory localization device includes：Signal source positioning devices, for detecting With the presence or absence of the signal source, the quantity of the signal source and its relevant position for exporting attention signal in current scene；And Loudspeaker positioning devices, for detecting in the current scene with the presence or absence of the loudspeaker, described for playing voice signal The quantity of loudspeaker and its relevant position.

In one embodiment of the application, the signal source positioning devices include：Camera, it is described current for catching The scene image of scene；And image identification unit, for recognizing the signal source in the scene image, determine the letter The quantity in number source, and determine the relative position between the reference position of the signal source and the audio signal processing apparatus.

In one embodiment of the application, described image recognition unit is according to the signal source in the scene image Position determine the relative position between the signal source and the reference position of the signal source positioning devices, and according to institute State the registering relation between the reference position of signal source positioning devices and the reference position of the audio signal processing apparatus and come true Relative position between the fixed signal source and the reference position of the audio signal processing apparatus.

In one embodiment of the application, the signal source positioning devices include：Signal separation unit, for receive by At least two-way branch input signal that at least two directional microphones are gathered, and from least two-way branch input signal Middle attention signal component of the separation from the signal source；And acoustic recognition unit, for according to the signal isolated The phase of the attention signal component in source determines the relative position of the signal source and the audio signal processing apparatus.

In one embodiment of the application, the loudspeaker positioning devices include：Signal separation unit, for receive by At least two-way branch input signal that at least two directional microphones are gathered, and from least two-way branch input signal Middle attention signal component of the separation from the loudspeaker；And acoustic recognition unit, for being raised one's voice according to isolating The phase of the attention signal component of device determines the relative position of the loudspeaker and the audio signal processing apparatus.

In one embodiment of the application, the gain control mechanism includes：Comparing unit, in response to existing just Exporting one or more signal sources of attention signal and in the absence of the loudspeaker for playing voice signal, comparing one Or the first position relation between multiple signal sources and the pickup area of each directional microphone；And gain adjusting unit, it is used for The gain of each directional microphone is adjusted according to the first position relation, to cause in the total input signal from described The power for the attention signal component that one or more signal sources are received is maximum.

In one embodiment of the application, the gain adjusting unit increases one or more of signal sources and is located at it The gain of one or more directional microphones in pickup area, to cause in the total input signal from one or more of letters The power for the attention signal component that number source is received is maximum and distortion occurs for none of attention signal component.

In one embodiment of the application, the gain adjusting unit further reduce in the microphone array except The gain of other microphones of one or more of directional microphones, is connect with reducing in the total input signal from noise source The power of the noise component(s) received.

In one embodiment of the application, the gain control mechanism includes：Comparing unit, in response in the absence of Export the signal source of attention signal and there are the one or more loudspeakers for playing voice signal, it is relatively more one Or the second place relation between multiple loudspeakers and the pickup area of each directional microphone；And gain adjusting unit, it is used for The gain of each directional microphone is adjusted according to the second place relation, to cause in the total input signal from described The power for the echo signal components that one or more loudspeakers are received is minimum.

In one embodiment of the application, the gain adjusting unit reduces one or more of loudspeakers and is located at it The gain of one or more directional microphones in pickup area.

In one embodiment of the application, the gain control mechanism includes：Comparing unit, in response to depositing simultaneously Exporting one or more signal sources of attention signal and playing one or more loudspeakers of voice signal, comparing First position relation between one or more of signal sources and the pickup area of each directional microphone and one or many Second place relation between individual loudspeaker and the pickup area of each directional microphone；And gain adjusting unit, for basis The first position relation and the second place relation adjust the gain of each directional microphone, to cause described total defeated Enter the power of the attention signal component received in signal from one or more of signal sources with being raised from one or more of Letter between the power for the echo signal components that sound device is received is returned than maximum.

In one embodiment of the application, the equipment also includes：Sef-adapting filter, for according to the loudspeaker The sound played to carry out echo cancellor to the total input signal after Gain tuning in time domain and/or frequency domain.

Compared with prior art, can be according to signal source using the audio signal processing apparatus according to the embodiment of the present application Position and the position of loudspeaker adjust the gain of each directional microphone in microphone array, to cause in microphone array The power of the attention signal component received in the total input signal gathered from the signal source from the loudspeaker with receiving Letter between the power of the echo signal components arrived is returned than maximum.It therefore, it can using the characteristic of directional microphone array come real Existing lossless attention signal enhancing and echo signal suppress.

Brief description of the drawings

By the way that the embodiment of the present application is described in more detail with reference to accompanying drawing, the above-mentioned and other purposes of the application, Feature and advantage will be apparent.Accompanying drawing is used for providing further understanding the embodiment of the present application, and constitutes explanation A part for book, is used to explain the application together with the embodiment of the present application, does not constitute the limitation to the application.In the accompanying drawings, Identical reference number typically represents same parts or step.

Fig. 1 illustrates the structural representation of the audio signal processing apparatus according to the embodiment of the present application.

Fig. 2 illustrates the structural representation of the microphone array according to the embodiment of the present application.

Fig. 3 illustrates the structural representation of the auditory localization device according to the embodiment of the present application.

Fig. 4 illustrates the structural representation of the gain control mechanism according to the embodiment of the present application.

Fig. 5 illustrates the audio signal processing apparatus according to the embodiment of the present application and the example location relation signal of signal source Figure.

Fig. 6 illustrates the schematic flow sheet of the acoustic signal processing method according to the embodiment of the present application.

Fig. 7 illustrates the block diagram of the electronic equipment according to the embodiment of the present application.

Embodiment

Below, the example embodiment according to the application will be described in detail by referring to the drawings.Obviously, described embodiment is only Only be a part of embodiment of the application, rather than the application whole embodiments, it should be appreciated that the application is not by described herein The limitation of example embodiment.

Application general introduction

As described above, the echo cancellation schemes of traditional single microphone combination echo restrainable algorithms are from time domain and the angle of frequency domain Spend to handle the input signal of microphone collection, it will face in the case of loudspeaker close coupling：If echo restrainable algorithms Suppress too strong, the voice signal of concern can be decayed excessive, cause voice signal to damage, influence discrimination；And if echo is pressed down Algorithm processed is excessively weak, and having most echo signal can not eliminate, to voice signal by as new nonstationary noise, same shadow Ring discrimination.

For example, in smart machine application scenarios, such as smart machine of TV, sound equipment, robot etc is in order to realize Far field effect, power of loudspeaker all can than larger, this sound for resulting in loudspeaker broadcasting will again be gathered by microphone and Produce larger echo.Conventional adaptive filter algorithm is difficult to eliminate this echo, can cause the residual echo after eliminating compared with It is big and damage of the algorithm to voice is also larger, so that the discrimination of voice signal is low and communication quality is low.

For the technical problem, the basic conception of the application is to propose a kind of audio signal processing apparatus, at audio signal Reason method, electronic equipment, computer program product and computer-readable recording medium, it is pressed down based on microphone array and echo The combination of algorithm processed, realizes the enhancing of attention signal (for example, voice signal) and the elimination of echo signal from spatial domain.Spatial domain increases The strong damage to attention signal is minimum, and follow-up echo-algorithm suppresses part just with linear echo very well to be eliminated back Acoustical signal, so as to lift echo cancellor ability and not influence discrimination.Directional microphone array compared to omnidirectional microphone array, It is that it is smaller for the damage of attention signal using the characteristic of microphone in itself, the form without introducing air space algorithm.Enter one again Step combines attention signal and carries out algorithm configuration with echo signal high specific principle, enters for the microphone of directional microphone array Row different gains are matched, it is ensured that the letter between attention signal power and echo signal power returns more maximum than (SER).It therefore, it can Speech recognition intelligibility and voice communication quality etc. are adaptively maximized, Consumer's Experience is lifted.

After the general principle of the application is described, carry out specifically to introduce the various non-limits of the application below with reference to the accompanying drawings Property embodiment processed.

Exemplary audio signal handling equipment

As shown in figure 1, being included according to the audio signal processing apparatus 100 of the embodiment of the present application：Loudspeaker 110, microphone Array 120, multiplexer 130, auditory localization device 140 and gain control mechanism 150.

In one embodiment, loudspeaker 110 is used to play voice signal, and it can be single loudspeaker or by multiple The loudspeaker array that loudspeaker is constituted.The voice signal is known when playing.

For example, the loudspeaker 110 can be 2.1 audio amplifiers, by a woofer (commonly referred to as subwoofer) and low a pair The weaker full-range cabinet of sound (commonly referred to as satellite box) composition.The audio amplifier including left (L) channel loudspeaker and right (R) sound channel to raising Sound device, so as to form played in stereo effect.Obviously, the application not limited to this.For example, the loudspeaker 110 can also be 2.0 sounds Case, 5.1 audio amplifiers etc..

In one embodiment, microphone array 120 can include multiple directional microphones with different pickup areas, often Individual directional microphone is used in the pickup area of itself gather branch input signal, and the branch input signal includes coming from signal The attention signal component in source and the echo signal components from the loudspeaker.

For example, microphone array 120 is made up of the microphone of certain amount, adopted for the spatial character to sound field Sample and the system handled.The directive property of microphone is that microphone is retouched to one from space all directions sound inspiration degree pattern State, be its important attribute.Different according to directive property, microphone can be divided into：Omnidirectional microphone and directional microphone.Entirely To microphone for the sound from different angles, its sensitivity is essentially identical, and its head uses the principle of pressure sensitive Design, vibrating diaphragm only receives from extraneous pressure.Directional microphone mainly uses the principle design of barometric gradient, passes through head chamber Aperture behind body, vibrating diaphragm receives the pressure of tow sides, thus vibrating diaphragm by different directions pressure and differ, microphone It is provided with directive property.Directional microphone array is using microphone characteristic in itself, without drawing compared to omnidirectional microphone array Enter the form of air space algorithm, it is smaller for the damage of voice.

For example, depending on the relative position relation of each microphone, microphone array 120 can be divided into：Linear array, its Array element is centrally located on same straight line；Planar array, its array element central distribution is in one plane；And space array, its Array element central distribution is in solid space.

For example, microphone array 120 can include multiple directional microphone MIC1 to MICn with different pickup areas, its Middle n is greater than the natural number equal to 2.Below, microphone array will be described by taking planar array as an example in one example.

As shown in Fig. 2 for example, be equipped with the microphone array 120 of a plane on audio signal processing apparatus 100, The microphone array 120 includes with same central point and centrosymmetric 8 directional microphone MIC1 to MIC8 is presented.Institute It is used in the pickup area of itself gather branch input signal after stating 8 directional microphones parallel connections.

Specifically, directional microphone MIC1 to MIC8 is arranged on same plane, the distance between each directional microphone basis Actual demand and the algorithm used are set.Adjacent directional microphone is uniformly distributed in two dimensional surface around central point, mutually Between be in 45° angle.As illustrated in fig. 2, it is assumed that reference directions of the MIC1 positioned at audio signal processing apparatus 100, i.e., 0 ° direction, then MIC2 is located at 45 ° of directions, and MIC3 is located at 90 ° of directions, and MIC4 is located at 135 ° of directions, and MIC5 is located at 180 ° of directions, and MIC6 is located at 225 ° of directions, MIC7 is located at 270 directions, and MIC8 is located at 315 ° of directions.

Certainly, the application not limited to this.In other embodiments, microphone array can also be other planar arrays, Can be linear array or space solid array etc..Each directional microphone in microphone array can be set according to the actual requirements In same plane or Different Plane, central point can be arranged about according to the actual requirements and is uniformly distributed to obtain as big as possible adopt Collect orientation range, or be arranged to non-uniform Distribution and the sound source in some directions is acquired with emphasis.Also, the sensing Mike Wind can also be so that non-paired mode is set individually, in groups etc..

MIC1 to MIC8 can have towards the pickup area immediately ahead of oneself respectively, that is, be respectively facing 0 ° of direction, 45 ° of sides To the pickup area in, 90 ° of directions, 135 ° of directions, 180 ° of directions, 225 ° of directions, 270 directions and 315 ° of directions.In order to avoid occurring The missing inspection of signal, adjacent pickup area can have overlapping region.In MIC1 to MIC8 each can be in the pickup area of itself The interior respective branch input signal of collection, when signal source is in its pickup area, the branch input signal includes coming from signal The attention signal component in source；When loudspeaker is in its pickup area, the branch input signal is included from the loudspeaker Echo signal components；When signal source and loudspeaker are in its pickup area simultaneously, the branch input signal includes coming from signal Both the attention signal component in source and the echo signal components from the loudspeaker；When signal source and loudspeaker are not in it When in pickup area, the branch input signal is zero.

In one embodiment, multiplexer 130 is electrically connected with each directional microphone, for by each directional microphone institute The branch input signal of collection merges into total input signal.

For example, the multiplexer can be simply adder, for each road branch input signal to be alignd simultaneously in time domain And it is superposed to total input signal all the way.Alternatively, the multiplexer can also be weighted summer, for passing through the process in superposition It is middle to apply different weights to different branch input signals, to cause the branch input signal of concern to have in total input signal There is higher peak value.

In one embodiment, auditory localization device 140 is used to determine the position of the signal source and the loudspeaker Position.It can adopt and come to position signal and loudspeaker in various manners.

As shown in figure 3, the auditory localization device 140 can include：Signal source positioning devices 141, are working as detecting With the presence or absence of the signal source, the quantity of the signal source and its relevant position for exporting attention signal in preceding scene；And raise Sound device positioning devices 142, for detecting in the current scene with the presence or absence of the loudspeaker, described for playing voice signal The quantity of loudspeaker and its relevant position.

Here, term " position " focuses more on the benchmark of signal source and loudspeaker relative to the audio signal processing apparatus The angle in direction (for example, 0 ° of direction in Fig. 2).

In the first example, the signal source positioning devices 141 can include：Camera, for catching the current field The scene image of scape；And image identification unit, for recognizing the signal source in the scene image, determine the signal The quantity in source, and determine the relative position between the reference position of the signal source and the audio signal processing apparatus.

For example, the camera can be used for catching current scene (for example, it at least covers the pickup of all directional microphones Area) scene image, it can be single camera or camera array.For example, the scene image that camera is collected Can be single-frame images, consecutive image frame sequence (that is, video flowing) or discrete picture frame sequence (that is, in predetermined sampling time point The image data set sampled) etc..For example, the camera can be such as monocular camera, binocular camera, many mesh cameras, in addition, It can be used for catching gray-scale map, can also catch the cromogram with colouring information.Certainly, as known in the art and general The application can be applied to come the camera for any other type being likely to occur, the mode that the application catches image to it does not have Especially limitation, as long as resulting in the gray scale or colouring information of input picture.In order to reduce the amount of calculation in subsequent operation, In one embodiment, cromogram can be subjected to gray processing processing before being analyzed and being handled.

For example, image device can constantly catch picture frame, the picture frame captured constantly can be analyzed and located Reason, to recognize signal source therein.For example, in the speech recognition of IED (for example, intelligent appliance, robot etc.) Under scene, signal source can be the user interacted with electronic equipment.At this moment, the identification of signal source can be known based on human body Not, recognition of face, oral area recognize scheduling algorithm to realize.For example, simply, can have user in current scene identifying In the case of, that is, judge to identify the user as signal source；More accurately, can also exist recognizing in current scene The lip of user and user judge to identify the user as signal source in the case of folding.

It should be noted that the signal source for sending attention signal is not limited to user, and can be that other are any possible Source, for example, TV, vehicle, animal etc..Correspondingly, the recognizer of signal source can also accordingly be adjusted to TV identification, The recognizers such as vehicle identification, animal identification.

Next, described in described image recognition unit determines according to position of the signal source in the scene image Relative position between the reference position of signal source and the signal source positioning devices, and according to the signal source positioning devices Reference position and the reference position of the audio signal processing apparatus between registering relation determine the signal source and institute State the relative position between the reference position of audio signal processing apparatus.

For example, image identification unit can determine that the signal source (for example, user or user's oral area) recognized is sat in image Position in mark system, and according to the outer ginseng matrix of camera, the position being converted into world coordinate system.Then, image Recognition unit can obtain the reference direction for the camera calibrated in advance and the reference direction of audio signal processing apparatus 100 Mapping relations between (for example, reference direction of microphone array), position of the signal source in world coordinate system is turned again Change in sound coordinate system, so as to obtain the angle between signal source and the reference direction (that is, 0 ° direction) of microphone array.

In the second example, the signal source positioning devices 141 can include：Signal separation unit, for receiving by extremely At least two-way branch input signal that few two directional microphones are gathered, and from least two-way branch input signal Separate the attention signal component from the signal source；And acoustic recognition unit, for according to the signal source isolated The phase of attention signal component determine the relative position of the signal source and the audio signal processing apparatus.

For example, the voice signal being currently played due to known loudspeaker, so signal separation unit can be in time domain And/or the sound signal components are removed in the branch input signal gathered on frequency domain from microphone (equivalent to echo signal point Amount), and obtain the attention signal component only from signal source.For example, in the case, the signal separation unit simply may be used To be subtracter.Then, the letter of the concern from the signal source that acoustic recognition unit can be separated based at least two-way Number component, the base of signal source and microphone array is directly obtained using existing or exploitation in the future sound localization method Angle between quasi- direction (that is, 0 ° direction).

Obviously, the application is not limited to two examples mentioned above, the side of any position for being determined for signal source Method can be applied to this, and thus, fall within the scope of the present application.For example, it is also possible to by above-mentioned first example and Second example is combined, i.e. only not only recognizing the lip that there is user and user in current scene in folding, and And when also detecting voice signal in respective direction, just judge to identify in the direction there is signal source, it is more accurate to obtain True signal source detection and positioning result.

In addition, in one example, the loudspeaker positioning devices 142 include：Signal separation unit, for receiving by extremely At least two-way branch input signal that few two directional microphones are gathered, and from least two-way branch input signal Separate the attention signal component from the loudspeaker；And acoustic recognition unit, for according to the loudspeaker isolated The phase of attention signal component determine the relative position of the loudspeaker and the audio signal processing apparatus.

Due to the exemplary construction and the knot of the signal source positioning devices 141 in the second example of loudspeaker positioning devices 142 Structure is identical, so for sake of simplicity, there is omitted herein its associated description.Further, in order to save cost and space, this is raised one's voice Device positioning devices 142 can also share same group of signal separation unit and acoustic recognition unit with signal source positioning devices 141.

In another example, it is contemplated that position of the loudspeaker array in audio signal processing apparatus 100 is often default And fixed, positional information of the loudspeaker with respect to microphone array is usually contained in factory mode, therefore, for the sake of simplicity, Loudspeaker positioning devices 142 can directly determine the base of one or more loudspeakers and microphone array using the positional information Angle between quasi- direction (that is, 0 ° direction).

In the case, the loudspeaker positioning devices 142 include：Position acquisition unit, for reading the loudspeaker With the relative position of the audio signal processing apparatus.

Obviously, the application is not limited to two examples mentioned above, the side of any position for being determined for loudspeaker Method can be applied to this, and thus, fall within the scope of the present application.For example, it is also possible to by above-mentioned two example phase With reference to, i.e. in order to prevent loudspeaker position may with predeterminated position produce skew, can first based on predeterminated position, The rough relative position relation determined between loudspeaker and microphone array, then, comes adaptive further according to sound localization method Find the difference under actual pattern.

In one embodiment, gain control mechanism 150 and the auditory localization device 140 and each directional microphone electricity Connection, the gain of each directional microphone is adjusted for the position according to the signal source and the position of the loudspeaker, with So that the power of the attention signal component received in the total input signal from the signal source from the loudspeaker with connecing Letter between the power of the echo signal components received is returned than maximum.

As shown in figure 4, the gain control mechanism 150 can include：Comparing unit 151, for comparing the signal source Position relationship between the loudspeaker and the pickup area of each directional microphone；And gain adjusting unit 152, for root The gain of each directional microphone is adjusted according to the position relationship, to cause in the total input signal from the signal source Letter between the power of the attention signal component received and the power of the echo signal components received from the loudspeaker is returned Than maximum.

For example, the comparing unit 151 can be simply comparator, in auditory localization units test to signal source and Mike Reference direction (that is, the 0 ° side of angle and loudspeaker and microphone array between the reference direction (that is, 0 ° direction) of wind array To) between angle after, determine which or the pickup area of multiple directional microphones signal source and loudspeaker are located in respectively.

For example, the gain adjusting unit 152 can be the one or both in analogue amplifier and digital amplifier, it is used for The gain factor of each directional microphone is generated based on above-mentioned position relationship, and is pointed to according to the gain factor to each The branch input signal that microphone is gathered is zoomed in or out, to strengthen attention signal power (for example, from user's Voice signal) while, suppress echo signal power.

Below, Gain tuning process described in several specific scenes.

In the first scene, it is assumed that there are exporting one or more signal sources of attention signal and be not present and broadcasting Put the loudspeaker of voice signal.

At this moment, the comparing unit 151 can be used for the one or more of signal sources of comparison and each directional microphone First position relation between pickup area.The gain adjusting unit 152 can be used for being adjusted according to the first position relation The gain of each directional microphone, to cause the pass received in the total input signal from one or more of signal sources The power for noting component of signal is maximum.

For example, the gain adjusting unit 152 can increase one that one or more of signal sources are located at its pickup area The gain of individual or multiple directional microphones, to receive from one or more of signal sources in the total input signal Attention signal component power it is maximum and distortion occurs for none of attention signal component.

Further, the gain adjusting unit 152 can also reduce in the microphone array except one Or the gain of other microphones of multiple directional microphones, to reduce in the total input signal from making an uproar that noise source is received The power of sound component, or reduction receive the possibility of noise component(s) from potential noise source.For example, can be by other microphones Gain is reduced to 0, that is, disables corresponding microphone, to reduce noise inputs and save power.However, because disabling microphone may Cause corresponding microphone can not play the purpose detected in real time, so alternatively, the gain of other microphones can be reduced to One predetermined value, to meet least energy requirement Emin, so as to be saved in power and obtain balance between detection in real time.

In the second scene, it is assumed that in the absence of the signal source that exports attention signal and exist and play voice signal One or more loudspeakers.

At this moment, the comparing unit 151 can be used for the one or more of loudspeakers of comparison and each directional microphone Second place relation between pickup area.The gain adjusting unit 152 can be used for being adjusted according to the second place relation The gain of each directional microphone, to cause in the total input signal from returning that one or more of loudspeakers are received The power of acoustical signal component is minimum.

For example, the gain adjusting unit 152 can reduce one that one or more of loudspeakers are located at its pickup area The gain of individual or multiple directional microphones.Similarly, for different purposes, for example, can be by one or more of Mikes The gain of wind is reduced to 0, and the gain of the microphone can also be reduced to a predetermined value, such as Emin.

In the 3rd scene, it is assumed that while playing in the presence of the one or more signal sources for exporting attention signal and One or more loudspeakers of voice signal.This scene is the combination of the first scene and the second scene.

At this moment, the comparing unit 151 can be used for the one or more of signal sources of comparison and each directional microphone First position relation and one or more of loudspeakers between pickup area and between the pickup area of each directional microphone Second place relation.The gain adjusting unit 152 is used to be adjusted according to the first position relation and the second place relation The gain of whole each directional microphone, to cause what is received in the total input signal from one or more of signal sources Letter between the power of attention signal component and the power of the echo signal components received from one or more of loudspeakers Return than maximum.

For example, the gain adjusting unit 152 can generate first group of gain of each directional microphone, wherein, it is described The gain that one or more signal sources are located at one or more directional microphones in its pickup area is increased, to cause described total The power of the attention signal component received in input signal from one or more of signal sources is maximum.Then, the gain Adjustment unit 152 can generate second group of gain of each directional microphone, wherein, one or more of loudspeakers are located at it The gain of one or more directional microphones in pickup area is reduced, to cause in the total input signal from one or The power for the echo signal components that multiple loudspeakers are received is minimum.Next, the gain adjusting unit 152 can generate use First group of weight in first group of gain and second group of weight for second group of gain, to cause in the total input signal The power of the attention signal component received from one or more of signal sources from one or more of loudspeakers with receiving Letter between the power of the echo signal components arrived is returned than maximum.Finally, the gain adjusting unit 152 can use described One group of gain, first group of weight, second group of gain and second group of weight adjust each directional microphone Gain.

Below, the Gain tuning process described in a specific example in above-mentioned different scenes will be carried out with reference to Fig. 5.

As shown in figure 5, including microphone array 120 in audio signal processing apparatus 100.The microphone array 120 Including with same central point and centrosymmetric 4 directional microphone MIC1 to the MIC4 of presentation.Assuming that MIC1 believes positioned at audio The reference direction of number processing equipment 100, i.e., 0 ° direction, then MIC2 be located at 90 ° of directions, MIC3 is located at 180 ° of directions, and MIC4 is located at 270 directions.For the sake of simplicity, assume that the audio signal processing apparatus 100 only includes a loudspeaker 110, and in the applied field Only include a signal source 200 in scape, the signal source 200 can be the user interacted with IED.This is raised Sound device 110 is located at 45 ° of directions of the reference direction (that is, 0 ° direction) of audio signal processing apparatus 100.The signal source 200 is located at 135 ° of directions of the reference direction (that is, 0 ° direction) of audio signal processing apparatus 100.

For example, first, the equipment (can be contained by the signal source positioning devices of such as camera etc come sensed signal sources Multiple signal sources) direction, and whether the equipment may determine that the broadcast state of loudspeaker, judge loudspeaker in broadcasting sound Sound.

On the one hand, once judging sound source (or being signal source) and being played without loudspeaker, then explanation enters above-mentioned First scene, i.e., pure near-end list says pattern, and only near-end is talked.There is no echo E, only near-end speech S now, equipment is only needed to Maximum speech energy is obtained by configuring, single multi- sound source is supported.

Each directional microphone equipped with separate gain control, for example its can by signal gain dominant vector [Gs1, Gs2 ..., Gsn] and (wherein, n be microphone quantity) represent, so as to control the sensitivity to the pointing direction or sound Acquisition capacity.

Then, sound Sources Detection device obtains the number and position (direction) coordinate of sound source, and for example it can pass through many sound sources Direction vector [S1, S2 ..., Sm] (wherein, m be sound source quantity) represent.Number and position of the algorithm according to sound source, from Gain control matrix is adaptively calculated, adaptively by the directional microphone gain vector of Sounnd source direction (for example, in Fig. 5 MIC2 and MIC3) adjustment increase so that after many sound-source signals are by equipment, it is ensured that maximum in the signal energy of Sounnd source direction, i.e. S Maximum, and it is undistorted.Microphone gain zero setting without sound source angle direction, reduces noise.

Then, said process can cyclically be performed, i.e. when sound source change (for example, quantity changes, position changes), Adaptive updates multiple sound source direction vector, gain dominant vector is adaptively updated by maximum SER criterions.

On the other hand, once judging no sound source and having loudspeaker broadcasting, then explanation enters the second above-mentioned scene, i.e., pure Near-end play mode, only loudspeaker are played.Now there was only echo E, there is no near-end speech S, equipment only needs to obtain by configuring Take the echo energy of minimum.

Each directional microphone equipped with separate gain control, for example its can by echo gain dominant vector [Ge1, Ge2 ..., Gen] and (wherein, n be microphone quantity) represent, so as to control the sensitivity to the pointing direction or sound Acquisition capacity.

Then, detection of echoes device obtains the number and position (direction) coordinate of loudspeaker, and for example it can pass through many times Sound direction vector [E1, E2 ..., El] (wherein, l be loudspeaker quantity) represent.Raised for example, factory preset pattern is included Sound device starts algorithmic statement based on this with respect to the positional information of microphone array, the adaptive area found under actual pattern Not.Number and position of the algorithm according to echogenicity (that is, loudspeaker), are adaptively calculated gain control matrix, adaptively will There is directional microphone gain vector (for example, MIC1 and MIC2 in Fig. 5) the adjustment reduction of echo angle direction so that many sound sources After signal is by equipment, it is ensured that small in the energy for having echo direction E, thresholding is set, least energy requirement Emin is met.Echoless The microphone gain vector of angle direction keeps constant, it is ensured that still can now wake up.

Then, said process can cyclically be performed, i.e. when loudspeaker change (for example, quantity changes, position changes) When, many echo direction vectors of adaptive updates adaptively update gain dominant vector by maximum SER criterions.

Another aspect, once judging sound source and having loudspeaker broadcasting, then explanation enters the 3rd above-mentioned scene, i.e., Closely/distal end is double to say pattern.Now existing echo E, there is a near-end speech S again, equipment need by configure acquisition maximum SER, i.e. S with E ratio is maximum.

Algorithm can with setting signal weight vector [α 1, α 2 ..., α n] and echo weight vector [β 1, β 2 ..., βn].3rd pattern is the combination of first mode and second mode, weight coefficient be respectively first mode and second mode plus Weight coefficient vector, for weighting the gain dominant vector of first mode and second mode.

So that α vector sums β vectors are weighted with signal gain dominant vector and echo gain dominant vector respectively, utilize Maximum SER obtains the optimal value of α vectors, β vectors, Gs vectors and Gn vectors than accurate.

It is then possible to which α vectors, β vectors, Gs vectors and Gn vectors are written in processing equipment, gain control is carried out, Obtain current optimal SER performances.

Then, said process can cyclically be performed, i.e. when sound source change (for example, quantity changes, position changes), Adaptive updates multiple sound source direction vector, gain dominant vector is adaptively updated by maximum SER criterions.In addition, above-mentioned ginseng Number can with stored, to be read directly out under identical scene later, without perform again gain and Vector calculates operation, so as to accelerate to handle the speed of audio signal.

In one embodiment, audio signal processing apparatus 100 can also include：Sef-adapting filter 160, for basis The sound that the loudspeaker is being played disappears to carry out echo to the total input signal after Gain tuning in time domain and/or frequency domain Remove.

It is microphone being gathered including realize on spatial domain enhanced by after above-mentioned Gain tuning The branch input signal of attention signal component and the echo signal components after eliminating merged into by multiplexer 130 it is total defeated all the way Enter after signal, pass through the echo suppressor part based on adaptive-filtering.

For example, the voice signal being currently played due to known loudspeaker, so sef-adapting filter 160 can be from wheat The sound signal components (equivalent to echo signal components) are removed in the branch input signal that gram wind is gathered, and obtain only to come From the attention signal component of signal source.Obviously, the application not limited to this.It is either existing or exploitation in the future adaptive Wave filter, can be applied in the audio signal processing apparatus according to the embodiment of the present application, and should also be included in the application Protection domain in.

Finally, be pure proximal device or near/remote equipment depending on audio signal processing apparatus, can also carry out for The audio identification operation of signal after filtering process, or remote equipment is sent to, for telecommunication purpose.

As can be seen here, can be according to the position of signal source using the audio signal processing apparatus according to the embodiment of the present application The gain of each directional microphone in microphone array is adjusted with the position of loudspeaker, to be gathered in microphone array Total input signal in the power of attention signal component received from the signal source and returning for being received from the loudspeaker Letter between the power of acoustical signal component is returned than maximum.It therefore, it can to realize using the characteristic of directional microphone array lossless Attention signal enhancing and echo signal suppress.

Specifically, embodiments herein has advantages below：

1. it is enhanced at the same time it can also adaptively suppress echo direction in Sounnd source direction, and it is adaptively adjusted sensing Microphone array gain obtains maximum SER, and the echo under being played hence for loud noise has extraordinary inhibition, lifting letter Intelligibility/discrimination/the communication quality of number (for example, voice signal)；

2. it can accomplish nondestructively to carry out the attention signal of such as voice etc using the characteristic of directional microphone array Strengthen and echo signal is suppressed, can using microphone body characteristic compared to the beamforming algorithm of omnidirectional microphone Preferably to protect voice quality, and it can accomplish that many sound sources strengthen simultaneously；

3. support the free switching under Three models.

Exemplary audio signal processing method

Audio according to described by can apply to referring to figs. 1 to Fig. 5 the acoustic signal processing method of the embodiment of the present application Signal handling equipment 100.

As shown in fig. 6, the acoustic signal processing method can include：

In step s 110, branch input signal, the Mike are received from each directional microphone in microphone array Wind array includes multiple directional microphones with different pickup areas, and each directional microphone is used to adopt in the pickup area of itself The branch input signal of the collection including the attention signal component from signal source and the echo signal components from loudspeaker；

In the step s 120, the branch input signal that each directional microphone is gathered is merged into total input signal；

In step s 130, the position of the signal source and the position of the loudspeaker are determined；And

In step S140, each sensing Mike is adjusted according to the position of the signal source and the position of the loudspeaker The gain of wind, with cause the power of attention signal component that is received in the total input signal from the signal source with from institute The letter stated between the power for the echo signal components that loudspeaker is received is returned than maximum.

In one embodiment, step S130 includes：Detection whether there is in current scene exports concern letter Number signal source, the quantity of the signal source and its relevant position；And detect in the current scene with the presence or absence of Play loudspeaker, the quantity of the loudspeaker and its relevant position of voice signal.

In one embodiment, detection in current scene with the presence or absence of the signal source, described for exporting attention signal The quantity of signal source and its relevant position include：Receive the scene image for the current scene that camera is caught；And institute State and the signal source is recognized in scene image, determine the quantity of the signal source, and determine the signal source and the audio Relative position between the reference position of signal handling equipment.

In one embodiment, the phase between the signal source and the reference position of the audio signal processing apparatus is determined Position is included：Determine that the signal source is determined with the signal source according to position of the signal source in the scene image Relative position between the reference position of position device, and the reference position according to the signal source positioning devices and the audio Registering relation between the reference position of signal handling equipment determines the signal source and the audio signal processing apparatus Relative position between reference position.

In one embodiment, detection in current scene with the presence or absence of the signal source, described for exporting attention signal The quantity of signal source and its relevant position include：At least two-way branch gathered by least two directional microphones is received to input Signal, and separate the attention signal component from the signal source from least two-way branch input signal；And root Determine that the signal source is set with the Audio Signal Processing according to the phase of the attention signal component for the signal source isolated Standby relative position.

In one embodiment, detection in the current scene with the presence or absence of play voice signal loudspeaker, The quantity of the loudspeaker and its relevant position include：Receive at least two-way branch gathered by least two directional microphones Input signal, and separate the attention signal component from the loudspeaker from least two-way branch input signal；With And determined according to the phase of the attention signal component for the loudspeaker isolated the loudspeaker with the audio signal Manage the relative position of equipment.

In one embodiment, step S140 includes：The one or more of attention signal are exported in response to existing Signal source and in the absence of the loudspeaker of voice signal is played, relatively more one or more of signal sources point to Mike with each First position relation between the pickup area of wind；And the increasing of each directional microphone is adjusted according to the first position relation Benefit, to cause the power of attention signal component received in the total input signal from one or more of signal sources most Greatly.

In one embodiment, the gain according to the first position relation to adjust each directional microphone includes：Increase Big one or more of signal sources are located at the gain of one or more directional microphones in its pickup area, to cause described total The power of the attention signal component received in input signal from one or more of signal sources is maximum and none of Distortion occurs for attention signal component.

In one embodiment, the gain according to the first position relation to adjust each directional microphone also includes： Reduce the gain of other microphones in the microphone array except one or more of directional microphones, to reduce in institute State the power of the noise component(s) received in total input signal from noise source.

In one embodiment, step S140 includes：In response in the absence of the signal source for exporting attention signal and In the presence of the one or more loudspeakers for playing voice signal, relatively more one or more of loudspeakers point to Mike with each Second place relation between the pickup area of wind；And the increasing of each directional microphone is adjusted according to the second place relation Benefit, to cause the power of echo signal components received in the total input signal from one or more of loudspeakers most It is small.

In one embodiment, the gain according to the second place relation to adjust each directional microphone includes：Subtract Small one or more of loudspeakers are located at the gain of one or more directional microphones in its pickup area.

In one embodiment, step S140 includes：In response to exist simultaneously export one of attention signal or Multiple signal sources and the one or more loudspeakers for playing voice signal, compare one or more of signal sources and each First position relation and one or more of loudspeakers and each directional microphone between the pickup area of directional microphone Second place relation between pickup area；And it is each to adjust according to the first position relation and the second place relation The gain of directional microphone, to cause the concern received in the total input signal from one or more of signal sources to believe Letter number between the power of component and the power of the echo signal components received from one or more of loudspeakers returns ratio most Greatly.

In one embodiment, the acoustic signal processing method can also include：In step S150, raised according to described The sound that sound device is being played to carry out echo cancellor to the total input signal after Gain tuning in time domain and/or frequency domain.

The concrete function of each step in above-mentioned acoustic signal processing method and operation are had been described above referring to figs. 1 to figure It is discussed in detail in the audio signal processing apparatus 100 of 5 descriptions, and therefore, its repeated description will be omitted.

Example electronic device

Below, it is described with reference to Figure 7 the electronic equipment according to the embodiment of the present application.The electronic equipment can be intelligentized Speech recognition system (for example, intelligent appliance, robot etc.), traditional voice communication system are (for example, conference system, internet Agreement transmission speech VoIP system etc.) in proximal device or remote equipment etc..

As shown in fig. 7, electronic equipment 10 includes one or more processors 11 and memory 12.

Processor 11 can be CPU (CPU) or with data-handling capacity and/or instruction execution capability Other forms processing unit, and desired function can be performed with the other assemblies in control electronics 10.

Memory 12 can include one or more computer program products, and the computer program product can include each The computer-readable recording medium of the form of kind, such as volatile memory and/or nonvolatile memory.The volatile storage Device is such as can include random access memory (RAM) and/or cache memory (cache).It is described non-volatile to deposit Reservoir is such as can include read-only storage (ROM), hard disk, flash memory.It can be deposited on the computer-readable recording medium One or more computer program instructions are stored up, processor 11 can run described program instruction, to realize this Shen described above The acoustic signal processing method of each embodiment please and/or other desired functions.In the computer-readable storage Can also be stored in medium the position of such as signal source, the position of loudspeaker, signal gain dominant vector, echo gain control to The information such as amount, signal weighting coefficient vector, echo weight vector.

In one example, electronic equipment 10 can also include：Input unit 13 and output device 14, these components pass through Bindiny mechanism's (not shown) interconnection of bus system and/or other forms.

For example, the input unit 13 can include such as keyboard, mouse and communication network and its connected it is long-range defeated Enter equipment etc..Alternatively or cumulatively, the input unit 13 can also be above-mentioned microphone array 120, including with not With multiple directional microphones in pickup area, each directional microphone is used in the pickup area of itself gather branch input signal.

Output device 14 can export various information, including each sensing Mike after adjustment to outside (for example, user) Total input signal after the gain of wind, echo cancellor etc..The output equipment 14 can include such as display, printer and Communication network and its remote output devices that are connected etc..Alternatively or cumulatively, the output device 14 can also be above-mentioned Loudspeaker 110, for playing sound, it can be single loudspeaker or the loudspeaker array being made up of multiple loudspeakers.

Certainly, to put it more simply, illustrate only some in component relevant with the application in the electronic equipment 10 in Fig. 7, Eliminate the component of such as bus, input/output interface etc..It should be noted that the component and knot of the electronic equipment 10 shown in Fig. 7 Structure is illustrative, and not restrictive, and as needed, electronic equipment 10 can also have other assemblies and structure.

Illustrative computer program product and computer-readable recording medium

In addition to the above method and equipment, embodiments herein can also be computer program product, and it includes meter Calculation machine programmed instruction, the computer program instructions by processor when being run so that described computing device this specification is above-mentioned The step in the acoustic signal processing method according to the various embodiments of the application described in " illustrative methods " part.

The computer program product can be write with any combination of one or more programming languages for holding The program code of row the embodiment of the present application operation, described program design language includes object oriented program language, such as Java, C++ etc., in addition to conventional procedural programming language, such as " C " language or similar programming language.Journey Sequence code can perform fully on the user computing device, partly perform on a user device, independent soft as one Part bag is performed, part is performed or completely in remote computing device on a remote computing on the user computing device for part Or performed on server.

In addition, embodiments herein can also be computer-readable recording medium, it is stored thereon with computer program and refers to Order, the computer program instructions by processor when being run so that above-mentioned " the exemplary side of described computing device this specification The step in the acoustic signal processing method according to the various embodiments of the application described in method " part.

The computer-readable recording medium can use any combination of one or more computer-readable recording mediums.Computer-readable recording medium can To be readable signal medium or readable storage medium storing program for executing.Readable storage medium storing program for executing can for example include but is not limited to electricity, magnetic, light, electricity Magnetic, the system of infrared ray or semiconductor, device or device, or any combination above.Readable storage medium storing program for executing is more specifically Example (non exhaustive list) includes：Electrical connection, portable disc with one or more wires, hard disk, random access memory Device (RAM), read-only storage (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc Read-only storage (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.

The general principle of the application is described above in association with specific embodiment, however, it is desirable to, it is noted that in this application Advantage, advantage, effect referred to etc. is only exemplary rather than limitation, it is impossible to which it is the application to think these advantages, advantage, effect etc. Each embodiment is prerequisite.In addition, detail disclosed above is merely to the effect of example and the work readily appreciated With, and it is unrestricted, above-mentioned details is not intended to limit the application to realize using above-mentioned concrete details.

The device that is related in the application, device, equipment, the block diagram of system only illustratively the example of property and are not intended to It is required that or hint must be attached in the way of square frame is illustrated, arrange, configure.As it would be recognized by those skilled in the art that , it can connect, arrange by any-mode, configuring these devices, device, equipment, system.Such as " comprising ", "comprising", " tool Have " etc. word be open vocabulary, refer to " including but is not limited to ", and can be with its used interchangeably.Vocabulary used herein above "or" and " and " refer to vocabulary "and/or", and can be with its used interchangeably, unless it is not such that context, which is explicitly indicated,.Here made Vocabulary " such as " refers to phrase " such as, but not limited to ", and can be with its used interchangeably.

It may also be noted that in device, apparatus and method in the application, each part or each step are to decompose And/or reconfigure.These decompose and/or reconfigured the equivalents that should be regarded as the application.

The above description of disclosed aspect is provided so that any person skilled in the art can make or use this Application.Various modifications in terms of these are readily apparent to those skilled in the art, and defined herein General Principle can apply to other aspect without departing from scope of the present application.Therefore, the application is not intended to be limited to Aspect shown in this, but according to the widest range consistent with the feature of principle disclosed herein and novelty.

In order to which purpose of illustration and description has been presented for above description.In addition, this description is not intended to the reality of the application Apply example and be restricted to form disclosed herein.Although already discussed above multiple exemplary aspects and embodiment, this area skill Art personnel will be recognized that its some modifications, modification, change, addition and sub-portfolio.

Claims

1. a kind of audio signal processing apparatus, it is characterised in that the equipment includes：

Loudspeaker；

Microphone array, including multiple directional microphones with different pickup areas, each directional microphone be used for itself Branch input signal is gathered in pickup area, the branch input signal is including the attention signal component from signal source and from institute State the echo signal components of loudspeaker；

Multiplexer, is electrically connected with each directional microphone, and the branch input signal for each directional microphone to be gathered is closed And be total input signal；

Auditory localization device, for determining the position of the signal source and the position of the loudspeaker；And

Gain control mechanism, is electrically connected with the auditory localization device and each directional microphone, for according to the signal source Position and the position of the loudspeaker adjust the gain of each directional microphone, with cause in the total input signal from The power of the power for the attention signal component that the signal source is received and the echo signal components received from the loudspeaker Between letter return than maximum.

2. equipment as claimed in claim 1, it is characterised in that the auditory localization device includes：

Signal source positioning devices, for detecting in current scene with the presence or absence of the signal source, described for exporting attention signal The quantity of signal source and its relevant position；And

Loudspeaker positioning devices, for detect in the current scene with the presence or absence of play voice signal loudspeaker, The quantity of the loudspeaker and its relevant position.

3. equipment as claimed in claim 2, it is characterised in that the signal source positioning devices include：

Camera, the scene image for catching the current scene；And

Image identification unit, for recognizing the signal source in the scene image, determines the quantity of the signal source, and Determine the relative position between the reference position of the signal source and the audio signal processing apparatus.

4. equipment as claimed in claim 3, it is characterised in that described image recognition unit is according to the signal source in the field Position in scape image determines the relative position between the signal source and the reference position of the signal source positioning devices, and And it is registering between the reference position according to the signal source positioning devices and the reference position of the audio signal processing apparatus Relation determines the relative position between the signal source and the reference position of the audio signal processing apparatus.

5. equipment as claimed in claim 2, it is characterised in that the signal source positioning devices include：

Signal separation unit, for receiving at least two-way branch input signal gathered by least two directional microphones, and And separate the attention signal component from the signal source from least two-way branch input signal；And

Acoustic recognition unit, the signal is determined for the phase of the attention signal component according to the signal source isolated Source and the relative position of the audio signal processing apparatus.

6. equipment as claimed in claim 2, it is characterised in that the loudspeaker positioning devices include：

Signal separation unit, for receiving at least two-way branch input signal gathered by least two directional microphones, and And separate the attention signal component from the loudspeaker from least two-way branch input signal；And

Acoustic recognition unit, determined for the phase of the attention signal component according to the loudspeaker isolated described in raise one's voice The relative position of device and the audio signal processing apparatus.

7. equipment as claimed in claim 1, it is characterised in that the gain control mechanism includes：

Comparing unit, for being played in response to there are the one or more signal sources for exporting attention signal and be not present First between the loudspeaker of voice signal, relatively one or more of signal sources and the pickup area of each directional microphone Put relation；And

Gain adjusting unit, for adjusting the gain of each directional microphone according to the first position relation, to cause The power of the attention signal component received in the total input signal from one or more of signal sources is maximum.

8. equipment as claimed in claim 7, it is characterised in that the gain adjusting unit increases one or more of signals Source is located at the gain of one or more directional microphones in its pickup area, to cause in the total input signal from one Or the power of attention signal component that receives of multiple signal sources is maximum and distortion occurs for none of attention signal component.

9. equipment as claimed in claim 8, it is characterised in that the gain adjusting unit further reduces the microphone array In row except one or more of directional microphones other microphones gain, with reduce in the total input signal from The power for the noise component(s) that noise source is received.

10. equipment as claimed in claim 1, it is characterised in that the gain control mechanism includes：

Comparing unit, in response in the absence of the signal source that exports attention signal and existing and playing voice signal Second between one or more loudspeakers, relatively one or more of loudspeakers and the pickup area of each directional microphone Put relation；And

Gain adjusting unit, for adjusting the gain of each directional microphone according to the second place relation, to cause The power of the echo signal components received in the total input signal from one or more of loudspeakers is minimum.

11. equipment as claimed in claim 10, it is characterised in that the gain adjusting unit reduces one or more of raise Sound device is located at the gain of one or more directional microphones in its pickup area.

12. equipment as claimed in claim 1, it is characterised in that the gain control mechanism includes：

Comparing unit, in response to there are the one or more signal sources and broadcasting sound that export attention signal simultaneously Between one or more loudspeakers of message number, relatively one or more of signal sources and the pickup area of each directional microphone First position relation and one or more of loudspeakers and the pickup area of each directional microphone between the second place close System；And

Gain adjusting unit, for adjusting each sensing Mike according to the first position relation and the second place relation The gain of wind, with the attention signal component that to receive from one or more of signal sources in the total input signal Letter between power and the power of the echo signal components received from one or more of loudspeakers is returned than maximum.

13. equipment as claimed in claim 1, it is characterised in that the equipment also includes：

Sef-adapting filter, is adjusted for the sound played according to the loudspeaker in time domain and/or frequency domain to gain Total input signal after whole carries out echo cancellor.