CN106531179B - A kind of multi-channel speech enhancement method of the selective attention based on semantic priori - Google Patents

A kind of multi-channel speech enhancement method of the selective attention based on semantic priori Download PDF

Info

Publication number
CN106531179B
CN106531179B CN201510574907.3A CN201510574907A CN106531179B CN 106531179 B CN106531179 B CN 106531179B CN 201510574907 A CN201510574907 A CN 201510574907A CN 106531179 B CN106531179 B CN 106531179B
Authority
CN
China
Prior art keywords
voice
signal
activation
activation word
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510574907.3A
Other languages
Chinese (zh)
Other versions
CN106531179A (en
Inventor
付强
王晓飞
国雁萌
颜永红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Acoustics CAS
Original Assignee
Institute of Acoustics CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Acoustics CAS filed Critical Institute of Acoustics CAS
Priority to CN201510574907.3A priority Critical patent/CN106531179B/en
Publication of CN106531179A publication Critical patent/CN106531179A/en
Application granted granted Critical
Publication of CN106531179B publication Critical patent/CN106531179B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Circuit For Audible Band Transducer (AREA)
  • Obtaining Desirable Characteristics In Audible-Bandwidth Transducers (AREA)

Abstract

The present invention provides a kind of multi-channel speech enhancement methods of selective attention based on semantic priori, which comprises more microphone arrays pick up the voice signal of any direction in reverberant ambiance, acquire multi-path voice signal and are pre-processed;Utilize specific activation word present in the activation pretreated voice signal of word speech recognition model inspection;The signal comprising activation word section haveing not been cut is handled to obtain complete activation word section;Activation word section is analyzed using the multichannel phase difference sound localization method based on reverberation robust, obtains the sound wave arrival direction of target sound source;The voice of the direction is enhanced, and inhibits the noise in other directions and far says the RMR room reverb under scene, acquires the enhancing voice of target direction.The bright method of we can be used for the occasion that the needs such as intelligent appliance, smart home, vehicle-mounted and wearable device far say formula voice input and interaction, especially suitable for complicated acoustic noise and interference environment occasion.

Description

A kind of multi-channel speech enhancement method of the selective attention based on semantic priori
Technical field
The present invention relates to speech processes field, in particular to the multichannel language of a kind of selective attention based on semantic priori Sound Enhancement Method.
Background technique
As voice communication and the continuous of man-machine voice interaction system are popularized, people increasingly expect to cast aside microphone and earphone Etc. cumbersome equipment, realize that the man machine language of similar human conversation general nature exchanges.However, voice is a kind of sound wave, in sky It will receive various influences, such as the decaying of sound wave when transmitting in gas, the multiple reflections (reverberation) of wall and barrier exist simultaneously Other sound sources and ambient noise etc..When multiple voice systems and multiple speakers are in same environment, how to ensure be System is properly received voice messaging, and can further determine voice system move towards practical.Speech enhan-cement is in a kind of complicated noise The effective means for extracting targeted voice signal are divided into single-channel voice enhancing and multicenter voice enhancing.
Single-channel voice enhancing mainly realizes that noise is eliminated in the difference that time-frequency domain is distributed using voice and noise.Single-pass Two key problems of road speech enhan-cement are noise estimation and a priori SNR estimation;The former is the key factor for reducing noise, And the latter is then related to the degree of residual " music noise ".Single channel enhancing algorithm can significantly improve noise in many cases Than especially having preferable eradicating efficacy to stationary noise (white noise, vehicle are made an uproar).
The ability that microphone array picks up spatial information is utilized in multicenter voice enhancing, can in conjunction with time domain, frequency domain with And spatial information, obtain the reception ability for having space distinction.In general, multicenter voice enhancing needs the arrival bearing of priori Angle information, using vacant filtering theory, presses down the back drop from non-targeted direction to form reliable steering vector System, for single-channel voice enhancing, multicenter voice enhances the ability for having better noise suppressed.
Why human auditory, which can be handled, more sound sources and has the problem of reverberation, in addition can also be detected when more people speak and with The interested voice of track oneself, main cause are that human auditory has specific Selective attention ability.When the mankind are to certain target , can be according to specific tasks and environment when sound is interested, choosing target voice and ambient sound most has the feature of distinction, and It is compared and is screened according to priori knowledge, exclusive PCR sound simultaneously obtains target voice.
For voice application, noise that may be present or interference are in daily household, vehicle-mounted and outdoor etc. actual scenes It is various.And existing speech enhan-cement or separation method, it is all extremely difficult to the undistorted pickup of target voice, and disappear simultaneously The purpose of non-targeted signal is removed or inhibits, especially multiple coherent sound sources exist simultaneously, reverberation is larger and low signal-to-noise ratio situation Under.
The amplitude and phase that speech enhan-cement based on multichannel (microphone array) receives signal using multiple microphones are poor, Spatial selectivity can be formed to the signal of target direction, so that beam forming (Beamforming, BM), spatial activity are examined It surveys (Directive speech activity detection, DSAD) algorithm and is directed toward target direction, to inhibit or refuse The interference signal in non-targeted direction.But the direction of arrival (DOA) of target sound source can not still be known in advance.Assume in simple sund source Under, it can determine the DOA of target sound source with auditory localization (Source Location, SL) technology, however actual application environment In, this hypothesis is difficult to meet.In most cases, multi-acoustical can be existed simultaneously, and number is unknown.There are room reflections Reverberation field, situation can be more complicated, causes the noise of target sound source excessive.
Summary of the invention
It, will be semantic-based it is an object of the invention to overcome drawbacks described above existing for current multi-channel speech enhancement method Identification of sound source and auditory localization technology based on signal processing combine, and merge " space filtering " characteristic of microphone array, mention The multi-channel speech enhancement method for having gone out a kind of selective attention based on semantic priori, can be with effectively overcoming noise and interference.
To achieve the goals above, the present invention provides a kind of multicenter voices of selective attention based on semantic priori Enhancement Method, which comprises more microphone arrays pick up the voice signal of any direction in reverberant ambiance, adopt Collection multi-path voice signal is simultaneously pre-processed;Exist using activating in the pretreated voice signal of word speech recognition model inspection Specific activation word;The signal comprising activation word section haveing not been cut is handled to obtain complete activation word section;Using base Activation word section is handled in the multichannel phase difference sound localization method of reverberation robust, the sound wave for obtaining target sound source reaches Direction;The voice of the direction is enhanced, and inhibits the noise in other directions and far says the RMR room reverb under scene, is obtained Obtain the enhancing voice of target direction.
In above-mentioned technical proposal, the specific method includes:
The more microphone arrays of step 1) pick up the voice signal of any direction in reverberant ambiance, acquire multichannel language Sound signal;
Step 2) pre-processes the multi-path voice signal that step 1) acquires;
Step 3) swashs using in the activation pretreated voice signal of word speech recognition model inspection with the presence or absence of specific Word living;If testing result is affirmative, retains the signal comprising activation word section haveing not been cut, enter step 4);Otherwise, turn Enter step 1);
Step 4) carries out Voice activity detector to the signal comprising activation word section haveing not been cut and is completely activated Word section;Activation word section is analyzed using the multichannel phase difference sound localization method based on reverberation robust, obtains target sound The sound wave arrival direction in source;The voice of the direction is enhanced, and inhibits remaining directionality noise and the expansion from environment It dissipates noise and far says the RMR room reverb under scene, get the enhancing voice of target direction.
In above-mentioned technical proposal, the detailed process of the step 2) are as follows: if there are acoustic echo in multi-path voice signal, Echo Cancellation is carried out to the multi-path voice signal picked up, inhibits diffusion ambient noise and gain control;Otherwise, only to multichannel Voice signal is diffused ambient noise and inhibits and gain control.
In above-mentioned technical proposal, the activation pretreated voice of word speech recognition model inspection is utilized in the step 3) With the presence or absence of the detailed process of specific activation word in signal are as follows: according to a large amount of activation word data of priori or speaker dependent Data, training obtain the activation word speech recognition model that speaker is related or speaker is unrelated;Using identification decoding policy pair Activation word content is detected and is calculated confidence level, so that discriminant classification is completed, by speech recognition and keyword retrieval algorithm phase In conjunction with detection of the realization to activation word.
In above-mentioned technical proposal, the step 4) is specifically included:
Step 4-1) starting point for activating word and tail point are detected by Voice activity detector, it obtains complete multichannel and swashs Word section living;
Step 4-2) activation word section is analyzed using the multichannel phase difference sound localization method based on reverberation robust; The sound wave arrival direction information of target sound source is obtained, that is, gets the target speaker direction for issuing the certain semantic;According to sound Wave arrival direction information, enhances the voice of the direction;
Step 4-3) it further suppresses remaining directionality noise using multichannel post filtering and makes an uproar from the diffusion of environment Sound and the remote RMR room reverb said under scene, acquire the enhancing voice of target direction.
In above-mentioned technical proposal, the step 4-2) it specifically includes:
Step 4-2-1) activation word section transformed into time-frequency domain, on each frequency point, respectively to the Coherent Part of signal and Incoherent part is tracked;
Step 4-2-2) count the time frequency point occupied by direct sound wave;
Step 4-2-3) in the time frequency point occupied by direct sound wave, when low frequency obtains signal arrival without spacial aliasing part The distribution of difference;
Step 4-2-4) in high frequency section, the signal step-out time information obtained according to low frequency removes the shadow of spacial aliasing It rings, obtains the signal step-out time information of Whole frequency band;Then sound wave arrival direction information is obtained;
Step 4-2-5) according to sound wave arrival direction information, the voice of the direction is enhanced.
In above-mentioned technical proposal, the step 4-2-5) in there are two types of the modes that are enhanced voice:
First way: according to sound wave arrival direction information, known direction voice is increased using Beamforming Method By force, inhibit coherence's sound source from other directions;
The second way: extraterrestrial target Speech signal detection is carried out using the known direction, is received from target area Voice, refuse from other directions sound source.
The present invention has the advantages that
1, the bright method of we can be used for the needs such as intelligent appliance, smart home, vehicle-mounted and wearable device and far say formula language The occasion of sound input and interaction, especially suitable for complicated acoustic noise and interference environment occasion;
2, method of the invention can selectively pick up under the conditions of far saying hands-free (far-field hands-free) Echo signal is taken, interference and noise are inhibited.
Detailed description of the invention
Fig. 1 is the flow chart of the multi-channel speech enhancement method of the selective attention of the invention based on semantic priori;
Fig. 2 is the flow chart of the invention that extraterrestrial target Speech signal detection is carried out using known direction.
Specific embodiment
The feature that target voice distinguishes over other sound has very much, and this category feature to be made full use of to be detected, then needs Pay the utmost attention to priori knowledge at most and most reliable feature.For example, when loudspeaker plays sound, it is relevant to loudspeaker sound Sound is construed as echo interference;If the semanteme of target voice is it is known that so semanteme is exactly apparent differentiating characteristics; If the sound wave arrival direction (Direction of Arrival, DOA) of target voice is it is known that so pass through detection DOA information It can be used for removing a large amount of unrelated sound.By the detection to various distinction information and compare, sound may finally be inhibited It influences, and filters out target language segment from mixed sound.
Present invention will now be described in detail with reference to the accompanying drawings..
As shown in Figure 1, a kind of multi-channel speech enhancement method of the selective attention based on semantic priori, the method packet It includes:
The more microphone arrays of step 1) pick up the voice signal of any direction in reverberant ambiance, acquire multichannel language Sound signal;
Step 2) pre-processes the multi-path voice signal that step 1) acquires;
If there are acoustic echos in voice signal, Echo Cancellation is carried out to the multi-path voice signal picked up, inhibits to expand Dissipate ambient noise and gain control;Otherwise, only ambient noise is diffused to multi-path voice signal to inhibit and necessary gain Control;
Step 3) swashs using in the activation pretreated voice signal of word speech recognition model inspection with the presence or absence of specific Word living;If testing result is affirmative, retains the signal comprising activation word section haveing not been cut, enter step 4);Otherwise, turn Enter step 1);
According to a large amount of activation word data of priori or the data of some speaker dependent, training obtain speaker it is related or The unrelated activation word speech recognition model of person speaker;Activation word content is detected and calculated using identification decoding policy and is set Reliability combines speech recognition and keyword retrieval algorithm to complete discriminant classification, realizes the detection to activation word.
Step 4) carries out speech enhan-cement to the signal comprising activation word section haveing not been cut;It specifically includes:
Step 4-1) by Voice activity detector (VAD:Voice Activity Detection) will activate word rise Point and the detection of tail point obtain complete multichannel activation word section;
Step 4-2) activation word section is analyzed using the multichannel phase difference sound localization method based on reverberation robust; The DOA information of target sound source is obtained, that is, gets the target speaker direction for issuing the certain semantic;It specifically includes:
Step 4-2-1) activation word section transformed into time-frequency domain, on each frequency point, respectively to the Coherent Part of signal and Incoherent part is tracked;
Step 4-2-2) count the time frequency point occupied by direct sound wave;
Step 4-2-3) in the time frequency point occupied by direct sound wave, step-out time is obtained without spacial aliasing part in low frequency The distribution of (TDOA:Time Difference Of Arrival);
Step 4-2-4) in high frequency section, the signal step-out time information obtained according to low frequency removes the shadow of spacial aliasing It rings, obtains the TDOA of the signal of Whole frequency band, then obtain DOA information;
Step 4-2-5) according to DOA information, the voice of known direction is enhanced;The step 4-2-5) in known There are two types of the modes that the voice in direction is enhanced:
First way: according to DOA information, enhancing known direction voice using Beamforming Method, inhibits to come From in coherence's sound source in other directions;
In the present embodiment, the minimum variance for being based on diagonal load (Diagonal Loading) using multichannel is undistorted Response Beamforming Method inhibits coherence's sound source from other directions to be also based on filial generation in other embodiments Blind source separate technology (Blind Source Separation) realize directional interference inhibition.
The second way: extraterrestrial target Speech signal detection (DSAD) is carried out using the known direction, is received from mesh The voice in region is marked, the sound source from other directions is refused.
As shown in Fig. 2, utilizing beam reference energy ratio (Beam-to- to each time frequency point by taking binary channels DSAD as an example Reference Ratio, BRR) and Signal to Noise Ratio (SNR) make decisions.For the decision threshold of BRR, direct sound wave mixed phase is combined Acoustic energy ratio (Direct-to-Reverberate Ratio, DRR) follow-up mechanism, so that the detection threshold value of each time frequency point can To be adjusted according to environment self-adaption, to improve the accuracy of each time frequency point possibility predication, dropped using Sidelobe Suppression mechanism The influence of low high frequency aliasing then improves the full accuracy with judgement.
Step 4-3) it further suppresses remaining directionality noise using multichannel post filtering and makes an uproar from the diffusion of environment Sound and the remote RMR room reverb said under scene;Acquire enhancing voice.

Claims (5)

1. a kind of multi-channel speech enhancement method of the selective attention based on semantic priori, which comprises more microphones Array picks up the voice signal of any direction in reverberant ambiance, acquires multi-path voice signal and is pre-processed;Benefit The specific activation word present in the activation pretreated voice signal of word speech recognition model inspection;Include to what is had not been cut The signal of activation word section is handled to obtain complete activation word section;It is fixed using the multichannel phase difference sound source based on reverberation robust Position method analyzes activation word section, obtains the sound wave arrival direction of target sound source;The voice of the direction is enhanced, and Inhibit the noise in other directions and far say the RMR room reverb under scene, acquires the enhancing voice of target direction;
The method specifically includes:
The more microphone arrays of step 1) pick up the voice signal of any direction in reverberant ambiance, acquisition multi-path voice letter Number;
Step 2) pre-processes the multi-path voice signal that step 1) acquires;
Step 3) activates word with the presence or absence of specific using in the activation pretreated voice signal of word speech recognition model inspection; If testing result is affirmative, retains the signal comprising activation word section haveing not been cut, enter step 4);Otherwise, it is transferred to step It is rapid 1);
Step 4) carries out Voice activity detector to the signal comprising activation word section haveing not been cut and obtains completely activating word section; Activation word section is analyzed using the multichannel phase difference sound localization method based on reverberation robust, obtains the sound of target sound source Wave arrival direction;The voice of the direction is enhanced, and inhibits remaining directionality noise and the diffusion noise from environment And far say RMR room reverb under scene, get the enhancing voice of target direction;
It is specific using whether there is in the activation pretreated voice signal of word speech recognition model inspection in the step 3) Activation word detailed process are as follows: according to a large amount of activation word data of priori or the data of speaker dependent, training is spoken The activation word speech recognition model that people is related or speaker is unrelated;Activation word content is detected using identification decoding policy And confidence level is calculated, to complete discriminant classification, speech recognition and keyword retrieval algorithm are combined, realized to activation word Detection.
2. the multi-channel speech enhancement method of the selective attention according to claim 1 based on semantic priori, feature It is, the detailed process of the step 2) are as follows: if there are acoustic echos in multi-path voice signal, to the multi-path voice picked up Signal carries out Echo Cancellation, inhibits diffusion ambient noise and gain control;Otherwise, back only is diffused to multi-path voice signal Scape noise suppressed and gain control.
3. the multi-channel speech enhancement method of the selective attention according to claim 1 based on semantic priori, feature It is, the step 4) specifically includes:
Step 4-1) starting point for activating word and tail point are detected by Voice activity detector, obtain complete multichannel activation word Section;
Step 4-2) activation word section is analyzed using the multichannel phase difference sound localization method based on reverberation robust;It obtains The sound wave arrival direction information of target sound source gets the target speaker direction for issuing the certain semantic;It is arrived according to sound wave Up to directional information, the voice of the direction is enhanced;
Step 4-3) use multichannel post filtering further suppress remaining directionality noise and from environment diffusion noise with And far say RMR room reverb under scene, acquire the enhancing voice of target direction.
4. the multi-channel speech enhancement method of the selective attention according to claim 3 based on semantic priori, feature It is, the step 4-2) it specifically includes:
Step 4-2-1) word section will be activated to transform to time-frequency domain, on each frequency point, respectively to the Coherent Part of signal and non-phase Stem portion is tracked;
Step 4-2-2) count the time frequency point occupied by direct sound wave;
Step 4-2-3) in the time frequency point occupied by direct sound wave, signal step-out time is obtained without spacial aliasing part in low frequency Distribution;
Step 4-2-4) in high frequency section, the signal step-out time information obtained according to low frequency removes the influence of spacial aliasing, obtains Take the signal step-out time information of Whole frequency band;Then sound wave arrival direction information is obtained;
Step 4-2-5) according to sound wave arrival direction information, the voice of the direction is enhanced.
5. the multi-channel speech enhancement method of the selective attention according to claim 4 based on semantic priori, feature Be, the step 4-2-5) in there are two types of the modes that are enhanced voice:
First way: according to sound wave arrival direction information, enhancing known direction voice using Beamforming Method, suppression Make coherence's sound source from other directions;
The second way: extraterrestrial target Speech signal detection is carried out using the known direction, receives the language from target area Sound refuses the sound source from other directions.
CN201510574907.3A 2015-09-10 2015-09-10 A kind of multi-channel speech enhancement method of the selective attention based on semantic priori Active CN106531179B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510574907.3A CN106531179B (en) 2015-09-10 2015-09-10 A kind of multi-channel speech enhancement method of the selective attention based on semantic priori

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510574907.3A CN106531179B (en) 2015-09-10 2015-09-10 A kind of multi-channel speech enhancement method of the selective attention based on semantic priori

Publications (2)

Publication Number Publication Date
CN106531179A CN106531179A (en) 2017-03-22
CN106531179B true CN106531179B (en) 2019-08-20

Family

ID=58346225

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510574907.3A Active CN106531179B (en) 2015-09-10 2015-09-10 A kind of multi-channel speech enhancement method of the selective attention based on semantic priori

Country Status (1)

Country Link
CN (1) CN106531179B (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106960672B (en) * 2017-03-30 2020-08-21 国家计算机网络与信息安全管理中心 Bandwidth extension method and device for stereo audio
CN107146614B (en) * 2017-04-10 2020-11-06 北京猎户星空科技有限公司 Voice signal processing method and device and electronic equipment
CN108877827B (en) * 2017-05-15 2021-04-20 福州瑞芯微电子股份有限公司 Voice-enhanced interaction method and system, storage medium and electronic equipment
CN107346661B (en) * 2017-06-01 2020-06-12 伊沃人工智能技术(江苏)有限公司 Microphone array-based remote iris tracking and collecting method
CN108122563B (en) * 2017-12-19 2021-03-30 北京声智科技有限公司 Method for improving voice awakening rate and correcting DOA
CN108447483B (en) * 2018-05-18 2023-11-21 深圳市亿道数码技术有限公司 speech recognition system
CN110164423B (en) 2018-08-06 2023-01-20 腾讯科技(深圳)有限公司 Azimuth angle estimation method, azimuth angle estimation equipment and storage medium
CN110875045A (en) * 2018-09-03 2020-03-10 阿里巴巴集团控股有限公司 Voice recognition method, intelligent device and intelligent television
CN111081234B (en) * 2018-10-18 2022-03-25 珠海格力电器股份有限公司 Voice acquisition method, device, equipment and storage medium
CN110047494B (en) * 2019-04-15 2022-06-03 北京小米智能科技有限公司 Device response method, device and storage medium
CN112289335A (en) * 2019-07-24 2021-01-29 阿里巴巴集团控股有限公司 Voice signal processing method and device and pickup equipment
CN110992977B (en) * 2019-12-03 2021-06-22 北京声智科技有限公司 Method and device for extracting target sound source
CN113257251B (en) * 2021-05-11 2024-05-24 深圳优地科技有限公司 Robot user identification method, apparatus and storage medium
CN113823311B (en) * 2021-08-19 2023-11-21 广州市盛为电子有限公司 Voice recognition method and device based on audio enhancement
CN113643714B (en) * 2021-10-14 2022-02-18 阿里巴巴达摩院(杭州)科技有限公司 Audio processing method, device, storage medium and computer program

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020116196A1 (en) * 1998-11-12 2002-08-22 Tran Bao Q. Speech recognizer
CN102819009B (en) * 2012-08-10 2014-10-01 香港生产力促进局 Driver sound localization system and method for automobile
CN204390737U (en) * 2014-07-29 2015-06-10 科大讯飞股份有限公司 A kind of home voice disposal system

Also Published As

Publication number Publication date
CN106531179A (en) 2017-03-22

Similar Documents

Publication Publication Date Title
CN106531179B (en) A kind of multi-channel speech enhancement method of the selective attention based on semantic priori
CN110556103B (en) Audio signal processing method, device, system, equipment and storage medium
WO2020103703A1 (en) Audio data processing method and apparatus, device and storage medium
Okuno et al. Robot audition: Its rise and perspectives
CN111370014B (en) System and method for multi-stream target-voice detection and channel fusion
CN108962272A (en) Sound pick-up method and system
JP5007442B2 (en) System and method using level differences between microphones for speech improvement
US9654894B2 (en) Selective audio source enhancement
CN112424863B (en) Voice perception audio system and method
CN101828407B (en) Based on the microphone array processor of spatial analysis
US9294860B1 (en) Identifying directions of acoustically reflective surfaces
Brutti et al. Multiple source localization based on acoustic map de-emphasis
US10957338B2 (en) 360-degree multi-source location detection, tracking and enhancement
CN103392349A (en) Systems, methods, apparatus, and computer-readable media for spatially selective audio augmentation
CN105532017A (en) Apparatus and method for beamforming to obtain voice and noise signals
US20210390952A1 (en) Robust speaker localization in presence of strong noise interference systems and methods
JP2023159381A (en) Sound recognition audio system and method thereof
CN110610718A (en) Method and device for extracting expected sound source voice signal
CN113223544B (en) Audio direction positioning detection device and method and audio processing system
Ba et al. Enhanced MVDR beamforming for arrays of directional microphones
CN116343808A (en) Flexible microphone array voice enhancement method and device, electronic equipment and medium
Jung et al. Adaptive microphone array system with two-stage adaptation mode controller
Ince et al. Whole body motion noise cancellation of a robot for improved automatic speech recognition
Nguyen et al. Selection of the closest sound source for robot auditory attention in multi-source scenarios
Tachioka et al. Ensemble integration of calibrated speaker localization and statistical speech detection in domestic environments

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant