CN106531179B - A kind of multi-channel speech enhancement method of the selective attention based on semantic priori - Google Patents
A kind of multi-channel speech enhancement method of the selective attention based on semantic priori Download PDFInfo
- Publication number
- CN106531179B CN106531179B CN201510574907.3A CN201510574907A CN106531179B CN 106531179 B CN106531179 B CN 106531179B CN 201510574907 A CN201510574907 A CN 201510574907A CN 106531179 B CN106531179 B CN 106531179B
- Authority
- CN
- China
- Prior art keywords
- voice
- signal
- activation
- activation word
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Circuit For Audible Band Transducer (AREA)
- Obtaining Desirable Characteristics In Audible-Bandwidth Transducers (AREA)
Abstract
The present invention provides a kind of multi-channel speech enhancement methods of selective attention based on semantic priori, which comprises more microphone arrays pick up the voice signal of any direction in reverberant ambiance, acquire multi-path voice signal and are pre-processed;Utilize specific activation word present in the activation pretreated voice signal of word speech recognition model inspection;The signal comprising activation word section haveing not been cut is handled to obtain complete activation word section;Activation word section is analyzed using the multichannel phase difference sound localization method based on reverberation robust, obtains the sound wave arrival direction of target sound source;The voice of the direction is enhanced, and inhibits the noise in other directions and far says the RMR room reverb under scene, acquires the enhancing voice of target direction.The bright method of we can be used for the occasion that the needs such as intelligent appliance, smart home, vehicle-mounted and wearable device far say formula voice input and interaction, especially suitable for complicated acoustic noise and interference environment occasion.
Description
Technical field
The present invention relates to speech processes field, in particular to the multichannel language of a kind of selective attention based on semantic priori
Sound Enhancement Method.
Background technique
As voice communication and the continuous of man-machine voice interaction system are popularized, people increasingly expect to cast aside microphone and earphone
Etc. cumbersome equipment, realize that the man machine language of similar human conversation general nature exchanges.However, voice is a kind of sound wave, in sky
It will receive various influences, such as the decaying of sound wave when transmitting in gas, the multiple reflections (reverberation) of wall and barrier exist simultaneously
Other sound sources and ambient noise etc..When multiple voice systems and multiple speakers are in same environment, how to ensure be
System is properly received voice messaging, and can further determine voice system move towards practical.Speech enhan-cement is in a kind of complicated noise
The effective means for extracting targeted voice signal are divided into single-channel voice enhancing and multicenter voice enhancing.
Single-channel voice enhancing mainly realizes that noise is eliminated in the difference that time-frequency domain is distributed using voice and noise.Single-pass
Two key problems of road speech enhan-cement are noise estimation and a priori SNR estimation;The former is the key factor for reducing noise,
And the latter is then related to the degree of residual " music noise ".Single channel enhancing algorithm can significantly improve noise in many cases
Than especially having preferable eradicating efficacy to stationary noise (white noise, vehicle are made an uproar).
The ability that microphone array picks up spatial information is utilized in multicenter voice enhancing, can in conjunction with time domain, frequency domain with
And spatial information, obtain the reception ability for having space distinction.In general, multicenter voice enhancing needs the arrival bearing of priori
Angle information, using vacant filtering theory, presses down the back drop from non-targeted direction to form reliable steering vector
System, for single-channel voice enhancing, multicenter voice enhances the ability for having better noise suppressed.
Why human auditory, which can be handled, more sound sources and has the problem of reverberation, in addition can also be detected when more people speak and with
The interested voice of track oneself, main cause are that human auditory has specific Selective attention ability.When the mankind are to certain target
, can be according to specific tasks and environment when sound is interested, choosing target voice and ambient sound most has the feature of distinction, and
It is compared and is screened according to priori knowledge, exclusive PCR sound simultaneously obtains target voice.
For voice application, noise that may be present or interference are in daily household, vehicle-mounted and outdoor etc. actual scenes
It is various.And existing speech enhan-cement or separation method, it is all extremely difficult to the undistorted pickup of target voice, and disappear simultaneously
The purpose of non-targeted signal is removed or inhibits, especially multiple coherent sound sources exist simultaneously, reverberation is larger and low signal-to-noise ratio situation
Under.
The amplitude and phase that speech enhan-cement based on multichannel (microphone array) receives signal using multiple microphones are poor,
Spatial selectivity can be formed to the signal of target direction, so that beam forming (Beamforming, BM), spatial activity are examined
It surveys (Directive speech activity detection, DSAD) algorithm and is directed toward target direction, to inhibit or refuse
The interference signal in non-targeted direction.But the direction of arrival (DOA) of target sound source can not still be known in advance.Assume in simple sund source
Under, it can determine the DOA of target sound source with auditory localization (Source Location, SL) technology, however actual application environment
In, this hypothesis is difficult to meet.In most cases, multi-acoustical can be existed simultaneously, and number is unknown.There are room reflections
Reverberation field, situation can be more complicated, causes the noise of target sound source excessive.
Summary of the invention
It, will be semantic-based it is an object of the invention to overcome drawbacks described above existing for current multi-channel speech enhancement method
Identification of sound source and auditory localization technology based on signal processing combine, and merge " space filtering " characteristic of microphone array, mention
The multi-channel speech enhancement method for having gone out a kind of selective attention based on semantic priori, can be with effectively overcoming noise and interference.
To achieve the goals above, the present invention provides a kind of multicenter voices of selective attention based on semantic priori
Enhancement Method, which comprises more microphone arrays pick up the voice signal of any direction in reverberant ambiance, adopt
Collection multi-path voice signal is simultaneously pre-processed;Exist using activating in the pretreated voice signal of word speech recognition model inspection
Specific activation word;The signal comprising activation word section haveing not been cut is handled to obtain complete activation word section;Using base
Activation word section is handled in the multichannel phase difference sound localization method of reverberation robust, the sound wave for obtaining target sound source reaches
Direction;The voice of the direction is enhanced, and inhibits the noise in other directions and far says the RMR room reverb under scene, is obtained
Obtain the enhancing voice of target direction.
In above-mentioned technical proposal, the specific method includes:
The more microphone arrays of step 1) pick up the voice signal of any direction in reverberant ambiance, acquire multichannel language
Sound signal;
Step 2) pre-processes the multi-path voice signal that step 1) acquires;
Step 3) swashs using in the activation pretreated voice signal of word speech recognition model inspection with the presence or absence of specific
Word living;If testing result is affirmative, retains the signal comprising activation word section haveing not been cut, enter step 4);Otherwise, turn
Enter step 1);
Step 4) carries out Voice activity detector to the signal comprising activation word section haveing not been cut and is completely activated
Word section;Activation word section is analyzed using the multichannel phase difference sound localization method based on reverberation robust, obtains target sound
The sound wave arrival direction in source;The voice of the direction is enhanced, and inhibits remaining directionality noise and the expansion from environment
It dissipates noise and far says the RMR room reverb under scene, get the enhancing voice of target direction.
In above-mentioned technical proposal, the detailed process of the step 2) are as follows: if there are acoustic echo in multi-path voice signal,
Echo Cancellation is carried out to the multi-path voice signal picked up, inhibits diffusion ambient noise and gain control;Otherwise, only to multichannel
Voice signal is diffused ambient noise and inhibits and gain control.
In above-mentioned technical proposal, the activation pretreated voice of word speech recognition model inspection is utilized in the step 3)
With the presence or absence of the detailed process of specific activation word in signal are as follows: according to a large amount of activation word data of priori or speaker dependent
Data, training obtain the activation word speech recognition model that speaker is related or speaker is unrelated;Using identification decoding policy pair
Activation word content is detected and is calculated confidence level, so that discriminant classification is completed, by speech recognition and keyword retrieval algorithm phase
In conjunction with detection of the realization to activation word.
In above-mentioned technical proposal, the step 4) is specifically included:
Step 4-1) starting point for activating word and tail point are detected by Voice activity detector, it obtains complete multichannel and swashs
Word section living;
Step 4-2) activation word section is analyzed using the multichannel phase difference sound localization method based on reverberation robust;
The sound wave arrival direction information of target sound source is obtained, that is, gets the target speaker direction for issuing the certain semantic;According to sound
Wave arrival direction information, enhances the voice of the direction;
Step 4-3) it further suppresses remaining directionality noise using multichannel post filtering and makes an uproar from the diffusion of environment
Sound and the remote RMR room reverb said under scene, acquire the enhancing voice of target direction.
In above-mentioned technical proposal, the step 4-2) it specifically includes:
Step 4-2-1) activation word section transformed into time-frequency domain, on each frequency point, respectively to the Coherent Part of signal and
Incoherent part is tracked;
Step 4-2-2) count the time frequency point occupied by direct sound wave;
Step 4-2-3) in the time frequency point occupied by direct sound wave, when low frequency obtains signal arrival without spacial aliasing part
The distribution of difference;
Step 4-2-4) in high frequency section, the signal step-out time information obtained according to low frequency removes the shadow of spacial aliasing
It rings, obtains the signal step-out time information of Whole frequency band;Then sound wave arrival direction information is obtained;
Step 4-2-5) according to sound wave arrival direction information, the voice of the direction is enhanced.
In above-mentioned technical proposal, the step 4-2-5) in there are two types of the modes that are enhanced voice:
First way: according to sound wave arrival direction information, known direction voice is increased using Beamforming Method
By force, inhibit coherence's sound source from other directions;
The second way: extraterrestrial target Speech signal detection is carried out using the known direction, is received from target area
Voice, refuse from other directions sound source.
The present invention has the advantages that
1, the bright method of we can be used for the needs such as intelligent appliance, smart home, vehicle-mounted and wearable device and far say formula language
The occasion of sound input and interaction, especially suitable for complicated acoustic noise and interference environment occasion;
2, method of the invention can selectively pick up under the conditions of far saying hands-free (far-field hands-free)
Echo signal is taken, interference and noise are inhibited.
Detailed description of the invention
Fig. 1 is the flow chart of the multi-channel speech enhancement method of the selective attention of the invention based on semantic priori;
Fig. 2 is the flow chart of the invention that extraterrestrial target Speech signal detection is carried out using known direction.
Specific embodiment
The feature that target voice distinguishes over other sound has very much, and this category feature to be made full use of to be detected, then needs
Pay the utmost attention to priori knowledge at most and most reliable feature.For example, when loudspeaker plays sound, it is relevant to loudspeaker sound
Sound is construed as echo interference;If the semanteme of target voice is it is known that so semanteme is exactly apparent differentiating characteristics;
If the sound wave arrival direction (Direction of Arrival, DOA) of target voice is it is known that so pass through detection DOA information
It can be used for removing a large amount of unrelated sound.By the detection to various distinction information and compare, sound may finally be inhibited
It influences, and filters out target language segment from mixed sound.
Present invention will now be described in detail with reference to the accompanying drawings..
As shown in Figure 1, a kind of multi-channel speech enhancement method of the selective attention based on semantic priori, the method packet
It includes:
The more microphone arrays of step 1) pick up the voice signal of any direction in reverberant ambiance, acquire multichannel language
Sound signal;
Step 2) pre-processes the multi-path voice signal that step 1) acquires;
If there are acoustic echos in voice signal, Echo Cancellation is carried out to the multi-path voice signal picked up, inhibits to expand
Dissipate ambient noise and gain control;Otherwise, only ambient noise is diffused to multi-path voice signal to inhibit and necessary gain
Control;
Step 3) swashs using in the activation pretreated voice signal of word speech recognition model inspection with the presence or absence of specific
Word living;If testing result is affirmative, retains the signal comprising activation word section haveing not been cut, enter step 4);Otherwise, turn
Enter step 1);
According to a large amount of activation word data of priori or the data of some speaker dependent, training obtain speaker it is related or
The unrelated activation word speech recognition model of person speaker;Activation word content is detected and calculated using identification decoding policy and is set
Reliability combines speech recognition and keyword retrieval algorithm to complete discriminant classification, realizes the detection to activation word.
Step 4) carries out speech enhan-cement to the signal comprising activation word section haveing not been cut;It specifically includes:
Step 4-1) by Voice activity detector (VAD:Voice Activity Detection) will activate word rise
Point and the detection of tail point obtain complete multichannel activation word section;
Step 4-2) activation word section is analyzed using the multichannel phase difference sound localization method based on reverberation robust;
The DOA information of target sound source is obtained, that is, gets the target speaker direction for issuing the certain semantic;It specifically includes:
Step 4-2-1) activation word section transformed into time-frequency domain, on each frequency point, respectively to the Coherent Part of signal and
Incoherent part is tracked;
Step 4-2-2) count the time frequency point occupied by direct sound wave;
Step 4-2-3) in the time frequency point occupied by direct sound wave, step-out time is obtained without spacial aliasing part in low frequency
The distribution of (TDOA:Time Difference Of Arrival);
Step 4-2-4) in high frequency section, the signal step-out time information obtained according to low frequency removes the shadow of spacial aliasing
It rings, obtains the TDOA of the signal of Whole frequency band, then obtain DOA information;
Step 4-2-5) according to DOA information, the voice of known direction is enhanced;The step 4-2-5) in known
There are two types of the modes that the voice in direction is enhanced:
First way: according to DOA information, enhancing known direction voice using Beamforming Method, inhibits to come
From in coherence's sound source in other directions;
In the present embodiment, the minimum variance for being based on diagonal load (Diagonal Loading) using multichannel is undistorted
Response Beamforming Method inhibits coherence's sound source from other directions to be also based on filial generation in other embodiments
Blind source separate technology (Blind Source Separation) realize directional interference inhibition.
The second way: extraterrestrial target Speech signal detection (DSAD) is carried out using the known direction, is received from mesh
The voice in region is marked, the sound source from other directions is refused.
As shown in Fig. 2, utilizing beam reference energy ratio (Beam-to- to each time frequency point by taking binary channels DSAD as an example
Reference Ratio, BRR) and Signal to Noise Ratio (SNR) make decisions.For the decision threshold of BRR, direct sound wave mixed phase is combined
Acoustic energy ratio (Direct-to-Reverberate Ratio, DRR) follow-up mechanism, so that the detection threshold value of each time frequency point can
To be adjusted according to environment self-adaption, to improve the accuracy of each time frequency point possibility predication, dropped using Sidelobe Suppression mechanism
The influence of low high frequency aliasing then improves the full accuracy with judgement.
Step 4-3) it further suppresses remaining directionality noise using multichannel post filtering and makes an uproar from the diffusion of environment
Sound and the remote RMR room reverb said under scene;Acquire enhancing voice.
Claims (5)
1. a kind of multi-channel speech enhancement method of the selective attention based on semantic priori, which comprises more microphones
Array picks up the voice signal of any direction in reverberant ambiance, acquires multi-path voice signal and is pre-processed;Benefit
The specific activation word present in the activation pretreated voice signal of word speech recognition model inspection;Include to what is had not been cut
The signal of activation word section is handled to obtain complete activation word section;It is fixed using the multichannel phase difference sound source based on reverberation robust
Position method analyzes activation word section, obtains the sound wave arrival direction of target sound source;The voice of the direction is enhanced, and
Inhibit the noise in other directions and far say the RMR room reverb under scene, acquires the enhancing voice of target direction;
The method specifically includes:
The more microphone arrays of step 1) pick up the voice signal of any direction in reverberant ambiance, acquisition multi-path voice letter
Number;
Step 2) pre-processes the multi-path voice signal that step 1) acquires;
Step 3) activates word with the presence or absence of specific using in the activation pretreated voice signal of word speech recognition model inspection;
If testing result is affirmative, retains the signal comprising activation word section haveing not been cut, enter step 4);Otherwise, it is transferred to step
It is rapid 1);
Step 4) carries out Voice activity detector to the signal comprising activation word section haveing not been cut and obtains completely activating word section;
Activation word section is analyzed using the multichannel phase difference sound localization method based on reverberation robust, obtains the sound of target sound source
Wave arrival direction;The voice of the direction is enhanced, and inhibits remaining directionality noise and the diffusion noise from environment
And far say RMR room reverb under scene, get the enhancing voice of target direction;
It is specific using whether there is in the activation pretreated voice signal of word speech recognition model inspection in the step 3)
Activation word detailed process are as follows: according to a large amount of activation word data of priori or the data of speaker dependent, training is spoken
The activation word speech recognition model that people is related or speaker is unrelated;Activation word content is detected using identification decoding policy
And confidence level is calculated, to complete discriminant classification, speech recognition and keyword retrieval algorithm are combined, realized to activation word
Detection.
2. the multi-channel speech enhancement method of the selective attention according to claim 1 based on semantic priori, feature
It is, the detailed process of the step 2) are as follows: if there are acoustic echos in multi-path voice signal, to the multi-path voice picked up
Signal carries out Echo Cancellation, inhibits diffusion ambient noise and gain control;Otherwise, back only is diffused to multi-path voice signal
Scape noise suppressed and gain control.
3. the multi-channel speech enhancement method of the selective attention according to claim 1 based on semantic priori, feature
It is, the step 4) specifically includes:
Step 4-1) starting point for activating word and tail point are detected by Voice activity detector, obtain complete multichannel activation word
Section;
Step 4-2) activation word section is analyzed using the multichannel phase difference sound localization method based on reverberation robust;It obtains
The sound wave arrival direction information of target sound source gets the target speaker direction for issuing the certain semantic;It is arrived according to sound wave
Up to directional information, the voice of the direction is enhanced;
Step 4-3) use multichannel post filtering further suppress remaining directionality noise and from environment diffusion noise with
And far say RMR room reverb under scene, acquire the enhancing voice of target direction.
4. the multi-channel speech enhancement method of the selective attention according to claim 3 based on semantic priori, feature
It is, the step 4-2) it specifically includes:
Step 4-2-1) word section will be activated to transform to time-frequency domain, on each frequency point, respectively to the Coherent Part of signal and non-phase
Stem portion is tracked;
Step 4-2-2) count the time frequency point occupied by direct sound wave;
Step 4-2-3) in the time frequency point occupied by direct sound wave, signal step-out time is obtained without spacial aliasing part in low frequency
Distribution;
Step 4-2-4) in high frequency section, the signal step-out time information obtained according to low frequency removes the influence of spacial aliasing, obtains
Take the signal step-out time information of Whole frequency band;Then sound wave arrival direction information is obtained;
Step 4-2-5) according to sound wave arrival direction information, the voice of the direction is enhanced.
5. the multi-channel speech enhancement method of the selective attention according to claim 4 based on semantic priori, feature
Be, the step 4-2-5) in there are two types of the modes that are enhanced voice:
First way: according to sound wave arrival direction information, enhancing known direction voice using Beamforming Method, suppression
Make coherence's sound source from other directions;
The second way: extraterrestrial target Speech signal detection is carried out using the known direction, receives the language from target area
Sound refuses the sound source from other directions.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510574907.3A CN106531179B (en) | 2015-09-10 | 2015-09-10 | A kind of multi-channel speech enhancement method of the selective attention based on semantic priori |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510574907.3A CN106531179B (en) | 2015-09-10 | 2015-09-10 | A kind of multi-channel speech enhancement method of the selective attention based on semantic priori |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106531179A CN106531179A (en) | 2017-03-22 |
CN106531179B true CN106531179B (en) | 2019-08-20 |
Family
ID=58346225
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510574907.3A Active CN106531179B (en) | 2015-09-10 | 2015-09-10 | A kind of multi-channel speech enhancement method of the selective attention based on semantic priori |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106531179B (en) |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106960672B (en) * | 2017-03-30 | 2020-08-21 | 国家计算机网络与信息安全管理中心 | Bandwidth extension method and device for stereo audio |
CN107146614B (en) * | 2017-04-10 | 2020-11-06 | 北京猎户星空科技有限公司 | Voice signal processing method and device and electronic equipment |
CN108877827B (en) * | 2017-05-15 | 2021-04-20 | 福州瑞芯微电子股份有限公司 | Voice-enhanced interaction method and system, storage medium and electronic equipment |
CN107346661B (en) * | 2017-06-01 | 2020-06-12 | 伊沃人工智能技术(江苏)有限公司 | Microphone array-based remote iris tracking and collecting method |
CN108122563B (en) * | 2017-12-19 | 2021-03-30 | 北京声智科技有限公司 | Method for improving voice awakening rate and correcting DOA |
CN108447483B (en) * | 2018-05-18 | 2023-11-21 | 深圳市亿道数码技术有限公司 | speech recognition system |
CN110164423B (en) | 2018-08-06 | 2023-01-20 | 腾讯科技(深圳)有限公司 | Azimuth angle estimation method, azimuth angle estimation equipment and storage medium |
CN110875045A (en) * | 2018-09-03 | 2020-03-10 | 阿里巴巴集团控股有限公司 | Voice recognition method, intelligent device and intelligent television |
CN111081234B (en) * | 2018-10-18 | 2022-03-25 | 珠海格力电器股份有限公司 | Voice acquisition method, device, equipment and storage medium |
CN110047494B (en) * | 2019-04-15 | 2022-06-03 | 北京小米智能科技有限公司 | Device response method, device and storage medium |
CN112289335A (en) * | 2019-07-24 | 2021-01-29 | 阿里巴巴集团控股有限公司 | Voice signal processing method and device and pickup equipment |
CN110992977B (en) * | 2019-12-03 | 2021-06-22 | 北京声智科技有限公司 | Method and device for extracting target sound source |
CN113257251B (en) * | 2021-05-11 | 2024-05-24 | 深圳优地科技有限公司 | Robot user identification method, apparatus and storage medium |
CN113823311B (en) * | 2021-08-19 | 2023-11-21 | 广州市盛为电子有限公司 | Voice recognition method and device based on audio enhancement |
CN113643714B (en) * | 2021-10-14 | 2022-02-18 | 阿里巴巴达摩院(杭州)科技有限公司 | Audio processing method, device, storage medium and computer program |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020116196A1 (en) * | 1998-11-12 | 2002-08-22 | Tran Bao Q. | Speech recognizer |
CN102819009B (en) * | 2012-08-10 | 2014-10-01 | 香港生产力促进局 | Driver sound localization system and method for automobile |
CN204390737U (en) * | 2014-07-29 | 2015-06-10 | 科大讯飞股份有限公司 | A kind of home voice disposal system |
-
2015
- 2015-09-10 CN CN201510574907.3A patent/CN106531179B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN106531179A (en) | 2017-03-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106531179B (en) | A kind of multi-channel speech enhancement method of the selective attention based on semantic priori | |
CN110556103B (en) | Audio signal processing method, device, system, equipment and storage medium | |
WO2020103703A1 (en) | Audio data processing method and apparatus, device and storage medium | |
Okuno et al. | Robot audition: Its rise and perspectives | |
CN111370014B (en) | System and method for multi-stream target-voice detection and channel fusion | |
CN108962272A (en) | Sound pick-up method and system | |
JP5007442B2 (en) | System and method using level differences between microphones for speech improvement | |
US9654894B2 (en) | Selective audio source enhancement | |
CN112424863B (en) | Voice perception audio system and method | |
CN101828407B (en) | Based on the microphone array processor of spatial analysis | |
US9294860B1 (en) | Identifying directions of acoustically reflective surfaces | |
Brutti et al. | Multiple source localization based on acoustic map de-emphasis | |
US10957338B2 (en) | 360-degree multi-source location detection, tracking and enhancement | |
CN103392349A (en) | Systems, methods, apparatus, and computer-readable media for spatially selective audio augmentation | |
CN105532017A (en) | Apparatus and method for beamforming to obtain voice and noise signals | |
US20210390952A1 (en) | Robust speaker localization in presence of strong noise interference systems and methods | |
JP2023159381A (en) | Sound recognition audio system and method thereof | |
CN110610718A (en) | Method and device for extracting expected sound source voice signal | |
CN113223544B (en) | Audio direction positioning detection device and method and audio processing system | |
Ba et al. | Enhanced MVDR beamforming for arrays of directional microphones | |
CN116343808A (en) | Flexible microphone array voice enhancement method and device, electronic equipment and medium | |
Jung et al. | Adaptive microphone array system with two-stage adaptation mode controller | |
Ince et al. | Whole body motion noise cancellation of a robot for improved automatic speech recognition | |
Nguyen et al. | Selection of the closest sound source for robot auditory attention in multi-source scenarios | |
Tachioka et al. | Ensemble integration of calibrated speaker localization and statistical speech detection in domestic environments |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |