WO2023065317A1 - 会议终端及回声消除方法 - Google Patents
会议终端及回声消除方法 Download PDFInfo
- Publication number
- WO2023065317A1 WO2023065317A1 PCT/CN2021/125763 CN2021125763W WO2023065317A1 WO 2023065317 A1 WO2023065317 A1 WO 2023065317A1 CN 2021125763 W CN2021125763 W CN 2021125763W WO 2023065317 A1 WO2023065317 A1 WO 2023065317A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- omnidirectional
- sound
- conference terminal
- microphones
- echo
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 230000005236 sound signal Effects 0.000 claims abstract description 70
- 239000013598 vector Substances 0.000 claims abstract description 57
- 238000012545 processing Methods 0.000 claims abstract description 5
- 239000011159 matrix material Substances 0.000 claims description 58
- 238000004422 calculation algorithm Methods 0.000 claims description 11
- 230000004044 response Effects 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 5
- 230000008030 elimination Effects 0.000 claims 1
- 238000003379 elimination reaction Methods 0.000 claims 1
- 238000005516 engineering process Methods 0.000 abstract description 13
- 230000000694 effects Effects 0.000 abstract description 6
- 230000001629 suppression Effects 0.000 description 12
- 238000003672 processing method Methods 0.000 description 9
- 238000013461 design Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000001228 spectrum Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000002592 echocardiography Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 239000000919 ceramic Substances 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M9/00—Arrangements for interconnection not involving centralised switching
- H04M9/08—Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
- H04M9/082—Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic using echo cancellers
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/02—Circuits for transducers, loudspeakers or microphones for preventing acoustic reaction, i.e. acoustic oscillatory feedback
Definitions
- the present application relates to the technical field of voice processing, in particular to a conference terminal, an echo cancellation method and device, and a sound pickup device.
- the common structure of the sound pickup device of the audio and video conferencing system is that, with the speaker as the center, a dipole directional microphone is respectively arranged in the three surrounding connecting rods (a, b, c).
- Pole directional microphones rely on acoustic design to form a dipole beam pattern that has the ability to suppress echoes from speakers and get speech in the direction of the target.
- a dipole directional microphone relies on acoustic design to form a dipole beam pattern, and the gain in the direction of the speaker is not small enough to effectively suppress the noise from the speaker. direction echo.
- the present application provides a conference terminal to solve the problem in the prior art that the echo cancellation effect of the conference terminal is poor.
- the present application additionally provides an echo cancellation method and device, and sound pickup equipment.
- This application provides a conference terminal, including:
- At least one omnidirectional microphone set comprising at least two omnidirectional microphones
- the memory is used to store the program for implementing the echo cancellation method. After the device is powered on and runs the program of the method through the processor, the following steps are performed:
- For the group of omnidirectional microphones according to the weight vector, determine a weighted sum of at least two sound signals corresponding to the at least two omnidirectional microphones as the sound signal after echo cancellation.
- the at least two omnidirectional microphones are two omnidirectional microphones.
- the at least one omnidirectional microphone group is three omnidirectional microphone groups centered on the speaker, and the three omnidirectional microphone groups cover omnidirectional target sound sources.
- the present application also provides an echo cancellation method for a conference terminal, where the conference terminal includes: a loudspeaker, at least one omnidirectional microphone group, and the omnidirectional microphone group includes at least two omnidirectional microphones;
- the methods include:
- For the group of omnidirectional microphones according to the weight vector of the beamformer, determine the weighted sum of at least two sound signals corresponding to the at least two omnidirectional microphones as the sound signal after echo cancellation .
- the determining a weight vector of a beamformer that enables the at least two omnidirectional microphones to form a dipole beam pattern includes:
- the weight vector is determined according to the noise covariance matrix and the steering vector through a minimum variance distortion-free response MVDR beamforming algorithm.
- the noise covariance matrix is determined in the following manner:
- the sound signal that comprises preset sound collected by described omnidirectional microphone, determine speech autocorrelation matrix, as noise covariance matrix.
- the autocorrelation matrix is updated according to the sound signal including the conference sound collected by the omnidirectional microphone, as an updated noise covariance matrix.
- the echo-eliminated sound signals corresponding to the at least two omnidirectional microphones of the target omnidirectional microphone group are selected.
- the present application also provides an echo cancellation device, the device is located in a conference terminal, and the conference terminal includes: a loudspeaker, at least one omnidirectional microphone group, and the omnidirectional microphone group includes at least two omnidirectional microphones;
- the devices include:
- a parameter determination unit configured to determine a weight vector of a beamformer that enables the at least two omnidirectional microphones to form a dipole beam pattern, so as to suppress the echo signal in the speaker direction and enhance the sound signal in the target direction;
- a sound signal acquisition unit configured to collect sound signals through the omnidirectional microphone
- the beamforming unit is configured to, for the group of omnidirectional microphones, determine the weighted sum of at least two sound signals corresponding to the at least two omnidirectional microphones according to the weight vector of the beamformer, as The sound signal after echo cancellation.
- the application also provides a sound pickup device, including:
- At least one omnidirectional microphone set comprising at least two omnidirectional microphones
- the memory is used to store a program for implementing the above echo cancellation method, and the terminal is powered on to run the program of the method through the processor.
- the present application also provides an electronic device, including:
- a processor and a memory the memory is used to store a program for implementing the above method, and the device is powered on and runs the program of the method through the processor.
- the present application also provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the computer-readable storage medium is run on a computer, it causes the computer to execute the above-mentioned various methods.
- the present application also provides a computer program product including instructions, which, when run on a computer, cause the computer to execute the above-mentioned various methods.
- the conference terminal provided in the embodiment of the present application includes: a loudspeaker, and at least one omnidirectional microphone group, where the omnidirectional microphone group includes at least two omnidirectional microphones.
- the conference terminal by determining the weight vector of the beamformer that makes the at least two omnidirectional microphones form a dipole beam pattern, to suppress the echo signal in the speaker direction and enhance the sound signal in the target direction; through the The omnidirectional microphone collects multiple microphone signals; for the at least two omnidirectional microphones, according to the weight vector, determine a weighted sum of the at least two microphone signals as the echo-eliminated sound signal.
- two or more omnidirectional microphones can be used to replace a dipole directional microphone. Combined with beamforming technology, the beam pattern can form a smaller gain in the direction of the speaker, and the gain is not limited by the acoustic design of the microphone. ; Therefore, the echo cancellation effect can be effectively improved.
- FIG. 1 is a schematic structural diagram of a conference terminal in the prior art
- Fig. 2 is a schematic structural diagram of an embodiment of a conference terminal provided by the present application.
- Fig. 3 is a schematic flowchart of an embodiment of the echo cancellation method provided by the present application.
- a conference terminal an echo cancellation method and device, and a sound pickup device are provided.
- Various schemes are described in detail in the following examples one by one.
- the conference terminal may include: a loudspeaker, and at least one omnidirectional microphone set, where the omnidirectional microphone set includes at least two omnidirectional microphones.
- the conference terminal can be used in an audio and video conference system.
- the audio and video conferencing system is two or more individuals or groups in different places, through transmission lines and conference terminals and other equipment, the audio, video and document data are exchanged to achieve real-time and interactive communication, so as to realize simultaneous conferences. system equipment.
- the conference terminal may be a speakerphone, or a video conference terminal including a display and a camera.
- the loudspeaker also known as "horn" is a transducer device that converts electrical signals into acoustic signals.
- the omnidirectional microphone is a microphone that can equally receive sounds from all sides, such as magnetic, ceramic and electret microphones are all omnidirectional microphones.
- the conference terminal includes a loudspeaker and multiple omnidirectional microphone groups, and the multiple omnidirectional microphone groups can be installed around the loudspeaker to cover all directions of the conference site.
- a plurality of connecting rods may be extended from the loudspeaker, and an omnidirectional microphone group is mounted on each connecting rod.
- Each omnidirectional microphone group includes at least two omnidirectional microphones instead of one directional microphone in the prior art.
- the core technology of the conference terminal provided by the embodiment of the present application includes: how to combine the beamforming technology to form each omnidirectional microphone group into a dipole beam pattern, so as to suppress the echo signal in the speaker direction and enhance the sound signal in the target direction.
- the dipole beam pattern formed based on two or more omnidirectional microphones in the embodiment of the present application results in less gain and therefore better echo suppression.
- the reason for this is that a dipole directional microphone does not have enough gain by "acoustic design", while two omnidirectional microphones based on a beamforming algorithm (such as MVDR) can gain less than what is acoustically achievable.
- FIG. 3 is a schematic diagram of an echo cancellation process of an embodiment of the conference terminal of the present application.
- echo cancellation includes the following processing steps:
- Step S301 Determine a weight vector of a beamformer that enables the at least two omnidirectional microphones to form a dipole beam pattern, so as to suppress the echo signal in the speaker direction and enhance the sound signal in the target direction.
- the conference terminal provided in this embodiment combines the beamforming technology to form a dipole beam pattern for each group of omnidirectional microphones.
- a variety of beamforming algorithms can be used to form a dipole beam pattern for each omnidirectional microphone group, such as a minimum variance distortion-free response MVDR beamforming algorithm, a differential beamforming algorithm, and the like.
- each group of omnidirectional microphones is formed into a dipole beam pattern by an MVDR algorithm.
- step S301 may include the following sub-steps:
- Step S3031 Determine the noise covariance matrix and steering vector of the conference terminal.
- the principle of the MVDR algorithm is to minimize the noise power spectrum while ensuring that the target direction is not distorted, as shown in formula 1:
- w represents the weight vector
- R represents the noise covariance matrix
- w H Rw represents the noise power spectrum
- the objective function is min w H Rw, which represents the minimum noise power spectrum.
- the solution result of this objective function is the weight vector w. According to the weight vector w, the weighted sum of the multiple sound signals in one group is calculated, and the result is the sound signal after the echo is suppressed.
- the noise covariance matrix can be determined in two stages.
- One stage is the initialization stage of the conference terminal, which can determine the initial value of the noise covariance matrix; the other stage is during the use of the conference terminal, when the participants in the environment where the conference terminal is located are not speaking (the target sound source is muted), the noise can be updated Covariance matrix, to better adapt to the conference environment, improve the accuracy of the noise covariance matrix, thereby improving the accuracy of the weight vector w, thereby improving the echo suppression effect.
- the noise covariance matrix is related to the conference environment where the conference terminal is located.
- the same conference terminal is usually used in multiple conference environments, so the noise covariance matrix can be determined when the conference terminal is initialized.
- the preset sound data can be played; then, the voice autocorrelation matrix can be determined according to the multi-channel sound signals including the preset sound collected by the omnidirectional microphone, as the noise coordinating matrix. variance matrix. For example, let the speaker of the conference terminal (such as a speakerphone) play the speech for 2 to 4 seconds, and calculate the autocorrelation matrix of this speech as the noise covariance matrix. With this processing method, the conference terminal can obtain a better echo suppression effect in different conference environments.
- the self-directed audio is updated according to the multi-channel sound signal collected by the omnidirectional microphone including the conference sound (the speaker's voice played by the loudspeaker). Correlation matrix, as the updated noise covariance matrix.
- the speech autocorrelation matrix of this segment is calculated as the noise covariance matrix.
- a smoothing method may be used to update the previous noise covariance matrix.
- a smoothing method may be used to update the previous noise covariance matrix.
- R t+1 ⁇ R t-1 +(1- ⁇ )R t Formula 3
- Step S3033 Determine the weight vector according to the noise covariance matrix and the steering vector through the minimum variance distortion-free response MVDR beamforming algorithm.
- the weight vector may be determined based on the noise covariance matrix and the steering vector according to Formula 2.
- Step S303 collecting multiple sound signals through the omnidirectional microphone.
- multiple omnidirectional microphones of the conference terminal can be used to collect multiple sound signals.
- Step S305 For the at least two omnidirectional microphones, according to the weight vector, determine a weighted sum of at least two sound signals as an echo-eliminated sound signal.
- the weight vector may include weight vectors respectively corresponding to respective omnidirectional microphones of the at least two omnidirectional microphones. For example, if one omnidirectional microphone group includes two omnidirectional microphone groups, the weight vector includes two weight vectors.
- any omnidirectional microphone group determines the weighted sum of at least two sound signals as the sound signal after echo cancellation. For example, if the conference terminal includes three omnidirectional microphone groups, three channels of echo-cancelled sound signals are obtained.
- the number of omnidirectional microphone groups is usually related to the size of the conference environment space. For most environments with limited space, three omnidirectional microphone groups can cover the target sound source in all directions around the conference terminal; For a conference environment with a large space, more omnidirectional microphone groups can be set up, such as 4 groups, 5 groups, etc., to cover all directions of the conference site.
- each omnidirectional microphone group includes two omnidirectional microphones
- a dipole beam pattern can be formed by combining the beamforming technology. In this way, the echo cancellation effect can be improved, and the equipment cost can be reduced.
- each omnidirectional microphone group may also include more than two omnidirectional microphones, but this will increase equipment costs.
- each omnidirectional microphone group includes two omnidirectional microphones
- the distance between the two omnidirectional microphones will affect the echo suppression performance.
- the distance between two omnidirectional microphones is 3cm better than 7cm.
- the gain facing the loudspeaker direction (broadside direction) formed by the prior art is larger than the gain facing the loudspeaker direction formed by the solution of the present application, so the solution of the present application can suppress echo better.
- the target-oriented gain (endfire direction) formed by the prior art is smaller than the target-oriented gain formed by the solution of the present application, so the solution of the present application can better enhance conference voice.
- a speaker may move in a conference site, such as a host at a party.
- the echo cancellation process may also include the following steps:
- Step S401 If the movement of the target sound source is detected, determine the signal-to-noise ratio of the omnidirectional microphone.
- the target sound source is moving through the existing technology, and to determine the signal-to-noise ratio (SNR) of each omnidirectional microphone through the existing technology.
- SNR signal-to-noise ratio
- Step S403 According to the signal-to-noise ratio, select the echo-cancelled sound signal corresponding to the at least two omnidirectional microphones of the target omnidirectional microphone group.
- step S401 and step S403 by executing step S401 and step S403, a good echo suppression effect can still be obtained when the sound source moves.
- the conference terminal provided in the embodiments of the present application includes: a loudspeaker, and at least one omnidirectional microphone group, where the omnidirectional microphone group includes at least two omnidirectional microphones.
- the conference terminal by determining the weight vector of the beamformer that makes the at least two omnidirectional microphones form a dipole beam pattern, to suppress the echo signal in the speaker direction and enhance the sound signal in the target direction; through the The omnidirectional microphone collects sound signals; for the at least two omnidirectional microphones, according to the weight vector of the beamformer, determine the weighted sum of at least two sound signals as the echo-eliminated sound signal.
- two or more omnidirectional microphones can be used to replace a dipole directional microphone.
- the beam pattern can form a smaller gain in the direction of the speaker, and the gain is not limited by the acoustic design of the microphone. ; Therefore, the echo cancellation effect can be effectively improved.
- a conference terminal is provided, and correspondingly, the present application also provides an echo cancellation method.
- the method corresponds to the embodiment of the above-mentioned device. Since the method embodiments are basically similar to the device embodiments, the description is relatively simple, and for relevant parts, please refer to part of the description of the device embodiments. The method embodiments described below are illustrative only.
- the present application further provides an echo cancellation method, which is used in a conference terminal, and the conference terminal includes: a loudspeaker, and at least one omnidirectional microphone group, and the omnidirectional microphone group includes at least two omnidirectional microphones.
- the method may include the following steps:
- Step S301 Determine a weight vector of a beamformer that enables the at least two omnidirectional microphones to form a dipole beam pattern, so as to suppress the echo signal in the speaker direction and enhance the sound signal in the target direction.
- Step S303 collecting sound signals through the omnidirectional microphone.
- Step S305 For the group of omnidirectional microphones, according to the weight vector of the beamformer, determine the weighted sum of at least two sound signals corresponding to the at least two omnidirectional microphones as the echo-eliminated sound signal.
- step S301 may include the following sub-steps: determine the noise covariance matrix and steering vector of the conference terminal; use the minimum variance distortion-free response MVDR beamforming algorithm to determine the noise covariance matrix and steering vector according to the The weight vector.
- the noise covariance matrix can be determined in the following manner: when the conference terminal is started, the preset sound data is played; according to the sound signal collected by the omnidirectional microphone that includes the preset sound, the voice Autocorrelation matrix, as noise covariance matrix.
- the conference terminal can obtain a better echo suppression effect in different conference environments.
- the method may further include the following steps: when the conference terminal is working, if a target direction is detected to be muted, updating the self Correlation matrix, as the updated noise covariance matrix.
- this processing method can better adapt to the conference environment, improve the accuracy of the noise covariance matrix, thereby improving the accuracy of the weight vector, and further improving the echo suppression effect.
- the method may further include the following steps: if the movement of the target sound source is detected, then determine the signal-to-noise ratio of the omnidirectional microphone; The echo-canceled sound signal corresponding to the at least two omnidirectional microphones.
- an echo cancellation method is provided, and correspondingly, the present application also provides an echo cancellation device.
- the device corresponds to the embodiment of the above-mentioned method. Since the device embodiment is basically similar to the method embodiment, the description is relatively simple, and for relevant parts, refer to the part of the description of the method embodiment.
- the device embodiments described below are illustrative only.
- the present application further provides an echo canceling device, the device is located in a conference terminal, and the conference terminal includes: a loudspeaker, at least one omnidirectional microphone group, and the omnidirectional microphone group includes at least two omnidirectional microphones;
- the devices include:
- a parameter determination unit configured to determine a weight vector of a beamformer that enables the at least two omnidirectional microphones to form a dipole beam pattern, so as to suppress the echo signal in the speaker direction and enhance the sound signal in the target direction;
- a sound signal acquisition unit configured to collect sound signals through the omnidirectional microphone
- the beamforming unit is configured to, for the group of omnidirectional microphones, determine the weighted sum of at least two sound signals corresponding to the at least two omnidirectional microphones according to the weight vector of the beamformer, as The sound signal after echo cancellation.
- the parameter determination unit may be specifically configured to determine the noise covariance matrix and steering vector of the conference terminal; through the minimum variance distortion-free response MVDR beamforming algorithm, according to the noise covariance matrix and steering vector, Determine the weight vector.
- the noise covariance matrix may be determined in the following manner: when the conference terminal is started, the preset sound data is played; according to the sound signal collected by the omnidirectional microphone including the preset sound, determine Speech autocorrelation matrix, as noise covariance matrix.
- the conference terminal can obtain a better echo suppression effect in different conference environments.
- the device may also include:
- a noise covariance matrix update unit configured to update the autocorrelation matrix according to the sound signal collected by the omnidirectional microphone including the sound of the meeting if the target direction is mute when the conference terminal is working, as an update
- the noise covariance matrix of Using this processing method can better adapt to the conference environment, improve the accuracy of the noise covariance matrix, thereby improving the accuracy of the weight vector, and further improving the echo suppression effect.
- the device may also include:
- a signal-to-noise ratio determining unit configured to determine the signal-to-noise ratio of the omnidirectional microphone if the target sound source is detected to be moving if the target direction is detected to be silent when the conference terminal is working;
- the echo suppression signal selection unit is configured to select the echo-cancelled sound signal corresponding to the at least two omnidirectional microphones of the target omnidirectional microphone group according to the signal-to-noise ratio.
- an echo cancellation method is provided, and correspondingly, the present application also provides an electronic device.
- the device corresponds to an embodiment of the method described above. Since the device embodiment is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to part of the description of the method embodiment.
- the device embodiments described below are illustrative only.
- the present application further provides an electronic device, including: a loudspeaker; at least one omnidirectional microphone set, the omnidirectional microphone set including at least two omnidirectional microphones; a processor; and a memory.
- the memory is used to store a program for implementing the above echo cancellation method, and the terminal is powered on to run the program of the method through the processor.
- the electronic device may be an audio-video conference terminal, or a sound pickup device.
- a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
- processors CPUs
- input/output interfaces network interfaces
- memory volatile and non-volatile memory
- Memory may include non-permanent storage in computer-readable media, in the form of random access memory (RAM) and/or nonvolatile memory such as read-only memory (ROM) or flash RAM. Memory is an example of computer readable media.
- RAM random access memory
- ROM read-only memory
- flash RAM flash random access memory
- Computer-readable media including permanent and non-permanent, removable and non-removable media, can be implemented by any method or technology for information storage.
- Information may be computer readable instructions, data structures, modules of a program, or other data.
- Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Flash memory or other memory technology, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cartridge, tape magnetic disk storage or other magnetic storage device or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
- computer-readable media excludes non-transitory computer-readable media, such as modulated data signals and carrier waves.
- the embodiments of the present application may be provided as methods, systems or computer program products. Accordingly, the present application can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
- a computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Acoustics & Sound (AREA)
- General Health & Medical Sciences (AREA)
- Otolaryngology (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
Description
Claims (12)
- 一种会议终端,其特征在于,包括:扬声器;至少一个全向性麦克风组,所述全向性麦克风组包括至少两个全向性麦克风;处理器;以及存储器,用于存储实现回声消除方法的程序,该设备通电并通过所述处理器运行该方法的程序后,执行下述步骤:确定使得所述至少两个全向性麦克风形成偶极子波束模式的波束形成器的权值向量,以抑制扬声器方向的回声信号,增强目标方向的声音信号;通过所述全向性麦克风,采集声音信号;针对所述全向性麦克风组,根据所述波束形成器的权值向量,确定与所述至少两个全向性麦克风对应的至少两路声音信号的加权之和,作为消除回声后的声音信号。
- 根据权利要求1的会议终端,其特征在于,所述至少两个全向性麦克风为两个全向性麦克风。
- 根据权利要求1的会议终端,其特征在于,所述至少一个全向性麦克风组为以扬声器为中心的三个全向性麦克风组,所述三个全向性麦克风组覆盖全方向的目标声源。
- 一种回声消除方法,用于会议终端,其特征在于,所述会议终端包括:扬声器,至少一个全向性麦克风组,所述全向性麦克风组包括至少两个全向性麦克风;所述方法包括:确定使得所述至少两个全向性麦克风形成偶极子波束模式的波束形成器的权值向量,以抑制扬声器方向的回声信号,增强目标方向的声音信号;通过所述全向性麦克风,采集声音信号;针对所述全向性麦克风组,根据所述波束形成器的权值向量,确定与所述至少两个全向性麦克风对应的至少两路声音信号的加权之和,作为消除回 声后的声音信号。
- 根据权利要求4的方法,其特征在于,所述确定使得所述至少两个全向性麦克风形成偶极子波束模式的波束形成器的权值向量,包括:确定所述会议终端的噪声协方差矩阵和导向矢量;通过最小方差无失真响应MVDR波束形成算法,根据所述噪声协方差矩阵和导向矢量,确定所述权值向量。
- 根据权利要求5的方法,其特征在于,所述噪声协方差矩阵采用如下方式确定:在所述会议终端启动时,播放预设的声音数据;根据所述全向性麦克风采集的包括预设声音的声音信号,确定语音自相关矩阵,作为噪声协方差矩阵。
- 根据权利要求6的方法,其特征在于,还包括:在所述会议终端工作时,若检测到目标方向静音,则根据所述全向性麦克风采集的包括会议声音的声音信号,更新所述自相关矩阵,作为更新的噪声协方差矩阵。
- 根据权利要求4的方法,其特征在于,还包括:若检测到目标声源移动,则确定所述全向性麦克风的信噪比;根据所述信噪比,选取与目标全向性麦克风组的所述至少两个全向性麦克风对应的所述消除回声后的声音信号。
- 一种回声消除装置,其特征在于,所述装置位于会议终端,所述会议终端包括:扬声器,至少一个全向性麦克风组,所述全向性麦克风组包括至少两个全向性麦克风;所述装置包括:参数确定单元,用于确定使得所述至少两个全向性麦克风形成偶极子波束模式的波束形成器的权值向量,以抑制扬声器方向的回声信号,增强目标方向的声音信号;声音信号采集单元,用于通过所述全向性麦克风,采集声音信号;波束形成单元,用于针对所述全向性麦克风组,根据所述波束形成器的权值向量,确定与所述至少两个全向性麦克风对应的至少两路声音信号的加权之和,作为消除回声后的声音信号。
- 一种拾音设备,其特征在于,包括:扬声器;至少一个全向性麦克风组,所述全向性麦克风组包括至少两个全向性麦克风;处理器;以及存储器,用于存储实现根据权利要求4-8任一项所述的回声消除方法的程序,该终端通电并通过所述处理器运行该方法的程序。
- 一种计算机程序,其特征在于,包括计算机可读代码,当所述计算机可读代码在计算处理设备上运行时,导致所述计算处理设备执行根据权利要求4-8任一项所述的回声消除方法。
- 一种计算机可读介质,其中存储了如权利要求11所述的计算机程序。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2021/125763 WO2023065317A1 (zh) | 2021-10-22 | 2021-10-22 | 会议终端及回声消除方法 |
CN202180101805.3A CN117981352A (zh) | 2021-10-22 | 2021-10-22 | 会议终端及回声消除方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2021/125763 WO2023065317A1 (zh) | 2021-10-22 | 2021-10-22 | 会议终端及回声消除方法 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023065317A1 true WO2023065317A1 (zh) | 2023-04-27 |
Family
ID=86058745
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/125763 WO2023065317A1 (zh) | 2021-10-22 | 2021-10-22 | 会议终端及回声消除方法 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN117981352A (zh) |
WO (1) | WO2023065317A1 (zh) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007251406A (ja) * | 2006-03-14 | 2007-09-27 | Yamaha Corp | 音声信号送受信装置及び音声会議装置 |
WO2009034524A1 (en) * | 2007-09-13 | 2009-03-19 | Koninklijke Philips Electronics N.V. | Apparatus and method for audio beam forming |
CN101763858A (zh) * | 2009-10-19 | 2010-06-30 | 瑞声声学科技(深圳)有限公司 | 双麦克风信号处理方法 |
CN104464739A (zh) * | 2013-09-18 | 2015-03-25 | 华为技术有限公司 | 音频信号处理方法及装置、差分波束形成方法及装置 |
CN110085247A (zh) * | 2019-05-06 | 2019-08-02 | 上海互问信息科技有限公司 | 一种针对复杂噪声环境的双麦克风降噪方法 |
CN111312269A (zh) * | 2019-12-13 | 2020-06-19 | 辽宁工业大学 | 一种智能音箱中的快速回声消除方法 |
CN111866439A (zh) * | 2020-07-21 | 2020-10-30 | 厦门亿联网络技术股份有限公司 | 一种优化音视频体验的会议装置、系统及其运行方法 |
CN111918169A (zh) * | 2020-06-28 | 2020-11-10 | 佳禾智能科技股份有限公司 | 基于多波束成形麦克风阵列的会议音箱及其声波拾取方法 |
-
2021
- 2021-10-22 CN CN202180101805.3A patent/CN117981352A/zh active Pending
- 2021-10-22 WO PCT/CN2021/125763 patent/WO2023065317A1/zh active Application Filing
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007251406A (ja) * | 2006-03-14 | 2007-09-27 | Yamaha Corp | 音声信号送受信装置及び音声会議装置 |
WO2009034524A1 (en) * | 2007-09-13 | 2009-03-19 | Koninklijke Philips Electronics N.V. | Apparatus and method for audio beam forming |
CN101763858A (zh) * | 2009-10-19 | 2010-06-30 | 瑞声声学科技(深圳)有限公司 | 双麦克风信号处理方法 |
CN104464739A (zh) * | 2013-09-18 | 2015-03-25 | 华为技术有限公司 | 音频信号处理方法及装置、差分波束形成方法及装置 |
CN110085247A (zh) * | 2019-05-06 | 2019-08-02 | 上海互问信息科技有限公司 | 一种针对复杂噪声环境的双麦克风降噪方法 |
CN111312269A (zh) * | 2019-12-13 | 2020-06-19 | 辽宁工业大学 | 一种智能音箱中的快速回声消除方法 |
CN111918169A (zh) * | 2020-06-28 | 2020-11-10 | 佳禾智能科技股份有限公司 | 基于多波束成形麦克风阵列的会议音箱及其声波拾取方法 |
CN111866439A (zh) * | 2020-07-21 | 2020-10-30 | 厦门亿联网络技术股份有限公司 | 一种优化音视频体验的会议装置、系统及其运行方法 |
Also Published As
Publication number | Publication date |
---|---|
CN117981352A (zh) | 2024-05-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9497544B2 (en) | Systems and methods for surround sound echo reduction | |
US9913022B2 (en) | System and method of improving voice quality in a wireless headset with untethered earbuds of a mobile device | |
US8233352B2 (en) | Audio source localization system and method | |
JP6121481B2 (ja) | マルチマイクロフォンを用いた3次元サウンド獲得及び再生 | |
JP6703525B2 (ja) | 音源を強調するための方法及び機器 | |
KR101456866B1 (ko) | 혼합 사운드로부터 목표 음원 신호를 추출하는 방법 및장치 | |
US9438985B2 (en) | System and method of detecting a user's voice activity using an accelerometer | |
US9197974B1 (en) | Directional audio capture adaptation based on alternative sensory input | |
US20110096915A1 (en) | Audio spatialization for conference calls with multiple and moving talkers | |
WO2015035785A1 (zh) | 语音信号处理方法与装置 | |
JP2013543987A (ja) | 遠距離場マルチ音源追跡および分離のためのシステム、方法、装置およびコンピュータ可読媒体 | |
US11496830B2 (en) | Methods and systems for recording mixed audio signal and reproducing directional audio | |
US11122381B2 (en) | Spatial audio signal processing | |
CN109859769A (zh) | 一种掩码估计方法及装置 | |
TW202312140A (zh) | 會議終端及回授抑制方法 | |
Ba et al. | Enhanced MVDR beamforming for arrays of directional microphones | |
CN111243615A (zh) | 麦克风阵列信号处理方法及手持式装置 | |
WO2023065317A1 (zh) | 会议终端及回声消除方法 | |
WO2023056905A1 (zh) | 声源定位方法、装置及设备 | |
JP7636512B2 (ja) | ポータブルカラオケの低複雑度ハウリング抑制 | |
CN115508777B (zh) | 说话人定位方法、装置及设备 | |
Suzuki et al. | Spot-forming method by using two shotgun microphones | |
WO2022041030A1 (en) | Low complexity howling suppression for portable karaoke | |
CN117376757A (zh) | 拾音方法、处理器、电子设备及计算机存储介质 | |
CN115512712A (zh) | 回声消除方法、装置及设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21961065 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 18685188 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202180101805.3 Country of ref document: CN |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21961065 Country of ref document: EP Kind code of ref document: A1 |