TW202234386A - Method, communication device and communication system for double-talk detection and echo cancellation - Google Patents
Method, communication device and communication system for double-talk detection and echo cancellation Download PDFInfo
- Publication number
- TW202234386A TW202234386A TW110106427A TW110106427A TW202234386A TW 202234386 A TW202234386 A TW 202234386A TW 110106427 A TW110106427 A TW 110106427A TW 110106427 A TW110106427 A TW 110106427A TW 202234386 A TW202234386 A TW 202234386A
- Authority
- TW
- Taiwan
- Prior art keywords
- judgment result
- communication device
- far
- parameter group
- bilateral
- Prior art date
Links
Images
Landscapes
- Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
- Telephone Function (AREA)
- Interconnected Communication Systems, Intercoms, And Interphones (AREA)
Abstract
Description
本發明涉及遠端音訊或視訊會議中的免持裝置,特別是一種偵測雙邊發話及消除回音的方法、通訊裝置及通訊系統。The invention relates to a hands-free device in a remote audio or video conference, in particular to a method, a communication device and a communication system for detecting bilateral speech and eliminating echoes.
為滿足大型會議的需求,應用在視訊會議的免持裝置通常串接多個免持裝置,提高喇叭輸出音量,並且使用高感度麥克風,以便提供一個便於與會者聆聽及免持裝置收音的環境。In order to meet the needs of large-scale conferences, the hands-free devices used in video conferences are usually connected in series with multiple hands-free devices, the speaker output volume is increased, and a high-sensitivity microphone is used to provide an environment that is convenient for participants to listen to and receive audio from the hands-free device.
然而,上述作法將導致免持裝置的麥克風收到更多回音,以至於在執行回音消除演算法時難以達到全雙工。However, the above approach will cause the microphone of the hands-free device to receive more echoes, so that it is difficult to achieve full duplex when executing the echo cancellation algorithm.
有鑑於此,本發明提出一種偵測雙邊發話及消除回音的方法、通訊裝置及通訊系統,藉此解決麥克風收到過多回音使得回音消除演算法難以達到全雙工的問題。In view of this, the present invention provides a method, a communication device and a communication system for detecting bilateral speech and eliminating echoes, thereby solving the problem that the echo cancellation algorithm is difficult to achieve full duplex when the microphone receives too many echoes.
依據本發明一實施例的偵測雙邊發話及消除回音的方法,適用於通訊裝置,通訊裝置具有音訊處理電路,通訊裝置用於電性連接喇叭及麥克風,方法包括:通訊裝置取得遠端訊號,遠端訊號用於供喇叭播放;麥克風產生錄音;音訊處理電路依據遠端訊號及錄音進行音訊處理程序以產生近端振幅;以音訊處理電路至少依據近端振幅、歷史資訊及門檻資訊產生判斷結果,判斷結果用於指示雙邊發話或單邊發話;以及音訊處理電路依據判斷結果調整動態等化器的參數組。According to an embodiment of the present invention, the method for detecting bilateral speech and eliminating echoes is suitable for a communication device. The communication device has an audio processing circuit, and the communication device is used for electrically connecting a speaker and a microphone. The method includes: the communication device obtains a remote signal, The far-end signal is used for speaker playback; the microphone produces a recording; the audio processing circuit performs an audio processing procedure according to the far-end signal and the recording to generate the near-end amplitude; the audio-processing circuit generates a judgment result based on at least the near-end amplitude, historical information and threshold information , the judgment result is used to indicate bilateral speech or unilateral speech; and the audio processing circuit adjusts the parameter group of the dynamic equalizer according to the judgment result.
依據本發明一實施例的偵測雙邊發話及消除回音的通訊裝置,包括:通訊電路,用於取得遠端訊號;以及音訊處理電路,電性連接通訊電路且用於電性連接喇叭及麥克風,其中音訊處理電路用於傳送遠端訊號至喇叭,接收麥克風產生的錄音,依據遠端訊號及錄音進行音訊處理程序以產生近端振幅,至少依據近端振幅、歷史資訊及門檻資訊產生用於指示雙邊發話或單邊發話的判斷結果,及依據判斷結果調整動態等化器的參數組,其中動態等化器依據參數組處理遠端訊號。According to an embodiment of the present invention, a communication device for detecting bilateral calls and eliminating echoes includes: a communication circuit for obtaining a remote signal; and an audio processing circuit for electrically connecting the communication circuit and for electrically connecting a speaker and a microphone, The audio processing circuit is used for transmitting the far-end signal to the speaker, receiving the recording produced by the microphone, and performing the audio processing procedure according to the far-end signal and the recording to generate the near-end amplitude, at least according to the near-end amplitude, historical information and threshold information for indicating The judgment result of bilateral speech or unilateral speech, and the parameter group of the dynamic equalizer is adjusted according to the judgment result, wherein the dynamic equalizer processes the far-end signal according to the parameter group.
依據本發明一實施例的偵測雙邊發話及消除回音的通訊系統,包括彼此通訊連接的第一通訊裝置及第二通訊裝置,其中第一通訊裝置包括:第一通訊介面,用於取得一遠端訊號;第一音訊處理電路,電性連接第一通訊介面且用於電性連接第一喇叭及第一麥克風,第一音訊處理電路用於傳送遠端訊號至第一喇叭,接收第一麥克風產生的第一錄音,依據遠端訊號、第一錄音及第二判斷結果產生第一判斷結果,第一判斷結果用於指示雙邊發話或單邊發話,並依據第一判斷結果調整動態等化器的參數組,其中動態等化器依據參數組處理遠端訊號;以及第一通訊電路,電性連接該第一音訊處理電路且用於發送被動態等化器處理後的遠端訊號及接收第二判斷結果;第二通訊裝置包括:第二通訊電路,用於接收第一通訊裝置發送的遠端訊號;第二音訊處理電路,電性連接第二通訊電路且用於電性連接第二喇叭及第二麥克風,第二音訊處理電路用於傳送遠端訊號至第二喇叭,接收第二麥克風產生的第二錄音,依據遠端訊號及第二錄音產生第二判斷結果,第二判斷結果用於指示雙邊發話或單邊發話;其中第二通訊電路更用於發送第二判斷結果至第一通訊裝置。According to an embodiment of the present invention, a communication system for detecting bilateral speech and eliminating echoes includes a first communication device and a second communication device that are communicatively connected to each other, wherein the first communication device includes: a first communication interface for obtaining a remote terminal signal; the first audio processing circuit is electrically connected to the first communication interface and used to electrically connect the first speaker and the first microphone, the first audio processing circuit is used to transmit the remote signal to the first speaker and receive the first microphone The generated first recording generates a first judgment result according to the remote signal, the first recording and the second judgment result. The first judgment result is used to indicate bilateral speech or unilateral speech, and the dynamic equalizer is adjusted according to the first judgment result. a parameter group, wherein the dynamic equalizer processes the remote signal according to the parameter group; and a first communication circuit is electrically connected to the first audio processing circuit and used for sending the remote signal processed by the dynamic equalizer and receiving the first communication circuit. Two judgment results; the second communication device includes: a second communication circuit for receiving the remote signal sent by the first communication device; a second audio processing circuit for electrically connecting to the second communication circuit and for electrically connecting to the second speaker and the second microphone, the second audio processing circuit is used to transmit the remote signal to the second speaker, receive the second recording generated by the second microphone, and generate a second judgment result according to the remote signal and the second recording. It is used for instructing bilateral speech or unilateral speech; wherein the second communication circuit is further used for sending the second judgment result to the first communication device.
以上之關於本揭露內容之說明及以下之實施方式之說明係用於示範與解釋本發明之精神與原理,並且提供本發明之專利申請範圍更進一步之解釋。The above description of the present disclosure and the following description of the embodiments are used to demonstrate and explain the spirit and principle of the present invention, and provide further explanation of the scope of the patent application of the present invention.
以下在實施方式中詳細敘述本發明之詳細特徵以及特點,其內容足以使任何熟習相關技藝者了解本發明之技術內容並據以實施,且根據本說明書所揭露之內容、申請專利範圍及圖式,任何熟習相關技藝者可輕易地理解本發明相關之構想及特點。以下之實施例係進一步詳細說明本發明之觀點,但非以任何觀點限制本發明之範疇。The detailed features and characteristics of the present invention are described in detail in the following embodiments, and the content is sufficient to enable any person skilled in the relevant art to understand the technical content of the present invention and implement accordingly, and according to the content disclosed in this specification, the scope of the patent application and the drawings , any person skilled in the related art can easily understand the related concepts and features of the present invention. The following examples further illustrate the viewpoints of the present invention in detail, but do not limit the scope of the present invention in any viewpoint.
本發明適用的場景例如:在大型會議室中透過一或多個線上會議免持裝置(speakphone)播放音訊或視訊。The applicable scene of the present invention is, for example, playing audio or video through one or more online conference speakerphones in a large conference room.
圖1是本發明一實施例的偵測雙邊發話及消除回音的通訊系統的硬體架構圖。雙邊發話的定義為:近端(本地端)喇叭播放遠端與會者聲音的同時,近端麥克風錄入近端與會者的聲音。FIG. 1 is a hardware architecture diagram of a communication system for detecting double-talking and eliminating echoes according to an embodiment of the present invention. Bilateral speaking is defined as: while the near-end (local) speaker plays the far-end participant's voice, the near-end microphone records the near-end participant's voice.
在圖1繪示的實施例中,通訊系統包括彼此通訊連接的第一通訊裝置10、第二通訊裝置20及第三通訊裝置30。該三通訊裝置10、20及30以菊鍊(daisy chain)形式串連。實務上,可依據會議空間大小決定通訊系統中的通訊裝置個數。若僅需要一個通訊裝置,則設置第一通訊裝置10。若需要兩個以上的通訊裝置,則設置一個第一通訊裝置10及一個以上的第二通訊裝置20。如圖1中的第三通訊裝置30,其與第二通訊裝置20具有相同硬體結構。因此,以下僅說明第一及第二通訊裝置10及20的實施細節。In the embodiment shown in FIG. 1 , the communication system includes a
第一通訊裝置10包括第一通訊介面11、第一通訊電路12、第一音訊處理電路14及第一轉換電路16。第一通訊介面11通訊連接網路N。第一音訊處理電路14電性連接第一通訊介面,且透過第一轉換電路16電性連接第一通訊電路12、第一喇叭18及第一麥克風19。The
第一通訊介面11從網路N接收來自遠端裝置的遠端訊號。遠端訊號包含遠端與會者的語音資訊。The
第一通訊電路12串連第一及第二通訊裝置10及20。第一通訊電路12可發送經第一音訊處理電路14及第一轉換電路16處理後的遠端訊號及接收來自第二通訊裝置20的第二判斷結果。第一通訊電路12例如採用低電壓差分訊號(Low Voltage Differential Signaling,LVDS)技術,並使用差動訊號傳送資料,藉此增加傳輸距離。The
第一音訊處理電路14可傳送遠端訊號至第一喇叭18,接收第一麥克風19產生的第一錄音,依據遠端訊號、第一錄音及第二判斷結果產生第一判斷結果,第一判斷結果用於指示雙邊發話或單邊發話,並依據第一判斷結果調整動態等化器(Dynamic Equalizer)的參數組,其中動態等化器依據參數組處理遠端訊號。第一音訊處理電路14例如採用中央處理器(Central Processing Unit,CPU)。後文將另外敘述雙邊發話的偵測方式。The first
第一轉換電路16可集中資料並轉換資料格式。第一轉換電路16例如採用現場可程式化邏輯閘陣列(Field Programmable Gate Array,FPGA)。在其他實施例中,第一轉換電路16亦可整合於第一音訊處理電路14。The
第一喇叭18可播放被動態等化器依據第一音訊處理電路14調整後的參數組調整的遠端訊號以產生第一遠端聲音。第一麥克風19可錄製本身所處環境的聲音以產生第一錄音。實務上,第一通訊裝置10可內建或外接第一喇叭18及第一麥克風19。The
第二通訊裝置20包括第二通訊電路22、第二音訊處理電路24及第二轉換電路26。第二通訊電路22通訊連接第一通訊電路12。第二音訊處理電路24透過第二轉換電路26電性連接第二通訊電路22、第二喇叭28及第二麥克風29。The
第二通訊電路22可接收第一通訊裝置10發送的遠端訊號,並將第二音訊處理電路24對於雙邊發話的第二判斷結果發送至第一通訊裝置10。第二通訊電路22與第一通訊電路12在硬體上基本相同。The
第二音訊電路24可傳送遠端訊號至第二喇叭28,接收第二麥克風29產生的第二錄音,依據遠端訊號及第二錄音產生第二判斷結果,第二判斷結果可指示雙邊發話或單邊發話。The
第二轉換電路26電性連接第二音訊處理電路24。第二轉換電路26與第一轉換電路16在硬體上基本相同。The
第二喇叭28可播放從第一通訊裝置10接收的遠端訊號以產生第二遠端聲音。第二麥克風29可錄製本身所處環境的聲音以產生第二錄音。實務上,第二通訊裝置20可內建或外接第二喇叭28及第二麥克風29。The
圖2是本發明一實施例的第一及第二音訊電路14及24的內部模組示意圖。圖2中的方塊可採用軟體或硬體電路方式實現。方塊之間的箭頭用於表示資料傳輸方向。為便於檢視,在圖2中省略繪示第一通訊界面13、第一及第二通訊電路12、22與第一及第二轉換電路16、26。FIG. 2 is a schematic diagram of the internal modules of the first and
圖3是本發明一實施例的偵測雙邊發話及消除回音的方法的流程圖,適用於圖1繪示的第一或第二通訊裝置10或20。以下請一併參考圖2及圖3。FIG. 3 is a flowchart of a method for detecting double-talking and eliminating echoes according to an embodiment of the present invention, which is applicable to the first or
步驟S31為「通訊裝置取得遠端訊號」,步驟S32為「麥克風產生錄音」。The step S31 is "the communication device obtains the remote signal", and the step S32 is "the microphone generates a recording".
首先,第一通訊介面11從網路N取得遠端訊號,動態等化器141依據預設的參數組處理遠端訊號後,將遠端訊號輸出至第一及第二喇叭18及28。第一喇叭18依據處理後的遠端訊號播放第一遠端聲音,此時第一麥克風19錄入第一通訊裝置10週邊的聲音以產生第一錄音;若此時近端與會者發言,則第一錄音將包含與會者語音及第一遠端聲音。同理,第二喇叭28播放第二遠端聲音且第二麥克風29產生第二錄音。First, the
步驟S33為「音訊處理電路依據遠端訊號及錄音進行音訊處理程序以至少產生近端振幅」。Step S33 is "the audio processing circuit performs an audio processing procedure according to the far-end signal and the recording to generate at least the near-end amplitude".
請參考圖2,第一錄音經過適應性濾波器(Adaptive Filter)143、非線性處理器(Nonlinear Processor,NLP)144、雜訊抑制器(Noise Reduction,NR)145、混音器146、自動增益控制器(Automatic Gain Control,AGC)147以及壓縮器(Compressor)148等模組的處理,最終透過網路N而被傳送至遠端的通訊裝置。在上述模組中,適應性濾波器143及非線性處理器144可進行回音消除(Acoustic Echo Cancelling,AEC),雜訊抑制器145可抑制雜訊,混音器146將兩個雜訊抑制器145、245各自的輸出訊號進行混音。自動增益控制器147調整混音器146的輸出訊號中的響度,壓縮器148可避免聲音飽和。Please refer to FIG. 2, the first recording goes through an adaptive filter (Adaptive Filter) 143, a nonlinear processor (NLP) 144, a noise suppressor (Noise Reduction, NR) 145, a
第二音訊電路24中的適應性濾波器243、非線性處理器244及雜訊抑制器245等模組的運作方式如前所述,其差別在於:第二音訊電路24中的雜訊抑制器245的輸出訊號更被串接到第一音訊電路14中的混音器146。The modules such as the
第一、第二音訊電路14、24中的雙邊發話偵測器142、242各自用於偵測第一、第二通訊裝置10、20週邊是否具有雙邊發話的狀況。The double-
在以菊鍊形式串聯的多個通訊裝置中,位於末端的通訊裝置將依據近端振幅、遠端語音機率、遠端振幅、歷史資訊及門檻資訊等進行綜合判斷以產生雙邊通話的判斷結果。近端振幅反映近端聲音的音量。遠端語音機率反映遠端聲音為語音的機率。遠端振幅反映遠端聲音的音量。歷史資訊包括近端振幅衰減值、遠端語音機率衰減值及遠端振幅衰減值,該三衰減值反應前次的近端振幅、遠端語音機率及遠端振幅。門檻資訊則包含由高門檻值及低門檻值形成的區間,且高門檻值大於低門檻值。非位於末端的通訊裝置除了依據前述多種資訊,更參考後一級通訊裝置的另一判斷結果以產生雙邊通話的判斷結果。以圖1為例,第一通訊裝置10的後一級通訊裝置為第二通訊裝置20,第二通訊裝置20的後一級通訊裝置為第三通訊裝置30。Among multiple communication devices connected in a daisy chain, the communication device at the end will make a comprehensive judgment based on the near-end amplitude, far-end speech probability, far-end amplitude, historical information, and threshold information to generate a judgment result of the bilateral call. Near-end amplitude reflects the volume of the near-end sound. Far-end speech probability reflects the probability that the far-end voice is speech. The far-end amplitude reflects the volume of the far-end sound. The historical information includes the near-end amplitude attenuation value, the far-end speech probability attenuation value and the far-end amplitude attenuation value, and the three attenuation values reflect the previous near-end amplitude, far-end speech probability and far-end amplitude. The threshold information includes an interval formed by a high threshold value and a low threshold value, and the high threshold value is greater than the low threshold value. In addition to relying on the aforementioned various information, the communication device not located at the end also refers to another judgment result of the next-level communication device to generate the judgment result of the bilateral call. Taking FIG. 1 as an example, the subsequent communication device of the
步驟S34為「音訊處理電路至少依據近端振幅、歷史資訊及門檻資訊產生用於指示雙邊發話或單邊發話的判斷結果」。假設本發明一實施例的通訊系統僅包含位於末端的第二通訊裝置20及非位於末端的第一通訊裝置10,以下舉例說明產生判斷結果的兩種實施方式。Step S34 is "the audio processing circuit generates a judgment result for indicating bilateral speech or unilateral speech at least according to the near-end amplitude, history information and threshold information." Assuming that the communication system of an embodiment of the present invention only includes the
在步驟S34的第一種實施方式中,第二通訊裝置20中的雙邊發話偵測器242進行運算如下所述:In the first embodiment of step S34, the
首先,將雜訊抑制器245本次輸出音訊訊框(frame)以K個取樣點為單位分成數個區塊。對於每一區塊中的K個取樣點取得振幅最大者,再將所有區塊的振幅最大值予以平均。按上述得到的平均振幅即為近端振幅A2。First, the audio frame (frame) output by the
其次,從歷史資訊中取得近端振幅衰減值A2’,其為雜訊抑制器245前次計算的結果與衰減係數的乘積。從歷史資訊中取得前次第一旗標F21’,其為第二音訊電路24前次設定的結果。從門檻資訊中取得高門檻值TH及低門檻值TL,其中TH>TL。歷史資訊及門檻資訊例如儲存於第二音訊電路10的儲存單元中。Next, the near-end amplitude attenuation value A2' is obtained from the historical information, which is the product of the previous calculation result of the
再來,選擇近端振幅A2與近端振幅衰減值A2’中的較大者A2 max。 Next, select the larger A2 max of the near-end amplitude A2 and the near-end amplitude attenuation value A2'.
若TH<A2 max,則將第一旗標F21設為1; If TH<A2 max , set the first flag F21 to 1;
若A2 max<TL,則將第一旗標F21設為0; If A2 max <TL, set the first flag F21 to 0;
若TL≤A2 max≤TH,則維持第一旗標F21的設定值,即以前次的第一旗標F21’作為本次第一旗標F21;以及 If TL≤A2 max≤TH, the set value of the first flag F21 is maintained, that is, the previous first flag F21' is used as the current first flag F21; and
將A2 max乘以衰減係數以作為下一次計算時的近端振幅衰減值A2’。 Multiply A2 max by the attenuation coefficient as the near-end amplitude attenuation value A2' for the next calculation.
當第一旗標F21被設定時(F21=1),第二判斷結果指示為雙邊發話。當第一旗標F21被解除時(F21=0),第二判斷結果指示為單邊發話。When the first flag F21 is set (F21=1), the second judgment result indicates double-sided speech. When the first flag F21 is released (F21=0), the second judgment result indicates that the unilateral speech is made.
請參考前述的運算流程。第一通訊裝置10中的雜訊抑制器145按前述方式產生近端振幅A1。第一音訊處理電路14從歷史資訊中取得近端振幅衰減值A1’及前次第一旗標F11’,從門檻資訊中取得高門檻值TH及低門檻值TL,然後按照前述方式設定、解除或維持第一旗標F11。當第一旗標F21被設定時(F21=1),第一判斷結果指示為雙邊發話。與前述不同的是,當第一通訊裝置10中的第一旗標F11被解除(F11=0)但第二通訊裝置20中的第一旗標F22被設定(F21=1)時,第一判斷結果仍指示為雙邊發話。除非兩個通訊裝置10、20的第一旗標F11、F21皆被解除,第一判斷結果才會指示單邊通話。簡言之,雙邊發話偵測器142除了進行如前述雙邊發話偵測器242的運算,更透過第一通訊電路12接收第二判斷結果做為參考依據。Please refer to the aforementioned operation flow. The noise suppressor 145 in the
在步驟S34的第二種實施方式中,第二音訊處理電路24依據遠端訊號及第二錄音進行音訊處理程序產生近端振幅A2、遠端語音機率P2及遠端振幅F2。In the second embodiment of step S34, the second
雙邊發話偵測器242取得的歷史資訊包括近端振幅衰減值A2’、遠端語音機率衰減值P2’、遠端振幅衰減值F2’、前次第一旗標F21’、前次第二旗標F22’及前次第三旗標F33’。雙邊發話偵測器242取得的門檻資訊包括近端振幅區間R1、遠端語音區間R2及端振幅區間R3,其中,每個區間Rn由高門檻值THn及低門檻值TLn組成,即Rn=[THn, TLn],且THn>TLn,n∈{1, 2, 3}。The historical information obtained by the
雙邊發話偵測器242至少依據近端振幅A2、歷史資訊及門檻資訊產生第二判斷結果的運算過程如下所述:The operation process for the
選擇近端振幅A2及近端振幅A2’衰減值中的第一較大者A2 max; Select the first larger A2 max of the near-end amplitude A2 and the near-end amplitude A2' attenuation value;
當第一較大者A2 max大於近端振幅區間R1時,即TH1<A2 max,則將第一旗標F21設定為1; When the first larger A2 max is greater than the near-end amplitude interval R1, that is, TH1 < A2 max , the first flag F21 is set to 1;
當第一較大者A2 max小於近端振幅區間R1時,即A2 max<TL1,則將第一旗標F21解除為0; When the first larger A2 max is smaller than the near-end amplitude interval R1, that is, A2 max <TL1, the first flag F21 is released to 0;
當第一較大者A2 max介於近端振幅區間R1時,即TL1<A2 max<TH1,則以前次第一旗標F21’作為本次第一旗標F21; When the first larger A2 max is within the near-end amplitude range R1, that is, TL1 < A2 max < TH1, the previous first flag F21' is used as the current first flag F21;
選擇遠端語音機率P2及遠端語音機率衰減值P2’中的第二較大者P2 max; selecting the second larger P2 max in the far-end speech probability P2 and the far-end speech probability attenuation value P2';
當第二較大者P2 max大於遠端語音區間R2時,即TH2<P2 max,則將第二旗標F22設定為1; When the second larger P2 max is greater than the far-end speech interval R2, that is, TH2 < P2 max , the second flag F22 is set to 1;
當第二較大者P2 max小於遠端語音區間R2時,即P2 max<TL2,則將第二旗標F22解除為0; When the second larger P2 max is smaller than the far-end speech interval R2, that is, P2 max <TL2, the second flag F22 is released to 0;
當第二較大者P2 max介於遠端語音區間R2時,即TL2<A2 max<TH2,則以前次第二旗標F22’作為本次第二旗標F22; When the second larger P2 max is within the far-end speech interval R2, that is, TL2 < A2 max < TH2, the previous second flag F22' is used as the current second flag F22;
選擇遠端振幅F2及遠端振幅衰減值F2’中的第三較大者F2 max; Select the third larger F2 max of the far-end amplitude F2 and the far-end amplitude attenuation value F2';
當第三較大者F2 max大於遠端振幅區間R3,即TH3<F2 max,則將第三旗標F23設定為1; When the third larger F2 max is greater than the remote amplitude interval R3, that is, TH3 < F2 max , the third flag F23 is set to 1;
當第三較大者F2 max小於遠端振幅區間R3時,即F2 max<TL3,則將第三旗標F23解除為0; When the third larger F2 max is smaller than the remote amplitude interval R3, that is, F2 max <TL3, the third flag F23 is released to 0;
當第三較大者F2 max介於遠端振幅區間R3時,即TL3<F2 max<TH3,則以前次第二旗標F23’作為本次第二旗標F23;以及 When the third larger F2 max is within the far-end amplitude range R3, that is, TL3 < F2 max < TH3, the previous second flag F23' is used as the current second flag F23; and
將第一、第二及第三較大者A2 max、P2 max及F2 max各自乘以對應的衰減係數以作為下一次運算時所需的近端振幅衰減值A2’、遠端語音機率衰減值P2’及遠端振幅衰減值F2’。 Multiply the first, second and third larger A2 max , P2 max and F2 max by the corresponding attenuation coefficient to be used as the near-end amplitude attenuation value A2' and the far-end speech probability attenuation value required in the next calculation P2' and the far-end amplitude attenuation value F2'.
在雙邊發話偵測器242完成上述運算過程之後,再依據該三旗標F21、F22及F23採用如下判斷邏輯:當第一旗標F21被設定時,或當第二旗標F22及第三旗標F23皆被設定時,將第二判斷結果D2指示為雙邊發話,即D2=(F21 or (F22 and F23))。若D2為1,代表目前第二通訊裝置20為雙邊發話;若D2為0,代表目前第二通訊裝置20為單邊發話。After the
此外,第二通訊電路22更將第二判斷結果D2傳送至第一通訊裝置10的雙邊發話偵測器142。In addition, the
同理,比照雙邊發話偵測器242的計算方式,第一音訊處理電路14中的雙邊發話偵測器142依據近端振幅A1、遠端語音機率P1、遠端振幅F1及對應於該三資料的歷史資訊與門檻資訊可計算出屬於第一通訊裝置10的三個旗標F11、F12及F13。須注意的是:第一判斷結果D1的判斷邏輯更參考第二判斷結果D2,即D1=(F11 or (F12 and F13) or D2)。若D1為1,代表目前第一通訊裝置10為雙邊發話;若D1為0,代表目前第一通訊裝置10為單邊發話。In the same way, compared to the calculation method of the
在步驟S34的第二種實施方式中,若符合以下三種情況其中一者,則雙邊發話偵測器142、242判定為雙邊發話:In the second implementation of step S34, if one of the following three conditions is met, the
(1) 最近二次取樣的近端聲音的音量超過一閾值;(1) The volume of the most recent subsampled near-end sound exceeds a threshold;
(2) 最近二次播放的遠端聲音屬於語音的機率超過另一閾值且遠端聲音的音量超過又一閾值;以及(2) The probability that the most recently played far-end sound belongs to speech exceeds another threshold and the volume of the far-end sound exceeds another threshold; and
(3) 後一級通訊裝置判定為雙邊發話(屬於菊鍊末端的通訊裝置則不需考慮此狀況)。(3) The latter-level communication device is judged to be bilaterally speaking (the communication device belonging to the end of the daisy chain does not need to consider this situation).
步驟S35為「音訊處理電路依據該判斷結果調整該動態等化器參數」。Step S35 is "the audio processing circuit adjusts the parameters of the dynamic equalizer according to the judgment result".
在產生第一判斷結果D1之後,第一音訊處理電路14依據第一判斷結果D1調整動態等化器的參數組。After the first determination result D1 is generated, the first
在一實施例中,動態等化器的參數組可被調整為單邊參數組或雙邊參數組,各參數組中包含一或多個參數。作為第一通訊裝置10上電運作時的預設值,單邊參數組用於純語音或純回音的狀況。雙邊參數組則用於雙邊發話的狀況。In one embodiment, the parameter group of the dynamic equalizer can be adjusted to a unilateral parameter group or a bilateral parameter group, and each parameter group includes one or more parameters. As the default value when the
圖4是第一音訊處理電路14設定動態等化器參數的流程圖。FIG. 4 is a flow chart of the first
步驟S41為「啟動雙邊發話偵測」,其細節如前文所述。Step S41 is "starting the detection of bilateral speech", the details of which are as described above.
步驟S42至S43的流程表示當判斷結果指示為雙邊發話時,判斷當前的動態等化器141的參數組是否為雙邊參數組。當步驟S43判斷為「是」時,執行步驟S44,維持目前雙邊參數組的設定。當步驟S43判斷為「否」時,執行步驟S45,將動態等化器141的參數組由單邊參數組逐漸調整為雙邊參數組,直到動態等化器141的曲線符合雙邊參數組的設定為止。The flow of steps S42 to S43 represents that when the judgment result indicates bilateral speaking, it is judged whether the parameter group of the current
步驟S42至S46的流程表示當判斷結果指示為單邊發話時,判斷當前的動態等化器141的參數組是否為單邊參數組。當步驟S46判斷為「是」時,執行步驟S47,維持目前的單邊參數組的設定。當步驟S46判斷為「否」時,執行步驟S48,將動態等化器141的參數組由雙邊參數組逐漸調整為單邊參數組,直到動態等化器141的曲線符合單邊參數組的設定為止。The flow of steps S42 to S46 represents that when the judgment result indicates that unilateral speaking is indicated, it is judged whether the parameter group of the current
在完成步驟S44、S45、S46及S47之後皆返回步驟S41。After completing steps S44, S45, S46 and S47, the process returns to step S41.
步驟S45及步驟4S8所述的「逐漸調整」的一種實施方式為:對於動態等化器中可設定的每個參數,每隔一單位時依據一偏移量調整參數組中的一參數,直到該參數等於單/雙邊參數組中的單/雙邊參數。單邊參數組中的一單邊參數的增益值大於雙邊參數組中的一雙邊參數的增益值。雙邊參數組中的參數設定在負增益,藉此抑制第一及第二喇叭18及28的輸出音量。雙邊參數組中的參數值在喇叭失真嚴重的頻率抑制較強,其他失真較不嚴重的頻率則抑制較弱或甚至不用抑制。因此,可降低第一及第二麥克風19及29錄入聲音中的遠端語音的失真程度。One embodiment of the "gradual adjustment" described in steps S45 and 4S8 is: for each parameter that can be set in the dynamic equalizer, adjust a parameter in the parameter group according to an offset every unit time, until This parameter is equal to the one-side/two-side parameter in the one-side/two-side parameter group. The gain value of a unilateral parameter in the unilateral parameter group is greater than the gain value of a bilateral parameter in the bilateral parameter group. The parameters in the bilateral parameter group are set at negative gain, thereby suppressing the output volume of the first and
綜上所述,本發明提出的偵測雙邊發話及消除回音的方法、通訊裝置及通訊系統不僅可以提高線上會議的收音與聆聽距離,同時可以全雙工模式執行回音消除功能。本發明可提升線上會議的流暢度,帶給所有與會者同處一地的臨場感。本發明可避免一方說話時,另一方的聲音無法傳遞的狀況。藉由調整動態等化器的參數,本發明可以針對喇叭非線性失真的頻段進行抑制,在其他頻段則可以減少調整幅度或甚至不調整。是以在本發明中所使用的串列免持裝置(speakerphone)並不需要特別提高單一喇叭的音量,因此避免在大會議室中為了提高喇叭音量,導致靠近喇叭的與會者暴露於過大音量的不適感。此外,在本發明中所使用的串列免持裝置也無須配置高感度的麥克風。To sum up, the method, communication device and communication system for detecting bilateral speech and eliminating echoes proposed by the present invention can not only improve the audio reception and listening distance of online conferences, but also can perform echo cancellation function in full duplex mode. The present invention can improve the fluency of the online conference, and bring all the participants a sense of presence in the same place. The present invention can avoid the situation that when one party speaks, the other party's voice cannot be transmitted. By adjusting the parameters of the dynamic equalizer, the present invention can suppress the frequency band of the speaker's nonlinear distortion, and in other frequency bands, the adjustment range can be reduced or even not adjusted. Therefore, the serial speakerphone used in the present invention does not need to increase the volume of a single speaker, so it is avoided that in a large conference room, in order to increase the volume of the speaker, the participants close to the speaker are exposed to excessively loud speakers. Discomfort. In addition, the serial hands-free device used in the present invention does not need to be equipped with a high-sensitivity microphone.
雖然本發明以前述之實施例揭露如上,然其並非用於限定本發明。在不脫離本發明之精神和範圍內,所為之更動與潤飾,均屬本發明之專利保護範圍。關於本發明所界定之保護範圍請參考所附之申請專利範圍。Although the present invention is disclosed in the foregoing embodiments, it is not intended to limit the present invention. Changes and modifications made without departing from the spirit and scope of the present invention belong to the scope of patent protection of the present invention. For the protection scope defined by the present invention, please refer to the attached patent application scope.
N:網路
10:第一通訊裝置
11:第一通訊介面
12:第一通訊電路
14:第一音訊處理電路
16:第一轉換電路
18:第一喇叭
19:第一麥克風
20:第二通訊裝置
22:第二通訊電路
24:第二音訊處理電路
28:第二喇叭
29:第二麥克風
30:第三通訊裝置
32:第三通訊電路
34:第三音訊處理電路
38:第三喇叭
39:第三麥克風
141:動態等化器
142、242:雙邊發話偵測器
143、243:適應性濾波器
144、244:非線性處理器
145、245:雜訊抑制器
146:混音器
147:自動增益控制器
148:壓縮器
S31~S35、S41~S49:步驟
N: network
10: The first communication device
11: The first communication interface
12: The first communication circuit
14: The first audio processing circuit
16: The first conversion circuit
18: The first horn
19: First Mic
20: Second communication device
22: The second communication circuit
24: Second audio processing circuit
28: Second speaker
29: Second Microphone
30: Third communication device
32: The third communication circuit
34: The third audio processing circuit
38: Third Horn
39: Third Microphone
141:
圖1是本發明一實施例的偵測雙邊發話及消除回音的通訊系統的硬體架構圖; 圖2是本發明一實施例的第一及第二音訊電路的內部模組示意圖; 圖3是本發明一實施例的偵測雙邊發話及消除回音的方法的流程圖;以及 圖4是第一音訊處理電路設定動態等化器的參數組的流程圖。 1 is a hardware architecture diagram of a communication system for detecting bilateral speech and eliminating echoes according to an embodiment of the present invention; 2 is a schematic diagram of the internal modules of the first and second audio circuits according to an embodiment of the present invention; FIG. 3 is a flowchart of a method for detecting double-sided speech and eliminating echoes according to an embodiment of the present invention; and FIG. 4 is a flow chart of the first audio processing circuit for setting the parameter group of the dynamic equalizer.
S31~S35:步驟 S31~S35: Steps
Claims (9)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW110106427A TWI778524B (en) | 2021-02-24 | 2021-02-24 | Method, communication device and communication system for double-talk detection and echo cancellation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW110106427A TWI778524B (en) | 2021-02-24 | 2021-02-24 | Method, communication device and communication system for double-talk detection and echo cancellation |
Publications (2)
Publication Number | Publication Date |
---|---|
TW202234386A true TW202234386A (en) | 2022-09-01 |
TWI778524B TWI778524B (en) | 2022-09-21 |
Family
ID=84957310
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW110106427A TWI778524B (en) | 2021-02-24 | 2021-02-24 | Method, communication device and communication system for double-talk detection and echo cancellation |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWI778524B (en) |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2213297A1 (en) * | 1996-10-04 | 1998-04-04 | Jeffrey Phillip Mcateer | Intelligent acoustic systems peripheral |
CN101151840B (en) * | 2005-01-10 | 2011-09-21 | 四次方有限公司 | Integrated architecture for the unified processing of visual media |
US7912211B1 (en) * | 2007-03-14 | 2011-03-22 | Clearone Communications, Inc. | Portable speakerphone device and subsystem |
GB2547063B (en) * | 2014-10-30 | 2018-01-31 | Imagination Tech Ltd | Noise estimator |
-
2021
- 2021-02-24 TW TW110106427A patent/TWI778524B/en active
Also Published As
Publication number | Publication date |
---|---|
TWI778524B (en) | 2022-09-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP4255461B2 (en) | Stereo microphone processing for conference calls | |
WO2008150022A1 (en) | Sound signal processor and delay time setting method | |
US20110181452A1 (en) | Usage of Speaker Microphone for Sound Enhancement | |
CN106448691B (en) | Voice enhancement method for public address communication system | |
CN101478614A (en) | Method, apparatus and communication terminal for adaptively tuning volume | |
JPH06104970A (en) | Loudspeaking telephone set | |
JP2013121105A (en) | Earhole attachment-type sound pickup device, signal processing device, and sound pickup method | |
JP2006139307A (en) | Apparatus having speech effect processing and noise control and method therefore | |
CN108462763B (en) | Noise reduction terminal and noise reduction method | |
JP6197930B2 (en) | Ear hole mounting type sound collecting device, signal processing device, and sound collecting method | |
CN110956975A (en) | Echo cancellation method and device | |
US8478359B2 (en) | User interface tone echo cancellation | |
CN116367066A (en) | Audio device with audio quality detection and related method | |
US6771779B1 (en) | System, apparatus, and method for improving speech quality in multi-party devices | |
US9232072B2 (en) | Participant controlled spatial AEC | |
TWI778524B (en) | Method, communication device and communication system for double-talk detection and echo cancellation | |
TWI790718B (en) | Conference terminal and echo cancellation method for conference | |
CN112637438B (en) | Entrance guard double-end intercom method and system based on single-line transmission | |
TWI554117B (en) | Method of processing voice output and earphone | |
JP6945158B2 (en) | Calling devices, programs and calling systems | |
CN216982106U (en) | Audio processing circuit and open type audio equipment | |
US11290599B1 (en) | Accelerometer echo suppression and echo gating during a voice communication session on a headphone device | |
US10264116B2 (en) | Virtual duplex operation | |
CN115705848A (en) | Noise reduction method, equipment and storage medium | |
JP2007124163A (en) | Call apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
GD4A | Issue of patent certificate for granted invention patent |