TW201028996A - Methods and apparatus for noise estimation - Google Patents
Methods and apparatus for noise estimation Download PDFInfo
- Publication number
- TW201028996A TW201028996A TW098134985A TW98134985A TW201028996A TW 201028996 A TW201028996 A TW 201028996A TW 098134985 A TW098134985 A TW 098134985A TW 98134985 A TW98134985 A TW 98134985A TW 201028996 A TW201028996 A TW 201028996A
- Authority
- TW
- Taiwan
- Prior art keywords
- noise
- noise level
- standard deviation
- level
- average
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 230000000694 effects Effects 0.000 claims abstract description 12
- 238000009499 grossing Methods 0.000 claims description 14
- 238000004364 calculation method Methods 0.000 claims description 7
- 230000005236 sound signal Effects 0.000 claims description 6
- 238000012935 Averaging Methods 0.000 claims description 5
- 206010010071 Coma Diseases 0.000 claims 1
- 235000010627 Phaseolus vulgaris Nutrition 0.000 claims 1
- 244000046052 Phaseolus vulgaris Species 0.000 claims 1
- 238000001514 detection method Methods 0.000 abstract description 18
- 238000001228 spectrum Methods 0.000 abstract description 5
- 230000006870 function Effects 0.000 description 5
- 230000001629 suppression Effects 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 238000004891 communication Methods 0.000 description 4
- 230000007774 longterm Effects 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000013178 mathematical model Methods 0.000 description 3
- 238000012937 correction Methods 0.000 description 2
- 230000009977 dual effect Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000003595 spectral effect Effects 0.000 description 2
- 241001342895 Chorus Species 0.000 description 1
- 101100518501 Mus musculus Spp1 gene Proteins 0.000 description 1
- 241000282320 Panthera leo Species 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- HAORKNGNJCEJBX-UHFFFAOYSA-N cyprodinil Chemical compound N=1C(C)=CC(C2CC2)=NC=1NC1=CC=CC=C1 HAORKNGNJCEJBX-UHFFFAOYSA-N 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000012535 impurity Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000005022 packaging material Substances 0.000 description 1
- 230000000135 prohibitive effect Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 239000004575 stone Substances 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L25/84—Detection of presence or absence of voice signals for discriminating voice from noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
Landscapes
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- General Health & Medical Sciences (AREA)
- Otolaryngology (AREA)
- Noise Elimination (AREA)
- Circuit For Audible Band Transducer (AREA)
- Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
Description
201028996 六、發明說明: 【發明所屬之技術領域】 本發明大體上係關於用於雜訊位準/頻譜估計及語音活 動偵測之方法及裝置,且更特定言之,係關於用於估計雜 訊位準及偵測語音之存在的機率模型的使用。 本申請案主張來自2008年10月15曰申請之美國臨時專利 申請案第61/105,727號的優先權,該臨時專利申請案之全 文以引用之方式併入本文中。
【先前技術】 通信技術在許多領域中持續 挑戰。隨著行動電話及無線耳機之出現,現可在非常嘈雜 的環境(亦即,具有低信雜比(SNR)之環境)中進行真正的 全雙工對話。信號增強及雜訊抑制在此等情形中變得關 鍵。所要語音之可理解度(intelligibmty)藉由在將信號發 送至另一端的收聽者之前抑制非所樂見的有雜訊信號 (noisy signal)而得以增強。偵測語音在有雜訊背景内之存 在為信號增強及雜訊抑制之一重要分量。為達成改良之注 音偵測,一些系統將一傳入信號劃分成複數個不同時間 頻率訊框,且估計語音在每—訊框巾之存在的機率。 镇測語音之存在的最大挑戰中的—者為追蹤雜訊底限, 特定言之,使用單-麥克風/感應器之非定態雜訊位準。 語音活動偵測廣泛用於現代通信器件中,尤其用於在低作 雜比,況下操作之現代行動器件,諸如,行動電話及無線 耳機盗件。在此等器件中之大多數中,在將有㈣㈣t 144023.doc 201028996 =至另-端之收聽者之前對該有雜訊信號執行信號增強及 雜訊抑制;進行此操作以改良所要語音之可理解度。在信 號增強/雜訊抑制中,使用狂_立七 之用e〇 0或話音活動偵測器(VAD)來 债測所要語音在混有雜訊之信號中的存在。此偵測器可產 生語音之存在或不存在的二元決策(bi_y,或亦 可產生語音存在之機率。 /貞測語音之存在的—挑戰為判定信號中之背景雜訊之位 準的上界限及下界限,其亦稱為雜訊「頂限」及「底 限」。在使用單—麥克風輸人之非定態雜訊的情況下尤為 如此。另彳,追蹤雜訊位準歸因於器件或使用該器件之人 員的實體移動之快速變化甚至更具挑戰性。 【發明内容】 在特定實施例中’揭卜種用於估計—音訊信號之一當 前訊框中之雜訊位準的方法。該方法包含判定複數個音訊 訊框之雜隸準以及計算該複數個音訊純上之該等雜訊 位準的平均值及標準差。使用自該平均值減去該標準差的 值來計算一當前訊框之一雜訊位準估計。 在特定實施例中,揭示一種雜訊判定系統。該系統包 含:一經組態以判定複數個音訊訊框之雜訊位準的模組; 及、’呈組U „·}·算該複數個音訊訊框上之該等雜訊位準的平 均值及標準差的—或多個模組。該系統亦可包括—經組態 以將該當前訊框之-雜訊位準估計計算為自該平均值減去 該標準差的值之模組。 在一些實施例中 揭示一種用於估計一信號在複數個時 144023.doc -4- 201028996 =^^Tequeneybin^雜訊位準之方法,該方 今方法計S電腦m實施。對於該信號之每一倉, ;= 數個音職框之雜妹準,估計該時間-頻 率倉中之該雜訊位準,判 彳¥, μ時間-頻率倉中之初級雜訊 丄…級雜訊位準判定該時間-頻率倉中之次級雜 訊位準,及自該時間_頻率倉 有界的雜訊位^ 該次級雜訊位準判定一 中:=施例揭示—種用於估計一音訊信號之一當前訊框 w“立準的系統。該系統可包含:用於判定複數個音 之雜訊位準的構件;用於計算該複數個 音訊訊框上 雜訊位準的平均值及標準差的構件;及用於將該當 2之—雜訊位準估計計算為自該平均值減去該標準差 的值之構件。 在特疋實施例中’揭示—種電腦可讀媒體,纟包含在一 ^里器上執行㈣行—方法的指令1方法包含:判定複 ^個音訊訊框之雜訊位準;計算該複數個音訊訊框上之該 讯位準的平均值及標準差;及將—當前訊框之-雜訊 位準估計計算為自該平均值減去該標準差的值。 【實施方式】 各種組態在隨附圖式中以實例方式而非以限制方式加以 說明。 本發明之實施例包含用於判定信號中之雜訊位準且在一 些個例中隨後偵測語音的方法及系統。此等實施例包含優 於先前技術之多個顯著進步。一改良係關於基於來自先前 144023.doc 201028996 框之背景雜訊的平均值執行語音信號中之背 此不同於基於來自較早及目前音訊訊框之 算語音之一訊框的目前背景雜訊位準的其他 二=統地’研究者已著眼於早先雜訊值之最小值來估 什目刖雜訊位準。然而,在一實施例中,自若干過去德 计异經估計的雜訊信號位準’計算此總體之平均值,而非 最J、值’ a自該總體減去經按比例調整的標準差。所得值 與使用總體最小值通常所提供的值相比有利地提供當前音 訊訊框之雜訊位準的更精確的估計。 曰 此外,可基於傳入信號位準對此經估計的雜訊位準動態 加界限以便維持對雜訊之較精確的估計。經估計的雜 訊位準可另外藉由早先值進#「平滑處理」或「平均 化」,以最小化不連續性。經估計的雜訊位準可接著用以 識別具有高於雜訊位準之能階的訊框中的語音。此可藉由 計算後驗(a posteriori)信雜比(SNR)而加以 又可由非線性s型啟動函數(sigmGidal aetiv2 function)用以產生語音之存在的校正機率。 參看圖1,一傳統話音活動偵測(VAD)系統1〇〇接收一傳 入信號101,該傳入信號1〇1包含具有背景雜訊之區段及 具有#景雜訊及語音兩者之區段。VAD系統1 〇〇將時間信 號101分割成多個訊框丨〇33至103d。此等訊框1〇33至1〇3d 中之每一者接著傳遞至分類模組i 04,該分類模組i 04判定 將給定訊框置於何類別(雜訊或語音)中。 分類模組104計算給定信號之能量,且比較彼能量與一 144023.doc 201028996 對應於雜afL底限之估計的時變臨限值。彼雜訊底限估計可 藉由每一傳入訊框進行更新。在一些實施例中,該訊框在 C L號之經估计的能階比特定訊框内之經量測的雜訊底 限高的情況下經分類為語音活動。由此,在此模組中,雜 訊頻譜估計為語音辨識及(若需要)後續增強之基本分量。 該等系統之強健性(特定言之,在低SNR及非定態雜訊環境 下)受到可罪地追蹤雜訊統計之快速變化的能力之最大影 響。 基於VAD之習知雜訊估計方法將雜訊估計之更新限於不 存在語音的週期。然而,此等VAD之可靠性對於弱的語音 分量及低輸入SNR而嚴重劣化。基於功率譜密度直方圖之 其他技術為計算昂貴的’需要大量記憶體資源,在低snr 條件下執行不良,且因此不適用於行動電話及藍芽耳機應 用。最小值統計為用於雜訊頻譜估計之另一種方法,其藉 由將複數個過去訊框的最小值取為雜訊估計而操作。遺憾 地此方法對於疋態雜訊工作良好,但當應對非定態環境 時表現糟糕。 實施例包含雜訊頻譜估計系統及方法,其對於追蹤許 多類型之非所樂見的音訊信號(包括諸如「聚會雜訊(party noise)」或「串音雜訊」之高度非定態雜訊環境)非常有 效°亥系統甚至在無益於該估計之環境中亦產生精確的雜 訊底限。此經估計的雜訊底限用於計算遂發snr,該後驗 SNR又用於S型函數「遽與邊教如扣☆加如叫」中以判 定語音之存在的機率。在-些實施例中’語音判定模組用 144023.doc 201028996 於此功能。 吏[η]及d[n]分別表示所要語音及不相關的添加雜訊信 號。觀察到之信號或受污染信號,單地為以上兩者的 相加,其由下式給出: y[n]=x[n]+d[n] (1) 兩個假設H0[n]及Ηι [n]分別指示第”個時間訊框中的語音 不存在及存在。在—些實施例巾,可在語音不存在之週期 期間遞歸地平均化雜訊量測的過去能 , -Γ . ^ π I 卜,估 叶可在語音存在期間保持恆定。特定言之, Η〇 [«] ·' Λ [λ]=ααλά [« -1]+(l - ad }j2y [«] & (2)、(3) 其中為有雜訊信號在時間訊框w處的能量, 2表不0與丨之間的平滑處理參數。然而,由於並不始終 ’:時:在語音,因此,可能並不清楚何時應用方法 或】中之每—者。可替代地使用「有條件的語音存在機 2均其值藉由隨時間推移而更新平滑處判子〜來估計遞 其中 XAnhas[n]^d[n -1]+(l - (4) (5) 可具有更精確的估 as [«] = «rf + (l - )pr〇b[n] 以此方式,當不知曉語音之存在時, 計 以用於雜訊位準 他人早先已考慮基於最小值統計的方法 144023.doc 201028996 十舉例而5,可著眼於(例如)過去100個訊框之經估計 的有雜訊信號位準心,計算總體之最小值’且將其宣告為 經估計的雜訊位準,亦即 . &2M-miniZd(n-m:n)] (6) . 處如啦]表7^向量x之項的最小值,且对„]為時間訊 框二中之經估计的雜訊位準。可執行多於或少於⑽個訊框 操作幻〇〇於此處及貫穿本說明書僅提供為一實例範 參 1 方法對疋態雜訊工作良好’但在非定態環境中表現 不良。 為解決此問題及其他問題,本發明之實施例使用下文所 描述之技術以改良系統之整體偵測效率。 平均值統計 在實施例+纟發明之系統及方法使用平均值統計而 非最小值統計以計算雜訊底限。特定言之,藉由自均值石 減去過去efL框值之經按比例調整的標準差〇來計算信號能 • h2。接著選擇目前能階以作為來自過去訊框之所有先 前經計算的信號能量σ"的最小值。 ^2[»] = [xrf[«-i〇〇:„]_a+<T^jM_1〇〇 ^ ” ⑺、(8) - 纟中无表示向量X之項的平均值。本發明之實施例預期自 100個訊框上之經估計的雜訊位準的平均值減去相同數目 個過去訊框之經估計的雜訊位㈣經按比例調整的標準 差。 144023.doc 201028996 使用雜訊估計之語音偵測 一旦已計算出雜訊估計σ丨2,即可藉由識別高隨之區域 來推斷語音。狀言之,可開發—數學模型,其基於以邏 輯迴歸為基礎之分類器精確地估計語音之存在的校正機 率。在一些實施例中’可使用基於特徵之分類器。由於語 音之短期頻譜由//衮分布良好地模型化,因此可使用經估 計之遂發SNR的對數而非SNR自身作為該組特徵,亦即 ^[«]=1〇 I〇g10 Σ Ν〇Γ -log,, </»/9-100 / (9) 理: 出於穩定性目的,亦可對以上量進行時間平滑處 = βχχ\η -1] + (l - yffj) χ[η] 种.75,0.85】 (1〇) 稱為邏#涵教⑼·之非線性及記憶體較少 啟動函數可接著用於所要語音偵測^語音在時間訊框”處 之存在的機率由下式給出: prob[ri\ =-\-- l + exp(-伽])⑴) 若需要’則經估計的機率亦可使用小的遺忘因子 進行時間平滑處理,以追蹤語音中之突然叢發。為獲得語 音不存在及存在的二元決策,可比較經估計的機率 (pm6e[〇,l])與預選臨限值。之較高值指示語音之存在 的較尚機率。舉例而言,若pr<?6[n]>0.7 ’則可宣告語音在 時間訊框《中的存在。否則,可認為該訊框僅含有非語音 /舌動。所提議之實施例由於較精確的雜訊位準判定而產生 144023.doc • 10 - 201028996 較精確的語音偵測。 對雜訊估計之改良 平均值及標準差之計算需要足夠記憶體以儲存過去訊框 估計。此要求對於具有有限記憶體之特定應用/器件(諸 如’特定小型攜帶型器件)可能為禁止性的。在該等狀況 下’以下近似可用以替代以上計算。可藉由以平滑處理常 數αΜ指數平均化功率估計χ(η)而計算平均值估計的近似 值。類似地,可藉由以平滑處理常數αν指數平均化功率估 ❿ 計之平方而計算方差估計(variance estimate)的近似值,其 中η表示訊框指數。 (η) = αΜχ(.η~Υ) + (ΐ-αΜ )χ(η) 办)=«冲_1) + (1_^2⑻(Η)、(Η) 或者’可藉由獲取方差估計#⑻之平方根而獲得標準差 估計的近似值。可在範圍㈣,0.99]中選擇平滑處理常數 W及αν以對應於20個至1〇〇個訊框上之平均化。此外,可 • ^由計算平均值與經料例調整的標準差估計之間的差而 獲得心的近似值。—旦獲得平均值減去經按 標準差估計,即可執耔料 ,, 的最小值統計。 組(例如’⑽個)訊框的該差值 、 §與最小值統計相比時’此特徵單獨提供對 峰值的極佳追蹤。在—此 U、雜訊 二實施例中’為補償影響 估計之所要語音峰值,心警雜讯位準 程式7中之過分減法可 然而,方 月匕坆成低估的雜訊位準。 問題,可執行語音不存在 為解決此 仔在期間的長期平均值,亦即 144023.doc 201028996 H〇[n]: λάχ [«] = α,λά[« -1] + (1 - α,)σ][η] = (14)、(15) 其中0^ = 0.9999為平滑處理因子,且雜訊位準經估計 為: (16) δ\ [n] = max (σ22 [«], λά< [«]) 雜訊加界限 通常,當傳入信號非常純淨(高SNR)時,通常低估雜訊 位準。一種解決此問題之方式為將雜訊位準加下界限為低 於所要信號位準σ2—(例如)至少1 8 dB。可使用以下底限 化運算來實現加下界限: η desii ,(17) —Μ = α2σ】β“[”-1] + (1-α2) X |刺 -100 SNR diff[η] = SNR_estimate[n] -Longterm_Avg — SNR[n] 若(If) Σ 1少[”]I >Δι /=n-100 若(if)〇1]>' β〇〇φ] = σ^.κά[η]/Α, 若(If) _/?〇〇/-[«-1] floor[n] = floor}[n] 否則,若(elseif)SW?_i/#[«-l]>A4 若(^ΚΆ-ι]^ floor[n] = fl〇〇rx\n\ 結束(End) 結束(End) 144023.doc -12- 201028996 結束(End) 結束(End) ,其中因子&至&為可調的,且 OTi?一£:扣讲仙及所」為分別使用雜訊估計 <。,》[«]及、[«]所獲得的後驗SNR及長期SNR估計。以此方 式’可如所需將雜訊位準加界限為低於作用中所要信號位 準12與24 dB之間。 基於頻率之雜訊估計 ® 實施例另外包括基於頻域副頻帶之在計算上所涉及之語 音偵測器,其可用於其他情況中。此處,將每一時間訊框 劃分成以該時間訊框之傅立葉變換(F〇uHer transf〇rm)所表 示的分量頻率之-集合。此等頻率保持與其在「時間-頻 2」倉中的各別訊框相關聯。所描述之實施例接著估計語 曰在每一時間-頻率倉(K亦_,第是個頻率倉及第”個時 間訊框)中之存在的機率。一些應用要求語音存在之機率 • 彳時間頻率基本單位等級及時間訊框等級兩者進行估 語音偵測器在每—時間·頻率倉中的操作可與 述之時域實施類似,不同 、 描 操作。特定U… 頻率倉中執行該 ° 藉由使用平滑處理因子0[名、ft丄
Xdj:k,中的雜訊位準與 $過去訊框 ί>(Μ|2下的俨泸处θ „ ° 0個訊框在此頻率 —了⑴之間内插而估計每-時間_頻 〇,《)中的雜訊位準h : 144023.doc -13- 201028996
Ad[kM = as[k,n]Ad[k,n-l]+(\-as[k,n]) ^ |r(it,/)|2 ^«-100 (18) 平滑處理因子tts自身可視語音之存在機率與丨之間的内 插而定(亦即,可假設語音多久存在—次)。
Error! Objects cannot be created from editing field codes. (19) 在以上方程式中,為第是個頻率倉及第z•個時間訊框 中的受〉可染信號。可將每一倉中之初級雜訊位準估計為: [*,《]=[乙[女,《 -100:/7] - σ [ A:,ij -1 〇〇: „])] c?22[k,ή\ = min(σ,2[Λ,«-1 〇〇:«]) (20)、(21) 與時域VAD類似,可根據以下方程式執行語音存在^及 不存在期間的長期平均化: [k,n] — + Ε |^(Α,/)[2 /-«-!〇〇 H,[k,η]: Ad)[k,n] = Adi[A:,«-1] (22)、(23) 可接著將每一時間-頻率倉中之次級雜訊位準估計為 ^,n] = max(a^[k,n],Ai/l[^n]) (24) 為解決低估一些高SNR倉之雜訊位準的問題,可使用以 下加界限條件及方程式 ^desired desired ^ ~ 1] + (l _ ) ^ |少[是,”]| /=”-100 SNR= SNR_estimate[k,ri\-Longterm—Avg SNR[k η' (25) 若(If) ί >△! /=/1-100 若(if) fl〇〇r,[k,n\ = aisired[k,ri\l ^ 144023.doc -14- 201028996 若(If) n -丨]< «] floor[k, ri\ = floor^k, n] 否則,若 floor[k,n] = floorx[k,n\ 結束(End) 結束(End) 結束(End)
結束(End) <„#,”] = 11^(<^2[^],1/7〇0啦,《]),其中因子八1至八5為可調的,且 iSW/?—_£·ίη_Α«αβ 及 為分別使用雜訊估計 及Λ,[Μ]所獲得的後驗SNR及長期SNR估計。σ„2_(Μ) 表示每一時間-頻率倉中之最終雜訊位準。
接下來,可使用上文所描述之基於時域數學模型的方程 式(方程式2至17)來估計語音在每一時間-頻率倉中之存在 的機率。特定言之,每一時間-頻率基本單位中之後驗SNR 由下式給出 派”]= l〇|l〇g1Q[ Σ |r[M|2]-l〇g1()K_[M])} (26)
L v=n-100 J J 出於穩定性目的,亦可對以上量進行時間平滑處理: (27) η] = βλχ{Κ,« -1] + (1 - /¾ ) x[k, η] e [0.75,0.85] 且語音在每一時間-頻率基本單位中之存在的機率由下 式給出 144023.doc -15- 201028996
Pr〇b[k,n] = l+cxP(-jt[k,n]) (28) 其中Pr〇6[k,n]表示語音在第*個頻率倉及第《個時間訊框 中之存在的機率。 雙等級架構 上文所描述之數學模型准許靈活地、最佳地組合每—時 間-頻率倉中之輸出機率’以獲得語音出現在每一時間訊 框中之機率的改良之估計。一實施例(例如)預期雙等級架 構,其中偵測器之第一等級以時間-頻率倉等級操作,且 將輸出輸入至第二時間-訊框等級語音偵測器。 雙等級架構組合每一時間_頻率倉中之經估計的機率, 以獲得語音在每一時間訊框中之存在的機率的較佳估計。 此方法可利用語音在特定頻帶(6〇〇 112至155〇 Hz)中占優的 事實。圖2說明用於一些實施例中之複數個頻率權重2〇3的 曲線圖。在一些實施例中,此等權重用以判定如下文所展 示之倉等級機率的加權均值 pr〇b[n] = yw,--- (29) ,=]^l + exp(-f[/,«])^
N /=1 其中權重向量妒包含圖2中所展示之值。最後,與時域 方法類似,可藉由比較經估計的機率與預選臨限值而作出 每一訊框中之語音存在或不存在的二元決策。 實例 為評估上文所描述之實施例的優勢,使用上文所描述之 時間及頻率實施例’以及兩個前導VAD系統來執行語音偵 144023.doc 201028996 測。變化雜訊環境下之此等示範中之每一者的職曲線展 不於圖3至圖6中。以上實施例之時間及頻率版本中的每一 者顯著地比標準獅執行地好。對於實例中之每—者,所 使用之雜訊資料庫係基於標準推薦之阳〗州心。 為話音品質及雜訊抑制評估之目@,此資料庫提供汽車雜 訊、街道雜訊、串音雜訊等之標準記錄。額外的真實世界 記錄亦詩評估⑽效能。此等雜訊環境含有定態及非定
‘㈣訊兩者’心提供對其進行測試之純戰㈣語料庫 ―帅進—步選擇5 dB之⑽以使得惰測格外困難(典 型辦公至雜訊應為約3〇 dB)。 實例1 為評估所提議之時域語音_器,繪製變化雜訊環境下 及在5 dB之SNR的情況下之接收器操作特性⑽^之曲 線。如圖2中所說明,峨曲線㈣偵測(當語音存在時, 侧語音之存在)3G1之機率對假警報(當語音料在時’宣 告語音之存在)302之機率。需要具有呈適宜備測率之極低 的假警報。針對給定假警報之偵測機率的較高值指示較佳 效能,因此大體上’較高曲線為較佳㈣器。 針對四種不同雜訊展示R〇c-粉紅雜訊、串音雜訊、交 通雜訊及聚會雜訊。粉紅雜訊為具有與頻率成反比之功率 "曰密度的疋ii雜訊。其通常在自然實體系統中被觀察到, 且經常用於測試音訊信號處理解決方案。串音雜訊及交通 雜訊本質上為準固態的,且為行動通信環境中通常遭遇到 的雜訊源。串音雜訊及交通雜訊信號可用於由ETSI Eg 144023.doc •17. 201028996 202 3 96-1標準推薦所提供之雜訊資料庫中。聚會雜訊為高 度非定態雜訊,且其用作用於評估VAD之效能的極端狀況 實例。大多數單一麥克風話音活動偵測器在聚會雜訊存在 之情況下歸因於該雜訊之高度非定態本質而產生高假警 報。然而,本發明中所提議的方法即使在聚會雜訊之情況 下亦產生低假警報。 圖3說明第一標準VAD的ROC曲線303c、第二標準VAD 的ROC曲線303b、本發明之基於時間之實施例中的一者的 ROC曲線303a,及本發明之基於頻率之實施例中的一者的 ROC曲線303d,該等ROC曲線係在粉紅雜訊環境中進行繪 製。如所展示,當假警報約束302不嚴格時,本發明之實 施例303a、303d顯著勝過第一 VAD 303b及第二VAD 303c 中之每一者,從而始終顯示較高偵測301。 實例2 圖4說明第一標準VAD的ROC曲線403c、第二標準VAD 的ROC曲線403b、本發明之基於時間之實施例中的一者的 ROC曲線403a,及本發明之基於頻率之實施例中的一者的 ROC曲線403d,該等ROC曲線係在串音雜訊環境中進行繪 製。如所展示,當假警報約束402不嚴格時,本發明之實 施例403a、403d顯著勝過第一 VAD 403b及第二VAD 403c 中之每一者,從而始終顯示較高偵測401。 實例3 圖5說明第一標準VAD的ROC曲線503c、第二標準VAD 的ROC曲線503b、本發明之基於時間之實施例中的一者的 144023.doc -18- 201028996 ROC曲線503a,及本發明之基於頻率之實施例中的一者的 R0C曲線503d ’該等rOC曲線係在交通雜訊環境中進行繪 製。如所展示’當假警報約束5〇2不嚴格時,本發明之實 施例503a、503d顯著勝過第一 VAD 503b及第二VAD 503c 中之每一者,從而始終顯示較高偵測5〇1。 實例4 圖ό說明第一標準VAD的ROC曲線603c、第二標準VAD 的ROC曲線603b、本發明之基於時間之實施例中的一者的 ROC曲線603a,及本發明之基於頻率之實施例中的一者的 R0C曲線603d,該等R〇c曲線係在ROC-ICASSP禮堂雜訊 環境中進行繪製。如所展示,當假警報約束602不嚴格 時’本發明之實施例603a、603d顯著勝過第一 VAD 603b及 第二VAD 603c中之每一者,從而始終顯示較高偵測601。 本發明中所描述之技術可以硬體、軟體、韌體或其任何 組合來實施。被描述為單元或組件之任何特徵可一起實施 於整合式邏輯器件中或獨立實施為離散但可共同操作的邏 輯器件。若以軟體實施,則可至少部分地由包含指令之電 腦可讀媒體來實現該等技術,該等指令在被執行時執行上 文中所描述之方法中的一或多者。電腦可讀媒體可形成可 包括封裝材料之電腦程式產品之部分。電腦可讀媒體可包 含諸如同步動態隨機存取記憶體(SDRAM)之隨機存取記憶 體(RAM)、唯讀記憶體(ROM)、非揮發性隨機存取記憶體 (NVRAM)、電可抹除可程式化唯讀記憶體(EEPROM)、快 閃記憶體、磁性或光學資料儲存媒體及其類似者。另外或 144023.doc -19- 201028996 其他,可至少卹八 , °丨刀地由電腦可讀通信媒體來實現該等技 術該電腦可讀通信媒體以指令或資料結構之形式載運或 傳達程式碼且可由電腦存取、讀取及/或執行。 。可由諸如一或多個數位信號處理器(DSP)、通用微處理 态、特殊應用積體電路(ASIC)、場可程式化邏輯陣列 ()或其他專效整合式或離散邏輯電路之一或多個處理 器來執订程式碼。因此,如本文中所使用之術語「處理 器」可指代上述結構或適用於實施本文中所描述之技術之 任何其他結構中的任一者。此外,在一些態樣中,可將本 文中所描述之功能性提供於經組態以用於編碼及解碼之專 用軟體單元或硬體單元内,或併入於組合之編碼器-解碼 器(CODEC)中。不同特徵作為單元或模組的描緣意欲強調 所說明之器件的不同功能態樣’且未必暗示該等單元必須 由單獨硬體或軟體組件實現。實情為,與一或多個單元或 模組相關聯之功能性可整合於共同或單獨硬體或軟體組件 内。可使用電腦處理器及/或電路來實施該等實施例。 已描述本發明之各種實施例。此等及其他實施例係在以 下申請專利範圍之範疇内。 【圖式簡單說明】 圖1為根據本發明之原理之一 VAD的簡化方塊圖; 圖2為說明頻域VAD之頻率選擇性加權向量的曲線圖; 圖3為說明所提議之時域VAD在粉紅雜訊環境丁之效能 的曲線圖; 圖4為說明所提議之時域VAD在串音雜訊環境下之效能 144023.doc -20· 201028996 的曲線圖, 圖5為說明所提議之時域VAD在交通雜訊環境下之效能 的曲線圖;及 圖6為說明所提議之時域VAD在聚會雜訊環境下之效能 的曲線圖。 【主要元件符號說明】 100 101 103a 103b 103c 103d 104 203 303a 303b 303c 303d 403a 403b 403c 話音活動偵測(VAD)系統 傳入信號/時間信號 訊框 訊框 訊框 訊框 分類模組 頻率權重 本發明之基於時間之實施例中的一者的 ROC曲線 第二標準VAD的ROC曲線 第一標準VAD的ROC曲線 本發明之基於頻率之實施例中的一者的 ROC曲線 本發明之基於時間之實施例中的一者的 ROC曲線 第二標準VAD的ROC曲線 第一標準VAD的ROC曲線 144023.doc -21 201028996 403d 本發明之基於頻率之實施例中的一者的 ROC曲線 5〇3a 本發明之基於時間之實施例中的一者的 ROC曲線 5〇3b 第二標準VAD的ROC曲線 5〇3c 第一標準VAD的ROC曲線 5〇3d 本發明之基於頻率之實施例中的一者的 ROC曲線 144023.doc •22·
Claims (1)
- 201028996 七、申請專利範園: l :種用於估計-音訊信號之—當前訊框中之雜訊位準的 方法,其包含: 判定複數個音訊訊框之該等雜訊位準; ::該複數個音訊訊框上之該等雜訊位準的平均值及 &竿差;及 去=Γ框之一雜訊位準估計計算為自該平均值減 舌該標準差的值。 义 2·如請求項〗之方法’其進一步包含 前按比例調整該標準差。 "均值減去之 3·如請求項1之方法,其進一步包含 位準估古十 判疋複數個雜訊 早估权最小值來敎該 4.如請求項1之方、本甘占 1平估s十。 訊框。方法’其中該複數個音訊訊框包含約100個 5·如請求項〗之方法, 一平滑處理Uh、° n位準估計包含使用 6.如請求項5之方法,其t該雜 週期期間保持恆定。 计在语音活動之 7·如請求項5之方法,其中 平滑處理因子於語立w 子係藉由使用一第二 内插而遞歸地平均化。'"田别訊框中的—機率與1之間 8.如請求項】之方法並 先經邦定之雜訊位準訊位準估計包含複數個早 9.如請求们之方法 該等雜訊位準之該平均值係藉 144023.doc 201028996 田以 别雜訊位準内插該等雜訊位準之—早先經計算 的平均值而進行估計。 10·如凊求項1之方法,其進一 弄進步包含將该經計算之雜訊位 準估計加界限為低於一所要信號位準12與24犯之間。 如請求項1之方法,其進_ 具進步包含藉由將該當前訊框識 二具有無雜訊區段而價測語音活動。 12.如晴求項Η之方法,豆 田對於所有Te[〇.2,l],語音之 sx機率>τ時’宣告語音活動。 13 種雜讯判定系統,其包含 位準; 第一模組,其經組態以判定複數個音訊訊框 之雜訊 一第二模組,其經組態以計算該複數個音訊訊框上之 該等雜訊位準的平均值及標準差;及 …第Γ模组,其經組態以將一當前訊框之—雜訊位準 估计计算為自該平均值減去該標準差的值。 ΐ4·如請求項13之雜㈣定系統,其中該第三模組經組態以 在自該平均值減去之前按比例調整該標準差。 A如請求項13之雜訊判定系統,其中計算該雜訊位準 包含使用一平滑處理因子。 16. 如4求項15之雜訊判定系統,其中該雜訊位準佑計在诗 音活動之週期期間保持恆定。 m 17. 如請求項15之雜訊判定系统,其中該平滑處理因子係藉 由使用-第二平滑處理因子於語音在該當前訊框中的— 機率與一值1之間内插而遞歸地平均化。 144023.doc 201028996 18. 種用於估計一信號在該信號 -雜訊位準之方法,其包含, 每一者: 之複數個時間頻率倉中的 對於該信號之該等倉令的 判二複數個音訊訊框之該等雜訊位準; 估=該時間-頻率倉中之該雜訊位準,· 判定該時間-頻率倉中之初級雜訊位準; 之次級雜訊位 自該初級雜訊位準判定該時間-頻率 準;及自該時間_頻率倉 訊位準。 中之該次級雜訊位準判定 一有界的雜 .二:項18之方法,其中判定該有界的雜訊位準包含將 …i彳之雜訊位準加界限為低於一作 位準12與24dB之間。 的所要^ 2〇·如請求項18之方法,其進一步包含藉由對每-頻率p前訊框中的機率求加權總和而計算語音在 訊框申的該機率。 J 儿如請求項20之方法,其争向刪出至i55〇 Hz範圍内之權 重給出為至少0.02之一值。 種用於估计-音訊信號之一當前訊框中之雜訊位準的 系統’其包含: 用於判定複數個音訊訊框之該等雜訊位準的構件; 用於計算該複數個音訊訊框上之該等雜訊位準的平均 值及標準差的構件;及 = 用於將該當前訊框之一雜訊位準估計計算為自該平均 144023.doc 201028996 值減去該標準差的值之構件。 23. 如請求項22之雜訊判定糸 h .其中該用於計算該當前訊 框之一雜訊位準估計的g 4 件在自該平均值減去之前按比 例調整該標準差。 !按比 24. 如請求項22之系統,其中呤田认上,— 人 叾巾刻射彳㈣等雜錄準之構 件包3一經組態以判定-信號之能階的棋植。 A =請求項22之㈣’其中㈣於計算該等雜訊位準之該 :值及該標準差的構件包含—經組態以執行 的模組。 于%异 Ilf項22之系統’其中該用於計算—雜訊位準估計之 構件包含一經組態以執行數學運算的模組。 27.—種電腦可讀媒體,其包含 3虽在一處理器上執行時執行 一方法的指令’該方法包含: 判定複數個音訊訊框之雜訊位準; 計算該複數個音訊訊框上之該等雜訊位準的平均值及 標準差;及 該平均值減 將一當前訊框之一雜訊位準估計計算為自 去該標準差的值。 包含在自該平均值減去之 28‘如請求項27之方法,其進—步 前按比例調整該標準差。 29· -種,理器,其經程式化以執行—方法,該方法包含: 判定複數個音訊訊框之雜訊位準; 計算該複數個音訊訊框上之該等雜訊位準的平均值及 標準差;及 144023.doc 201028996 將一當前訊框之一雜訊位準估計計算為自該平均值減 去該標準差的值。 30.如請求項29之方法,其進一步包含在自該平均值減去之 前按比例調整該標準差。144023.doc
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10572708P | 2008-10-15 | 2008-10-15 | |
US12/579,322 US8380497B2 (en) | 2008-10-15 | 2009-10-14 | Methods and apparatus for noise estimation |
Publications (1)
Publication Number | Publication Date |
---|---|
TW201028996A true TW201028996A (en) | 2010-08-01 |
Family
ID=42099699
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW098134985A TW201028996A (en) | 2008-10-15 | 2009-10-15 | Methods and apparatus for noise estimation |
Country Status (7)
Country | Link |
---|---|
US (1) | US8380497B2 (zh) |
EP (1) | EP2351020A1 (zh) |
JP (1) | JP5596039B2 (zh) |
KR (3) | KR20110081295A (zh) |
CN (1) | CN102187388A (zh) |
TW (1) | TW201028996A (zh) |
WO (1) | WO2010045450A1 (zh) |
Cited By (128)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI582753B (zh) * | 2014-09-30 | 2017-05-11 | 蘋果公司 | 用於操作一虛擬助理之方法、系統及電腦可讀儲存媒體 |
US9668024B2 (en) | 2014-06-30 | 2017-05-30 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US9865248B2 (en) | 2008-04-05 | 2018-01-09 | Apple Inc. | Intelligent text-to-speech conversion |
US9934775B2 (en) | 2016-05-26 | 2018-04-03 | Apple Inc. | Unit-selection text-to-speech synthesis based on predicted concatenation parameters |
US9966060B2 (en) | 2013-06-07 | 2018-05-08 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
US9971774B2 (en) | 2012-09-19 | 2018-05-15 | Apple Inc. | Voice-based media searching |
US9972304B2 (en) | 2016-06-03 | 2018-05-15 | Apple Inc. | Privacy preserving distributed evaluation framework for embedded personalized systems |
US9986419B2 (en) | 2014-09-30 | 2018-05-29 | Apple Inc. | Social reminders |
US10043516B2 (en) | 2016-09-23 | 2018-08-07 | Apple Inc. | Intelligent automated assistant |
US10049675B2 (en) | 2010-02-25 | 2018-08-14 | Apple Inc. | User profiling for voice input processing |
US10049668B2 (en) | 2015-12-02 | 2018-08-14 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
US10049663B2 (en) | 2016-06-08 | 2018-08-14 | Apple, Inc. | Intelligent automated assistant for media exploration |
US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
US10079014B2 (en) | 2012-06-08 | 2018-09-18 | Apple Inc. | Name recognition system |
US10083690B2 (en) | 2014-05-30 | 2018-09-25 | Apple Inc. | Better resolution when referencing to concepts |
US10089072B2 (en) | 2016-06-11 | 2018-10-02 | Apple Inc. | Intelligent device arbitration and control |
US10108612B2 (en) | 2008-07-31 | 2018-10-23 | Apple Inc. | Mobile device having human language translation capability with positional feedback |
US10192552B2 (en) | 2016-06-10 | 2019-01-29 | Apple Inc. | Digital assistant providing whispered speech |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US10249300B2 (en) | 2016-06-06 | 2019-04-02 | Apple Inc. | Intelligent list reading |
US10269345B2 (en) | 2016-06-11 | 2019-04-23 | Apple Inc. | Intelligent task discovery |
US10297253B2 (en) | 2016-06-11 | 2019-05-21 | Apple Inc. | Application integration with a digital assistant |
US10303715B2 (en) | 2017-05-16 | 2019-05-28 | Apple Inc. | Intelligent automated assistant for media exploration |
US10311871B2 (en) | 2015-03-08 | 2019-06-04 | Apple Inc. | Competing devices responding to voice triggers |
US10311144B2 (en) | 2017-05-16 | 2019-06-04 | Apple Inc. | Emoji word sense disambiguation |
US10318871B2 (en) | 2005-09-08 | 2019-06-11 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US10332518B2 (en) | 2017-05-09 | 2019-06-25 | Apple Inc. | User interface for correcting recognition errors |
US10356243B2 (en) | 2015-06-05 | 2019-07-16 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
US10354011B2 (en) | 2016-06-09 | 2019-07-16 | Apple Inc. | Intelligent automated assistant in a home environment |
US10366158B2 (en) | 2015-09-29 | 2019-07-30 | Apple Inc. | Efficient word encoding for recurrent neural network language models |
US10381016B2 (en) | 2008-01-03 | 2019-08-13 | Apple Inc. | Methods and apparatus for altering audio output signals |
US10395654B2 (en) | 2017-05-11 | 2019-08-27 | Apple Inc. | Text normalization based on a data-driven learning network |
US10403283B1 (en) | 2018-06-01 | 2019-09-03 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
US10403278B2 (en) | 2017-05-16 | 2019-09-03 | Apple Inc. | Methods and systems for phonetic matching in digital assistant services |
US10410637B2 (en) | 2017-05-12 | 2019-09-10 | Apple Inc. | User-specific acoustic models |
US10417344B2 (en) | 2014-05-30 | 2019-09-17 | Apple Inc. | Exemplar-based natural language processing |
US10417405B2 (en) | 2011-03-21 | 2019-09-17 | Apple Inc. | Device access using voice authentication |
US10417266B2 (en) | 2017-05-09 | 2019-09-17 | Apple Inc. | Context-aware ranking of intelligent response suggestions |
US10431204B2 (en) | 2014-09-11 | 2019-10-01 | Apple Inc. | Method and apparatus for discovering trending terms in speech requests |
US10438595B2 (en) | 2014-09-30 | 2019-10-08 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
US10446143B2 (en) | 2016-03-14 | 2019-10-15 | Apple Inc. | Identification of voice inputs providing credentials |
US10445429B2 (en) | 2017-09-21 | 2019-10-15 | Apple Inc. | Natural language understanding using vocabularies with compressed serialized tries |
US10474753B2 (en) | 2016-09-07 | 2019-11-12 | Apple Inc. | Language identification using recurrent neural networks |
US10482874B2 (en) | 2017-05-15 | 2019-11-19 | Apple Inc. | Hierarchical belief states for digital assistants |
US10490187B2 (en) | 2016-06-10 | 2019-11-26 | Apple Inc. | Digital assistant providing automated status report |
US10496705B1 (en) | 2018-06-03 | 2019-12-03 | Apple Inc. | Accelerated task performance |
US10497365B2 (en) | 2014-05-30 | 2019-12-03 | Apple Inc. | Multi-command single utterance input method |
US10509862B2 (en) | 2016-06-10 | 2019-12-17 | Apple Inc. | Dynamic phrase expansion of language input |
US10521466B2 (en) | 2016-06-11 | 2019-12-31 | Apple Inc. | Data driven natural language event detection and classification |
US10529332B2 (en) | 2015-03-08 | 2020-01-07 | Apple Inc. | Virtual assistant activation |
US10567477B2 (en) | 2015-03-08 | 2020-02-18 | Apple Inc. | Virtual assistant continuity |
US10592604B2 (en) | 2018-03-12 | 2020-03-17 | Apple Inc. | Inverse text normalization for automatic speech recognition |
US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
US10636424B2 (en) | 2017-11-30 | 2020-04-28 | Apple Inc. | Multi-turn canned dialog |
US10643611B2 (en) | 2008-10-02 | 2020-05-05 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
US10657328B2 (en) | 2017-06-02 | 2020-05-19 | Apple Inc. | Multi-task recurrent neural network architecture for efficient morphology handling in neural language modeling |
US10657961B2 (en) | 2013-06-08 | 2020-05-19 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US10684703B2 (en) | 2018-06-01 | 2020-06-16 | Apple Inc. | Attention aware virtual assistant dismissal |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10699717B2 (en) | 2014-05-30 | 2020-06-30 | Apple Inc. | Intelligent assistant for home automation |
US10706841B2 (en) | 2010-01-18 | 2020-07-07 | Apple Inc. | Task flow identification based on user intent |
US10714117B2 (en) | 2013-02-07 | 2020-07-14 | Apple Inc. | Voice trigger for a digital assistant |
US10726832B2 (en) | 2017-05-11 | 2020-07-28 | Apple Inc. | Maintaining privacy of personal information |
US10733375B2 (en) | 2018-01-31 | 2020-08-04 | Apple Inc. | Knowledge-based framework for improving natural language understanding |
US10733982B2 (en) | 2018-01-08 | 2020-08-04 | Apple Inc. | Multi-directional dialog |
US10733993B2 (en) | 2016-06-10 | 2020-08-04 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US10741185B2 (en) | 2010-01-18 | 2020-08-11 | Apple Inc. | Intelligent automated assistant |
US10748546B2 (en) | 2017-05-16 | 2020-08-18 | Apple Inc. | Digital assistant services based on device capabilities |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US10755703B2 (en) | 2017-05-11 | 2020-08-25 | Apple Inc. | Offline personal assistant |
US10755051B2 (en) | 2017-09-29 | 2020-08-25 | Apple Inc. | Rule-based natural language processing |
US10769385B2 (en) | 2013-06-09 | 2020-09-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
US10789945B2 (en) | 2017-05-12 | 2020-09-29 | Apple Inc. | Low-latency intelligent automated assistant |
US10789959B2 (en) | 2018-03-02 | 2020-09-29 | Apple Inc. | Training speaker recognition models for digital assistants |
US10791176B2 (en) | 2017-05-12 | 2020-09-29 | Apple Inc. | Synchronization and task delegation of a digital assistant |
US10795541B2 (en) | 2009-06-05 | 2020-10-06 | Apple Inc. | Intelligent organization of tasks items |
US10810274B2 (en) | 2017-05-15 | 2020-10-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
US10818288B2 (en) | 2018-03-26 | 2020-10-27 | Apple Inc. | Natural assistant interaction |
US10839159B2 (en) | 2018-09-28 | 2020-11-17 | Apple Inc. | Named entity normalization in a spoken dialog system |
US10892996B2 (en) | 2018-06-01 | 2021-01-12 | Apple Inc. | Variable latency device coordination |
US10909331B2 (en) | 2018-03-30 | 2021-02-02 | Apple Inc. | Implicit identification of translation payload with neural machine translation |
US10928918B2 (en) | 2018-05-07 | 2021-02-23 | Apple Inc. | Raise to speak |
US10984780B2 (en) | 2018-05-21 | 2021-04-20 | Apple Inc. | Global semantic word embeddings using bi-directional recurrent neural networks |
US11010550B2 (en) | 2015-09-29 | 2021-05-18 | Apple Inc. | Unified language modeling framework for word prediction, auto-completion and auto-correction |
US11010127B2 (en) | 2015-06-29 | 2021-05-18 | Apple Inc. | Virtual assistant for media playback |
US11010561B2 (en) | 2018-09-27 | 2021-05-18 | Apple Inc. | Sentiment prediction from textual data |
US11023513B2 (en) | 2007-12-20 | 2021-06-01 | Apple Inc. | Method and apparatus for searching using an active ontology |
US11025565B2 (en) | 2015-06-07 | 2021-06-01 | Apple Inc. | Personalized prediction of responses for instant messaging |
US11048473B2 (en) | 2013-06-09 | 2021-06-29 | Apple Inc. | Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant |
US11069336B2 (en) | 2012-03-02 | 2021-07-20 | Apple Inc. | Systems and methods for name pronunciation |
US11080012B2 (en) | 2009-06-05 | 2021-08-03 | Apple Inc. | Interface for a virtual digital assistant |
US11120372B2 (en) | 2011-06-03 | 2021-09-14 | Apple Inc. | Performing actions associated with task items that represent tasks to perform |
US11127397B2 (en) | 2015-05-27 | 2021-09-21 | Apple Inc. | Device voice control |
US11133008B2 (en) | 2014-05-30 | 2021-09-28 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
US11140099B2 (en) | 2019-05-21 | 2021-10-05 | Apple Inc. | Providing message response suggestions |
US11145294B2 (en) | 2018-05-07 | 2021-10-12 | Apple Inc. | Intelligent automated assistant for delivering content from user experiences |
US11170166B2 (en) | 2018-09-28 | 2021-11-09 | Apple Inc. | Neural typographical error modeling via generative adversarial networks |
US11204787B2 (en) | 2017-01-09 | 2021-12-21 | Apple Inc. | Application integration with a digital assistant |
US11217251B2 (en) | 2019-05-06 | 2022-01-04 | Apple Inc. | Spoken notifications |
US11227589B2 (en) | 2016-06-06 | 2022-01-18 | Apple Inc. | Intelligent list reading |
US11231904B2 (en) | 2015-03-06 | 2022-01-25 | Apple Inc. | Reducing response latency of intelligent automated assistants |
US11237797B2 (en) | 2019-05-31 | 2022-02-01 | Apple Inc. | User activity shortcut suggestions |
US11269678B2 (en) | 2012-05-15 | 2022-03-08 | Apple Inc. | Systems and methods for integrating third party services with a digital assistant |
US11281993B2 (en) | 2016-12-05 | 2022-03-22 | Apple Inc. | Model and ensemble compression for metric learning |
US11289073B2 (en) | 2019-05-31 | 2022-03-29 | Apple Inc. | Device text to speech |
US11301477B2 (en) | 2017-05-12 | 2022-04-12 | Apple Inc. | Feedback analysis of a digital assistant |
US11307752B2 (en) | 2019-05-06 | 2022-04-19 | Apple Inc. | User configurable task triggers |
US11314370B2 (en) | 2013-12-06 | 2022-04-26 | Apple Inc. | Method for extracting salient dialog usage from live data |
US11348573B2 (en) | 2019-03-18 | 2022-05-31 | Apple Inc. | Multimodality in digital assistant systems |
US11350253B2 (en) | 2011-06-03 | 2022-05-31 | Apple Inc. | Active transport based notifications |
US11360641B2 (en) | 2019-06-01 | 2022-06-14 | Apple Inc. | Increasing the relevance of new available information |
US11388291B2 (en) | 2013-03-14 | 2022-07-12 | Apple Inc. | System and method for processing voicemail |
US11386266B2 (en) | 2018-06-01 | 2022-07-12 | Apple Inc. | Text correction |
US11423908B2 (en) | 2019-05-06 | 2022-08-23 | Apple Inc. | Interpreting spoken requests |
US11462215B2 (en) | 2018-09-28 | 2022-10-04 | Apple Inc. | Multi-modal inputs for voice commands |
US11468282B2 (en) | 2015-05-15 | 2022-10-11 | Apple Inc. | Virtual assistant in a communication session |
US11475884B2 (en) | 2019-05-06 | 2022-10-18 | Apple Inc. | Reducing digital assistant latency when a language is incorrectly determined |
US11475898B2 (en) | 2018-10-26 | 2022-10-18 | Apple Inc. | Low-latency multi-speaker speech recognition |
US11488406B2 (en) | 2019-09-25 | 2022-11-01 | Apple Inc. | Text detection using global geometry estimators |
US11495218B2 (en) | 2018-06-01 | 2022-11-08 | Apple Inc. | Virtual assistant operation in multi-device environments |
US11496600B2 (en) | 2019-05-31 | 2022-11-08 | Apple Inc. | Remote execution of machine-learned models |
US11532306B2 (en) | 2017-05-16 | 2022-12-20 | Apple Inc. | Detecting a trigger of a digital assistant |
US11620999B2 (en) | 2020-09-18 | 2023-04-04 | Apple Inc. | Reducing device processing of unintended audio |
US11638059B2 (en) | 2019-01-04 | 2023-04-25 | Apple Inc. | Content playback on multiple devices |
US11657813B2 (en) | 2019-05-31 | 2023-05-23 | Apple Inc. | Voice identification in digital assistant systems |
US11798547B2 (en) | 2013-03-15 | 2023-10-24 | Apple Inc. | Voice activated device for use with a voice-based digital assistant |
US12010262B2 (en) | 2013-08-06 | 2024-06-11 | Apple Inc. | Auto-activating smart responses based on activities from remote devices |
Families Citing this family (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101335417B1 (ko) * | 2008-03-31 | 2013-12-05 | (주)트란소노 | 노이지 음성 신호의 처리 방법과 이를 위한 장치 및 컴퓨터판독 가능한 기록매체 |
WO2010146711A1 (ja) * | 2009-06-19 | 2010-12-23 | 富士通株式会社 | 音声信号処理装置及び音声信号処理方法 |
KR101581885B1 (ko) * | 2009-08-26 | 2016-01-04 | 삼성전자주식회사 | 복소 스펙트럼 잡음 제거 장치 및 방법 |
US9172345B2 (en) | 2010-07-27 | 2015-10-27 | Bitwave Pte Ltd | Personalized adjustment of an audio device |
US20120166117A1 (en) * | 2010-10-29 | 2012-06-28 | Xia Llc | Method and apparatus for evaluating superconducting tunnel junction detector noise versus bias voltage |
US10218327B2 (en) | 2011-01-10 | 2019-02-26 | Zhinian Jing | Dynamic enhancement of audio (DAE) in headset systems |
FR2976710B1 (fr) * | 2011-06-20 | 2013-07-05 | Parrot | Procede de debruitage pour equipement audio multi-microphones, notamment pour un systeme de telephonie "mains libres" |
CN102592592A (zh) * | 2011-12-30 | 2012-07-18 | 深圳市车音网科技有限公司 | 语音数据的提取方法和装置 |
EP2828853B1 (en) | 2012-03-23 | 2018-09-12 | Dolby Laboratories Licensing Corporation | Method and system for bias corrected speech level determination |
HUP1200197A2 (hu) | 2012-04-03 | 2013-10-28 | Budapesti Mueszaki Es Gazdasagtudomanyi Egyetem | Eljárás és elrendezés környezeti zaj valós idejû, forrásszelektív monitorozására és térképezésére |
US8842810B2 (en) * | 2012-05-25 | 2014-09-23 | Tim Lieu | Emergency communications management |
CN102820035A (zh) * | 2012-08-23 | 2012-12-12 | 无锡思达物电子技术有限公司 | 一种对长时变噪声的自适应判决方法 |
WO2014043024A1 (en) * | 2012-09-17 | 2014-03-20 | Dolby Laboratories Licensing Corporation | Long term monitoring of transmission and voice activity patterns for regulating gain control |
JP6066471B2 (ja) * | 2012-10-12 | 2017-01-25 | 本田技研工業株式会社 | 対話システム及び対話システム向け発話の判別方法 |
US9449609B2 (en) * | 2013-11-07 | 2016-09-20 | Continental Automotive Systems, Inc. | Accurate forward SNR estimation based on MMSE speech probability presence |
US9449615B2 (en) * | 2013-11-07 | 2016-09-20 | Continental Automotive Systems, Inc. | Externally estimated SNR based modifiers for internal MMSE calculators |
US9449610B2 (en) * | 2013-11-07 | 2016-09-20 | Continental Automotive Systems, Inc. | Speech probability presence modifier improving log-MMSE based noise suppression performance |
TWI573096B (zh) * | 2013-12-31 | 2017-03-01 | 智原科技股份有限公司 | 影像雜訊估測的方法與裝置 |
KR20150105847A (ko) * | 2014-03-10 | 2015-09-18 | 삼성전기주식회사 | 음성구간 검출 방법 및 장치 |
CN105336341A (zh) * | 2014-05-26 | 2016-02-17 | 杜比实验室特许公司 | 增强音频信号中的语音内容的可理解性 |
US10141003B2 (en) * | 2014-06-09 | 2018-11-27 | Dolby Laboratories Licensing Corporation | Noise level estimation |
CN105336344B (zh) * | 2014-07-10 | 2019-08-20 | 华为技术有限公司 | 杂音检测方法和装置 |
US9886966B2 (en) * | 2014-11-07 | 2018-02-06 | Apple Inc. | System and method for improving noise suppression using logistic function and a suppression target value for automatic speech recognition |
US9330684B1 (en) * | 2015-03-27 | 2016-05-03 | Continental Automotive Systems, Inc. | Real-time wind buffet noise detection |
JP6404780B2 (ja) * | 2015-07-14 | 2018-10-17 | 日本電信電話株式会社 | ウィナーフィルタ設計装置、音強調装置、音響特徴量選択装置、これらの方法及びプログラム |
US10224053B2 (en) | 2017-03-24 | 2019-03-05 | Hyundai Motor Company | Audio signal quality enhancement based on quantitative SNR analysis and adaptive Wiener filtering |
US10360895B2 (en) * | 2017-12-21 | 2019-07-23 | Bose Corporation | Dynamic sound adjustment based on noise floor estimate |
CN111063368B (zh) * | 2018-10-16 | 2022-09-27 | 中国移动通信有限公司研究院 | 一种音频信号中的噪声估计方法、装置、介质和设备 |
KR102237286B1 (ko) * | 2019-03-12 | 2021-04-07 | 울산과학기술원 | 음성 구간 검출장치 및 그 방법 |
JP7004875B2 (ja) * | 2019-12-20 | 2022-01-21 | 三菱電機株式会社 | 情報処理装置、算出方法、及び算出プログラム |
CN111354378B (zh) * | 2020-02-12 | 2020-11-24 | 北京声智科技有限公司 | 语音端点检测方法、装置、设备及计算机存储介质 |
CN113270107B (zh) * | 2021-04-13 | 2024-02-06 | 维沃移动通信有限公司 | 音频信号中噪声响度的获取方法、装置和电子设备 |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0315897A (ja) * | 1989-06-14 | 1991-01-24 | Fujitsu Ltd | 判別閾値設定制御方式 |
JP2966452B2 (ja) | 1989-12-11 | 1999-10-25 | 三洋電機株式会社 | 音声認識装置の雑音除去システム |
CN1145928C (zh) | 1999-06-07 | 2004-04-14 | 艾利森公司 | 用参数噪声模型统计量产生舒适噪声的方法及装置 |
US7117149B1 (en) * | 1999-08-30 | 2006-10-03 | Harman Becker Automotive Systems-Wavemakers, Inc. | Sound source classification |
FR2833103B1 (fr) * | 2001-12-05 | 2004-07-09 | France Telecom | Systeme de detection de parole dans le bruit |
JP2003316381A (ja) | 2002-04-23 | 2003-11-07 | Toshiba Corp | 雑音抑圧方法及び雑音抑圧プログラム |
US7388954B2 (en) | 2002-06-24 | 2008-06-17 | Freescale Semiconductor, Inc. | Method and apparatus for tone indication |
KR100677396B1 (ko) * | 2004-11-20 | 2007-02-02 | 엘지전자 주식회사 | 음성인식장치의 음성구간 검출방법 |
JP4765461B2 (ja) * | 2005-07-27 | 2011-09-07 | 日本電気株式会社 | 雑音抑圧システムと方法及びプログラム |
CN100580770C (zh) * | 2005-08-08 | 2010-01-13 | 中国科学院声学研究所 | 基于能量及谐波的语音端点检测方法 |
CN101197130B (zh) * | 2006-12-07 | 2011-05-18 | 华为技术有限公司 | 声音活动检测方法和声音活动检测器 |
-
2009
- 2009-10-14 US US12/579,322 patent/US8380497B2/en active Active
- 2009-10-15 WO PCT/US2009/060828 patent/WO2010045450A1/en active Application Filing
- 2009-10-15 KR KR1020117011012A patent/KR20110081295A/ko active IP Right Grant
- 2009-10-15 TW TW098134985A patent/TW201028996A/zh unknown
- 2009-10-15 JP JP2011532248A patent/JP5596039B2/ja not_active Expired - Fee Related
- 2009-10-15 EP EP09737318A patent/EP2351020A1/en not_active Withdrawn
- 2009-10-15 KR KR1020137007743A patent/KR20130042649A/ko not_active Application Discontinuation
- 2009-10-15 CN CN2009801412129A patent/CN102187388A/zh active Pending
- 2009-10-15 KR KR1020137002342A patent/KR101246954B1/ko not_active IP Right Cessation
Cited By (174)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10318871B2 (en) | 2005-09-08 | 2019-06-11 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US11928604B2 (en) | 2005-09-08 | 2024-03-12 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US11023513B2 (en) | 2007-12-20 | 2021-06-01 | Apple Inc. | Method and apparatus for searching using an active ontology |
US10381016B2 (en) | 2008-01-03 | 2019-08-13 | Apple Inc. | Methods and apparatus for altering audio output signals |
US9865248B2 (en) | 2008-04-05 | 2018-01-09 | Apple Inc. | Intelligent text-to-speech conversion |
US10108612B2 (en) | 2008-07-31 | 2018-10-23 | Apple Inc. | Mobile device having human language translation capability with positional feedback |
US10643611B2 (en) | 2008-10-02 | 2020-05-05 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
US11348582B2 (en) | 2008-10-02 | 2022-05-31 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
US11080012B2 (en) | 2009-06-05 | 2021-08-03 | Apple Inc. | Interface for a virtual digital assistant |
US10795541B2 (en) | 2009-06-05 | 2020-10-06 | Apple Inc. | Intelligent organization of tasks items |
US10741185B2 (en) | 2010-01-18 | 2020-08-11 | Apple Inc. | Intelligent automated assistant |
US10706841B2 (en) | 2010-01-18 | 2020-07-07 | Apple Inc. | Task flow identification based on user intent |
US11423886B2 (en) | 2010-01-18 | 2022-08-23 | Apple Inc. | Task flow identification based on user intent |
US10692504B2 (en) | 2010-02-25 | 2020-06-23 | Apple Inc. | User profiling for voice input processing |
US10049675B2 (en) | 2010-02-25 | 2018-08-14 | Apple Inc. | User profiling for voice input processing |
US10417405B2 (en) | 2011-03-21 | 2019-09-17 | Apple Inc. | Device access using voice authentication |
US11120372B2 (en) | 2011-06-03 | 2021-09-14 | Apple Inc. | Performing actions associated with task items that represent tasks to perform |
US11350253B2 (en) | 2011-06-03 | 2022-05-31 | Apple Inc. | Active transport based notifications |
US11069336B2 (en) | 2012-03-02 | 2021-07-20 | Apple Inc. | Systems and methods for name pronunciation |
US11269678B2 (en) | 2012-05-15 | 2022-03-08 | Apple Inc. | Systems and methods for integrating third party services with a digital assistant |
US10079014B2 (en) | 2012-06-08 | 2018-09-18 | Apple Inc. | Name recognition system |
US9971774B2 (en) | 2012-09-19 | 2018-05-15 | Apple Inc. | Voice-based media searching |
US10978090B2 (en) | 2013-02-07 | 2021-04-13 | Apple Inc. | Voice trigger for a digital assistant |
US10714117B2 (en) | 2013-02-07 | 2020-07-14 | Apple Inc. | Voice trigger for a digital assistant |
US11388291B2 (en) | 2013-03-14 | 2022-07-12 | Apple Inc. | System and method for processing voicemail |
US11798547B2 (en) | 2013-03-15 | 2023-10-24 | Apple Inc. | Voice activated device for use with a voice-based digital assistant |
US9966060B2 (en) | 2013-06-07 | 2018-05-08 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
US10657961B2 (en) | 2013-06-08 | 2020-05-19 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
US10769385B2 (en) | 2013-06-09 | 2020-09-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
US11727219B2 (en) | 2013-06-09 | 2023-08-15 | Apple Inc. | System and method for inferring user intent from speech inputs |
US11048473B2 (en) | 2013-06-09 | 2021-06-29 | Apple Inc. | Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant |
US12010262B2 (en) | 2013-08-06 | 2024-06-11 | Apple Inc. | Auto-activating smart responses based on activities from remote devices |
US11314370B2 (en) | 2013-12-06 | 2022-04-26 | Apple Inc. | Method for extracting salient dialog usage from live data |
US11133008B2 (en) | 2014-05-30 | 2021-09-28 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
US10699717B2 (en) | 2014-05-30 | 2020-06-30 | Apple Inc. | Intelligent assistant for home automation |
US10657966B2 (en) | 2014-05-30 | 2020-05-19 | Apple Inc. | Better resolution when referencing to concepts |
US10083690B2 (en) | 2014-05-30 | 2018-09-25 | Apple Inc. | Better resolution when referencing to concepts |
US10417344B2 (en) | 2014-05-30 | 2019-09-17 | Apple Inc. | Exemplar-based natural language processing |
US10878809B2 (en) | 2014-05-30 | 2020-12-29 | Apple Inc. | Multi-command single utterance input method |
US10714095B2 (en) | 2014-05-30 | 2020-07-14 | Apple Inc. | Intelligent assistant for home automation |
US10497365B2 (en) | 2014-05-30 | 2019-12-03 | Apple Inc. | Multi-command single utterance input method |
US11257504B2 (en) | 2014-05-30 | 2022-02-22 | Apple Inc. | Intelligent assistant for home automation |
US9668024B2 (en) | 2014-06-30 | 2017-05-30 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US10904611B2 (en) | 2014-06-30 | 2021-01-26 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US10431204B2 (en) | 2014-09-11 | 2019-10-01 | Apple Inc. | Method and apparatus for discovering trending terms in speech requests |
TWI582753B (zh) * | 2014-09-30 | 2017-05-11 | 蘋果公司 | 用於操作一虛擬助理之方法、系統及電腦可讀儲存媒體 |
US10390213B2 (en) | 2014-09-30 | 2019-08-20 | Apple Inc. | Social reminders |
US9986419B2 (en) | 2014-09-30 | 2018-05-29 | Apple Inc. | Social reminders |
US10074360B2 (en) | 2014-09-30 | 2018-09-11 | Apple Inc. | Providing an indication of the suitability of speech recognition |
US10438595B2 (en) | 2014-09-30 | 2019-10-08 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
US10453443B2 (en) | 2014-09-30 | 2019-10-22 | Apple Inc. | Providing an indication of the suitability of speech recognition |
US11231904B2 (en) | 2015-03-06 | 2022-01-25 | Apple Inc. | Reducing response latency of intelligent automated assistants |
US10930282B2 (en) | 2015-03-08 | 2021-02-23 | Apple Inc. | Competing devices responding to voice triggers |
US10311871B2 (en) | 2015-03-08 | 2019-06-04 | Apple Inc. | Competing devices responding to voice triggers |
US10567477B2 (en) | 2015-03-08 | 2020-02-18 | Apple Inc. | Virtual assistant continuity |
US11087759B2 (en) | 2015-03-08 | 2021-08-10 | Apple Inc. | Virtual assistant activation |
US10529332B2 (en) | 2015-03-08 | 2020-01-07 | Apple Inc. | Virtual assistant activation |
US11468282B2 (en) | 2015-05-15 | 2022-10-11 | Apple Inc. | Virtual assistant in a communication session |
US11127397B2 (en) | 2015-05-27 | 2021-09-21 | Apple Inc. | Device voice control |
US10681212B2 (en) | 2015-06-05 | 2020-06-09 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
US10356243B2 (en) | 2015-06-05 | 2019-07-16 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
US11025565B2 (en) | 2015-06-07 | 2021-06-01 | Apple Inc. | Personalized prediction of responses for instant messaging |
US11010127B2 (en) | 2015-06-29 | 2021-05-18 | Apple Inc. | Virtual assistant for media playback |
US11126400B2 (en) | 2015-09-08 | 2021-09-21 | Apple Inc. | Zero latency digital assistant |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US11500672B2 (en) | 2015-09-08 | 2022-11-15 | Apple Inc. | Distributed personal assistant |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US11010550B2 (en) | 2015-09-29 | 2021-05-18 | Apple Inc. | Unified language modeling framework for word prediction, auto-completion and auto-correction |
US10366158B2 (en) | 2015-09-29 | 2019-07-30 | Apple Inc. | Efficient word encoding for recurrent neural network language models |
US11526368B2 (en) | 2015-11-06 | 2022-12-13 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10354652B2 (en) | 2015-12-02 | 2019-07-16 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
US10049668B2 (en) | 2015-12-02 | 2018-08-14 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
US10942703B2 (en) | 2015-12-23 | 2021-03-09 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US10446143B2 (en) | 2016-03-14 | 2019-10-15 | Apple Inc. | Identification of voice inputs providing credentials |
US9934775B2 (en) | 2016-05-26 | 2018-04-03 | Apple Inc. | Unit-selection text-to-speech synthesis based on predicted concatenation parameters |
US9972304B2 (en) | 2016-06-03 | 2018-05-15 | Apple Inc. | Privacy preserving distributed evaluation framework for embedded personalized systems |
US10249300B2 (en) | 2016-06-06 | 2019-04-02 | Apple Inc. | Intelligent list reading |
US11227589B2 (en) | 2016-06-06 | 2022-01-18 | Apple Inc. | Intelligent list reading |
US10049663B2 (en) | 2016-06-08 | 2018-08-14 | Apple, Inc. | Intelligent automated assistant for media exploration |
US11069347B2 (en) | 2016-06-08 | 2021-07-20 | Apple Inc. | Intelligent automated assistant for media exploration |
US10354011B2 (en) | 2016-06-09 | 2019-07-16 | Apple Inc. | Intelligent automated assistant in a home environment |
US10192552B2 (en) | 2016-06-10 | 2019-01-29 | Apple Inc. | Digital assistant providing whispered speech |
US10490187B2 (en) | 2016-06-10 | 2019-11-26 | Apple Inc. | Digital assistant providing automated status report |
US11037565B2 (en) | 2016-06-10 | 2021-06-15 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
US10733993B2 (en) | 2016-06-10 | 2020-08-04 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US10509862B2 (en) | 2016-06-10 | 2019-12-17 | Apple Inc. | Dynamic phrase expansion of language input |
US10269345B2 (en) | 2016-06-11 | 2019-04-23 | Apple Inc. | Intelligent task discovery |
US10089072B2 (en) | 2016-06-11 | 2018-10-02 | Apple Inc. | Intelligent device arbitration and control |
US10580409B2 (en) | 2016-06-11 | 2020-03-03 | Apple Inc. | Application integration with a digital assistant |
US10942702B2 (en) | 2016-06-11 | 2021-03-09 | Apple Inc. | Intelligent device arbitration and control |
US10521466B2 (en) | 2016-06-11 | 2019-12-31 | Apple Inc. | Data driven natural language event detection and classification |
US11152002B2 (en) | 2016-06-11 | 2021-10-19 | Apple Inc. | Application integration with a digital assistant |
US10297253B2 (en) | 2016-06-11 | 2019-05-21 | Apple Inc. | Application integration with a digital assistant |
US10474753B2 (en) | 2016-09-07 | 2019-11-12 | Apple Inc. | Language identification using recurrent neural networks |
US10553215B2 (en) | 2016-09-23 | 2020-02-04 | Apple Inc. | Intelligent automated assistant |
US10043516B2 (en) | 2016-09-23 | 2018-08-07 | Apple Inc. | Intelligent automated assistant |
US11281993B2 (en) | 2016-12-05 | 2022-03-22 | Apple Inc. | Model and ensemble compression for metric learning |
US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
US11656884B2 (en) | 2017-01-09 | 2023-05-23 | Apple Inc. | Application integration with a digital assistant |
US11204787B2 (en) | 2017-01-09 | 2021-12-21 | Apple Inc. | Application integration with a digital assistant |
US10332518B2 (en) | 2017-05-09 | 2019-06-25 | Apple Inc. | User interface for correcting recognition errors |
US10417266B2 (en) | 2017-05-09 | 2019-09-17 | Apple Inc. | Context-aware ranking of intelligent response suggestions |
US10741181B2 (en) | 2017-05-09 | 2020-08-11 | Apple Inc. | User interface for correcting recognition errors |
US10847142B2 (en) | 2017-05-11 | 2020-11-24 | Apple Inc. | Maintaining privacy of personal information |
US11599331B2 (en) | 2017-05-11 | 2023-03-07 | Apple Inc. | Maintaining privacy of personal information |
US10726832B2 (en) | 2017-05-11 | 2020-07-28 | Apple Inc. | Maintaining privacy of personal information |
US10755703B2 (en) | 2017-05-11 | 2020-08-25 | Apple Inc. | Offline personal assistant |
US10395654B2 (en) | 2017-05-11 | 2019-08-27 | Apple Inc. | Text normalization based on a data-driven learning network |
US10789945B2 (en) | 2017-05-12 | 2020-09-29 | Apple Inc. | Low-latency intelligent automated assistant |
US10410637B2 (en) | 2017-05-12 | 2019-09-10 | Apple Inc. | User-specific acoustic models |
US11380310B2 (en) | 2017-05-12 | 2022-07-05 | Apple Inc. | Low-latency intelligent automated assistant |
US11405466B2 (en) | 2017-05-12 | 2022-08-02 | Apple Inc. | Synchronization and task delegation of a digital assistant |
US11301477B2 (en) | 2017-05-12 | 2022-04-12 | Apple Inc. | Feedback analysis of a digital assistant |
US10791176B2 (en) | 2017-05-12 | 2020-09-29 | Apple Inc. | Synchronization and task delegation of a digital assistant |
US10482874B2 (en) | 2017-05-15 | 2019-11-19 | Apple Inc. | Hierarchical belief states for digital assistants |
US10810274B2 (en) | 2017-05-15 | 2020-10-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
US10909171B2 (en) | 2017-05-16 | 2021-02-02 | Apple Inc. | Intelligent automated assistant for media exploration |
US11217255B2 (en) | 2017-05-16 | 2022-01-04 | Apple Inc. | Far-field extension for digital assistant services |
US10303715B2 (en) | 2017-05-16 | 2019-05-28 | Apple Inc. | Intelligent automated assistant for media exploration |
US10311144B2 (en) | 2017-05-16 | 2019-06-04 | Apple Inc. | Emoji word sense disambiguation |
US10403278B2 (en) | 2017-05-16 | 2019-09-03 | Apple Inc. | Methods and systems for phonetic matching in digital assistant services |
US10748546B2 (en) | 2017-05-16 | 2020-08-18 | Apple Inc. | Digital assistant services based on device capabilities |
US11532306B2 (en) | 2017-05-16 | 2022-12-20 | Apple Inc. | Detecting a trigger of a digital assistant |
US10657328B2 (en) | 2017-06-02 | 2020-05-19 | Apple Inc. | Multi-task recurrent neural network architecture for efficient morphology handling in neural language modeling |
US10445429B2 (en) | 2017-09-21 | 2019-10-15 | Apple Inc. | Natural language understanding using vocabularies with compressed serialized tries |
US10755051B2 (en) | 2017-09-29 | 2020-08-25 | Apple Inc. | Rule-based natural language processing |
US10636424B2 (en) | 2017-11-30 | 2020-04-28 | Apple Inc. | Multi-turn canned dialog |
US10733982B2 (en) | 2018-01-08 | 2020-08-04 | Apple Inc. | Multi-directional dialog |
US10733375B2 (en) | 2018-01-31 | 2020-08-04 | Apple Inc. | Knowledge-based framework for improving natural language understanding |
US10789959B2 (en) | 2018-03-02 | 2020-09-29 | Apple Inc. | Training speaker recognition models for digital assistants |
US10592604B2 (en) | 2018-03-12 | 2020-03-17 | Apple Inc. | Inverse text normalization for automatic speech recognition |
US10818288B2 (en) | 2018-03-26 | 2020-10-27 | Apple Inc. | Natural assistant interaction |
US11710482B2 (en) | 2018-03-26 | 2023-07-25 | Apple Inc. | Natural assistant interaction |
US10909331B2 (en) | 2018-03-30 | 2021-02-02 | Apple Inc. | Implicit identification of translation payload with neural machine translation |
US10928918B2 (en) | 2018-05-07 | 2021-02-23 | Apple Inc. | Raise to speak |
US11854539B2 (en) | 2018-05-07 | 2023-12-26 | Apple Inc. | Intelligent automated assistant for delivering content from user experiences |
US11169616B2 (en) | 2018-05-07 | 2021-11-09 | Apple Inc. | Raise to speak |
US11145294B2 (en) | 2018-05-07 | 2021-10-12 | Apple Inc. | Intelligent automated assistant for delivering content from user experiences |
US10984780B2 (en) | 2018-05-21 | 2021-04-20 | Apple Inc. | Global semantic word embeddings using bi-directional recurrent neural networks |
US10684703B2 (en) | 2018-06-01 | 2020-06-16 | Apple Inc. | Attention aware virtual assistant dismissal |
US11495218B2 (en) | 2018-06-01 | 2022-11-08 | Apple Inc. | Virtual assistant operation in multi-device environments |
US11009970B2 (en) | 2018-06-01 | 2021-05-18 | Apple Inc. | Attention aware virtual assistant dismissal |
US10892996B2 (en) | 2018-06-01 | 2021-01-12 | Apple Inc. | Variable latency device coordination |
US10403283B1 (en) | 2018-06-01 | 2019-09-03 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
US10720160B2 (en) | 2018-06-01 | 2020-07-21 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
US11386266B2 (en) | 2018-06-01 | 2022-07-12 | Apple Inc. | Text correction |
US11431642B2 (en) | 2018-06-01 | 2022-08-30 | Apple Inc. | Variable latency device coordination |
US10984798B2 (en) | 2018-06-01 | 2021-04-20 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
US10496705B1 (en) | 2018-06-03 | 2019-12-03 | Apple Inc. | Accelerated task performance |
US10504518B1 (en) | 2018-06-03 | 2019-12-10 | Apple Inc. | Accelerated task performance |
US10944859B2 (en) | 2018-06-03 | 2021-03-09 | Apple Inc. | Accelerated task performance |
US11010561B2 (en) | 2018-09-27 | 2021-05-18 | Apple Inc. | Sentiment prediction from textual data |
US11462215B2 (en) | 2018-09-28 | 2022-10-04 | Apple Inc. | Multi-modal inputs for voice commands |
US11170166B2 (en) | 2018-09-28 | 2021-11-09 | Apple Inc. | Neural typographical error modeling via generative adversarial networks |
US10839159B2 (en) | 2018-09-28 | 2020-11-17 | Apple Inc. | Named entity normalization in a spoken dialog system |
US11475898B2 (en) | 2018-10-26 | 2022-10-18 | Apple Inc. | Low-latency multi-speaker speech recognition |
US11638059B2 (en) | 2019-01-04 | 2023-04-25 | Apple Inc. | Content playback on multiple devices |
US11348573B2 (en) | 2019-03-18 | 2022-05-31 | Apple Inc. | Multimodality in digital assistant systems |
US11423908B2 (en) | 2019-05-06 | 2022-08-23 | Apple Inc. | Interpreting spoken requests |
US11307752B2 (en) | 2019-05-06 | 2022-04-19 | Apple Inc. | User configurable task triggers |
US11217251B2 (en) | 2019-05-06 | 2022-01-04 | Apple Inc. | Spoken notifications |
US11475884B2 (en) | 2019-05-06 | 2022-10-18 | Apple Inc. | Reducing digital assistant latency when a language is incorrectly determined |
US11140099B2 (en) | 2019-05-21 | 2021-10-05 | Apple Inc. | Providing message response suggestions |
US11289073B2 (en) | 2019-05-31 | 2022-03-29 | Apple Inc. | Device text to speech |
US11657813B2 (en) | 2019-05-31 | 2023-05-23 | Apple Inc. | Voice identification in digital assistant systems |
US11237797B2 (en) | 2019-05-31 | 2022-02-01 | Apple Inc. | User activity shortcut suggestions |
US11496600B2 (en) | 2019-05-31 | 2022-11-08 | Apple Inc. | Remote execution of machine-learned models |
US11360739B2 (en) | 2019-05-31 | 2022-06-14 | Apple Inc. | User activity shortcut suggestions |
US11360641B2 (en) | 2019-06-01 | 2022-06-14 | Apple Inc. | Increasing the relevance of new available information |
US11488406B2 (en) | 2019-09-25 | 2022-11-01 | Apple Inc. | Text detection using global geometry estimators |
US11620999B2 (en) | 2020-09-18 | 2023-04-04 | Apple Inc. | Reducing device processing of unintended audio |
Also Published As
Publication number | Publication date |
---|---|
US8380497B2 (en) | 2013-02-19 |
WO2010045450A1 (en) | 2010-04-22 |
KR20130019017A (ko) | 2013-02-25 |
KR101246954B1 (ko) | 2013-03-25 |
CN102187388A (zh) | 2011-09-14 |
US20100094625A1 (en) | 2010-04-15 |
KR20110081295A (ko) | 2011-07-13 |
JP2012506073A (ja) | 2012-03-08 |
KR20130042649A (ko) | 2013-04-26 |
EP2351020A1 (en) | 2011-08-03 |
JP5596039B2 (ja) | 2014-09-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TW201028996A (en) | Methods and apparatus for noise estimation | |
US10504539B2 (en) | Voice activity detection systems and methods | |
Davis et al. | Statistical voice activity detection using low-variance spectrum estimation and an adaptive threshold | |
US8239196B1 (en) | System and method for multi-channel multi-feature speech/noise classification for noise suppression | |
JP4307557B2 (ja) | 音声活性度検出器 | |
JP4520732B2 (ja) | 雑音低減装置、および低減方法 | |
EP1547061B1 (en) | Multichannel voice detection in adverse environments | |
US8135586B2 (en) | Method and apparatus for estimating noise by using harmonics of voice signal | |
CN110739005A (zh) | 一种面向瞬态噪声抑制的实时语音增强方法 | |
WO2017136018A1 (en) | Babble noise suppression | |
US8744846B2 (en) | Procedure for processing noisy speech signals, and apparatus and computer program therefor | |
Choi et al. | On using acoustic environment classification for statistical model-based speech enhancement | |
JP2014122939A (ja) | 音声処理装置および方法、並びにプログラム | |
JP2011033717A (ja) | 雑音抑圧装置 | |
CN110265058A (zh) | 估计音频信号中的背景噪声 | |
Zhang et al. | A novel fast nonstationary noise tracking approach based on MMSE spectral power estimator | |
TW200818802A (en) | Systems, methods, and apparatus for signal change detection | |
US10229686B2 (en) | Methods and apparatus for speech segmentation using multiple metadata | |
US20230095174A1 (en) | Noise supression for speech enhancement | |
US11183172B2 (en) | Detection of fricatives in speech signals | |
KR100798056B1 (ko) | 높은 비정적인 잡음 환경에서의 음질 개선을 위한 음성처리 방법 | |
Gilg et al. | Methodology for the design of a robust voice activity detector for speech enhancement | |
Martin et al. | Robust speech/non-speech detection based on LDA-derived parameter and voicing parameter for speech recognition in noisy environments | |
US20220068270A1 (en) | Speech section detection method | |
Meddah et al. | Speech enhancement using Rao–Blackwellized particle filtering of complex DFT coefficients |