JP2005258439A - 文字から音声への変換のための相互情報量基準を用いた大きな文字音素単位の生成 - Google Patents
文字から音声への変換のための相互情報量基準を用いた大きな文字音素単位の生成 Download PDFInfo
- Publication number
- JP2005258439A JP2005258439A JP2005063646A JP2005063646A JP2005258439A JP 2005258439 A JP2005258439 A JP 2005258439A JP 2005063646 A JP2005063646 A JP 2005063646A JP 2005063646 A JP2005063646 A JP 2005063646A JP 2005258439 A JP2005258439 A JP 2005258439A
- Authority
- JP
- Japan
- Prior art keywords
- character
- phoneme
- word
- mutual information
- computer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
Abstract
【解決手段】 本発明では、単語のセットにある文字音素単位の対の相互情報量スコアが求められる。各文字音素単位は、少なくとも1つの文字を含む。相互情報量スコアに基づいて、1つの文字音素単位の対の文字音素単位が組み合わせられる。それにより、新しい文字音素単位が形成される。本発明の一態様では、相互情報量を使用して音節に区分された単語に基づいて、音節のnグラムモデルがトレーニングされる。音節のnグラムモデルを使用して、新しい単語の音声表現を音節に区分する。同様に、相互情報量を使用して形態素の一覧が形成され、形態素のnグラムがトレーニングされ、そのnグラムを使用して新しい単語を形態素の連続に区分することができる。
【選択図】 図4
Description
音 ph:f o:ow n:n e:#
が何らかの形で与えられると仮定する。そして、各単語の文字音素定義を使用して、「n」個の文字音素が連続する尤度を推定する。例えば、文字音素のトライグラム(trigram)では、3つの文字音素が連続する確率Pr(g3|g1g2)が、文字音素発音を有するトレーニング辞書から推定される。
phone: p:f h:ow o:n n:# e:#
box: b:d o:aa x:k&s
このように、初めの文字音素単位はそれぞれ、正確に1つの文字とゼロ個または1つ以上の音を有する。これらの最初の単位は、包括的にl:p*と表記することができる。
p:f h:# o:ow n:n e:#
が、最も有望な位置合わせとして選択されることができる。
130 システムメモリ
134、144 オペレーティングシステム
135、145 アプリケーションプログラム
136、146 他のプログラムモジュール
137、147 プログラムデータ
140 取り外し不能、不揮発性メモリインターフェース
150 取り外し可能、不揮発性メモリインターフェース
160 ユーザ入力インターフェース
161 ポインティングデバイス
162 キーボード
163 マイクロフォン
170 ネットワークインターフェース
171 ローカルエリアネットワーク
173 ワイドエリアネットワーク
172 モデム
180 リモートコンピュータ
185 リモートアプリケーションプログラム
190 ビデオインターフェース
191 モニタ
195 出力周辺インターフェース
196 プリンタ
197 スピーカ
Claims (17)
- 単語を構成部分に区分する方法であって、
文字音素単位の相互情報量スコアを求めるステップであって、各文字音素単位は、単語の綴りに少なくとも1つの文字を備えるステップと、
前記相互情報量スコアを使用して、文字音素単位を組み合わせてより大きな文字音素単位にするステップと、
単語を構成部分に区分して文字音素の連続を形成するステップと
を備えることを特徴とする方法。 - 文字音素を組み合わせるステップは、各文字音素の文字を組み合わせて、前記より大きな文字音素単位の文字の連続を生成し、各文字音素の音を組み合わせて、前記より大きな文字音素単位の音の連続を生成するステップを備えることを特徴とする請求項1に記載の方法。
- 前記区分された単語を使用してモデルを生成するステップをさらに備えることを特徴とする請求項1に記載の方法。
- 前記モデルは、単語中での前後関係を考慮して文字音素単位の確率を記述することを特徴とする請求項3に記載の方法。
- 前記モデルを使用して、単語の綴りを考慮して前記単語の発音を判定するステップをさらに備えることを特徴とする請求項4に記載の方法。
- 前記相互情報量スコアを使用するステップは、より大きな1つの文字音素単位について求められた少なくとも2つの相互情報量スコアを合計して強さを形成するステップを備えることを特徴とする請求項1に記載の方法。
- 単語のセットにある文字音素単位の対の相互情報量スコアを求めるステップであって、各文字音素単位は、少なくとも1つの文字を備えるステップと、
前記相互情報量スコアに基づいて、1つの文字音素単位の対の文字音素単位を組み合わせて新しい文字音素単位を形成するステップと、
部分的に前記新しい文字音素単位に基づいて、単語の文字音素単位のセットを特定するステップと
を行うコンピュータ実行可能命令を有することを特徴とするコンピュータ可読媒体。 - 前記文字音素単位を組み合わせるステップは、前記文字音素単位の文字を組み合わせて、前記新しい文字音素単位の文字の連続を形成するステップを備えることを特徴とする請求項7に記載のコンピュータ可読媒体。
- 前記文字音素単位を組み合わせるステップはさらに、前記文字音素単位の音を組み合わせて、前記新しい文字音素単位の音の連続を形成するステップを備えることを特徴とする請求項8に記載のコンピュータ可読媒体。
- 辞書中で各単語の文字音素のセットを特定するステップをさらに備えることを特徴とする請求項7に記載のコンピュータ可読媒体。
- 前記辞書中で前記単語について特定された文字音素のセットを使用してモデルをトレーニングするステップをさらに備えることを特徴とする請求項10に記載のコンピュータ可読媒体。
- 前記モデルは、文字音素単位が単語中に出現する確率を記述することを特徴とする請求項11に記載のコンピュータ可読媒体。
- 前記確率は、前記単語中の少なくとも1つの他の文字音素に基づくことを特徴とする請求項12に記載のコンピュータ可読媒体。
- 前記モデルを使用して、単語の綴りを考慮して前記単語の発音を判定するステップをさらに備えることを特徴とする請求項11に記載のコンピュータ可読媒体。
- 前記相互情報量スコアに基づいて文字音素単位を組み合わせるステップは、新しい文字音素単位に関連付けられた少なくとも2つの相互情報量スコアを合計するステップを備えることを特徴とする請求項7に記載のコンピュータ可読媒体。
- 単語を音節に区分する方法であって、
相互情報量スコアを使用して単語のセットを音声音節に区分するステップと、
前記区分された単語のセットを使用して、音節のnグラムモデルをトレーニングするステップと、
前記音節のnグラムモデルを使用して、強制的な位置合わせを介して単語の音声表現を音節に区分するステップと
を備えることを特徴とする方法。 - 単語を形態素に区分する方法であって、
相互情報量スコアを使用して単語のセットを形態素に区分するステップと、
前記区分された単語のセットを使用して、形態素のnグラムモデルをトレーニングするステップと、
前記形態素のnグラムモデルを使用して、強制的な位置合わせを介して単語を形態素に区分するステップと
を備えることを特徴とする方法。
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/797,358 US7693715B2 (en) | 2004-03-10 | 2004-03-10 | Generating large units of graphonemes with mutual information criterion for letter to sound conversion |
Publications (1)
Publication Number | Publication Date |
---|---|
JP2005258439A true JP2005258439A (ja) | 2005-09-22 |
Family
ID=34827631
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2005063646A Ceased JP2005258439A (ja) | 2004-03-10 | 2005-03-08 | 文字から音声への変換のための相互情報量基準を用いた大きな文字音素単位の生成 |
Country Status (7)
Country | Link |
---|---|
US (1) | US7693715B2 (ja) |
EP (1) | EP1575029B1 (ja) |
JP (1) | JP2005258439A (ja) |
KR (1) | KR100996817B1 (ja) |
CN (1) | CN1667699B (ja) |
AT (1) | ATE508453T1 (ja) |
DE (1) | DE602005027770D1 (ja) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008275731A (ja) * | 2007-04-26 | 2008-11-13 | Asahi Kasei Corp | テキスト発音記号変換辞書作成装置、認識語彙辞書作成装置、及び音声認識装置 |
Families Citing this family (227)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU6630800A (en) * | 1999-08-13 | 2001-03-13 | Pixo, Inc. | Methods and apparatuses for display and traversing of links in page character array |
US8645137B2 (en) | 2000-03-16 | 2014-02-04 | Apple Inc. | Fast, language-independent method for user authentication by voice |
JP3662519B2 (ja) * | 2000-07-13 | 2005-06-22 | シャープ株式会社 | 光ピックアップ |
ITFI20010199A1 (it) | 2001-10-22 | 2003-04-22 | Riccardo Vieri | Sistema e metodo per trasformare in voce comunicazioni testuali ed inviarle con una connessione internet a qualsiasi apparato telefonico |
US8677377B2 (en) | 2005-09-08 | 2014-03-18 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US7633076B2 (en) | 2005-09-30 | 2009-12-15 | Apple Inc. | Automated response to and sensing of user activity in portable devices |
US9318108B2 (en) | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
US8977255B2 (en) | 2007-04-03 | 2015-03-10 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
US9053089B2 (en) | 2007-10-02 | 2015-06-09 | Apple Inc. | Part-of-speech tagging using latent analogy |
US8620662B2 (en) * | 2007-11-20 | 2013-12-31 | Apple Inc. | Context-aware unit selection |
US7991615B2 (en) * | 2007-12-07 | 2011-08-02 | Microsoft Corporation | Grapheme-to-phoneme conversion using acoustic data |
US10002189B2 (en) | 2007-12-20 | 2018-06-19 | Apple Inc. | Method and apparatus for searching using an active ontology |
US9330720B2 (en) | 2008-01-03 | 2016-05-03 | Apple Inc. | Methods and apparatus for altering audio output signals |
US8065143B2 (en) | 2008-02-22 | 2011-11-22 | Apple Inc. | Providing text input using speech data and non-speech data |
US20090240501A1 (en) * | 2008-03-19 | 2009-09-24 | Microsoft Corporation | Automatically generating new words for letter-to-sound conversion |
US8996376B2 (en) | 2008-04-05 | 2015-03-31 | Apple Inc. | Intelligent text-to-speech conversion |
US10496753B2 (en) | 2010-01-18 | 2019-12-03 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US8464150B2 (en) | 2008-06-07 | 2013-06-11 | Apple Inc. | Automatic language identification for dynamic text processing |
US20100030549A1 (en) | 2008-07-31 | 2010-02-04 | Lee Michael M | Mobile device having human language translation capability with positional feedback |
US8768702B2 (en) | 2008-09-05 | 2014-07-01 | Apple Inc. | Multi-tiered voice feedback in an electronic device |
US8898568B2 (en) | 2008-09-09 | 2014-11-25 | Apple Inc. | Audio user interface |
US8712776B2 (en) | 2008-09-29 | 2014-04-29 | Apple Inc. | Systems and methods for selective text to speech synthesis |
US8583418B2 (en) | 2008-09-29 | 2013-11-12 | Apple Inc. | Systems and methods of detecting language and natural language strings for text to speech synthesis |
US8676904B2 (en) | 2008-10-02 | 2014-03-18 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
WO2010067118A1 (en) | 2008-12-11 | 2010-06-17 | Novauris Technologies Limited | Speech recognition involving a mobile device |
KR101057191B1 (ko) * | 2008-12-30 | 2011-08-16 | 주식회사 하이닉스반도체 | 반도체 소자의 미세 패턴 형성방법 |
US8862252B2 (en) * | 2009-01-30 | 2014-10-14 | Apple Inc. | Audio user interface for displayless electronic device |
US8380507B2 (en) | 2009-03-09 | 2013-02-19 | Apple Inc. | Systems and methods for determining the language to use for speech generated by a text to speech engine |
US10540976B2 (en) | 2009-06-05 | 2020-01-21 | Apple Inc. | Contextual voice commands |
US10241644B2 (en) | 2011-06-03 | 2019-03-26 | Apple Inc. | Actionable reminder entries |
US20120309363A1 (en) | 2011-06-03 | 2012-12-06 | Apple Inc. | Triggering notifications associated with tasks items that represent tasks to perform |
US9858925B2 (en) | 2009-06-05 | 2018-01-02 | Apple Inc. | Using context information to facilitate processing of commands in a virtual assistant |
US10241752B2 (en) | 2011-09-30 | 2019-03-26 | Apple Inc. | Interface for a virtual digital assistant |
CN101576872B (zh) * | 2009-06-16 | 2014-05-28 | 北京系统工程研究所 | 一种中文文本处理方法及装置 |
US9431006B2 (en) | 2009-07-02 | 2016-08-30 | Apple Inc. | Methods and apparatuses for automatic speech recognition |
KR101083455B1 (ko) * | 2009-07-17 | 2011-11-16 | 엔에이치엔(주) | 통계 데이터에 기초한 사용자 질의 교정 시스템 및 방법 |
US20110110534A1 (en) * | 2009-11-12 | 2011-05-12 | Apple Inc. | Adjustable voice output based on device status |
US8682649B2 (en) | 2009-11-12 | 2014-03-25 | Apple Inc. | Sentiment prediction from textual data |
US8600743B2 (en) | 2010-01-06 | 2013-12-03 | Apple Inc. | Noise profile determination for voice-related feature |
US8311838B2 (en) | 2010-01-13 | 2012-11-13 | Apple Inc. | Devices and methods for identifying a prompt corresponding to a voice input in a sequence of prompts |
US8381107B2 (en) | 2010-01-13 | 2013-02-19 | Apple Inc. | Adaptive audio feedback system and method |
US10553209B2 (en) | 2010-01-18 | 2020-02-04 | Apple Inc. | Systems and methods for hands-free notification summaries |
US10705794B2 (en) | 2010-01-18 | 2020-07-07 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US10276170B2 (en) | 2010-01-18 | 2019-04-30 | Apple Inc. | Intelligent automated assistant |
US10679605B2 (en) | 2010-01-18 | 2020-06-09 | Apple Inc. | Hands-free list-reading by intelligent automated assistant |
US8977584B2 (en) | 2010-01-25 | 2015-03-10 | Newvaluexchange Global Ai Llp | Apparatuses, methods and systems for a digital conversation management platform |
US8682667B2 (en) | 2010-02-25 | 2014-03-25 | Apple Inc. | User profiling for selecting user specific voice input processing information |
US8639516B2 (en) | 2010-06-04 | 2014-01-28 | Apple Inc. | User-specific noise suppression for voice quality improvements |
US8713021B2 (en) | 2010-07-07 | 2014-04-29 | Apple Inc. | Unsupervised document clustering using latent semantic density analysis |
US8719006B2 (en) | 2010-08-27 | 2014-05-06 | Apple Inc. | Combined statistical and rule-based part-of-speech tagging for text-to-speech synthesis |
US8719014B2 (en) | 2010-09-27 | 2014-05-06 | Apple Inc. | Electronic device with text error correction based on voice recognition data |
US20120089400A1 (en) * | 2010-10-06 | 2012-04-12 | Caroline Gilles Henton | Systems and methods for using homophone lexicons in english text-to-speech |
US10515147B2 (en) | 2010-12-22 | 2019-12-24 | Apple Inc. | Using statistical language models for contextual lookup |
US10762293B2 (en) | 2010-12-22 | 2020-09-01 | Apple Inc. | Using parts-of-speech tagging and named entity recognition for spelling correction |
US8781836B2 (en) | 2011-02-22 | 2014-07-15 | Apple Inc. | Hearing assistance system for providing consistent human speech |
US9262612B2 (en) | 2011-03-21 | 2016-02-16 | Apple Inc. | Device access using voice authentication |
US9607044B2 (en) | 2011-03-31 | 2017-03-28 | Tibco Software Inc. | Systems and methods for searching multiple related tables |
WO2012134488A1 (en) * | 2011-03-31 | 2012-10-04 | Tibco Software Inc. | Relational database joins for inexact matching |
US10672399B2 (en) | 2011-06-03 | 2020-06-02 | Apple Inc. | Switching between text data and audio data based on a mapping |
US10057736B2 (en) | 2011-06-03 | 2018-08-21 | Apple Inc. | Active transport based notifications |
US8812294B2 (en) | 2011-06-21 | 2014-08-19 | Apple Inc. | Translating phrases from one language into another using an order-based set of declarative rules |
US8706472B2 (en) | 2011-08-11 | 2014-04-22 | Apple Inc. | Method for disambiguating multiple readings in language conversion |
US8994660B2 (en) | 2011-08-29 | 2015-03-31 | Apple Inc. | Text correction processing |
US8762156B2 (en) | 2011-09-28 | 2014-06-24 | Apple Inc. | Speech recognition repair using contextual information |
US10134385B2 (en) | 2012-03-02 | 2018-11-20 | Apple Inc. | Systems and methods for name pronunciation |
US9483461B2 (en) | 2012-03-06 | 2016-11-01 | Apple Inc. | Handling speech synthesis of content for multiple languages |
US9280610B2 (en) | 2012-05-14 | 2016-03-08 | Apple Inc. | Crowd sourcing information to fulfill user requests |
US10417037B2 (en) | 2012-05-15 | 2019-09-17 | Apple Inc. | Systems and methods for integrating third party services with a digital assistant |
US8775442B2 (en) | 2012-05-15 | 2014-07-08 | Apple Inc. | Semantic search using a single-source semantic model |
US9721563B2 (en) | 2012-06-08 | 2017-08-01 | Apple Inc. | Name recognition system |
WO2013185109A2 (en) | 2012-06-08 | 2013-12-12 | Apple Inc. | Systems and methods for recognizing textual identifiers within a plurality of words |
US9495129B2 (en) | 2012-06-29 | 2016-11-15 | Apple Inc. | Device, method, and user interface for voice-activated navigation and browsing of a document |
US20140067394A1 (en) * | 2012-08-28 | 2014-03-06 | King Abdulaziz City For Science And Technology | System and method for decoding speech |
US9576574B2 (en) | 2012-09-10 | 2017-02-21 | Apple Inc. | Context-sensitive handling of interruptions by intelligent digital assistant |
US9547647B2 (en) | 2012-09-19 | 2017-01-17 | Apple Inc. | Voice-based media searching |
US8935167B2 (en) | 2012-09-25 | 2015-01-13 | Apple Inc. | Exemplar-based latent perceptual modeling for automatic speech recognition |
CN113470640B (zh) | 2013-02-07 | 2022-04-26 | 苹果公司 | 数字助理的语音触发器 |
US10642574B2 (en) | 2013-03-14 | 2020-05-05 | Apple Inc. | Device, method, and graphical user interface for outputting captions |
US10652394B2 (en) | 2013-03-14 | 2020-05-12 | Apple Inc. | System and method for processing voicemail |
US9733821B2 (en) | 2013-03-14 | 2017-08-15 | Apple Inc. | Voice control to diagnose inadvertent activation of accessibility features |
US10572476B2 (en) | 2013-03-14 | 2020-02-25 | Apple Inc. | Refining a search based on schedule items |
US9368114B2 (en) | 2013-03-14 | 2016-06-14 | Apple Inc. | Context-sensitive handling of interruptions |
US9977779B2 (en) | 2013-03-14 | 2018-05-22 | Apple Inc. | Automatic supplementation of word correction dictionaries |
EP2973002B1 (en) | 2013-03-15 | 2019-06-26 | Apple Inc. | User training by intelligent digital assistant |
CN105027197B (zh) | 2013-03-15 | 2018-12-14 | 苹果公司 | 训练至少部分语音命令系统 |
KR102057795B1 (ko) | 2013-03-15 | 2019-12-19 | 애플 인크. | 콘텍스트-민감성 방해 처리 |
US10748529B1 (en) | 2013-03-15 | 2020-08-18 | Apple Inc. | Voice activated device for use with a voice-based digital assistant |
WO2014144579A1 (en) | 2013-03-15 | 2014-09-18 | Apple Inc. | System and method for updating an adaptive speech recognition model |
US9582608B2 (en) | 2013-06-07 | 2017-02-28 | Apple Inc. | Unified ranking with entropy-weighted information for phrase-based semantic auto-completion |
WO2014197334A2 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
WO2014197336A1 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for detecting errors in interactions with a voice-based digital assistant |
WO2014197335A1 (en) | 2013-06-08 | 2014-12-11 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
DE112014002747T5 (de) | 2013-06-09 | 2016-03-03 | Apple Inc. | Vorrichtung, Verfahren und grafische Benutzerschnittstelle zum Ermöglichen einer Konversationspersistenz über zwei oder mehr Instanzen eines digitalen Assistenten |
US10176167B2 (en) | 2013-06-09 | 2019-01-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
CN105265005B (zh) | 2013-06-13 | 2019-09-17 | 苹果公司 | 用于由语音命令发起的紧急呼叫的系统和方法 |
AU2014306221B2 (en) | 2013-08-06 | 2017-04-06 | Apple Inc. | Auto-activating smart responses based on activities from remote devices |
US10296160B2 (en) | 2013-12-06 | 2019-05-21 | Apple Inc. | Method for extracting salient dialog usage from live data |
US9620105B2 (en) | 2014-05-15 | 2017-04-11 | Apple Inc. | Analyzing audio input for efficient speech and music recognition |
US10592095B2 (en) | 2014-05-23 | 2020-03-17 | Apple Inc. | Instantaneous speaking of content on touch devices |
US9502031B2 (en) | 2014-05-27 | 2016-11-22 | Apple Inc. | Method for supporting dynamic grammars in WFST-based ASR |
US9842101B2 (en) | 2014-05-30 | 2017-12-12 | Apple Inc. | Predictive conversion of language input |
US10078631B2 (en) | 2014-05-30 | 2018-09-18 | Apple Inc. | Entropy-guided text prediction using combined word and character n-gram language models |
US9760559B2 (en) | 2014-05-30 | 2017-09-12 | Apple Inc. | Predictive text input |
US10289433B2 (en) | 2014-05-30 | 2019-05-14 | Apple Inc. | Domain specific language for encoding assistant dialog |
US9785630B2 (en) | 2014-05-30 | 2017-10-10 | Apple Inc. | Text prediction using combined word N-gram and unigram language models |
US9734193B2 (en) | 2014-05-30 | 2017-08-15 | Apple Inc. | Determining domain salience ranking from ambiguous words in natural speech |
US9633004B2 (en) | 2014-05-30 | 2017-04-25 | Apple Inc. | Better resolution when referencing to concepts |
US10170123B2 (en) | 2014-05-30 | 2019-01-01 | Apple Inc. | Intelligent assistant for home automation |
US9430463B2 (en) | 2014-05-30 | 2016-08-30 | Apple Inc. | Exemplar-based natural language processing |
AU2015266863B2 (en) | 2014-05-30 | 2018-03-15 | Apple Inc. | Multi-command single utterance input method |
US9715875B2 (en) | 2014-05-30 | 2017-07-25 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
US9338493B2 (en) | 2014-06-30 | 2016-05-10 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US10659851B2 (en) | 2014-06-30 | 2020-05-19 | Apple Inc. | Real-time digital assistant knowledge updates |
US10446141B2 (en) | 2014-08-28 | 2019-10-15 | Apple Inc. | Automatic speech recognition based on user feedback |
US9818400B2 (en) | 2014-09-11 | 2017-11-14 | Apple Inc. | Method and apparatus for discovering trending terms in speech requests |
US10789041B2 (en) | 2014-09-12 | 2020-09-29 | Apple Inc. | Dynamic thresholds for always listening speech trigger |
US9668121B2 (en) | 2014-09-30 | 2017-05-30 | Apple Inc. | Social reminders |
US9886432B2 (en) | 2014-09-30 | 2018-02-06 | Apple Inc. | Parsimonious handling of word inflection via categorical stem + suffix N-gram language models |
US10127911B2 (en) | 2014-09-30 | 2018-11-13 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
US9646609B2 (en) | 2014-09-30 | 2017-05-09 | Apple Inc. | Caching apparatus for serving phonetic pronunciations |
US10074360B2 (en) | 2014-09-30 | 2018-09-11 | Apple Inc. | Providing an indication of the suitability of speech recognition |
US10552013B2 (en) | 2014-12-02 | 2020-02-04 | Apple Inc. | Data detection |
US9711141B2 (en) | 2014-12-09 | 2017-07-18 | Apple Inc. | Disambiguating heteronyms in speech synthesis |
US9865280B2 (en) | 2015-03-06 | 2018-01-09 | Apple Inc. | Structured dictation using intelligent automated assistants |
US10152299B2 (en) | 2015-03-06 | 2018-12-11 | Apple Inc. | Reducing response latency of intelligent automated assistants |
US9721566B2 (en) | 2015-03-08 | 2017-08-01 | Apple Inc. | Competing devices responding to voice triggers |
US10567477B2 (en) | 2015-03-08 | 2020-02-18 | Apple Inc. | Virtual assistant continuity |
US9886953B2 (en) | 2015-03-08 | 2018-02-06 | Apple Inc. | Virtual assistant activation |
US9899019B2 (en) | 2015-03-18 | 2018-02-20 | Apple Inc. | Systems and methods for structured stem and suffix language models |
US9842105B2 (en) | 2015-04-16 | 2017-12-12 | Apple Inc. | Parsimonious continuous-space phrase representations for natural language processing |
US10460227B2 (en) | 2015-05-15 | 2019-10-29 | Apple Inc. | Virtual assistant in a communication session |
US10083688B2 (en) | 2015-05-27 | 2018-09-25 | Apple Inc. | Device voice control for selecting a displayed affordance |
US10127220B2 (en) | 2015-06-04 | 2018-11-13 | Apple Inc. | Language identification from short strings |
US10101822B2 (en) | 2015-06-05 | 2018-10-16 | Apple Inc. | Language input correction |
US9578173B2 (en) | 2015-06-05 | 2017-02-21 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
US11025565B2 (en) | 2015-06-07 | 2021-06-01 | Apple Inc. | Personalized prediction of responses for instant messaging |
US10186254B2 (en) | 2015-06-07 | 2019-01-22 | Apple Inc. | Context-based endpoint detection |
US10255907B2 (en) | 2015-06-07 | 2019-04-09 | Apple Inc. | Automatic accent detection using acoustic models |
US9972300B2 (en) * | 2015-06-11 | 2018-05-15 | Genesys Telecommunications Laboratories, Inc. | System and method for outlier identification to remove poor alignments in speech synthesis |
US20160378747A1 (en) | 2015-06-29 | 2016-12-29 | Apple Inc. | Virtual assistant for media playback |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US9697820B2 (en) | 2015-09-24 | 2017-07-04 | Apple Inc. | Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks |
US10366158B2 (en) | 2015-09-29 | 2019-07-30 | Apple Inc. | Efficient word encoding for recurrent neural network language models |
US11010550B2 (en) | 2015-09-29 | 2021-05-18 | Apple Inc. | Unified language modeling framework for word prediction, auto-completion and auto-correction |
US11587559B2 (en) | 2015-09-30 | 2023-02-21 | Apple Inc. | Intelligent device identification |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10049668B2 (en) | 2015-12-02 | 2018-08-14 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
CN105590623B (zh) * | 2016-02-24 | 2019-07-30 | 百度在线网络技术(北京)有限公司 | 基于人工智能的字母音素转换模型生成方法及装置 |
US10446143B2 (en) | 2016-03-14 | 2019-10-15 | Apple Inc. | Identification of voice inputs providing credentials |
US9934775B2 (en) | 2016-05-26 | 2018-04-03 | Apple Inc. | Unit-selection text-to-speech synthesis based on predicted concatenation parameters |
US9972304B2 (en) | 2016-06-03 | 2018-05-15 | Apple Inc. | Privacy preserving distributed evaluation framework for embedded personalized systems |
US11227589B2 (en) | 2016-06-06 | 2022-01-18 | Apple Inc. | Intelligent list reading |
US10249300B2 (en) | 2016-06-06 | 2019-04-02 | Apple Inc. | Intelligent list reading |
US10049663B2 (en) | 2016-06-08 | 2018-08-14 | Apple, Inc. | Intelligent automated assistant for media exploration |
DK179588B1 (en) | 2016-06-09 | 2019-02-22 | Apple Inc. | INTELLIGENT AUTOMATED ASSISTANT IN A HOME ENVIRONMENT |
US10490187B2 (en) | 2016-06-10 | 2019-11-26 | Apple Inc. | Digital assistant providing automated status report |
US10586535B2 (en) | 2016-06-10 | 2020-03-10 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
US10509862B2 (en) | 2016-06-10 | 2019-12-17 | Apple Inc. | Dynamic phrase expansion of language input |
US10192552B2 (en) | 2016-06-10 | 2019-01-29 | Apple Inc. | Digital assistant providing whispered speech |
DK179049B1 (en) | 2016-06-11 | 2017-09-18 | Apple Inc | Data driven natural language event detection and classification |
DK179343B1 (en) | 2016-06-11 | 2018-05-14 | Apple Inc | Intelligent task discovery |
DK179415B1 (en) | 2016-06-11 | 2018-06-14 | Apple Inc | Intelligent device arbitration and control |
DK201670540A1 (en) | 2016-06-11 | 2018-01-08 | Apple Inc | Application integration with a digital assistant |
US10474753B2 (en) | 2016-09-07 | 2019-11-12 | Apple Inc. | Language identification using recurrent neural networks |
US10043516B2 (en) | 2016-09-23 | 2018-08-07 | Apple Inc. | Intelligent automated assistant |
US11281993B2 (en) | 2016-12-05 | 2022-03-22 | Apple Inc. | Model and ensemble compression for metric learning |
US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
US11204787B2 (en) | 2017-01-09 | 2021-12-21 | Apple Inc. | Application integration with a digital assistant |
DK201770383A1 (en) | 2017-05-09 | 2018-12-14 | Apple Inc. | USER INTERFACE FOR CORRECTING RECOGNITION ERRORS |
US10417266B2 (en) | 2017-05-09 | 2019-09-17 | Apple Inc. | Context-aware ranking of intelligent response suggestions |
US10726832B2 (en) | 2017-05-11 | 2020-07-28 | Apple Inc. | Maintaining privacy of personal information |
DK201770439A1 (en) | 2017-05-11 | 2018-12-13 | Apple Inc. | Offline personal assistant |
US10395654B2 (en) | 2017-05-11 | 2019-08-27 | Apple Inc. | Text normalization based on a data-driven learning network |
DK179496B1 (en) | 2017-05-12 | 2019-01-15 | Apple Inc. | USER-SPECIFIC Acoustic Models |
DK179745B1 (en) | 2017-05-12 | 2019-05-01 | Apple Inc. | SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT |
US11301477B2 (en) | 2017-05-12 | 2022-04-12 | Apple Inc. | Feedback analysis of a digital assistant |
DK201770429A1 (en) | 2017-05-12 | 2018-12-14 | Apple Inc. | LOW-LATENCY INTELLIGENT AUTOMATED ASSISTANT |
DK201770432A1 (en) | 2017-05-15 | 2018-12-21 | Apple Inc. | Hierarchical belief states for digital assistants |
DK201770431A1 (en) | 2017-05-15 | 2018-12-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
US20180336275A1 (en) | 2017-05-16 | 2018-11-22 | Apple Inc. | Intelligent automated assistant for media exploration |
DK179560B1 (en) | 2017-05-16 | 2019-02-18 | Apple Inc. | FAR-FIELD EXTENSION FOR DIGITAL ASSISTANT SERVICES |
US20180336892A1 (en) | 2017-05-16 | 2018-11-22 | Apple Inc. | Detecting a trigger of a digital assistant |
US10403278B2 (en) | 2017-05-16 | 2019-09-03 | Apple Inc. | Methods and systems for phonetic matching in digital assistant services |
US10311144B2 (en) | 2017-05-16 | 2019-06-04 | Apple Inc. | Emoji word sense disambiguation |
CN108962218A (zh) * | 2017-05-27 | 2018-12-07 | 北京搜狗科技发展有限公司 | 一种文字发音方法和装置 |
US10657328B2 (en) | 2017-06-02 | 2020-05-19 | Apple Inc. | Multi-task recurrent neural network architecture for efficient morphology handling in neural language modeling |
US10445429B2 (en) | 2017-09-21 | 2019-10-15 | Apple Inc. | Natural language understanding using vocabularies with compressed serialized tries |
US10755051B2 (en) | 2017-09-29 | 2020-08-25 | Apple Inc. | Rule-based natural language processing |
US10636424B2 (en) | 2017-11-30 | 2020-04-28 | Apple Inc. | Multi-turn canned dialog |
US10733982B2 (en) | 2018-01-08 | 2020-08-04 | Apple Inc. | Multi-directional dialog |
US10733375B2 (en) | 2018-01-31 | 2020-08-04 | Apple Inc. | Knowledge-based framework for improving natural language understanding |
US10789959B2 (en) | 2018-03-02 | 2020-09-29 | Apple Inc. | Training speaker recognition models for digital assistants |
US10592604B2 (en) | 2018-03-12 | 2020-03-17 | Apple Inc. | Inverse text normalization for automatic speech recognition |
US10818288B2 (en) | 2018-03-26 | 2020-10-27 | Apple Inc. | Natural assistant interaction |
US10909331B2 (en) | 2018-03-30 | 2021-02-02 | Apple Inc. | Implicit identification of translation payload with neural machine translation |
US10928918B2 (en) | 2018-05-07 | 2021-02-23 | Apple Inc. | Raise to speak |
US11145294B2 (en) | 2018-05-07 | 2021-10-12 | Apple Inc. | Intelligent automated assistant for delivering content from user experiences |
US10984780B2 (en) | 2018-05-21 | 2021-04-20 | Apple Inc. | Global semantic word embeddings using bi-directional recurrent neural networks |
DK179822B1 (da) | 2018-06-01 | 2019-07-12 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
DK201870355A1 (en) | 2018-06-01 | 2019-12-16 | Apple Inc. | VIRTUAL ASSISTANT OPERATION IN MULTI-DEVICE ENVIRONMENTS |
US11386266B2 (en) | 2018-06-01 | 2022-07-12 | Apple Inc. | Text correction |
DK180639B1 (en) | 2018-06-01 | 2021-11-04 | Apple Inc | DISABILITY OF ATTENTION-ATTENTIVE VIRTUAL ASSISTANT |
US10892996B2 (en) | 2018-06-01 | 2021-01-12 | Apple Inc. | Variable latency device coordination |
US10504518B1 (en) | 2018-06-03 | 2019-12-10 | Apple Inc. | Accelerated task performance |
CN108877777B (zh) * | 2018-08-01 | 2021-04-13 | 云知声(上海)智能科技有限公司 | 一种语音识别方法及系统 |
US11010561B2 (en) | 2018-09-27 | 2021-05-18 | Apple Inc. | Sentiment prediction from textual data |
US11462215B2 (en) | 2018-09-28 | 2022-10-04 | Apple Inc. | Multi-modal inputs for voice commands |
US11170166B2 (en) | 2018-09-28 | 2021-11-09 | Apple Inc. | Neural typographical error modeling via generative adversarial networks |
US10839159B2 (en) | 2018-09-28 | 2020-11-17 | Apple Inc. | Named entity normalization in a spoken dialog system |
US11475898B2 (en) | 2018-10-26 | 2022-10-18 | Apple Inc. | Low-latency multi-speaker speech recognition |
US11638059B2 (en) | 2019-01-04 | 2023-04-25 | Apple Inc. | Content playback on multiple devices |
US11348573B2 (en) | 2019-03-18 | 2022-05-31 | Apple Inc. | Multimodality in digital assistant systems |
US11423908B2 (en) | 2019-05-06 | 2022-08-23 | Apple Inc. | Interpreting spoken requests |
DK201970509A1 (en) | 2019-05-06 | 2021-01-15 | Apple Inc | Spoken notifications |
US11475884B2 (en) | 2019-05-06 | 2022-10-18 | Apple Inc. | Reducing digital assistant latency when a language is incorrectly determined |
US11307752B2 (en) | 2019-05-06 | 2022-04-19 | Apple Inc. | User configurable task triggers |
US11140099B2 (en) | 2019-05-21 | 2021-10-05 | Apple Inc. | Providing message response suggestions |
US11289073B2 (en) | 2019-05-31 | 2022-03-29 | Apple Inc. | Device text to speech |
DK201970511A1 (en) | 2019-05-31 | 2021-02-15 | Apple Inc | Voice identification in digital assistant systems |
US11496600B2 (en) | 2019-05-31 | 2022-11-08 | Apple Inc. | Remote execution of machine-learned models |
DK180129B1 (en) | 2019-05-31 | 2020-06-02 | Apple Inc. | USER ACTIVITY SHORTCUT SUGGESTIONS |
US11360641B2 (en) | 2019-06-01 | 2022-06-14 | Apple Inc. | Increasing the relevance of new available information |
WO2021056255A1 (en) | 2019-09-25 | 2021-04-01 | Apple Inc. | Text detection using global geometry estimators |
CN113257234A (zh) * | 2021-04-15 | 2021-08-13 | 北京百度网讯科技有限公司 | 生成词典与语音识别的方法、装置 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0283594A (ja) * | 1988-09-20 | 1990-03-23 | Nec Corp | 形態素合成形英単語辞書構成方式 |
JPH09281989A (ja) * | 1996-04-09 | 1997-10-31 | Fuji Xerox Co Ltd | 音声認識装置および方法 |
JP2001324995A (ja) * | 2000-05-17 | 2001-11-22 | Alpine Electronics Inc | 音声認識方法 |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6067520A (en) * | 1995-12-29 | 2000-05-23 | Lee And Li | System and method of recognizing continuous mandarin speech utilizing chinese hidden markou models |
JP3033514B2 (ja) * | 1997-03-31 | 2000-04-17 | 日本電気株式会社 | 大語彙音声認識方法及び装置 |
CN1111811C (zh) * | 1997-04-14 | 2003-06-18 | 英业达股份有限公司 | 计算机语音信号的发音合成方法 |
US6185524B1 (en) * | 1998-12-31 | 2001-02-06 | Lernout & Hauspie Speech Products N.V. | Method and apparatus for automatic identification of word boundaries in continuous text and computation of word boundary scores |
JP2001249922A (ja) * | 1999-12-28 | 2001-09-14 | Matsushita Electric Ind Co Ltd | 単語分割方式及び装置 |
US6505151B1 (en) * | 2000-03-15 | 2003-01-07 | Bridgewell Inc. | Method for dividing sentences into phrases using entropy calculations of word combinations based on adjacent words |
US6973427B2 (en) | 2000-12-26 | 2005-12-06 | Microsoft Corporation | Method for adding phonetic descriptions to a speech recognition lexicon |
GB0118184D0 (en) * | 2001-07-26 | 2001-09-19 | Ibm | A method for generating homophonic neologisms |
US20030088416A1 (en) * | 2001-11-06 | 2003-05-08 | D.S.P.C. Technologies Ltd. | HMM-based text-to-phoneme parser and method for training same |
AU2003271083A1 (en) * | 2002-10-08 | 2004-05-04 | Matsushita Electric Industrial Co., Ltd. | Language model creation/accumulation device, speech recognition device, language model creation method, and speech recognition method |
DE602005026778D1 (de) * | 2004-01-16 | 2011-04-21 | Scansoft Inc | Corpus-gestützte sprachsynthese auf der basis von segmentrekombination |
-
2004
- 2004-03-10 US US10/797,358 patent/US7693715B2/en not_active Expired - Fee Related
-
2005
- 2005-03-08 DE DE602005027770T patent/DE602005027770D1/de active Active
- 2005-03-08 JP JP2005063646A patent/JP2005258439A/ja not_active Ceased
- 2005-03-08 EP EP05101790A patent/EP1575029B1/en not_active Not-in-force
- 2005-03-08 AT AT05101790T patent/ATE508453T1/de not_active IP Right Cessation
- 2005-03-10 CN CN2005100527542A patent/CN1667699B/zh not_active Expired - Fee Related
- 2005-03-10 KR KR1020050020059A patent/KR100996817B1/ko not_active IP Right Cessation
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0283594A (ja) * | 1988-09-20 | 1990-03-23 | Nec Corp | 形態素合成形英単語辞書構成方式 |
JPH09281989A (ja) * | 1996-04-09 | 1997-10-31 | Fuji Xerox Co Ltd | 音声認識装置および方法 |
JP2001324995A (ja) * | 2000-05-17 | 2001-11-22 | Alpine Electronics Inc | 音声認識方法 |
Non-Patent Citations (1)
Title |
---|
JPN7011000672; Lucian Galescu, James F. Allen: 'Bi-directional Conversion Between Graphemes and Phonemes Using a Joint N-gram Model' 4th ISCA Tutorial and Research Workshop on Speech Synthesis , 20010829 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008275731A (ja) * | 2007-04-26 | 2008-11-13 | Asahi Kasei Corp | テキスト発音記号変換辞書作成装置、認識語彙辞書作成装置、及び音声認識装置 |
Also Published As
Publication number | Publication date |
---|---|
DE602005027770D1 (de) | 2011-06-16 |
EP1575029B1 (en) | 2011-05-04 |
EP1575029A2 (en) | 2005-09-14 |
CN1667699B (zh) | 2010-06-23 |
US20050203739A1 (en) | 2005-09-15 |
ATE508453T1 (de) | 2011-05-15 |
CN1667699A (zh) | 2005-09-14 |
KR20060043825A (ko) | 2006-05-15 |
KR100996817B1 (ko) | 2010-11-25 |
EP1575029A3 (en) | 2009-04-29 |
US7693715B2 (en) | 2010-04-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP2005258439A (ja) | 文字から音声への変換のための相互情報量基準を用いた大きな文字音素単位の生成 | |
EP1575030B1 (en) | New-word pronunciation learning using a pronunciation graph | |
US11270687B2 (en) | Phoneme-based contextualization for cross-lingual speech recognition in end-to-end models | |
JP4528535B2 (ja) | テキストから単語誤り率を予測するための方法および装置 | |
US8392191B2 (en) | Chinese prosodic words forming method and apparatus | |
Tachbelie et al. | Using different acoustic, lexical and language modeling units for ASR of an under-resourced language–Amharic | |
KR20120038198A (ko) | 음성 인식 장치 및 방법 | |
JP5276610B2 (ja) | 言語モデル生成装置、そのプログラムおよび音声認識システム | |
JP5180800B2 (ja) | 統計的発音変異モデルを記憶する記録媒体、自動音声認識システム及びコンピュータプログラム | |
JP4826719B2 (ja) | 音声認識システム、音声認識方法、および音声認識プログラム | |
US7003740B2 (en) | Method and apparatus for minimizing weighted networks with link and node labels | |
JP3950957B2 (ja) | 言語処理装置および方法 | |
Sproat et al. | Applications of lexicographic semirings to problems in speech and language processing | |
Vu et al. | Vietnamese automatic speech recognition: The flavor approach | |
Lehnen et al. | N-grams for conditional random fields or a failure-transition (ϕ) posterior for acyclic FSTs | |
JP5137588B2 (ja) | 言語モデル生成装置及び音声認識装置 | |
Liu et al. | The effect of pruning and compression on graphical representations of the output of a speech recognizer | |
JP6879521B1 (ja) | 多言語音声認識およびテーマ−意義素解析方法および装置 | |
Demuynck et al. | Robust phone lattice decoding | |
Vertanen | Efficient computer interfaces using continuous gestures, language models, and speech | |
JP2002073077A (ja) | 単一文章文法を使用して複数組のhmmを復号する方法 | |
Day | New Content Functionality for an Automated Oral Reading Fluency Tutor | |
JP2003223185A (ja) | 音声理解方法及び装置及び音声理解プログラム及び音声理解プログラムを格納した記憶媒体 | |
Picheny et al. | LVCSR Decoding (cont’d) and Robustness |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20080304 |
|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20110301 |
|
A601 | Written request for extension of time |
Free format text: JAPANESE INTERMEDIATE CODE: A601 Effective date: 20110531 |
|
A602 | Written permission of extension of time |
Free format text: JAPANESE INTERMEDIATE CODE: A602 Effective date: 20110603 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20110701 |
|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20120207 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20120507 |
|
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20121109 |
|
A045 | Written measure of dismissal of application [lapsed due to lack of payment] |
Free format text: JAPANESE INTERMEDIATE CODE: A045 Effective date: 20130322 |