JP4363590B2 - 音声合成 - Google Patents
音声合成 Download PDFInfo
- Publication number
- JP4363590B2 JP4363590B2 JP2003564856A JP2003564856A JP4363590B2 JP 4363590 B2 JP4363590 B2 JP 4363590B2 JP 2003564856 A JP2003564856 A JP 2003564856A JP 2003564856 A JP2003564856 A JP 2003564856A JP 4363590 B2 JP4363590 B2 JP 4363590B2
- Authority
- JP
- Japan
- Prior art keywords
- sound
- speech
- text
- prosody
- lessac
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 230000015572 biosynthetic process Effects 0.000 title claims description 38
- 238000003786 synthesis reaction Methods 0.000 title claims description 38
- 238000000034 method Methods 0.000 claims description 102
- 230000008859 change Effects 0.000 claims description 68
- 238000013473 artificial intelligence Methods 0.000 claims description 37
- 230000008569 process Effects 0.000 claims description 30
- 241000282414 Homo sapiens Species 0.000 claims description 25
- 230000014509 gene expression Effects 0.000 claims description 23
- 238000001308 synthesis method Methods 0.000 claims description 21
- 230000015654 memory Effects 0.000 claims description 19
- 238000001914 filtration Methods 0.000 claims description 14
- 230000005236 sound signal Effects 0.000 claims description 13
- 238000002360 preparation method Methods 0.000 claims description 10
- 238000004458 analytical method Methods 0.000 claims description 8
- 230000004044 response Effects 0.000 claims description 8
- 230000033764 rhythmic process Effects 0.000 claims description 8
- 230000002996 emotional effect Effects 0.000 claims description 7
- 238000012986 modification Methods 0.000 claims description 4
- 230000004048 modification Effects 0.000 claims description 4
- 230000008929 regeneration Effects 0.000 claims description 3
- 238000011069 regeneration method Methods 0.000 claims description 3
- 239000000203 mixture Substances 0.000 claims description 2
- 238000002592 echocardiography Methods 0.000 claims 1
- 238000012545 processing Methods 0.000 description 40
- 239000011295 pitch Substances 0.000 description 26
- 230000000694 effects Effects 0.000 description 14
- 230000006870 function Effects 0.000 description 14
- 238000011160 research Methods 0.000 description 14
- 210000000214 mouth Anatomy 0.000 description 10
- 238000013459 approach Methods 0.000 description 8
- 238000006243 chemical reaction Methods 0.000 description 8
- 238000012549 training Methods 0.000 description 8
- 230000009977 dual effect Effects 0.000 description 7
- 230000008451 emotion Effects 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 238000009499 grossing Methods 0.000 description 6
- 230000002194 synthesizing effect Effects 0.000 description 5
- 230000009471 action Effects 0.000 description 4
- 230000001419 dependent effect Effects 0.000 description 4
- 208000023514 Barrett esophagus Diseases 0.000 description 3
- 210000000988 bone and bone Anatomy 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 210000001983 hard palate Anatomy 0.000 description 3
- 201000000615 hard palate cancer Diseases 0.000 description 3
- 239000003607 modifier Substances 0.000 description 3
- 230000036961 partial effect Effects 0.000 description 3
- 230000001755 vocal effect Effects 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 210000005069 ears Anatomy 0.000 description 2
- 210000001061 forehead Anatomy 0.000 description 2
- 210000003128 head Anatomy 0.000 description 2
- 238000011835 investigation Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 210000003928 nasal cavity Anatomy 0.000 description 2
- 210000003800 pharynx Anatomy 0.000 description 2
- 230000002829 reductive effect Effects 0.000 description 2
- 230000035807 sensation Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 241000208140 Acer Species 0.000 description 1
- 241000747049 Aceros Species 0.000 description 1
- 201000004569 Blindness Diseases 0.000 description 1
- 206010010305 Confusional state Diseases 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 241001282135 Poromitra oscitans Species 0.000 description 1
- 241000405217 Viola <butterfly> Species 0.000 description 1
- 206010048232 Yawning Diseases 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000002238 attenuated effect Effects 0.000 description 1
- ZYXYTGQFPZEUFX-UHFFFAOYSA-N benzpyrimoxan Chemical compound O1C(OCCC1)C=1C(=NC=NC=1)OCC1=CC=C(C=C1)C(F)(F)F ZYXYTGQFPZEUFX-UHFFFAOYSA-N 0.000 description 1
- 230000009172 bursting Effects 0.000 description 1
- 230000019771 cognition Effects 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000001627 detrimental effect Effects 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 210000004704 glottis Anatomy 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 230000003155 kinesthetic effect Effects 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000003340 mental effect Effects 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 210000003254 palate Anatomy 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000009527 percussion Methods 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 239000011435 rock Substances 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 210000001584 soft palate Anatomy 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 210000005182 tip of the tongue Anatomy 0.000 description 1
- 210000003437 trachea Anatomy 0.000 description 1
- 230000003313 weakening effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B5/00—Electrically-operated educational appliances
- G09B5/04—Electrically-operated educational appliances with audible presentation of the material to be studied
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B19/00—Teaching not covered by other main groups of this subclass
- G09B19/04—Speaking
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
- G10L13/10—Prosody rules derived from text; Stress or intonation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
- G10L2015/0638—Interactive procedures
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- General Physics & Mathematics (AREA)
- Educational Technology (AREA)
- Educational Administration (AREA)
- Human Computer Interaction (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Entrepreneurship & Innovation (AREA)
- Machine Translation (AREA)
- Electrically Operated Instructional Devices (AREA)
- Document Processing Apparatus (AREA)
Description
音声認識技術は、ここ10年余りの間に、正確さと使いやすさの面において、格段の進歩を遂げてきた。その一方で、テキストから音声への変換技術は、聴きやすく、自然な音で容易に理解しやすい機能を備えたものであるかという点については、いまだに、その機能は定義しにくいものの、求めてやまない目標であることには変わりはない。
メモリを有する計算機を用いて音声合成する方法を開示する。テキストを、計算機のメモリ中に取り込む。語彙構文解析のルールの集合を適用し、テキストを複数のコンポーネントに分割する。発音および意味情報を、これらのコンポーネントに関連付ける。語句構文解析のルールの集合を用いて、マーク付けしたテキストを生成する。さらに、音声的な構文解析ルールとLessac表現構文解析ルールを用いて、マーク付けされたテキストを、音声的に構文解析する。さらに、計算機のメモリに、音を格納し、各々の音には、発音情報を関連付ける。テキストに対応した音を呼び出し、音声および表現構文解析ルールを用いて構文解析した後、マーク付けされたテキストから生の音声信号を生成する。
本発明のいくつかの実施例を示す以下の図面を引用して説明することにより、発明の機能、目的および利点を、明確にする。
本発明に従い、現行のシステムの問題を解決することを目的とした、音声合成の方法を説明する。特に、パタンマッチング、音素、二重音声および信号処理に基づく現行のシステムでは、人間のような表現力をもたない、“ロボット”のような音声が出力されている。本発明の一実施例によれば、言語学、“N要素音素”、および多くの部分でArthur Lessacの研究成果に基づく人工知能ルールを適用して、発明による計算機で生成した音声中の、音色のエネルギー、音楽的特長、自然音および構造的エネルギーを向上させる。本発明の応用範囲は、顧客サービス応答システム、電話応答システム、情報検索、視覚障害者あるいは“手が塞がっている”人のための計算機によるテキスト読み上げ、教育、オフィス業務支援などがある。
far above(一つの単語、すなわちfaraboveとして扱う)
grab it
stop up
bad actor
breathe in
that’s enough
this is it
sob sister
keep this
stand back
take time
smooth surface
stack pack
can’t be
hill country/ask not why
understand patience
stab me
help me
good news
that seems good
red zone
did that
ステップ242では、音識別情報とそれに対応した定量的な韻律を他のパラメータと合わせた形式で表した、Lessac理論に従って発声しランダム化された音を、ステップ222で生成した韻律記録の出力によって変化させ、変化させた記録をステップ244でのオプションの韻律深さ変調を行なう。
本発明の実施例を、システムの様々な部分に対するいくつかの代案とともに説明してきたが、様々な変更が可能であることは、本技術分野に精通した者にとっては自明なことである。これらの変更は、本発明の趣旨と範囲を逸脱することはなく、この趣旨と範囲は、請求項で限定し定義したものである。
Claims (26)
- メモリを有する計算機装置を用いて音声合成を行う方法であって、
(a)前記計算機装置の前記メモリにテキスト(112)を受信するステップと、
(b)言語構文解析ルール(26)の集合を適用し、前記テキストを複数の要素に構文解析するステップと、
(c)発音と意味に関する情報を、前記要素に対応づけるステップと
(d)語句構文解析ルール(18)の集合を適用し、マーク付けしたテキストを生成するステップと、
(e)音構文解析ルールを用いて、前記マーク付けしたテキスト(22)を音構文解析するステップと、
(f)複数の音をメモリに格納するステップであって、前記音の各々は、前記発音に関する情報に対応付けられたものであることを特徴とするステップと、
(g)前記テキストに対応付けられた音を呼び出し、生の音声信号を生成するステップを含む方法であって、
さらに、(h)表現構文解析ルール(26)を用いて、前記マーク付けしたテキストを構文解析するステップを含み、該ルールは、任意のLessac構文解析ルールであることを特徴とする方法。 - 請求項1記載の方法において、表現構文解析ルールは、データベースから求められ、Lessacの音声指導システムに基づくものであり、ルールは、発声されるか無音である子音ドラムビートの特定と、単語リスト中の音エネルギー位置と、単語の中の構造的な母音の音と、連結詞を任意に含むことを特徴とする方法。
- 請求項1記載の方法は、ランダム化された文脈韻律変更を含むことを特徴とする方法。
- 請求項1記載の方法は、直接連結、再生と連結、および準備と連結からなるグループから選択した任意の一つまたはより多くのLessac連結詞の韻律変更を含むことを特徴とする方法。
- 請求項1記載の方法は、テキストの意味の認識への人工知能の適用と、伝えようとするメッセージの感情的な状態の特定と、特定された感情的な状態に応じた音声合成出力の韻律の変更とを含むことを特徴とする方法。
- 請求項1、2、3、4または5記載の方法は、
(h)表現構文解析ルールを用いて決定したパラメータを用いて、前記生の音声信号をフィルタ処理し、出力音声信号を生成するステップを含むことを特徴とする方法。 - メモリを有する計算機装置を用いて音声合成を行う方法であって、
(a)前記計算機装置の前記メモリに複数単語からなるテキスト(112)を受信するステップと、
(b)前記テキストから複数の音素(118)を抽出するステップと、
(c)前記音素に対応した音情報を、前記メモリから読み出すステップと、
(d)前記音情報を出力して、音声信号を生成するステップとを含む方法であって、
(c)前記単語に対応した韻律記録のデータベースに基づく韻律記録を、前記音素の各々に対応づけることと、
(d)人工知能ルールの集合を適用し、前記テキストの文脈情報を決定することと、
(e)前記音素の各々に対して、
(i)文脈の影響を受けた韻律の変化を決定し、
(ii)Lessac理論に基づくルールの第二集合を適用し、Lessac理論に基づく韻律の変化を決定し、
(iii)前記文脈の影響を受けた韻律の変化と、前記Lessac理論に基づく韻律の変化に応じて、韻律記録を変更し、
(iv)前記音素に対応した音情報を、前記メモリから読み出し、
(v)前記文脈の影響を受けた韻律の変化と、前記Lessac理論に基づく韻律の変化に応じて変更した韻律記録に基づき、前記音情報を変更し、変更された音情報を生成すすることを特徴とする方法。 - 請求項7記載の音声合成方法において、前記音声信号の韻律を変化させ、前記音声信号の現実感を向上させることを特徴とする方法。
- 請求項7記載の音声合成方法において、ランダムに、または擬似ランダムに前記音声信号の韻律を変化させ、前記音声信号の現実感を向上させることを特徴とする方法。
- 請求項7記載の音声合成方法において、前記音情報を異なる話者に対応させ、人工知能ルールの集合を用いて、出力しようとする音情報に対応した話者の特定情報を決定することを特徴とする方法。
- 請求項7記載の音声合成方法において、前記文脈の影響を受けた韻律の変化に応じた、韻律記録の前記変更は、前期テキストの単語とそれらの並び方に基づくものであることを特徴とする方法。
- 請求項7、8、9、10または11記載の音声合成方法において、前記文脈の影響を受けた韻律の変化に応じた、韻律記録の前記変更は、前記テキスト中の単語の感情的な文脈に基づくものであることを特徴とする方法。
- 請求項12記載の音声合成方法において、前記音声信号の韻律を変化させ、前記音声信号の現実感を向上させることを特徴とする方法。
- 請求項13記載の音声合成方法において、ランダムに、または擬似ランダムに前記音声信号の韻律を変化させ、前記音声信号の現実感を向上させることを特徴とする方法。
- 請求項14記載の音声合成方法において、前記音情報を異なる話者に対応させ、人工知能ルールの集合を用いて、出力しようとする音情報に対応した話者の特定情報を決定することを特徴とする方法。
- 請求項15記載の音声合成方法において、前記文脈の影響を受けた韻律の変化に応じた、韻律記録の前記変更は、前期テキストの単語とそれらの並び方に基づくものであることを特徴とする方法。
- 請求項16記載の音声合成方法は、さらに、前記音声信号をフィルタ処理し、フィルタ処理された変更された音情報信号を求め、前記フィルタ処理された変更された音情報信号を出力して音声信号を生成するステップを含むことを特徴とする方法。
- 請求項17記載の音声合成方法において、前記変更された音情報の前記フィルタ処理は、エコーの導入を含むことを特徴とする方法。
- 請求項18記載の音声合成方法において、前記変更された音情報の前記フィルタ処理は、前記変更された音情報を、母音情報に合わせて共鳴特性を与えるアナログまたはデジタル共鳴回路に送ることを特徴とする方法。
- 請求項17記載の音声合成方法において、前記音声信号のフィルタ処理は、前記変更された音情報の減衰を含むことを特徴とする方法。
- 請求項16記載の音声合成方法は、さらに、エコーを導入して、前記変更された音情報をフィルタ処理するステップと、前記変更された音情報を、母音情報に合わせて共鳴特性を与えるアナログまたはデジタル共鳴回路に送るステップと、前記変更された音情報を減衰するステップとを含むことを特徴とする方法。
- 請求項12記載の音声合成方法は、さらに、エコーを導入して、前記変更された音情報をフィルタ処理するステップと、前記変更された音情報を、母音情報に合わせて共鳴特性を与えるアナログまたはデジタル共鳴回路に送るステップと、前記変更された音情報を減衰するステップとを含むことを特徴とする方法。
- 請求項12記載の音声合成方法は、さらに、前記テキストに適用する人工知能ルールに応じ、および、または人による入力に応じて、前記テキストの文脈と論理的に整合をもつ背景音を加えるステップを含むことを特徴とする方法。
- 請求項12記載の音声合成方法は、さらに、前記テキストの文脈と整合をとり、前記テキストに適用される人工知能ルールおよび、または人の入力に応じて、論理的に背景音を加えるステップを含むことを特徴とする方法。
- 請求項12記載の音声合成方法は、さらに、メッセージの内容、意図または想定される聞き手の情報を用いて、適用すべき韻律を示すステップを含むことを特徴とする方法。
- 請求項12記載の音声合成方法は、さらに、リズム、音の変化、音の変化率、子音および母音の音の長さ、調和音の構成および共鳴の中から選ばれた、一つまたはより多くの韻律特性を変更するステップを含むことを特徴とする方法。
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/061,078 US6847931B2 (en) | 2002-01-29 | 2002-01-29 | Expressive parsing in computerized conversion of text to speech |
US10/334,658 US6865533B2 (en) | 2000-04-21 | 2002-12-31 | Text to speech |
PCT/US2003/002561 WO2003065349A2 (en) | 2002-01-29 | 2003-01-28 | Text to speech |
Publications (2)
Publication Number | Publication Date |
---|---|
JP2005516262A JP2005516262A (ja) | 2005-06-02 |
JP4363590B2 true JP4363590B2 (ja) | 2009-11-11 |
Family
ID=27667761
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2003564856A Expired - Fee Related JP4363590B2 (ja) | 2002-01-29 | 2003-01-28 | 音声合成 |
Country Status (5)
Country | Link |
---|---|
US (1) | US6865533B2 (ja) |
EP (1) | EP1479068A4 (ja) |
JP (1) | JP4363590B2 (ja) |
CA (1) | CA2474483A1 (ja) |
WO (1) | WO2003065349A2 (ja) |
Families Citing this family (212)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7292980B1 (en) * | 1999-04-30 | 2007-11-06 | Lucent Technologies Inc. | Graphical user interface and method for modifying pronunciations in text-to-speech and speech recognition systems |
US8645137B2 (en) | 2000-03-16 | 2014-02-04 | Apple Inc. | Fast, language-independent method for user authentication by voice |
JP2002282543A (ja) * | 2000-12-28 | 2002-10-02 | Sony Computer Entertainment Inc | オブジェクトの音声処理プログラム、オブジェクトの音声処理プログラムを記録したコンピュータ読み取り可能な記録媒体、プログラム実行装置、及びオブジェクトの音声処理方法 |
US20020133342A1 (en) * | 2001-03-16 | 2002-09-19 | Mckenna Jennifer | Speech to text method and system |
GB0113570D0 (en) * | 2001-06-04 | 2001-07-25 | Hewlett Packard Co | Audio-form presentation of text messages |
US20030093280A1 (en) * | 2001-07-13 | 2003-05-15 | Pierre-Yves Oudeyer | Method and apparatus for synthesising an emotion conveyed on a sound |
US7096183B2 (en) * | 2002-02-27 | 2006-08-22 | Matsushita Electric Industrial Co., Ltd. | Customizing the speaking style of a speech synthesizer based on semantic analysis |
US20030212761A1 (en) * | 2002-05-10 | 2003-11-13 | Microsoft Corporation | Process kernel |
KR100463655B1 (ko) * | 2002-11-15 | 2004-12-29 | 삼성전자주식회사 | 부가 정보 제공 기능이 있는 텍스트/음성 변환장치 및 방법 |
US7424430B2 (en) * | 2003-01-30 | 2008-09-09 | Yamaha Corporation | Tone generator of wave table type with voice synthesis capability |
JP4264030B2 (ja) * | 2003-06-04 | 2009-05-13 | 株式会社ケンウッド | 音声データ選択装置、音声データ選択方法及びプログラム |
US20040254793A1 (en) * | 2003-06-12 | 2004-12-16 | Cormac Herley | System and method for providing an audio challenge to distinguish a human from a computer |
JP2005031259A (ja) * | 2003-07-09 | 2005-02-03 | Canon Inc | 自然言語処理方法 |
US7359085B2 (en) * | 2003-07-14 | 2008-04-15 | Lexmark International, Inc. | Method and apparatus for recording sound information and playing sound information back using an all-in-one printer |
US8886538B2 (en) * | 2003-09-26 | 2014-11-11 | Nuance Communications, Inc. | Systems and methods for text-to-speech synthesis using spoken example |
US8200477B2 (en) * | 2003-10-22 | 2012-06-12 | International Business Machines Corporation | Method and system for extracting opinions from text documents |
US8103505B1 (en) * | 2003-11-19 | 2012-01-24 | Apple Inc. | Method and apparatus for speech synthesis using paralinguistic variation |
US20050125486A1 (en) * | 2003-11-20 | 2005-06-09 | Microsoft Corporation | Decentralized operating system |
JP4585759B2 (ja) * | 2003-12-02 | 2010-11-24 | キヤノン株式会社 | 音声合成装置、音声合成方法、プログラム、及び記録媒体 |
WO2005088606A1 (en) * | 2004-03-05 | 2005-09-22 | Lessac Technologies, Inc. | Prosodic speech text codes and their use in computerized speech systems |
US7570746B2 (en) * | 2004-03-18 | 2009-08-04 | Sony Corporation | Method and apparatus for voice interactive messaging |
US20070203703A1 (en) * | 2004-03-29 | 2007-08-30 | Ai, Inc. | Speech Synthesizing Apparatus |
US7788098B2 (en) | 2004-08-02 | 2010-08-31 | Nokia Corporation | Predicting tone pattern information for textual information used in telecommunication systems |
US7865365B2 (en) * | 2004-08-05 | 2011-01-04 | Nuance Communications, Inc. | Personalized voice playback for screen reader |
JP2006047866A (ja) * | 2004-08-06 | 2006-02-16 | Canon Inc | 電子辞書装置およびその制御方法 |
US7869999B2 (en) * | 2004-08-11 | 2011-01-11 | Nuance Communications, Inc. | Systems and methods for selecting from multiple phonectic transcriptions for text-to-speech synthesis |
JP4456537B2 (ja) * | 2004-09-14 | 2010-04-28 | 本田技研工業株式会社 | 情報伝達装置 |
US7675641B2 (en) * | 2004-10-28 | 2010-03-09 | Lexmark International, Inc. | Method and device for converting scanned text to audio data via connection lines and lookup tables |
JP4802489B2 (ja) * | 2004-12-07 | 2011-10-26 | 日本電気株式会社 | 音データ提供システムおよびその方法 |
TWI281145B (en) * | 2004-12-10 | 2007-05-11 | Delta Electronics Inc | System and method for transforming text to speech |
US7707131B2 (en) * | 2005-03-08 | 2010-04-27 | Microsoft Corporation | Thompson strategy based online reinforcement learning system for action selection |
US7734471B2 (en) | 2005-03-08 | 2010-06-08 | Microsoft Corporation | Online learning for dialog systems |
US7885817B2 (en) * | 2005-03-08 | 2011-02-08 | Microsoft Corporation | Easy generation and automatic training of spoken dialog systems using text-to-speech |
JP2008545995A (ja) * | 2005-03-28 | 2008-12-18 | レサック テクノロジーズ、インコーポレーテッド | ハイブリッド音声合成装置、方法および用途 |
US7415413B2 (en) * | 2005-03-29 | 2008-08-19 | International Business Machines Corporation | Methods for conveying synthetic speech style from a text-to-speech system |
US20090202226A1 (en) * | 2005-06-06 | 2009-08-13 | Texthelp Systems, Ltd. | System and method for converting electronic text to a digital multimedia electronic book |
KR100724868B1 (ko) * | 2005-09-07 | 2007-06-04 | 삼성전자주식회사 | 다수의 합성기를 제어하여 다양한 음성 합성 기능을제공하는 음성 합성 방법 및 그 시스템 |
US8677377B2 (en) | 2005-09-08 | 2014-03-18 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US20070078655A1 (en) * | 2005-09-30 | 2007-04-05 | Rockwell Automation Technologies, Inc. | Report generation system with speech output |
US8224647B2 (en) | 2005-10-03 | 2012-07-17 | Nuance Communications, Inc. | Text-to-speech user's voice cooperative server for instant messaging clients |
US20070124142A1 (en) * | 2005-11-25 | 2007-05-31 | Mukherjee Santosh K | Voice enabled knowledge system |
US20070288898A1 (en) * | 2006-06-09 | 2007-12-13 | Sony Ericsson Mobile Communications Ab | Methods, electronic devices, and computer program products for setting a feature of an electronic device based on at least one user characteristic |
US8036902B1 (en) * | 2006-06-21 | 2011-10-11 | Tellme Networks, Inc. | Audio human verification |
US9318108B2 (en) | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
CN101606190B (zh) * | 2007-02-19 | 2012-01-18 | 松下电器产业株式会社 | 用力声音转换装置、声音转换装置、声音合成装置、声音转换方法、声音合成方法 |
US8977255B2 (en) | 2007-04-03 | 2015-03-10 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
CN101295504B (zh) * | 2007-04-28 | 2013-03-27 | 诺基亚公司 | 用于仅文本的应用的娱乐音频 |
CN103200309A (zh) * | 2007-04-28 | 2013-07-10 | 诺基亚公司 | 用于仅文本的应用的娱乐音频 |
EP2188729A1 (en) * | 2007-08-08 | 2010-05-26 | Lessac Technologies, Inc. | System-effected text annotation for expressive prosody in speech synthesis and recognition |
JP4327241B2 (ja) * | 2007-10-01 | 2009-09-09 | パナソニック株式会社 | 音声強調装置および音声強調方法 |
SG152092A1 (en) * | 2007-10-26 | 2009-05-29 | Creative Tech Ltd | Wireless handheld device able to accept text input and methods for inputting text on a wireless handheld device |
JP5098613B2 (ja) * | 2007-12-10 | 2012-12-12 | 富士通株式会社 | 音声認識装置及びコンピュータプログラム |
US9330720B2 (en) * | 2008-01-03 | 2016-05-03 | Apple Inc. | Methods and apparatus for altering audio output signals |
US8996376B2 (en) | 2008-04-05 | 2015-03-31 | Apple Inc. | Intelligent text-to-speech conversion |
US10496753B2 (en) | 2010-01-18 | 2019-12-03 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US8489399B2 (en) | 2008-06-23 | 2013-07-16 | John Nicholas and Kristin Gross Trust | System and method for verifying origin of input through spoken language analysis |
US20090326948A1 (en) * | 2008-06-26 | 2009-12-31 | Piyush Agarwal | Automated Generation of Audiobook with Multiple Voices and Sounds from Text |
US9186579B2 (en) | 2008-06-27 | 2015-11-17 | John Nicholas and Kristin Gross Trust | Internet based pictorial game system and method |
US20100030549A1 (en) | 2008-07-31 | 2010-02-04 | Lee Michael M | Mobile device having human language translation capability with positional feedback |
US8352268B2 (en) * | 2008-09-29 | 2013-01-08 | Apple Inc. | Systems and methods for selective rate of speech and speech preferences for text to speech synthesis |
US8712776B2 (en) | 2008-09-29 | 2014-04-29 | Apple Inc. | Systems and methods for selective text to speech synthesis |
US8571849B2 (en) * | 2008-09-30 | 2013-10-29 | At&T Intellectual Property I, L.P. | System and method for enriching spoken language translation with prosodic information |
WO2010067118A1 (en) | 2008-12-11 | 2010-06-17 | Novauris Technologies Limited | Speech recognition involving a mobile device |
US8401849B2 (en) * | 2008-12-18 | 2013-03-19 | Lessac Technologies, Inc. | Methods employing phase state analysis for use in speech synthesis and recognition |
US8380507B2 (en) * | 2009-03-09 | 2013-02-19 | Apple Inc. | Systems and methods for determining the language to use for speech generated by a text to speech engine |
US8332225B2 (en) * | 2009-06-04 | 2012-12-11 | Microsoft Corporation | Techniques to create a custom voice font |
US10241644B2 (en) | 2011-06-03 | 2019-03-26 | Apple Inc. | Actionable reminder entries |
US10255566B2 (en) | 2011-06-03 | 2019-04-09 | Apple Inc. | Generating and processing task items that represent tasks to perform |
US9858925B2 (en) | 2009-06-05 | 2018-01-02 | Apple Inc. | Using context information to facilitate processing of commands in a virtual assistant |
US10241752B2 (en) | 2011-09-30 | 2019-03-26 | Apple Inc. | Interface for a virtual digital assistant |
US9431006B2 (en) | 2009-07-02 | 2016-08-30 | Apple Inc. | Methods and apparatuses for automatic speech recognition |
US20110029325A1 (en) * | 2009-07-28 | 2011-02-03 | General Electric Company, A New York Corporation | Methods and apparatus to enhance healthcare information analyses |
US20110029326A1 (en) * | 2009-07-28 | 2011-02-03 | General Electric Company, A New York Corporation | Interactive healthcare media devices and systems |
US10553209B2 (en) | 2010-01-18 | 2020-02-04 | Apple Inc. | Systems and methods for hands-free notification summaries |
US10705794B2 (en) | 2010-01-18 | 2020-07-07 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US10679605B2 (en) | 2010-01-18 | 2020-06-09 | Apple Inc. | Hands-free list-reading by intelligent automated assistant |
US10276170B2 (en) | 2010-01-18 | 2019-04-30 | Apple Inc. | Intelligent automated assistant |
WO2011089450A2 (en) | 2010-01-25 | 2011-07-28 | Andrew Peter Nelson Jerram | Apparatuses, methods and systems for a digital conversation management platform |
US8949128B2 (en) | 2010-02-12 | 2015-02-03 | Nuance Communications, Inc. | Method and apparatus for providing speech output for speech-enabled applications |
US8571870B2 (en) | 2010-02-12 | 2013-10-29 | Nuance Communications, Inc. | Method and apparatus for generating synthetic speech with contrastive stress |
US8447610B2 (en) | 2010-02-12 | 2013-05-21 | Nuance Communications, Inc. | Method and apparatus for generating synthetic speech with contrastive stress |
US8682667B2 (en) | 2010-02-25 | 2014-03-25 | Apple Inc. | User profiling for selecting user specific voice input processing information |
US9564120B2 (en) * | 2010-05-14 | 2017-02-07 | General Motors Llc | Speech adaptation in speech synthesis |
US8423365B2 (en) | 2010-05-28 | 2013-04-16 | Daniel Ben-Ezri | Contextual conversion platform |
US8965768B2 (en) | 2010-08-06 | 2015-02-24 | At&T Intellectual Property I, L.P. | System and method for automatic detection of abnormal stress patterns in unit selection synthesis |
US10762293B2 (en) | 2010-12-22 | 2020-09-01 | Apple Inc. | Using parts-of-speech tagging and named entity recognition for spelling correction |
US10019995B1 (en) | 2011-03-01 | 2018-07-10 | Alice J. Stiebel | Methods and systems for language learning based on a series of pitch patterns |
US11062615B1 (en) | 2011-03-01 | 2021-07-13 | Intelligibility Training LLC | Methods and systems for remote language learning in a pandemic-aware world |
JP2012198277A (ja) * | 2011-03-18 | 2012-10-18 | Toshiba Corp | 文書読み上げ支援装置、文書読み上げ支援方法および文書読み上げ支援プログラム |
US9262612B2 (en) | 2011-03-21 | 2016-02-16 | Apple Inc. | Device access using voice authentication |
US10672399B2 (en) | 2011-06-03 | 2020-06-02 | Apple Inc. | Switching between text data and audio data based on a mapping |
US10057736B2 (en) | 2011-06-03 | 2018-08-21 | Apple Inc. | Active transport based notifications |
US8994660B2 (en) | 2011-08-29 | 2015-03-31 | Apple Inc. | Text correction processing |
US9111457B2 (en) * | 2011-09-20 | 2015-08-18 | International Business Machines Corporation | Voice pronunciation for text communication |
US10453479B2 (en) | 2011-09-23 | 2019-10-22 | Lessac Technologies, Inc. | Methods for aligning expressive speech utterances with text and systems therefor |
US10134385B2 (en) | 2012-03-02 | 2018-11-20 | Apple Inc. | Systems and methods for name pronunciation |
US9483461B2 (en) | 2012-03-06 | 2016-11-01 | Apple Inc. | Handling speech synthesis of content for multiple languages |
US9280610B2 (en) | 2012-05-14 | 2016-03-08 | Apple Inc. | Crowd sourcing information to fulfill user requests |
US9721563B2 (en) | 2012-06-08 | 2017-08-01 | Apple Inc. | Name recognition system |
US8972265B1 (en) * | 2012-06-18 | 2015-03-03 | Audible, Inc. | Multiple voices in audio content |
US9495129B2 (en) | 2012-06-29 | 2016-11-15 | Apple Inc. | Device, method, and user interface for voice-activated navigation and browsing of a document |
US9576574B2 (en) | 2012-09-10 | 2017-02-21 | Apple Inc. | Context-sensitive handling of interruptions by intelligent digital assistant |
US9547647B2 (en) | 2012-09-19 | 2017-01-17 | Apple Inc. | Voice-based media searching |
US9472113B1 (en) | 2013-02-05 | 2016-10-18 | Audible, Inc. | Synchronizing playback of digital content with physical content |
DE112014000709B4 (de) | 2013-02-07 | 2021-12-30 | Apple Inc. | Verfahren und vorrichtung zum betrieb eines sprachtriggers für einen digitalen assistenten |
US9368114B2 (en) | 2013-03-14 | 2016-06-14 | Apple Inc. | Context-sensitive handling of interruptions |
US10652394B2 (en) | 2013-03-14 | 2020-05-12 | Apple Inc. | System and method for processing voicemail |
AU2014233517B2 (en) | 2013-03-15 | 2017-05-25 | Apple Inc. | Training an at least partial voice command system |
WO2014144579A1 (en) | 2013-03-15 | 2014-09-18 | Apple Inc. | System and method for updating an adaptive speech recognition model |
US9928754B2 (en) * | 2013-03-18 | 2018-03-27 | Educational Testing Service | Systems and methods for generating recitation items |
US9317486B1 (en) | 2013-06-07 | 2016-04-19 | Audible, Inc. | Synchronizing playback of digital content with captured physical content |
WO2014197334A2 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
US9582608B2 (en) | 2013-06-07 | 2017-02-28 | Apple Inc. | Unified ranking with entropy-weighted information for phrase-based semantic auto-completion |
WO2014197336A1 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for detecting errors in interactions with a voice-based digital assistant |
WO2014197335A1 (en) | 2013-06-08 | 2014-12-11 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
EP3937002A1 (en) | 2013-06-09 | 2022-01-12 | Apple Inc. | Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant |
US10176167B2 (en) | 2013-06-09 | 2019-01-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
AU2014278595B2 (en) | 2013-06-13 | 2017-04-06 | Apple Inc. | System and method for emergency calls initiated by voice command |
DE112014003653B4 (de) | 2013-08-06 | 2024-04-18 | Apple Inc. | Automatisch aktivierende intelligente Antworten auf der Grundlage von Aktivitäten von entfernt angeordneten Vorrichtungen |
EP3061086B1 (en) * | 2013-10-24 | 2019-10-23 | Bayerische Motoren Werke Aktiengesellschaft | Text-to-speech performance evaluation |
US9620105B2 (en) | 2014-05-15 | 2017-04-11 | Apple Inc. | Analyzing audio input for efficient speech and music recognition |
US10592095B2 (en) | 2014-05-23 | 2020-03-17 | Apple Inc. | Instantaneous speaking of content on touch devices |
US9502031B2 (en) | 2014-05-27 | 2016-11-22 | Apple Inc. | Method for supporting dynamic grammars in WFST-based ASR |
US9715875B2 (en) | 2014-05-30 | 2017-07-25 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
US9842101B2 (en) | 2014-05-30 | 2017-12-12 | Apple Inc. | Predictive conversion of language input |
US9760559B2 (en) | 2014-05-30 | 2017-09-12 | Apple Inc. | Predictive text input |
US10170123B2 (en) | 2014-05-30 | 2019-01-01 | Apple Inc. | Intelligent assistant for home automation |
TWI566107B (zh) | 2014-05-30 | 2017-01-11 | 蘋果公司 | 用於處理多部分語音命令之方法、非暫時性電腦可讀儲存媒體及電子裝置 |
US10078631B2 (en) | 2014-05-30 | 2018-09-18 | Apple Inc. | Entropy-guided text prediction using combined word and character n-gram language models |
US9633004B2 (en) | 2014-05-30 | 2017-04-25 | Apple Inc. | Better resolution when referencing to concepts |
US9785630B2 (en) | 2014-05-30 | 2017-10-10 | Apple Inc. | Text prediction using combined word N-gram and unigram language models |
US9734193B2 (en) | 2014-05-30 | 2017-08-15 | Apple Inc. | Determining domain salience ranking from ambiguous words in natural speech |
US10289433B2 (en) | 2014-05-30 | 2019-05-14 | Apple Inc. | Domain specific language for encoding assistant dialog |
US9430463B2 (en) | 2014-05-30 | 2016-08-30 | Apple Inc. | Exemplar-based natural language processing |
US10659851B2 (en) | 2014-06-30 | 2020-05-19 | Apple Inc. | Real-time digital assistant knowledge updates |
US9338493B2 (en) | 2014-06-30 | 2016-05-10 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US10446141B2 (en) | 2014-08-28 | 2019-10-15 | Apple Inc. | Automatic speech recognition based on user feedback |
US9818400B2 (en) | 2014-09-11 | 2017-11-14 | Apple Inc. | Method and apparatus for discovering trending terms in speech requests |
US10789041B2 (en) | 2014-09-12 | 2020-09-29 | Apple Inc. | Dynamic thresholds for always listening speech trigger |
US9606986B2 (en) | 2014-09-29 | 2017-03-28 | Apple Inc. | Integrated word N-gram and class M-gram language models |
US9646609B2 (en) | 2014-09-30 | 2017-05-09 | Apple Inc. | Caching apparatus for serving phonetic pronunciations |
US10127911B2 (en) | 2014-09-30 | 2018-11-13 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
US9886432B2 (en) | 2014-09-30 | 2018-02-06 | Apple Inc. | Parsimonious handling of word inflection via categorical stem + suffix N-gram language models |
US10074360B2 (en) | 2014-09-30 | 2018-09-11 | Apple Inc. | Providing an indication of the suitability of speech recognition |
US9668121B2 (en) | 2014-09-30 | 2017-05-30 | Apple Inc. | Social reminders |
JP6507579B2 (ja) * | 2014-11-10 | 2019-05-08 | ヤマハ株式会社 | 音声合成方法 |
US10552013B2 (en) | 2014-12-02 | 2020-02-04 | Apple Inc. | Data detection |
US9711141B2 (en) | 2014-12-09 | 2017-07-18 | Apple Inc. | Disambiguating heteronyms in speech synthesis |
US9865280B2 (en) | 2015-03-06 | 2018-01-09 | Apple Inc. | Structured dictation using intelligent automated assistants |
US9886953B2 (en) | 2015-03-08 | 2018-02-06 | Apple Inc. | Virtual assistant activation |
US10567477B2 (en) | 2015-03-08 | 2020-02-18 | Apple Inc. | Virtual assistant continuity |
US9721566B2 (en) | 2015-03-08 | 2017-08-01 | Apple Inc. | Competing devices responding to voice triggers |
US9899019B2 (en) | 2015-03-18 | 2018-02-20 | Apple Inc. | Systems and methods for structured stem and suffix language models |
US9842105B2 (en) | 2015-04-16 | 2017-12-12 | Apple Inc. | Parsimonious continuous-space phrase representations for natural language processing |
US10083688B2 (en) | 2015-05-27 | 2018-09-25 | Apple Inc. | Device voice control for selecting a displayed affordance |
US10127220B2 (en) | 2015-06-04 | 2018-11-13 | Apple Inc. | Language identification from short strings |
US10101822B2 (en) | 2015-06-05 | 2018-10-16 | Apple Inc. | Language input correction |
US9578173B2 (en) | 2015-06-05 | 2017-02-21 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
US11025565B2 (en) | 2015-06-07 | 2021-06-01 | Apple Inc. | Personalized prediction of responses for instant messaging |
US10255907B2 (en) | 2015-06-07 | 2019-04-09 | Apple Inc. | Automatic accent detection using acoustic models |
US10186254B2 (en) | 2015-06-07 | 2019-01-22 | Apple Inc. | Context-based endpoint detection |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US9697820B2 (en) | 2015-09-24 | 2017-07-04 | Apple Inc. | Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks |
US11010550B2 (en) | 2015-09-29 | 2021-05-18 | Apple Inc. | Unified language modeling framework for word prediction, auto-completion and auto-correction |
US10366158B2 (en) | 2015-09-29 | 2019-07-30 | Apple Inc. | Efficient word encoding for recurrent neural network language models |
US9721551B2 (en) | 2015-09-29 | 2017-08-01 | Amper Music, Inc. | Machines, systems, processes for automated music composition and generation employing linguistic and/or graphical icon based musical experience descriptions |
US10854180B2 (en) | 2015-09-29 | 2020-12-01 | Amper Music, Inc. | Method of and system for controlling the qualities of musical energy embodied in and expressed by digital music to be automatically composed and generated by an automated music composition and generation engine |
RU2632424C2 (ru) | 2015-09-29 | 2017-10-04 | Общество С Ограниченной Ответственностью "Яндекс" | Способ и сервер для синтеза речи по тексту |
US11587559B2 (en) | 2015-09-30 | 2023-02-21 | Apple Inc. | Intelligent device identification |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10049668B2 (en) | 2015-12-02 | 2018-08-14 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
JP6523998B2 (ja) * | 2016-03-14 | 2019-06-05 | 株式会社東芝 | 読み上げ情報編集装置、読み上げ情報編集方法およびプログラム |
US10446143B2 (en) | 2016-03-14 | 2019-10-15 | Apple Inc. | Identification of voice inputs providing credentials |
US9934775B2 (en) | 2016-05-26 | 2018-04-03 | Apple Inc. | Unit-selection text-to-speech synthesis based on predicted concatenation parameters |
US9972304B2 (en) | 2016-06-03 | 2018-05-15 | Apple Inc. | Privacy preserving distributed evaluation framework for embedded personalized systems |
US10249300B2 (en) | 2016-06-06 | 2019-04-02 | Apple Inc. | Intelligent list reading |
US10049663B2 (en) | 2016-06-08 | 2018-08-14 | Apple, Inc. | Intelligent automated assistant for media exploration |
DK179588B1 (en) | 2016-06-09 | 2019-02-22 | Apple Inc. | INTELLIGENT AUTOMATED ASSISTANT IN A HOME ENVIRONMENT |
US10192552B2 (en) | 2016-06-10 | 2019-01-29 | Apple Inc. | Digital assistant providing whispered speech |
US10586535B2 (en) | 2016-06-10 | 2020-03-10 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US10509862B2 (en) | 2016-06-10 | 2019-12-17 | Apple Inc. | Dynamic phrase expansion of language input |
US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
US10490187B2 (en) | 2016-06-10 | 2019-11-26 | Apple Inc. | Digital assistant providing automated status report |
DK179343B1 (en) | 2016-06-11 | 2018-05-14 | Apple Inc | Intelligent task discovery |
DK179415B1 (en) | 2016-06-11 | 2018-06-14 | Apple Inc | Intelligent device arbitration and control |
DK179049B1 (en) | 2016-06-11 | 2017-09-18 | Apple Inc | Data driven natural language event detection and classification |
DK201670540A1 (en) | 2016-06-11 | 2018-01-08 | Apple Inc | Application integration with a digital assistant |
US10043516B2 (en) | 2016-09-23 | 2018-08-07 | Apple Inc. | Intelligent automated assistant |
US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
DK201770439A1 (en) | 2017-05-11 | 2018-12-13 | Apple Inc. | Offline personal assistant |
DK179496B1 (en) | 2017-05-12 | 2019-01-15 | Apple Inc. | USER-SPECIFIC Acoustic Models |
DK179745B1 (en) | 2017-05-12 | 2019-05-01 | Apple Inc. | SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT |
DK201770432A1 (en) | 2017-05-15 | 2018-12-21 | Apple Inc. | Hierarchical belief states for digital assistants |
DK201770431A1 (en) | 2017-05-15 | 2018-12-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
DK179560B1 (en) | 2017-05-16 | 2019-02-18 | Apple Inc. | FAR-FIELD EXTENSION FOR DIGITAL ASSISTANT SERVICES |
US10225621B1 (en) | 2017-12-20 | 2019-03-05 | Dish Network L.L.C. | Eyes free entertainment |
CN108877765A (zh) * | 2018-05-31 | 2018-11-23 | 百度在线网络技术(北京)有限公司 | 语音拼接合成的处理方法及装置、计算机设备及可读介质 |
WO2019245916A1 (en) * | 2018-06-19 | 2019-12-26 | Georgetown University | Method and system for parametric speech synthesis |
EP3921770A4 (en) * | 2019-02-05 | 2022-11-09 | Igentify Ltd. | SYSTEM AND METHOD FOR MODULATION OF DYNAMIC GAPS IN SPEECH |
CN110047474A (zh) * | 2019-05-06 | 2019-07-23 | 齐鲁工业大学 | 一种英语音标发音智能训练系统及训练方法 |
KR20210155401A (ko) | 2019-05-15 | 2021-12-23 | 엘지전자 주식회사 | 인공 지능을 이용하여, 합성 음성의 품질을 평가하는 음성 합성 장치 및 그의 동작 방법 |
US11024275B2 (en) | 2019-10-15 | 2021-06-01 | Shutterstock, Inc. | Method of digitally performing a music composition using virtual musical instruments having performance logic executing within a virtual musical instrument (VMI) library management system |
US10964299B1 (en) | 2019-10-15 | 2021-03-30 | Shutterstock, Inc. | Method of and system for automatically generating digital performances of music compositions using notes selected from virtual musical instruments based on the music-theoretic states of the music compositions |
US11037538B2 (en) | 2019-10-15 | 2021-06-15 | Shutterstock, Inc. | Method of and system for automated musical arrangement and musical instrument performance style transformation supported within an automated music performance system |
US11302300B2 (en) * | 2019-11-19 | 2022-04-12 | Applications Technology (Apptek), Llc | Method and apparatus for forced duration in neural speech synthesis |
CN110933330A (zh) * | 2019-12-09 | 2020-03-27 | 广州酷狗计算机科技有限公司 | 视频配音方法、装置、计算机设备及计算机可读存储介质 |
TWI759003B (zh) * | 2020-12-10 | 2022-03-21 | 國立成功大學 | 語音辨識模型的訓練方法 |
WO2022144851A1 (en) * | 2021-01-01 | 2022-07-07 | Jio Platforms Limited | System and method of automated audio output |
CN112818118B (zh) * | 2021-01-22 | 2024-05-21 | 大连民族大学 | 基于反向翻译的中文幽默分类模型的构建方法 |
WO2024079605A1 (en) | 2022-10-10 | 2024-04-18 | Talk Sàrl | Assisting a speaker during training or actual performance of a speech |
Family Cites Families (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4783803A (en) | 1985-11-12 | 1988-11-08 | Dragon Systems, Inc. | Speech recognition apparatus and method |
US4903305A (en) | 1986-05-12 | 1990-02-20 | Dragon Systems, Inc. | Method for representing word models for use in speech recognition |
US4866778A (en) | 1986-08-11 | 1989-09-12 | Dragon Systems, Inc. | Interactive speech recognition apparatus |
US5231670A (en) | 1987-06-01 | 1993-07-27 | Kurzweil Applied Intelligence, Inc. | Voice controlled system and method for generating text from a voice controlled input |
US5027406A (en) | 1988-12-06 | 1991-06-25 | Dragon Systems, Inc. | Method for interactive speech recognition and training |
US5010495A (en) | 1989-02-02 | 1991-04-23 | American Language Academy | Interactive language learning system |
US5745873A (en) | 1992-05-01 | 1998-04-28 | Massachusetts Institute Of Technology | Speech recognition using final decision based on tentative decisions |
US5393236A (en) | 1992-09-25 | 1995-02-28 | Northeastern University | Interactive speech pronunciation apparatus and method |
GB9223066D0 (en) | 1992-11-04 | 1992-12-16 | Secr Defence | Children's speech training aid |
US5850627A (en) | 1992-11-13 | 1998-12-15 | Dragon Systems, Inc. | Apparatuses and methods for training and operating speech recognition systems |
US5636325A (en) | 1992-11-13 | 1997-06-03 | International Business Machines Corporation | Speech synthesis and analysis of dialects |
US5796916A (en) | 1993-01-21 | 1998-08-18 | Apple Computer, Inc. | Method and apparatus for prosody for synthetic speech prosody determination |
US5487671A (en) | 1993-01-21 | 1996-01-30 | Dsp Solutions (International) | Computerized system for teaching speech |
JPH10511472A (ja) | 1994-12-08 | 1998-11-04 | ザ リージェンツ オブ ザ ユニバーシティ オブ カリフォルニア | 言語障害者間の語音の認識を向上させるための方法および装置 |
US5787231A (en) | 1995-02-02 | 1998-07-28 | International Business Machines Corporation | Method and system for improving pronunciation in a voice control system |
US5717828A (en) | 1995-03-15 | 1998-02-10 | Syracuse Language Systems | Speech recognition apparatus and method for learning |
US5890123A (en) * | 1995-06-05 | 1999-03-30 | Lucent Technologies, Inc. | System and method for voice controlled video screen display |
US5903864A (en) | 1995-08-30 | 1999-05-11 | Dragon Systems | Speech recognition |
US5799279A (en) | 1995-11-13 | 1998-08-25 | Dragon Systems, Inc. | Continuous speech recognition of text and commands |
JP2942190B2 (ja) | 1996-05-10 | 1999-08-30 | 本田技研工業株式会社 | バギー車の車体フレーム構造及びその製造方法 |
US5728960A (en) | 1996-07-10 | 1998-03-17 | Sitrick; David H. | Multi-dimensional transformation systems and display communication architecture for musical compositions |
US5766015A (en) | 1996-07-11 | 1998-06-16 | Digispeech (Israel) Ltd. | Apparatus for interactive language training |
WO1998014934A1 (en) | 1996-10-02 | 1998-04-09 | Sri International | Method and system for automatic text-independent grading of pronunciation for language instruction |
US5864805A (en) | 1996-12-20 | 1999-01-26 | International Business Machines Corporation | Method and apparatus for error correction in a continuous dictation system |
US5946654A (en) | 1997-02-21 | 1999-08-31 | Dragon Systems, Inc. | Speaker identification using unsupervised speech models |
GB2323693B (en) | 1997-03-27 | 2001-09-26 | Forum Technology Ltd | Speech to text conversion |
JP4267101B2 (ja) | 1997-11-17 | 2009-05-27 | インターナショナル・ビジネス・マシーンズ・コーポレーション | 音声識別装置、発音矯正装置およびこれらの方法 |
US6081780A (en) | 1998-04-28 | 2000-06-27 | International Business Machines Corporation | TTS and prosody based authoring system |
US6266637B1 (en) | 1998-09-11 | 2001-07-24 | International Business Machines Corporation | Phrase splicing and variable substitution using a trainable speech synthesizer |
US6188984B1 (en) | 1998-11-17 | 2001-02-13 | Fonix Corporation | Method and system for syllable parsing |
US6253182B1 (en) | 1998-11-24 | 2001-06-26 | Microsoft Corporation | Method and apparatus for speech synthesis with efficient spectral smoothing |
US6144939A (en) | 1998-11-25 | 2000-11-07 | Matsushita Electric Industrial Co., Ltd. | Formant-based speech synthesizer employing demi-syllable concatenation with independent cross fade in the filter parameter and source domains |
WO2001082291A1 (en) | 2000-04-21 | 2001-11-01 | Lessac Systems, Inc. | Speech recognition and training methods and systems |
US6505158B1 (en) * | 2000-07-05 | 2003-01-07 | At&T Corp. | Synthesis-based pre-selection of suitable units for concatenative speech |
-
2002
- 2002-12-31 US US10/334,658 patent/US6865533B2/en not_active Expired - Lifetime
-
2003
- 2003-01-28 WO PCT/US2003/002561 patent/WO2003065349A2/en active Application Filing
- 2003-01-28 EP EP03705954A patent/EP1479068A4/en not_active Withdrawn
- 2003-01-28 CA CA002474483A patent/CA2474483A1/en not_active Abandoned
- 2003-01-28 JP JP2003564856A patent/JP4363590B2/ja not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
WO2003065349A2 (en) | 2003-08-07 |
EP1479068A4 (en) | 2007-05-09 |
WO2003065349A3 (en) | 2004-01-08 |
WO2003065349B1 (en) | 2004-02-26 |
US20030163316A1 (en) | 2003-08-28 |
US6865533B2 (en) | 2005-03-08 |
EP1479068A2 (en) | 2004-11-24 |
CA2474483A1 (en) | 2003-08-07 |
JP2005516262A (ja) | 2005-06-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP4363590B2 (ja) | 音声合成 | |
US6847931B2 (en) | Expressive parsing in computerized conversion of text to speech | |
US8219398B2 (en) | Computerized speech synthesizer for synthesizing speech from text | |
Dutoit | An introduction to text-to-speech synthesis | |
Feld et al. | Vocal anthropology: From the music of language to the language of song | |
Halle | From memory to speech and back: Papers on phonetics and phonology 1954-2002 | |
JP2007527555A (ja) | 韻律音声テキストコード及びコンピュータ化された音声システムへのその使用 | |
WO2009021183A1 (en) | System-effected text annotation for expressive prosody in speech synthesis and recognition | |
KR20150076128A (ko) | 3차원 멀티미디어 활용 발음 학습 지원 시스템 및 그 시스템의 발음 학습 지원 방법 | |
JPH0335296A (ja) | テキスト音声合成装置 | |
Aaron et al. | Conversational computers | |
Meyer et al. | A Flute, Musical Bows and Bamboo Clarinets that “Speak” in the Amazon Rainforest; Speech and Music in the Gavião Language of Rondônia | |
Sečujski et al. | Learning prosodic stress from data in neural network based text-to-speech synthesis | |
JP2004145015A (ja) | テキスト音声合成システム及び方法 | |
Trouvain et al. | Speech synthesis: text-to-speech conversion and artificial voices | |
Iida | A study on corpus-based speech synthesis with emotion | |
Chamorro | An Analysis of Jonathan Harvey’s Speakings for Orchestra and Electronics | |
JP2908720B2 (ja) | 合成を基本とした会話訓練装置及び方法 | |
Madaminjonov | Formation of a Speech Database in the Karakalpak Language for Speech Synthesis Systems | |
IMRAN | ADMAS UNIVERSITY SCHOOL OF POST GRADUATE STUDIES DEPARTMENT OF COMPUTER SCIENCE | |
O'Cinneide et al. | A brief introduction to speech synthesis and voice modification | |
Handley | Evaluating text-to-speech (TTS) synthesis for use in computer-assisted language learning (CALL) | |
Clark | Emphasizing the articulatory and timbral aspects of vocal production in vocal composition | |
COHEN et al. | A study of pitch phenomena and applications in electrolarynx speech | |
Newell et al. | Place, authenticity time: a framework for synthetic voice acting |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A711 | Notification of change in applicant |
Free format text: JAPANESE INTERMEDIATE CODE: A711 Effective date: 20050328 |
|
A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20060112 |
|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20090424 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20090623 |
|
TRDD | Decision of grant or rejection written | ||
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20090717 |
|
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 |
|
A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20090814 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20120828 Year of fee payment: 3 |
|
R150 | Certificate of patent or registration of utility model |
Free format text: JAPANESE INTERMEDIATE CODE: R150 |
|
LAPS | Cancellation because of no payment of annual fees |