JP2009037214A5 - - Google Patents

Download PDF

Info

Publication number
JP2009037214A5
JP2009037214A5 JP2008134655A JP2008134655A JP2009037214A5 JP 2009037214 A5 JP2009037214 A5 JP 2009037214A5 JP 2008134655 A JP2008134655 A JP 2008134655A JP 2008134655 A JP2008134655 A JP 2008134655A JP 2009037214 A5 JP2009037214 A5 JP 2009037214A5
Authority
JP
Japan
Prior art keywords
voice
guidance
synthesis
recording
playback
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP2008134655A
Other languages
Japanese (ja)
Other versions
JP2009037214A (en
JP5097007B2 (en
Filing date
Publication date
Application filed filed Critical
Priority to JP2008134655A priority Critical patent/JP5097007B2/en
Priority claimed from JP2008134655A external-priority patent/JP5097007B2/en
Priority to US12/170,124 priority patent/US8027835B2/en
Publication of JP2009037214A publication Critical patent/JP2009037214A/en
Publication of JP2009037214A5 publication Critical patent/JP2009037214A5/ja
Application granted granted Critical
Publication of JP5097007B2 publication Critical patent/JP5097007B2/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Claims (10)

複数の語または句で構成される文を、録音再生方式または規則合成方式を用いて再生可能な音声処理装置であって、
再生する文を構成する複数の語または句のそれぞれが、録音再生方式で再生される語または句か、規則合成方式で再生される語または句かを特定する特定手段と、
前記複数の語または句のそれぞれを、前記特定手段により特定された再生方式を用いて第1の配置順序で再生する場合に、録音再生方式を用いた再生と規則合成方式を用いた再生とが切り替わる反転回数に基づいて、前記複数の語または句のそれぞれを前記第1の配置順序で再生するか、該第1の配置順序とは異なる配置順序で再生するかを選択する選択手段と、
前記複数の語または句のそれぞれを、前記特定手段により特定された再生方式を用いて、前記選択手段により選択された配置順序で再生する再生手段と
を備えることを特徴とする音声処理装置。
A speech processing device capable of reproducing a sentence composed of a plurality of words or phrases using a recording / playback method or a rule synthesis method,
A specifying means for specifying whether each of a plurality of words or phrases constituting a sentence to be reproduced is a word or phrase reproduced by a recording and reproduction method or a word or phrase reproduced by a rule composition method;
When each of the plurality of words or phrases is reproduced in the first arrangement order using the reproduction method specified by the specifying means, reproduction using the recording / reproduction method and reproduction using the rule composition method are performed. Selection means for selecting whether to reproduce each of the plurality of words or phrases in the first arrangement order or in an arrangement order different from the first arrangement order based on the number of inversions to be switched;
Reproducing means for reproducing each of the plurality of words or phrases in the arrangement order selected by the selecting means using the reproducing method specified by the specifying means.
前記反転回数は、録音再生方式を用いた再生から規則合成方式を用いた再生へ切り替わる回数と、規則合成方式を用いた再生から録音再生方式を用いた再生へ切り替わる回数との和に相当することを特徴とする請求項1に記載の音声処理装置。   The number of inversions corresponds to the sum of the number of times of switching from playback using the recording and playback method to playback using the rule synthesis method and the number of times switching from playback using the rule synthesis method to playback using the recording and playback method. The speech processing apparatus according to claim 1. 前記選択手段は、前記反転回数が所定数未満のときは、前記第1の配置順序による再生を選択し、前記所定数以上のときは、前記第1の配置順序とは異なる配置順序による再生を選択することを特徴とする請求項1又は2に記載の音声処理装置。   The selection means selects reproduction based on the first arrangement order when the number of inversions is less than a predetermined number, and reproduces reproduction based on an arrangement order different from the first arrangement order when the number is greater than the predetermined number. The audio processing apparatus according to claim 1, wherein the audio processing apparatus is selected. 前記選択手段は、前記反転回数が所定数未満のときは、前記第1の配置順序による再生を選択し、前記所定数以上のときは、所定の基準に基づいて、前記第1の配置順序とは異なる複数の配置順序のうちのいずれか1つの配置順序による再生を選択することを特徴とする請求項1又は2に記載の音声処理装置。   The selection means selects reproduction based on the first arrangement order when the number of inversions is less than a predetermined number, and selects the first arrangement order based on a predetermined criterion when the number of inversions is greater than the predetermined number. The audio processing apparatus according to claim 1, wherein reproduction is selected according to any one of a plurality of different arrangement orders. 前記選択手段は、前記反転回数が前記所定数以上のときは、前記第1の配置順序とは異なる複数の配置順序のうち、前記録音再生方式を用いた再生と前記規則合成方式を用いた再生とが切り替わる回数が前記所定数未満となる配置順序による再生を選択することを特徴とする請求項4に記載の音声処理装置。   When the number of inversions is equal to or greater than the predetermined number, the selection unit is configured to perform playback using the recording / playback method and playback using the rule composition method among a plurality of placement orders different from the first placement order. The audio processing apparatus according to claim 4, wherein reproduction according to an arrangement order in which the number of times of switching is less than the predetermined number is selected. 録音再生方式と規則合成方式とを選択的に切り替えながら音声合成を行うことが可能な音声合成手段を用いて、ユーザの操作に応じたガイダンスの音声を発生する音声処理装置であって、
固定されたメッセージを示す固定部と、該固定部の中間に位置し、ユーザの操作に応じたメッセージが挿入されることを示す可変部とからなる第1ガイダンスと、前記可変部を固定部の末尾に位置させた前記第1ガイダンスと同義の第2ガイダンスとを保持するガイダンス保持手段と、
ユーザの操作に関連付けられる、表記、該表記の読み、及び該読みの音声を登録可能なエントリの集合を保持するエントリ保持手段と、
前記エントリ保持手段から、ユーザにより行われた操作に応じたエントリを取得する取得手段と、
を有し、
前記音声合成手段は、
前記取得手段が取得したエントリに音声が登録されている場合には、前記第1ガイダンスを選択し、該第1ガイダンスの固定部に対しては予め録音された該固定部に対応する音声を用いて録音再生方式で音声合成を行うとともに、可変部に対しても該エントリに登録されている音声を用いて録音再生方式で音声合成を行い、
前記取得手段が取得したエントリに音声が登録されていない場合には、前記第2ガイダンスを選択し、該第2ガイダンスの固定部に対しては予め録音された該固定部に対応する音声を用いて録音再生方式で音声合成を行い、可変部に対しては規則合成方式で音声合成を行う
ことを特徴とする音声処理装置。
A speech processing device that generates speech of guidance according to a user operation using speech synthesis means capable of performing speech synthesis while selectively switching between a recording and playback method and a rule synthesis method,
A first guidance comprising a fixed part indicating a fixed message, and a variable part positioned in the middle of the fixed part and indicating that a message according to a user operation is inserted; and Guidance holding means for holding second guidance having the same meaning as the first guidance located at the end;
Entry holding means for holding a set of entries that can be registered with a notation, a reading of the notation, and a sound of the reading, which are associated with a user operation;
Obtaining means for obtaining an entry corresponding to an operation performed by the user from the entry holding means;
Have
The speech synthesis means
When the voice is registered in the entry acquired by the acquisition unit, the first guidance is selected, and the voice corresponding to the fixed part recorded in advance is used for the fixed part of the first guidance. In addition to performing voice synthesis with the recording and playback method, the voice synthesis is performed with the recording and playback method using the voice registered in the entry for the variable part,
If no voice is registered in the entry acquired by the acquisition means, the second guidance is selected, and the voice corresponding to the fixed part recorded in advance is used for the fixed part of the second guidance. A voice processing apparatus characterized in that voice synthesis is performed by a recording / playback system, and voice synthesis is performed by a rule synthesis system for a variable part.
ネットワーク通信を行う通信手段を更に備え、
前記ユーザの操作は前記ネットワーク通信に関する操作を含み、前記エントリ保持手段は、前記ネットワーク通信のためのアドレス帳を構成することを特徴とする請求項6に記載の音声処理装置。
A communication means for performing network communication;
The voice processing apparatus according to claim 6, wherein the user operation includes an operation related to the network communication, and the entry holding unit constitutes an address book for the network communication.
固定されたメッセージを示す固定部と、該固定部の中間に位置し、ユーザの操作に応じたメッセージが挿入されることを示す可変部とからなる第1ガイダンスと、前記可変部を固定部の末尾に位置させた前記第1ガイダンスと同義の第2ガイダンスとを保持するガイダンス保持手段と、ユーザの操作に関連付けられる、表記、該表記の読み、及び該読みの音声を登録可能なエントリの集合を保持するエントリ保持手段と、録音再生方式と規則合成方式とを選択的に切り替えながら音声合成を行うことが可能な音声合成手段とを備える音声処理装置を制御して、ユーザの操作に応じたガイダンスの音声を発生するための音声処理方法であって、
取得手段が、前記エントリ保持手段から、ユーザにより行われた操作に応じたエントリを取得する取得工程と、
音声合成手段が、前記取得工程で取得したエントリに音声が登録されている場合、前記第1ガイダンスを選択し、該第1ガイダンスの固定部に対しては予め録音された該固定部に対応する音声を用いて録音再生方式で音声合成を行うとともに、可変部に対しても該エントリに登録されている音声を用いて録音再生方式で音声合成を行う第1音声合成工程と、
前記音声合成手段が、前記取得工程で取得したエントリに音声が登録されていない場合、前記第2ガイダンスを選択し、該第2ガイダンスの固定部に対しては予め録音された該固定部に対応する音声を用いて録音再生方式で音声合成を行い、可変部に対しては規則合成方式で音声合成を行う第2音声合成工程と、
を有することを特徴とする音声処理方法。
A first guidance comprising a fixed part indicating a fixed message, and a variable part positioned in the middle of the fixed part and indicating that a message according to a user operation is inserted; and Guidance holding means that holds the second guidance that is synonymous with the first guidance positioned at the end, and a set of entries that can be associated with a user operation and that can register the notation, the reading of the notation, and the sound of the reading And a speech processing unit that can perform speech synthesis while selectively switching between the recording / playback method and the rule synthesis method. A voice processing method for generating a voice of guidance,
Acquisition means, from the entry holding means, an acquisition step of acquiring an entry corresponding to the operation performed by the user,
When the voice synthesizing means has a voice registered in the entry acquired in the acquisition step, the first guidance is selected, and the fixed portion of the first guidance corresponds to the fixed portion recorded in advance. A first voice synthesis step of performing voice synthesis by voice recording using a recording / playback method and voice synthesis by voice recording using a voice registered in the entry for the variable unit;
When the voice synthesizing means has no voice registered in the entry acquired in the acquiring step, the second guidance is selected, and the fixed part of the second guidance corresponds to the fixed part recorded in advance. A second voice synthesis step of performing voice synthesis by a recording / playback method using the voice to be played and voice synthesis by a rule synthesis method for the variable portion;
A voice processing method characterized by comprising:
請求項8に記載の音声処理方法の各工程をコンピュータに実行させるためのプログラム。 The program for making a computer perform each process of the audio | voice processing method of Claim 8. 請求項9に記載のプログラムを記憶したコンピュータ読み取り可能な記憶媒体。   A computer-readable storage medium storing the program according to claim 9.
JP2008134655A 2007-07-11 2008-05-22 Audio processing apparatus and method Expired - Fee Related JP5097007B2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2008134655A JP5097007B2 (en) 2007-07-11 2008-05-22 Audio processing apparatus and method
US12/170,124 US8027835B2 (en) 2007-07-11 2008-07-09 Speech processing apparatus having a speech synthesis unit that performs speech synthesis while selectively changing recorded-speech-playback and text-to-speech and method

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
JP2007182555 2007-07-11
JP2007182555 2007-07-11
JP2008134655A JP5097007B2 (en) 2007-07-11 2008-05-22 Audio processing apparatus and method

Publications (3)

Publication Number Publication Date
JP2009037214A JP2009037214A (en) 2009-02-19
JP2009037214A5 true JP2009037214A5 (en) 2011-05-26
JP5097007B2 JP5097007B2 (en) 2012-12-12

Family

ID=40439123

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2008134655A Expired - Fee Related JP5097007B2 (en) 2007-07-11 2008-05-22 Audio processing apparatus and method

Country Status (1)

Country Link
JP (1) JP5097007B2 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5269668B2 (en) 2009-03-25 2013-08-21 株式会社東芝 Speech synthesis apparatus, program, and method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07210194A (en) * 1994-01-18 1995-08-11 Hitachi Ltd Device for outputting sound
JPH0934490A (en) * 1995-07-20 1997-02-07 Sony Corp Method and device for voice synthetization, navigation system, and recording medium

Similar Documents

Publication Publication Date Title
JP2009526260A5 (en)
JP2007295038A5 (en)
RU2008137617A (en) NAVIGATION DEVICE AND METHOD FOR RECEIVING AND PLAYING SOUND SAMPLES
JP2013025299A (en) Transcription support system and transcription support method
KR20190076934A (en) Audio metadata encoding and audio data playing apparatus for supporting dynamic format conversion, and method for performing by the appartus, and computer-readable medium recording the dynamic format conversions
KR100830689B1 (en) Method of reproducing multimedia for educating foreign language by chunking and Media recorded thereby
JP2009037214A5 (en)
KR100695209B1 (en) Method and mobile communication terminal for storing content of electronic book
JP2018146961A (en) Voice reproduction device and voice reproduction program
JP4581052B2 (en) Recording / reproducing apparatus, recording / reproducing method, and program
JP3978465B2 (en) Recording / playback device
KR20080113844A (en) Apparatus and method for voice file playing in electronic device
JP2005092191A5 (en)
JP6273456B2 (en) Audio playback device
KR20090076298A (en) Apparatus and method for generating multimedia data with various reproduction speed, apparatus and method for reproducing the multimedia data, and storage medium storing for thereof
JP2008010090A5 (en)
KR100912118B1 (en) Studying system linked with contents and method for studying using the same
JP2015022045A (en) Sound reproducing system
US20060200343A1 (en) Enhanced data storage
JP2015025842A (en) Sound reproduction apparatus
JP2016012098A (en) Electronic book reproduction device and electronic book reproduction program
JP2022062818A (en) Information processing device
JP2008191292A (en) Speech synthesis method and program, speech synthesizing device, and music and speech reproducing device
JPH0119160B2 (en)
JP2006166441A5 (en)