JP2015087740A5 - Speech synthesis apparatus and speech synthesis method - Google Patents
Speech synthesis apparatus and speech synthesis method Download PDFInfo
- Publication number
- JP2015087740A5 JP2015087740A5 JP2014048636A JP2014048636A JP2015087740A5 JP 2015087740 A5 JP2015087740 A5 JP 2015087740A5 JP 2014048636 A JP2014048636 A JP 2014048636A JP 2014048636 A JP2014048636 A JP 2014048636A JP 2015087740 A5 JP2015087740 A5 JP 2015087740A5
- Authority
- JP
- Japan
- Prior art keywords
- pitch
- section
- speech synthesizer
- changed
- relationship
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001308 synthesis method Methods 0.000 title claims description 3
- 230000015572 biosynthetic process Effects 0.000 title description 5
- 238000003786 synthesis reaction Methods 0.000 title description 5
- 230000002194 synthesizing Effects 0.000 title description 5
- 230000005236 sound signal Effects 0.000 claims 1
Description
ところで、上述した音声合成技術と音声対話システムとを組み合わせて、利用者の音声による問いに対し、データを検索して音声合成により出力する対話システムを想定する。この場合、音声合成によって出力される音声が利用者に不自然な感じ、具体的には、いかにも機械が喋っている感じを与えるときがある、という問題が指摘されている。
本発明は、このような事情に鑑みてなされたものであり、その目的の一つは、利用者に自然な感じを与えるような、具体的には、利用者に対して好印象や悪印象などを与えることが可能な音声合成装置および音声合成方法を提供することにある。
By the way, it is assumed that a dialogue system that combines the above-described voice synthesis technology and a voice dialogue system and retrieves data and outputs it by voice synthesis in response to a user's voice question. In this case, a problem has been pointed out that the voice output by the voice synthesis feels unnatural to the user, specifically, the machine sometimes feels roaring.
The present invention has been made in view of such circumstances, and one of its purposes is to give the user a natural feeling, specifically, a positive impression or a bad impression on the user. Is to provide a speech synthesis apparatus and a speech synthesis method .
Claims (8)
前記問いのうち、特定の第1区間の音高を解析する音高解析部と、
前記問いに対する回答を取得する取得部と、
取得された回答のうち、特定の第2区間の音高を、前記第1区間の音高に対して所定の関係にある音高となるように変更して出力する音声合成部と、
を具備し、
前記第1区間は、前記問いの語尾であり、
前記第2区間は、前記回答の語頭または語尾である、
ことを特徴とする音声合成装置。 A voice input unit for inputting questions by voice signals;
Among the questions, a pitch analysis unit that analyzes the pitch of a specific first section;
An acquisition unit for acquiring an answer to the question;
Among the obtained answers, a speech synthesizer that changes and outputs the pitch of a specific second section so that the pitch has a predetermined relationship with the pitch of the first section;
Equipped with,
The first interval is the ending of the question;
The second interval is the beginning or end of the answer,
A speech synthesizer characterized by the above.
ことを特徴とする請求項1に記載の音声合成装置。 The predetermined relationship is a relationship of Kyowa intervals excluding perfect 1 degree.
The speech synthesizer according to claim 1 .
前記第2区間の音高を、前記第1区間の音高に対して、同一を除く、上下1オクターブの範囲内の音高関係となるように変更して出力する
ことを特徴とする請求項1に記載の音声合成装置。 The speech synthesizer
The pitch of the second section is changed so as to have a pitch relationship within a range of one octave above and below, except for the same pitch as the pitch of the first section, and output. The speech synthesizer according to 1 .
前記第2区間の音高を、前記第1区間の音高に対して、5度下の協和音程の関係にある音高となるように変更して出力する
ことを特徴とする請求項2または3に記載の音声合成装置。 The speech synthesizer
The pitch of the second section, wherein the first section of the pitch, claim 2 and outputs modified as a pitch in relation Kyowa interval under 5 degrees or The speech synthesizer according to 3 .
前記第2区間の音高を、前記第1区間の音高に対して所定の関係にある音高となるように変更しようとする場合に、
変更しようとする音高が所定の閾値音高よりも低ければ、変更しようとする音高をさらに1オクターブ上の音高にシフトする、または、
変更しようとする音高が所定の閾値音高よりも高ければ、変更しようとする音高をさらに1オクターブ下の音高にシフトする、
ことを特徴とする請求項2または3に記載の音声合成装置。 The speech synthesizer
When changing the pitch of the second section to be a pitch having a predetermined relationship with the pitch of the first section,
If the pitch to be changed is lower than a predetermined threshold pitch, the pitch to be changed is further shifted to a pitch one octave higher, or
If the pitch to be changed is higher than a predetermined threshold pitch, the pitch to be changed is further shifted to a pitch one octave below.
The speech synthesizer according to claim 2 or 3 .
前記第2区間の音高を、前記第1区間の音高に対して所定の関係にある音高となるように変更しようとする場合に、所定の属性が定められていれば、所定の関係にある音高をさらに1オクターブ上または下の音高にシフトする
ことを特徴とする請求項2または3に記載の音声合成装置。 The speech synthesizer
If the pitch of the second section is to be changed to be a pitch having a predetermined relationship with the pitch of the first section, if a predetermined attribute is defined, the predetermined relationship 4. The speech synthesizer according to claim 2 or 3 , wherein the pitch at is further shifted to a pitch one octave above or below.
前記音声合成部は、
前記動作モードが前記第1モードであれば、前記第2区間の音高を、前記第1区間の音高に対して、完全1度を除いた協和音程の関係にある音高となるように変更して出力し、
前記動作モードが前記第2モードであれば、前記第2区間の音高を、前記第1区間の音高に対して、不協和音程の関係にある音高となるように変更して出力する、
ことを特徴とする請求項1に記載の音声合成装置。 There are a first mode and a second mode as operation modes,
The speech synthesizer
If the operation mode is the first mode, the pitch of the second section is set to a pitch that is in a relationship of Kyowa intervals except for the perfect pitch with respect to the pitch of the first section. Change and output,
If the operation mode is the second mode, the pitch of the second section is changed and output so that the pitch is in a dissonant pitch relative to the pitch of the first section.
The speech synthesizer according to claim 1 .
入力された音声信号による問いに対する回答を取得し、
前記問いのうち、特定の第1区間の音高を解析し、
取得した回答のうち、特定の第2区間の音高を、前記第1区間の音高に対して所定の関係にある音高となるように変更して出力し、
前記第1区間は、前記問いの語尾であり、
前記第2区間は、前記回答の語頭または語尾である、
ことを特徴とする音声合成方法。 Computer,
Get answers to questions from input audio signals ,
Among the questions, analyze the pitch of a specific first section ,
Of the acquired answer, the pitch of the specific second period, and outputs the change to the pitch of the first section such that the pitch in a predetermined relationship,
The first interval is the ending of the question;
The second interval is the beginning or end of the answer,
A speech synthesis method characterized by the above.
Priority Applications (8)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2014048636A JP5954348B2 (en) | 2013-05-31 | 2014-03-12 | Speech synthesis apparatus and speech synthesis method |
US14/892,624 US9685152B2 (en) | 2013-05-31 | 2014-06-02 | Technology for responding to remarks using speech synthesis |
EP14803435.8A EP3007165B1 (en) | 2013-05-31 | 2014-06-02 | Technology for responding to remarks using speech synthesis |
PCT/JP2014/064631 WO2014192959A1 (en) | 2013-05-31 | 2014-06-02 | Technology for responding to remarks using speech synthesis |
CN201910272063.5A CN109887485A (en) | 2013-05-31 | 2014-06-02 | The technology responded to language is synthesized using speech |
CN201480031099.XA CN105247609B (en) | 2013-05-31 | 2014-06-02 | The method and device responded to language is synthesized using speech |
EP18178496.8A EP3399521B1 (en) | 2013-05-31 | 2014-06-02 | Technology for responding to remarks using speech synthesis |
US15/375,984 US10490181B2 (en) | 2013-05-31 | 2016-12-12 | Technology for responding to remarks using speech synthesis |
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2013115111 | 2013-05-31 | ||
JP2013115111 | 2013-05-31 | ||
JP2013198217 | 2013-09-25 | ||
JP2013198217 | 2013-09-25 | ||
JP2014048636A JP5954348B2 (en) | 2013-05-31 | 2014-03-12 | Speech synthesis apparatus and speech synthesis method |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2016088798A Division JP6323491B2 (en) | 2013-05-31 | 2016-04-27 | Speech synthesis apparatus and speech synthesis method |
Publications (3)
Publication Number | Publication Date |
---|---|
JP2015087740A JP2015087740A (en) | 2015-05-07 |
JP2015087740A5 true JP2015087740A5 (en) | 2016-04-14 |
JP5954348B2 JP5954348B2 (en) | 2016-07-20 |
Family
ID=53050546
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2014048636A Expired - Fee Related JP5954348B2 (en) | 2013-05-31 | 2014-03-12 | Speech synthesis apparatus and speech synthesis method |
JP2016088798A Expired - Fee Related JP6323491B2 (en) | 2013-05-31 | 2016-04-27 | Speech synthesis apparatus and speech synthesis method |
JP2018076522A Expired - Fee Related JP6566076B2 (en) | 2013-05-31 | 2018-04-12 | Speech synthesis method and program |
Family Applications After (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2016088798A Expired - Fee Related JP6323491B2 (en) | 2013-05-31 | 2016-04-27 | Speech synthesis apparatus and speech synthesis method |
JP2018076522A Expired - Fee Related JP6566076B2 (en) | 2013-05-31 | 2018-04-12 | Speech synthesis method and program |
Country Status (1)
Country | Link |
---|---|
JP (3) | JP5954348B2 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2017021125A (en) * | 2015-07-09 | 2017-01-26 | ヤマハ株式会社 | Voice interactive apparatus |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS57122497A (en) * | 1980-12-30 | 1982-07-30 | Tokuko Ikegami | Voice input/output apparatus |
JPS62115199A (en) * | 1985-11-14 | 1987-05-26 | 日本電気株式会社 | Voice responder |
JP3350293B2 (en) * | 1994-08-09 | 2002-11-25 | 株式会社東芝 | Dialogue processing device and dialogue processing method |
JP3437064B2 (en) * | 1997-08-25 | 2003-08-18 | シャープ株式会社 | Speech synthesizer |
JPH11175082A (en) * | 1997-12-10 | 1999-07-02 | Toshiba Corp | Voice interaction device and voice synthesizing method for voice interaction |
JP2001005487A (en) * | 1999-06-18 | 2001-01-12 | Mitsubishi Electric Corp | Voice recognition device |
JP3578961B2 (en) * | 2000-02-29 | 2004-10-20 | 日本電信電話株式会社 | Speech synthesis method and apparatus |
JP2003271194A (en) * | 2002-03-14 | 2003-09-25 | Canon Inc | Voice interaction device and controlling method thereof |
US8768701B2 (en) * | 2003-01-24 | 2014-07-01 | Nuance Communications, Inc. | Prosodic mimic method and apparatus |
US7280968B2 (en) * | 2003-03-25 | 2007-10-09 | International Business Machines Corporation | Synthetically generated speech responses including prosodic characteristics of speech inputs |
JP2005208162A (en) * | 2004-01-20 | 2005-08-04 | Canon Inc | Device and method for generating sound information, speech synthesis device and method therefor, and control program |
JP5750839B2 (en) * | 2010-06-14 | 2015-07-22 | 日産自動車株式会社 | Audio information presentation apparatus and audio information presentation method |
US20120234158A1 (en) * | 2011-03-15 | 2012-09-20 | Agency For Science, Technology And Research | Auto-synchronous vocal harmonizer |
WO2013187610A1 (en) * | 2012-06-15 | 2013-12-19 | Samsung Electronics Co., Ltd. | Terminal apparatus and control method thereof |
-
2014
- 2014-03-12 JP JP2014048636A patent/JP5954348B2/en not_active Expired - Fee Related
-
2016
- 2016-04-27 JP JP2016088798A patent/JP6323491B2/en not_active Expired - Fee Related
-
2018
- 2018-04-12 JP JP2018076522A patent/JP6566076B2/en not_active Expired - Fee Related
Similar Documents
Publication | Publication Date | Title |
---|---|---|
MX2016013015A (en) | Methods and systems of handling a dialog with a robot. | |
WO2017085714A3 (en) | Virtual assistant for generating personal suggestions to a user based on intonation analysis of the user | |
JP2013102411A5 (en) | ||
WO2015017687A3 (en) | Systems and methods for producing predictive images | |
MX2018004828A (en) | Apparatus and method for generating a filtered audio signal realizing elevation rendering. | |
WO2017029488A3 (en) | Methods of generating personalized 3d head models or 3d body models | |
RU2016124468A (en) | CONTROL DEVICE, METHOD OF MANAGEMENT AND COMPUTER PROGRAM | |
JP2015092654A5 (en) | ||
WO2018081607A3 (en) | Methods of systems of generating virtual multi-dimensional models using image analysis | |
EP2703951A3 (en) | Sound to haptic effect conversion system using mapping | |
WO2017072754A3 (en) | A system and method for computer-assisted instruction of a music language | |
MX2019001216A (en) | Systems and methods for executing a supplemental function for a natural language query. | |
MX2020006034A (en) | Computer-implemented method and system for producing orthopedic care. | |
WO2015114216A3 (en) | Audio segments analysis to determine danceability of a music and for video and pictures synchronisaton to the music. | |
JP2018501575A5 (en) | ||
Thoret et al. | Seeing circles and drawing ellipses: when sound biases reproduction of visual motion | |
PH12018500750A1 (en) | Game apparatus and recording medium | |
EP3012831A3 (en) | Musical drumhead with tonal modification | |
WO2017106610A8 (en) | Method and system for providing automated localized feedback for an extracted component of an electronic document file | |
JP2015087740A5 (en) | Speech synthesis apparatus and speech synthesis method | |
JP2016136284A5 (en) | Speech synthesis apparatus and speech synthesis method | |
KR20130067839A (en) | Apparatus and method for generating motion effect data | |
Karlin et al. | The articulatory tone-bearing unit: Gestural coordination of lexical tone in Thai | |
WO2016029045A3 (en) | Lexical dialect analysis system | |
JP2020076844A5 (en) | Sound processing method, sound processing system and program |