JP2015087740A5 - Speech synthesis apparatus and speech synthesis method - Google Patents

Speech synthesis apparatus and speech synthesis method Download PDF

Info

Publication number
JP2015087740A5
JP2015087740A5 JP2014048636A JP2014048636A JP2015087740A5 JP 2015087740 A5 JP2015087740 A5 JP 2015087740A5 JP 2014048636 A JP2014048636 A JP 2014048636A JP 2014048636 A JP2014048636 A JP 2014048636A JP 2015087740 A5 JP2015087740 A5 JP 2015087740A5
Authority
JP
Japan
Prior art keywords
pitch
section
speech synthesizer
changed
relationship
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP2014048636A
Other languages
Japanese (ja)
Other versions
JP2015087740A (en
JP5954348B2 (en
Filing date
Publication date
Application filed filed Critical
Priority claimed from JP2014048636A external-priority patent/JP5954348B2/en
Priority to JP2014048636A priority Critical patent/JP5954348B2/en
Priority to EP18178496.8A priority patent/EP3399521B1/en
Priority to EP14803435.8A priority patent/EP3007165B1/en
Priority to PCT/JP2014/064631 priority patent/WO2014192959A1/en
Priority to CN201910272063.5A priority patent/CN109887485A/en
Priority to CN201480031099.XA priority patent/CN105247609B/en
Priority to US14/892,624 priority patent/US9685152B2/en
Publication of JP2015087740A publication Critical patent/JP2015087740A/en
Publication of JP2015087740A5 publication Critical patent/JP2015087740A5/en
Publication of JP5954348B2 publication Critical patent/JP5954348B2/en
Application granted granted Critical
Priority to US15/375,984 priority patent/US10490181B2/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Description

ところで、上述した音声合成技術と音声対話システムとを組み合わせて、利用者の音声による問いに対し、データを検索して音声合成により出力する対話システムを想定する。この場合、音声合成によって出力される音声が利用者に不自然な感じ、具体的には、いかにも機械が喋っている感じを与えるときがある、という問題が指摘されている。
本発明は、このような事情に鑑みてなされたものであり、その目的の一つは、利用者に自然な感じを与えるような、具体的には、利用者に対して好印象や悪印象などを与えることが可能な音声合成装置および音声合成方法を提供することにある。
By the way, it is assumed that a dialogue system that combines the above-described voice synthesis technology and a voice dialogue system and retrieves data and outputs it by voice synthesis in response to a user's voice question. In this case, a problem has been pointed out that the voice output by the voice synthesis feels unnatural to the user, specifically, the machine sometimes feels roaring.
The present invention has been made in view of such circumstances, and one of its purposes is to give the user a natural feeling, specifically, a positive impression or a bad impression on the user. Is to provide a speech synthesis apparatus and a speech synthesis method .

Claims (8)

音声信号による問いを入力する音声入力部と、
前記問いのうち、特定の第1区間の音高を解析する音高解析部と、
前記問いに対する回答を取得する取得部と、
取得された回答のうち、特定の第2区間の音高を、前記第1区間の音高に対して所定の関係にある音高となるように変更して出力する音声合成部と、
を具備し、
前記第1区間は、前記問いの語尾であり、
前記第2区間は、前記回答の語頭または語尾である、
ことを特徴とする音声合成装置。
A voice input unit for inputting questions by voice signals;
Among the questions, a pitch analysis unit that analyzes the pitch of a specific first section;
An acquisition unit for acquiring an answer to the question;
Among the obtained answers, a speech synthesizer that changes and outputs the pitch of a specific second section so that the pitch has a predetermined relationship with the pitch of the first section;
Equipped with,
The first interval is the ending of the question;
The second interval is the beginning or end of the answer,
A speech synthesizer characterized by the above.
前記所定の関係は、完全1度を除いた協和音程の関係である、
ことを特徴とする請求項に記載の音声合成装置。
The predetermined relationship is a relationship of Kyowa intervals excluding perfect 1 degree.
The speech synthesizer according to claim 1 .
前記音声合成部は、
前記第2区間の音高を、前記第1区間の音高に対して、同一を除く、上下1オクターブの範囲内の音高関係となるように変更して出力する
ことを特徴とする請求項に記載の音声合成装置。
The speech synthesizer
The pitch of the second section is changed so as to have a pitch relationship within a range of one octave above and below, except for the same pitch as the pitch of the first section, and output. The speech synthesizer according to 1 .
前記音声合成部は、
前記第2区間の音高を、前記第1区間の音高に対して、5度下の協和音程の関係にある音高となるように変更して出力する
ことを特徴とする請求項またはに記載の音声合成装置。
The speech synthesizer
The pitch of the second section, wherein the first section of the pitch, claim 2 and outputs modified as a pitch in relation Kyowa interval under 5 degrees or The speech synthesizer according to 3 .
前記音声合成部は、
前記第2区間の音高を、前記第1区間の音高に対して所定の関係にある音高となるように変更しようとする場合に、
変更しようとする音高が所定の閾値音高よりも低ければ、変更しようとする音高をさらに1オクターブ上の音高にシフトする、または、
変更しようとする音高が所定の閾値音高よりも高ければ、変更しようとする音高をさらに1オクターブ下の音高にシフトする、
ことを特徴とする請求項またはに記載の音声合成装置。
The speech synthesizer
When changing the pitch of the second section to be a pitch having a predetermined relationship with the pitch of the first section,
If the pitch to be changed is lower than a predetermined threshold pitch, the pitch to be changed is further shifted to a pitch one octave higher, or
If the pitch to be changed is higher than a predetermined threshold pitch, the pitch to be changed is further shifted to a pitch one octave below.
The speech synthesizer according to claim 2 or 3 .
前記音声合成部は、
前記第2区間の音高を、前記第1区間の音高に対して所定の関係にある音高となるように変更しようとする場合に、所定の属性が定められていれば、所定の関係にある音高をさらに1オクターブ上または下の音高にシフトする
ことを特徴とする請求項またはに記載の音声合成装置。
The speech synthesizer
If the pitch of the second section is to be changed to be a pitch having a predetermined relationship with the pitch of the first section, if a predetermined attribute is defined, the predetermined relationship 4. The speech synthesizer according to claim 2 or 3 , wherein the pitch at is further shifted to a pitch one octave above or below.
動作モードとして第1モードおよび第2モードがあり、
前記音声合成部は、
前記動作モードが前記第1モードであれば、前記第2区間の音高を、前記第1区間の音高に対して、完全1度を除いた協和音程の関係にある音高となるように変更して出力し、
前記動作モードが前記第2モードであれば、前記第2区間の音高を、前記第1区間の音高に対して、不協和音程の関係にある音高となるように変更して出力する、
ことを特徴とする請求項に記載の音声合成装置。
There are a first mode and a second mode as operation modes,
The speech synthesizer
If the operation mode is the first mode, the pitch of the second section is set to a pitch that is in a relationship of Kyowa intervals except for the perfect pitch with respect to the pitch of the first section. Change and output,
If the operation mode is the second mode, the pitch of the second section is changed and output so that the pitch is in a dissonant pitch relative to the pitch of the first section.
The speech synthesizer according to claim 1 .
コンピュータが、
入力された音声信号による問いに対する回答を取得し、
前記問いのうち、特定の第1区間の音高を解析し、
取得た回答のうち、特定の第2区間の音高を、前記第1区間の音高に対して所定の関係にある音高となるように変更して出力し、
前記第1区間は、前記問いの語尾であり、
前記第2区間は、前記回答の語頭または語尾である、
ことを特徴とする音声合成方法
Computer,
Get answers to questions from input audio signals ,
Among the questions, analyze the pitch of a specific first section ,
Of the acquired answer, the pitch of the specific second period, and outputs the change to the pitch of the first section such that the pitch in a predetermined relationship,
The first interval is the ending of the question;
The second interval is the beginning or end of the answer,
A speech synthesis method characterized by the above.
JP2014048636A 2013-05-31 2014-03-12 Speech synthesis apparatus and speech synthesis method Expired - Fee Related JP5954348B2 (en)

Priority Applications (8)

Application Number Priority Date Filing Date Title
JP2014048636A JP5954348B2 (en) 2013-05-31 2014-03-12 Speech synthesis apparatus and speech synthesis method
US14/892,624 US9685152B2 (en) 2013-05-31 2014-06-02 Technology for responding to remarks using speech synthesis
EP14803435.8A EP3007165B1 (en) 2013-05-31 2014-06-02 Technology for responding to remarks using speech synthesis
PCT/JP2014/064631 WO2014192959A1 (en) 2013-05-31 2014-06-02 Technology for responding to remarks using speech synthesis
CN201910272063.5A CN109887485A (en) 2013-05-31 2014-06-02 The technology responded to language is synthesized using speech
CN201480031099.XA CN105247609B (en) 2013-05-31 2014-06-02 The method and device responded to language is synthesized using speech
EP18178496.8A EP3399521B1 (en) 2013-05-31 2014-06-02 Technology for responding to remarks using speech synthesis
US15/375,984 US10490181B2 (en) 2013-05-31 2016-12-12 Technology for responding to remarks using speech synthesis

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
JP2013115111 2013-05-31
JP2013115111 2013-05-31
JP2013198217 2013-09-25
JP2013198217 2013-09-25
JP2014048636A JP5954348B2 (en) 2013-05-31 2014-03-12 Speech synthesis apparatus and speech synthesis method

Related Child Applications (1)

Application Number Title Priority Date Filing Date
JP2016088798A Division JP6323491B2 (en) 2013-05-31 2016-04-27 Speech synthesis apparatus and speech synthesis method

Publications (3)

Publication Number Publication Date
JP2015087740A JP2015087740A (en) 2015-05-07
JP2015087740A5 true JP2015087740A5 (en) 2016-04-14
JP5954348B2 JP5954348B2 (en) 2016-07-20

Family

ID=53050546

Family Applications (3)

Application Number Title Priority Date Filing Date
JP2014048636A Expired - Fee Related JP5954348B2 (en) 2013-05-31 2014-03-12 Speech synthesis apparatus and speech synthesis method
JP2016088798A Expired - Fee Related JP6323491B2 (en) 2013-05-31 2016-04-27 Speech synthesis apparatus and speech synthesis method
JP2018076522A Expired - Fee Related JP6566076B2 (en) 2013-05-31 2018-04-12 Speech synthesis method and program

Family Applications After (2)

Application Number Title Priority Date Filing Date
JP2016088798A Expired - Fee Related JP6323491B2 (en) 2013-05-31 2016-04-27 Speech synthesis apparatus and speech synthesis method
JP2018076522A Expired - Fee Related JP6566076B2 (en) 2013-05-31 2018-04-12 Speech synthesis method and program

Country Status (1)

Country Link
JP (3) JP5954348B2 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017021125A (en) * 2015-07-09 2017-01-26 ヤマハ株式会社 Voice interactive apparatus

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS57122497A (en) * 1980-12-30 1982-07-30 Tokuko Ikegami Voice input/output apparatus
JPS62115199A (en) * 1985-11-14 1987-05-26 日本電気株式会社 Voice responder
JP3350293B2 (en) * 1994-08-09 2002-11-25 株式会社東芝 Dialogue processing device and dialogue processing method
JP3437064B2 (en) * 1997-08-25 2003-08-18 シャープ株式会社 Speech synthesizer
JPH11175082A (en) * 1997-12-10 1999-07-02 Toshiba Corp Voice interaction device and voice synthesizing method for voice interaction
JP2001005487A (en) * 1999-06-18 2001-01-12 Mitsubishi Electric Corp Voice recognition device
JP3578961B2 (en) * 2000-02-29 2004-10-20 日本電信電話株式会社 Speech synthesis method and apparatus
JP2003271194A (en) * 2002-03-14 2003-09-25 Canon Inc Voice interaction device and controlling method thereof
US8768701B2 (en) * 2003-01-24 2014-07-01 Nuance Communications, Inc. Prosodic mimic method and apparatus
US7280968B2 (en) * 2003-03-25 2007-10-09 International Business Machines Corporation Synthetically generated speech responses including prosodic characteristics of speech inputs
JP2005208162A (en) * 2004-01-20 2005-08-04 Canon Inc Device and method for generating sound information, speech synthesis device and method therefor, and control program
JP5750839B2 (en) * 2010-06-14 2015-07-22 日産自動車株式会社 Audio information presentation apparatus and audio information presentation method
US20120234158A1 (en) * 2011-03-15 2012-09-20 Agency For Science, Technology And Research Auto-synchronous vocal harmonizer
WO2013187610A1 (en) * 2012-06-15 2013-12-19 Samsung Electronics Co., Ltd. Terminal apparatus and control method thereof

Similar Documents

Publication Publication Date Title
MX2016013015A (en) Methods and systems of handling a dialog with a robot.
WO2017085714A3 (en) Virtual assistant for generating personal suggestions to a user based on intonation analysis of the user
JP2013102411A5 (en)
WO2015017687A3 (en) Systems and methods for producing predictive images
MX2018004828A (en) Apparatus and method for generating a filtered audio signal realizing elevation rendering.
WO2017029488A3 (en) Methods of generating personalized 3d head models or 3d body models
RU2016124468A (en) CONTROL DEVICE, METHOD OF MANAGEMENT AND COMPUTER PROGRAM
JP2015092654A5 (en)
WO2018081607A3 (en) Methods of systems of generating virtual multi-dimensional models using image analysis
EP2703951A3 (en) Sound to haptic effect conversion system using mapping
WO2017072754A3 (en) A system and method for computer-assisted instruction of a music language
MX2019001216A (en) Systems and methods for executing a supplemental function for a natural language query.
MX2020006034A (en) Computer-implemented method and system for producing orthopedic care.
WO2015114216A3 (en) Audio segments analysis to determine danceability of a music and for video and pictures synchronisaton to the music.
JP2018501575A5 (en)
Thoret et al. Seeing circles and drawing ellipses: when sound biases reproduction of visual motion
PH12018500750A1 (en) Game apparatus and recording medium
EP3012831A3 (en) Musical drumhead with tonal modification
WO2017106610A8 (en) Method and system for providing automated localized feedback for an extracted component of an electronic document file
JP2015087740A5 (en) Speech synthesis apparatus and speech synthesis method
JP2016136284A5 (en) Speech synthesis apparatus and speech synthesis method
KR20130067839A (en) Apparatus and method for generating motion effect data
Karlin et al. The articulatory tone-bearing unit: Gestural coordination of lexical tone in Thai
WO2016029045A3 (en) Lexical dialect analysis system
JP2020076844A5 (en) Sound processing method, sound processing system and program