JP2007148172A5 - - Google Patents
Download PDFInfo
- Publication number
- JP2007148172A5 JP2007148172A5 JP2005344737A JP2005344737A JP2007148172A5 JP 2007148172 A5 JP2007148172 A5 JP 2007148172A5 JP 2005344737 A JP2005344737 A JP 2005344737A JP 2005344737 A JP2005344737 A JP 2005344737A JP 2007148172 A5 JP2007148172 A5 JP 2007148172A5
- Authority
- JP
- Japan
- Prior art keywords
- speech
- unit
- priority
- units
- selection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000003908 quality control method Methods 0.000 claims 12
- 230000002194 synthesizing Effects 0.000 claims 8
- 238000001514 detection method Methods 0.000 claims 5
- 230000015572 biosynthetic process Effects 0.000 claims 3
- 238000003786 synthesis reaction Methods 0.000 claims 3
- 230000002123 temporal effect Effects 0.000 claims 1
Claims (8)
少なくとも音韻情報と韻律情報とを含む入力に対して選択の候補となる複数の音声素片を格納する素片記憶手段と、
前記各入力に対して、候補となる複数の音声素片からなる前記各素片群のうちから音声素片系列を、隣接する音声素片間で予め定められた連続性を保って選択する素片選択手段と、
前記素片選択手段によって選択された音声素片の系列から音声を合成してユーザに提示する合成手段と、
前記素片群の中から、前記音声合成された系列に含まれる音声素片よりも優先的に選択されるべき音声素片を指定するためのユーザからの入力を受け付ける入力手段と、
前記ユーザにより指定された音声素片に対して、前記素片選択手段によって選択された音声素片よりも高い優先度を決定する優先度決定手段とを備え、
前記素片選択手段は、前記ユーザにより指定された前記音声素片を、前記優先度決定手段により決定された優先度に基づいて前記素片群の中から再選択し、さらに、その前後の音声素片を前記再選択前と同一の選択を許容して再選択し、
前記声質制御装置は、さらに、
前記ユーザにより順次、指定された複数の音声素片を含む音声素片の系列において、音声素片間の接続歪みを、予め定められた方法で前記音声素片間の連続性を計測することにより検出する接続歪み検出手段と、
前記素片選択手段により音声素片を再選択した後に、前記接続歪み検出手段により、所定の閾値以上の歪みを検出した場合に、前記優先度決定手段により決定された音声素片の優先度を調整する優先度調整手段とを備え、
前記素片選択手段は、調整された優先度に基づいて音声素片をさらに再選択する
ことを特徴とする声質制御装置。 A speech quality control device that synthesizes speech from a sequence of speech units and changes speech units included in the synthesized speech to speech units of different voice qualities according to user designation,
Segment storage means for storing a plurality of speech segments that are candidates for selection with respect to an input including at least phoneme information and prosody information;
For each input, a unit for selecting a speech unit sequence from the group of units composed of a plurality of candidate speech units while maintaining a predetermined continuity between adjacent speech units. A piece selection means;
Synthesizing means for synthesizing speech from a sequence of speech units selected by the unit selection means and presenting it to the user;
An input unit for receiving an input from a user for designating a speech unit to be preferentially selected from speech units included in the speech-synthesized sequence from the unit group;
A priority determining unit that determines a higher priority than the speech unit selected by the unit selecting unit for the speech unit specified by the user;
The unit selection unit reselects the speech unit designated by the user from the unit group based on the priority determined by the priority determination unit, and further, the speech before and after the unit is selected. the fragment was allowed the same selection and before the reselection reselect,
The voice quality control device further includes:
By measuring connection continuity between speech units in a sequence of speech units including a plurality of speech units designated by the user sequentially, and measuring continuity between the speech units by a predetermined method. A connection distortion detecting means for detecting;
After re-selecting the speech unit by the unit selection unit, when the distortion more than a predetermined threshold is detected by the connection distortion detection unit, the priority of the speech unit determined by the priority determination unit is set. A priority adjusting means for adjusting,
The voice quality control device , wherein the segment selection means further re-selects a voice segment based on the adjusted priority .
ことを特徴とする請求項1に記載の声質制御装置。 The priority adjustment means includes: a priority level determined for each of the speech units specified in proximity in the sequence of the speech units, a temporal relationship of designation of the speech units, and designation has been based on said at least one specified number of voice segments, voice quality control apparatus according to claim 1, characterized in that to adjust the priority.
ことを特徴とする請求項2に記載の声質制御装置。 The priority adjustment means determines the priority between the speech units according to the designated time before and after the designated time based on a function representing priority with time as a variable, and determines that the priority is low. For the speech unit that has been determined, a speech unit that has a smaller connection distortion with the speech unit that has been determined to have a higher priority is reselected from the group of segments, including previous and next speech units. The voice quality control apparatus according to claim 2 , wherein the priority is adjusted as described above.
ことを特徴とする請求項3に記載の声質制御装置。 The function is a function that takes a positive value and monotonously increases or decreases monotonically depending on the positive and negative of the primary coefficient of the variable, and the primary coefficient is used when importance is given to the user specified later in time. The voice quality control device according to claim 3 , wherein is set to a positive value, and is set to a negative value when importance is given to the earlier one in terms of time.
前記優先度調整手段により調整された優先度が所定の閾値よりも小さい場合には、調整された音声素片が属するクラスタの優先度を、前記優先度調整手段により調整された優先
度とする
ことを特徴とする請求項2に記載の声質制御装置。 The unit storage means stores a plurality of speech units clustered based on similarity,
When the priority adjusted by the priority adjustment unit is smaller than a predetermined threshold, the priority of the cluster to which the adjusted speech segment belongs is set as the priority adjusted by the priority adjustment unit. The voice quality control device according to claim 2 .
前記音声合成によりユーザに提示された音声素片の前記系列と、前記系列に含まれる音声素片の選択の候補となった素片群とを表示する表示手段を備え、
前記入力手段は、表示された前記素片群の中から、優先的に選択されるべき音声素片を指定するための入力を受け付ける
ことを特徴とする請求項1に記載の声質制御装置。 The voice quality control device further includes:
Display means for displaying the sequence of speech units presented to the user by the speech synthesis and a segment group that is a candidate for selection of speech units included in the sequence;
The voice quality control apparatus according to claim 1, wherein the input unit receives an input for designating a speech unit to be preferentially selected from the displayed unit group.
少なくとも音韻情報と韻律情報とを含む入力に対して選択の候補となる複数の音声素片を格納する素片記憶ステップと、
前記各入力に対して、候補となる複数の音声素片からなる前記各素片群のうちから音声素片系列を、隣接する音声素片間で予め定められた連続性を保って選択する素片選択ステップと、
前記素片選択ステップによって選択された音声素片の系列から音声を合成してユーザに提示する合成ステップと、
前記素片群の中から、前記音声合成された系列に含まれる音声素片よりも優先的に選択されるべき音声素片を指定するためのユーザからの入力を受け付ける入力ステップと、
前記ユーザにより指定された音声素片に対して、前記素片選択手段によって選択された音声素片よりも高い優先度を決定する優先度決定ステップとを含み、
前記素片選択ステップでは、前記ユーザにより指定された前記音声素片を、前記優先度決定ステップにより決定された優先度に基づいて前記素片群の中から再選択し、さらに、その前後の音声素片を前記再選択前と同一の選択を許容して再選択し、
前記声質制御方法は、さらに、
前記ユーザにより順次、指定された複数の音声素片を含む音声素片の系列において、音声素片間の接続歪みを、予め定められた方法で前記音声素片間の連続性を計測することにより検出する接続歪み検出ステップと、
前記素片選択ステップにより音声素片を再選択した後に、前記接続歪み検出ステップにより、所定の閾値以上の歪みを検出した場合に、前記優先度決定ステップにより決定された音声素片の優先度を調整する優先度調整ステップとを含み、
前記素片選択ステップでは、調整された優先度に基づいて音声素片をさらに再選択する
ことを特徴とする声質制御方法。 A speech quality control method for synthesizing speech from a sequence of speech segments and changing speech segments included in the synthesized speech to speech segments of different voice qualities according to user designation,
A segment storage step for storing a plurality of speech segments that are candidates for selection with respect to an input including at least phoneme information and prosody information;
For each input, a unit for selecting a speech unit sequence from the group of units composed of a plurality of candidate speech units while maintaining a predetermined continuity between adjacent speech units. A single selection step;
A synthesis step of synthesizing speech from the sequence of speech segments selected by the segment selection step and presenting it to the user;
An input step of receiving an input from a user for designating a speech unit to be preferentially selected from speech units included in the speech synthesized sequence from the unit group;
A priority determination step for determining a higher priority than the speech unit selected by the unit selection unit for the speech unit specified by the user,
In the segment selection step, the speech segment specified by the user is reselected from the segment group based on the priority determined in the priority determination step, and the speech before and after the segment is further selected. the fragment was allowed the same selection and before the reselection reselect,
The voice quality control method further includes:
By measuring connection continuity between speech units in a sequence of speech units including a plurality of speech units designated by the user sequentially, and measuring continuity between the speech units by a predetermined method. A connection distortion detection step to detect;
After re-selecting a speech unit by the unit selection step, if a distortion greater than a predetermined threshold is detected by the connection distortion detection step, the priority of the speech unit determined by the priority determination step is set. A priority adjustment step to adjust,
In the segment selection step, a speech segment is further selected again based on the adjusted priority .
コンピュータに、少なくとも音韻情報と韻律情報とを含む入力に対して選択の候補となる複数の音声素片を格納する素片記憶ステップと、前記各入力に対して、候補となる複数の音声素片からなる前記各素片群のうちから音声素片系列を、隣接する音声素片間で予め定められた連続性を保って選択する素片選択ステップと、前記素片選択ステップによって選択された音声素片の系列から音声を合成してユーザに提示する合成ステップと、前記素片群の中から、前記音声合成された系列に含まれる音声素片よりも優先的に選択されるべき音声素片を指定するためのユーザからの入力を受け付ける入力ステップと、前記ユーザにより指定された音声素片に対して、前記素片選択手段によって選択された音声素片よりも高い優先度を決定する優先度決定ステップとを含み、前記素片選択ステップでは、前記ユーザにより指定された前記音声素片を、前記優先度決定ステップにより決定された優先度に基づいて前記素片群の中から再選択し、さらに、その前後の音声素片を前記再選択前と同一の選択を許容して再選択し、さらに、
前記ユーザにより順次、指定された複数の音声素片を含む音声素片の系列において、音声素片間の接続歪みを、予め定められた方法で前記音声素片間の連続性を計測することにより検出する接続歪み検出ステップと、前記素片選択ステップにより音声素片を再選択した後に、前記接続歪み検出ステップにより、所定の閾値以上の歪みを検出した場合に、前記優先度決定ステップにより決定された音声素片の優先度を調整する優先度調整ステップとを含み、前記素片選択ステップでは、調整された優先度に基づいて音声素片をさらに再選択することを実行させるプログラム。 A program for a voice quality control device that synthesizes speech from a sequence of speech units and changes speech units included in the synthesized speech to speech units of different voice qualities according to a user designation,
Storing a plurality of speech units as candidates for selection with respect to an input including at least phoneme information and prosody information in a computer; and a plurality of speech units as candidates for each of the inputs A speech segment sequence that selects a speech segment sequence from the group of segments consisting of the above while maintaining predetermined continuity between adjacent speech segments; and the speech selected by the segment selection step A synthesis step of synthesizing speech from the sequence of segments and presenting it to the user; and a speech segment to be preferentially selected from speech segments included in the speech synthesized sequence from the segment group An input step for accepting an input from a user for designating a priority, and a priority for determining a higher priority than the speech unit selected by the unit selection unit for the speech unit specified by the user Re-selecting the speech unit designated by the user from the unit group based on the priority determined by the priority determination step. Furthermore, the preceding and the speech unit to reselect to permit the same selection and before the reselection, and further,
By measuring connection continuity between speech units in a sequence of speech units including a plurality of speech units designated by the user sequentially, and measuring continuity between the speech units by a predetermined method. After the re-selection of the speech element by the connection distortion detection step to be detected and the element selection step, the distortion is determined by the priority determination step when a distortion of a predetermined threshold value or more is detected by the connection distortion detection step. A priority adjustment step of adjusting the priority of the speech unit, wherein the unit selection step further executes reselection of the speech unit based on the adjusted priority .
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2005344737A JP4664194B2 (en) | 2005-11-29 | 2005-11-29 | Voice quality control device and method, and program storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2005344737A JP4664194B2 (en) | 2005-11-29 | 2005-11-29 | Voice quality control device and method, and program storage medium |
Publications (3)
Publication Number | Publication Date |
---|---|
JP2007148172A JP2007148172A (en) | 2007-06-14 |
JP2007148172A5 true JP2007148172A5 (en) | 2008-12-25 |
JP4664194B2 JP4664194B2 (en) | 2011-04-06 |
Family
ID=38209625
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2005344737A Expired - Fee Related JP4664194B2 (en) | 2005-11-29 | 2005-11-29 | Voice quality control device and method, and program storage medium |
Country Status (1)
Country | Link |
---|---|
JP (1) | JP4664194B2 (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5198200B2 (en) * | 2008-09-25 | 2013-05-15 | 株式会社東芝 | Speech synthesis apparatus and method |
JP2011180368A (en) * | 2010-03-01 | 2011-09-15 | Fujitsu Ltd | Synthesized voice correction device and synthesized voice correction method |
JP5123347B2 (en) * | 2010-03-31 | 2013-01-23 | 株式会社東芝 | Speech synthesizer |
JP5648347B2 (en) * | 2010-07-14 | 2015-01-07 | ヤマハ株式会社 | Speech synthesizer |
KR101201913B1 (en) * | 2010-11-08 | 2012-11-15 | 주식회사 보이스웨어 | Voice Synthesizing Method and System Based on User Directed Candidate-Unit Selection |
JP5712818B2 (en) * | 2011-06-30 | 2015-05-07 | 富士通株式会社 | Speech synthesis apparatus, sound quality correction method and program |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3281281B2 (en) * | 1996-03-12 | 2002-05-13 | 株式会社東芝 | Speech synthesis method and apparatus |
JP3423276B2 (en) * | 2000-08-10 | 2003-07-07 | 三洋電機株式会社 | Voice synthesis method |
JP2004145015A (en) * | 2002-10-24 | 2004-05-20 | Fujitsu Ltd | System and method for text speech synthesis |
JP4311710B2 (en) * | 2003-02-14 | 2009-08-12 | 株式会社アルカディア | Speech synthesis controller |
JP2005181998A (en) * | 2003-11-28 | 2005-07-07 | Matsushita Electric Ind Co Ltd | Speech synthesizer and speech synthesizing method |
-
2005
- 2005-11-29 JP JP2005344737A patent/JP4664194B2/en not_active Expired - Fee Related
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP2007148172A5 (en) | ||
JP6264081B2 (en) | Program, method and system for selecting video segment, and program for selecting segment of media clip | |
TWI337740B (en) | ||
US9087501B2 (en) | Sound signal analysis apparatus, sound signal analysis method and sound signal analysis program | |
US9747876B1 (en) | Adaptive layout of sheet music in coordination with detected audio | |
CN104517605B (en) | A kind of sound bite splicing system and method for phonetic synthesis | |
JP2001265326A (en) | Performance position detecting device and score display device | |
JP2006295514A5 (en) | ||
JP2016156996A (en) | Electronic device, method, and program | |
JP5888356B2 (en) | Voice search device, voice search method and program | |
JP4664194B2 (en) | Voice quality control device and method, and program storage medium | |
JP2012108451A (en) | Audio processor, method and program | |
US8728822B2 (en) | Method, apparatus, and recording medium for performance game | |
CN104205212A (en) | Talker collision in auditory scene | |
GB2506404A (en) | Computer implemented iterative method of cross-fading between two audio tracks | |
JP4921343B2 (en) | Performance evaluation device, program and electronic musical instrument | |
JPWO2018207936A1 (en) | Automatic musical score detection method and device | |
CN103974142B (en) | A kind of video broadcasting method and system | |
JP2006313176A (en) | Speech synthesizer | |
JP4687297B2 (en) | Image processing apparatus and program | |
JP4442239B2 (en) | Voice speed conversion device and voice speed conversion method | |
JP7232654B2 (en) | karaoke equipment | |
JP4846548B2 (en) | Audio information selection device and audio information selection method | |
JP6569343B2 (en) | Voice search device, voice search method and program | |
WO2017164216A1 (en) | Acoustic processing method and acoustic processing device |