JP2007148172A5

JP2007148172A5 -

Info

Publication number: JP2007148172A5
Application number: JP2005344737A
Authority: JP
Filing date: 2005-11-29
Publication date: 2008-12-25
Anticipated expiration: 2025-11-29

Claims

A speech quality control device that synthesizes speech from a sequence of speech units and changes speech units included in the synthesized speech to speech units of different voice qualities according to user designation,
Segment storage means for storing a plurality of speech segments that are candidates for selection with respect to an input including at least phoneme information and prosody information;
For each input, a unit for selecting a speech unit sequence from the group of units composed of a plurality of candidate speech units while maintaining a predetermined continuity between adjacent speech units. A piece selection means;
Synthesizing means for synthesizing speech from a sequence of speech units selected by the unit selection means and presenting it to the user;
An input unit for receiving an input from a user for designating a speech unit to be preferentially selected from speech units included in the speech-synthesized sequence from the unit group;
A priority determining unit that determines a higher priority than the speech unit selected by the unit selecting unit for the speech unit specified by the user;
The unit selection unit reselects the speech unit designated by the user from the unit group based on the priority determined by the priority determination unit, and further, the speech before and after the unit is selected. the fragment was allowed the same selection and before the reselection reselect,
The voice quality control device further includes:
By measuring connection continuity between speech units in a sequence of speech units including a plurality of speech units designated by the user sequentially, and measuring continuity between the speech units by a predetermined method. A connection distortion detecting means for detecting;
After re-selecting the speech unit by the unit selection unit, when the distortion more than a predetermined threshold is detected by the connection distortion detection unit, the priority of the speech unit determined by the priority determination unit is set. A priority adjusting means for adjusting,
The voice quality control device , wherein the segment selection means further re-selects a voice segment based on the adjusted priority .

The priority adjustment means includes: a priority level determined for each of the speech units specified in proximity in the sequence of the speech units, a temporal relationship of designation of the speech units, and designation has been based on said at least one specified number of voice segments, voice quality control apparatus according to claim 1, characterized in that to adjust the priority.

The priority adjustment means determines the priority between the speech units according to the designated time before and after the designated time based on a function representing priority with time as a variable, and determines that the priority is low. For the speech unit that has been determined, a speech unit that has a smaller connection distortion with the speech unit that has been determined to have a higher priority is reselected from the group of segments, including previous and next speech units. The voice quality control apparatus according to claim 2 , wherein the priority is adjusted as described above.

The function is a function that takes a positive value and monotonously increases or decreases monotonically depending on the positive and negative of the primary coefficient of the variable, and the primary coefficient is used when importance is given to the user specified later in time. The voice quality control device according to claim 3 , wherein is set to a positive value, and is set to a negative value when importance is given to the earlier one in terms of time.

The unit storage means stores a plurality of speech units clustered based on similarity,
When the priority adjusted by the priority adjustment unit is smaller than a predetermined threshold, the priority of the cluster to which the adjusted speech segment belongs is set as the priority adjusted by the priority adjustment unit. The voice quality control device according to claim 2 .

The voice quality control device further includes:
Display means for displaying the sequence of speech units presented to the user by the speech synthesis and a segment group that is a candidate for selection of speech units included in the sequence;
The voice quality control apparatus according to claim 1, wherein the input unit receives an input for designating a speech unit to be preferentially selected from the displayed unit group.

A speech quality control method for synthesizing speech from a sequence of speech segments and changing speech segments included in the synthesized speech to speech segments of different voice qualities according to user designation,
A segment storage step for storing a plurality of speech segments that are candidates for selection with respect to an input including at least phoneme information and prosody information;
For each input, a unit for selecting a speech unit sequence from the group of units composed of a plurality of candidate speech units while maintaining a predetermined continuity between adjacent speech units. A single selection step;
A synthesis step of synthesizing speech from the sequence of speech segments selected by the segment selection step and presenting it to the user;
An input step of receiving an input from a user for designating a speech unit to be preferentially selected from speech units included in the speech synthesized sequence from the unit group;
A priority determination step for determining a higher priority than the speech unit selected by the unit selection unit for the speech unit specified by the user,
In the segment selection step, the speech segment specified by the user is reselected from the segment group based on the priority determined in the priority determination step, and the speech before and after the segment is further selected. the fragment was allowed the same selection and before the reselection reselect,
The voice quality control method further includes:
By measuring connection continuity between speech units in a sequence of speech units including a plurality of speech units designated by the user sequentially, and measuring continuity between the speech units by a predetermined method. A connection distortion detection step to detect;
After re-selecting a speech unit by the unit selection step, if a distortion greater than a predetermined threshold is detected by the connection distortion detection step, the priority of the speech unit determined by the priority determination step is set. A priority adjustment step to adjust,
In the segment selection step, a speech segment is further selected again based on the adjusted priority .

A program for a voice quality control device that synthesizes speech from a sequence of speech units and changes speech units included in the synthesized speech to speech units of different voice qualities according to a user designation,
Storing a plurality of speech units as candidates for selection with respect to an input including at least phoneme information and prosody information in a computer; and a plurality of speech units as candidates for each of the inputs A speech segment sequence that selects a speech segment sequence from the group of segments consisting of the above while maintaining predetermined continuity between adjacent speech segments; and the speech selected by the segment selection step A synthesis step of synthesizing speech from the sequence of segments and presenting it to the user; and a speech segment to be preferentially selected from speech segments included in the speech synthesized sequence from the segment group An input step for accepting an input from a user for designating a priority, and a priority for determining a higher priority than the speech unit selected by the unit selection unit for the speech unit specified by the user Re-selecting the speech unit designated by the user from the unit group based on the priority determined by the priority determination step. Furthermore, the preceding and the speech unit to reselect to permit the same selection and before the reselection, and further,
By measuring connection continuity between speech units in a sequence of speech units including a plurality of speech units designated by the user sequentially, and measuring continuity between the speech units by a predetermined method. After the re-selection of the speech element by the connection distortion detection step to be detected and the element selection step, the distortion is determined by the priority determination step when a distortion of a predetermined threshold value or more is detected by the connection distortion detection step. A priority adjustment step of adjusting the priority of the speech unit, wherein the unit selection step further executes reselection of the speech unit based on the adjusted priority .