JP2017090856A5

JP2017090856A5 -

Info

Publication number: JP2017090856A5
Application number: JP2015225047A
Authority: JP
Filing date: 2015-11-17
Publication date: 2018-11-15
Anticipated expiration: 2035-11-17

Claims

An acquisition process for acquiring first position data indicating at least one of an accent position and a break position from input speech data, and morpheme data including a plurality of morphemes generated from text data corresponding to the speech data A comparison process comparing the second position data indicating at least one of the position of the accent and the separation position between the plurality of morphemes and the first position data acquired from the audio data; If the first and second position data do not match, voice information creation comprising a processing unit that executes processing for assigning the first position data to the morpheme data instead of the second position data apparatus.

The voice information creation device further includes:
The speech information creation device according to claim 1, further comprising: a morpheme analysis unit that generates morpheme data including the plurality of morphemes by executing a morpheme analysis process on the text data.

The audio information creation device according to claim 1, wherein the processing unit executes a process of acquiring a position of a silent section of the audio data as a delimiter position of the first position data as the acquisition process.

In the comparison process, when the delimiter position indicated by the first position data coincides with the delimiter position indicated by the second position data, the processing unit further determines that the second position data is relative to the morpheme data. The voice information creation device according to claim 1, wherein reading point information is assigned to the indicated position.

The processing unit includes, as the acquisition process, a process of determining a fundamental frequency of the speech data within a section of the speech data corresponding to each of the plurality of morphemes, and a highest fundamental frequency within the section of the speech data. The voice information creation device according to claim 1, wherein a process for acquiring a high position as an accent position of the first position data is executed.

The processing unit includes, as the acquisition process, a process of determining a signal strength of the voice data within a section of the voice data corresponding to each of the plurality of morphemes, and a highest signal strength within the section of the voice data. The voice information creation device according to claim 1, wherein a process for acquiring a high position as an accent position of the first position data is executed.

A speech information creation method used in a speech information creation apparatus including a processing unit, wherein the processing unit is
Obtaining first position data indicating at least one of an accent position and a break position from the input voice data;
Second position data indicating at least one of a position of an accent given to morpheme data including a plurality of morphemes generated from text data corresponding to the sound data and a delimiter position between the plurality of morphemes; and the sound data And the first position data obtained from
When the first and second position data do not coincide with each other, the voice information creating method of adding the first position data to the morpheme data instead of the second position data.

In a computer used as a voice information creation device,
Obtaining first position data indicating at least one of an accent position and a break position from input voice data;
Second position data indicating at least one of a position of an accent given to morpheme data including a plurality of morphemes generated from text data corresponding to the sound data and a delimiter position between the plurality of morphemes; and the sound data Comparing the first position data obtained from
If the first and second position data do not match, giving the first position data to the morpheme data instead of the second position data;
A program that executes

The voice information creation device according to any one of claims 1 to 6,
A speech database that extracts the phoneme data from the speech data, the phoneme data, the phoneme label that represents the phoneme data, and the accent information given to the morpheme corresponding to the phoneme data by the speech information creation device A registration processing unit for executing, a registration processing unit for executing
Voice database creation device with