JP2001075776A

JP2001075776A - Device and method for recording voice

Info

Publication number: JP2001075776A
Application number: JP24920799A
Authority: JP
Inventors: Toshiaki Eguri; 俊明殖栗; Takashi Aso; 隆麻生
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1999-09-02
Filing date: 1999-09-02
Publication date: 2001-03-23

Abstract

PROBLEM TO BE SOLVED: To obtain a voice recording device which makes it possible to adjust the reading-aloud speed at which a speaker reads aloud an object sentence to be recorded by determining the phonation time of the sentence and notifying the speaker of the sentence and information for reading the sentence aloud according to the phonation time. SOLUTION: A spoken and recorded sentence storage part 11 stores sentences as objects of voice recording. A spoken and recorded sentence extraction part 12 extracts the spoken and recorded sentences from the storage part 1, one by one. A phonation time length calculation part 13 determines the length of a phonation time for reading aloud one sentence taken out by the extraction part 12. A phonation speed representation display part 14 gives notice of information or the like that a speaker reads aloud according to the phonation time length of the sentence calculated by the calculation part 13 in a form of an image or voice. Consequently, the phonation speed of the speaker can be controlled at the time of voice recording. Consequently, uniform data free of differences in data size and tone quality are generated with the recorded voice.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、音声収録装置及び
音声収録方法に関し、特に、収録の対象となる文につい
て発声者が読み上げる速度を調節し得る音声収録装置及
び音声収録方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice recording apparatus and a voice recording method, and more particularly to a voice recording apparatus and a voice recording method capable of adjusting the speed at which a speaker reads out a sentence to be recorded.

【０００２】[0002]

【従来の技術】音声素片辞書等を作成するためには、そ
の前提として、人間が発声する音声を収録しておく必要
がある。係る音声収録は、例えば、収録の対象となる文
を発声者に対して開示し、該文に基づいて発声された音
声を収録するものである。2. Description of the Related Art In order to create a speech unit dictionary or the like, as a prerequisite, it is necessary to record voices uttered by humans. In such voice recording, for example, a sentence to be recorded is disclosed to a speaker, and a voice uttered based on the sentence is recorded.

【０００３】[0003]

【発明が解決しようとする課題】しかし、文を読み上げ
る速度は、人によって異なるため、収録の対象となる文
を複数の発声者が読み上げる場合、各発声者は、独自の
発声速度に基づき発声することとなる。そのため、各音
声収録の後に作成される音声合成用の各音声素片辞書の
データサイズや音質に差異が生じるという問題がある。However, since the speed at which a sentence is read varies from person to person, when a plurality of speakers read a sentence to be recorded, each speaker speaks based on a unique utterance speed. It will be. For this reason, there is a problem that a difference occurs in the data size and sound quality of each speech unit dictionary for speech synthesis created after each speech recording.

【０００４】また、同じ発声者であっても、発声速度が
一定せず、揺らぎを生じるために、同じ音声素片辞書の
素片の音質にも差異が生じるという問題もある。[0004] Further, even for the same speaker, the utterance speed is not constant and fluctuations occur, so that there is also a problem that the sound quality of the segments of the same speech unit dictionary is different.

【０００５】従って、本発明の目的は、収録の対象とな
る文について発声者が読み上げる速度を調節し得る音声
収録装置及び音声収録方法を提供することにある。Accordingly, it is an object of the present invention to provide a voice recording apparatus and a voice recording method capable of adjusting the speed at which a speaker reads out a sentence to be recorded.

【０００６】[0006]

【課題を解決するための手段】本発明によれば、所定の
文を読み上げる発声者の音声を収録するための音声収録
装置であって、前記文の発声時間を決定する手段と、前
記文と、前記文を前記発声時間に合わせて読み上げるた
めの情報と、を発声者に通知する手段と、を備えたこと
を特徴とする音声収録装置が提供される。According to the present invention, there is provided a voice recording apparatus for recording a voice of a speaker reading a predetermined sentence, wherein a means for determining a utterance time of the sentence, And a means for notifying a speaker of information for reading out the sentence in time with the utterance time.

【０００７】本発明の音声収録装置では、前記情報が、
前記文と共に表示された前記文の全体の発声時間及び前
記文中の所定の単位毎の発声時間であってもよい。[0007] In the audio recording apparatus of the present invention, the information is:
The total utterance time of the sentence displayed together with the sentence and the utterance time of each predetermined unit in the sentence may be used.

【０００８】また、前記情報が、前記文中の所定の単位
を前記発声時間に基づいて順番に指し示す、前記文に合
わせて表示された目印であってもよい。Further, the information may be a mark displayed in accordance with the sentence, which indicates a predetermined unit in the sentence in order based on the utterance time.

【０００９】また、本発明によれば、所定の文を読み上
げる発声者の音声を収録するための音声収録装置であっ
て、前記文の発声時間を決定する手段と、前記文中の所
定の単位毎に、前記発声時間に合わせて順番に前記文を
表示又は非表示する手段と、を備えたことを特徴とする
音声収録装置が提供される。Further, according to the present invention, there is provided a voice recording apparatus for recording a voice of a speaker who reads a predetermined sentence, wherein: a means for determining a utterance time of the sentence; Means for displaying or not displaying the sentence in order according to the utterance time.

【００１０】また、本発明によれば、所定の文を読み上
げる発声者の音声を収録するための音声収録装置であっ
て、前記文の発声時間を決定する手段と、前記文を表示
し、かつ、前記文中の所定の単位毎に前記発声時間に合
わせて前記文の色を変化させる手段と、を備えたことを
特徴とする音声収録装置が提供される。Further, according to the present invention, there is provided a voice recording apparatus for recording a voice of a speaker reading a predetermined sentence, wherein a means for determining a utterance time of the sentence, displaying the sentence, and Means for changing the color of the sentence in accordance with the utterance time for each predetermined unit in the sentence.

【００１１】本発明の音声収録装置では、前記単位とし
て、前記文を構成する音節、ポーズ句、若しくは、アク
セント句を採用することができる。In the voice recording apparatus according to the present invention, a syllable, a pause phrase, or an accent phrase constituting the sentence can be adopted as the unit.

【００１２】また、本発明によれば、所定の文を読み上
げる発声者の音声を収録するための音声収録装置であっ
て、前記文の発声時間を決定する手段と、前記文を連続
的に移動させて表示し、かつ、前記文中の発声すべき部
分を示す固定の目印を表示する手段と、を備えたことを
特徴とする音声収録装置が提供される。Further, according to the present invention, there is provided a voice recording apparatus for recording a voice of a speaker who reads a predetermined sentence, wherein a means for determining a utterance time of the sentence, and a method of continuously moving the sentence. And means for displaying a fixed mark indicating a portion of the sentence to be uttered.

【００１３】また、本発明によれば、所定の文を読み上
げる発声者の音声を収録するための音声収録装置であっ
て、前記文の発声時間を決定する手段と、前記文を表示
し、かつ、前記発声時間に基づく音声を発する手段と、
を備えたことを特徴とする音声収録装置が提供される。Further, according to the present invention, there is provided a voice recording device for recording a voice of a speaker reading a predetermined sentence, wherein a means for determining a utterance time of the sentence, displaying the sentence, and Means for producing a voice based on the utterance time;
A voice recording device characterized by comprising:

【００１４】また、本発明によれば、所定の文を読み上
げる発声者の音声を収録するための音声収録方法であっ
て、前記文の発声時間を決定する工程と、前記文と、前
記文を前記発声時間に合わせて読み上げるための情報
と、を発声者に通知する工程と、を含むことを特徴とす
る音声収録方法が提供される。Further, according to the present invention, there is provided a voice recording method for recording a voice of a speaker reading a predetermined sentence, wherein a step of determining a utterance time of the sentence is provided. Notifying the speaker of the information to be read aloud in accordance with the utterance time, thereby providing a voice recording method.

【００１５】また、本発明によれば、所定の文を読み上
げる発声者の音声を収録するための音声収録方法であっ
て、前記文の発声時間を決定する工程と、前記文中の所
定の単位毎に、前記発声時間に合わせて順番に前記文を
表示又は非表示する工程と、を含むことを特徴とする音
声収録方法が提供される。Further, according to the present invention, there is provided a voice recording method for recording a voice of a speaker who reads a predetermined sentence, wherein a step of determining a utterance time of the sentence, And displaying or not displaying the sentence in order according to the utterance time.

【００１６】また、本発明によれば、所定の文を読み上
げる発声者の音声を収録するための音声収録方法であっ
て、前記文の発声時間を決定する工程と、前記文を表示
し、かつ、前記文中の所定の単位毎に前記発声時間に合
わせて前記文の色を変化させる工程と、を含むことを特
徴とする音声収録方法が提供される。Further, according to the present invention, there is provided a voice recording method for recording a voice of a speaker reading a predetermined sentence, wherein a step of determining a utterance time of the sentence, displaying the sentence, and Changing the color of the sentence according to the utterance time for each predetermined unit in the sentence.

【００１７】また、本発明によれば、所定の文を読み上
げる発声者の音声を収録するための音声収録方法であっ
て、前記文の発声時間を決定する工程と、前記文を連続
的に移動させて表示し、かつ、前記文中の発声すべき部
分を示す固定の目印を表示する工程と、を含むことを特
徴とする音声収録方法が提供される。Further, according to the present invention, there is provided a voice recording method for recording a voice of a speaker reading a predetermined sentence, wherein a step of determining a utterance time of the sentence, and a step of continuously moving the sentence. And displaying a fixed mark indicating a portion to be uttered in the sentence.

【００１８】また、本発明によれば、所定の文を読み上
げる発声者の音声を収録するための音声収録方法であっ
て、前記文の発声時間を決定する工程と、前記文を表示
し、かつ、前記発声時間に基づく音声を発する工程と、
を含むことを特徴とする音声収録方法が提供される。Further, according to the present invention, there is provided a voice recording method for recording a voice of a speaker reading a predetermined sentence, wherein a step of determining a utterance time of the sentence, displaying the sentence, and Producing a voice based on the utterance time;
Is provided.

【００１９】また、本発明によれば、所定の文を読み上
げる発声者の音声を収録するために、コンピュータを、
前記文の発声時間を決定する手段、前記文と、前記文を
前記発声時間に合わせて読み上げるための情報と、を発
声者に通知する手段、として機能させるプログラムを記
録した記憶媒体が提供される。According to the present invention, a computer for recording a voice of a speaker who reads a predetermined sentence is provided.
A storage medium is provided that stores a program that functions as a unit that determines a utterance time of the sentence, a unit that notifies the utterer of the sentence and information for reading out the sentence in accordance with the utterance time. .

【００２０】また、本発明によれば、所定の文を読み上
げる発声者の音声を収録するために、コンピュータを、
前記文の発声時間を決定する手段、前記文中の所定の単
位毎に、前記発声時間に合わせて順番に前記文を表示又
は非表示する手段、として機能させるプログラムを記録
した記憶媒体が提供される。According to the present invention, in order to record the voice of a speaker who reads a predetermined sentence, a computer is provided.
There is provided a storage medium storing a program for functioning as means for determining the utterance time of the sentence, and means for displaying or hiding the sentence in order for each predetermined unit in the sentence in accordance with the utterance time. .

【００２１】また、本発明によれば、所定の文を読み上
げる発声者の音声を収録するために、コンピュータを、
前記文の発声時間を決定する手段、前記文を表示し、か
つ、前記文中の所定の単位毎に前記発声時間に合わせて
前記文の色を変化させる手段、として機能させるプログ
ラムを記録した記憶媒体が提供される。According to the present invention, in order to record a voice of a speaker who reads a predetermined sentence, a computer is provided.
A storage medium storing a program functioning as means for determining the utterance time of the sentence, means for displaying the sentence, and means for changing the color of the sentence in accordance with the utterance time for each predetermined unit in the sentence Is provided.

【００２２】また、本発明によれば、前記単位が、前記
文を構成する音節、ポーズ句、若しくは、アクセント句
であることを特徴とする請求項１８乃至２１のいずれか
１項に記載の記憶媒体が提供される。22. The storage according to claim 18, wherein the unit is a syllable, a pause phrase, or an accent phrase constituting the sentence. A medium is provided.

【００２３】また、本発明によれば、所定の文を読み上
げる発声者の音声を収録するために、コンピュータを、
前記文の発声時間を決定する手段、前記文を連続的に移
動させて表示し、かつ、前記文中の発声すべき部分を示
す固定の目印を表示する手段、として機能させるプログ
ラムを記録した記憶媒体が提供される。According to the present invention, a computer for recording a voice of a speaker who reads a predetermined sentence is provided.
A storage medium storing a program functioning as means for determining the utterance time of the sentence, means for continuously moving and displaying the sentence, and means for displaying a fixed mark indicating a part to be uttered in the sentence Is provided.

【００２４】また、本発明によれば、所定の文を読み上
げる発声者の音声を収録するために、コンピュータを、
前記文の発声時間を決定する手段、前記文を表示し、か
つ、前記発声時間に基づく音声を発する手段、として機
能させるプログラムを記録した記憶媒体が提供される。According to the present invention, in order to record a voice of a speaker who reads a predetermined sentence, a computer is provided.
There is provided a storage medium storing a program for functioning as a means for determining the utterance time of the sentence and a means for displaying the sentence and emitting a voice based on the utterance time.

【００２５】[0025]

【発明の実施の形態】以下、図面を参照して本発明の好
適な実施の形態について詳細に説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Preferred embodiments of the present invention will be described below in detail with reference to the drawings.

【００２６】図１は、本発明の一実施形態に係る音声収
録装置の構成のうち、特に、その特徴部分の構成を示す
ブロック図である。FIG. 1 is a block diagram showing the configuration of a characteristic portion of the configuration of a sound recording apparatus according to an embodiment of the present invention.

【００２７】図１において、１１は発声収録文格納部で
あり、音声収録の対象となる文が複数蓄積されている。
１２は発声収録文抽出部であり、発生収録文格納部１１
から発声収録文を１文ずつ抽出するものである。In FIG. 1, reference numeral 11 denotes an utterance recording sentence storage unit in which a plurality of sentences to be recorded are stored.
Reference numeral 12 denotes an utterance recorded sentence extraction unit, and a generated recorded sentence storage unit 11
Utterance sentences are extracted one by one from the utterance.

【００２８】１３は発声時間長算出部であり、発声収録
文抽出部１２より取り出された１文を読み上げる発声時
間長を決定するものである。Reference numeral 13 denotes an utterance time length calculation unit which determines the utterance time length for reading one sentence extracted from the utterance recorded sentence extraction unit 12.

【００２９】発声時間長としては、発生収録文に対して
形態素解析、ポーズ生成処理を行い、１モーラ当たりの
時間長と１ポーズ当たりの時間長から、発声するのに適
した時間長を決定することができる。As the utterance time length, a morphological analysis and a pause generation process are performed on the generated recorded sentence, and a time length suitable for uttering is determined from the time length per mora and the time length per pause. be able to.

【００３０】ここで、モーラとは言葉を形成する音節を
指し、長音や促音や撥音も１モーラに数える。例えば、
「コーヒー」は４モーラであり、「システムキッチン」
は８モーラである。形態素解析とは、漢字かな混じりの
文の漢字の読みを決定し、文字のまとまりである形態素
を決定するものである。ポーズ生成処理とは、長文を息
継ぎなしで一気に発声しようとすると、早ロになった
り、抑揚のないものになってしまうので、文の途中に息
継ぎ箇所を生成し、自然な発声ができるようにするもの
である。Here, the mora refers to a syllable that forms a word, and a long sound, a vocal sound, and a sound-repelling sound are counted as one mora. For example,
“Coffee” is 4 mora and “system kitchen”
Is 8 mora. The morphological analysis is to determine the reading of kanji in a sentence mixed with kanji and kana, and to determine a morpheme that is a unit of characters. With pause generation processing, if you try to utter a long sentence at a stretch without breathing, it will become early or without inflection. Is what you do.

【００３１】例えば、発声収録文が「今日はいい天気で
す。」の様に漢字かな混じり文である場合、漢字が何モ
ーラで、文全体で何モーラなのかが分からないため、発
声するのに適した発声時間長を決定できない。そこで、
漢字かな混じり文に対して形態素解析、ポーズ生成処理
を行う。For example, if the utterance recorded sentence is a mixture of kanji and kana characters, such as "Today is fine weather." The appropriate utterance time length cannot be determined. Therefore,
Morphological analysis and pose generation processing are performed on the sentence mixed with kanji and kana.

【００３２】形態素解析、ポーズ生成処理を行うと、
「きょうは○いい○てんきです。」という読みとポーズ
（○）との生成が行われ、文が１０モーラ、２ポーズで
構成されていると判断する。そこで、１モーラ当たり２
００ｍｓｅｃ、１ポーズ当たり３００ｍｓｅｃとする
と、１文を発声するのに適した全体の発声時間長は２．
６ｓｅｃと決定できる。When the morphological analysis and the pose generation processing are performed,
The reading “Today is a good day” and a pose (O) are generated, and it is determined that the sentence is composed of 10 mora and 2 poses. So 2 per mora
Assuming 00 msec and 300 msec per pose, the overall utterance time length suitable for uttering one sentence is 2.
It can be determined as 6 seconds.

【００３３】１４は発声速度表現表示部であり、発声時
間長算出部１３により算出されるその文の発声時間長に
合わせて発声者が読み上げるような情報等を画像、もし
くは音声により通知するものである。Reference numeral 14 denotes a utterance speed expression display unit, which notifies an image or a voice of information or the like which is read out by the speaker according to the utterance time length of the sentence calculated by the utterance time length calculation unit 13. is there.

【００３４】なお、図１に示す各ブロックを含む音声収
録装置は、例えば、図２に示すシステムで実現すること
ができる。図２において、２１はＣＰＵであり、後述す
る手順を実現するプログラムにしたがって動作する。２
２はＲＡＭであり、上記プログラムの動作に必要な記憶
領域を提供する。２３はＲＯＭであり、後述する手順を
実現するプログラムを保持する。２４はディスク装置で
あり、上記プログラムを格納し、使用に際してメモリに
格納される。２５は表示装置であり、上記プログラムに
より作成された発声速度を画像、もしくは音声により表
示する。２６は上記の各構成を接続するバスである。The audio recording device including the blocks shown in FIG. 1 can be realized by, for example, the system shown in FIG. In FIG. 2, reference numeral 21 denotes a CPU, which operates according to a program for implementing a procedure described later. 2
Reference numeral 2 denotes a RAM, which provides a storage area necessary for the operation of the program. Reference numeral 23 denotes a ROM, which stores a program for implementing a procedure described later. Numeral 24 denotes a disk device which stores the above program and is stored in a memory when used. Reference numeral 25 denotes a display device, which displays the utterance speed created by the above-mentioned program by image or sound. Reference numeral 26 denotes a bus for connecting the above components.

【００３５】次に、図１に示した各ブロックの動作を図
３に示すフローチャートを参照して説明する。本実施形
態では、音声合成における音声素片辞書を作成する際に
行う、音声収録を想定している。Next, the operation of each block shown in FIG. 1 will be described with reference to the flowchart shown in FIG. In the present embodiment, it is assumed that voice recording is performed when a voice unit dictionary in voice synthesis is created.

【００３６】ステップＳ３１では、発声収録文抽出部１
２が、発声収録文格納部１１から１文を抽出して、ステ
ップＳ３２に移る。In step S31, the utterance recorded sentence extracting unit 1
2 extracts one sentence from the utterance recorded sentence storage unit 11, and proceeds to step S32.

【００３７】ステップＳ３２では、発声時間長算出部１
３が、ステップＳ３１において抽出された１文の発声収
録文を発声するのに適した発声時間長を決定し、ステッ
プＳ３３に移る。In step S32, the utterance time length calculation unit 1
3 determines the utterance time length suitable for uttering the one utterance recorded sentence extracted in step S31, and proceeds to step S33.

【００３８】ステップＳ３３では、発声者に発声収録文
とその発声速度とを画像により表示する。ステップＳ３
４では、発声収録文格納部１１に未収録文が残っている
かを判別し、未収録文が残っているならばステップＳ３
１に戻って上記処理を反復し、未収録文がなければ終了
する。In step S33, the uttered speaker displays the uttered recorded sentence and its utterance speed in an image. Step S3
In S4, it is determined whether an unrecorded sentence remains in the utterance recorded sentence storage unit 11, and if an unrecorded sentence remains, step S3 is performed.
Returning to step 1, the above processing is repeated, and if there is no unrecorded sentence, the processing ends.

【００３９】次に、ステップＳ３２及びステップＳ３３
における処理を、発声収録文が「システムキッチン」で
ある場合を例にとって説明する。Next, step S32 and step S33
Will be described by taking as an example the case where the utterance recorded sentence is “system kitchen”.

【００４０】ステップＳ３３では、この文を発声するの
にかかる時間を決定する。１モーラ当たり０．２５秒と
定めると、発声に適した時間長は２．０秒と決定する。In step S33, the time required to utter this sentence is determined. If it is determined to be 0.25 seconds per mora, the time length suitable for utterance is determined to be 2.0 seconds.

【００４１】したがって、収録する文「システムキッチ
ン」は、１モーラ当たり０．２５秒かかり、文全体を２
秒かけて発声者に読ま上げさせることとなる。Therefore, the sentence “System Kitchen” to be recorded takes 0.25 seconds per mora, and the whole sentence is 2
Over a second, the speaker will read it aloud.

【００４２】この結果を、ステップＳ３４では、例え
ば、図４に示すように表示する。図４の表示例では、発
声収録文「システムキッチン」と、その発声時間長に合
わせて読み上げるための情報として、文全体を読む時間
（２．０秒）と、各文字毎を読む時間（０．２５秒）
と、が表示されている。発声者は、この時間を目安とし
て発声収録文を読み上げることができる。The result is displayed in step S34, for example, as shown in FIG. In the display example of FIG. 4, as the information to be read aloud according to the utterance recording sentence “system kitchen” and the utterance time length, the time to read the entire sentence (2.0 seconds) and the time to read each character (0 .25 seconds)
And are displayed. The speaker can use the time as a guide to read out the recorded speech.

【００４３】以下、ステップＳ３４における他の表示例
を列挙する。Hereinafter, other display examples in step S34 will be listed.

【００４４】図５の表示例では、矢印５１が算出した時
間長に基づいて右方向に移動しながら、現在読み上げる
音節を指し示している。In the display example of FIG. 5, the arrow 51 indicates the syllable to be read at present while moving rightward based on the calculated time length.

【００４５】図６の表示例では、インジゲータ６１が算
出した時間長に基づいて右方向に伸びながら、現在読み
上げる音節を指し示している。In the display example of FIG. 6, the syllable to be read at present is indicated while extending rightward based on the time length calculated by the indicator 61.

【００４６】図７の表示例では、正弦波上の波７１が算
出した時間長に基づいて右方向に伸びながら、現在読み
上げる音節を指し示している図８の表示例では、発声収
録文を、算出した時間長に基づいて、その音節単位で順
番に表示しており、発声者は現れた音節から順番に読め
ば、算出した時間長で発声収録文を読み上げたこととな
る。なお、最初に一文全体を表示して、その後、音節単
位で順番に非表示とするようにしてもよい。In the display example of FIG. 7, while the syllabary wave 71 extends rightward based on the calculated time length and points to the syllable to be read at present, in the display example of FIG. The syllables are displayed in order based on the syllable unit based on the length of the syllable, and if the utterance is read in order from the syllable that appears, the utterance is read aloud by the calculated time length. Note that the entire sentence may be displayed first, and then hidden one by one in syllable units.

【００４７】図９の表示例では、発声収録文に色をつけ
て表示し、算出した時間長に基づいて、その音節単位で
その色を変化させており、発声者は、色の変化に合わせ
て文を順番に読めば、算出した時間長で発声収録文を読
み上げたこととなる。In the display example of FIG. 9, the utterance recorded sentence is displayed with color, and the color is changed in units of the syllable based on the calculated time length. If the sentences are read in order, the utterance recorded sentences are read out with the calculated time length.

【００４８】図１０の表示例は、算出した時間長に基づ
いて、発声収録文をが電光掲示板の表示の如く流れるよ
うに移動させたものである。図１０において、発声収録
文「きょうはいいてんきです。」は、時刻ｔからｔ
＋１に時間が進むにつれて全体として左方向に移動す
る。発声者は、固定された目印８１を通過する文字を読
み上げれば、算出した時間長で発声収録文を読み上げた
こととなる。In the display example shown in FIG. 10, based on the calculated time length, the utterance recorded sentence is moved so as to flow like a display on an electronic bulletin board. In FIG. 10, the utterance recorded sentence “Kyoha Iitenki I” is from time t to time t.
As the time progresses to +1, it moves to the left as a whole. If the speaker reads out the characters that pass through the fixed mark 81, the speaker has read out the recorded speech with the calculated time length.

【００４９】なお、本実施形態においては、音節単位で
表示を調節したが、ポーズ句単位、アクセント句単位で
調節してもよい。例えば、図８の表示例では、音節単位
で発声対象で指定しているので、１文字ずつ消えたり、
現れたりするが、発声対象を音節単位からポーズ句単
位、アクセント句単位とすることができる。In the present embodiment, the display is adjusted in units of syllables, but may be adjusted in units of pause phrases or accent phrases. For example, in the display example of FIG. 8, since the utterance target is specified in units of syllables, the characters may disappear one by one,
Although it appears, the utterance target can be changed from syllable units to pause phrase units and accent phrase units.

【００５０】ここで、ポーズ句とは、１文の区切りに挟
まれた文字のまとまりであり、アクセント句とは、アク
セントを含む文字のまとまりである。Here, the pause phrase is a group of characters sandwiched by one sentence, and the accent phrase is a group of characters including accents.

【００５１】例えば、「にほんたろうさんはげんきで
す。」という文がある時、ポーズ句、アクセント句は、
「にほん／たろうさんは／○げんきです。」（／はアク
セント句、○はポーズ句）と１文に１つのポーズ句と、
２つのアクセント句を含む。１文中には１個以上のポー
ズ句を含み、１つのポーズ句中には１個以上のアクセン
ト句を含む。For example, when there is a sentence "Nihontaro-san is fine," the pose phrase and the accent phrase are
"Nihon / Taro-san // Genki." (/ Is an accent phrase, ○ is a pause phrase) and one pause phrase per sentence.
Contains two accent phrases. One sentence includes one or more pause phrases, and one pause phrase includes one or more accent phrases.

【００５２】また、本実施形態においては、発声収録文
を発声時間長に合わせて読み上げるための情報を画像に
よって表示していたが、視覚的だけでなく、ヘッドホン
やスピーカの様に、発声時間長を音声によって発声者に
通知し、聴覚的なもので実現してもよい。In the present embodiment, the information for reading out the utterance recorded sentence according to the utterance time length is displayed by an image, but it is not only visual but also the utterance time length as in a headphone or a speaker. May be notified to the speaker by voice, and may be realized in an auditory manner.

【００５３】以上述べた通り、本実施形態によれば、発
声収録文を読みあげる速度の指標となるものを画像、若
しくは、音声で発声者に通知するようにしたので、音声
収録の際に発声者の発声速度をコントロールすることが
できる。この結果、収録された音声によって、データサ
イズや音質に差異のない均一なデータが作成される。As described above, according to the present embodiment, the speaker which is an index of the speed at which the utterance recorded sentence is read out is notified to the utterer by the image or the voice. Can control the speech rate of the person. As a result, uniform data having no difference in data size and sound quality is created by the recorded voice.

【００５４】なお、本発明は、複数の機器から構成され
るシステムに適用しても、１つの機器からなる装置に適
用してもよい。前述した実施形態の機能を実現するソフ
トウェアのプログラムコードを記録した記録媒体を、シ
ステム或いは装置に供給し、そのシステム或いは装置の
コンピュータ（またはＣＰＵやＭＰＵ）が記録媒体に格
納されたプログラムコードを読み出し実行することによ
っても、達成されることは言うまでもない。The present invention may be applied to a system constituted by a plurality of devices or to an apparatus constituted by a single device. A recording medium storing the software program code for realizing the functions of the above-described embodiments is supplied to a system or an apparatus, and a computer (or CPU or MPU) of the system or the apparatus reads out the program code stored in the recording medium. Needless to say, it can also be achieved by executing.

【００５５】この場合、記録媒体から読み出されたプロ
グラムコード自体が前述した実施形態の機能を実現する
ことになり、そのプログラムコード自体が前述した実施
形態の機能を実現することになり、そのプログラムコー
ドを記録した記録媒体は本発明を構成することになる。In this case, the program code itself read from the recording medium implements the functions of the above-described embodiment, and the program code itself implements the functions of the above-described embodiment. The recording medium on which the code is recorded constitutes the present invention.

【００５６】プログラムコードを供給するための記録媒
体としては、例えば、フロッピーディスク、ハードディ
スク、光ディスク、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、磁気テー
プ、不揮発性のメモリカード、ＲＯＭなどを用いること
ができる。As a recording medium for supplying the program code, for example, a floppy disk, hard disk, optical disk, CD-ROM, CD-R, magnetic tape, nonvolatile memory card, ROM, etc. can be used.

【００５７】また、コンピュータが読み出したプログラ
ムコードを実行することにより、前述した実施形態の機
能が実現されるだけでなく、そのプログラムコードの指
示に基づき、コンピュータ上で稼動しているＯＳなどが
実際の処理の一部または全部を行い、その処理によって
前述した実施形態の機能が実現される場合も含まれるこ
とは言うまでもない。When the computer executes the readout program code, not only the functions of the above-described embodiment are realized, but also the OS or the like running on the computer is actually executed based on the instructions of the program code. It goes without saying that a part or all of the above-described processing is performed, and the functions of the above-described embodiments are realized by the processing.

【００５８】更に、記録媒体から読み出されたプログラ
ムコードが、コンピュータに挿入された機能拡張ボード
やコンピュータに接続された機能拡張ユニットに備わる
メモリに書き込まれた後、そのプログラムコードの指示
に基づき、その機能拡張ボードや機能拡張ユニットに備
わるＣＰＵなどが実際の処理の一部または全部を行い、
その処理によって前述した実施形態の機能が実現される
場合も含まれることは言うまでもない。Further, after the program code read from the recording medium is written into a memory provided in a function expansion board inserted into the computer or a function expansion unit connected to the computer, based on the instruction of the program code, The CPU provided in the function expansion board or function expansion unit performs part or all of the actual processing,
It goes without saying that a case where the function of the above-described embodiment is realized by the processing is also included.

【００５９】[0059]

【発明の効果】以上説明したように、本発明によれば、
収録の対象となる文について発声者が読み上げる速度を
調節することができる。As described above, according to the present invention,
The speed at which the speaker reads out the sentence to be recorded can be adjusted.

[Brief description of the drawings]

【図１】本発明の一実施形態に係る音声収録装置の特徴
部分の構成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of a characteristic portion of an audio recording device according to an embodiment of the present invention.

【図２】本発明の一実施形態に係る音声収録装置を実施
するための構成例を示すブロック図である。FIG. 2 is a block diagram illustrating a configuration example for implementing a voice recording device according to an embodiment of the present invention.

【図３】音声収録の処理手順を示すフローチャートであ
る。FIG. 3 is a flowchart showing a processing procedure of audio recording.

【図４】図３のステップＳ３３の処理における表示例を
示す図である。FIG. 4 is a diagram showing a display example in a process of step S33 in FIG. 3;

【図５】図３のステップＳ３３の処理における表示例を
示す図である。FIG. 5 is a diagram showing a display example in a process of step S33 in FIG. 3;

【図６】図３のステップＳ３３の処理における表示例を
示す図である。FIG. 6 is a diagram showing a display example in a process of step S33 in FIG. 3;

【図７】図３のステップＳ３３の処理における表示例を
示す図である。FIG. 7 is a diagram showing a display example in a process of step S33 in FIG. 3;

【図８】図３のステップＳ３３の処理における表示例を
示す図である。FIG. 8 is a diagram showing a display example in a process of step S33 in FIG. 3;

【図９】図３のステップＳ３３の処理における表示例を
示す図である。FIG. 9 is a diagram showing a display example in the process of step S33 in FIG. 3;

【図１０】図３のステップＳ３３の処理における表示例
を示す図である。FIG. 10 is a diagram showing a display example in the process of step S33 in FIG. 3;

[Explanation of symbols]

１１発声収録文格納部１２発生収録文抽出部１３発生時間長算出部１４発声速度表現表示部２１ＣＰＵ２２ＲＡＭ２３ＲＯＭ２４ディスク装置２５表示装置２６バス 11 utterance recorded sentence storage unit 12 generated recorded sentence extraction unit 13 generation time length calculation unit 14 utterance speed expression display unit 21 CPU 22 RAM 23 ROM 24 disk device 25 display device 26 bus

Claims

[Claims]

1. A voice recording device for recording a voice of a speaker who reads a predetermined sentence, comprising: means for determining a utterance time of the sentence; and adjusting the sentence and the sentence to the utterance time. A voice recording device comprising: information for reading out; and means for notifying a speaker of the information.

2. The voice recording apparatus according to claim 1, wherein the information is a total utterance time of the sentence displayed together with the sentence and a utterance time of each predetermined unit in the sentence.

3. The voice recording according to claim 1, wherein the information is a mark displayed in accordance with the sentence, indicating a predetermined unit in the sentence in order based on the utterance time. apparatus.

4. A voice recording device for recording a voice of a speaker who reads a predetermined sentence, wherein: a means for determining a utterance time of the sentence; Means for displaying or hiding the sentence in turn.

5. A voice recording device for recording a voice of a speaker reading a predetermined sentence, wherein: a means for determining a utterance time of the sentence; displaying the sentence; Means for changing the color of the sentence in accordance with the utterance time for each unit.

6. The audio recording apparatus according to claim 2, wherein the unit is a syllable, a pause phrase, or an accent phrase constituting the sentence.

7. A voice recording device for recording a voice of a speaker reading a predetermined sentence, wherein: a means for determining a utterance time of the sentence; and wherein the sentence is continuously moved and displayed; Means for displaying a fixed mark indicating a part to be uttered in the sentence.

8. A voice recording device for recording a voice of a speaker reading a predetermined sentence, wherein: a means for determining a utterance time of the sentence; displaying the sentence; and based on the utterance time. A sound recording device comprising: means for emitting a sound.

9. A voice recording method for recording a voice of a speaker reading a predetermined sentence, comprising: determining a utterance time of the sentence; and adjusting the sentence and the sentence to the utterance time. Notifying the speaker of the information to be read out, and a voice recording method.

10. The voice recording method according to claim 9, wherein the information is an utterance time of the entire sentence displayed together with the sentence and an utterance time of each predetermined unit in the sentence.

11. The voice recording according to claim 9, wherein the information is a mark displayed in accordance with the sentence, indicating a predetermined unit in the sentence in order based on the utterance time. Method.

12. A voice recording method for recording a voice of a speaker who reads a predetermined sentence, comprising: determining a utterance time of the sentence; and determining a utterance time for each predetermined unit in the sentence. A step of displaying or not displaying the sentence in turn in combination.

13. A voice recording method for recording a voice of a speaker reading a predetermined sentence, comprising: determining a utterance time of the sentence; displaying the sentence; Changing the color of the sentence according to the utterance time for each unit.

14. The syllable constituting the sentence, wherein the unit is:
The voice recording method according to any one of claims 10 to 13, wherein the voice recording method is a pause phrase or an accent phrase.

15. A voice recording method for recording a voice of a speaker reading a predetermined sentence, comprising: determining a utterance time of the sentence; displaying the sentence by continuously moving the sentence; Displaying a fixed mark indicating a part to be uttered in the sentence.

16. A voice recording method for recording a voice of a speaker reading a predetermined sentence, comprising: a step of determining an utterance time of the sentence; displaying the sentence; and based on the utterance time. Producing a voice.

17. A computer for recording a voice of a speaker who reads a predetermined sentence, comprising: a computer for determining a utterance time of the sentence; and information for reading the sentence according to the utterance time. And a storage medium storing a program that functions as means for notifying the speaker.

18. The storage medium according to claim 17, wherein the information is the total utterance time of the sentence displayed together with the sentence and the utterance time of each predetermined unit in the sentence.

19. The information processing apparatus according to claim 17, wherein the information is a mark displayed in accordance with the sentence, indicating a predetermined unit in the sentence in order based on the utterance time.
A storage medium according to claim 1.

20. A computer for recording a voice of a speaker reading a predetermined sentence, comprising: means for determining a utterance time of the sentence; for each predetermined unit in the sentence, a computer is sequentially arranged in accordance with the utterance time. A storage medium storing a program that functions as means for displaying or hiding the sentence.

21. A computer for recording a voice of a speaker reading a predetermined sentence, comprising: means for determining a utterance time of the sentence; displaying the sentence; and displaying the sentence for each predetermined unit in the sentence. A storage medium storing a program for functioning as means for changing the color of the sentence according to the utterance time.

22. The syllable constituting the sentence, wherein the unit is:
The storage medium according to any one of claims 18 to 21, wherein the storage medium is a pause phrase or an accent phrase.

23. A computer for recording a voice of a speaker reading a predetermined sentence, comprising: means for determining a utterance time of the sentence; displaying the sentence by continuously moving the sentence; A storage medium storing a program to function as means for displaying a fixed mark indicating a portion to be uttered.

24. A computer for recording a voice of a speaker reading a predetermined sentence, comprising: a computer for determining a utterance time of the sentence; a means for displaying the sentence and generating a sound based on the utterance time. A storage medium that stores a program that functions as a and.