JPH11344996A

JPH11344996A - Pronunciation document creating device, pronunciation document creating method and recording medium readable by computer in which program to make computer execute the method is recorded

Info

Publication number: JPH11344996A
Application number: JP10154279A
Authority: JP
Inventors: Nobuhide Yamazaki; 信英山崎
Original assignee: JustSystems Corp
Current assignee: JustSystems Corp
Priority date: 1998-06-03
Filing date: 1998-06-03
Publication date: 1999-12-14

Abstract

PROBLEM TO BE SOLVED: To prepare and edit a pronunciation document including a pause, a metrical pattern based on converted accent, accent information and part of speech information by converting character information into the accent information and the part of speech information. SOLUTION: A pronunciation document creating device is provided with an input part 301 to input a character string, a converting part 302 to convert the inputted character string into the accent, the accent information and the part of speech information corresponding its reading, an accent phrase generating part 303 to generate an accent phrase based on the converted accent, the accent information and a part of speech, a pause setting part 304 to set pause information in what position between plural generated accent phrases, how long a silent section (pause) is inserted, etc., and a metrical pattern generating part 305 to generate the metrical patterns like a pitch pattern and time length of each syllable, etc., by the sentence to be constituted of the plural accent phrases in which the pause information is set.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、文書を構成する文
字列を、それぞれの文字列の読みに対応するアクセント
に変換して、さらに、ポーズ、韻律パタンが付加された
発音文書を作成する発音文書作成装置、発音文書作成方
法およびその方法をコンピュータに実行させるプログラ
ムを記録したコンピュータ読み取り可能な記録媒体に関
する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a pronunciation method for converting a character string constituting a document into an accent corresponding to the reading of each character string, and further creating a pronunciation document to which a pause and a prosodic pattern are added. The present invention relates to a document creation device, a pronunciation document creation method, and a computer-readable recording medium that records a program for causing a computer to execute the method.

【０００２】[0002]

【従来の技術】情報の伝達・保管方法の一つとして文字
情報をもちいる方法が知られている。この文字情報をも
ちいる方法として、近年、日本語ワードプロセッサ、英
文ワードプロセッサ等の文書作成装置や、ワードプロセ
ッサ機能を有したパソコンをもちいて、文書作成者が所
望の文書を作成し、作成した文書をネットワークを介し
て転送したり、作成した文書を磁気ディスクや、光ディ
スク等の記録媒体に記憶させる方法が使用されるように
なっている。これは、コンピュータ関連技術の発展に伴
って文書作成装置自体が高機能化・低価格化を実現して
いると共に、オフィスのペーパレス化の推進や、通信網
の整備、電子メールの普及等による作業環境の変化に負
うところが大きい。2. Description of the Related Art As one of information transmission and storage methods, a method using character information is known. As a method using this character information, in recent years, a document creator creates a desired document using a document creation device such as a Japanese word processor or an English word processor, or a personal computer having a word processor function, and transmits the created document to a network. Via the Internet, or a method of storing a created document in a recording medium such as a magnetic disk or an optical disk. This is due to the development of computer-related technology, which has realized higher functionality and lower cost of the document creation device itself, as well as the promotion of paperless offices, the development of communication networks, and the spread of e-mail. It depends heavily on environmental changes.

【０００３】また、情報の伝達・保管等に使用される他
の方法として、音声情報をもちいる方法や、音声情報と
映像情報とをもちいる方法が知られている。たとえば、
音声情報をもちいる方法では、情報の伝達は電話等を介
して直接、音声情報を転送し、情報の保管は録音機器を
もちいてテープ等に録音して保管している。また、音声
情報と映像情報とをもちいる方法では、情報の伝達はモ
ニターとスピーカを有する通信装置をもちいて音声情報
と映像情報を転送し、情報の保管はビデオ装置等の録画
機器をもちいてビデオテープや、光ディスク等に保管し
ている。As other methods used for transmitting and storing information, a method using audio information and a method using audio information and video information are known. For example,
In the method using voice information, the voice information is directly transmitted via a telephone or the like, and the information is stored and recorded on a tape or the like using a recording device. In the method using audio information and video information, information is transmitted using a communication device having a monitor and a speaker, and the audio information and video information are transferred, and information is stored using a recording device such as a video device. They are stored on video tapes, optical disks, etc.

【０００４】なお、上述した情報の伝達・保管方法のう
ち、文字情報をもちいる方法は、他の方法と比較して、
データ量が少なく、情報の編集が容易であること、さら
にデジタル情報としてコンピュータ上で使用可能である
ことから、最も汎用性が高く、広く使用されている。[0004] Among the above-mentioned methods of transmitting and storing information, the method using character information is different from other methods.
It is the most versatile and widely used because it has a small amount of data, is easy to edit information, and can be used as digital information on a computer.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、上記従
来の文字情報をもちいる方法では、作成した文書が視覚
的な言語情報（すなわち、文字言語情報）に限定された
情報であるため、非言語情報にあたる感情等の表現を情
報として付加することはできなかった。なお、音声をも
ちいた言語情報（すなわち、音声言語情報）では、アク
セントや、声の大きさ（音量）、声の高さ等の喋り方を
変えることで、非言語情報にあたる感情等の表現を情報
として付加していた。However, according to the above-described conventional method using character information, since the created document is information limited to visual linguistic information (that is, character linguistic information), non-linguistic information is not used. Expressions such as emotions could not be added as information. In the linguistic information using speech (that is, speech linguistic information), expressions such as emotions corresponding to non-linguistic information can be expressed by changing the way of speaking such as accent, loudness (volume), and pitch of voice. Was added as information.

【０００６】また、従来の技術において、文字情報と音
声情報という２つの表現形態の情報を整合性の取れた形
で複合した情報を作成する装置および方法は提供されて
いなかった。Further, in the prior art, there has not been provided an apparatus and a method for creating information in which information of two expression forms, that is, character information and audio information, is combined in a consistent manner.

【０００７】また、音声情報の編集は、基本的に聴覚を
もちいて（すなわち、再生させた音声情報を耳で聞い
て）おこなうのが一般的であり、一々再生して所望の音
声情報の位置（場所）を確認する必要があるため、作業
が煩雑で、面倒であるという問題点もあった。In general, audio information is basically edited by using the auditory sense (that is, by listening to the reproduced audio information with the ear). Since the (location) needs to be confirmed, the operation is complicated and troublesome.

【０００８】なお、従来の音声合成技術の一つであるテ
キスト音声合成技術をもちいることにより、テキスト文
書（すなわち、文字情報）から音声を合成することも可
能であるが、テキストからの音声合成では、辞書に入っ
ていない固有名詞を読み間違えたり、違ったアクセント
で発音するという問題点があった。さらに非言語情報で
ある感情等の表現ができないことや、文書作成者の意図
する喋り方で正確に音声を合成することができないとい
う問題点もあった。It is possible to synthesize speech from a text document (that is, character information) by using a text speech synthesis technique which is one of the conventional speech synthesis techniques. Then, there was a problem that proper nouns not included in the dictionary were misread or pronounced with different accents. Further, there are problems that it is not possible to express emotions or the like which are non-verbal information, and it is not possible to synthesize speech accurately according to the way the creator intends to speak.

【０００９】本発明は上記に鑑みてなされたものであっ
て、文字情報をアクセント情報および品詞情報に変換す
ることにより、変換されたアクセントおよびアクセント
情報・品詞情報に基づいてポーズ、韻律パタンを含む発
音文書を作成・編集できるようにすることを目的とす
る。The present invention has been made in view of the above, and includes a pause and a prosodic pattern based on converted accents and accent / part of speech information by converting character information into accent information and part of speech information. The purpose is to be able to create and edit pronunciation documents.

【００１０】[0010]

【課題を解決するための手段】上述した課題を解決し、
目的を達成するために、請求項１に係る発音文書作成装
置は、文字列を入力する入力手段と、前記入力手段によ
り入力された文字列をその読みに対応するアクセントお
よびアクセント情報・品詞情報に変換する変換手段と、
前記変換手段により変換されたアクセントおよびアクセ
ント情報・品詞とに基づいてアクセント句を生成するア
クセント句生成手段と、前記アクセント句生成手段によ
り生成された複数のアクセント句の間のどの位置にどの
くらいの長さで無音区間（ポーズ）を挿入するか等のポ
ーズ情報を設定するポーズ設定手段と、前記ポーズ設定
手段によりポーズ情報が設定された複数のアクセント句
から構成される文単位でピッチパタン・各音節の時間長
等の韻律パタンを生成する韻律パタン生成手段と、を備
えたことを特徴とする。Means for Solving the Problems The above-mentioned problems are solved,
In order to achieve the object, a pronunciation document creation device according to claim 1 includes an input unit for inputting a character string, and converting the character string input by the input unit into accent and accent information / speech information corresponding to the reading. Conversion means for converting;
An accent phrase generating means for generating an accent phrase based on the accent and the accent information / part of speech converted by the converting means; and at what position and how long between the plurality of accent phrases generated by the accent phrase generating means. A pause setting means for setting pause information such as whether a silent section (pause) is inserted, and a pitch pattern / each syllable in units of sentences composed of a plurality of accent phrases in which the pause information is set by the pause setting means. And a prosody pattern generating means for generating a prosody pattern such as a time length.

【００１１】この請求項１の発明によれば、文字列を入
力し、アクセント句の区切り位置で変換することによ
り、自然音声の入力をすることなく、文書作成者の意図
する喋り方で正確に音声を合成するための発音文書の作
成をすることができ、発音文書の作成効率および利便性
の向上を図ることが可能である。According to the first aspect of the present invention, by inputting a character string and converting the character string at a delimiter position of an accent phrase, it is possible to accurately input in a manner intended by the document creator without inputting natural speech. A pronunciation document for synthesizing speech can be created, and the efficiency and convenience of creating a pronunciation document can be improved.

【００１２】また、請求項２に係る発音文書作成装置
は、請求項１の発明において、前記アクセント句生成手
段により生成されたアクセント句および／または前記ポ
ーズ設定手段により設定されたポーズ情報および／また
は前記韻律パタン生成手段により生成された韻律パタン
を表示する表示手段を備えたことを特徴とする。According to a second aspect of the present invention, in the pronunciation document creating apparatus according to the first aspect, the accent phrase generated by the accent phrase generating means and / or the pause information and / or the pause information set by the pause setting means are set. A display means for displaying a prosody pattern generated by the prosody pattern generation means is provided.

【００１３】この請求項２の発明によれば、発音文書を
表示するので、発音文書の内容を容易に確認することが
でき、発音文書の作成効率および利便性の向上を図るこ
とが可能である。According to the second aspect of the present invention, since the pronunciation document is displayed, the contents of the pronunciation document can be easily confirmed, and the efficiency of creation of the pronunciation document and the convenience can be improved. .

【００１４】また、請求項３に係る発音文書作成装置
は、請求項１または２の発明において、さらに、前記ア
クセント句生成手段により生成されたアクセント句と、
前記ポーズ設定手段により設定されたポーズ情報と、前
記韻律パタン生成手段により生成された韻律パタンとを
もちいて音声を合成する音声合成手段と、前記音声合成
手段により合成された音声を出力する音声出力手段と、
を備えたことを特徴とする。According to a third aspect of the present invention, there is provided the pronunciation document creating apparatus according to the first or second aspect, further comprising: an accent phrase generated by the accent phrase generating means;
Voice synthesis means for synthesizing voice using the pause information set by the pause setting means and the prosody pattern generated by the prosody pattern generation means, and voice output for outputting the voice synthesized by the voice synthesis means Means,
It is characterized by having.

【００１５】この請求項３の発明によれば、発音文書の
発音データを音声合成して出力するので、発音文書の内
容を容易に再生することができ、発音文書の作成効率お
よび利便性の向上を図ることが可能である。According to the third aspect of the present invention, since the pronunciation data of the pronunciation document is synthesized and output, the contents of the pronunciation document can be easily reproduced, and the efficiency of creation of the pronunciation document and the convenience are improved. It is possible to achieve.

【００１６】また、請求項４に係る発音文書作成装置
は、請求項１〜３のいずれか一つの発明においてさら
に、前記アクセント句生成手段により生成されたアクセ
ント句および／または前記ポーズ設定手段により設定さ
れたポーズ情報および／または前記韻律パタン生成手段
により生成された韻律パタンを編集する編集手段とを備
えたことを特徴とする。According to a fourth aspect of the present invention, in the pronunciation document creating apparatus according to any one of the first to third aspects, the accent phrase generated by the accent phrase generating means and / or the pause setting means are set. Editing means for editing the generated pause information and / or the prosody pattern generated by the prosody pattern generation means.

【００１７】この請求項４の発明によれば、発音文書を
編集することができるので、作成者の意図する喋り方に
より近い発音文書を容易に作成することができ、発音文
書の作成効率および利便性の向上を図ることが可能であ
る。According to the fourth aspect of the present invention, the pronunciation document can be edited, so that a pronunciation document closer to the way of speech intended by the creator can be easily created, and the efficiency and convenience of creating the pronunciation document can be improved. It is possible to improve the performance.

【００１８】また、請求項５に係る発音文書作成方法
は、文字列を入力する第１工程と、前記第１工程により
入力された文字列をその読みに対応するアクセントおよ
びアクセント情報・品詞情報に変換する第２工程と、前
記第２工程により変換されたアクセントおよびアクセン
ト情報と品詞とに基づいてアクセント句を生成する第３
工程と、前記第３工程により生成された複数のアクセン
ト句の間のどの位置にどのくらいの長さで無音区間（ポ
ーズ）を挿入するか等のポーズ情報を設定する第４工程
と、前記第４工程によりポーズ情報が設定された複数の
アクセント句から構成される文単位でピッチパタン・各
音節の時間長等の韻律パタンを生成する第５工程と、を
含んだことを特徴とする。According to a fifth aspect of the present invention, in the pronunciation document creating method, a first step of inputting a character string and converting the character string input in the first step into accent and accent information / speech information corresponding to the reading. A second step of converting, and a third step of generating an accent phrase based on the accent and accent information converted by the second step and the part of speech.
And a fourth step of setting pose information such as a position and a length of a silent section (pause) between the plurality of accent phrases generated in the third step and at which position, A fifth step of generating a prosody pattern such as a pitch pattern and a time length of each syllable in a sentence unit composed of a plurality of accent phrases in which pause information is set in the step.

【００１９】この請求項５の発明によれば、文字列を入
力し、アクセント句の区切り位置で変換することによ
り、自然音声の入力をすることなく、文書作成者の意図
する喋り方で正確に音声を合成するための発音文書の作
成をすることができ、発音文書の作成効率および利便性
の向上を図ることが可能である。According to the fifth aspect of the present invention, by inputting a character string and converting the character string at a delimiter position of an accent phrase, it is possible to accurately input a character string intended by a document creator without inputting natural speech. A pronunciation document for synthesizing speech can be created, and the efficiency and convenience of creating a pronunciation document can be improved.

【００２０】また、請求項６に係る発音文書作成方法
は、請求項５の発明において、さらに、前記第３工程に
より生成されたアクセント句および／または前記第４工
程により設定されたポーズ情報および／または前記第５
工程により生成された韻律パタンを表示する第６工程を
含んだことを特徴とする。According to a sixth aspect of the present invention, there is provided the pronunciation document creating method according to the fifth aspect, further comprising the accent phrase generated in the third step and / or the pause information and / or the pause information set in the fourth step. Or the fifth
The method includes a sixth step of displaying the prosodic pattern generated by the step.

【００２１】この請求項６の発明によれば、発音文書を
表示するので、発音文書の内容を容易に確認することが
でき、発音文書の作成効率および利便性の向上を図るこ
とが可能である。According to the sixth aspect of the present invention, since the pronunciation document is displayed, the content of the pronunciation document can be easily confirmed, and the efficiency of creation of the pronunciation document and the convenience can be improved. .

【００２２】また、請求項７に係る発音文書作成方法
は、請求項５または６の発明において、さらに、前記第
３工程により生成されたアクセント句と、前記第４工程
により設定されたポーズ情報と、前記第５工程により生
成された韻律パタンとをもちいて音声を合成する第７工
程と、前記第７工程により合成された音声を出力する第
８工程と、を含んだことを特徴とする。According to a seventh aspect of the present invention, there is provided the pronunciation document creating method according to the fifth or sixth aspect, further comprising: the accent phrase generated in the third step; and the pose information set in the fourth step. , A seventh step of synthesizing speech using the prosodic pattern generated in the fifth step, and an eighth step of outputting the speech synthesized in the seventh step.

【００２３】この請求項７の発明によれば、発音文書を
表示するので、発音文書の内容を容易に確認することが
でき、発音文書の作成効率および利便性の向上を図るこ
とが可能である。According to the seventh aspect of the present invention, since the pronunciation document is displayed, the contents of the pronunciation document can be easily confirmed, and the efficiency of creation of the pronunciation document and the convenience can be improved. .

【００２４】また、請求項８に係る発音文書作成装置
は、請求項５〜７のいずれか一つの発明において、さら
に、前記第３工程により生成されたアクセント句および
／または前記第４工程により設定されたポーズ情報およ
び／または前記第５工程により生成された韻律パタンを
編集する第９工程とを含んだことを特徴とする。According to an eighth aspect of the present invention, there is provided the pronunciation document creating apparatus according to any one of the fifth to seventh aspects, further comprising: an accent phrase generated in the third step and / or a setting in the fourth step. A ninth step of editing the generated pause information and / or the prosodic pattern generated in the fifth step.

【００２５】この請求項８の発明によれば、発音文書を
編集することができるので、作成者の意図する喋り方に
より近い発音文書を容易に作成することができ、発音文
書の作成効率および利便性の向上を図ることが可能であ
る。According to the eighth aspect of the present invention, since the pronunciation document can be edited, it is possible to easily create a pronunciation document that is closer to the way of speech intended by the creator, and the efficiency and convenience of creating the pronunciation document are improved. It is possible to improve the performance.

【００２６】また、請求項９の発明に係る記憶媒体は、
請求項５〜８のいずれか一つに記載された方法をコンピ
ュータに実行させるプログラムを記録したことで、その
プログラムを機械読み取り可能となり、これによって、
請求項５〜８のいずれか一つの動作をコンピュータによ
って実現することが可能である。Further, a storage medium according to a ninth aspect of the present invention is:
By recording a program for causing a computer to execute the method according to any one of claims 5 to 8, the program becomes machine-readable, whereby
The operation of any one of claims 5 to 8 can be realized by a computer.

【００２７】[0027]

【発明の実施の形態】以下、本発明の発音文書作成装
置、発音文書作成方法およびその方法をコンピュータに
実行させるプログラムを記録したコンピュータ読み取り
可能な記録媒体の好適な実施の形態を詳細に説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Preferred embodiments of a pronunciation document creation apparatus, a pronunciation document creation method, and a computer-readable recording medium that records a program for causing a computer to execute the method according to the present invention will be described below in detail. .

【００２８】（実施の形態１）図１は、実施の形態１の
発音文書作成装置１００のハードウエア構成を示すブロ
ック図を示す。この発音文書作成装置１００は、制御部
１０１と、アプリケーション記憶部１０２と、変換辞書
１０３と、韻律パタンモデル記憶部１０４と、声色記憶
部１０５と、キー入力装置１０６と、表示装置１０７
と、マイク１０８と、スピーカ１０９と、発音文書記憶
部１１０と、インタフェース（Ｉ／Ｆ）１１１と、フロ
ッピーディスクドライブ（ＦＤドライブ）１１２と、Ｃ
Ｄ−ＲＯＭドライブ１１３と、通信部１１４と、から構
成される。(Embodiment 1) FIG. 1 is a block diagram showing a hardware configuration of a pronunciation document creating apparatus 100 according to Embodiment 1. The pronunciation document creation device 100 includes a control unit 101, an application storage unit 102, a conversion dictionary 103, a prosody pattern model storage unit 104, a timbre storage unit 105, a key input device 106, and a display device 107.
, A microphone 108, a speaker 109, a pronunciation document storage unit 110, an interface (I / F) 111, a floppy disk drive (FD drive) 112,
It comprises a D-ROM drive 113 and a communication unit 114.

【００２９】制御部１０１は、バスＢＳに結合された上
記各部を制御する中央処理ユニットであり、ＣＰＵ１０
１ａ、ＲＯＭ１０１ｂおよびＲＡＭ１０１ｃ等を備えて
いる。ＣＰＵ１０１ａはＲＯＭ１０１ｂに格納されたＯ
Ｓ（オペレーティングシステム）プログラムやアプリケ
ーション記憶部１０２に格納されたアプリケーションプ
ログラムにしたがって動作する。また、ＲＯＭ１０１ｂ
はＯＳプログラムを格納するメモリであり、ＲＡＭ１０
１ｃは各種プログラムのワークエリアとして使用するメ
モリである。The control unit 101 is a central processing unit for controlling the above-mentioned units connected to the bus BS.
1a, a ROM 101b, a RAM 101c, and the like. The CPU 101a stores the O stored in the ROM 101b.
It operates according to an S (operating system) program or an application program stored in the application storage unit 102. ROM 101b
Is a memory for storing an OS program.
1c is a memory used as a work area for various programs.

【００３０】アプリケーション記憶部１０２には、読み
からアクセントへ変換する機能を実現する読みアクセン
ト変換アプリケーションや、ポーズを設定する機能を実
現するポーズ設定アプリケーションや、韻律パタンを生
成する機能を実現する韻律パタン生成アプリケーション
等の各種アプリケーションプログラムが記憶されてい
る。また、実施の形態１の発音文書作成装置１００は、
かな漢字変換機能を有しており、このかな漢字変換機能
を実現するかな漢字変換アプリケーションもアプリケー
ション記憶部１０２に記憶されている。The application storage unit 102 includes a reading accent conversion application that realizes a function of converting reading to accent, a pause setting application that realizes a function of setting a pause, and a prosody pattern that realizes a function of generating a prosody pattern. Various application programs such as a generation application are stored. Further, the pronunciation document creation device 100 according to the first embodiment includes:
It has a Kana-Kanji conversion function, and a Kana-Kanji conversion application that realizes the Kana-Kanji conversion function is also stored in the application storage unit 102.

【００３１】変換辞書１０３は、文字列（単語）の読み
と対応する漢字を含む文字列が対応付けられて記憶され
た辞書であり、また同様に、文字列（単語）の読みとア
クセントおよびアクセント情報・品詞情報が対応付けら
れて記憶されているデータベース化された辞書でもあ
る。The conversion dictionary 103 is a dictionary in which the reading of a character string (word) and the character string containing the corresponding kanji are stored in association with each other. It is also a database-based dictionary that stores information and part-of-speech information in association with each other.

【００３２】韻律パタンモデル記憶部１０４は、韻律パ
タンのモデルをあらかじめ記憶し、データベース化され
たメモリである。韻律パタンモデル記憶部１０４に記憶
される韻律パタンの内容については後述する。The prosody pattern model storage unit 104 is a memory that stores a model of the prosody pattern in advance and is made into a database. The contents of the prosody pattern stored in the prosody pattern model storage unit 104 will be described later.

【００３３】声色記憶部１０５は、声色の種類別にアク
セント等の素片単位ごとの音響パラメータを表す声色デ
ータを選択可能に格納している。声色記憶部１０５は、
通信回線や、ＦＤ１１２ａ、ＣＤ−ＲＯＭ１１３ａ等の
記憶媒体を通して声色データ等を追加したり、キー入力
装置１０６のキー操作によって削除することが可能であ
る。The timbre storage unit 105 stores selectable timbre data representing acoustic parameters for each unit such as accent for each timbre type. The voice storage unit 105
Voice data or the like can be added through a communication line, a storage medium such as the FD 112a, the CD-ROM 113a, or the like, or can be deleted by a key operation of the key input device 106.

【００３４】キー入力装置１０６は、キーボード、マウ
ス等の入力デバイスを備えており、文字列の入力や、発
音文書の再生指定、発音文書の作成・登録等の各種オペ
レーションをおこなうのに使用される。また、キー入力
装置１０６には、入力された文字列を漢字を含む文字列
に変換するための変換キーを備えている。The key input device 106 includes an input device such as a keyboard and a mouse, and is used to perform various operations such as input of a character string, designation of reproduction of a pronunciation document, and creation / registration of a pronunciation document. . Further, the key input device 106 includes a conversion key for converting an input character string into a character string containing kanji.

【００３５】表示装置１０７は、液晶表示装置またはＣ
ＲＴディスプレイからなり、文字列の表示、発音文書の
表示、各種メッセージの表示等に使用される。The display 107 is a liquid crystal display or C
An RT display is used for displaying character strings, displaying pronunciation documents, displaying various messages, and the like.

【００３６】マイク１０８は、韻律パタンのモデルを作
成等する場合にもちいる元音声波形データとなるオリジ
ナルの肉声をサンプリングするのに使用される。The microphone 108 is used to sample the original real voice which is the original voice waveform data used for creating a model of a prosody pattern.

【００３７】スピーカ１０９は、音声合成部１０５で合
成した音声の再生出力や、各種音の再生に使用される。The speaker 109 is used for reproducing and outputting the voice synthesized by the voice synthesizing unit 105 and reproducing various sounds.

【００３８】発音文書記憶部１１０は、作成された発音
文書を記憶するメモリである。詳細は後述するが、発音
文書とはアクセント句に関するデータ、ポーズに関する
データ、韻律パタンに関するデータ等を含む入力文字列
に対応する文書データである。The pronunciation document storage unit 110 is a memory for storing the created pronunciation document. As will be described in detail later, the pronunciation document is document data corresponding to an input character string including data related to accent phrases, data related to pauses, data related to prosodic patterns, and the like.

【００３９】Ｉ／Ｆ１１１は、バスＢＳとＦＤドライブ
１１２やＣＤ−ＲＯＭドライブ１１３との間でデータ授
受をおこなうユニットである。ＦＤドライブ１１２は着
脱自在のＦＤ１１２ａ（記録媒体）を装着してデータを
読み出したり書き込む動作を実施する。ＣＤ−ＲＯＭド
ライブ１１３は着脱自在のＣＤ−ＲＯＭ１１３ａ（記録
媒体）を装着してデータを読み出す動作を実施する。な
お、発音文書記憶部１１０に記憶されている発音文書
を、Ｉ／Ｆ１１１およびＦＤドライブ１１２を介してＦ
Ｄ１１２ａに保存することも可能である。The I / F 111 is a unit for exchanging data between the bus BS and the FD drive 112 or the CD-ROM drive 113. The FD drive 112 carries out an operation of reading and writing data by mounting a removable FD 112a (recording medium). The CD-ROM drive 113 carries out an operation of reading data by mounting a removable CD-ROM 113a (recording medium). The pronunciation document stored in the pronunciation document storage unit 110 is stored in the F / F drive 112 via the I / F 111 and the FD drive 112.
It is also possible to save it in D112a.

【００４０】通信部１１４は、通信回線に接続されてお
り、その通信回線を介して外部装置との通信をおこなう
ものである。The communication unit 114 is connected to a communication line and communicates with an external device via the communication line.

【００４１】なお、実施の形態１では、キー入力装置１
０６を介して文字列を入力する場合を例として説明する
が、特にこれに限定するものではなく、手書き入力装置
を接続して、手書きの文字を判別（文字認識）して文字
列を入力してもよく、さらにあらかじめ作成したワード
プロセッサ文書等から文字列を入力してもよい。In the first embodiment, the key input device 1
A case of inputting a character string via the input line 06 will be described as an example. However, the present invention is not limited to this. A handwriting input device is connected, and a character string is input by determining (character recognition) a handwritten character. Alternatively, a character string may be input from a word processor document or the like created in advance.

【００４２】図２は、実施の形態１の発音文書作成装置
１００の外観図を示す。図示の如く、ハードウェアの構
成としては、マイク１０８およびスピーカ１０９を有し
たパソコンを使用することが可能である。FIG. 2 is an external view of the pronunciation document creating apparatus 100 according to the first embodiment. As illustrated, as a hardware configuration, a personal computer having a microphone 108 and a speaker 109 can be used.

【００４３】つぎに、実施の形態１の発音文書作成装置
１００の構成を機能的に説明する。図３は、実施の形態
１の発音文書作成装置１００の機能的構成を示す機能ブ
ロック図である。図３において、発音文書作成装置１０
０は、入力部３０１と、変換部３０２と、アクセント句
生成部３０３と、ポーズ設定部３０４と、韻律パタン生
成部３０５と、表示部３０６と、音声合成部３０７と、
音声出力部３０８とを含む構成である。Next, the configuration of the pronunciation document creating apparatus 100 according to the first embodiment will be functionally described. FIG. 3 is a functional block diagram illustrating a functional configuration of the pronunciation document creation device 100 according to the first embodiment. In FIG. 3, the pronunciation document creation device 10
0 indicates an input unit 301, a conversion unit 302, an accent phrase generation unit 303, a pause setting unit 304, a prosody pattern generation unit 305, a display unit 306, a speech synthesis unit 307,
The configuration includes an audio output unit 308.

【００４４】入力部３０１は、かな文字列を入力するも
のである。具体的には、たとえば、キー入力装置１０６
により、または、通信部１１４により、かな文字を入力
することにより実現するものである。The input section 301 is for inputting a kana character string. Specifically, for example, the key input device 106
Or by inputting kana characters through the communication unit 114.

【００４５】変換部３０２は、入力部３０１により入力
されたかな文字列をその読みに対応する漢字を含む文字
列に変換するとともに、前記読みに対応するアクセント
に変換し、その際、アクセント情報および品詞情報を当
該アクセントに付加するものである。変換処理は、具体
的には、たとえば、キー入力装置１０６に備えられてい
る図示しない変換キーの押下によりおこなわれる。The conversion unit 302 converts the kana character string input by the input unit 301 into a character string including a kanji corresponding to the reading, and also converts the character string into an accent corresponding to the reading. The part of speech information is added to the accent. The conversion process is specifically performed by, for example, pressing a not-shown conversion key provided on the key input device 106.

【００４６】変換部３０２は、助詞の『は』あるいは
『へ』の読みを品詞情報に基づいてアクセントである
『ワ』あるいは『エ』に変換する。また、変換部３０２
は、読みの長音か機能も備えている。たとえば、『がっ
こう』と入力すると、より自然な喋り方に近いアクセン
トである『ガッコー』に変換する。The conversion unit 302 converts the reading of the particles "wa" or "he" into accents "wa" or "e" based on the part of speech information. Also, the conversion unit 302
Also has a long reading or function. For example, if you enter "Gakuko", it will be converted to "Gacco" which is an accent closer to a more natural way of speaking.

【００４７】アクセント句生成部３０３は、入力部３０
１により入力された文字列中の単語の読みと変換部３０
２により変換されたアクセント情報と品詞とに基づいて
複数のアクセントを結合することにより、アクセント句
を生成するものである。アクセント情報としては、たと
えばアクセント型等が含まれる。さらに、アクセント結
合属性等をアクセント情報に含めてもよい。The accent phrase generation unit 303 includes the input unit 30
Reading and conversion unit 30 for words in the character string input by
The accent phrase is generated by combining a plurality of accents based on the accent information and the part of speech converted by step 2. The accent information includes, for example, an accent type. Further, an accent combination attribute or the like may be included in the accent information.

【００４８】図４は、実施の形態１の発音文書作成装置
におけるアクセント句の生成を示す説明図である。図４
において、入力文字列『かぶしき』に対して、読みに対
応するアクセントと、アクセント情報として「２型アク
セント」と、品詞情報として「名詞」が変換辞書に記憶
されており、変換キーの押下等の操作により、変換部３
０２によって、入力文字列『かぶしき』に対してアクセ
ント情報（アクセント型）である「２型アクセント」お
よび品詞情報である「名詞」が付加される。また同様
に、変換部３０２によって、入力文字列『がいしゃ』に
対してアクセント情報（アクセント型）である「０型ア
クセント」および品詞情報である「名詞」が付加され
る。さらに同様に、変換部３０２によって、入力文字列
『の』に対してはアクセント情報は特に付加されず、品
詞情報である「助詞」のみが付加される。FIG. 4 is an explanatory diagram showing the generation of accent phrases in the pronunciation document creation device according to the first embodiment. FIG.
In the input character string "kabushiki", the accent corresponding to the reading, "type 2 accent" as accent information, and "noun" as part of speech information are stored in the conversion dictionary. Operation, the conversion unit 3
With 02, "2 type accent" as accent information (accent type) and "noun" as part of speech information are added to the input character string "kabushiki". Similarly, the conversion unit 302 adds “0-type accent” as accent information (accent type) and “noun” as part-of-speech information to the input character string “gaisha”. Similarly, the conversion unit 302 does not particularly add accent information to the input character string “no”, but adds only “particles” as part of speech information.

【００４９】アクセント句生成部３０３が変換されたア
クセントをアクセント情報および品詞情報に基づいて結
合することにより、上記入力文字列『かぶしき』、『が
いしゃ』、『の』から『カブシキガ’イシャノ』という
アクセント句が生成されるものである。The accent phrase generator 303 combines the converted accents based on the accent information and the part-of-speech information, thereby converting the input character strings "kabushiki", "gaisha", "no" to "kabushiga'ishana". Is generated.

【００５０】ポーズ設定部３０４は、アクセント句生成
部３０３により生成された複数のアクセント句の間のど
の位置にどのくらいの長さで無音区間（ポーズ）を挿入
するか等のポーズ情報を設定するものである。ポーズ情
報の設定、すなわちポーズの挿入は手動によりおこなわ
れる。具体的には、たとえば、表示装置１０７により表
示されたアクセント句の所望の位置に所望の長さのポー
ズを挿入することにより実現するものである。ポーズの
挿入は手動でおこなわれるほか、定型のポーズ等であれ
ば、所定の条件にしたがって自動で挿入することも可能
である。また、挿入されたポーズも表示装置１０７に表
示される。The pause setting section 304 sets pose information such as where and how long a silent section (pause) is to be inserted between a plurality of accent phrases generated by the accent phrase generation section 303. It is. The setting of the pose information, that is, the insertion of the pose, is performed manually. Specifically, for example, it is realized by inserting a pause of a desired length at a desired position of the accent phrase displayed by the display device 107. In addition to the manual insertion of the pose, it is also possible to automatically insert the pose according to a predetermined condition in a fixed pose or the like. The inserted pose is also displayed on the display device 107.

【００５１】韻律パタン生成部３０５は、ポーズ設定部
３０４によりポーズ情報が設定された複数のアクセント
句から構成される文単位で、ピッチパタン・各音節の時
間長等の韻律パタンを生成するものである。具体的に
は、文単位でピッチパタンおよび各音節の時間長を求め
ることにより韻律パタンを生成する。The prosody pattern generation unit 305 generates a prosody pattern such as a pitch pattern and a time length of each syllable in units of sentences composed of a plurality of accent phrases in which the pause information is set by the pause setting unit 304. is there. Specifically, a prosody pattern is generated by obtaining a pitch pattern and a time length of each syllable for each sentence.

【００５２】韻律パタンの生成の方法としては、あらか
じめ用意された韻律パタンモデル記憶部１０４に記憶さ
れた複数種類の韻律パタンモデルをもちいる方法が考え
られる。当該複数種類の韻律パタンモデルの中から、ア
クセント情報、アクセントの位置、アクセント句の数等
の条件から、最適のモデルを抽出することにより、韻律
パタンを生成することができる。また、操作者が所望の
モデルを選択することにより、当該所望のモデルをもち
いるような韻律パタンとしてもよい。さらにまた、上記
の条件をもちいて、データベース化された韻律パタンの
中から選択することにより、韻律パタンを生成するよう
にしてもよい。As a method of generating a prosody pattern, a method using a plurality of types of prosody pattern models stored in a prepared prosody pattern model storage unit 104 can be considered. A prosody pattern can be generated by extracting an optimal model from among the plurality of types of prosody pattern models from conditions such as accent information, accent positions, and the number of accent phrases. Further, the prosody pattern may be such that the operator selects a desired model and uses the desired model. Furthermore, a prosody pattern may be generated by selecting from prosody patterns in a database using the above conditions.

【００５３】また、表示部３０６は、発音文書、すなわ
ち、アクセント句生成部３０３により生成されたアクセ
ント句および／またはポーズ設定部３０４により設定さ
れたポーズ情報および／または韻律パタン生成部３０５
手段により生成された韻律パタンを表示するものであ
る。具体的には、たとえば、表示装置１０７をもちいる
ことにより実現するものである。また、表示部３０６
は、変換された漢字を含む文字列の漢字の部分にその漢
字の読みをルビで表示するようにしてもよい。The display unit 306 displays a pronunciation document, that is, an accent phrase generated by the accent phrase generation unit 303 and / or pause information set by the pause setting unit 304 and / or a prosody pattern generation unit 305.
The prosody pattern generated by the means is displayed. Specifically, for example, it is realized by using the display device 107. The display unit 306
May display the reading of the kanji in ruby at the kanji portion of the character string containing the converted kanji.

【００５４】また、音声合成部３０７は、アクセント句
生成部３０３により生成されたアクセント句と、ポーズ
設定部３０４により設定されたポーズ情報と、韻律パタ
ン生成部３０５により生成された韻律パタンからなる発
音文書と声色記憶部１０５に格納された声色データとを
もちいて音声を合成するものである。また、音声出力部
３０８は、音声合成部３０７により合成された音声を出
力するものである。具体的には、スピーカ１０９等によ
り実現するものである。Further, the speech synthesis unit 307 generates the accent phrase generated by the accent phrase generation unit 303, the pause information set by the pause setting unit 304, and the pronunciation composed of the prosody pattern generated by the prosody pattern generation unit 305. The voice is synthesized using the document and the timbre data stored in the timbre storage unit 105. The audio output unit 308 outputs the audio synthesized by the audio synthesis unit 307. Specifically, it is realized by the speaker 109 and the like.

【００５５】なお、入力部３０１、変換部３０２、アク
セント句生成部３０３、ポーズ設定部３０４、韻律パタ
ン生成部３０５、表示部３０６、音声合成部３０７、音
声出力部３０８はそれぞれ、ＲＯＭ１０１ｂ、ＲＡＭ１
０１ｃまたはアプリケーション記憶部１０２、フロッピ
ーディスク１１２ａ、ＣＤ−ＲＯＭ１１３ａ等の記録媒
体に記録されたプログラムに記載された命令にしたがっ
てＣＰＵ１０１ａ等が命令処理を実行することにより、
各部の機能を実現するものである。The input unit 301, the conversion unit 302, the accent phrase generation unit 303, the pause setting unit 304, the prosody pattern generation unit 305, the display unit 306, the voice synthesis unit 307, and the voice output unit 308 are respectively ROM101b and RAM1.
01c or the application storage unit 102, the floppy disk 112a, the CD-ROM 113a, and the like.
The function of each part is realized.

【００５６】以上の構成において、発音文書作成の一連
の処理の手順について説明する。図５は、実施の形態１
の発音文書作成装置の発音文書作成処理の手順を示すフ
ローチャートである。図５のフローチャートにおいて、
まず、キー入力装置１０６から文字の入力を待つ（ステ
ップＳ５０１）。文字の入力は、通常のワードプロセッ
サと同様の方法によりおこなわれる。In the above configuration, a procedure of a series of processes for creating a pronunciation document will be described. FIG. 5 shows the first embodiment.
4 is a flowchart showing the procedure of a pronunciation document creation process of the pronunciation document creation device of FIG. In the flowchart of FIG.
First, the process waits for a character input from the key input device 106 (step S501). Input of characters is performed in the same manner as in a normal word processor.

【００５７】文字の入力がなされた場合（ステップＳ５
０１肯定）は、つぎに、変換キーが押下されたか否かを
判断する（ステップＳ５０２）。変換キーが押下される
のは、通常、文節の区切りの時点であったり、アクセン
ト句の区切りの時点であったりする。ここで、変換キー
が押下されない場合（ステップＳ５０２否定）は、未だ
文節等の区切りではなく、さらに文字の入力が有ると判
断し、ステップＳ５０１へ移行し、さらなる文字の入力
を待つ。When a character is input (step S5)
(01 affirmative), it is determined whether or not the conversion key has been pressed (step S502). The conversion key is usually pressed at the time of a phrase break or at the time of an accent phrase break. Here, if the conversion key is not pressed (No at step S502), it is determined that there is still a character input, not a break of a clause or the like, and the process proceeds to step S501 to wait for further character input.

【００５８】変換キーが押下された場合（ステップＳ５
０２肯定）は、つぎに、キー入力装置１０６等からアク
セントの入力が有ったか否かを判断し（ステップＳ５０
３）、直接アクセントの入力があった場合（ステップＳ
５０３肯定）は、何もせずにステップＳ５０６へ移行す
る。When the conversion key is pressed (step S5)
(02 affirmative), it is determined whether or not an accent has been input from the key input device 106 or the like (step S50).
3) If there is a direct input of an accent (step S
(503 affirmative) shifts to step S506 without doing anything.

【００５９】ステップＳ５０３において、アクセントの
入力がなかった場合は（ステップＳ５０３否定）は、入
力された文字列に対応するアクセントおよびアクセント
情報・品詞情報を変換辞書１０３から読み出して、それ
ぞれアクセントに変換し、アクセント情報・品詞情報を
付加する（ステップＳ５０４）。この際、文字列に対し
て複数のアクセント、アクセント情報・品詞情報の候補
が変換辞書１０３内に存在する場合は、従来技術のかな
漢字変換の方法と同様の方法により、それらの候補を表
示させ、選択させることにより変換を確定させることが
できる（ステップＳ５０５）。したが、変換が確定した
か否かを判断し（ステップＳ５０５）、確定しない場合
（ステップＳ５０５否定）は、ステップＳ５０４へ移行
し、変換が確定するまで、変換処理を繰り返しおこな
う。If there is no input of an accent at step S503 (No at step S503), the accent and the accent information / speech information corresponding to the input character string are read out from the conversion dictionary 103 and converted into accents. Then, accent information and part of speech information are added (step S504). At this time, if a plurality of accents, accent information / speech information candidates exist for the character string in the conversion dictionary 103, the candidates are displayed by the same method as the conventional kana-kanji conversion method, By making the selection, the conversion can be determined (step S505). However, it is determined whether or not the conversion has been determined (step S505). If the conversion has not been determined (No at step S505), the process proceeds to step S504, and the conversion process is repeated until the conversion is determined.

【００６０】変換が確定した場合（ステップＳ５０５肯
定）は、つぎに、入力された文字列の読みと、ステップ
Ｓ５０４において変換されたアクセントと、アクセント
に付加されたアクセント情報・品詞情報に基づいてアク
セント句を生成する（ステップＳ５０６）。アクセント
句の生成の方法については、上述のとおりである。その
後、ステップＳ５０７へ移行する。If the conversion is determined (Yes at step S505), then, based on the reading of the input character string, the accent converted at step S504, and the accent information / part-of-speech information added to the accent, A phrase is generated (step S506). The method of generating the accent phrase is as described above. After that, the procedure moves to step S507.

【００６１】ステップＳ５０７では、変換辞書１０３を
もちいて、ステップＳ５０１によって入力された文字列
に対応する漢字を含む文字列へ変換する。変換の方法は
従来のかな漢字変換の方法と同様である。変換が確定し
たか否かを判断し（ステップＳ５０８）、変換が確定し
ない場合（ステップＳ５０８否定）は、ステップＳ５０
７へ移行し、変換が確定するまで、変換処理（ステップ
Ｓ５０７）を繰り返しおこなう。変換が確定した場合
（ステップＳ５０８肯定）は、ステップＳ５０９へ移行
する。In step S507, the conversion dictionary 103 is used to convert the character string input in step S501 into a character string containing kanji. The conversion method is the same as the conventional Kana-Kanji conversion method. It is determined whether or not the conversion is determined (step S508), and if the conversion is not determined (No at step S508), step S50 is performed.
7 and the conversion process (step S507) is repeated until the conversion is determined. When the conversion is determined (Yes at Step S508), the process proceeds to Step S509.

【００６２】なお、アクセントに変換し、アクセント情
報・品詞情報を付加した後、かな漢字変換をおこなうよ
うにしたが、この順序は逆であってもよい。また、変換
候補が有る場合、漢字を含む文字列の候補およびアクセ
ント、アクセント情報・品詞情報の候補を同時に表示
し、その中から、所望の組み合わせで選択させるように
し、変換を確定するようにしてもよい。It is to be noted that the kana-kanji conversion is performed after conversion into accents and addition of accent information and part-of-speech information, but the order may be reversed. If there are conversion candidates, character string candidates including kanji and accents, accent information / speech information candidates are simultaneously displayed, and a desired combination is selected from among them to confirm the conversion. Is also good.

【００６３】つぎに、ステップＳ５０９において、文の
区切りを示す句点またはリターンキーの入力があったか
否かを判断し、入力がなければ（ステップＳ５０９否
定）、ステップＳ５０１へ移行し、ステップＳ５０１〜
Ｓ５０９までを繰り返しおこなう。Next, in step S509, it is determined whether or not there is an input of a punctuation mark indicating a sentence delimiter or a return key. If there is no input (step S509: No), the process proceeds to step S501, and the process proceeds to step S501.
Steps up to S509 are repeated.

【００６４】ステップＳ５０９において、句点またはリ
ターンキーの入力があった場合（ステップＳ５０９肯
定）は、つぎに、ポーズの設定をおこなう（ステップＳ
５１０）。ポーズの設定の方法は上述のとおりである。
また、この処理ステップにおいて、離散的な韻律情報を
得ることができる。In step S509, if there is an input of a period or a return key (Yes in step S509), a pause is set (step S509).
510). The method of setting the pose is as described above.
In this processing step, discrete prosody information can be obtained.

【００６５】つぎに、韻律パタンの生成をおこなう（ス
テップＳ５１１）。韻律パタンの生成の方法は上述のと
おりである。さらに、韻律パタンの生成がおこなわれた
発音文書を発音文書記憶部１１０に記憶する（ステップ
Ｓ５１２）。Next, a prosody pattern is generated (step S511). The method of generating the prosody pattern is as described above. Further, the pronunciation document in which the prosody pattern has been generated is stored in the pronunciation document storage unit 110 (step S512).

【００６６】さらに、発音文書を表示装置１０７に表示
し（ステップＳ５１３）、すべての処理を終了する。Further, the pronunciation document is displayed on the display device 107 (step S513), and all the processing ends.

【００６７】つぎに、発音文書の音声出力処理について
説明する。図６は、実施の形態１による発音文書の作成
装置における発音文書の音声出力の処理の手順を示すフ
ローチャートである。図６のフローチャートにおいて、
音声出力の指示があったか否かを判断し（ステップＳ６
０１）、出力指示を待って（ステップＳ６０１肯定）、
該当する発音文書を発音文書記憶部１１０から読み出す
（ステップＳ６０２）。Next, the sound output processing of the pronunciation document will be described. FIG. 6 is a flowchart illustrating a procedure of a sound output process of a pronunciation document in the pronunciation document creation device according to the first embodiment. In the flowchart of FIG.
It is determined whether a voice output instruction has been given (step S6).
01), and waits for an output instruction (Yes at step S601).
The corresponding pronunciation document is read from the pronunciation document storage unit 110 (step S602).

【００６８】つぎに、読み出された発音文書の発音デー
タと声色記憶部１０５に記憶された声色データをもちい
て音声合成をおこなう（ステップＳ６０３）。その後、
合成音声をスピーカ１０９をもちいて出力する（ステッ
プＳ６０４）。Next, speech synthesis is performed using the read pronunciation data of the pronunciation document and the timbre data stored in the timbre storage unit 105 (step S603). afterwards,
The synthesized voice is output using the speaker 109 (step S604).

【００６９】以上説明したように、この実施の形態１に
よれば、文字列を入力し、かな漢字変換をおこなうのと
同様に変換キーを押下するだけで、発音文書を作成する
ことができる。その際、かな漢字変換の変換キーのタイ
ミング情報をアクセント句の区切り位置として利用する
ことによりアクセント句の区切りの誤りを減少させるこ
とができる。また、当該作成された発音文書を表示する
ことができる。さらにまた、当該作成された発音文書の
内容を音声合成出力することができる。As described above, according to the first embodiment, it is possible to create a pronunciation document simply by inputting a character string and pressing a conversion key in the same manner as when performing kana-kanji conversion. At this time, by using the timing information of the conversion key of the Kana-Kanji conversion as an accent phrase delimiter position, errors in accent phrase delimiter can be reduced. In addition, the created pronunciation document can be displayed. Furthermore, the contents of the created pronunciation document can be synthesized and output.

【００７０】なお、実施の形態１においては、日本語の
発音文書作成の方法についてのみ説明したが、それには
限定されず、英文の発音文書作成法であってもよい。そ
の場合、単語のスペルを入力し、単語間のスペースキー
の入力に対応して、単語の読み（発音）に対応するアク
セントに変換するようにすれば、日本語の発音文書の作
成と同様に英語の発音文書の作成をおこなうことができ
る。In the first embodiment, only the method of creating a Japanese pronunciation document has been described. However, the present invention is not limited to this, and an English pronunciation document creation method may be used. In that case, if you input the spelling of the word and convert it to an accent corresponding to the reading (pronunciation) of the word in response to the input of the space key between the words, you can create Can create English pronunciation documents.

【００７１】（実施の形態２）さて、上述した実施の形
態１では、文字列を入力し、かな漢字変換をおこなうの
と同様に変換キーを押下することにより、発音文書を作
成するようにしたが、以下に説明する実施の形態２のよ
うに、一旦作成された発音文書をより自然な喋り方にな
るように変更するために発音文書を編集するようにして
もよい。(Embodiment 2) In Embodiment 1 described above, a pronunciation string is created by inputting a character string and pressing a conversion key as in the case of performing kana-kanji conversion. However, as in the second embodiment described below, the pronunciation document may be edited in order to change the once created pronunciation document to a more natural way of speaking.

【００７２】この発明の実施の形態２による発音文書作
成装置７００のハードウエア構成および外観図について
は実施の形態１の図１および図２の発音文書作成装置１
００と同様であるので、その説明は省略する。また、図
７は、発音文書作成装置７００の機能的構成を示す機能
ブロック図である。図７において、発音文書作成装置７
００は、編集部７００をのぞく他の各部は実施の形態１
の図３の発音文書作成装置１００の各部と同様の構成で
あるので、同一の符号を付してその説明を省略する。For the hardware configuration and external view of the pronunciation document generation device 700 according to the second embodiment of the present invention, refer to FIG. 1 and FIG.
Since it is the same as 00, its description is omitted. FIG. 7 is a functional block diagram showing a functional configuration of the pronunciation document creation device 700. In FIG. 7, the pronunciation document creation device 7
00 is the same as in the first embodiment except for the editing unit 700.
3 has the same configuration as that of each part of the pronunciation document creating apparatus 100 shown in FIG.

【００７３】図７において、編集部７０１は、アクセン
ト句生成部３０３により生成されたアクセント句および
／またはポーズ設定部３０４により設定されたポーズ情
報および／または韻律パタン生成部３０５により生成さ
れた韻律パタンを編集するものである。具体的には、読
みの修正、アクセント型の修正、アクセント句の区切り
の挿入・削除、ポーズ情報の変更、韻律パタンの変更等
をおこなうものである。In FIG. 7, the editing unit 701 includes an accent phrase generated by the accent phrase generation unit 303 and / or pause information set by the pause setting unit 304 and / or a prosody pattern generated by the prosody pattern generation unit 305. Is to edit. Specifically, correction of reading, correction of accent type, insertion / deletion of a break of an accent phrase, change of pause information, change of prosody pattern, and the like are performed.

【００７４】上記のように読みの修正もおこなうことか
ら、その読みに対応する漢字およびアクセント変換をお
こなうことができる変換部３０２の一部の機能も備えて
いる。Since the reading is also corrected as described above, a part of the function of the conversion unit 302 capable of performing kanji and accent conversion corresponding to the reading is also provided.

【００７５】編集の方法としては、表示装置１０７に表
示された発音文書を参照して、キー入力装置１０６をも
ちいて、変数データを入力することによりおこなう。こ
れは、ワードプロセッサにより作成した文書を編集する
のと同様の方法によりおこなうものである。The editing method is performed by inputting variable data using the key input device 106 with reference to the pronunciation document displayed on the display device 107. This is performed by the same method as editing a document created by a word processor.

【００７６】つぎに、編集処理の手順について説明す
る。図８は、実施の形態２の文書作成装置の編集部７０
１の編集処理の手順を示すフローチャートである。図８
のフローチャートにおいて、まず、編集指示があったか
否かを判断し（ステップＳ８０１）、編集指示を待って
（ステップＳ８０１肯定）、該当する発音文書を発音文
書記憶部１１０から読み出す（ステップＳ８０２）。Next, the procedure of the editing process will be described. FIG. 8 shows the editing unit 70 of the document creation device according to the second embodiment.
9 is a flowchart illustrating a procedure of one editing process. FIG.
First, it is determined whether or not there is an editing instruction (step S801), and after waiting for the editing instruction (Yes at step S801), the corresponding pronunciation document is read from the pronunciation document storage unit 110 (step S802).

【００７７】つぎに、読み出された発音文書の編集をお
こなう（ステップＳ８０３）。その後、編集処理が終了
したか否かを判断し（ステップＳ８０４）、終了してい
ない場合（ステップＳ８０４否定）は、ステップＳ８０
３へ移行し、編集処理を繰り返しおこなう。Next, the read pronunciation document is edited (step S803). Thereafter, it is determined whether or not the editing process has been completed (step S804). If the editing process has not been completed (step S804: No), the process proceeds to step S80.
3 and the editing process is repeated.

【００７８】ステップＳ８０４において、編集処理が終
了した場合（ステップＳ８０４肯定）は、編集処理がお
こなわれた発音文書の内容を確認するために、編集がお
こなわれた発音文書の発音データと声色記憶部１０５に
記憶された声色データをもちいて音声合成をおこなう
（ステップＳ８０５）。その後、合成音声をスピーカ１
０９をもちいて出力する（ステップＳ８０６）。If the editing process has been completed in step S804 (Yes at step S804), the pronunciation data of the edited pronunciation document and the timbre storage unit are checked in order to confirm the contents of the edited pronunciation document. Voice synthesis is performed using the timbre data stored in the memory 105 (step S805). Then, the synthesized voice is sent to the speaker 1
09 is output (step S806).

【００７９】合成音声を確認した後、編集のやり直しを
する場合（ステップＳ８０７肯定）は、ステップＳ８０
３へ移行し、ステップＳ８０３〜Ｓ８０７の各処理を繰
り返しおこなう。編集のやり直しをしない場合、すなわ
ち、編集の内容を確定する場合（ステップＳ８０７否
定）は、編集処理がおこなわれた発音文書を発音文書記
憶部１１０に書き込み（ステップＳ８０８）、すべての
処理は終了する。After the synthesized speech is confirmed, if the editing is to be performed again (Yes at Step S807), Step S80 is performed.
Then, the process proceeds to step S3, and the processes in steps S803 to S807 are repeatedly performed. When the editing is not performed again, that is, when the content of the editing is determined (No at Step S807), the pronunciation document subjected to the editing process is written in the pronunciation document storage unit 110 (Step S808), and all the processing ends. .

【００８０】以上説明したように、この発明の実施の形
態２によれば、一旦作成された発音文書を編集するの
で、より自然な喋り方で音声合成することができる発音
文書を得ることができる。As described above, according to the second embodiment of the present invention, a pronunciation document that has been once created is edited, so that a pronunciation document that can be synthesized in a more natural way of speaking can be obtained. .

【００８１】[0081]

【発明の効果】以上説明したように、請求項１の発明に
よれば、文字列を入力する入力手段と、前記入力手段に
より入力された文字列をその読みに対応するアクセント
およびアクセント情報・品詞情報に変換する変換手段
と、前記変換手段により変換されたアクセントおよびア
クセント情報・品詞とに基づいてアクセント句を生成す
るアクセント句生成手段と、前記アクセント句生成手段
により生成された複数のアクセント句の間のどの位置に
どのくらいの長さで無音区間（ポーズ）を挿入するか等
のポーズ情報を設定するポーズ設定手段と、前記ポーズ
設定手段によりポーズ情報が設定された複数のアクセン
ト句から構成される文単位でピッチパタン・各音節の時
間長等の韻律パタンを生成する韻律パタン生成手段と、
を備えたため、文字列を入力し、アクセント句の区切り
位置で変換することにより、自然音声の入力をすること
なく、文書作成者の意図する喋り方で正確に音声を合成
するための発音文書の作成をすることができ、発音文書
の作成効率および利便性の向上を図ることが可能な発音
文書作成装置が得られるという効果を奏する。As described above, according to the first aspect of the present invention, the input means for inputting a character string, and the character string input by the input means are converted to the accent and accent information / part of speech corresponding to the reading. Conversion means for converting the information into information, accent phrase generation means for generating an accent phrase based on the accent and the accent information and part of speech converted by the conversion means, and a plurality of accent phrases generated by the accent phrase generation means. Pose setting means for setting pause information such as at which position and how long a silence section (pause) is to be inserted, and a plurality of accent phrases in which the pause information is set by the pause setting means. Prosody pattern generation means for generating a prosody pattern such as a pitch pattern and a time length of each syllable in sentence units;
By inputting a character string and converting it at the delimiter position of the accent phrase, it is possible to create a pronunciation document for synthesizing speech accurately in the way that the document creator intends without inputting natural speech. It is possible to obtain a pronunciation document creation device that can create the pronunciation document and can improve the efficiency and convenience of creating the pronunciation document.

【００８２】また、請求項２の発明によれば、請求項１
の発明において、前記アクセント句生成手段により生成
されたアクセント句および／または前記ポーズ設定手段
により設定されたポーズ情報および／または前記韻律パ
タン生成手段により生成された韻律パタンを表示する表
示手段を備えたため、発音文書を表示するので、発音文
書の内容を容易に確認することができ、発音文書の作成
効率および利便性の向上を図ることが可能な発音文書作
成装置が得られるという効果を奏する。Further, according to the invention of claim 2, according to claim 1
The invention according to the invention, further comprising display means for displaying the accent phrase generated by the accent phrase generation means and / or the pause information set by the pause setting means and / or the prosody pattern generated by the prosody pattern generation means. Since the pronunciation document is displayed, the contents of the pronunciation document can be easily confirmed, and the pronunciation document creation device capable of improving the efficiency and convenience of creating the pronunciation document can be obtained.

【００８３】また、請求項３の発明によれば、請求項１
または２の発明において、さらに、前記アクセント句生
成手段により生成されたアクセント句と、前記ポーズ設
定手段により設定されたポーズ情報と、前記韻律パタン
生成手段により生成された韻律パタンとをもちいて音声
を合成する音声合成手段と、前記音声合成手段により合
成された音声を出力する音声出力手段と、を備えたた
め、発音文書の発音データを音声合成して出力するの
で、発音文書の内容を容易に再生することができ、発音
文書の作成効率および利便性の向上を図ることが可能な
発音文書作成装置が得られるという効果を奏する。Further, according to the invention of claim 3, according to claim 1
Alternatively, in the invention according to the second aspect, further, a speech is generated by using the accent phrase generated by the accent phrase generation unit, the pose information set by the pause setting unit, and the prosody pattern generated by the prosody pattern generation unit. Since there is provided a voice synthesizing means for synthesizing and a voice output means for outputting a voice synthesized by the voice synthesizing means, the pronunciation data of the pronunciation document is synthesized and output, so that the contents of the pronunciation document can be easily reproduced This makes it possible to obtain a pronunciation document creation device capable of improving pronunciation document creation efficiency and convenience.

【００８４】また、請求項４の発明によれば、請求項１
〜３のいずれか一つの発明においてさらに、前記アクセ
ント句生成手段により生成されたアクセント句および／
または前記ポーズ設定手段により設定されたポーズ情報
および／または前記韻律パタン生成手段により生成され
た韻律パタンを編集する編集手段とを備えたため、発音
文書を編集することができるので、作成者の意図する喋
り方により近い発音文書を容易に作成することができ、
発音文書の作成効率および利便性の向上を図ることが可
能な発音文書作成装置が得られるという効果を奏する。According to the invention of claim 4, according to claim 1,
In the invention according to any one of the first to third aspects, the accent phrase and / or
Alternatively, since there is provided editing means for editing the pause information set by the pause setting means and / or the prosody pattern generated by the prosody pattern generation means, the pronunciation document can be edited, so that the creator's intention You can easily create pronunciation documents closer to how you speak,
There is an effect that a pronunciation document creation device capable of improving the efficiency and convenience of creating pronunciation documents can be obtained.

【００８５】また、請求項５の発明によれば、文字列を
入力する第１工程と、前記第１工程により入力された文
字列をその読みに対応するアクセントおよびアクセント
情報・品詞情報に変換する第２工程と、前記第２工程に
より変換されたアクセントおよびアクセント情報と品詞
とに基づいてアクセント句を生成する第３工程と、前記
第３工程により生成された複数のアクセント句の間のど
の位置にどのくらいの長さで無音区間（ポーズ）を挿入
するか等のポーズ情報を設定する第４工程と、前記第４
工程によりポーズ情報が設定された複数のアクセント句
から構成される文単位でピッチパタン・各音節の時間長
等の韻律パタンを生成する第５工程と、を含むため、文
字列を入力し、アクセント句の区切り位置で変換するこ
とにより、自然音声の入力をすることなく、文書作成者
の意図する喋り方で正確に音声を合成するための発音文
書の作成をすることができ、発音文書の作成効率および
利便性の向上を図ることが可能な発音文書作成方法が得
られるという効果を奏する。According to the fifth aspect of the present invention, the first step of inputting a character string and converting the character string input in the first step into accent and accent information / speech information corresponding to the reading. A second step, a third step of generating an accent phrase based on the accent and accent information and the part of speech converted in the second step, and a position between the plurality of accent phrases generated in the third step. A fourth step of setting pause information such as how long a silence section (pause) is to be inserted in the fourth step;
A fifth step of generating a prosody pattern such as a pitch pattern and a time length of each syllable in a sentence unit composed of a plurality of accent phrases in which pose information is set by the step. By converting at the phrase delimiter, it is possible to create a pronunciation document for synthesizing speech accurately according to the way the creator intends without inputting natural speech, and to create a pronunciation document There is an effect that a pronunciation document creation method capable of improving efficiency and convenience can be obtained.

【００８６】また、請求項６の発明によれば、請求項５
の発明において、さらに、前記第３工程により生成され
たアクセント句および／または前記第４工程により設定
されたポーズ情報および／または前記第５工程により生
成された韻律パタンを表示する第６工程を含むため、発
音文書を表示するので、発音文書の内容を容易に確認す
ることができ、発音文書の作成効率および利便性の向上
を図ることが可能な発音文書作成方法が得られるという
効果を奏する。Further, according to the invention of claim 6, according to claim 5,
And a sixth step of displaying the accent phrase generated in the third step and / or the pause information set in the fourth step and / or the prosodic pattern generated in the fifth step. Therefore, since the pronunciation document is displayed, the contents of the pronunciation document can be easily confirmed, and the pronunciation document creation method capable of improving the efficiency and convenience of creating the pronunciation document is obtained.

【００８７】また、請求項７の発明によれば、請求項５
または６の発明において、さらに、前記第３工程により
生成されたアクセント句と、前記第４工程により設定さ
れたポーズ情報と、前記第５工程により生成された韻律
パタンとをもちいて音声を合成する第７工程と、前記第
７工程により合成された音声を出力する第８工程と、を
含むため、発音文書を表示するので、発音文書の内容を
容易に確認することができ、発音文書の作成効率および
利便性の向上を図ることが可能な発音文書作成方法が得
られるという効果を奏する。According to the invention of claim 7, according to claim 5,
Alternatively, in the invention according to the sixth aspect, speech is synthesized using the accent phrase generated in the third step, the pause information set in the fourth step, and the prosodic pattern generated in the fifth step. Since the method includes the seventh step and the eighth step of outputting the voice synthesized in the seventh step, the pronunciation document is displayed, so that the content of the pronunciation document can be easily confirmed, and the creation of the pronunciation document can be performed. There is an effect that a pronunciation document creation method capable of improving efficiency and convenience can be obtained.

【００８８】また、請求項８の発明は、請求項５〜７の
いずれか一つの発明において、さらに、前記第３工程に
より生成されたアクセント句および／または前記第４工
程により設定されたポーズ情報および／または前記第５
工程により生成された韻律パタンを編集する第９工程と
を含むため、発音文書を編集することができるので、作
成者の意図する喋り方により近い発音文書を容易に作成
することができ、発音文書の作成効率および利便性の向
上を図ることが可能な発音文書作成方法が得られるとい
う効果を奏する。The invention according to claim 8 is the invention according to any one of claims 5 to 7, further comprising the accent phrase generated in the third step and / or the pose information set in the fourth step. And / or the fifth
Since the method includes a ninth step of editing the prosodic pattern generated by the step, the pronunciation document can be edited, so that a pronunciation document closer to the way of speech intended by the creator can be easily created, and the pronunciation document can be easily created. There is an effect that a pronunciation document creation method capable of improving the creation efficiency and convenience of the pronunciation document can be obtained.

【００８９】また、請求項９の発明に係る記憶媒体は、
請求項５〜８のいずれか一つに記載された方法をコンピ
ュータに実行させるプログラムを記録したことで、その
プログラムを機械読み取り可能となり、これによって、
請求項５〜８のいずれか一つの動作をコンピュータによ
って実現することが可能な記録媒体が得られるという効
果を奏する。The storage medium according to the ninth aspect of the present invention
By recording a program for causing a computer to execute the method according to any one of claims 5 to 8, the program becomes machine-readable, whereby
An advantage is obtained in that a recording medium capable of realizing the operation of any one of claims 5 to 8 by a computer is obtained.

[Brief description of the drawings]

【図１】この発明による実施の形態１の発音文書作成装
置の概略ブロック図である。FIG. 1 is a schematic block diagram of a pronunciation document creation device according to a first embodiment of the present invention.

【図２】実施の形態１の発音文書作成装置の外観図であ
る。FIG. 2 is an external view of a pronunciation document creation device according to the first embodiment.

【図３】実施の形態１の発音文書作成装置の機能的構成
を示す機能ブロック図である。FIG. 3 is a functional block diagram illustrating a functional configuration of a pronunciation document creation device according to the first embodiment;

【図４】実施の形態１の発音文書作成装置におけるアク
セント句の生成を示す説明図である。FIG. 4 is an explanatory diagram showing generation of an accent phrase in the pronunciation document creation device according to the first embodiment;

【図５】実施の形態１の発音文書作成装置における発音
文書の作成処理の手順を示すフローチャートである。FIG. 5 is a flowchart illustrating a procedure of a pronunciation document creation process in the pronunciation document creation device according to the first embodiment;

【図６】実施の形態１の発音文書作成装置における音声
出力処理の手順を示すフローチャートである。FIG. 6 is a flowchart illustrating a procedure of a sound output process in the pronunciation document creation device according to the first embodiment;

【図７】この発明による実施の形態２の発音文書作成装
置の機能的構成を示す機能ブロック図である。FIG. 7 is a functional block diagram illustrating a functional configuration of a pronunciation document creation device according to a second embodiment of the present invention;

【図８】実施の形態２の発音文書作成装置における発音
文書の編集処理の手順を示すフローチャートである。FIG. 8 is a flowchart illustrating a procedure of editing a pronunciation document in the pronunciation document creation device according to the second embodiment;

[Explanation of symbols]

１００，７００発音文書作成装置１０１制御部１０１ａＣＰＵ１０１ｂＲＯＭ１０１ｃＲＡＭ１０２アプリケーション記憶部１０３変換辞書１０４韻律パタンモデル記憶部１０５声色記憶部１０６キー入力装置１０７表示装置１０８マイク１０９スピーカ１１０発音文書記憶部１１１インタフェース（Ｉ／Ｆ）１１２ＦＤドライブ１１３ＣＤ−ＲＯＭドライブ１１４通信部３０１入力部３０２変換部３０３アクセント句生成部３０４ポーズ設定部３０５韻律パタン生成部３０６表示部３０７音声合成部３０８音声出力部７０１編集部 Reference Signs List 100, 700 pronunciation document creation device 101 control unit 101a CPU 101b ROM 101c RAM 102 application storage unit 103 conversion dictionary 104 prosody pattern model storage unit 105 voice color storage unit 106 key input device 107 display device 108 microphone 109 speaker 110 pronunciation document storage unit 111 Interface (I / F) 112 FD drive 113 CD-ROM drive 114 Communication unit 301 Input unit 302 Conversion unit 303 Accent phrase generation unit 304 Pause setting unit 305 Prosody pattern generation unit 306 Display unit 307 Voice synthesis unit 308 Audio output unit 701 Editing Department

─────────────────────────────────────────────────────
────────────────────────────────────────────────── ───

【手続補正書】[Procedure amendment]

【提出日】平成１１年７月１２日[Submission date] July 12, 1999

【手続補正１】[Procedure amendment 1]

【補正対象書類名】明細書[Document name to be amended] Statement

【補正対象項目名】全文[Correction target item name] Full text

【補正方法】変更[Correction method] Change

【補正内容】[Correction contents]

【書類名】明細書[Document Name] Statement

【発明の名称】発音文書作成装置、発音文書作成方法
およびその方法をコンピュータに実行させるプログラム
を記録したコンピュータ読み取り可能な記録媒体Patent application title: Pronunciation document creation device, pronunciation document creation method, and computer-readable recording medium recording a program for causing a computer to execute the method

【特許請求の範囲】[Claims]

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【０００２】[0002]

【０００５】[0005]

【００１０】[0010]

【課題を解決するための手段】上述した課題を解決し、
目的を達成するために、請求項１に係る発音文書作成装
置は、かな文字列を入力する入力手段と、前記入力手段
により入力されたかな文字列をその読みに対応する漢字
を含む文字列に変換するとともに、前記読みに対応する
アクセントに変換し、その際、アクセント情報および品
詞情報を当該アクセントに付加する変換手段と、前記変
換手段により変換されたアクセントおよびアクセント情
報・品詞とに基づいてアクセント句を生成するアクセン
ト句生成手段と、前記アクセント句生成手段により生成
された複数のアクセント句の間のどの位置にどのくらい
の長さで無音区間（ポーズ）を挿入するか等のポーズ情
報を設定するポーズ設定手段と、前記ポーズ設定手段に
よりポーズ情報が設定された複数のアクセント句から構
成される文単位でピッチパタン・各音節の時間長等の韻
律パタンを生成する韻律パタン生成手段と、を備えたこ
とを特徴とする。Means for Solving the Problems The above-mentioned problems are solved,
In order to achieve the object, a pronunciation document creation device according to claim 1 comprises: input means for inputting a kana character string; and converting the kana character string input by the input means into a character string containing kanji corresponding to the reading. Conversion means for converting the accent into an accent corresponding to the reading, and at this time, a conversion means for adding the accent information and the part of speech information to the accent, and an accent based on the accent and the accent information / part of speech converted by the conversion means. Accent phrase generation means for generating a phrase, and pose information such as where and how long a silent section (pause) is to be inserted between the plurality of accent phrases generated by the accent phrase generation means are set. A pause setting unit, and a sentence unit composed of a plurality of accent phrases in which the pause information is set by the pause setting unit. Prosodic pattern generating means for generating a Tchipatan-prosodic pattern time length of each syllable, and further comprising a.

【００１８】また、請求項５に係る発音文書作成方法
は、かな文字列を入力する第１工程と、前記第１工程に
より入力されたかな文字列をその読みに対応する漢字を
含む文字列に変換するとともに、前記読みに対応するア
クセントに変換し、その際、アクセント情報および品詞
情報を当該アクセントに付加する第２工程と、前記第２
工程により変換されたアクセントおよびアクセント情報
と品詞とに基づいてアクセント句を生成する第３工程
と、前記第３工程により生成された複数のアクセント句
の間のどの位置にどのくらいの長さで無音区間（ポー
ズ）を挿入するか等のポーズ情報を設定する第４工程
と、前記第４工程によりポーズ情報が設定された複数の
アクセント句から構成される文単位でピッチパタン・各
音節の時間長等の韻律パタンを生成する第５工程と、を
含んだことを特徴とする。According to a fifth aspect of the present invention, there is provided a pronunciation document creation method, wherein a first step of inputting a kana character string and a kana character string input in the first step are converted into a character string containing a kanji corresponding to the reading. A second step of converting the accent into an accent corresponding to the reading, and adding accent information and part-of-speech information to the accent.
A third step of generating an accent phrase based on the accent and accent information converted by the step and the part of speech, and a silence section at which position and how long between the plurality of accent phrases generated by the third step A fourth step of setting pause information such as whether to insert a (pause), and a pitch pattern, a time length of each syllable, etc. in units of sentences composed of a plurality of accent phrases in which the pause information is set in the fourth step. And a fifth step of generating a prosodic pattern of

【００２７】[0027]

【００４６】変換部３０２は、助詞の『は』あるいは
『へ』の読みを品詞情報に基づいてアクセントである
『ワ』あるいは『エ』に変換する。また、変換部３０２
は、読みの長音化機能も備えている。たとえば、『がっ
こう』と入力すると、より自然な喋り方に近いアクセン
トである『ガッコー』に変換する。The conversion unit 302 converts the reading of the particles "wa" or "he" into accents "wa" or "e" based on the part of speech information. Also, the conversion unit 302
Also has a function to lengthen the reading. For example, if you enter "Gakuko", it will be converted to "Gacco" which is an accent closer to a more natural way of speaking.

【００５９】ステップＳ５０３において、アクセントの
入力がなかった場合は（ステップＳ５０３否定）は、入
力された文字列に対応するアクセントおよびアクセント
情報・品詞情報を変換辞書１０３から読み出して、それ
ぞれアクセントに変換し、アクセント情報・品詞情報を
付加する（ステップＳ５０４）。この際、文字列に対し
て複数のアクセント、アクセント情報・品詞情報の候補
が変換辞書１０３内に存在する場合は、従来技術のかな
漢字変換の方法と同様の方法により、それらの候補を表
示させ、選択させることにより変換を確定させることが
できる（ステップＳ５０５）。したがって、変換が確定
したか否かを判断し（ステップＳ５０５）、確定しない
場合（ステップＳ５０５否定）は、ステップＳ５０４へ
移行し、変換が確定するまで、変換処理を繰り返しおこ
なう。If there is no input of an accent at step S503 (No at step S503), the accent and the accent information / speech information corresponding to the input character string are read out from the conversion dictionary 103 and converted into accents. Then, accent information and part of speech information are added (step S504). At this time, if a plurality of accents, accent information / speech information candidates exist for the character string in the conversion dictionary 103, the candidates are displayed by the same method as the conventional kana-kanji conversion method, By making the selection, the conversion can be determined (step S505). Therefore, it is determined whether or not the conversion is determined (step S505). If the conversion is not determined (No at step S505), the process proceeds to step S504, and the conversion process is repeated until the conversion is determined.

【００８１】[0081]

【発明の効果】以上説明したように、請求項１の発明に
よれば、かな文字列を入力する入力手段と、前記入力手
段により入力されたかな文字列をその読みに対応する漢
字を含む文字列に変換するとともに、前記読みに対応す
るアクセントに変換し、その際、アクセント情報および
品詞情報を当該アクセントに付加する変換手段と、前記
変換手段により変換されたアクセントおよびアクセント
情報・品詞とに基づいてアクセント句を生成するアクセ
ント句生成手段と、前記アクセント句生成手段により生
成された複数のアクセント句の間のどの位置にどのくら
いの長さで無音区間（ポーズ）を挿入するか等のポーズ
情報を設定するポーズ設定手段と、前記ポーズ設定手段
によりポーズ情報が設定された複数のアクセント句から
構成される文単位でピッチパタン・各音節の時間長等の
韻律パタンを生成する韻律パタン生成手段と、を備えた
ため、文字列を入力し、アクセント句の区切り位置で変
換することにより、自然音声の入力をすることなく、文
書作成者の意図する喋り方で正確に音声を合成するため
の発音文書の作成をすることができ、発音文書の作成効
率および利便性の向上を図ることが可能な発音文書作成
装置が得られるという効果を奏する。As described above, according to the first aspect of the present invention, an input means for inputting a kana character string and a character including a kanji corresponding to the reading of the kana character string input by the input means are provided. In addition to the conversion into a column, the conversion into an accent corresponding to the reading, the conversion means for adding accent information and part of speech information to the accent, and the accent and accent information / part of speech converted by the conversion means An accent phrase generating means for generating an accent phrase, and pose information such as where and how long a silent section (pause) is to be inserted between the plurality of accent phrases generated by the accent phrase generating means. A sentence unit composed of a pause setting unit to be set and a plurality of accent phrases for which pose information has been set by the pause setting unit Prosody pattern generation means for generating a prosody pattern such as a pitch pattern and a time length of each syllable, so that a character string is input and converted at an accent phrase delimiter without inputting natural speech. Thus, there is provided a pronunciation document creation apparatus capable of creating a pronunciation document for accurately synthesizing speech in a manner intended by the document creator and improving the efficiency and convenience of pronunciation document creation. The effect is that it can be done.

【００８５】また、請求項５の発明によれば、かな文字
列を入力する第１工程と、前記第１工程により入力され
たかな文字列をその読みに対応する漢字を含む文字列に
変換するとともに、前記読みに対応するアクセントに変
換し、その際、アクセント情報および品詞情報を当該ア
クセントに付加する第２工程と、前記第２工程により変
換されたアクセントおよびアクセント情報と品詞とに基
づいてアクセント句を生成する第３工程と、前記第３工
程により生成された複数のアクセント句の間のどの位置
にどのくらいの長さで無音区間（ポーズ）を挿入するか
等のポーズ情報を設定する第４工程と、前記第４工程に
よりポーズ情報が設定された複数のアクセント句から構
成される文単位でピッチパタン・各音節の時間長等の韻
律パタンを生成する第５工程と、を含むため、文字列を
入力し、アクセント句の区切り位置で変換することによ
り、自然音声の入力をすることなく、文書作成者の意図
する喋り方で正確に音声を合成するための発音文書の作
成をすることができ、発音文書の作成効率および利便性
の向上を図ることが可能な発音文書作成方法が得られる
という効果を奏する。According to the fifth aspect of the present invention, the first step of inputting a kana character string and the kana character string input in the first step are converted into a character string containing a kanji corresponding to the reading. And a second step of converting the accent into an accent corresponding to the reading, and adding the accent information and the part of speech information to the accent, and an accent based on the accent, the accent information and the part of speech converted in the second step. A third step of generating a phrase, and a fourth step of setting pause information such as where and how long a silence section (pause) is to be inserted between the plurality of accent phrases generated in the third step. And generating a prosodic pattern such as a pitch pattern and a time length of each syllable in units of sentences composed of a plurality of accent phrases for which pause information is set in the fourth step. The fifth step includes inputting a character string and converting the character string at a delimiter position of an accent phrase, thereby accurately synthesizing a speech according to a speech style intended by a document creator without inputting a natural speech. Therefore, there is an effect that a pronunciation document creation method that can improve the efficiency and convenience of creating pronunciation documents can be obtained.

【図面の簡単な説明】[Brief description of the drawings]

【符号の説明】１００，７００発音文書作成装置１０１制御部１０１ａＣＰＵ１０１ｂＲＯＭ１０１ｃＲＡＭ１０２アプリケーション記憶部１０３変換辞書１０４韻律パタンモデル記憶部１０５声色記憶部１０６キー入力装置１０７表示装置１０８マイク１０９スピーカ１１０発音文書記憶部１１１インタフェース（Ｉ／Ｆ）１１２ＦＤドライブ１１３ＣＤ−ＲＯＭドライブ１１４通信部３０１入力部３０２変換部３０３アクセント句生成部３０４ポーズ設定部３０５韻律パタン生成部３０６表示部３０７音声合成部３０８音声出力部７０１編集部[Description of Signs] 100, 700 Pronunciation document creation device 101 Control unit 101a CPU 101b ROM 101c RAM 102 Application storage unit 103 Conversion dictionary 104 Prosody pattern model storage unit 105 Voice storage unit 106 Key input device 107 Display device 108 Microphone 109 Speaker 110 Pronunciation document storage unit 111 Interface (I / F) 112 FD drive 113 CD-ROM drive 114 Communication unit 301 Input unit 302 Conversion unit 303 Accent phrase generation unit 304 Pause setting unit 305 Prosodic pattern generation unit 306 Display unit 307 Voice synthesis unit 308 Voice output unit 701 Editing unit

Claims

[Claims]

An input unit for inputting a character string; a conversion unit for converting the character string input by the input unit into accent and accent information / part of speech information corresponding to the reading; An accent phrase generating means for generating an accent phrase based on the accent and the accent information / part of speech; And a pause setting unit for setting pause information such as whether to insert a pitch pattern.
A prosody pattern generating means for generating a prosody pattern such as a time length of each syllable.

2. A display means for displaying the accent phrase generated by the accent phrase generation means and / or the pause information set by the pause setting means and / or the prosody pattern generated by the prosody pattern generation means. The pronunciation document creating apparatus according to claim 1, further comprising:

3. A speech synthesizer using the accent phrase generated by the accent phrase generation unit, the pause information set by the pause setting unit, and the prosody pattern generated by the prosody pattern generation unit. 3. The pronunciation document creating device according to claim 1, further comprising: a voice synthesizing unit that performs voice synthesis;

4. An editing means for editing the accent phrase generated by the accent phrase generating means and / or the pause information set by the pause setting means and / or the prosodic pattern generated by the prosodic pattern generating means. 4. The method according to claim 1, wherein
The pronunciation document creation device according to any one of the above.

5. A first step of inputting a character string, a second step of converting the character string input in the first step into accent and accent information / speech information corresponding to the reading, and the second step A third step of generating an accent phrase based on the accent and accent information and the part-of-speech that have been converted by the first and second steps; A fourth step of setting pose information such as whether to insert a pause), and a pitch pattern, a time length of each syllable, and the like in a sentence unit composed of a plurality of accent phrases in which the pause information is set in the fourth step. A fifth step of generating a prosodic pattern; and a pronunciation document creating method.

6. A sixth step of displaying the accent phrase generated in the third step and / or the pause information set in the fourth step and / or the prosodic pattern generated in the fifth step. 6. The pronunciation document creation method according to claim 5, wherein the pronunciation document is included.

7. A method of synthesizing speech using the accent phrase generated in the third step, the pause information set in the fourth step, and the prosodic pattern generated in the fifth step. 7. The pronunciation document creation method according to claim 5, further comprising: a seventh step; and an eighth step of outputting the voice synthesized in the seventh step.

A ninth step of editing the accent phrase generated in the third step and / or the pause information set in the fourth step and / or the prosodic pattern generated in the fifth step. The pronunciation document creation method according to any one of claims 5 to 7, further comprising:

9. A computer-readable recording medium on which a program for causing a computer to execute the method according to claim 5 is recorded.