JPH0546195A

JPH0546195A - Speech synthesizing device

Info

Publication number: JPH0546195A
Application number: JP3204245A
Authority: JP
Inventors: Motoaki Koyama; 元昭児山
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1991-08-14
Filing date: 1991-08-14
Publication date: 1993-02-26

Abstract

PURPOSE:To synthesize a speech by accessing speech data according the attribute data of the speech data. CONSTITUTION:A memory part 11 is stored with the speech data of plural phrases and the attribute data of the respective speech data. At a retrieval request, the attribute data are read out of the memory part 11 by a retrieval part 12, which compares input attribute data for retrieval with the attribute data read out of the memory part 11 to retrieve the attribute data, and outputs the retrieval result to an addressing part 13. The addressing part 13 generates address data for a speech data storage part corresponding to the retrieval output of the retrieval part 12 and sends this address data to the memory part 11; and the speech data are read out of the memory part 11 and sent to a speech synthesis part 14, which synthesizes the speech.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は音声合成装置に係り、
特に音声合成する際にその音声フレーズの属性に基づい
て合成すべき音声の選択を可能にした音声合成装置に関
する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech synthesizer,
In particular, the present invention relates to a voice synthesizing device capable of selecting a voice to be synthesized based on the attribute of the voice phrase when synthesizing the voice.

【０００２】[0002]

【従来の技術】図５は複数フレーズの音声合成が可能な
従来の音声合成装置のブロック図である。この音声合成
装置ではフレーズの種類に対応した数だけデータ記憶装
置31を用意し、音声合成する際には複数のデータ記憶装
置31から１つを選択し、選択されたデータ記憶装置31に
格納されている音声データを音声合成部32に送ることに
より音声合成が行われる。2. Description of the Related Art FIG. 5 is a block diagram of a conventional speech synthesizer capable of synthesizing a plurality of phrases. This voice synthesizer prepares as many data storage devices 31 as the number of phrases, and when synthesizing a voice, one is selected from a plurality of data storage devices 31 and stored in the selected data storage device 31. The voice synthesis is performed by sending the voice data to the voice synthesizer 32.

【０００３】また、図６は上記とは異なる従来の音声合
成装置のブロック図である。この音声合成装置ではデー
タ記憶装置、例えばＩＣメモリ33にインデクス領域と音
声データ領域とを設定しておき、音声データ領域には複
数フレーズ分の音声データを記憶し、インデクス領域に
はこれら音声データのアドレス（ＡＤ）を記憶し、音声
合成の際にはインデクス領域を参照して音声データ領域
をアドレス指定することによって音声データを選択し、
選択された音声データが音声合成部32に送ることにより
音声合成が行われる。FIG. 6 is a block diagram of a conventional speech synthesizer different from the above. In this voice synthesizer, an index area and a voice data area are set in a data storage device, for example, an IC memory 33, voice data for a plurality of phrases is stored in the voice data area, and the voice data of these phrases is stored in the index area. The address (AD) is stored, and the voice data is selected by referring to the index region and addressing the voice data region during voice synthesis.
Voice synthesis is performed by sending the selected voice data to the voice synthesis unit 32.

【０００４】上記従来の両音声合成装置では、複数の各
音声フレーズに対応した音声データはいずれの場合にも
離散的にアクセスされている。このため、例えば先頭か
ら１番目、２番目等のように順序を指定したランダムな
音声データのアクセスは可能であるが、音声データの記
憶場所（アドレス）以外のデータによって音声データを
アクセスすることはできない。つまり、例えば音声デー
タを記憶したた時間データ、音声データのデータ長な
ど、音声データの属性データに基づいて音声データをア
クセスすることができないという問題がある。In both of the conventional voice synthesizers, voice data corresponding to a plurality of voice phrases are discretely accessed in any case. Therefore, for example, it is possible to access random audio data in which the order is specified, such as the first from the beginning, the second, etc. However, it is possible to access the audio data by data other than the storage location (address) of the audio data. Can not. That is, there is a problem that the voice data cannot be accessed based on the attribute data of the voice data, such as time data storing the voice data and the data length of the voice data.

【０００５】[0005]

【発明が解決しようとする課題】上記のように従来の音
声合成装置では、音声データの属性データに基づいて音
声データをアクセスすることができないという問題があ
る。As described above, the conventional voice synthesizer has a problem that the voice data cannot be accessed based on the attribute data of the voice data.

【０００６】この発明は上記のようにな事情を考慮して
なされたものであり、その目的は、音声データの属性デ
ータに基づいて音声データをアクセスし、音声合成を行
うことができる音声合成装置を提供することである。The present invention has been made in consideration of the circumstances as described above, and an object thereof is to access a voice data based on attribute data of the voice data and perform a voice synthesis. Is to provide.

【０００７】[0007]

【課題を解決するための手段】この発明によれば、複数
フレーズの音声データを格納する音声データ格納部と、
上記複数フレーズの各音声データに対応した属性データ
を格納する属性データ格納部と、上記属性データ格納部
に格納されている属性データに基づき上記音声データ格
納部に格納されている音声データを検索する検索部と、
上記検索部の検索結果に基づいて上記音声データ格納部
の音声データを選択する音声データ選択部と、上記音声
データ選択部によって選択された音声データから音声を
合成する音声合成部とを具備したことを特徴とする。According to the present invention, a voice data storage unit for storing voice data of a plurality of phrases,
An attribute data storage unit for storing attribute data corresponding to each voice data of the plurality of phrases, and a search for voice data stored in the voice data storage unit based on the attribute data stored in the attribute data storage unit Search section,
A voice data selecting unit for selecting voice data in the voice data storing unit based on a search result of the search unit; and a voice synthesizing unit for synthesizing voice from the voice data selected by the voice data selecting unit. Is characterized by.

【０００８】[0008]

【作用】音声データに対応した属性データに基づき音声
データ格納部に格納されている音声データが検索され、
この検索結果に基づいて音声データ格納部に格納されて
いる音声データが選択される。これにより、音声データ
そのものの記憶場所以外のデータによって音声データを
アクセスすることが可能になる。[Operation] The voice data stored in the voice data storage unit is searched based on the attribute data corresponding to the voice data,
The voice data stored in the voice data storage unit is selected based on the search result. As a result, the voice data can be accessed by data other than the storage location of the voice data itself.

【０００９】[0009]

【実施例】以下、図面を参照してこの発明を実施例によ
り説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention will be described below with reference to the accompanying drawings.

【００１０】図１はこの発明の音声合成装置の第１の実
施例に係る全体の構成を示すブロック図である。図にお
いて、11は複数フレーズ分の音声データ及び各音声デー
タの属性データを格納するメモリ部である。このメモリ
部11として例えば、ＩＣメモリ、フレキシブルディスク
記憶装置、ハードディスク記憶装置、磁気テープ記憶装
置等が使用される。なお、ここでいう属性データとは、
音声データを識別するためのデータであり、これには例
えば音声データの作成日時、音声データのデータ長、話
者名、話者の男女の区別などがある。上記各音声データ
と属性データとは図に示すように対になって上記メモリ
部11に格納される。FIG. 1 is a block diagram showing the overall structure of a first embodiment of a speech synthesizer according to the present invention. In the figure, 11 is a memory unit for storing voice data for a plurality of phrases and attribute data of each voice data. As the memory unit 11, for example, an IC memory, a flexible disk storage device, a hard disk storage device, a magnetic tape storage device or the like is used. The attribute data referred to here is
This is data for identifying voice data, and includes, for example, the date and time when the voice data was created, the data length of the voice data, the speaker name, and the gender of the speaker. The respective voice data and the attribute data are paired and stored in the memory unit 11 as shown in the figure.

【００１１】12は検索部である。この検索部12は検索要
求があった際に、上記メモリ部11に格納されている属性
データを読み出し、検索用の入力属性データとメモリ部
11から読み出された属性データとを比較して属性データ
の検索を行い、内容が一致した属性データに応じた検索
結果をアドレッシング部13に出力する。アドレッシング
部13は検索部12の検索出力に対応した音声データ格納部
のアドレスデータを発生する。このアドレスデータが上
記メモリ部11に送られることにより、メモリ部11から音
声データが読み出され、音声合成部14に送られて、音声
が合成される。なお、上記音声合成部14としてはPARCOR
（Partial Auto-Correlation）方式、ADM （Adaptive D
ifferrential Modulation ）方式、ADPCM （Adaptive D
ifferrential Pulse Code Modulation）方式、MPC （Mu
lti Pulse Cording ）方式等の波形符号化方式及び分析
符号化方式の音声合成回路を使用することができる。Reference numeral 12 is a search unit. The search unit 12 reads the attribute data stored in the memory unit 11 when a search request is made, and inputs the search attribute data and the memory unit 11.
The attribute data read from 11 is compared to search the attribute data, and the search result corresponding to the attribute data having the matched content is output to the addressing unit 13. The addressing unit 13 generates address data of the voice data storage unit corresponding to the search output of the search unit 12. By sending this address data to the memory unit 11, the voice data is read from the memory unit 11 and sent to the voice synthesis unit 14 to synthesize the voice. It should be noted that PARCOR is used as the voice synthesis unit 14.
(Partial Auto-Correlation) method, ADM (Adaptive D
ifferrential Modulation) method, ADPCM (Adaptive D
ifferrential Pulse Code Modulation) method, MPC (Mu
It is possible to use a waveform synthesizing circuit such as a waveform coding method such as the lti Pulse Cording method and an analysis coding method.

【００１２】このように上記実施例の音声合成装置で
は、音声データの作成日時、音声データのデータ長、話
者名、話者の男女の区別等、音声データの属性データに
基づいて音声データをアクセスし、音声合成を行うこと
ができる。As described above, in the voice synthesizer of the above-described embodiment, the voice data is generated based on the attribute data of the voice data such as the date and time of the voice data creation, the data length of the voice data, the speaker name, and the gender of the speaker. You can access and perform voice synthesis.

【００１３】図２はこの発明の音声合成装置の第２の実
施例に係る全体の構成を示すブロック図である。この実
施例装置が前記第１の実施例装置と異なる点は、前記１
個のメモリ部11の代わりに、音声データ用のメモリ部11
Ａと属性データ用のメモリ部11Ｂとを設け、音声データ
と属性データとを独立したメモリ部に格納するようにし
たものである。FIG. 2 is a block diagram showing the overall construction of a second embodiment of the speech synthesizer of the present invention. The device of this embodiment differs from the device of the first embodiment in that
Instead of the individual memory units 11, the memory unit 11 for voice data is used.
A and a memory section 11B for attribute data are provided so that voice data and attribute data are stored in independent memory sections.

【００１４】図３はこの発明の音声合成装置の第３の実
施例に係る全体の構成を示すブロック図である。この実
施例装置では、メモリ部11に格納される属性データとし
て、前記のように音声データの作成日時、音声データの
データ長、話者名、話者の男女の区別等のデータの他に
音声合成方式を選択するためのデータも含まれている。FIG. 3 is a block diagram showing the overall structure of a third embodiment of the speech synthesizer of the present invention. In this embodiment, as the attribute data stored in the memory unit 11, as described above, in addition to data such as the creation date and time of the voice data, the data length of the voice data, the speaker name, the gender of the speaker, and the like, It also contains data for selecting the composition method.

【００１５】この実施例装置では検索部12に入力される
検索データの１つに音声合成方式を選択するためのデー
タも含まれる。そして、このデータはメモリ部11から読
み出される音声データが入力されるスイッチ部15に供給
される。このスイッチ部15はメモリ部11から読み出され
る音声データを、検索部12から出力される音声合成方式
を選択するためのデータに基づき、音声合成方式が異な
る複数の音声合成部（この実施例では３個の音声合成部
14Ａ、14Ｂ、14Ｃ）に選択的に出力する。これら３個の
音声合成部14Ａ、14Ｂ、14Ｃのうち、音声データが供給
されたものではその合成方式に従って音声が合成され
る。なお、上記３個の音声合成部14Ａ、14Ｂ、14Ｃにお
ける合成方式としては、前記のようにPARCOR方式、ADM
方式、ADPCM 方式、MPC方式等の中から選択可能であ
る。In the apparatus of this embodiment, one of the search data input to the search unit 12 also includes data for selecting a voice synthesis method. Then, this data is supplied to the switch unit 15 to which the audio data read from the memory unit 11 is input. The switch unit 15 uses the voice data read from the memory unit 11 based on the data for selecting the voice synthesis system output from the search unit 12, and a plurality of voice synthesis units having different voice synthesis systems (3 in this embodiment). Speech synthesizer
14A, 14B, 14C). Of these three voice synthesizers 14A, 14B, and 14C, the one to which voice data is supplied synthesizes the voice according to the synthesis method. The three speech synthesis units 14A, 14B, and 14C have the above-mentioned PARCOR scheme and ADM synthesis scheme.
It is possible to select from the methods, ADPCM method, MPC method, etc.

【００１６】この実施例装置は、図１に示す第１の実施
例装置の場合と同様に、音声データの作成日時、音声デ
ータのデータ長、話者名、話者の男女の区別等、音声デ
ータの属性データに基づいて音声データをアクセスし、
音声合成を行うことができるという効果を有する。ま
た、この実施例装置では上記効果の他に、複数の音声合
成部に対して共通のメモリ部を利用しているのでメモリ
の使用効率を高くすることができるという効果もある。
さらに複数方式の音声合成部を設けるようにしているの
で、目的に応じてデータ圧縮率の良い音声合成方式（PA
RCOR方式）や音質が良い音声合成方式（ADPCM 方式）を
選択して使用することがてきるという効果もある。As in the case of the first embodiment apparatus shown in FIG. 1, this embodiment apparatus uses the voice data such as the creation date and time of the voice data, the data length of the voice data, the speaker name, and the gender of the speaker. Access voice data based on data attribute data,
It has an effect that voice synthesis can be performed. In addition to the above effects, the device of this embodiment also has an effect that the memory usage efficiency can be increased because a common memory unit is used for a plurality of voice synthesis units.
Furthermore, since a voice synthesis unit for multiple systems is provided, a voice synthesis system with a good data compression rate (PA
There is also the effect that it is possible to select and use the RCOR method) or the voice synthesis method (ADPCM method) with good sound quality.

【００１７】図４はこの発明の音声合成装置の第４の実
施例に係る全体の構成を示すブロック図である。この実
施例装置は、上記図３の実施例装置に対して、方式が異
なる複数の音声符号化部（この実施例では２個の音声符
号化部16Ａ、16Ｂ）、属性データ作成部17及び入力音声
を上記２個の音声符号化部16Ａ、16Ｂに選択的に供給す
るためのスイッチ部18を追加するようにしたものであ
る。FIG. 4 is a block diagram showing the overall construction of the fourth embodiment of the speech synthesizer of the present invention. The apparatus of this embodiment is different from the apparatus of FIG. 3 in that a plurality of voice encoding units (two voice encoding units 16A and 16B in this embodiment) having different methods, an attribute data creation unit 17, and an input. A switch unit 18 for selectively supplying voice to the two voice encoding units 16A and 16B is added.

【００１８】ここで上記属性データ作成部17は各入力音
声に対する属性データを作成するものであり、ここで作
成された属性データはメモリ部11に送られ、格納され
る。また、２個の音声符号化部16Ａ、16Ｂは、前記のPA
RCOR方式、ADM 方式、ADPCM 方式、MPC 方式等の中から
選択された音声符号化方式に基づき入力音声を符号化し
て音声データを作成する。この実施例装置は、音声符
号化部を備えているため、合成したい音声フレーズの書
き替えを行うことができるという効果がある。Here, the attribute data creating section 17 creates attribute data for each input voice, and the attribute data created here is sent to and stored in the memory section 11. Also, the two speech coding units 16A and 16B are
Audio data is created by encoding the input audio based on the audio encoding method selected from the RCOR method, ADM method, ADPCM method, MPC method, and the like. Since the device of this embodiment includes the voice encoding unit, there is an effect that the voice phrase to be synthesized can be rewritten.

【００１９】[0019]

【発明の効果】以上説明したようにこの発明によれば、
音声データの属性データに基づいて音声データをアクセ
スし、音声合成を行うことができる音声合成装置を提供
することができる。また、音声データをデータファイル
やデータベースの処理対象として取り扱えるため、従来
のデータ処理系の中に音声データをスムーズに導入する
できるという効果もある。As described above, according to the present invention,
It is possible to provide a voice synthesizing device that can perform voice synthesis by accessing voice data based on attribute data of voice data. Further, since the voice data can be handled as a processing target of the data file or the database, there is an effect that the voice data can be smoothly introduced into the conventional data processing system.

[Brief description of drawings]

【図１】この発明の第１の実施例装置のブロック図。FIG. 1 is a block diagram of an apparatus according to a first embodiment of the present invention.

【図２】この発明の第２の実施例装置のブロック図。FIG. 2 is a block diagram of a second embodiment device of the present invention.

【図３】この発明の第３の実施例装置のブロック図。FIG. 3 is a block diagram of an apparatus according to a third embodiment of the present invention.

【図４】この発明の第４の実施例装置のブロック図。FIG. 4 is a block diagram of an apparatus according to a fourth embodiment of the present invention.

【図５】従来装置のブロック図。FIG. 5 is a block diagram of a conventional device.

【図６】従来装置のブロック図。FIG. 6 is a block diagram of a conventional device.

[Explanation of symbols]

11，11Ａ，11Ｂ…メモリ部、12…検索部、13…アドレッ
シング部、14，14Ａ，14Ｂ，14Ｃ…音声合成部、15，18
…スイッチ部、16Ａ、16Ｂ…音声符号化部、17…属性デ
ータ作成部。11, 11A, 11B ... Memory section, 12 ... Search section, 13 ... Addressing section, 14, 14A, 14B, 14C ... Speech synthesis section, 15, 18
... switch section, 16A, 16B ... voice encoding section, 17 ... attribute data creating section.

Claims

[Claims]

1. A voice data storage unit for storing voice data of a plurality of phrases, an attribute data storage unit for storing attribute data corresponding to each voice data of the plurality of phrases, and an attribute data storage unit stored in the attribute data storage unit. A search unit for searching voice data stored in the voice data storage unit based on attribute data; a voice data selection unit for selecting voice data in the voice data storage unit based on a search result of the search unit; A voice synthesizing device comprising: a voice synthesizing unit for synthesizing a voice from the voice data selected by the voice data selecting unit.

2. The voice synthesizer according to claim 1, further comprising a voice data forming unit which forms voice data from input voice.

3. The voice synthesizer according to claim 1, further comprising an attribute data forming unit that forms attribute data for the voice data stored in the voice data storage unit.

4. The voice data storage section and the attribute data storage section are configured by one memory device.
The described speech synthesizer.