JP2000330591A

JP2000330591A - Voice synthesizer

Info

Publication number: JP2000330591A
Application number: JP11139703A
Authority: JP
Inventors: Akira Sawamura; 陽沢村
Original assignee: Rohm Co Ltd
Current assignee: Rohm Co Ltd
Priority date: 1999-05-20
Filing date: 1999-05-20
Publication date: 2000-11-30

Abstract

PROBLEM TO BE SOLVED: To reduce a circuit scale and current consumption in a voice synthesizer using vector quantization as a method of compressing voice data for storing them in a storage means. SOLUTION: This voice synthesizer is provided with a 1st memory 4 which stores the data compressed by vector-quantizing the voice data on time axis as they are and outputs the compressed data stored in the addresses designated by a read sequencer 3, a control part 2 for controlling the addresses in the 1st memory designated by the read sequencer 3, and a means (a 2nd memory 5, a rate converting part 6, a D/A converter part 7, and a voice output part 8) for expanding the compressed data outputted from the 1st memory 4 and converting them into voice to output.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、例えば、玩具、電
子ブック、警報器などに搭載される、入力に応じた音声
を出力する音声合成装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice synthesizing apparatus for outputting a voice corresponding to an input, which is mounted on, for example, a toy, an electronic book, an alarm, and the like.

【０００２】[0002]

【従来の技術】音声データを記憶手段に記憶させてお
き、入力に応じて記憶手段から音声データを読み出して
出力する音声合成装置では、通常、記憶手段の記憶容量
を小さくするために、音声データを圧縮して記憶手段に
記憶させておき、記憶手段から読み出した音声データ
（圧縮された音声データ）を伸長し、伸長して得られた
音声データを音声に変換して出力するようになってい
る。2. Description of the Related Art A voice synthesizing apparatus which stores voice data in a storage means, reads out the voice data from the storage means in response to an input, and outputs the data is usually used to reduce the storage capacity of the storage means. Is compressed and stored in the storage means, the voice data (compressed voice data) read from the storage means is expanded, and the voice data obtained by expansion is converted into voice and output. I have.

【０００３】ここで、データを圧縮する方法としてベク
トル量子化が存在する。ベクトル量子化においては、圧
縮の際にはハードウェアが大がかりなものとなり、ま
た、処理に要する時間も長いが、一方、伸長の際にはハ
ードウェアとしては、圧縮データが取り得る符号とビッ
トパターンとの対応を示すデータ（「コードブック」と
呼ばれている）を記憶した記憶手段が必要となるだけで
あり、また、処理に要する時間も短い。したがって、ベ
クトル量子化は、電子ブック等のように音声再生のみを
行う音声合成装置にとって最適な手法と言える。Here, vector quantization exists as a method for compressing data. In vector quantization, the hardware becomes large-scale at the time of compression, and the time required for processing is long. On the other hand, at the time of decompression, the code and bit pattern that the compressed data can take are used as hardware. Only a storage means for storing data (called "codebook") indicating the correspondence is required, and the time required for processing is short. Therefore, vector quantization can be said to be an optimal technique for a speech synthesizer such as an electronic book that performs only speech reproduction.

【０００４】記憶手段に記憶させる音声データの圧縮方
法としてベクトル量子化を用いた音声合成装置が特開平
５−７３１００号公報に開示されている。この音声合成
装置では、音声データを強さの情報であるパワー情報と
音響的な情報であるスペクトル情報とにパラメータ分離
し、スペクトル情報のみをベクトル量子化することによ
り圧縮して得られた圧縮データを記憶手段に記憶させる
ようになっている。Japanese Patent Laid-Open No. 5-73100 discloses a speech synthesizer using vector quantization as a method for compressing speech data stored in a storage means. In this speech synthesizer, compressed data obtained by compressing speech data by separating parameters into power information as strength information and spectrum information as acoustic information and subjecting only the spectrum information to vector quantization. Is stored in the storage means.

【０００５】このように音声データのスペクトル情報の
みをベクトル量子化して音声データを圧縮することによ
り、音声のパワー情報とスペクトル情報とをまとめてベ
クトル量子化する場合よりも、伸長の際にも必要となる
コードブックのサイズを小さくすることができる。As described above, by compressing the audio data by vector-quantizing only the spectral information of the audio data, it is necessary to expand the power information and the spectral information of the audio together rather than to perform vector quantization. Can be reduced in size.

【０００６】[0006]

【発明が解決しようとする課題】しかしながら、上記音
声合成装置では、音声データを周波数軸上のデータの形
で圧縮することにより得られた圧縮データが記憶手段に
記憶されていることになり、このため、記憶手段から出
力される圧縮データを伸長して得られる音声データを時
間軸上のデータに変換する回路が必要となり、音声合成
装置全体としてみると回路規模が大きく、また、消費電
流が大きいという問題があった。However, in the above speech synthesizer, compressed data obtained by compressing the speech data in the form of data on the frequency axis is stored in the storage means. Therefore, a circuit for converting the audio data obtained by decompressing the compressed data output from the storage means into data on the time axis is required, and the overall circuit size of the audio synthesizer is large and the current consumption is large. There was a problem.

【０００７】そこで、本発明は、記憶手段に記憶させる
音声データの圧縮方法としてベクトル量子化を用いた音
声合成装置であって、回路規模の縮小及び消費電流の低
減を実現した音声合成装置を提供することを目的とす
る。Accordingly, the present invention provides a speech synthesizer that uses vector quantization as a method for compressing speech data stored in a storage means, and that realizes a reduced circuit scale and reduced current consumption. The purpose is to do.

【０００８】[0008]

【課題を解決するための手段】上記の目的を達成するた
め、本発明では、入力に応じた音声を出力する音声合成
装置において、音声から変換して得られるデジタルの電
気信号である音声データを時間軸上のデータのままベク
トル量子化により圧縮して得られた圧縮データを記憶し
た記憶手段と、当該音声合成装置への入力に応じて前記
記録手段から圧縮データを読み出す読み出し手段と、該
読み出し手段により読み出された圧縮データを伸長する
伸長手段と、該伸長手段により伸長されて得られたデー
タを音声に変換して出力する出力手段とを有している。In order to achieve the above object, according to the present invention, in a speech synthesizing apparatus for outputting a speech corresponding to an input, a speech data which is a digital electric signal obtained by converting the speech is converted. Storage means for storing compressed data obtained by compression by vector quantization as it is on the time axis; reading means for reading compressed data from the recording means in response to input to the speech synthesizer; A decompressing means for decompressing the compressed data read by the means, and an output means for converting data obtained by decompressing by the decompressing means into sound and outputting the sound.

【０００９】この構成により、伸長されて得られた音声
データは時間軸上のデータであり、周波数軸上のデータ
から時間軸上のデータへ変換する必要はなくなる。With this configuration, the audio data obtained by decompression is data on the time axis, and there is no need to convert data on the frequency axis to data on the time axis.

【００１０】[0010]

【発明の実施の形態】以下に、本発明の実施形態を図面
を参照しながら説明する。図１は本発明の一実施形態で
ある音声合成装置のブロック図である。同図において、
１は入力部、２は制御部、３は読み出しシーケンサ、４
は第１メモリ、５は第２メモリ、６はレート変換部、７
はＤ／Ａ変換部、８は音声出力部である。Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram of a speech synthesizer according to an embodiment of the present invention. In the figure,
1 is an input unit, 2 is a control unit, 3 is a read sequencer, 4
Is the first memory, 5 is the second memory, 6 is the rate converter, 7
Is a D / A converter, and 8 is a sound output unit.

【００１１】入力部１は外部からの入力を受け付ける。
制御部２は、入力部１で受け付けられた入力に応じた第
１メモリ４内の所定のアドレスが所定の順序で読み出し
シーケンサ３により指定されるように読み出しシーケン
サ３を制御する。読み出しシーケンサ３は、制御部２の
制御の下、第１メモリ４内のアドレスを指定する。The input unit 1 receives an external input.
The control unit 2 controls the read sequencer 3 so that predetermined addresses in the first memory 4 according to the input received by the input unit 1 are specified by the read sequencer 3 in a predetermined order. The read sequencer 3 specifies an address in the first memory 4 under the control of the control unit 2.

【００１２】第１メモリ４に記憶されているデータにつ
いて説明する。当該音声合成装置から出力するべき音声
から変換して得られるデジタルの電気信号である音声デ
ータ（８ビットとする）を、時間軸上のデータの形で、
４バイト毎に、予め用意されたＮ個の３２ビットのビッ
トパターンのうちの最も類似度の高いビットパターンに
割り当てられた符号で置き換えることにより得られた圧
縮データを記憶している。すなわち、音声データを時間
軸上のデータのままベクトル量子化により圧縮し、圧縮
して得られた圧縮データが第１メモリ４に記憶されてい
る。そして、第１メモリ４は読み出しシーケンサ３によ
り指定されたアドレスに記憶している圧縮データを出力
する。The data stored in the first memory 4 will be described. Voice data (8 bits), which is a digital electric signal obtained by converting voice to be output from the voice synthesizer, is converted into data on a time axis,
Compressed data obtained by replacing with a code assigned to a bit pattern having the highest similarity among the N 32-bit bit patterns prepared in advance is stored every four bytes. That is, the audio data is compressed by vector quantization as it is on the time axis, and the compressed data obtained by compression is stored in the first memory 4. Then, the first memory 4 outputs the compressed data stored at the address specified by the read sequencer 3.

【００１３】第２メモリ５は、図２にそのイメージ図を
示すように、ベクトル量子化を行うために上記予め用意
されたＮ個の３２ビットのビットパターンｂ（０）、ｂ
（１）、ｂ（２）、…、ｂ（Ｎ−１）と、圧縮を行う際
に音声データが置き換えられた符号（すなわち、圧縮デ
ータが取り得る符号）Ｃ（０）、Ｃ（１）、Ｃ（２）、
…、Ｃ（Ｎ−１）との対応を示すコードブックを記憶し
ている。そして、第２メモリ５は、第１メモリ４から出
力される圧縮データの符号Ｃ（ｋ）（ｋ＝０、１、２、
…、Ｎ−１）に対応するビットパターンｂ（ｋ）のビッ
ト列をパラレルに出力する。すなわち、第２メモリ５に
より、第１メモリ４から出力された圧縮データは音声デ
ータに伸長されることになる。As shown in an image diagram of FIG. 2, the second memory 5 stores the N 32-bit bit patterns b (0), b prepared in advance for performing vector quantization.
(1), b (2),..., B (N−1) and codes (ie, codes that can be taken by compressed data) in which audio data is replaced when compression is performed C (0), C (1) , C (2),
.., C (N−1) are stored. The second memory 5 stores the code C (k) (k = 0, 1, 2,...) Of the compressed data output from the first memory 4.
.., N−1) are output in parallel in a bit sequence of the bit pattern b (k). That is, the compressed data output from the first memory 4 is expanded into audio data by the second memory 5.

【００１４】ここで、例えば、ベクトル量子化を行うた
めに予め用意されたビットパターンの種類数が２５６で
あると仮定すると、コードブックのサイズ、すなわち、
第２メモリ５の記憶容量としては２５６×４＝１０２４
バイトが必要となる。Here, for example, assuming that the number of types of bit patterns prepared in advance for performing vector quantization is 256, the size of the code book, that is,
The storage capacity of the second memory 5 is 256 × 4 = 1024
Bytes are required.

【００１５】レート変換部６は、第２メモリ５から出力
される３２ビットのデータを入力し、入力した３２ビッ
トのデータを８ビットずつの４つのデータに分割し、分
割して得られた４つのデータを所定の順序で順次出力す
る。すなわち、第２メモリ５により圧縮データを伸長し
て得られた音声データは４バイトのパラレルデータであ
るが、レート変換部６を介することにより、シリアルデ
ータに変換される。The rate converter 6 receives the 32-bit data output from the second memory 5, divides the input 32-bit data into four 8-bit data, and divides the data into four data. Are sequentially output in a predetermined order. That is, the audio data obtained by expanding the compressed data by the second memory 5 is 4-byte parallel data, but is converted into serial data through the rate converter 6.

【００１６】Ｄ／Ａ変換部７はレート変換部６から出力
される音声データをアナログの電気信号に変換する。音
声出力部８はＤ／Ａ変換部７で得られたアナログの電気
信号を音声に変換して外部に出力する。The D / A converter 7 converts the audio data output from the rate converter 6 into an analog electric signal. The audio output unit 8 converts the analog electric signal obtained by the D / A conversion unit 7 into audio and outputs it to the outside.

【００１７】以上の構成により、圧縮データを伸長して
得られた音声データは時間軸上のデータであり、周波数
軸上のデータから時間軸上のデータへ変換する必要はな
い。したがって、この変換を行うための回路は不要とな
り、これにより、回路規模が縮小され、また、消費電流
が小さくなる。With the above configuration, audio data obtained by expanding compressed data is data on the time axis, and there is no need to convert data on the frequency axis to data on the time axis. Therefore, a circuit for performing this conversion becomes unnecessary, thereby reducing the circuit scale and the current consumption.

【００１８】そして、例えば発音者を１人にするなどし
て発音源を限定したり、圧縮する音声の種類を制限した
りすれば、これにより、コードブックのサイズを同一と
すると、出力する音声の音質を向上させることができ
る。If the number of sound sources is limited by, for example, a single sounding person, or the type of voice to be compressed is limited, the output voice can be set assuming the same codebook size. Sound quality can be improved.

【００１９】[0019]

【発明の効果】以上説明したように、本発明の音声合成
装置によれば、音声データを時間軸上のデータのままベ
クトル量子化により圧縮し、圧縮して得られた圧縮デー
タを記憶手段に記憶させているので、記憶手段から読み
出される圧縮データを伸長し、その後、音声に変換する
に際して、ベクトル量子化を用いた従来の音声合成装置
では必要であった周波数軸上のデータから時間軸上のデ
ータへの変換が不要となり、これにより、回路規模の縮
小及び消費電流の低減を実現することができる。As described above, according to the speech synthesizing apparatus of the present invention, speech data is compressed by vector quantization as it is on the time axis, and the compressed data obtained by compression is stored in the storage means. Since the compressed data read from the storage means is decompressed and then converted into speech, the data is stored on the frequency axis, which is necessary in the conventional speech synthesizer using vector quantization. It is not necessary to convert the data into the data described above, whereby the circuit scale and the current consumption can be reduced.

[Brief description of the drawings]

【図１】本発明の一実施形態である音声合成装置のブ
ロック図である。FIG. 1 is a block diagram of a speech synthesizer according to an embodiment of the present invention.

【図２】図１に示す音声合成装置内の第２メモリがコ
ードブックを記憶していることを示すイメージ図であ
る。FIG. 2 is an image diagram showing that a second memory in the speech synthesizer shown in FIG. 1 stores a code book.

[Explanation of symbols]

１入力部２制御部３読み出しシーケンサ４第１メモリ５第２メモリ６レート変換部７Ｄ／Ａ変換部８音声出力部 DESCRIPTION OF SYMBOLS 1 Input part 2 Control part 3 Read-out sequencer 4 1st memory 5 2nd memory 6 Rate conversion part 7 D / A conversion part 8 Audio output part

Claims

[Claims]

1. A speech synthesizer for outputting speech in accordance with an input, wherein compressed data obtained by vector-quantizing speech data obtained by converting speech into a digital electric signal without performing parameter separation is used. A first storage unit that stores the data, a reading unit that reads compressed data from the first storage unit in response to an input to the speech synthesizer, and a codebook that reads the compressed data read by the reading unit. A speech synthesizer comprising: a second storage unit for decompressing; and an output unit for converting data obtained by decompression by the second storage unit into speech and outputting the speech.