JPS61296396A

JPS61296396A - Voice code generation

Info

Publication number: JPS61296396A
Application number: JP60138517A
Authority: JP
Inventors: 国澤　寛治; 糸山　博
Original assignee: Matsushita Electric Works Ltd
Current assignee: Panasonic Electric Works Co Ltd
Priority date: 1985-06-25
Filing date: 1985-06-25
Publication date: 1986-12-27
Anticipated expiration: 2009-04-27
Also published as: JPH0632019B2

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Abstract] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】［技術分野１本発明は規則合成用の音声コード作成方法に関するもの
である。DETAILED DESCRIPTION OF THE INVENTION [Technical Field 1] The present invention relates to a speech code creation method for rule synthesis.

［背景技術１従来この種の音声合成方式においては、文字系列を入力
すると共に、単語のアクセントや文のイントネーション
に関する情報を入力し、それらを用いて予め記憶してい
る音韻データと規則とにより音声合成を行なっている。[Background technology 1] Conventionally, in this type of speech synthesis method, a character sequence is input, as well as information about the accent of a word and the intonation of a sentence. Performing synthesis.

しかしこの従来方法では、キーボードから文章を入力す
る際に、同時に各単語のアクセント位置などを入力する
必要があるので、捏作がきわめて面倒であるという問題
があった。However, with this conventional method, when inputting a sentence from the keyboard, it is necessary to input the accent position of each word at the same time, so it is extremely troublesome to fabricate sentences.

［発明の目的］本発明は上記の問題点に鑑み為されたものであり、その
目的とするところは、規則合成用の音声コードを作成す
る際に、アクセント情報の入力をきわめて容易にできる
方法を提供するにある。[Object of the Invention] The present invention has been made in view of the above-mentioned problems, and its purpose is to provide a method that makes it extremely easy to input accent information when creating a speech code for rule synthesis. is to provide.

［発明の開示］しかして本発明による音声コード作成方法は、音声を入
力とし、音声認識技術によって音声波形を文字系列に変
換すると共に各音韻のピッチ情報を抽出し・該ピッチ情
報を文字系列と共にコード化するものであり、従来のキ
ーボードなどからの文字入力に音声入力を加えることに
より、あるいは音声入力のみによって、文字系列とアク
セント情報との入力を容易に行なえる点に特徴を有する
ものである。[Disclosure of the Invention] The speech code creation method according to the present invention takes speech as input, converts the speech waveform into a character sequence using speech recognition technology, extracts pitch information of each phoneme, and converts the pitch information together with the character sequence. It is characterized by the ability to easily input character sequences and accent information by adding voice input to character input from a conventional keyboard, or by voice input alone. .

第１図（ａ）は本発明による音声コード作成方法の一実
施例を示したものである。同図において、キーボードあ
るいは文字読み取り器からの文字入力は、イにおいて音
素や音節などの音韻に分解されて記憶される。次にマイ
クロフォンなどから入力される音声が、口において音韻
単位のセグメンテーシヨンを施されると同時に、得られ
た音韻列が文字系列からの音韻列と比較され、もし一致
しない場合には再度セグメンテーシヨンをやり直すこと
によって、音韻境界が正確に検出され、それによりハに
おいて各音韻のピッチ、パワー、音韻艮、ホルマント情
報などのパラメータの抽出を行ない、これらを文字系列
からの文字情報に付加して、二においてコード化を行な
うものである。FIG. 1(a) shows an embodiment of the voice code creation method according to the present invention. In the figure, characters input from a keyboard or a character reader are broken down into phonemes such as phonemes and syllables and stored in i. Next, the speech input from a microphone or the like is segmented into phoneme units by the mouth, and at the same time, the resulting phoneme string is compared with the phoneme string from the character sequence, and if they do not match, it is segmented again. By redoing the process, the phoneme boundaries are accurately detected, and parameters such as the pitch, power, phoneme, and formant information of each phoneme are extracted in C, and these are added to the character information from the character sequence. Then, in step 2, encoding is performed.

こうして得られたフードは、メモリに格納したり、ある
いはバーコードとして印刷したりして記憶され、合成時
には同図（ｂ）に示すように、ホにおいて上記コードを
読み出し、へにおいて各パラメータに復号化し、トにお
いて予め合成部に記憶されている音韻データと規則とに
より合成が行なわれる。The food obtained in this way is stored in a memory or printed as a barcode, and when compositing, the code is read out in E and decoded into each parameter in B, as shown in Figure (b). Then, in G, synthesis is performed using the phoneme data and rules previously stored in the synthesis section.

したがって上記実施例においては、音声認識で得られる
音韻を既知の音ＩＩ系列と比較することによって、音韻
セグメンテーシヨンを容易に且つ正確に行なうことがで
慇、アクセントやイントネイシ１ンに関する情報が音声
入力から容易に得られるのである。Therefore, in the above embodiment, phoneme segmentation can be performed easily and accurately by comparing phonemes obtained by speech recognition with known phoneme II sequences, and information regarding accents and intonations can be easily and accurately performed. It can be easily obtained from the input.

第２図の実施例は、音声入力のみを用いて、セグメンテ
ーシヨンにより音声波形を各音韻に分解し、文字系列に
変換するものであり、このセグメンテーシヨンの際に同
時にピッチ情報や音韻長などの情報を抽出することによ
って、第１図の場合と同様に、別途キーボードからのア
クセント情報の入力を省略することができる。なおこの
場合には当然音声認識回路の精度が問題となるが、本発
明者等が別途提案している曖昧前の処理方式などを泪い
ることにより、最近では比較的安価でしかも精度の高い
音声認識回路を構成することができる。The embodiment shown in Figure 2 uses only voice input to segment the voice waveform into phonemes and convert them into character sequences.During this segmentation, pitch information and phoneme length are simultaneously analyzed. By extracting information such as, it is possible to omit the separate input of accent information from the keyboard, as in the case of FIG. Naturally, in this case, the accuracy of the speech recognition circuit is a problem, but recently, by using a pre-ambiguity processing method that the inventors have proposed separately, it has become possible to achieve relatively inexpensive and highly accurate speech. A recognition circuit can be configured.

［発明の効果］上述のように本発明は、規則合成のための音声コードを
文字入力と音声入力により、あるいは音声入力のみを用
いて作成するものであって、音声波形をセグメンテーシ
ヨンにより文字系列に変換する際に同時にピッチ情報な
どの抽出を行ない、これを規則合成時にアクセント情報
として利用するようにしたものであるから、従来行なっ
ていたキーボードからの文字入力及びアクセント位置の
入力のうち、少なくともアクセント位置の入力を省略す
ることができ、音声コードの作成を着しく簡単化し得る
という利点がある。[Effects of the Invention] As described above, the present invention creates a voice code for rule synthesis by character input and voice input, or by using voice input only, and converts voice waveforms into characters by segmentation. When converting to a series, pitch information is extracted at the same time, and this is used as accent information when composing rules, so it is possible to input characters and accent positions from the keyboard, which was conventionally done. There is an advantage that at least the input of the accent position can be omitted, and the creation of the voice code can be considerably simplified.

[Brief explanation of drawings]

第１図（ａ）及び（ｂ）は本発明方法の一実施例を示す
７０−チャート、第２図は他の実施例を示す７０−チャ
ートである。1(a) and (b) are 70-charts showing one embodiment of the method of the present invention, and FIG. 2 is a 70-chart showing another embodiment.

Claims

[Claims]

(1) A speech code creation method which takes speech as input, converts the speech waveform into a character sequence using speech recognition technology, extracts pitch information of each phoneme, and encodes the pitch information together with the character sequence.

(2) A character sequence having the same content as the voice input is input in advance, and pitch information is extracted while comparing the phonemes obtained by segmentation with the phonemes from the character input. How to create audio code as described.

(3) Claim 1 characterized in that information regarding other prosodic or articulatory combinations is extracted along with pitch information.
How to create audio code as described in section.