JP2000132180A

JP2000132180A - Voice output device and voice converting method

Info

Publication number: JP2000132180A
Application number: JP10306684A
Authority: JP
Inventors: Takeshi Hamada; 剛濱田; Fuyuhiko Ogoshi; 冬彦大越
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1998-10-28
Filing date: 1998-10-28
Publication date: 2000-05-12
Anticipated expiration: 2018-10-28
Also published as: JP3406230B2

Abstract

PROBLEM TO BE SOLVED: To provide an improved voice output device capable of generating a voice more easy to listen and a method therefor. SOLUTION: A voice edition part 3 makes the telephone number received by an output data receiving part 2 into two sets of figures, and a corresponding voice data are read from a voice data base 5, thereby, the telephone number is converted into a voice data with a rising intonation of the former figures and a falling intonation of the latter figures in which the accent of the larger figure is larger in each set. A voice with the hyphen including in the data of the telephone number converted into a voice data of 'and' and a voice data of 'is' automatically added before the telephone number is generated and outputted from a voice output part 4.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は音声出力装置、特に
聞き取りやすい出力音声を生成する改良された音声出力
装置及び音声変換方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an audio output device, and more particularly to an improved audio output device and an audio conversion method for generating an easily audible output audio.

【０００２】[0002]

【従来の技術】電話による自動オーダリングシステムや
予約システムなどに用いられる音声出力装置は、入力さ
れた数字の復唱や処理結果の通知をするために電話番号
や予約番号などの数字列を音声出力する。このような音
声出力装置は、出力すべき数字を予め登録しておいた音
声データに変換し出力するという処理を基本的に行って
いるが、数字を単に並べて音声出力していたのでは人間
による発声とは異なるため必ずしも聞き取りやすい音声
であるとは言い難い。そこで、従来から聞き取りやすい
音声を生成する装置が提案されている。2. Description of the Related Art A voice output device used in an automatic ordering system or a reservation system by telephone outputs a number string such as a telephone number or a reservation number by voice in order to repeat input numbers or to notify a processing result. . Such a voice output device basically performs a process of converting a number to be output into voice data registered in advance and outputting the voice data. Since it is different from utterance, it is not always easy to hear the sound. Therefore, a device that generates a sound that is easy to hear has been proposed.

【０００３】例えば、特開平６−５９６９６号公報に
は、連続した数字の奇数番目を尻上がり、偶数番目と最
後を尻下がりに音声出力するという音声応答装置による
番号再生方法が開示されている。For example, Japanese Patent Laying-Open No. 6-59696 discloses a number reproducing method using a voice response device in which a continuous number is output as an ascending odd number and an even number and the last as a descending bottom.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、従来の
ようにイントネーション（抑揚）を交互に変えて数字の
列を音声出力することだけでは、不自然さが残ってしま
うように感じる場合がある。However, there is a case where unnaturalness remains by simply outputting a string of numbers by alternately changing intonation (intonation) as in the related art.

【０００５】本発明は以上のような問題を解決するため
になされたものであり、その目的は、より聞き取りやす
い音声を生成できる改良された音声出力装置及び音声変
換方法を提供することにある。SUMMARY OF THE INVENTION The present invention has been made to solve the above problems, and an object of the present invention is to provide an improved sound output device and a sound conversion method capable of generating a sound that is easier to hear.

【０００６】[0006]

【課題を解決するための手段】以上のような目的を達成
するために、第１の発明に係る音声出力装置は、０から
９までの各数字について異なるアクセントの音声データ
が予め登録された音声データベースと、該当する音声デ
ータを前記音声データベースから読み出すことによって
数字列を含む音声出力対象データの各データを音声デー
タに変換する音声編集手段と、前記音声編集手段により
変換された音声データを合成出力する音声出力手段とを
有し、前記音声編集手段は、音声出力対象データに含ま
れる数字列をその先頭から順に２以上の数字により構成
される組に分割し、各組における数字の大小関係によっ
て各数字を音声出力するためのアクセントを決定するも
のである。In order to achieve the above object, a voice output device according to a first aspect of the present invention provides a voice output device in which voice data having different accents for each of numbers from 0 to 9 is registered in advance. A database, voice editing means for converting each data of voice output target data including a numeric string into voice data by reading out the corresponding voice data from the voice database, and synthesizing and outputting the voice data converted by the voice editing means. Voice editing means, wherein the voice editing means divides a numeric string included in the audio output target data into sets composed of two or more numbers in order from the top, and determines the magnitude relation of the numbers in each set. It determines the accent for outputting each number as a voice.

【０００７】また、前記音声編集手段は、音声出力対象
データに含まれる数字列を連続する２つの数字の組に分
割し、各組において大きい数字の方のアクセントが大き
くなるように当該各数字の音声データを読み出すもので
ある。Further, the voice editing means divides the number string included in the data to be output into two consecutive sets of numbers, and increases the accent of the larger number in each set so that the accent of the larger number becomes larger. It reads out audio data.

【０００８】また、前記音声データベースには、各数字
についてアクセントとイントネーションの大小の各組合
せに対応した音声データが登録されており、前記音声編
集手段は、各組における数字の並び順及び大小関係によ
って各数字を音声出力するためのアクセントを決定する
ものである。In the voice database, voice data corresponding to each combination of accent and intonation for each number is registered, and the voice editing means determines the order of numbers in each group and the magnitude relation. It determines the accent for outputting each number as a voice.

【０００９】更に、前記音声編集手段は、分割された２
つの数字から成る組において大きい数字の方のアクセン
トが大きくなるように、かつ先の数字のイントネーショ
ンが尻上がりに、後の数字のイントネーションが尻下が
りになるような当該各数字の音声データを読み出すもの
である。[0009] Further, the audio editing means may include the divided two
In a set of two numbers, the voice data of each number is read so that the accent of the larger number becomes larger, the intonation of the first number rises, and the intonation of the second number falls. is there.

【００１０】また、前記音声編集手段が前記音声データ
ベースから読み出す音声データの特徴的な傾向を示す選
択パラメータを指定する選択パラメータ指定手段を有
し、前記音声編集手段は、前記選択パラメータ指定手段
から指定された選択パラメータによってある数字に対す
る前記音声データベースから読み出す音声データを選択
するものである。[0010] Further, the voice editing means has selection parameter specifying means for specifying a selection parameter indicating a characteristic tendency of the voice data read from the voice database, and the voice editing means specifies the selection parameter from the selection parameter specifying means. The voice data to be read from the voice database for a certain number is selected by the selected selection parameter.

【００１１】また、本発明に係る音声変換方法は、０か
ら９までの各数字について異なるアクセントの音声デー
タが予め登録されており、入力された音声出力対象の数
字に対して当該数字に対応した音声データの中からいず
れかを読み出して音声出力する音声出力装置において、
数字列をその先頭から順に２以上の数字からなる組に分
割し、各組において大きい数字のアクセントが大きくな
るように当該各数字の音声データを読み出すようにして
音声出力対象の数字を音声データに変換するものであ
る。In the voice conversion method according to the present invention, voice data having different accents for each of the numbers 0 to 9 is registered in advance, and the input voice output target numbers correspond to the numbers. In an audio output device that reads any of the audio data and outputs the audio,
The number sequence is divided into sets of two or more numbers in order from the beginning, and the voice data of each number is read out so that the accent of a large number in each group becomes large, so that the number to be output is converted to voice data. It is something to convert.

【００１２】また、２つの数字から成る各組において更
に先の数字のイントネーションが尻上がりに、後の数字
のイントネーションが尻下がりになるような当該各数字
の音声データを読み出すものである。Further, in each set of two numbers, the voice data of each number is read such that the intonation of the first number becomes higher and the intonation of the second number becomes lower.

【００１３】[0013]

【発明の実施の形態】以下、図面に基づいて、本発明の
好適な実施の形態について説明する。Preferred embodiments of the present invention will be described below with reference to the drawings.

【００１４】実施の形態１．図１は、本発明に係る音声
出力装置の実施の形態１を示した機能ブロック構成図で
ある。本実施の形態では、数字列を含む電話番号を音声
データに変換して音声出力する音声出力装置１を例にし
て説明する。音声出力装置１は、出力データ受信部２、
音声編集部３及び音声出力部４を有しており、更に、音
声データが登録されている音声データベース５が搭載さ
れている。このうち、出力データ受信部２は、上位コン
ピュータ等から音声出力対象データとして送られてくる
電話番号を受け付ける。音声編集部３は、電話番号を構
成する各数字及び文字に対応する音声データを音声デー
タベース５から読み出すことによって各数字等を音声デ
ータに変換する。音声出力部４は、音声編集部３が変換
した音声データを合成して電話網へ送出する。また、こ
のために回線制御等を行う。Embodiment 1 FIG. 1 is a functional block diagram showing the first embodiment of the audio output device according to the present invention. In the present embodiment, a description will be given of a voice output device 1 that converts a telephone number including a numeric string into voice data and outputs voice as an example. The audio output device 1 includes an output data receiving unit 2,
It has a voice editing unit 3 and a voice output unit 4, and further has a voice database 5 in which voice data is registered. Among them, the output data receiving unit 2 receives a telephone number sent as data to be output from a host computer or the like. The voice editing unit 3 reads out voice data corresponding to each number and characters constituting the telephone number from the voice database 5 to convert each number and the like into voice data. The audio output unit 4 synthesizes the audio data converted by the audio editing unit 3 and sends the synthesized data to the telephone network. Further, line control and the like are performed for this purpose.

【００１５】図２は、本実施の形態における音声データ
ベース５のデータ構成を示した概念図である。音声デー
タベース５には、０から９までの各数字について、アク
セントとイントネーションの大小の各組合せに対応した
音声データが登録されている。つまり、アクセントとし
ては相対的に大きなアクセントと小さなアクセントの２
種類の異なるアクセントを、また、イントネーションと
しては尻上がりと尻下がりの２種類のイントネーション
をそれぞれ組み合わせて各数字に対して４種類の音声デ
ータを登録しておく。具体的にいうと、アクセントが大
であってイントネーションが尻上がりの音声データ、ア
クセントが大であってイントネーションが尻下がりの音
声データ、アクセントが小であってイントネーションが
尻上がりの音声データ及びアクセントが小であってイン
トネーションが尻下がりの音声データである。更に、音
声データベース５には、上位コンピュータ等から送られ
てくる電話番号データに含まれる市外局番等の区切りを
表すハイフン“−”を音声出力するための音声データ
と、音声出力する電話番号の終了を表すために電話番号
の後に自動的に付加する文言の音声データとが含まれて
いる。本実施の形態では、ハイフンを「の」と音声出力
し、電話番号の後に「です」を付加して音声出力するよ
うにするので、各音声データも数字の音声データと共に
音声データベース５に登録している。なお、この数字以
外の音声データを特に補助用音声データと称することに
すると、各補助用音声データに対しても数字の音声デー
タと同様に前述したアクセントとイントネーションの組
合せによる４種類の音声データを用意しておく。FIG. 2 is a conceptual diagram showing a data configuration of the voice database 5 in the present embodiment. In the audio database 5, audio data corresponding to each combination of accent and intonation is registered for each number from 0 to 9. In other words, two accents, relatively large and small,
Four types of voice data are registered for each number by combining different types of accents, and two types of intonations, ie, rising and falling, as the intonation. More specifically, voice data with a large accent and rising intonation, voice data with a large accent and falling intonation, voice data with a small accent and rising intonation and a small accent. Then, the intonation is the voice data with the bottom falling. Further, the voice database 5 includes voice data for outputting a hyphen "-" indicating a delimiter of an area code included in telephone number data transmitted from a host computer or the like, and a telephone number of a telephone number to be voice-outputted. And speech data of words automatically added after the telephone number to indicate the end. In the present embodiment, a hyphen is output as "no" and the telephone number is added with "is" after the telephone number so that the voice is output. Therefore, each voice data is registered in the voice database 5 together with the voice data of numbers. ing. It should be noted that if the voice data other than the numbers is particularly referred to as auxiliary voice data, the four types of voice data based on the combination of the accent and intonation described above are also applied to the auxiliary voice data in the same manner as the numerical voice data. Have it ready.

【００１６】本実施の形態において特徴的なことは、電
話番号を構成する数字を２つの組に分割して、各組にお
いて大きい数字の方のアクセントが大きくなるようにす
ると共に先の数字のイントネーションを尻上がりに、後
の数字のイントネーションを尻下がりとした音声データ
に変換するようにしたので、人間の発声により近い聞き
取りやすい音声を生成することができる。A feature of the present embodiment is that the numbers constituting the telephone number are divided into two sets so that the accent of the larger number becomes larger in each set and the intonation of the preceding number is made. Is converted to voice data in which the intonation of the numeral is lowered to the bottom and the intonation of the number is lowered to the bottom, so that it is possible to generate a voice that is more audible and closer to human utterance.

【００１７】次に、本実施の形態における動作について
図３及び図４に示したフローチャートを用いて説明す
る。ここでは、上位コンピュータから送られてきて音声
出力をする電話番号として“0132-4-5555”を例にして
説明する。なお、この電話番号は、全てのパターン、す
なわち、連続した２つの数字の組のうち先の数字が後の
数字より大きい場合（例えば“32”）、小さい場合（例
えば“01”）及び等しい場合（“55”）、更に、組にし
て文字を処理していった場合においてハイフンが先に現
れた場合（“2”と“4”の間の“-”）及び後に現れた
場合（“4-”）のパターンに対する本実施の形態におけ
る処理を説明するために便宜的人為的に形成したもので
ある。もちろん、上記各パターンへの対応ができれば、
どのような数字とハイフンの並びにも、また、桁数にも
対応できることは明らかである。Next, the operation of this embodiment will be described with reference to the flowcharts shown in FIGS. Here, a description will be given by taking “0132-4-5555” as an example as a telephone number that is sent from the host computer and outputs a voice. Note that this telephone number is used in all patterns, that is, when the first digit of a set of two consecutive numbers is larger (eg, “32”), smaller (eg, “01”), and equal. (“55”), and furthermore, when a character is processed as a set, a hyphen appears first (“−” between “2” and “4”) and a hyphen appears later (“4 This is formed for the sake of convenience to explain the processing in the present embodiment for the pattern of "-"). Of course, if we can respond to each of the above patterns,
Obviously, any sequence of numbers and hyphens can correspond to any number of digits.

【００１８】出力データ受信部２は、上位コンピュータ
からの電話番号を受け取ると、所定の記憶領域に一時格
納する。電話番号のデータは、バイナリ形式でもテキス
ト形式でもよいが、ここではテキスト形式の場合を例に
する。この電話番号のデータをバッファＳｔｒに格納す
るとすると、Ｓｔｒ［］＝“0132-4-5555”、文字デー
タ長（変数名：Ｓｔｒｌｅｎ）は１１である。When receiving the telephone number from the host computer, the output data receiving section 2 temporarily stores the telephone number in a predetermined storage area. The data of the telephone number may be in a binary format or a text format. Here, the text format is used as an example. If the data of the telephone number is stored in the buffer Str, Str [] = “0132-4-5555”, and the character data length (variable name: Strlen) is 11.

【００１９】音声編集部３は、出力データ受信部２が受
け取った電話番号を先頭から１文字ずつ処理していく。
本実施の形態では、連続した２つの数字を組にして各数
字のアクセント等を決定していくが、この各組を構成し
うる文字の並び順をｍ、電話番号を構成する各文字の並
び順をｎとすると、最初にｍとｎを初期化する（ステッ
プ１０１）。そして、電話番号データを構成する各文字
（Ｓｔｒ［ｎ］、ｎ＝１〜１１）に対して後段の処理を
順番に繰り返し行う（ステップ１０２）。The voice editing unit 3 processes the telephone number received by the output data receiving unit 2 character by character from the beginning.
In the present embodiment, the accent and the like of each number are determined by combining two consecutive numbers as a set. The arrangement order of the characters that can constitute each set is m, and the arrangement of the characters that constitute the telephone number is m. Assuming that the order is n, m and n are initialized first (step 101). Then, the subsequent processes are sequentially repeated for each character (Str [n], n = 1 to 11) constituting the telephone number data (step 102).

【００２０】まず、音声編集部３は、先頭（ｎ＝１）の
文字（Ｓｔｒ［１］）を取り出し、それが数字か否かを
判定する（ステップ１０３）。この例では、Ｓｔｒ
［１］は0であり、また、組における先頭（ｍ＝１）の
数字となるので、数字0をバッファＮｕｍｂｅｒ１に一
時保存する（ステップ１０４，１０５）。そして、ｎを
インクリメントすることで次の文字に処理対象を移すと
共に次の文字はＮｕｍｂｅｒ１に保存した数字と組とす
るためにｍ＝２とする（ステップ１０６）。First, the audio editing unit 3 takes out the first character (Str = 1) of the character (Str = 1) and determines whether or not it is a number (step 103). In this example, Str
Since [1] is 0 and is the first number (m = 1) in the set, the number 0 is temporarily stored in the buffer Number1 (steps 104 and 105). Then, the processing target is shifted to the next character by incrementing n, and m = 2 is set in order to make the next character a pair with the number stored in Number1 (step 106).

【００２１】音声編集部３は、次（ｎ＝２）の文字（Ｓ
ｔｒ［２］）を取り出し、それが数字か否かを判定する
（ステップ１０３）。この例では、Ｓｔｒ［２］は1で
あり、また、当該組における２番目（ｍ＝２）の数字と
なるので、数字1をバッファＮｕｍｂｅｒ２に一時保存
する（ステップ１０４，１０７）。ここで、図４に示し
た音声データ変換処理を行う（ステップ１０８）。The voice editing unit 3 outputs the next (n = 2) character (S
tr [2]) is taken out and it is determined whether or not it is a number (step 103). In this example, Str [2] is 1, and since it is the second (m = 2) number in the set, the number 1 is temporarily stored in the buffer Number2 (steps 104 and 107). Here, the audio data conversion processing shown in FIG. 4 is performed (step 108).

【００２２】図４において、ここでは、処理対象の文字
Ｓｔｒ［ｎ］が数字であり、２つの数字により組が構成
できたときの音声データ変換処理なのでステップ２０３
に移る。ステップ２０３において、組を構成するＮｕｍ
ｂｅｒ１とＮｕｍｂｅｒ２とを比較する。ここでは、Ｎ
ｕｍｂｅｒ１（＝0）はＮｕｍｂｅｒ２（＝1）より小さ
いので、Ｎｕｍｂｅｒ１に一時保存した先頭の数字0に
対してアクセントが小、イントネーションが尻上がりの
数字0の音声データを音声データベース５から読み出し
て、出力する音声データを格納する所定の記憶領域Ａｎ
ｎの先頭保存位置（Ａｎｎ［（ｎ−１）＝１］）に保存
する（ステップ２０４）。また、音声編集部３は、Ｎｕ
ｍｂｅｒ２に一時保存した２番目の数字1に対してアク
セントが大、イントネーションが尻下がりの数字1の音
声データを音声データベース５から読み出して、記憶領
域Ａｎｎの次の保存位置（Ａｎｎ［ｎ＝２］）に保存す
る（ステップ２０５）。In FIG. 4, here, the character Str [n] to be processed is a number, and the voice data conversion processing is performed when a pair can be formed by two numbers.
Move on to In step 203, Num constituting the set
Compare ber1 and Number2. Here, N
Since the number 1 (= 0) is smaller than the number 2 (= 1), the voice data of the number 0 whose accent is small and the intonation is ascended with respect to the first number 0 temporarily stored in the number 1 is read out from the voice database 5 and output. Predetermined storage area An for storing audio data
The data is stored at the head storage position of n (Ann [(n-1) = 1]) (step 204). In addition, the audio editing unit 3
The second digit 1 temporarily stored in mber2 is read out of the audio database 5 with the accent 1 being large and the intonation falling down from the audio database 5 and the next storage position (Ann [n = 2] in the storage area Ann ) (Step 205).

【００２３】以上のようにして、電話番号を構成する１
組の数字に対する音声データへの変換が終了すると、ｎ
をインクリメントすることで次の文字に処理対象を移す
と共にｍを初期化する（ステップ１０９）。なお、音声
データ変換の処理対象となる数字を一時保存するＮｕｍ
ｂｅｒ１，Ｎｕｍｂｅｒ２の内容もこの時点で後段の処
理のために初期化しておいた方が望ましい。As described above, the telephone number 1
When the conversion of the set of numbers into audio data is completed, n
Is incremented, the processing target is moved to the next character, and m is initialized (step 109). It is to be noted that Num for temporarily storing numbers to be processed in the audio data conversion is used.
It is desirable that the contents of ber1 and Number2 be also initialized at this point for the subsequent processing.

【００２４】音声編集部３は、次（ｎ＝３）の文字（Ｓ
ｔｒ［３］）を取り出し、それが数字か否かを判定する
（ステップ１０３）。この例では、Ｓｔｒ［３］は3で
あり、また、組における先頭（ｍ＝１）の数字となるの
で、上記と同様にして数字3をバッファＮｕｍｂｅｒ１
に一時保存し、更にｎをインクリメントし、ｍ＝２とす
る（ステップ１０４〜１０６）。The voice editing unit 3 outputs the next (n = 3) character (S
tr [3]) is taken out, and it is determined whether or not it is a number (step 103). In this example, Str [3] is 3, and it is the first number (m = 1) in the set. Therefore, the number 3 is stored in the buffer Number1 in the same manner as described above.
Is temporarily stored, and n is incremented to m = 2 (steps 104 to 106).

【００２５】続いて、音声編集部３は、次（ｎ＝４）の
文字（Ｓｔｒ［４］）を取り出し、それが数字か否かを
判定する（ステップ１０３）。この例では、Ｓｔｒ
［４］は2であり、また、当該組における２番目（ｍ＝
２）の数字となるので、数字2をバッファＮｕｍｂｅｒ
２に一時保存した後、音声データ変換処理を行う（ステ
ップ１０４，１０７，１０８）。Subsequently, the voice editing unit 3 extracts the next (n = 4) character (Str [4]) and determines whether or not it is a number (step 103). In this example, Str
[4] is 2, and the second (m =
2) The number 2 is stored in the buffer Number
Then, the audio data conversion processing is performed (Steps 104, 107 and 108).

【００２６】図４においては、上記とほぼ同様に処理を
するが、この組を構成するＮｕｍｂｅｒ１（＝3）は、
Ｎｕｍｂｅｒ２（＝2）より大きいので、Ｎｕｍｂｅｒ
１に一時保存した先頭の数字3に対してアクセントが
大、イントネーションが尻上がりの数字3の音声データ
を音声データベース５から読み出して、記憶領域Ａｎｎ
の次の保存位置（Ａｎｎ［（ｎ−１）＝３］）に保存す
る（ステップ２０６）。また、音声編集部３は、この組
においてＮｕｍｂｅｒ２に一時保存した２番目の数字2
に対してアクセントが小、イントネーションが尻下がり
の数字2の音声データを音声データベース５から読み出
して、次の保存位置（Ａｎｎ［ｎ＝４］）に保存する
（ステップ２０７）。In FIG. 4, the processing is performed in substantially the same manner as described above, but Number 1 (= 3) constituting this group is
Since it is larger than Number2 (= 2), Number
The voice data of the number 3 whose accent is large and the intonation rises from the first number 3 temporarily stored in 1 is read out from the voice database 5 and stored in the storage area Ann.
Is stored in the next storage position (Ann [(n-1) = 3]) (step 206). The audio editing unit 3 also stores the second number 2 temporarily stored in Number 2 in this group.
Then, the voice data of the numeral 2 with a small accent and a lower intonation is read out from the voice database 5 and stored in the next storage position (Ann [n = 4]) (step 207).

【００２７】以上のようにして、電話番号を構成する１
組の数字に対する音声データへの変換が終了すると、ｎ
をインクリメントすることで次の文字に処理対象を移す
と共にｍを初期化する（ステップ１０９）。As described above, the telephone number 1
When the conversion of the set of numbers into audio data is completed, n
Is incremented, the processing target is moved to the next character, and m is initialized (step 109).

【００２８】音声編集部３は、次（ｎ＝５）の文字（Ｓ
ｔｒ［５］）を取り出し、それが数字か否かを判定する
（ステップ１０３）。この例ではハイフンなので、音声
データ変換処理を行う（ステップ１１０）。The voice editing unit 3 outputs the next (n = 5) character (S
tr [5]) is taken out, and it is determined whether or not it is a number (step 103). Since a hyphen is used in this example, audio data conversion processing is performed (step 110).

【００２９】図４において、ここでは、処理対象の文字
Ｓｔｒ［ｎ］が数字でないためステップ２０８に移る。
ステップ２０８において、ここでは、ｍ＝１のとき、す
なわち組を構成するためにＮｕｍｂｅｒ１に一時保存し
た数字が存在していない場合なので、Ｓｔｒ［５］に対
してアクセントが小、イントネーションが尻下がりのハ
イフンの音声データを音声データベース５から読み出し
て、記憶領域Ａｎｎの次の保存位置（Ａｎｎ［ｎ＝
５］）に保存する（ステップ２１０）。つまり、電話番
号の市外局番等の境目に挿入するハイフンは、通常
「の」と発声していることに対応して、本実施の形態に
おける音声編集部３は、ハイフンを「の」という音声デ
ータに変換している。続いて、ｎをインクリメントする
ことで次の文字に処理対象を移すと共にｍを初期化する
（ステップ１１１）。なお、ここではｍは１のはずなの
でｍの初期化は省略してもよい。In FIG. 4, the process proceeds to step 208 because the character Str [n] to be processed is not a numeral.
In step 208, here, when m = 1, that is, when there is no number temporarily stored in Number1 to form a set, the accent is small for Str [5], and the intonation is downward. The voice data of the hyphen is read from the voice database 5 and is stored at the next storage position (Ann [n =
5]) (step 210). That is, in response to the fact that the hyphen inserted at the boundary of the area code or the like of the telephone number is normally uttered as "no", the audio editing unit 3 in the present embodiment sets the hyphen to the voice "no". Converted to data. Then, the processing target is moved to the next character by incrementing n, and m is initialized (step 111). Here, since m should be 1, the initialization of m may be omitted.

【００３０】続いて、音声編集部３は、次（ｎ＝６）の
文字（Ｓｔｒ［６］）を取り出しそれが数字か否かを判
定する（ステップ１０３）。この例では、Ｓｔｒ［６］
は4であり、また、組における先頭（ｍ＝１）の数字と
なるので、数字4をバッファＮｕｍｂｅｒ１に一時保存
する（ステップ１０４，１０５）。そして、ｎをインク
リメントすることで次の文字に処理対象を移すと共に次
の文字はＮｕｍｂｅｒ１に保存した数字と組とするため
にｍ＝２とする（ステップ１０６）。Subsequently, the voice editing unit 3 takes out the next (n = 6) character (Str [6]) and determines whether or not it is a number (step 103). In this example, Str [6]
Is 4 and is the first number (m = 1) in the set, so the number 4 is temporarily stored in the buffer Number1 (steps 104 and 105). Then, the processing target is shifted to the next character by incrementing n, and m = 2 is set in order to make the next character a pair with the number stored in Number1 (step 106).

【００３１】音声編集部３は、次（ｎ＝7）の文字（Ｓ
ｔｒ［７］）を取り出し、それが数字か否かを判定する
（ステップ１０３）。この例ではハイフンなので、音声
データ変換処理を行う（ステップ１１０）。The voice editing unit 3 outputs the next (n = 7) character (S
tr [7]) is taken out and it is determined whether or not it is a number (step 103). Since a hyphen is used in this example, audio data conversion processing is performed (step 110).

【００３２】図４において、ここでは、処理対象の文字
Ｓｔｒ［ｎ］が数字でないためステップ２０８に移る。
ステップ２０８において、ここでは、ｍ＝２のとき、す
なわち組を構成するためにＮｕｍｂｅｒ１に数字4が一
時保存されている場合なので、数字4に対してアクセン
トが大、イントネーションが尻上がりの数字4の音声デ
ータを音声データベース５から読み出して、記憶領域Ａ
ｎｎの次の保存位置（Ａｎｎ［（ｎ−１）＝６］）に保
存する（ステップ２０９）。そして、ハイフンに対して
はアクセントが小、イントネーションが尻下がりのハイ
フンの音声データを音声データベース５から読み出し
て、後続の保存位置（Ａｎｎ［ｎ＝７］）に保存する
（ステップ２１０）。つまり、Ｎｕｍｂｅｒ１に一時保
存された数字は、組を構成できなかったものの組を構成
できた場合には先頭に位置するので、アクセントが大、
イントネーションが尻上がりの音声データに変換され
る。また、ハイフンは、単独の文字として処理され、組
を構成することなく常にアクセントが小、イントネーシ
ョンが尻下がりの音声データに変換される。続いて、ｎ
をインクリメントすることで次の文字に処理対象を移す
と共にｍを初期化する（ステップ１１１）。In FIG. 4, the process proceeds to step 208 because the character Str [n] to be processed is not a numeral.
In step 208, here, when m = 2, that is, when the number 4 is temporarily stored in Number 1 to form a group, the accent of the number 4 is large, and the intonation of the number 4 is increased. Data is read from the voice database 5 and stored in the storage area A
nn is stored in the next storage position (Ann [(n-1) = 6]) (step 209). Then, the voice data of the hyphen with the small accent and the lower intonation of the hyphen is read from the voice database 5 and stored in the subsequent storage position (Ann [n = 7]) (step 210). In other words, the number temporarily stored in Number1 is located at the beginning when a set that could not form a set can be formed, so that the accent is large.
The intonation is converted to ascending audio data. A hyphen is processed as a single character, and is always converted into voice data with a small accent and a lower intonation without forming a set. Then, n
Is incremented, the processing target is moved to the next character, and m is initialized (step 111).

【００３３】音声編集部３は、次（ｎ＝８）の文字（Ｓ
ｔｒ［８］）を取り出しそれが数字か否かを判定する
（ステップ１０３）。この例では、Ｓｔｒ［８］は5で
あり、また、組における先頭（ｍ＝１）の数字となるの
で、数字5をバッファＮｕｍｂｅｒ１に一時保存する
（ステップ１０４，１０５）。そして、ｎをインクリメ
ントすることで次の文字に処理対象を移すと共に次の文
字はＮｕｍｂｅｒ１の数字と組となりうるので、ｍ＝２
とする（ステップ１０６）。The voice editing unit 3 outputs the next (n = 8) character (S
tr [8]) is taken out and it is determined whether or not it is a number (step 103). In this example, Str [8] is 5 and is the first number (m = 1) in the group, so the number 5 is temporarily stored in the buffer Number1 (steps 104 and 105). Then, by incrementing n, the processing target is shifted to the next character and the next character can be paired with the number of Number1, so that m = 2
(Step 106).

【００３４】続いて、音声編集部３は、次（ｎ＝９）の
文字（Ｓｔｒ［９］）を取り出し、それが数字か否かを
判定する（ステップ１０３）。この例では、Ｓｔｒ
［９］は5であり、また、当該組における２番目（ｍ＝
２）の数字となるので、数字5をバッファＮｕｍｂｅｒ
２に一時保存した後、音声データ変換処理を行う（ステ
ップ１０４，１０７，１０８）。Subsequently, the voice editing unit 3 extracts the next (n = 9) character (Str [9]) and determines whether or not it is a digit (step 103). In this example, Str
[9] is 5, and the second (m =
2) The number 5 is stored in the buffer Number
Then, the audio data conversion processing is performed (Steps 104, 107 and 108).

【００３５】図４のステップ２０３において、この組を
構成するＮｕｍｂｅｒ１（＝5）は、Ｎｕｍｂｅｒ２
（＝5）と等しいので、Ｎｕｍｂｅｒ１に一時保存した
先頭の数字5に対してアクセントが大、イントネーショ
ンが尻上がりの数字5の音声データを音声データベース
５から読み出して、記憶領域Ａｎｎの次の保存位置（Ａ
ｎｎ［（ｎ−１）＝８］）に保存する（ステップ２０
６）。また、音声編集部３は、この組においてＮｕｍｂ
ｅｒ２に一時保存した２番目の数字5に対してアクセン
トが小、イントネーションが尻下がりの数字5の音声デ
ータを音声データベース５から読み出して、後続の保存
位置（Ａｎｎ［ｎ＝９］）に保存する（ステップ２０
７）。なお、本実施の形態では、Ｎｕｍｂｅｒ１とＮｕ
ｍｂｅｒ２とに保存した数字が等しい場合はＮｕｍｂｅ
ｒ１がＮｕｍｂｅｒ２より大きい場合と同等に扱った
が、小さい場合と同等に扱ってもよい。あるいは、別途
の音声データへの変換規則を設定してもよい。At step 203 in FIG. 4, Number1 (= 5) constituting this group is replaced with Number2.
(= 5), the voice data of the number 5 whose accent is large and the intonation rises from the first number 5 temporarily stored in the number 1 is read out from the voice database 5, and the next storage position of the storage area Ann ( A
nn [(n-1) = 8]) (step 20).
6). In addition, the audio editing unit 3 determines that Numb
The second digit 5 temporarily stored in er2 is read out from the voice database 5 with the voice data of the digit 5 whose accent is small and the intonation is lower, and is stored in the subsequent storage position (Ann [n = 9]). (Step 20
7). In the present embodiment, Number1 and Nu1
If the numbers stored in mber2 are equal, Number
Although the case where r1 is larger than Number2 is treated as the same, the case where r1 is smaller may be treated as the same. Alternatively, a separate conversion rule for audio data may be set.

【００３６】以上のようにして、電話番号を構成する１
組の数字に対する音声データへの変換が終了すると、ｎ
をインクリメントすることで次の文字に処理対象を移す
と共にｍを初期化する（ステップ１０９）。As described above, the telephone number 1
When the conversion of the set of numbers into audio data is completed, n
Is incremented, the processing target is moved to the next character, and m is initialized (step 109).

【００３７】音声編集部３は、次（ｎ＝１０）の文字
（Ｓｔｒ［１０］）を取り出し、それが数字か否かを判
定するが、Ｓｔｒ［１０］＝Ｓｔｒ［１１］＝5であ
り、前述したＳｔｒ［８］＝Ｓｔｒ［９］＝5の関係の
処理と同様なので説明を省略する。The voice editing unit 3 takes out the next (n = 10) character (Str [10]) and determines whether or not it is a digit. However, Str [10] = Str [11] = 5. Since the process is the same as the above-described process of the relationship of Str [8] = Str [9] = 5, the description is omitted.

【００３８】更に、音声編集部３は、次（ｎ＝１２）の
文字（Ｓｔｒ［１２］）を取り出そうとするが、存在し
ないので終了のための音声データ変換処理を行う（ステ
ップ１１２）。Further, the voice editing unit 3 attempts to retrieve the next (n = 12) character (Str [12]), but since it does not exist, performs voice data conversion processing for termination (step 112).

【００３９】図４において、ここでは、終了（ｎ＞Ｓｔ
ｒｌｅｎ）の際の処理であるためステップ２１１に移
る。ステップ２１１において、ここでは、ｍ＝１のと
き、すなわちＮｕｍｂｅｒ１に数字が一時保存されてい
ない場合なので、アクセントが小、イントネーションが
尻下がりの終了を意味する音声データを音声データベー
ス５から読み出して、後続の保存位置（Ａｎｎ［ｎ＝１
２］）に保存する（ステップ２１２）。この終了に対応
した音声データとして、本実施の形態では、「です」を
登録している。また、ステップ２１１においてｍ＝２で
あれば、Ｎｕｍｂｅｒ１に数字が一時保存されている場
合なので、終了を意味する音声データを登録する前に、
アクセントが大、イントネーションが尻上がりの当該数
字の音声データを音声データベース５から読み出して、
終了の直前の保存位置（Ａｎｎ［（ｎ−１）］）に保存
することになる（ステップ２１２）。In FIG. 4, here, the end (n> St)
rlen), the process proceeds to step 211. In step 211, here, when m = 1, that is, when the number is not temporarily stored in Number1, the voice data indicating that the accent is small and the intonation ends in the bottom is read out from the voice database 5, and the subsequent Storage location (Ann [n = 1
2]) (step 212). In the present embodiment, “is” is registered as audio data corresponding to this end. Also, if m = 2 in step 211, the number is temporarily stored in Number1, so before registering the audio data indicating the end,
The voice data of the number whose accent is large and intonation rises is read out from the voice database 5,
The data is stored at the storage position (Ann [(n-1)]) immediately before the end (step 212).

【００４０】以上のようにして、音声編集部３は、電話
番号を対応する音声データに変換して記憶領域Ａｎｎに
保存する。この後、音声出力部４は、Ａｎｎの内容を電
話網を介して音声出力する。このアナウンスされる音声
は、「０１３２の４の５５５５です」となる。読み出さ
れた音声データのアクセントの大小とイントネーション
の尻上がり／尻下がりを図５に示す。As described above, the voice editing unit 3 converts the telephone number into the corresponding voice data and stores it in the storage area Ann. Thereafter, the audio output unit 4 outputs the contents of Ann as audio via the telephone network. The voice to be announced is “5555 of 0132”. FIG. 5 shows the magnitude of the accent of the read audio data and the rise / fall of the intonation.

【００４１】本実施の形態においては、音声出力する電
話番号の数字列において連続した２つの数字で組が構成
できた場合、当該組にした２つの数字を比較して等しい
か大きい数字の方のアクセントを大きくし、他方、小さ
い数字の方のアクセントを小さくするように音声変換を
する。これにより、出力される音声をより滑らかに聞き
取りやすくすることができる。更に、各組において先の
数字の音声データのイントネーションを尻上がりとし、
後の数字の音声データのイントネーションを尻下がりと
することにより、出力する音声を更に聞き取りやすくす
ることができる。In the present embodiment, when a pair can be formed by two consecutive numbers in the numeral string of the telephone number to be output as a voice, the two numbers in the pair are compared and the two numbers which are equal or larger are compared. Speech conversion is performed so as to increase the accent and decrease the accent of the smaller number. This makes it possible to make the output sound smoother and easier to hear. Furthermore, in each group, the intonation of the voice data of the preceding number is raised,
By making the intonation of the audio data of the subsequent numeral lower, it is possible to make the output audio more audible.

【００４２】実施の形態２．図６は、本発明に係る音声
出力装置の実施の形態２を示した機能ブロック構成図で
ある。本実施の形態における音声出力装置１は、上記実
施の形態１に示した構成に加えて選択パラメータ指定部
６を設けている。音声編集部３は、処理対象の文字に対
してその前後の文字の種別や数字の大小によって該当す
る音声データを音声データベース５から読み出すわけで
あるが、選択パラメータ指定部６は、音声編集部３が音
声データベース５から音声データを読み出す際の一指標
として音声編集部３に選択パラメータを与える。このよ
うな構成とすることで次のような効果を奏することがで
きる。Embodiment 2 FIG. 6 is a functional block configuration diagram showing Embodiment 2 of the audio output device according to the present invention. The audio output device 1 according to the present embodiment includes a selection parameter specifying unit 6 in addition to the configuration described in the first embodiment. The audio editing unit 3 reads out from the audio database 5 audio data corresponding to the character to be processed according to the type of the character before and after the character and the magnitude of the number. Gives a selection parameter to the audio editing unit 3 as an index when reading audio data from the audio database 5. With such a configuration, the following effects can be obtained.

【００４３】例えば、我が国においては、文尾を尻上が
りにするなど標準的なものと多少異なるアクセントやイ
ントネーションで会話する地方がある。上記実施の形態
１では、そのような地方の特徴的なイントネーションに
関係なく一律に数字の大小や文字の並びに応じて各文字
を音声データに変換していた。そこで、本実施の形態で
は、出力対象の文字を標準以外のアクセント及びイント
ネーションで音声出力できるようにして、各地方におい
てよりなじみやすい音声でアナウンスできるようにした
ことを特徴としている。つまり、例えば、選択パラメー
タの１を北海道地方、２を関西地方などのようにイント
ネーションの特徴的な傾向に対応させて選択パラメータ
を割り当てておけば、本実施の形態における音声出力装
置１を特定の地方のみで使用する場合、その特定の地方
においては極めてなじみのある聞き取りやすい音声でア
ナウンスすることができる。なお、実施の形態１におい
ては、アクセントが小であってイントネーションが尻下
がり以外の補助用音声データが使用されることはない
が、本実施の形態のような場合にも対応できるように実
施の形態１で使用する音声データベース５にも数字の音
声データと同様に４種類の補助用音声データを用意して
おいた。For example, in Japan, there are regions where conversations are made with accents and intonations slightly different from standard ones such as raising the tail of the sentence. In the first embodiment, each character is uniformly converted to voice data in accordance with the size of numbers and the arrangement of characters irrespective of such local intonation. Therefore, the present embodiment is characterized in that a character to be output can be output as a voice with a non-standard accent and intonation, so that an announcement can be made with a voice that is more familiar in each region. That is, for example, if the selection parameters 1 are assigned in accordance with the characteristic tendency of the intonation, such as 1 in the Hokkaido region and 2 in the Kansai region, the audio output device 1 according to the present embodiment can be specified. If used only in a local area, the announcement can be made in a very familiar and audible sound in that specific local area. In the first embodiment, the auxiliary voice data other than the lower accent is used because the accent is small and the intonation is not reduced. In the voice database 5 used in the first embodiment, four types of auxiliary voice data are prepared similarly to the voice data of numbers.

【００４４】次に、本実施の形態における動作について
説明するが、上記実施の形態１とは、音声編集部３にお
ける音声データ変換処理の一部分が異なるだけので、そ
の部分の処理についてのみ説明をする。図７は、図４に
示した音声データ変換処理に追加される処理を示した。Next, the operation of the present embodiment will be described. However, only a part of the audio data conversion processing in the audio editing unit 3 is different from that of the first embodiment, and only the processing of that part will be described. . FIG. 7 shows a process added to the audio data conversion process shown in FIG.

【００４５】例えば、電話番号の音声データへの変換が
終了して終了を示す「です」を音声データに付加する
際、音声編集部３は、指定された選択パラメータの値に
よってアクセントが小、イントネーションが尻下がりの
音声データ（ステップ２１３−１，２１３−３）あるい
はアクセントが大、イントネーションが尻上がりの音声
データ（ステップ２１３−２）を音声データベース５か
ら読み出すことになる。For example, when the conversion of the telephone number into the voice data is completed and “is” indicating the end is added to the voice data, the voice editing unit 3 sets the accent to be small, intonation depending on the value of the designated selection parameter. Is read from the voice database 5 (steps 213-1 and 213-3) or voice data with a large accent and a rising intonation (step 213-2).

【００４６】このように、アクセント等の異なるいずれ
かの音声データを読み出す際の指標として選択パラメー
タを指定できるようにしたので、同じ電話番号であって
もアナウンス先によって異なるアクセント等でその電話
番号を音声出力することができる。選択パラメータは、
音声出力装置１の設置場所によって固定化させたり、回
線接続先の電話番号の市外局番等により動的に切り替え
たりするなど、音声出力装置１を使用するシステムや運
用によって決定すればよい。なお、図７には、選択パラ
メータの値が３つの場合で例示したが、この分岐数に限
られたものではない。また、選択パラメータの値に対応
させて音声データを読み出すことができることを特徴と
しているので、図７の例のように異なるパラメータ値で
も同じ音声データを読み出すことになる場合は十分あり
うる。As described above, the selection parameter can be designated as an index for reading any audio data having different accents or the like. Therefore, even if the same telephone number is used, the telephone number can be changed with different accents depending on the announcement destination. Can output voice. The selection parameters are
What is necessary is just to determine by the system or operation which uses the audio output device 1, such as fixing by the installation location of the audio output device 1 or changing dynamically by the area code of the telephone number of a line connection destination. Although FIG. 7 illustrates the case where the value of the selection parameter is three, the number of branches is not limited to this. In addition, since the audio data can be read in accordance with the value of the selected parameter, it is quite possible that the same audio data is read with different parameter values as in the example of FIG.

【００４７】また、図７では、ステップ２１３の処理を
代表して選択パラメータの値によって分割した場合を例
にしたが、選択パラメータは、音声データベース５から
どの音声データを読み出すかの一条件となるので、その
読出し処理が行われる図４のステップ２０４〜２０７，
２０９，２１０，２１２においてもステップ２１４と同
様の分岐処理が追加されることになる。FIG. 7 shows an example in which the process of step 213 is divided by the value of the selection parameter as a representative, but the selection parameter is one condition for which audio data is read from the audio database 5. Therefore, steps 204 to 207 in FIG.
In 209, 210, and 212, the same branch processing as in step 214 is added.

【００４８】本実施の形態では、選択パラメータ指定手
段として設けた選択パラメータ指定部６が音声編集部３
に対して選択パラメータを付与する形態としたが、選択
パラメータ指定手段が単なる選択パラメータの記憶手段
であってもよい。また、本実施の形態では、単一の音声
データベース５の中からいずれかの音声データを読み出
すようにしたが、例えば地方毎に異なる音声データベー
ス５を用意して、音声データベース５を切り替えたり、
あるいは搭載する音声データベース５を入れ替えたりし
ても同等の効果を奏することができる。In the present embodiment, the selection parameter designating section 6 provided as the selection parameter designating means is
Although the selection parameter is assigned to the selection parameter, the selection parameter designation unit may be a simple selection parameter storage unit. Further, in the present embodiment, any one of the voice data is read out from the single voice database 5, but for example, a different voice database 5 is prepared for each region, and the voice database 5 is switched.
Alternatively, the same effect can be obtained even if the installed voice database 5 is replaced.

【００４９】上記各実施の形態においては、音声データ
ベース５に異なるアクセントとして相対的に大きいアク
セントと小さいアクセントの２種類のアクセントを予め
登録するようにしたが、よりアナウンスする音声の滑ら
かさを出すために３種類以上のアクセントを用意しても
よい。また、上記各実施の形態では、並び順に２つの数
字を組にして各数字のアクセントの大小等を決定するよ
うにした。電話番号等の１０桁程度の文字列であれば、
リズム的に２つの数字の組で十分であるが、桁数の多い
数字をアナウンスする場合、３つ以上の数字の組でアク
セントを考慮した方がよい場合なども考えられる。この
ように３つ以上の数字を組にする場合も本発明の適用範
囲内である。In each of the above embodiments, two types of accents, a relatively large accent and a small accent, are registered in advance in the speech database 5 as different accents. , Three or more accents may be prepared. In each of the above-described embodiments, two numbers are grouped in the order of arrangement to determine the magnitude of the accent of each number. If it is a character string of about 10 digits such as a telephone number,
Although a set of two numbers is sufficient in terms of rhythm, a case in which a number having a large number of digits is announced or a case where it is better to consider accents in a set of three or more numbers may be considered. Thus, the case where three or more numbers are grouped is also within the scope of the present invention.

【００５０】また、電話番号の音声出力に適用した場合
を例にしたが、その他の数字列を含む音声出力対象のデ
ータ、例えばカード番号や予約番号などを音声によりア
ナウンスするシステム等にも適用できることはいうまで
もない。Also, the case where the present invention is applied to voice output of telephone numbers has been described as an example. However, the present invention can also be applied to a system that announces voice target data including other digit strings, such as card numbers and reservation numbers, by voice. Needless to say.

【００５１】[0051]

【発明の効果】本発明によれば、数字列を分割した各組
において数字の大小関係によってアクセントの大小を決
定するようにした。特に、各組において大きい数字の方
のアクセントが大きくなるように音声データを読み出す
ようにしたので、より聞き取りやすい不自然でない音声
を生成し出力することができる。According to the present invention, the magnitude of the accent is determined based on the magnitude relation of the numbers in each set obtained by dividing the number sequence. In particular, since the voice data is read so that the accent of the larger number becomes larger in each group, it is possible to generate and output a more audible and less unnatural voice.

【００５２】更に、アクセントの大小に加えて各組の数
字をその並びによって尻上がりあるいは尻下がりのイン
トネーションの音声データに変換するようにしたので、
より聞き取りやすい音声を生成し出力することができ
る。Further, in addition to the magnitude of the accent, each set of numbers is converted into voice data of a rising or falling intonation depending on the arrangement.
A more audible sound can be generated and output.

【００５３】また、音声データーに変換する際に音声デ
ータベースから読み出す音声データを選択可能としたの
で、標準以外の、例えば地方独特のアクセントやイント
ネーションに適応した音声を生成することができる。こ
れにより、各地方においてよりなじみやすい音声でも出
力することができる。Further, since it is possible to select audio data to be read from the audio database when converting into audio data, it is possible to generate audio other than the standard, for example, adapted to local unique accents and intonations. As a result, it is possible to output a sound that is more familiar in each region.

[Brief description of the drawings]

【図１】本発明に係る音声出力装置の実施の形態１を
示した機能ブロック構成図である。FIG. 1 is a functional block configuration diagram showing a first embodiment of an audio output device according to the present invention.

【図２】実施の形態１における音声データベース５の
データ構成を示した概念図である。FIG. 2 is a conceptual diagram showing a data configuration of a voice database 5 according to the first embodiment.

【図３】実施の形態１において音声データへ変換する
処理を示したフローチャートである。FIG. 3 is a flowchart showing a process of converting to audio data in the first embodiment.

【図４】図３における音声データ変換処理を示したフ
ローチャートである。FIG. 4 is a flowchart showing a voice data conversion process in FIG. 3;

【図５】実施の形態１における処理結果の例を示した
図である。FIG. 5 is a diagram illustrating an example of a processing result according to the first embodiment;

【図６】本発明に係る音声出力装置の実施の形態２を
示した機能ブロック構成図である。FIG. 6 is a functional block diagram showing a second embodiment of the audio output device according to the present invention.

【図７】実施の形態２における音声データ変換処理の
一部のみを示したフローチャートである。FIG. 7 is a flowchart showing only a part of audio data conversion processing according to the second embodiment.

[Explanation of symbols]

１音声出力装置、２出力データ受信部、３音声編
集部、４音声出力部、５音声データベース、６選
択パラメータ指定部。1 audio output device, 2 output data receiving unit, 3 audio editing unit, 4 audio output unit, 5 audio database, 6 selection parameter specifying unit.

Claims

[Claims]

1. A voice database in which voice data having different accents for each of numbers from 0 to 9 are registered in advance, and voice output target data including a digit string by reading out the corresponding voice data from the voice database. And audio output means for synthesizing and outputting the audio data converted by the audio editing means.The audio editing means converts a numeric string included in the audio output target data into An audio output device, which is divided into sets each composed of two or more numbers in order from the top, and determines an accent for outputting each number as a voice according to the magnitude relation of the numbers in each set.

2. The voice editing means divides a numeric string included in the audio output target data into a set of two consecutive numbers, and adds a larger number to each set such that the accent of a larger number is larger in each set. 2. The audio output device according to claim 1, wherein the audio data is read.

3. The voice database has registered therein voice data corresponding to each combination of accent and intonation for each number, and the voice editing means determines the order of the numbers in each group and the magnitude relation. 2. The voice output device according to claim 1, wherein an accent for outputting each number by voice is determined.

4. The voice editing means is arranged such that, in a set consisting of two divided numbers, the accent of the larger number is larger, the intonation of the first number is ascended, and the intonation of the later number is ascended. 4. The audio output device according to claim 3, wherein the audio data of each of the numbers that fall is read.

5. The apparatus according to claim 1, further comprising: a selection parameter designating unit for designating a selection parameter indicating a characteristic tendency of the audio data read from the audio database by the audio editing unit. 4. The voice output device according to claim 1, wherein voice data to be read from the voice database for a certain number is selected according to the selected selection parameter.

6. Voice data with different accents for each of the numbers 0 to 9 is registered in advance, and one of the voice data corresponding to the number is read out for the input voice output target number. In a voice output device for outputting voices, a digit string is divided into sets of two or more numbers in order from the beginning, and the voice data of each number is read out so that the accent of a large number in each set becomes large. A voice conversion method, comprising converting a voice output target number into voice data.

7. The method according to claim 1, wherein in each set of two numbers, voice data of each of the numbers is read such that the intonation of a further number rises upward and the intonation of a subsequent number falls downward. 6. The voice conversion method according to 6.