JP2003060790A

JP2003060790A - Speech interactive translation service system

Info

Publication number: JP2003060790A
Application number: JP2001247802A
Authority: JP
Inventors: Masaki Matsudaira; 正樹松平
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2001-08-17
Filing date: 2001-08-17
Publication date: 2003-02-28

Abstract

PROBLEM TO BE SOLVED: To provide a speech interactive translation service system for writing an intermediate language processing by an XML and the like and supplying service by natural interaction to a user who has only a fixed telephone set or a portable telephone set in language that the user desires. SOLUTION: One or more speech response servers connected to a public line network, one or more translation servers and one or more service supply servers are interconnected through a network. The translation server translates service held in the service supply server, and the speech response server provides the translated result by means of the speech interaction.

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】この発明は，音声対話翻訳サ
ービスシステムに関するもので，特に電話機からの利用
者へ，音声認識，翻訳，音声合成による音声対話翻訳サ
ービスを提供するシステムに関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice dialogue translation service system, and more particularly to a system for providing a voice dialogue translation service by voice recognition, translation and voice synthesis to a user from a telephone.

【０００２】[0002]

【従来の技術】従来の翻訳サービスシステムとして，例
えば特開平９−８１５６９号公報の「多カ国対応サービ
ス提供システム」が知られている。この従来システムで
は，サービスサーバとサービスクライアントが接続され
ている通信ネットワークに，中間表現形式を介して情報
表現形式を相互に変換する規則を管理する変換規則管理
サーバを接続し，サービスサーバとサービスクライアン
トは変換規則を利用して情報表現形式を変換する手段を
有し，送信する場合は特定言語の単語を中間表現に変換
し，受信した場合は中間表現を特定言語の単語に変換す
るように構成されていた。ここではクライアントに音声
合成手段および音声認識手段を具備することによって，
音声対話によるサービスも可能であった。2. Description of the Related Art As a conventional translation service system, for example, a "multinational service providing system" disclosed in Japanese Patent Laid-Open No. 9-81569 is known. In this conventional system, a conversion rule management server that manages rules for converting information representation formats via an intermediate representation format is connected to a communication network to which a service server and service clients are connected, and the service server and service clients are connected. Has a means for converting the information representation format using conversion rules, and is configured to convert a word in a specific language into an intermediate expression when transmitting, and convert the intermediate expression into a word in a specific language when receiving. It had been. Here, by providing the client with the voice synthesis means and the voice recognition means,
Service by voice dialogue was also possible.

【０００３】[0003]

【発明が解決しようとする課題】しかしながら上記シス
テムは，サービスサーバおよびクライアントとして，パ
ソコンのようなキー入力および画面出力を有する装置を
前提としており，固定電話あるいは携帯電話しか所有し
ていない利用者あるいはサービス提供者は利用すること
ができないという問題点があった。However, the above system is premised on an apparatus having a key input and a screen output such as a personal computer as a service server and a client, and a user who owns only a fixed telephone or a mobile telephone or There is a problem that the service provider cannot use it.

【０００４】また，中間表現を音声合成手段および音声
認識手段によって音声対話として提供するだけであっ
て，必ずしも使い勝手のよい自然な対話とはなり得ない
という問題点があった。Further, there is a problem in that the intermediate representation is only provided as a voice dialogue by the voice synthesizing means and the voice recognizing means, and it cannot always be a natural dialogue that is easy to use.

【０００５】そこで本発明は，上記従来システムの問題
点を解決し，固定電話あるいは携帯電話しか所有してい
ない利用者あるいはサービス提供者でも利用することが
でき，しかも，利用者の所望する言語で使い勝手のよい
自然な対話によってサービスを提供し，それに対応する
ことができる音声対話翻訳サービスシステムを提供する
ことを目的とする。Therefore, the present invention solves the above problems of the conventional system and can be used by a user or a service provider who owns only a fixed telephone or a mobile telephone, and in a language desired by the user. It is an object of the present invention to provide a service by a user-friendly natural dialogue and to provide a voice dialogue translation service system capable of coping with it.

【０００６】[0006]

【課題を解決するための手段】上記目的を解決するため
の本発明の音声対話翻訳サービスシステムは，公衆回線
網に接続された１つあるいは複数の音声応答サーバと，
１つあるいは複数の翻訳サーバと，１つあるいは複数の
サービス提供サーバとをネットワークで接続し，前記サ
ービス提供サーバ内に保持されたサービスを前記翻訳サ
ーバが翻訳し，その結果を前記音声応答サーバが音声対
話によって提供することを特徴としている。A speech dialogue translation service system of the present invention for solving the above-mentioned object includes one or a plurality of voice response servers connected to a public line network.
One or a plurality of translation servers and one or a plurality of service providing servers are connected by a network, the services held in the service providing server are translated by the translation server, and the result is transmitted by the voice response server. It is characterized by being provided by voice dialogue.

【０００７】前記音声応答サーバ内には，中間言語を解
析して特定言語，例えばＸＭＬ，ＶｏｉｃｅＸＭＬなど
の対話シーケンスを生成する対話シーケンス生成手段を
設けてもよい。The voice response server may be provided with a dialogue sequence generation means for analyzing an intermediate language and generating a dialogue sequence of a specific language, for example, XML or VoiceXML.

【０００８】前記対話シーケンス生成手段は，言語ごと
の対話シーケンス生成規則を保持するように構成しても
よい。The dialogue sequence generation means may be configured to hold a dialogue sequence generation rule for each language.

【０００９】前記音声応答サーバ内には，対話シーケン
スに基づいて音声を出力し，利用者が発声した特定言語
の音声あるいは利用者のプッシュ入力から中間言語の要
素を抽出する入出力手段を設けてもよい。また，サービ
スに対応する中間言語がどのサービス提供サーバに存在
するかを管理するサービス管理手段を設けてもよい。The voice response server is provided with an input / output unit for outputting a voice based on a dialogue sequence and extracting an element of an intermediate language from a voice of a specific language uttered by the user or a push input of the user. Good. Further, a service management means for managing which service providing server the intermediate language corresponding to the service exists may be provided.

【００１０】前記サービス提供サーバ内には，提供する
サービスに対応する中間言語を保持するサービス保持手
段を設けてもよい。前記中間言語は，言語に共通する部
分と，言語に依存する部分とから構成するようにしても
よい。A service holding means for holding an intermediate language corresponding to a service to be provided may be provided in the service providing server. The intermediate language may be composed of a language-common portion and a language-dependent portion.

【００１１】前記音声応答サーバは，利用者からサービ
スおよびサービス提供言語の指定を含むサービス提供依
頼の電話を受けると，ネットワークを介して該当する前
記サービス提供サーバにアクセスし，サービス名および
サービス提供言語を指定して中間言語を要求し，前記サ
ービス提供サーバから中間言語を受け取ると，中間言語
から必要な情報を抽出し，その情報をもとに前記対話シ
ーケンス作成手段を用いて対話シーケンスを生成し，そ
の対話シーケンスに従って利用者と対話するように構成
してもよい。When the voice response server receives a call for a service providing request including designation of a service and a service providing language from a user, the voice response server accesses the corresponding service providing server via a network to obtain a service name and a service providing language. When requesting an intermediate language by specifying, and receiving the intermediate language from the service providing server, necessary information is extracted from the intermediate language, and a dialogue sequence is generated based on the information by using the dialogue sequence creating means. , It may be configured to interact with the user according to the dialogue sequence.

【００１２】更に前記音声応答サーバは，前記サービス
管理手段を用いて指定されたサービスに対応する中間言
語を保持するサービス提供サーバを判断し，該当するサ
ービス提供サーバにアクセスするように構成してもよい
し，また，前記対話シーケンスに従って利用者と対話す
る際に，利用者の発声内容を獲得し，対話によって獲得
した情報を中間言語の言語に依存する部分の値として代
入し，値を代入した中間言語を前記サービス提供サーバ
に送るように構成してもよい。Further, the voice response server may be configured to determine a service providing server holding an intermediate language corresponding to a service designated by the service managing means and access the corresponding service providing server. OK, when the user interacts with the user according to the dialog sequence, the user's utterance content is acquired, the information acquired by the dialog is substituted as the value of the language-dependent part of the intermediate language, and the value is substituted. The intermediate language may be transmitted to the service providing server.

【００１３】前記サービス提供サーバは，前期音声応答
サーバから中間言語の要求があると，要求されたサービ
ス名およびサービス提供言語に対応する中間言語を保持
している場合は，それを前記音声応答サーバに送り，要
求されたサービス名に対応する中間言語の設定言語が要
求されたサービス提供言語と異なる場合は，前記翻訳サ
ーバを用いて要求されたサービス名に対応する中間言語
から要求されたサービス提供言語に対応する中間言語を
作成した後，前記音声応答サーバに送るように構成して
もよい。When there is a request for an intermediate language from the voice response server in the previous term, the service providing server holds the intermediate language corresponding to the requested service name and the service providing language, and if it holds the intermediate language, the service providing server sends it to the voice response server. And the setting language of the intermediate language corresponding to the requested service name is different from the requested service providing language, the service provision requested from the intermediate language corresponding to the requested service name is performed using the translation server. The intermediate language corresponding to the language may be created and then sent to the voice response server.

【００１４】また，前記サービス提供サーバは，要求さ
れたサービス名に対応する中間言語から要求されたサー
ビス提供言語に対応する中間言語を作成する際に，作成
する中間言語は元の設定言語における値と翻訳後の値と
の両方を区別して記述するように構成してもよいし，更
に，前期音声応答サーバから中間言語を受け取ると，そ
れをサービス提供者に提供するように構成してもよい。Further, when the service providing server creates the intermediate language corresponding to the requested service providing language from the intermediate language corresponding to the requested service name, the created intermediate language is a value in the original setting language. And the translated value may be distinguished and described. Further, when the intermediate language is received from the previous voice response server, it may be configured to be provided to the service provider. .

【００１５】また，前記サービス提供サーバは，利用者
の情報をサービス提供者に提供する際に，中間言語の言
語に依存する部分が元の設定言語と翻訳後の設定言語の
値から構成される場合には，元の設定言語に対応する部
分を選択してサービス提供者に提供するように構成して
もよい。Further, in the service providing server, when providing the information of the user to the service provider, the part depending on the language of the intermediate language is composed of the value of the original setting language and the value of the translated setting language. In this case, the part corresponding to the original setting language may be selected and provided to the service provider.

【００１６】また，前記サービス提供サーバは，前記音
声応答サーバに中間言語を送る際に，指定された言語に
対応する対話シーケンス生成規則も同時に送り，前記対
話シーケンス作成手段は，前記サービス提供サーバから
対話シーケンス生成規則を受け取り，対話シーケンスを
生成する際に受け取った対話シーケンス生成規則を用い
るように構成してもよい。Further, when the service providing server sends the intermediate language to the voice response server, the service providing server also sends a dialogue sequence generation rule corresponding to the designated language, and the dialogue sequence creating means sends the dialogue sequence creating means from the service providing server. The dialogue sequence generation rule may be received, and the received dialogue sequence generation rule may be used when the dialogue sequence is generated.

【００１７】前記音声応答サーバは，前記サービス提供
サーバの機能をも有するように構成してもよい。The voice response server may have a function of the service providing server.

【００１８】[0018]

【発明の実施の形態】以下に，本発明による音声対話翻
訳サービスシステムの実施の形態を，図面を用いて説明
する。BEST MODE FOR CARRYING OUT THE INVENTION An embodiment of a speech dialogue translation service system according to the present invention will be described below with reference to the drawings.

【００１９】（第１の実施の形態）図１は，本発明によ
る音声対話翻訳サービスシステムの第１の実施の形態を
表す構成図である。複数の音声応答サーバ１と，複数の
翻訳サーバ２と，複数のサービス提供サーバ６とがネッ
トワーク３に接続されている。各々の音声応答サーバ１
は，公衆回線網４を介して複数の電話機５と接続され，
各々のサービス提供サーバ６は，ネットワークを介して
コンピュータ７と接続されている。(First Embodiment) FIG. 1 is a block diagram showing a first embodiment of a speech dialogue translation service system according to the present invention. A plurality of voice response servers 1, a plurality of translation servers 2 and a plurality of service providing servers 6 are connected to the network 3. Each voice response server 1
Is connected to a plurality of telephones 5 via the public network 4,
Each service providing server 6 is connected to the computer 7 via a network.

【００２０】図２は，音声応答サーバ１の構成を，図３
は，サービス提供サーバ６の構成を示す図である。音声
応答サーバ１は，公衆回線網４を介した利用者側の電話
機５との回線接続を制御する回線制御手段１１と，音声
応答サーバ間でやりとりする中間言語から特定言語の対
話シーケンスを生成する対話シーケンス生成手段１２
と，対話シーケンスに基づいて音声を出力し，利用者が
発声した特定言語の音声あるいは利用者のプッシュ入力
から中間言語の要素を抽出する入出力手段１３と，サー
ビスに対応する中間言語がどのサービス提供サーバに存
在するかを管理するサービス管理手段１４と，中間言語
を解析して情報を抽出する中間言語解析手段１５とを有
している。FIG. 2 shows the configuration of the voice response server 1 shown in FIG.
FIG. 3 is a diagram showing a configuration of the service providing server 6. The voice response server 1 generates a dialogue sequence of a specific language from the line control means 11 for controlling the line connection with the telephone 5 on the user side via the public line network 4 and the intermediate language exchanged between the voice response servers. Dialog sequence generation means 12
And an input / output unit 13 that outputs a voice based on a dialogue sequence and extracts an element of an intermediate language from a voice of a specific language uttered by the user or a push input of the user, and which service is an intermediate language corresponding to the service. It has a service management means 14 for managing whether it exists in the providing server and an intermediate language analysis means 15 for analyzing the intermediate language and extracting information.

【００２１】更に対話シーケンス生成手段１２は，中間
言語を解析した結果から対話シーケンスを生成するため
の対話シーケンス生成規則１６（図示せず）を有してい
る。Further, the dialogue sequence generation means 12 has a dialogue sequence generation rule 16 (not shown) for generating a dialogue sequence from the result of analyzing the intermediate language.

【００２２】翻訳サーバ２は，ある言語と他のある言語
の語句の対応を記述した変換辞書１７（図示せず）を有
している。The translation server 2 has a conversion dictionary 17 (not shown) in which correspondences between words in a certain language and words in another language are described.

【００２３】サービス提供サーバ６は，中間言語を解析
して情報を抽出する中間言語解析手段１８と，中間言語
を保持するサービス保持手段１９とを有している。The service providing server 6 has an intermediate language analyzing means 18 for analyzing the intermediate language and extracting information, and a service holding means 19 for holding the intermediate language.

【００２４】まず，利用者は電話機５から公衆回線網４
を介してひとつの音声サーバ１に電話し，プッシュ入力
あるいは音声入力によりサービスの提供言語およびサー
ビスを特定する。例えば，「１＃」（日本語），「４
＃」（ホテル予約）とプッシュ入力するか，「日本
語」，「ホテル」と発声することにより日本語でのホテ
ル予約サービスを指定することができる。日本国内から
の発呼に対しては，何も指定しない場合は提供言語に日
本語を指定したことにしてもよい。First of all, the user operates the telephone 5 to the public line network 4
One voice server 1 is called via, and the service providing language and the service are specified by push input or voice input. For example, "1 #" (Japanese), "4
You can specify the hotel reservation service in Japanese by either pressing "#" (Hotel reservation) or saying "Japanese" or "Hotel". For calls originating from within Japan, if nothing is specified, Japanese may be specified as the language provided.

【００２５】音声応答サーバ１は，サービス管理手段１
４を用いて，指定されたサービスに対応する中間言語が
どのサービス提供サーバ内に存在するかどうかを判断す
る。しかる後，ネットワーク３を介して該当するサービ
ス提供サーバ６にアクセスし，サービス名およびサービ
ス提供言語を指定して中間言語を要求する。The voice response server 1 is the service management means 1
4 is used to determine in which service providing server the intermediate language corresponding to the specified service exists. After that, the service providing server 6 is accessed via the network 3, the service name and the service providing language are designated, and the intermediate language is requested.

【００２６】サービス提供サーバ６は，要求されたサー
ビス名およびサービス提供言語に対応する中間言語を保
持している場合は，それを音声応答サーバ１に返す。要
求されたサービス名に対応する中間言語の設定言語が要
求されたサービス提供言語と異なる場合は，中間言語解
析手段１８および翻訳サーバ２を用いて要求されたサー
ビス名に対応する中間言語から，要求されたサービス提
供言語に対応する中間言語を作成した後，音声応答サー
バ１に返す。If the service providing server 6 holds an intermediate language corresponding to the requested service name and service providing language, it returns it to the voice response server 1. When the setting language of the intermediate language corresponding to the requested service name is different from the requested service providing language, the request is made from the intermediate language corresponding to the requested service name using the intermediate language analysis means 18 and the translation server 2. After creating an intermediate language corresponding to the provided service providing language, it is returned to the voice response server 1.

【００２７】ここで中間言語は，言語共通のタグおよび
言語に依存する値およびその他の情報から構成され，例
えば図４のようになっている。Here, the intermediate language is made up of tags common to the language, values dependent on the language, and other information, as shown in FIG. 4, for example.

【００２８】図４において，＜＞で囲まれた要素は言語
共通のタグを示し，タグで囲まれた要素はその値を示し
ている。タグは入れ子構造にすることも可能である。例
えば，＜ｌａｎｇｕａｇｅ＞ＥＮＧ＜／ｌａｎｇｕａ
ｇｅ＞は，設定言語がＥＮＧ（英語）であること，
＜ｃｉｔｙ＞から＜／ｃｉｔｙ＞までの行は，都
市名の値は未設定で，選択肢は，例えば図５に示すファ
イル（ｃｉｔｙ．ｌｓｔ）に語彙のリストとして記
述されていること，＜ｄａｔｅｎａｍｅ＝”ａ＿ｄａ
ｔｅ”＞から＜／ｄａｔｅ＞までの３行は，ｄａ
ｔｅｏｆａｒｒｉｖａｌで表現されるａ＿ｄａ
ｔｅという変数の値は未設定であること，＜ａｔｔｒ
ｉｂｕｔｅ＞から＜／ａｔｔｒｉｂｕｔｅ＞まで
の行は，ｔｙｐｅｏｆｒｏｏｍで表現されるｒ
ｏｏｍ＿ｔｙｐｅという変数の値は未設定で，選択肢
としてｓｉｎｇｌｅとｔｗｉｎとがあることを
示している。この例では，中間言語はＸＭＬ形式で記述
しているが，これに限定されるものではなく，専用の記
述形式でもよい。In FIG. 4, elements enclosed in <> indicate tags common to languages, and elements enclosed by tags indicate their values. Tags can be nested. For example, <language> ENG </ langua
ge> means that the setting language is ENG (English),
In the lines from <city> to </ city>, the value of the city name is not set, and the options are described as a vocabulary list in the file (city.lst) shown in FIG. 5, for example, <date name = "A_da
The three lines from te ″> to </ date> are da
a_da expressed by te of arrival
The value of the variable te has not been set, <attr
The line from ibut> to </ attribute> is represented by type of room r
The value of the variable "oom_type" is not set, and it indicates that there are single and twin as options. In this example, the intermediate language is described in the XML format, but it is not limited to this and a dedicated description format may be used.

【００２９】中間言語の翻訳は，以下のようにして行
う。まず，中間言語を保持するサービス提供サーバ６
が，中間言語のコピーを作成する。サービス提供サーバ
６は，中間言語解析手段１８を用いて中間言語のコピー
を先頭から順に解析し，タグおよびタグの値，その他の
情報を抽出する。抽出したタグが言語設定タグの場合
は，値を目的言語に設定しなおし，元の言語設定を別の
タグとして追加する。それ以外のタグの場合は，抽出し
た値を中間言語の設定言語および翻訳する目的言語のフ
ラグとともにネットワーク３を介して翻訳サーバ２に送
り，翻訳を依頼する。そして，翻訳後の値を受け取り，
タグの値に翻訳後の値を元の値と区別して追加する。ま
た，タグ内に語彙のリストとして指定されたファイルが
ある場合は，そのファイル名に目的言語の拡張子を追加
する。The translation of the intermediate language is performed as follows. First, the service providing server 6 that holds the intermediate language
Makes a copy of the intermediate language. The service providing server 6 analyzes the copy of the intermediate language in order from the beginning by using the intermediate language analysis unit 18, and extracts the tag, the value of the tag, and other information. If the extracted tag is a language setting tag, reset the value to the target language and add the original language setting as another tag. In the case of other tags, the extracted value is sent to the translation server 2 via the network 3 together with the set language of the intermediate language and the flag of the target language to be translated to request translation. And receive the translated value,
Add the translated value to the tag value separately from the original value. If there is a file specified as a list of vocabulary in the tag, the extension of the target language is added to the file name.

【００３０】翻訳サーバ２は，送られた中間言語の設定
言語および目的言語のフラグに対応する変換辞書１７
（図示せず）を用いて送られた値を目的言語に翻訳し，
翻訳後の値を元の値とセットにしてサービス提供サーバ
６に返す。The translation server 2 uses the conversion dictionary 17 corresponding to the flags of the set language of the intermediate language and the target language sent.
Translate the value sent using (not shown) into the target language,
The translated value is returned to the service providing server 6 together with the original value.

【００３１】翻訳後の中間言語の例を図６に，翻訳後の
語彙リストのファイルを図７に示す。図６，図７におい
て，｛｝内は元の言語（ＥＮＧ；英語）での値である。FIG. 6 shows an example of the intermediate language after translation, and FIG. 7 shows a vocabulary list file after translation. 6 and 7, values in {} are values in the original language (ENG; English).

【００３２】音声応答サーバ１は，サービス提供サーバ
６から中間言語を受け取ると，中間言語解析手段１５を
用いて中間言語を先頭から順に解析し，タグおよびタグ
の値，その他の情報を抽出する。例えば図６では，＜ｌ
ａｎｇｕａｇｅ＞タグに対する値がＪＰＮ（日本語），
＜ｏｒｇ＿ｌａｎｇｕａｇｅ＞タグに対する値がＥＮＧ
（英語），＜ｃｉｔｙ＞タグに対する値は未設定で，選
択肢はｃｉｔｙ．ｌｓｔ．ＪＰＮファイルに記述されて
いるという情報を得る。また，＜ｄａｔｅ＞タグについ
ては，変数ａ＿ｄａｔｅ，日本語表記「到着日」，英語
表記ｄａｔｅｏｆａｒｒｉｖａ１，値は未設定という
情報を得る。しかる後，対話シーケンス生成手段１２を
用いて抽出したタグおよびタグの値，その他の情報から
対話シーケンスを生成する。Upon receiving the intermediate language from the service providing server 6, the voice response server 1 analyzes the intermediate language in order from the beginning using the intermediate language analysis means 15 and extracts the tag, the tag value, and other information. For example, in FIG. 6, <l
The value for the "annuage>tag" is JPN (Japanese),
The value for the <org_language> tag is ENG
(English), the value for the <city> tag is not set, and the options are city. lst. The information that is described in the JPN file is obtained. As for the <date> tag, the information that the variable a_date, the Japanese notation “arrival date”, the English notation dateof arrival1, and the value are not set is obtained. Thereafter, a dialogue sequence is generated from the tags and tag values extracted by the dialogue sequence generation means 12 and other information.

【００３３】対話シーケンス生成手段１２は，各タグに
ついての対話シーケンス生成規則を記述した対話シーケ
ンス生成規則１６（図示せず）を用いて，対話シーケン
スを生成する。日本語に対応する対話シーケンス生成規
則１６の例を図８に，生成される対話シーケンスの例を
図９に示す。The dialogue sequence generation means 12 generates a dialogue sequence using a dialogue sequence generation rule 16 (not shown) which describes the dialogue sequence generation rule for each tag. FIG. 8 shows an example of the dialogue sequence generation rule 16 corresponding to Japanese, and FIG. 9 shows an example of the dialogue sequence generated.

【００３４】図８において，ｓｅｔｇｒａｍは認識語
彙の設定，ｐｒｏｍｐｔは音声出力，ｇｅｔｖａｒ
は音声認識あるいはＤＴＭＦ入力，ｓｕｂｍｉｔは認
識結果の送信を意味し，＄ｖａｒ，＄ｅｘｐは変数
を表している。また，図９の例では，対話シーケンスを
ＶｏｉｃｅＸＭＬで記述しているが，これに限定される
ものではなく，専用の記述方式でもよい。なお，ＸＭＬ
からＶｏｉｃｅＸＭＬへの変換については，情報処理学
会研究報告２０００−ＳＬＰ−３４「ＸＭＬ−Ｖｏｉｃ
ｅＸＭＬ変換ツールの開発」に記載されている方法が流
用できる。In FIG. 8, setgram is a recognition vocabulary setting, prompt is a voice output, and getvar.
Indicates voice recognition or DTMF input, submit means transmission of recognition result, and $ var and $ exp represent variables. Further, in the example of FIG. 9, the dialogue sequence is described in VoiceXML, but it is not limited to this and a dedicated description method may be used. In addition, XML
To VoiceXML is described in Information Processing Society of Japan, Research Report 2000-SLP-34 "XML-Voic".
The method described in "Development of eXML conversion tool" can be used.

【００３５】対話シーケンスを作成した後，音声応答サ
ーバ１は，回線制御手段１１および入出力手段１３を用
いて対話シーケンスに従って利用者と対話し，利用者の
発声内容を獲得し，中間言語のタグの値として代入す
る。例えば，図９の対話シーケンスでは，音声応答サー
バ１が「都市名を指定して下さい」と音声出力し，利用
者の発声内容をｃｉｔｙ．ｌｓｔ．ＪＰＮで記述さ
れた都市名のリストのひとつとして認識し，結果を中間
言語の＜ｃｉｔｙ＞タグの値に代入する。その時，語棄
リストのファイルの｛｝内に記述された元の言語の値も
あわせて代入する。例えば，認識結果が「アトランタ」
の場合，代入する値は「アトランタ｛Ａｔｌａｎｔ
ａ｝」となる。同様に，対話によって獲得した到着日を
（ｙｙｙｙ）ｍｍｄｄの形式で変数ａ＿ｄａｔｅの＜ｄ
ａｔｅ＞タグ，出発日を（ｙｙｙｙ）ｍｍｄｄの形式で
変数ａ＿ｄａｔｅの＜ｄａｔｅ＞タグ，部屋のタイプを
変数ｒｏｏｍ＿ｔｙｐｅの＜ａｔｔｒｉｂｕｔｅ＞タグ
の値に代入する。対話後の中間言語は図１０のようにな
る。After creating the dialogue sequence, the voice response server 1 uses the line control means 11 and the input / output means 13 to have a dialogue with the user according to the dialogue sequence, obtain the content of the user's utterance, and tag the intermediate language. Substitute as the value of. For example, in the dialogue sequence shown in FIG. 9, the voice response server 1 outputs the voice as "Please specify the city name", and outputs the user's utterance content as city. lst. Recognize as one of the list of city names described in JPN, and substitute the result into the value of the <city> tag of the intermediate language. At that time, the value of the original language described in {} of the word list file is also substituted. For example, the recognition result is "Atlanta"
In the case of, the value to be substituted is “Atlanta
a} ”. Similarly, the arrival date acquired by the dialogue is expressed as (yyyy) mmdd in the variable a_date <d.
ate> tag, the departure date in the format of (yyyy) mmdd, the <date> tag of the variable a_date, and the room type are substituted for the values of the <attribute> tag of the variable room_type. The intermediate language after the dialogue is as shown in FIG.

【００３６】しかる後，音声応答サーバ１は，値を代入
した中間言語をサービス提供サーバ６に送る。サービス
提供サーバ６は，中間言語を受け取ると，中間言語解析
手段１８を用いて再びタグおよびタグの値，その他の情
報を抽出し，元の設定言語に対応する値，情報だけを選
択してコンピュータ７に提供する。After that, the voice response server 1 sends the intermediate language in which the value is substituted to the service providing server 6. Upon receiving the intermediate language, the service providing server 6 again extracts the tag, the value of the tag, and other information by using the intermediate language analysis means 18, selects only the value and information corresponding to the original set language, and selects the computer. Provide to 7.

【００３７】以上示したように，本発明の第１の実施の
形態によれば，上記のような作用，構成であるので，固
定電話や携帯電話しか所有していない利用者に対して
も，利用者の所望する言語で自然な対話によるサービス
を提供することができる。As described above, according to the first embodiment of the present invention, since it has the above-described operation and configuration, even for a user who owns only a fixed telephone or a mobile telephone, It is possible to provide a service by a natural dialogue in a language desired by the user.

【００３８】（第２の実施の形態）図１１は，本発明に
よる音声対話翻訳サービスシステムの第２の実施の形態
を表す構成図である。複数の音声応答サーバ１と，複数
の翻訳サーバ２と，複数のサービス提供サーバ６とをネ
ットワーク３にて接続している。各々の音声応答サーバ
１は，公衆回線網４を介して複数の電話機５と接続され
ている。(Second Embodiment) FIG. 11 is a block diagram showing the second embodiment of the speech dialogue translation service system according to the present invention. A plurality of voice response servers 1, a plurality of translation servers 2 and a plurality of service providing servers 6 are connected via a network 3. Each voice response server 1 is connected to a plurality of telephones 5 via a public line network 4.

【００３９】図１２は，第２の実施の形態における音声
応答サーバ１の構成を示している。音声応答サーバ１
は，公衆回線網４を介した利用者側の電話機５との回線
接続を制御する回線制御手段１１と，音声応答サーバ間
でやりとりする中間言語から特定言語の対話シーケンス
を生成する対話シーケンス生成手段１２と，対話シーケ
ンスに基づいて音声を出力し，利用者が発声した特定言
語の音声あるいは利用者のプッシュ人力から中間言語の
要素を抽出する入出力手段１３と，サービスに対応する
中間言語がどのサービス提供サーバに存在するかを管理
するサービス管理手段１４と，中間言語を解析して情報
を抽出する中間言語解析手段１５と，サービス保持手段
１９とを有している。FIG. 12 shows the configuration of the voice response server 1 according to the second embodiment. Voice response server 1
Is a line control means 11 for controlling the line connection with the telephone 5 on the user side via the public line network 4, and a dialogue sequence generation means for generating a dialogue sequence of a specific language from an intermediate language exchanged between the voice response servers. 12, an input / output unit 13 for outputting a voice based on a dialogue sequence and extracting an intermediate language element from a voice of a specific language uttered by the user or a user's push human power, and an intermediate language corresponding to the service. It has a service management means 14 for managing whether it exists in the service providing server, an intermediate language analysis means 15 for analyzing the intermediate language and extracting information, and a service holding means 19.

【００４０】ここで，サービス保持手段１９は，第１の
実施の形態１におけるサービス提供サーバ６内のサービ
ス保持手段と同じものである。Here, the service holding means 19 is the same as the service holding means in the service providing server 6 in the first embodiment.

【００４１】利用者が電話機５から公衆回線網４を介し
てひとつの音声応答サーバ１（他の音声応答サーバと区
別するため，これを「１Ａ」とする）に電話し，言語お
よびサービスを特定すると，音声応答サーバ１Ａは，所
望のサービスを保持している音声応答サーバ１（これを
「１Ｂ」とする）にアクセスし，サービス名とサービス
提供言語を指定して中間言語を要求する。ここで，音声
応答サーバ１Ａが所望のサービスを保持している，すな
わち，音声応答サーバ１Ｂが音声応答サーバ１Ａと同じ
であってもよい。The user calls one voice response server 1 from the telephone 5 through the public network 4 (in order to distinguish it from other voice response servers, this is referred to as "1A") and specifies the language and service. Then, the voice response server 1A accesses the voice response server 1 (which is referred to as "1B") holding the desired service, specifies the service name and the service providing language, and requests the intermediate language. Here, the voice response server 1A may hold a desired service, that is, the voice response server 1B may be the same as the voice response server 1A.

【００４２】以降，音声応答サーバ１Ｂは，第１の実施
の形態でのサービス提供サーバ６と同じ動作，音声応答
サーバ１Ａは，第１の実施の形態での音声応答サーバ１
と同じ動作を行い，音声応答サーバ１Ｂが値を代入した
中間言語を受け取るまでは第１の実施の形態と同様の動
作を行う。Thereafter, the voice response server 1B operates in the same way as the service providing server 6 in the first embodiment, and the voice response server 1A operates in the voice response server 1 in the first embodiment.
The same operation as in the first embodiment is performed until the voice response server 1B receives the intermediate language in which the value is substituted.

【００４３】しかる後，音声応答サーバ１Ｂは，中間言
語解析手段１５を用いて中間言語を先頭から順に解析
し，タグおよびタグの値，その他の情報を抽出する。そ
して，対話シーケンス生成手段１２を用いて抽出したタ
グおよびタグの値，その他の情報から対話シーケンスを
生成する。Thereafter, the voice response server 1B analyzes the intermediate language in order from the beginning by using the intermediate language analysis means 15 and extracts the tag, the value of the tag, and other information. Then, the dialogue sequence is generated from the tag, the tag value, and other information extracted by using the dialogue sequence generation means 12.

【００４４】対話シーケンスを作成した後，音声応答サ
ーバ１Ｂは，回線制御手段１１および入出力手段１３を
用いて対話シーケンスに従ってサービス提供者と対話
し，利用者の情報を提供する。After creating the dialogue sequence, the voice response server 1B uses the line control means 11 and the input / output means 13 to have a dialogue with the service provider according to the dialogue sequence and provide the user information.

【００４５】以上示したように，本発明の第２の実施の
形態によれば，音声応答サーバ１にサービス保持手段１
９を設けて第１の実施の形態におけるサービス提供サー
バ６の機能をも持たせたので，固定電話や携帯電話しか
所有していない利用者に対しても，利用者の所望する言
語で自然な対話によるサービスを提供することができ，
またそれに対応することができる。As described above, according to the second embodiment of the present invention, the service holding means 1 is provided in the voice response server 1.
9 is provided so as to have the function of the service providing server 6 in the first embodiment as well, so that even a user who has only a fixed-line telephone or a mobile telephone can use a natural language in the user's desired language. We can provide services by dialogue,
Moreover, it can respond to it.

【００４６】以上，添付図面を参照しながら本発明の音
声対話翻訳サービスシステムの好適な実施形態について
説明したが，本発明はこれらの例に限定されない。いわ
ゆる当業者であれば，特許請求の範囲に記載された技術
的思想の範疇内において各種の変更例または修正例に想
到し得ることは明らかであり，それらについても当然に
本発明の技術的範囲に属するものと了解される。The preferred embodiments of the speech dialogue translation service system of the present invention have been described above with reference to the accompanying drawings, but the present invention is not limited to these examples. It is obvious that a so-called person skilled in the art can come up with various changes or modifications within the scope of the technical idea described in the claims, and of course, the technical scope of the present invention is also applicable to them. Be understood to belong to.

【００４７】[0047]

【発明の効果】本発明の音声対話翻訳サービスシステム
によれば，音声認識，音声合成を用いることによって自
然な対話による音声対話システムが実現し，また，中間
言語処理をＸＭＬ等によって記述するので，定型文章の
バリエーションを用意に増やすことができる。According to the speech dialogue translation service system of the present invention, a speech dialogue system by natural dialogue is realized by using speech recognition and speech synthesis, and intermediate language processing is described by XML or the like. You can easily increase the variation of the fixed text.

[Brief description of drawings]

【図１】本発明によるシステムの，第１の実施の形態を
表す構成図である。FIG. 1 is a configuration diagram showing a first embodiment of a system according to the present invention.

【図２】本発明によるシステムの，音声応答サーバの構
成図である。FIG. 2 is a block diagram of a voice response server of the system according to the present invention.

【図３】本発明によるシステムの，サービス提供サーバ
の構成図である。FIG. 3 is a configuration diagram of a service providing server of the system according to the present invention.

【図４】本発明によるシステムの，中間言語の一例であ
る。FIG. 4 is an example of an intermediate language of the system according to the invention.

【図５】本発明によるシステムの，語彙リストのファイ
ルの一例である。FIG. 5 is an example of a vocabulary list file of the system according to the present invention.

【図６】本発明によるシステムの，翻訳後の中間言語の
一例である。FIG. 6 is an example of the translated intermediate language of the system according to the present invention.

【図７】本発明によるシステムの，翻訳後の語彙リスト
のファイルの一例である。FIG. 7 is an example of a translated vocabulary list file of the system according to the present invention.

【図８】本発明によるシステムの，日本語に対応する対
話シーケンス生成規則の一例である。FIG. 8 is an example of a dialogue sequence generation rule corresponding to Japanese in the system according to the present invention.

【図９】本発明によるシステムで生成される対話シーケ
ンスの一例である。FIG. 9 is an example of a dialogue sequence generated by the system according to the present invention.

【図１０】本発明によるシステムの，対話後の中間言語
の一例である。FIG. 10 is an example of an intermediate language after dialogue in the system according to the present invention.

【図１１】本発明によるシステムの，第２の実施の形態
を表す構成図である。FIG. 11 is a configuration diagram showing a second embodiment of a system according to the present invention.

【図１２】本発明によるシステムの，第２の実施の形態
における音声応答サーバの構成を示す図である。FIG. 12 is a diagram showing a configuration of a voice response server in the second exemplary embodiment of the system according to the present invention.

[Explanation of symbols]

１音声応答サーバ２翻訳サーバ３ネットワーク４公衆回線網５電話機６サービス提供サーバ７コンピュータ１１回線制御手段１２対話シーケンス生成手段１３入出力手段１４サービス管理手段１５中間言語解析手段１６対話シーケンス生成規則１７変換辞書１８中間言語解析手段１９サービス保持手段 1 Voice response server 2 Translation server 3 network 4 public network 5 telephone 6 Service providing server 7 computer 11 Line control means 12 Dialog sequence generation means 13 Input / output means 14 Service management means 15 Intermediate language analysis means 16 Dialog sequence generation rules 17 conversion dictionary 18 Intermediate language analysis means 19 Service holding means

フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｈ０４Ｍ 3/523 Ｇ１０Ｌ 3/00 ５５１Ｃ５Ｋ１０１ 11/10 ５７１ＶＲＦターム(参考） 5B091 AA01 BA13 CB12 CB32 CD03 EA09 EA21 5D015 KK02 LL12 5D045 AB03 AB04 AB24 5K015 AA06 AA07 GA07 5K024 BB01 BB02 DD01 DD02 5K101 KK15 MM07 NN07 NN16 Front page continuation (51) Int.Cl. ⁷ Identification code FI theme code (reference) H04M 3/523 G10L 3/00 551C 5K101 11/10 571V R F term (reference) 5B091 AA01 BA13 CB12 CB32 CD03 EA09 EA21 5D015 KK02 LL12 5D045 AB03 AB04 AB24 5K015 AA06 AA07 GA07 5K024 BB01 BB02 DD01 DD02 5K101 KK15 MM07 NN07 NN16

Claims

[Claims]

1. A service providing server in which one or more voice response servers connected to a public line network, one or more translation servers, and one or more service providing servers are connected by a network. A voice dialogue translation service system, wherein the translation server translates the service held therein, and the voice response server provides the result by voice dialogue.

2. The voice dialogue translation service system according to claim 1, wherein the voice response server has dialogue sequence generation means for analyzing an intermediate language to generate a dialogue sequence of a specific language.

3. The voice dialogue translation service system according to claim 2, wherein the dialogue sequence generation means holds a dialogue sequence generation rule for each language.

4. The voice response server has an input / output unit that outputs a voice based on a dialogue sequence and extracts an intermediate language element from a voice of a specific language uttered by the user or a push input of the user. The spoken dialogue translation service system according to any one of claims 1 to 3.

5. The voice response server has service management means for managing which service providing server the intermediate language corresponding to the service is present in, as claimed in any one of claims 1 to 4. The spoken dialogue translation service system described.

6. The spoken dialogue translation service system according to claim 1, wherein the service providing server has a service holding unit that holds an intermediate language corresponding to the service to be provided. .

7. The spoken dialogue translation service system according to claim 1, wherein the intermediate language is composed of a language-common portion and a language-dependent portion. .

8. The voice response server, upon receiving a telephone call for a service provision request including designation of a service and a service provision language from a user, accesses the corresponding service provision server via a network to obtain a service name and a service. When an intermediate language is requested by designating a provided language and the intermediate language is received from the service providing server, necessary information is extracted from the intermediate language, and a dialogue sequence is created using the dialogue sequence creating means based on the information. Generate,
The speech dialogue translation service system according to any one of claims 1 to 7, wherein the speech dialogue translation service system interacts with a user according to the dialogue sequence.

9. The voice response server determines a service providing server holding an intermediate language corresponding to a service designated by the service managing means, and accesses the corresponding service providing server. The voice dialogue translation service system according to any one of claims 1 to 8.

10. The voice response server obtains the utterance content of the user when interacting with the user according to the dialogue sequence, and substitutes the information obtained by the dialogue as a value of a language-dependent part of the intermediate language. The spoken language translation service system according to any one of claims 1 to 9, wherein the intermediate language in which the value is substituted is sent to the service providing server.

11. When the service providing server receives an intermediate language request from the voice response server in the previous term, if the service providing server holds an intermediate language corresponding to the requested service name and service providing language, the service providing server uses the If the setting language of the intermediate language corresponding to the requested service name is different from the requested service providing language to the response server, the translation server requests the intermediate language corresponding to the requested service name. 11. The voice interactive translation service system according to claim 1, wherein an intermediate language corresponding to a service providing language is created and then sent to the voice response server.

12. When the service providing server creates an intermediate language corresponding to a requested service providing language from an intermediate language corresponding to a requested service name, the created intermediate language is a value in an original setting language. 12. The spoken dialogue translation service system according to claim 1, wherein both the translated value and the translated value are described separately.

13. The service providing server, when receiving the intermediate language from the voice response server in the previous term, provides the service provider with the intermediate language.
2. The speech dialogue translation service system according to any one of 2 above.

14. The service providing server, when providing information of a user to a service provider, a part depending on an intermediate language is composed of an original setting language and a value of a translated setting language. In this case, the speech dialogue translation service system according to any one of claims 1 to 13, wherein a part corresponding to the original set language is selected and provided to the service provider.

15. The service providing server, when sending an intermediate language to the voice response server, also sends a dialogue sequence generation rule corresponding to a designated language, and the dialogue sequence creating means is configured to send the dialogue sequence generating means from the service providing server. 15. The spoken dialogue translation service system according to claim 1, wherein the dialogue sequence generation rule is received, and the received dialogue sequence generation rule is used when the dialogue sequence is generated.

16. The voice response server also has a function of the service providing server.
~ The voice interactive translation service system according to any one of claims 15 to 16.