JP2002007414A

JP2002007414A - Voice browser system

Info

Publication number: JP2002007414A
Application number: JP2000191280A
Authority: JP
Inventors: Yoshio Nakajima; 芳夫中島; Shuhei Takimoto; 周平滝本
Original assignee: Sumitomo Electric Industries Ltd
Current assignee: Sumitomo Electric Industries Ltd
Priority date: 2000-06-26
Filing date: 2000-06-26
Publication date: 2002-01-11

Abstract

PROBLEM TO BE SOLVED: To realize a voice browser system by which voice information is acquired independently of image information by constructing the voice information by separate file from the image information. SOLUTION: The voice browser system is provided with a personal computer 13 to be connected with input/output devices such as a microphone 14, a speaker 15, a display device 16 and a keyboard 17. An extension anx to indicate that the file is the one for voice information is added to the file for voice information, when a file with the added anx is read, a voice driving browser reads text of a voice reading tag in the file and jumps to a link destination according to voice input from a user.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、クライアント端末
において疑似音声でファイル内容が読み上げられ、クラ
イアント端末に対して音声でファイル要求を出すことの
できる、音声ブラウザシステムに関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice browser system in which file contents are read out in a pseudo voice at a client terminal and a file request can be issued by voice to the client terminal.

【０００２】[0002]

【従来の技術】現在、車載ナビゲーション装置又は車両
に持ち込んだパーソナルコンピュータ等（以下これらを
総称して「車載装置」という）に対して、駐車場情報、
渋滞情報、道路情報、観光情報などをインターネットを
通して提供する交通情報提供システムが検討されてい
る。車両ドライバは、車載装置にインストールしたブラ
ウザ（browser;閲覧ソフト）を利用して、前記情報を掲
載した画面を呼び出すことができる。2. Description of the Related Art At present, a car navigation system or a personal computer brought into a vehicle (hereinafter, these are collectively referred to as "vehicle devices") are provided with parking lot information,
A traffic information providing system that provides traffic congestion information, road information, sightseeing information, and the like through the Internet is being studied. The vehicle driver can call a screen on which the information is posted by using a browser (browser; browsing software) installed in the vehicle-mounted device.

【０００３】しかし、車両ドライバにとって、情報を取
得するには、車載装置の所定の操作が必要なため、運転
中は、情報の取得ができない。そこで、運転中でも、安
全に情報の取得ができる音声操作可能な音声ブラウザシ
ステムが望まれている。一方、従来、音声で操作できる
音声ブラウザシステムが提案されている（特開平11-249
867号公報）。この音声ブラウザシステムは、文字情
報、画像情報を記述した一般的なＨＴＭＬ（hypertext
markup language）ファイルから、音読可能な部分を抽
出し、音声データに変換してユーザに読み聞かせ、ユー
ザがリンクを指定するための発声をしたときは、その音
声を認識して、該当するＵＲＬ(uniform resource loca
tor)を指定するという構成を持っている。However, for a vehicle driver to obtain information, a predetermined operation of the in-vehicle device is required, so that information cannot be obtained during driving. Therefore, there is a demand for a voice browser system that can perform voice operation so that information can be safely acquired even during driving. On the other hand, conventionally, a voice browser system that can be operated by voice has been proposed (JP-A-11-249).
No. 867). This voice browser system uses a general HTML (hypertext) describing character information and image information.
markup language), a readable portion is extracted from the file, converted into voice data and read to the user, and when the user utters a voice for designating a link, the voice is recognized and the corresponding URL ( uniform resource loca
tor) is specified.

【０００４】[0004]

【発明が解決しようとする課題】前記音声ブラウザシス
テムは、基本的に、画像情報で出来上がっているＨＴＭ
Ｌファイルをベースにしているため、実際に読み上げに
適していない部分を読み上げたり、音声で指定しづらい
リンク名を使ってしまうおそれがある。また、ＨＴＭＬ
ファイルだけを使用し、それを音声に対応させているた
め、映像と音声とを別々に制御できない。例えば、同じ
画面で複数回音声入力を促すような画面を構成できな
い。The above-mentioned voice browser system is basically an HTM made up of image information.
Since the file is based on the L file, there is a possibility that a part which is not suitable for reading out aloud is read out or a link name which is difficult to specify by voice is used. Also, HTML
Since only the file is used and the file is made to correspond to the sound, the video and the sound cannot be controlled separately. For example, a screen that prompts voice input a plurality of times on the same screen cannot be configured.

【０００５】そこで、本発明は、音声情報を、画像情報
とは別ファイルで構築することにより、画像情報とは独
立して音声情報を取得することのできる音声ブラウザシ
ステムを実現することを目的とする。[0005] Therefore, an object of the present invention is to realize a voice browser system capable of acquiring voice information independently of image information by constructing voice information in a file separate from image information. I do.

【０００６】[0006]

【課題を解決するための手段】本発明の音声ブラウザシ
ステムは、音声の入出力手段、音声認識合成手段及び音
声情報展開処理手段を有し、音声情報用ファイルに、音
声情報用ファイルであることを表す記号が付加されてお
り、音声情報用ファイルであることを表す記号が付加さ
れたファイルが呼び出された場合、音声情報展開処理手
段は、当該ファイル内の音声読み上げタグのテキスト読
み上げを行い、ユーザからの音声入力に従って、リンク
先へのジャンプを行うことを特徴とする（請求項１）。A voice browser system according to the present invention includes voice input / output means, voice recognition / synthesis means, and voice information development processing means, and the voice information file is a voice information file. Is added, and when a file to which a symbol indicating that the file is a voice information file is called, the voice information expansion processing unit performs text-to-speech of a voice-to-speech tag in the file, A jump to a link destination is performed according to a voice input from a user (claim 1).

【０００７】前記の構成によれば、音声情報は、音声読
み上げタグを利用することによって、画像のことを考慮
せずに自由に記述できる。また、本発明の音声ブラウザ
システムは、画像の入出力手段、音声の入出力手段、音
声認識合成手段、音声情報展開処理手段及び画像情報展
開処理手段を有し、音声情報用ファイルに、音声情報用
ファイルであることを表す記号が付加され、画像情報用
ファイルに、画像情報用ファイルであることを表す記号
が付加されており、音声情報用ファイルであることを表
す記号が付加されたファイルが呼び出された場合、音声
情報展開処理手段は、当該ファイル内の音声読み上げタ
グのテキスト読み上げを行い、ユーザからの音声入力に
従って、リンク先へのジャンプを行い、画像情報用ファ
イルであることを表す記号が付加されたファイルが呼び
出された場合、画像情報展開処理手段は、当該ファイル
内の表示タグの表示を行い、ユーザからの画面入力に従
って、リンク先へのジャンプを行うことを特徴とする
（請求項２）。[0007] According to the above configuration, the voice information can be freely described by using the voice reading tag without considering the image. The voice browser system of the present invention has an image input / output unit, a voice input / output unit, a voice recognition / synthesis unit, a voice information development processing unit, and an image information development processing unit. A symbol indicating that the file is an image information file is added to the image information file, and a symbol indicating that the file is an audio information file is added to the image information file. When called, the voice information expansion processing means reads out the text of the voice readout tag in the file, jumps to the link destination according to the voice input from the user, and indicates that the file is an image information file. When a file to which is added is called, the image information expansion processing means displays a display tag in the file and displays an image from the user. According to the input, and performs a jump to a link destination (claim 2).

【０００８】この構成によれば、画像と音声とを別々の
ファイルとするため、音声情報は、画像のことを考慮せ
ずに自由に記述でき、画像情報は、音声のことを考慮せ
ずに自由に記述できる。請求項２記載の音声ブラウザシ
ステムにおいて、音声情報と画像情報との関連づけ情報
用ファイルを定義し、当該ファイルであることを表す記
号が付加されたファイルが呼び出された場合、当該ファ
イルに記載された音声情報用ファイルについては、当該
ファイル内の音声読み上げタグのテキスト読み上げを行
い、ユーザからの音声入力に従って、リンク先へのジャ
ンプを行い、当該ファイルに記載された画像情報用ファ
イルについては、当該ファイル内の表示タグの表示を行
い、ユーザからの画面入力に従って、リンク先へのジャ
ンプを行うことができる（請求項３）。According to this configuration, since the image and the sound are separate files, the sound information can be freely described without considering the image, and the image information can be described without considering the sound. You can write freely. 3. The audio browser system according to claim 2, wherein a file for associating information between the audio information and the image information is defined, and when a file to which a symbol indicating that the file is added is called, the file is described in the file. For the audio information file, the text-to-speech tag of the audio-speech tag in the file is read, and a jump to a link destination is performed according to the voice input from the user. For the image information file described in the file, the file The display tag in is displayed, and a jump to a link destination can be performed according to a screen input from a user (claim 3).

【０００９】この構成によれば、例えば画像情報だけを
すべて構築してから、その画像に合う音声情報を構築し
て、後から関連づけを定義するようなファイルシステム
とすることができる。前記ファイルの用途をＵＲＬの拡
張子によって記述してもよく（請求項４）、この場合
は、拡張子の違いによってファイルの種類を区別するこ
とができる。また、前記音声情報用ファイルは、例え
ば、音声タグと、音声コマンドと、リンク先の記述によ
って記述することができる（請求項５）。According to this configuration, it is possible to construct a file system in which, for example, only all image information is constructed, then audio information suitable for the image is constructed, and the association is defined later. The purpose of the file may be described by the extension of the URL (claim 4). In this case, the type of the file can be distinguished by the difference in extension. The voice information file can be described by, for example, a voice tag, a voice command, and a description of a link destination.

【００１０】前記画像情報用ファイルは、通常用いられ
ているＨＴＭＬ言語で記述することができる（請求項
６）。[0010] The image information file can be described in a commonly used HTML language.

【００１１】[0011]

【発明の実施の形態】以下、本発明の実施の形態を、添
付図面を参照しながら詳細に説明する。図１は、本発明
を実施する全体システム構成を示すブロック図である。
全体システムは、車載装置１と、サーバ２と、これらを
結びつける専用ネットワーク３，インターネツト４を含
むものである。車載装置１は、ＧＰＳ受信機、車速セン
サ、ジャイロなどの各種センサ１１からの信号に基づい
て位置を検出し、この位置と地図データベースに記憶さ
れた地図データとに基づいて道路上に車両の位置を決定
するナビゲーション装置１２を有している。Embodiments of the present invention will be described below in detail with reference to the accompanying drawings. FIG. 1 is a block diagram showing the overall system configuration for implementing the present invention.
The entire system includes an in-vehicle device 1, a server 2, a dedicated network 3 for connecting them, and an Internet 4. The in-vehicle device 1 detects a position based on signals from various sensors 11 such as a GPS receiver, a vehicle speed sensor, and a gyro, and detects a position of the vehicle on a road based on the position and map data stored in a map database. Is determined.

【００１２】さらに車載装置１は、ナビゲーション装置
１２から車両位置情報を取り込むとともに、マイクロホ
ン１４、スピーカ１５、表示装置１６、キーボード１７
などの入出力装置とつながれるパーソナルコンピュータ
１３を備えている。なお、これ以外に、マウスなどの入
力装置を備えていてもよい。パーソナルコンピュータ１
３は、携帯型送受信機１８を通してインターネット接続
サービスの専用ネットワーク３とつながっている。ここ
でインターネット接続サービスには、例えばｉモード
（株式会社エヌ・ティ・ティドコモのインターネット接
続サービス）、ＥＺｗeb（ＤＤＩ−セルラー、ツーカー
グループのインターネット接続サービス）、Ｊ−スカイ
ウェブ（Ｊ−フォングループのインターネット接続サー
ビス）があげられるが、これに限定されることはなく、
インターネットを通してドライバに情報を提供すること
ができるあらゆるサービスを含むものとする。また、携
帯型送受信機１８に代えて自動車電話、双方向ビーコン
の送受信機など任意の送受信機を用いることができる。Further, the in-vehicle device 1 captures vehicle position information from the navigation device 12, and also includes a microphone 14, a speaker 15, a display device 16, and a keyboard 17.
And a personal computer 13 connected to an input / output device such as a personal computer. In addition, an input device such as a mouse may be provided in addition to the above. Personal computer 1
Numeral 3 is connected to a dedicated network 3 for an Internet connection service through a portable transceiver 18. Here, the Internet connection service includes, for example, i-mode (Internet connection service of NTT Tidcomo Co., Ltd.), EZweb (Internet connection service of DDI-cellular, Tucar group), J-Sky Web (Internet connection of J-phone group) Connection service), but is not limited to this.
Includes any service that can provide information to the driver through the Internet. Instead of the portable transceiver 18, an arbitrary transceiver such as a car phone or a two-way beacon transceiver can be used.

【００１３】さらに、専用ネットワーク３と接続される
インターネット４上に、各車両に、情報（駐車場情報、
渋滞情報、道路情報、観光情報などドライバに有用な情
報をいう）を提供するためのサーバ２が設定されてい
る。サーバ２は、音声用の情報（コンテンツ）と画像用
の情報（コンテンツ）とを区別してそれぞれ別々のファ
イル２１，２２に格納している。画像用の情報は、ＨＴ
ＭＬで記述しているが、音声情報は、ＸＭＬ(eXtensibl
e Markup Language)を基礎にした言語ＶＩＮＸ(Voice I
nternet Navigation mark-up language based on XML)
で記述している。Further, information (parking lot information, parking lot information, etc.) is provided on each vehicle on the Internet 4 connected to the dedicated network 3.
A server 2 for providing useful information to the driver such as traffic jam information, road information, and sightseeing information) is set. The server 2 stores the information (contents) for audio and the information (contents) for images in separate files 21 and 22, respectively. Information for images is HT
Although described in ML, audio information is in XML (eXtensibl
e Markup Language) based language VINX (Voice I
(nternet Navigation mark-up language based on XML)
Is described.

【００１４】また、画像と音声とを関連づける情報を記
載したファイル２３を用意している。以上の各情報の区
別には、拡張子.vnx,.anx又は.fnxを用いる。.vnxは画
像情報であることを表し、.anxは音声情報であることを
表し、.fnxは関連づけ情報であることを表す。図２は、
車載装置１がサーバ２から情報を受け取る場合の、パー
ソナルコンピュータ１３内部の音声駆動ブラウザ(Voice
Activated Browser)の機能ブロック図を示す。音声駆
動ブラウザは、ＸＭＬ及びＨＴＭＬのタグを解読し、そ
れぞれのタグの定義に従って機能を実行する。表示内容
はＨＴＭＬで記述され、音声内容は、ＶＩＮＸで記述さ
れている。音声タグの内容が音声合成モジュール１３５
によって合成音声で読み上げられる。Further, a file 23 containing information for associating an image with a sound is prepared. The extensions .vnx, .anx or .fnx are used to distinguish the above information. .vnx represents image information, .anx represents audio information, and .fnx represents association information. FIG.
When the in-vehicle device 1 receives information from the server 2, a voice-driven browser (Voice
2 shows a functional block diagram of Activated Browser). The voice-driven browser decodes XML and HTML tags and performs a function according to the definition of each tag. The display content is described in HTML, and the audio content is described in VINX. If the content of the voice tag is a voice synthesis module 135
Is read aloud by synthesized speech

【００１５】さらに詳しく説明すると、前記携帯型送受
信機１８の受信部からファイルを受け取ると、その拡張
子（.vnx,.anx又は.fnx）を判別する(131)。判別の結
果、拡張子が.vnxであればファイルを画像情報として展
開し(132)、.anxであればファイルを音声情報として展
開する(133)。.fnxであれば、関連づけられた画像情報
ファイル又は音声情報ファイルを展開する(134)。展開
された情報は、情報の種類に応じて、表示装置１６の画
面に表示され、又は音声合成されてスピーカ１５より拡
声されて、ドライバに伝えられる。More specifically, when a file is received from the receiving section of the portable transceiver 18, its extension (.vnx, .anx or .fnx) is determined (131). As a result of the determination, if the extension is .vnx, the file is expanded as image information (132), and if the extension is .anx, the file is expanded as audio information (133). If it is .fnx, the associated image information file or audio information file is expanded (134). The developed information is displayed on the screen of the display device 16 according to the type of the information, or is voice-synthesized and loudspeaked from the speaker 15 and transmitted to the driver.

【００１６】図３は、車載装置１がサーバ２にファイル
要求を出す場合の、音声駆動ブラウザの機能ブロック図
を示す。音声駆動ブラウザは、表示装置の画面１６を見
ながらキーボード１７で指定された内容、又はマイクロ
ホン１４により発声され音声認識モジュール１３６を通
して認識された内容に基づいて、ＵＲＬを特定し（１３
７）、ファイル要求を出す（１３８）。この要求は、携
帯型送受信機１８の送信部から送信される。FIG. 3 is a functional block diagram of the voice-driven browser when the vehicle-mounted device 1 issues a file request to the server 2. The voice-driven browser specifies the URL based on the content specified by the keyboard 17 while watching the screen 16 of the display device, or the content uttered by the microphone 14 and recognized through the voice recognition module 136 (13).
7) A file request is issued (138). This request is transmitted from the transmission unit of the portable transceiver 18.

【００１７】図４〜図７は、ドライバがＵＲＬを指定し
た場合に、ＵＲＬに付いている拡張子に応じて処理を実
行する流れを示すフローチャートである。まず、図４に
示すように、ドライバがＵＲＬを指定すると(ステップ
Ｓ１)、ＵＲＬに付いている拡張子が.vnx.anx又は.fnx
である場合に、それぞれ固有の処理を行う(ステップＳ
２〜Ｓ４)。拡張子が.vnx．の場合は、図５に示すよう
に、パーソナルコンピュータ１３は、画像情報展開処理
Ａを行う。指定ファイルをインターネットを通じて読み
込み(ステップＳ２１)、タグに従った処理を行う(ステ
ップＳ２２〜２４)。処理の内容は、通常どおり、文字
表示、色づけ、画像ファイル表示、表作成、リンク表示
等であり、それらの具体例は後述する。FIGS. 4 to 7 are flowcharts showing the flow of executing a process according to the extension attached to the URL when the driver specifies the URL. First, as shown in FIG. 4, when the driver specifies a URL (step S1), the extension attached to the URL is .vnx.anx or .fnx.
, A unique process is performed (step S
2 to S4). The extension is .vnx. In the case of, as shown in FIG. 5, the personal computer 13 performs the image information developing process A. The designated file is read via the Internet (step S21), and processing according to the tag is performed (steps S22 to S24). The contents of the processing include character display, coloring, image file display, table creation, link display, and the like, as usual, and specific examples thereof will be described later.

【００１８】拡張子が.anxの場合は、パーソナルコンピ
ュータ１３は、図６に示すように、音声情報展開処理Ｂ
を行う。指定ファイルをインターネットを通じて読み込
み(ステップＳ３１)、タグに従った処理を行う(ステッ
プＳ３２〜３４)。処理の内容は、音声読み上げタグの
テキスト読み上げ、音声リンクタグのリンク先へのジャ
ンプ用音声単語（コマンド）取り出し、等である。拡張
子が.fnxの場合は、パーソナルコンピュータ１３は、図
７に示すように、関連づけ情報展開処理を行う。指定フ
ァイルをインターネットを通じて読み込み(ステップＳ
４１)、タグを取り出し(ステップＳ４２)、画像情報用
ファイルであれば、前記図５の画像情報展開処理Ａを行
い(ステップＳ４３〜４４)、音声情報用ファイルであれ
ば、前記図６の音声情報展開処理Ｂを行う(ステップＳ
４５〜４６)。When the extension is .anx, as shown in FIG.
I do. The designated file is read through the Internet (step S31), and processing according to the tag is performed (steps S32 to S34). The contents of the processing include text-to-speech reading of the voice reading tag, extraction of a voice word (command) for jumping to the link destination of the voice link tag, and the like. If the extension is .fnx, the personal computer 13 performs the association information expanding process as shown in FIG. Read the specified file through the Internet (Step S
41), the tag is taken out (step S42), and if it is a file for image information, the image information expansion processing A of FIG. 5 is performed (steps S43 to S44). Perform information development processing B (step S
45-46).

【００１９】図８は、画像情報用ファイルの一具体例を
示すＸＭＬフォーマット図である。表示するタイトルは
「ドライバー情報」であり、内容は、「今週のおすす
め」「ドライブルート情報」「役立ち情報」である。ド
ライバがいずれかの項目をクリックすれば、当該項目の
詳細画面に替わる。例えば、「今週のおすすめ」の行が
クリックされれば、関連づけ情報ファイルrecommend.fn
xが呼び出される。図９は、音声情報用ファイルの一具
体例を示すＸＭＬフォーマット図である。読み上げる文
章は「ドライバーの皆さんに役立つ情報を提供します。
お好きな情報をお選び下さい。おすすめ、ドライブルー
ト、情報、が選べます」という内容である。ドライバが
「おすすめ」という音声を入力すれば、音声情報用ファ
イル"recommend .anx"を呼び出す。ドライバが「ドライ
ブルート」という音声を入力すれば、音声情報用ファイ
ル"driveroute.anx"を呼び出す。ドライバが「情報」
「役立ち」「役立ち情報」といういずれかの音声を入力
すれば、音声情報用ファイル"useful.anx"を呼び出す。FIG. 8 is an XML format diagram showing a specific example of the image information file. The title to be displayed is “driver information”, and the contents are “recommended this week”, “drive route information”, and “useful information”. If the driver clicks on any item, the screen changes to the detail screen for that item. For example, if the "Recommended this week" line is clicked, the association information file recommend.end
x is called. FIG. 9 is an XML format diagram showing a specific example of the audio information file. The sentence reads, "Provides useful information to drivers.
Please choose your favorite information. Recommended, drive route, information, you can choose. " When the driver inputs the voice of "Recommended", the voice information file "recommend .anx" is called. When the driver inputs the voice of "drive route", the voice information file "driveroute.anx" is called. Driver is "information"
When any of the voices "useful" and "useful information" is input, the voice information file "useful.anx" is called.

【００２０】図１０に、音声情報用ファイル"recommend
.anx"の読み上げ例を示す。図１１に、音声情報用ファ
イル"driveroute.anx"の読み上げ例を示す。図１２に、
音声情報用ファイル"useful.anx"の読み上げ例を示す。
図１３は、関連づけ情報展開処理の一具体例を示す図で
ある。指定されたファイル名が画像情報用ファイルmai
n.vnxであれば、画像情報展開処理を行い、音声情報用
ファイルmain.anxであれば、音声情報展開処理を行う。FIG. 10 shows a voice information file "recommend".
FIG. 11 shows an example of reading out the voice information file "driveroute.anx", and FIG.
An example of reading the audio information file "useful.anx" is shown below.
FIG. 13 is a diagram illustrating a specific example of the association information expanding process. The specified file name is the image information file mai
If it is n.vnx, it performs image information expansion processing, and if it is the audio information file main.anx, it performs audio information expansion processing.

【００２１】図１４は、以上の図８〜図１２に例示した
処理に沿って、表示装置に表示される画面、マイクロホ
ンに喋る音声内容、スピーカから聞こえる音声内容を、
具体的に掲げた図面である。図１４(a)は、図８の画像
情報用ファイルの画面、及び図９の音声情報用ファイル
の読み上げ例を示す図である。この状態から、ドライバ
が「おすすめ」をクリックするか、「おすすめ」という
音声を発声すれば、図１４(b)の画面に替わり、「今週
のおすすめは別府温泉の○×ホテルです。‥‥」といっ
た疑似音声が発声される。ドライバが「ドライブルー
ト」をクリックするか、「ドライブルート」という音声
を発声すれば、図１４(c)の画面に替わり、「ドライブ
ルート情報です。大阪の湾岸高速道路はいかがでしょう
か‥‥」といった疑似音声が発声される。FIG. 14 shows a screen displayed on a display device, a voice content spoken by a microphone, and a voice content heard from a speaker, in accordance with the processing illustrated in FIGS.
It is a drawing specifically raised. FIG. 14A is a diagram showing an example of reading out the image information file screen of FIG. 8 and the voice information file of FIG. From this state, if the driver clicks “Recommended” or speaks “Recommended”, the screen changes to the screen shown in FIG. 14 (b), and “Recommended this week is ○○ Hotel in Beppu Onsen. ‥‥” Is uttered. If the driver clicks "Drive Route" or utters the voice of "Drive Route", the screen changes to the screen shown in Fig. 14 (c), and "Drive route information. How about the Wangan Expressway in Osaka?" Is uttered.

【００２２】ドライバが「役立ち情報」をクリックする
か、「情報」「役立ち」「役立ち情報」といういずれか
の音声を発声すれば、図１４(d)の画面に替わり、「今
週のドライブお役立ち情報をお知らせします。‥‥」と
いった疑似音声が発声される。次に、本発明の実施の形
態における、画像情報用ファイルと、音声情報用ファイ
ルとの遷移の態様を説明する。If the driver clicks on "useful information" or utters one of the words "information", "useful" or "useful information", the screen shown in FIG. Information will be announced. "". Next, the mode of transition between the image information file and the audio information file according to the embodiment of the present invention will be described.

【００２３】図１５は、画像と音声が完全に同期して遷
移していく例を示している。図１５(a)は、関連づけ情
報ファイルがaaa.fnxであり、関連づけられる画像情報
用ファイルaaa.vnx、関連づけられる音声情報用ファイ
ルaaa.anxを開いている状態を示している。この状態か
ら、bbb.fnxにリンク指定すれば、画像情報用ファイル
もbbb.vnxに遷移し、音声情報用ファイルもbbb.anxに遷
移する。さらにこの状態から、ccc.fnxにリンク指定す
れば、画像情報用ファイルもccc.vnxに遷移し、音声情
報用ファイルもccc.anxに遷移する。FIG. 15 shows an example in which the image and the sound transition completely synchronously. FIG. 15A shows a state in which the association information file is aaa.fnx, and the associated image information file aaa.vnx and the associated audio information file aaa.anx are open. In this state, if a link is designated to bbb.fnx, the image information file also transits to bbb.vnx, and the audio information file also transits to bbb.anx. Further, from this state, if a link is designated to ccc.fnx, the file for image information also transits to ccc.vnx, and the file for audio information also transits to ccc.anx.

【００２４】図１６は、音声は変化せず、画像のみが遷
移していく例を示している。図１６(a)は、関連づけ情
報ファイルがaaa.fnxであり、関連づけられる画像情報
用ファイルaaa.vnx、関連づけられる音声情報用ファイ
ルaaa.anxを開いている状態を示している。この状態か
ら、bbb.vnxという画像情報用ファイルにリンク指定す
れば、画像情報用ファイルはbbb.vnxに遷移するが、音
声情報用ファイルはaaa.anxのままである。さらにこの
状態から、ccc.fnxにリンク指定すれば、画像情報用フ
ァイルもccc.vnxに遷移し、音声情報用ファイルもccc.a
nxに遷移する。FIG. 16 shows an example in which the sound does not change and only the image changes. FIG. 16A shows a state in which the association information file is aaa.fnx, and the associated image information file aaa.vnx and the associated audio information file aaa.anx are opened. From this state, if a link is specified to the image information file bbb.vnx, the image information file transits to bbb.vnx, but the audio information file remains aaa.anx. Furthermore, from this state, if you specify a link to ccc.fnx, the image information file will also transition to ccc.vnx, and the audio information file will also be ccc.a
Transition to nx.

【００２５】図１７は、画像のみが遷移する具体例を説
明するための図である。図１７(a)において、所定縮尺
の地図画像が表示されている。「拡大」というコマンド
を入力すれば、拡大地図に相当する画像情報用ファイル
が読み出され画面上では、図１７(b)に示すように、拡
大地図画面に替わるが、音声情報用ファイルは不変であ
る。図１８は、画像は変化せず、音声のみが遷移してい
く例を示している。図１７(a)は、関連づけ情報ファイ
ルがaaa.fnxであり、関連づけられる画像情報用ファイ
ルaaa.vnx、関連づけられる音声情報用ファイルaaa.anx
を開いている状態を示している。この状態から、bbb.an
xという音声情報用ファイルにリンク指定すれば、音声
情報用ファイルはbbb.anxに遷移するが、画像情報用フ
ァイルはvvv.anxのままである。さらにこの状態から、c
cc.fnxにリンク指定すれば、画像情報用ファイルもccc.
vnxに遷移し、音声情報用ファイルもccc.anxに遷移す
る。FIG. 17 is a diagram for explaining a specific example in which only an image transitions. In FIG. 17A, a map image of a predetermined scale is displayed. When the command "enlarge" is input, the image information file corresponding to the enlarged map is read out, and the screen is replaced with the enlarged map screen as shown in FIG. 17 (b), but the audio information file is unchanged. It is. FIG. 18 shows an example in which the image does not change and only the sound transitions. FIG. 17A shows that the association information file is aaa.fnx, the associated image information file aaa.vnx, and the associated audio information file aaa.anx.
Indicates the open state. From this state, bbb.an
If a link is specified to the audio information file x, the audio information file transits to bbb.anx, but the image information file remains vvv.anx. Furthermore, from this state, c
If you specify a link to cc.fnx, the image information file will also be ccc.
The file transitions to vnx, and the audio information file also transitions to ccc.anx.

【００２６】図１９は、音声のみ遷移していく具体例を
示す図である。図１９(a)において、所定縮尺の地図画
像が表示されている。「スクロール」というコマンドを
入力すれば、走行中はスクロールできないので、スクロ
ール禁止の音声情報用ファイルが呼び出され、「走行中
はスクロールできません。‥‥」という音声が読み上げ
られる。画面は、図１９(b)に示すように不変である。
以上で、本発明の実施の形態を説明したが、本発明の実
施は、前記の形態に限定されるものではない。例えば、
前記の形態では、各情報の区別にはＵＲＬに付加された
拡張子を用いていたが、ファイル内先頭に記述したヘッ
ダから判別することも可能である。FIG. 19 is a diagram showing a specific example in which only sound transitions. In FIG. 19A, a map image of a predetermined scale is displayed. If a command "scroll" is input, scrolling cannot be performed during traveling, so a scroll-inhibited voice information file is called, and a voice "cannot be scrolled during traveling." The screen is unchanged as shown in FIG.
The embodiments of the present invention have been described above, but the embodiments of the present invention are not limited to the above embodiments. For example,
In the above embodiment, the extension added to the URL is used for discriminating each piece of information. However, it is also possible to discriminate each information from the header described at the head of the file.

【００２７】[0027]

【発明の効果】以上のように本発明の音声ブラウザシス
テムによれば、音声情報を、画像情報とは別ファイルで
構築することができるので、音声情報のみを独立して制
御することができる。したがって、自動車に搭載した情
報端末として利用する場合等に、運転の安全を確保しな
がら、音声情報を自由に取得することができ、使い勝手
のよいシステムとすることができる。As described above, according to the audio browser system of the present invention, since the audio information can be constructed in a file different from the image information, only the audio information can be controlled independently. Therefore, when it is used as an information terminal mounted on a car or the like, it is possible to freely obtain voice information while ensuring the safety of driving, and it is possible to provide a user-friendly system.

[Brief description of the drawings]

【図１】本発明の音声ブラウザシステムを実施するため
の通信システム構成を示すブロック図である。FIG. 1 is a block diagram showing the configuration of a communication system for implementing a voice browser system of the present invention.

【図２】車載装置１がサーバ２から情報を受け取る場合
の、音声駆動ブラウザの機能ブロック図である。FIG. 2 is a functional block diagram of a voice-driven browser when the in-vehicle device 1 receives information from a server 2.

【図３】車載装置１がサーバにファイル要求を出す場合
の、音声駆動ブラウザの機能ブロック図である。FIG. 3 is a functional block diagram of a voice driven browser when the in-vehicle device 1 issues a file request to a server.

【図４】ドライバがＵＲＬを指定した場合に、ＵＲＬに
付いている拡張子に応じて各処理を実行する流れを示す
フローチャートである。FIG. 4 is a flowchart illustrating a flow of executing each process according to an extension attached to a URL when a driver specifies a URL.

【図５】画像情報処理を実行する流れを示すフローチャ
ートである。FIG. 5 is a flowchart illustrating a flow of executing image information processing.

【図６】音声情報処理を実行する流れを示すフローチャ
ートである。FIG. 6 is a flowchart showing a flow of executing voice information processing.

【図７】関連づけ情報処理を実行する流れを示すフロー
チャートである。FIG. 7 is a flowchart illustrating a flow of executing association information processing.

【図８】画像情報用ファイルの一具体例を示すＸＭＬフ
ォーマット図である。FIG. 8 is an XML format diagram showing a specific example of an image information file.

【図９】音声情報用ファイルの一具体例を示すＸＭＬフ
ォーマット図である。FIG. 9 is an XML format diagram showing a specific example of an audio information file.

【図１０】音声情報用ファイル"recommend .anx"の読み
上げ例を示す図である。FIG. 10 is a diagram showing an example of reading a voice information file “recommend .anx”.

【図１１】音声情報用ファイル"driveroute.anx"の読み
上げ例を示す図である。FIG. 11 is a diagram showing an example of reading a voice information file “driveroute.anx”.

【図１２】音声情報用ファイル"useful.anx"の読み上げ
例を示す図である。FIG. 12 is a diagram illustrating an example of reading a speech information file “useful.anx”.

【図１３】関連づけ情報展開処理の一具体例を示す図で
ある。FIG. 13 is a diagram showing a specific example of an associating information expanding process.

【図１４】以上の図８〜図１２に例示した処理に沿っ
て、表示装置に表示される画面、マイクロホンに喋る音
声内容、スピーカから聞こえる音声内容の具体例を掲げ
た図面である。FIG. 14 is a diagram showing specific examples of a screen displayed on a display device, audio content spoken by a microphone, and audio content heard from a speaker, in accordance with the processing illustrated in FIGS. 8 to 12 above.

【図１５】画像情報用ファイルと、音声情報用ファイル
との遷移の態様を説明するための図であり、画像と音声
が完全に同期して遷移していく例を示している。FIG. 15 is a diagram for explaining a mode of transition between an image information file and an audio information file, and shows an example in which an image and an audio transition completely synchronously.

【図１６】画像情報用ファイルと、音声情報用ファイル
との遷移の態様を説明するための図であり、音声は変化
せず、画像のみが遷移していく例を示している。FIG. 16 is a diagram for explaining a mode of transition between an image information file and an audio information file, and shows an example in which audio does not change and only an image transitions.

【図１７】画像のみが遷移する具体例を説明するための
画面図である。FIG. 17 is a screen diagram for describing a specific example in which only an image transitions.

【図１８】画像情報用ファイルと、音声情報用ファイル
との遷移の態様を説明するための図であり、画像は変化
せず、音声のみが遷移していく例を示している。FIG. 18 is a diagram for explaining a mode of transition between an image information file and an audio information file, and shows an example in which an image does not change and only audio transitions.

【図１９】音声のみ遷移していく具体例を示す画面図で
ある。FIG. 19 is a screen diagram showing a specific example in which only sound transitions.

[Explanation of symbols]

１車載装置２サーバ３専用ネットワーク４インターネツト１１センサ１２ナビゲーション装置１３パーソナルコンピュータ１４マイクロホン１５スピーカ１６表示装置１７キーボード１８携帯型送受信機２１，２２，２３ファイル DESCRIPTION OF SYMBOLS 1 In-vehicle device 2 Server 3 Dedicated network 4 Internet 11 Sensor 12 Navigation device 13 Personal computer 14 Microphone 15 Speaker 16 Display device 17 Keyboard 18 Portable transceiver 21, 22, 23 File

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｇ１０Ｌ 13/00 Ｇ０１Ｃ 21/00 Ｈ 15/00 Ｇ０８Ｇ 1/0969 15/28 Ｇ１０Ｌ 3/00 Ｑ // Ｇ０１Ｃ 21/00 ＲＧ０８Ｇ 1/0969 ５５１Ａ５５１Ｑ５５１ＰＦターム(参考） 2F029 AA02 AB01 AB07 AC02 AC14 AC18 5B075 ND06 ND14 ND36 NK48 PP07 PQ04 5D015 AA04 BB01 KK01 KK04 LL06 LL08 5D045 AB17 AB21 5H180 AA01 FF04 FF05 FF22 FF25 FF27 FF32 ──────────────────────────────────────────────────続き Continued on the front page (51) Int.Cl. ⁷ Identification symbol FI Theme coat ゛ (Reference) G10L 13/00 G01C 21/00 H 15/00 G08G 1/0969 15/28 G10L 3/00 Q // G01C 21/00 RG08G 1/0969 551A 551Q 551P F-term (reference) 2F029 AA02 AB01 AB07 AC02 AC14 AC18 5B075 ND06 ND14 ND36 NK48 PP07 PQ04 5D015 AA04 BB01 KK01 KK04 LL06 LL08 5D0180 FF25 FF21FF21A

Claims

[Claims]

1. A voice information input / output unit, a voice recognition / synthesis unit, and a voice information development processing unit, wherein a symbol indicating that the file is a voice information file is added to the voice information file. The processing means reads out the text of the text-to-speech tag in the file when a file added with a symbol indicating that the file is a voice information file is read, and jumps to a link destination according to voice input from the user. A voice browser system characterized by performing:

2. An audio input / output means, an audio input / output means, a voice recognition / synthesis means, a voice information development processing means and an image information development processing means, wherein the voice information file is a voice information file. Is added to the image information file, and a symbol indicating that the file is an image information file is added to the image information file. When the file is called, the text-to-speech tag in the file is read aloud, and a jump to a link destination is performed according to the voice input from the user. The image information expansion processing means determines that the file is an image information file. When a file to which a symbol is added is called, the display tag in the file is displayed, and the link destination is displayed according to the screen input from the user. Voice browser system which is characterized in that the jump.

3. A file to which a symbol indicating that it is a file for associating audio information with image information is defined, and when the file is called, the file for audio information described in the file is defined. Performs the text-to-speech reading of the text-to-speech tag in the file, jumps to the link destination according to the voice input from the user, and, for the image information file described in the file, the display tag in the file. 3. The voice browser system according to claim 2, wherein the display is performed, and a jump to a link destination is performed in accordance with a screen input from a user.

4. The voice browser system according to claim 1, wherein the use of the file is described by an extension of a URL.

5. The voice browser system according to claim 1, wherein the voice information file is described by a voice tag, a voice command, and a description of a link destination.

6. The voice browser system according to claim 2, wherein the image information file is described in an HTML language.