JPH10124291A

JPH10124291A - Speech recognition communication system for mobile terminal

Info

Publication number: JPH10124291A
Application number: JP8274452A
Authority: JP
Inventors: Toru Yamakita; 徹山北
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 1996-10-17
Filing date: 1996-10-17
Publication date: 1998-05-15

Abstract

PROBLEM TO BE SOLVED: To realize a speech recognition function as a user interface of a mobile terminal in communication environment using the mobile terminal at practical accuracy and cost. SOLUTION: In the mobile terminal 101, a speech signal inputted from an input part 109 in a call or an off-line state is transmitted from a control part 110 and a communicating part 111 to a PHS network 103, from which the speech signal is transmitted to a speech control host device 108 with a mobile terminal control host device 104 and an internet 105. The speech signal is received by a mobile terminal communication control part 116 with a packet transmitting/receiving part 115 in the speech control host device 108 and recognized by a sentense/voice recognizing part 117. Recognized speech/sentense data obtained as the result is returned in block to the mobile terminal 101 in real time or a desired timing, received by the control part 110 with the communication part 111 and displayed in an output part 112.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、移動（携帯）端末
装置において入力された通話音声等の音声を認識する技
術に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a technology for recognizing voice such as a voice input by a mobile (portable) terminal device.

【０００２】[0002]

【従来の技術及び発明が解決しようとする課題】音声信
号を認識して、文字データに変換して蓄積したり、認識
結果を種々のサービスの利用に供したりするための音声
認識技術は、様々な産業分野で従来から要請されてい
る。2. Description of the Related Art There are various voice recognition technologies for recognizing voice signals, converting them into character data and storing the data, and providing recognition results to use of various services. It has been demanded in various industrial fields.

【０００３】近年では、音声認識アルゴリズムの発達に
より、メインフレームコンピュータ又はワークステーシ
ョンコンピュータ等を用いた音声認識システムが開発さ
れている。In recent years, with the development of speech recognition algorithms, speech recognition systems using a mainframe computer or a workstation computer have been developed.

【０００４】これらのシステムは、例えば、電話音声を
入力とする銀行の残高照会システムや座席予約システ
ム、作業員の音声を認識して荷物の自動配送を行う配送
荷物の仕分システムを始めとして、種々の産業分野に取
り入られつつある。[0004] These systems include various systems such as a bank balance inquiry system and a seat reservation system using telephone voice as input, and a delivery package sorting system that recognizes workers' voices and automatically delivers packages. It is being adopted by other industrial fields.

【０００５】しかし、このような音声認識システムは、
上述のような大規模なコンピュータシステムの環境のも
とでようやく実用的な認識精度を有するレベルに到達し
たばかりであり、いわゆるパーソナルコンピュータのよ
うな小型のコンピュータシステム環境のもとでは、実用
的な認識精度を有する安価な音声認識システムは未だ実
現されていないのが現状である。However, such a speech recognition system has
Only under the environment of a large-scale computer system as described above has reached a level having practical recognition accuracy, and under the environment of a small-sized computer system such as a so-called personal computer, the practical recognition accuracy has been reached. At present, an inexpensive speech recognition system having recognition accuracy has not been realized yet.

【０００６】一方、上述のような情報処理技術と並行し
て、近年、自動車電話・携帯電話やＰＨＳ（パーソナル
ハンディホンシステム）等の移動端末が、急速に普及し
つつある。[0006] On the other hand, in parallel with the above information processing technology, mobile terminals such as car phones, mobile phones, and PHS (Personal Handy Phone System) have been rapidly spreading in recent years.

【０００７】特に、ＰＨＳは、小型であると同時に、自
動車電話・携帯電話に比較して通話料金が安く、かつ、
「いつでも、どこでも、誰とでも」高い品質で通話がで
きるという特徴を備えており、爆発的に普及しつつあ
る。更に、ＰＨＳは、ＩＳＤＮ（Integrated Services
Digital Network:サービス統合デジタル網）をバックボ
ーンとする公衆網であるため、３２キロビット／秒の伝
送レートでの高速デジタル通信が可能であり、マルチメ
ディア通信分野への応用に対する期待も高まっている。[0007] In particular, the PHS is small in size, and at the same time, the call charge is lower than that of a car phone or a mobile phone.
It has the feature of being able to talk with high quality "anytime, anywhere, with anyone" and is exploding. In addition, PHS is an ISDN (Integrated Services)
Since it is a public network using a Digital Network (Integrated Services Digital Network) as a backbone, high-speed digital communication at a transmission rate of 32 kilobits / second is possible, and expectations for its application to the multimedia communication field are increasing.

【０００８】更には、移動端末の利便性をいかすべく、
携帯電話装置としてだけではなく、携帯情報管理装置と
しても利用できるような、マルチメディア情報管理／通
信端末装置としての実現の可能性も期待されている。具
体的には、このような移動端末は、通話機能／ＦＡＸ機
能を備えることはもちろん、インターネットや社内ネッ
トワークへのアクセス機能としてのホームページアクセ
ス機能や電子メール通信機能を備えることが予想される
ほか、アドレス管理、スケジュール管理、データベース
検索等の情報管理機能をも兼ね備えることが期待され
る。Further, in order to take advantage of the convenience of the mobile terminal,
It is also expected that it can be used as a multimedia information management / communication terminal device that can be used not only as a mobile phone device but also as a mobile information management device. Specifically, such a mobile terminal is expected to have not only a call function / fax function, but also a homepage access function and an e-mail communication function as an access function to the Internet and a company network. It is expected to have information management functions such as address management, schedule management, and database search.

【０００９】そして、このような移動端末は、人が気軽
に利用できるよう、できる限り人にやさしく自然なユー
ザインタフェースを備えることが要請される。現在実現
されているユーザインタフェースとしては、キーボード
やマウスによる指操作入力、電子ペンによる手書き入力
等が実用化されているが、音声入力等にも対応すること
ができれば、ユーザインタフェースとして理想的であ
る。即ち例えば、基本機能としての通話機能を利用しな
がら通話内容を示す音声信号をデータとして処理するこ
と等が可能になれば、移動端末の利便性を飛躍的に増大
させることが可能になる。ここに、移動端末に対してユ
ーザインタフェースとして音声認識機能を適用すること
の価値を見出すことができる。[0009] Such a mobile terminal is required to have a natural user interface that is as gentle as possible to a person so that the user can use it easily. As user interfaces currently realized, finger operation input using a keyboard or a mouse, handwriting input using an electronic pen, and the like have been put to practical use, but if they can respond to voice input and the like, they are ideal as user interfaces. . That is, for example, if it becomes possible to process a voice signal indicating the contents of a call as data while using a call function as a basic function, it is possible to dramatically increase the convenience of the mobile terminal. Here, the value of applying a voice recognition function as a user interface to a mobile terminal can be found.

【００１０】しかし、移動端末は小型でありその情報処
理能力は限られている反面、前述したように、現在の音
声認識処理では、メインフレームコンピュータ又はワー
クステーションコンピュータクラスの環境のもとでない
と、実用的な認識精度を実現することは困難である。従
って、現状では、移動端末のユーザインタフェースとし
て音声認識機能を実現することは非常に困難である、と
いう問題点を有している。[0010] However, while the mobile terminal is small and has limited information processing capability, as described above, in the current speech recognition processing, the mobile terminal must be installed in a mainframe computer or workstation computer class environment. It is difficult to achieve practical recognition accuracy. Therefore, at present, there is a problem that it is very difficult to realize a voice recognition function as a user interface of a mobile terminal.

【００１１】本発明の課題は、移動端末を用いた通信環
境において、そのユーザインタフェースとしての音声認
識機能を実用的な精度及びコストで実現することにあ
る。An object of the present invention is to realize a speech recognition function as a user interface at a practical accuracy and cost in a communication environment using a mobile terminal.

【００１２】[0012]

【課題を解決するための手段】本発明はまず、以下の構
成を含む移動端末を有する。即ち、ホスト接続手段（制
御部１１０、通信部１１１）は、無線網又は有線網の何
れか一方又は両方から構成される中継網（ＰＨＳ網１０
３とインターネット１０５）を介して間接的に又はその
中継網を介さずに直接的にホスト装置である音声制御ホ
スト装置（音声制御ホスト装置１０８）に接続する。こ
の手段は、例えば、音声制御ホスト装置に対して、認識
音声データをリアルタイム又は非リアルタイムに返信さ
せるためのリアルタイム返信要求コマンド又は非リアル
タイム返信要求コマンドを選択的に送信する機能と、認
識音声データを一括して転送させるための一括転送要求
コマンドを送信する機能とを含む。The present invention first has a mobile terminal having the following configuration. In other words, the host connection means (the control unit 110 and the communication unit 111) is connected to a relay network (PHS network 10) composed of one or both of a wireless network and a wired network.
3 and the Internet 105) indirectly or directly to a voice control host device (voice control host device 108) as a host device without going through a relay network. This means includes, for example, a function of selectively transmitting a real-time reply request command or a non-real-time reply request command for causing the recognition voice data to return in real time or non-real time to the voice control host device, and And a function of transmitting a batch transfer request command for batch transfer.

【００１３】音声入力手段（入力部１０９）は、音声を
入力する。音声データ送信手段（制御部１１０、通信部
１１１）は、ホスト接続手段による接続動作の後、音声
入力手段から入力される音声データを音声制御ホスト装
置に送信する。The voice input means (input section 109) inputs voice. After the connection operation by the host connection unit, the audio data transmission unit (the control unit 110 and the communication unit 111) transmits the audio data input from the audio input unit to the audio control host device.

【００１４】認識音声データ受信手段（制御部１１０、
通信部１１１）は、音声制御ホスト装置から返信される
認識音声データを受信する。認識音声データ表示／編集
手段（制御部１１０、出力部１１２）は、その受信され
た認識音声データを表示又は編集する。[0014] Recognition voice data receiving means (control unit 110,
The communication unit 111) receives the recognized voice data returned from the voice control host device. The recognition voice data display / edit means (control unit 110, output unit 112) displays or edits the received recognition voice data.

【００１５】次に、本発明は、以下の構成を含む音声制
御ホスト装置１０８を有する。即ち、移動端末接続手段
（パケット送受信部１１５、移動端末通信制御部１１
６）は、移動端末内のホスト接続手段による接続動作に
応答して、移動端末を識別して接続する。Next, the present invention has a voice control host device 108 having the following configuration. That is, mobile terminal connection means (packet transmitting / receiving section 115, mobile terminal communication control section 11
6) responds to the connection operation by the host connection means in the mobile terminal, and identifies and connects the mobile terminal.

【００１６】音声データ受信手段（パケット送受信部１
１５、移動端末通信制御部１１６）は、現在接続されて
いる移動端末毎に、音声データを受信する。音声認識手
段（移動端末通信制御部１１６、文音声認識部１１７）
は、現在接続されている移動端末毎に、音声データ受信
手段により受信された音声データに対して音声認識処理
を実行する。Voice data receiving means (packet transmitting / receiving unit 1)
15. The mobile terminal communication control unit 116) receives voice data for each currently connected mobile terminal. Voice recognition means (mobile terminal communication control unit 116, sentence voice recognition unit 117)
Executes voice recognition processing on voice data received by the voice data receiving means for each mobile terminal currently connected.

【００１７】認識音声データ返信手段（移動端末通信制
御部１１６、パケット送受信部１１５）は、現在接続さ
れている移動端末毎に、音声認識手段による音声認識処
理によって得られる認識音声データを、それに対応する
移動端末に返信する。この手段は、例えば、現在接続さ
れている移動端末毎に、移動端末接続手段が移動端末か
らリアルタイム返信要求コマンドを受信している場合に
は、それに対応して音声認識手段による音声認識処理に
よって得られる認識音声データをそれに対応する移動端
末に即座に返信し、移動端末接続手段が移動端末から非
リアルタイム返信要求コマンドを受信している場合に
は、それに対応して音声認識手段による音声認識処理に
よって得られる認識音声データを保持した後、移動端末
接続手段が移動端末から一括転送要求コマンドを受信し
た時点で、それに対応して保持していた認識音声データ
をそれに対応する移動端末に一括して返信する。Recognition voice data return means (mobile terminal communication control section 116, packet transmission / reception section 115) converts recognition voice data obtained by voice recognition processing by voice recognition means for each currently connected mobile terminal. Reply to the mobile terminal. This means is provided, for example, for each currently connected mobile terminal, when the mobile terminal connection means receives a real-time reply request command from the mobile terminal, the voice recognition processing by the voice recognition means correspondingly. The recognized voice data is immediately returned to the corresponding mobile terminal, and if the mobile terminal connection means has received a non-real-time reply request command from the mobile terminal, corresponding voice recognition processing by the voice recognition means After holding the obtained recognition voice data, when the mobile terminal connection means receives the batch transfer request command from the mobile terminal, the recognition voice data held correspondingly is collectively returned to the corresponding mobile terminal. I do.

【００１８】以上の移動端末と音声制御ホスト装置を含
む本発明による通信移動端末音声認識通信システムによ
り、移動端末は、高度な音声認識環境を設備する必要が
なく実用的な精度を有する音声認識機能の提供を低コス
トで受けることができる。With the above-described mobile terminal voice recognition communication system according to the present invention including the mobile terminal and the voice control host device, the mobile terminal does not need to be equipped with an advanced voice recognition environment and has a practically accurate voice recognition function. Can be provided at low cost.

【００１９】上述の発明の構成は、下記の限定を含むこ
とができる。即ち、まず、移動端末は、パーソナルハン
ディホンシステム通信機能（通信部１１１）を備える。The configuration of the invention described above can include the following limitations. That is, first, the mobile terminal has a personal handyphone system communication function (communication unit 111).

【００２０】次に、中継網は、パーソナルハンディホン
システム通信網（ＰＨＳ網１０３）とインターネット
（インターネット１０５）を含む。また、音声制御ホス
ト装置は、インターネットに接続する。Next, the relay network includes a personal handy phone system communication network (PHS network 103) and the Internet (Internet 105). The voice control host device connects to the Internet.

【００２１】そして、移動端末内のホスト接続手段は、
パーソナルハンディホンシステム通信網を介して、その
パーソナルハンディホンシステム通信網を含む公衆網と
インターネットとの間のゲートウエイ機能を有する移動
端末制御ホスト装置（移動端末制御ホスト装置１０４）
に発信し接続することによって、インターネット上の通
信プロトコルを使用して、移動端末制御ホスト装置から
インターネットを介して音声制御ホスト装置に接続す
る。Then, the host connection means in the mobile terminal includes:
A mobile terminal control host device (mobile terminal control host device 104) having a gateway function between a public network including the personal handyphone system communication network and the Internet via a personal handyphone system communication network.
The mobile terminal control host device connects to the voice control host device via the Internet using the communication protocol on the Internet by calling and connecting to the voice control host device.

【００２２】この限定構成によって、現在全国的及び全
世界的に普及しつつあるパーソナルハンディホンシステ
ム通信網及びインターネットを経由することにより、実
用的な精度を有する音声認識機能の提供をより低コスト
及び手軽に受けることができると同時に、本発明が提供
する機能とパーソナルハンディホンシステム通話機能及
びインターネットアクセス機能とを、シームレスに結合
することができる。With this limited configuration, it is possible to provide a practically accurate voice recognition function at a lower cost and through a personal handyphone system communication network and the Internet, which are now spreading nationwide and worldwide. At the same time, the functions provided by the present invention can be seamlessly combined with the personal handyphone system call function and the Internet access function.

【００２３】更に、上述の通信プロトコルは、下記の限
定を含むことができる。即ち、上記通信プロトコルは、
インターネットプロトコル（ＩＰ）層及びトランスミッ
ションコントロールプロトコル（ＴＣＰ）層を含む階層
プロトコルである。Further, the above communication protocol may include the following limitations. That is, the communication protocol is
It is a hierarchical protocol that includes an Internet Protocol (IP) layer and a Transmission Control Protocol (TCP) layer.

【００２４】次に、インターネット上を伝送されるイン
ターネットプロトコル層のパケットデータであるインタ
ーネットプロトコルデータグラム（ＩＰデータグラム）
のヘッダ（ＩＰヘッダ）フィールドには、インターネッ
ト上での移動端末及び音声制御ホスト装置のアドレスを
指定する送信元インターネットプロトコルアドレス及び
宛先インターネットプロトコルアドレスが格納され、そ
のインターネットプロトコルデータグラムのデータフィ
ールドには、トランスミッションコントロールプロトコ
ル層のパケットデータであるトランスミッションコント
ロールプロトコルセグメントが格納される。Next, an Internet Protocol datagram (IP datagram) which is packet data of an Internet protocol layer transmitted on the Internet.
The header (IP header) field stores a source Internet protocol address and a destination Internet protocol address designating addresses of a mobile terminal and a voice control host device on the Internet, and the data field of the Internet protocol datagram contains And a transmission control protocol segment which is packet data of the transmission control protocol layer.

【００２５】また、トランスミッションコントロールプ
ロトコルセグメント（ＴＣＰセグメント）のヘッダ（Ｔ
ＣＰヘッダ）フィールドには、音声認識処理のための通
信プロトコルを特定する送信元ポート番号及び宛先ポー
ト番号が格納され、そのトランスミッションコントロー
ルプロトコルセグメントのデータフィールドには、移動
端末を識別するための端末識別コード、リアルタイム返
信要求コマンド、非リアルタイム返信要求コマンド、一
括転送要求コマンド、音声データ、又は認識音声データ
が格納される。The transmission control protocol segment (TCP segment) header (T
The CP header field stores a source port number and a destination port number that specify a communication protocol for voice recognition processing. The data field of the transmission control protocol segment includes a terminal identification for identifying a mobile terminal. A code, a real-time reply request command, a non-real-time reply request command, a batch transfer request command, voice data, or recognized voice data are stored.

【００２６】この限定構成によって、移動端末と音声制
御ホスト装置とを全世界的に容易に特定できると共に、
音声認識処理サービスと他の情報処理サービスとの共存
を容易に実現できる。With this limited configuration, the mobile terminal and the voice control host device can be easily specified worldwide, and
The coexistence of the voice recognition processing service and other information processing services can be easily realized.

【００２７】ここまでの発明の構成において、音声制御
ホスト装置は、網によって相互に接続され、移動端末接
続手段、音声データ受信手段、音声認識手段、及び認識
音声データ返信手段に対応する機能を分散して実現する
複数のホストコンピュータから構成されるように実現す
ることができる。In the configuration of the present invention described above, the voice control host devices are mutually connected by a network, and the functions corresponding to the mobile terminal connecting means, the voice data receiving means, the voice recognition means, and the recognized voice data return means are distributed. And a plurality of host computers.

【００２８】この限定構成によって、ホスト装置側の負
荷分散を容易に実現できる。なお、上述した移動端末及
び音声制御ホスト装置の単体も、本発明の権利範囲であ
る。With this limited configuration, load distribution on the host device side can be easily realized. Note that the above-described mobile terminal and voice control host device alone are also within the scope of the present invention.

【００２９】[0029]

【発明の実施の形態】以下、図面を参照しながら本発明
の実施の形態について詳細に説明する。本実施の形態で
は、ＰＨＳ機能が組み込まれた移動端末において、通話
時に又はオフライン状態でマイクから入力された音声信
号が、ＰＨＳ網からインターネットを介して特定の音声
サービスプロバイダ内のＬＡＮに接続される音声制御ホ
スト装置に送られ、そこで上記音声信号が認識された
後、リアルタイムに又は所望のタイミングで一括して移
動端末に返送され、移動端末での情報管理処理に利用さ
れることが、本発明に関連する大きな特徴である。この
ようなシステムにより、移動端末は、高度な音声認識環
境を設備する必要がなく実用的な精度を有する音声認識
機能の提供を低コストで受けることができる。＜システム構成＞図１は、本発明の実施の形態の全体シ
ステム構成図である。Embodiments of the present invention will be described below in detail with reference to the drawings. In the present embodiment, in a mobile terminal having a built-in PHS function, a voice signal input from a microphone during a call or in an off-line state is connected to a LAN in a specific voice service provider from the PHS network via the Internet. According to the present invention, the voice signal is sent to the voice control host device, and after the voice signal is recognized, the voice signal is returned to the mobile terminal in real time or at a desired timing in a lump and used for information management processing in the mobile terminal. This is a major feature related to. According to such a system, the mobile terminal can receive the provision of a practically accurate speech recognition function at a low cost without having to provide a sophisticated speech recognition environment. <System Configuration> FIG. 1 is an overall system configuration diagram of an embodiment of the present invention.

【００３０】移動端末１０１は、ＰＨＳ端末機能を有し
ており、無線基地１０２を介して、無線通信によってＰ
ＨＳ網１０３に接続される。無線基地１０２は、街路の
公衆電話ボックス、電柱、ビル屋上、地下通路等に設け
られる公衆無線基地、又は加入者宅内の親子電話装置等
である。なお、親子電話装置に接続される場合は、ＰＨ
Ｓ網を介さずに、直接公衆電話網に接続される。なお、
無線基地１０２の代わりに、有線接続装置を介して、有
線通信によってＰＨＳ網１０３又は公衆電話網に接続さ
れるように構成されてもよい。The mobile terminal 101 has a PHS terminal function.
It is connected to the HS network 103. The wireless base 102 is a public wireless base provided on a public telephone booth, a telephone pole, a building rooftop, an underground passage, or the like on a street, or a parent-child telephone device in a subscriber's house. When connected to the parent-child telephone device, the PH
It is directly connected to the public telephone network without going through the S network. In addition,
Instead of the wireless base 102, a configuration may be adopted in which the wireless base 102 is connected to the PHS network 103 or the public telephone network by wired communication via a wired connection device.

【００３１】ＰＨＳ網１０３は、公衆電話網又はＩＳＤ
Ｎ網と相互接続しており、これらの網には、高速デジタ
ル専用線等によってインターネット１０５に接続してい
る移動端末制御ホスト装置１０４が接続されている。The PHS network 103 is a public telephone network or an ISD
It is interconnected with N networks, and a mobile terminal control host device 104 connected to the Internet 105 by a high-speed digital leased line or the like is connected to these networks.

【００３２】移動端末１０１は、無線基地１０２及びＰ
ＨＳ網１０３を介して、上記公衆電話網又はＩＳＤＮ網
に接続されている移動端末制御ホスト装置１０４に自動
的にダイヤルアップ発信することによって、インターネ
ット１０５に接続することができる。The mobile terminal 101 is connected to the radio base 102 and the P
By automatically dialing up the mobile terminal control host device 104 connected to the public telephone network or the ISDN network via the HS network 103, it is possible to connect to the Internet 105.

【００３３】インターネット１０５には、高速デジタル
専用線等を介して所定の音声サービスプロバイダのＬＡ
Ｎ１０７に接続しているルータ装置１０６が接続されて
いる。ＬＡＮ１０７は、イーサネット方式、ＡＴＭ（As
ynchronous Transfer Mode）方式、又はＦＤＤＩ方式に
よるローカルエリアネットワークである。ＬＡＮ１０７
には、更に音声制御ホスト装置１０８が接続されてい
る。A predetermined voice service provider LA is connected to the Internet 105 via a high-speed digital leased line or the like.
The router device 106 connected to N107 is connected. LAN 107 is an Ethernet system, ATM (As
Synchronous Transfer Mode) or FDDI. LAN 107
Is connected to a voice control host device 108.

【００３４】移動端末１０１は、移動端末制御ホスト装
置１０４に自動的にダイヤルアップ発信した後に、イン
ターネット１０５、ルータ装置１０６、及びＬＡＮ１０
７を介して、音声制御ホスト装置１０８と通信すること
ができる。After automatically dialing up the mobile terminal 101 to the mobile terminal control host device 104, the mobile terminal 101, the Internet 105, the router device 106, and the LAN 10
7 can communicate with the voice control host device 108.

【００３５】今、移動端末１０１内の入力部１０９にお
いて、ユーザが、タッチパネルから音声制御ホスト装置
１０８との通信を指示すると、制御部１１０は、通信部
１１１に対して、音声制御ホスト装置１０８との通信開
始を依頼する。Now, when the user instructs communication with the voice control host device 108 from the touch panel on the input unit 109 in the mobile terminal 101, the control unit 110 sends the voice control host device 108 to the communication unit 111. Request to start communication.

【００３６】通信部１１１は、制御部１１０から通信開
始を依頼されると、現在移動端末制御ホスト装置１０４
に接続していなければ、無線基地（又は有線接続装置）
１０２に無線（又は有線）発信してＰＨＳ網１０３に接
続した後、移動端末制御ホスト装置１０４のアクセス電
話番号を指定してダイヤルアップ発信する。When the communication unit 111 is requested by the control unit 110 to start communication, the current mobile terminal control host device 104
If not connected to a wireless base (or wired connection device)
After making a wireless (or wired) call to 102 and connecting to the PHS network 103, a dial-up call is made by specifying the access telephone number of the mobile terminal control host device 104.

【００３７】移動端末制御ホスト装置１０４が着信する
と、移動端末１０１内の通信部１１１は、まず、移動端
末制御ホスト装置１０４内の接続確立部１１３と通信す
ることにより、インターネット１０５上の標準通信プロ
トコルであるＴＣＰ／ＩＰ及びＰＰＰ方式による接続の
確立のためのネゴシエーションを行う。この結果、移動
端末制御ホスト装置１０４から、移動端末１０１内の通
信部１１１に対して、インターネット１０５上の識別ア
ドレスであるＩＰアドレスが付与され、移動端末１０１
は、インターネット１０５へのアクセスが可能となる。When the mobile terminal control host device 104 receives an incoming call, the communication unit 111 in the mobile terminal 101 first communicates with the connection establishment unit 113 in the mobile terminal control host device 104, thereby establishing a standard communication protocol on the Internet 105. Negotiation for establishing a connection by the TCP / IP and PPP methods. As a result, an IP address, which is an identification address on the Internet 105, is assigned from the mobile terminal control host device 104 to the communication unit 111 in the mobile terminal 101, and the mobile terminal 101
Can access the Internet 105.

【００３８】移動端末１０１内の通信部１１１は、既に
移動端末制御ホスト装置１０４に接続していれば、上記
タイヤルアップ発信は省略する。その後、移動端末１０
１内の通信部１１１は、予め設定されている音声制御ホ
スト装置１０８のＩＰアドレスである“宛先ＩＰアドレ
ス”と、移動端末制御ホスト装置１０４から付与された
ＩＰアドレスである“送信元ＩＰアドレス”と、移動端
末１０１を識別するための“端末識別コード”（例えば
ＰＨＳ電話番号）と、ユーザの指定に基づく文音声認識
処理のリアルタイム開始要求コマンド又は文音声認識処
理の非リアルタイム開始要求コマンドとが格納されたＴ
ＣＰ／ＩＰパケットを、インターネット１０５に向けて
送出する。If the communication unit 111 in the mobile terminal 101 is already connected to the mobile terminal control host device 104, the above dial-up transmission is omitted. Then, the mobile terminal 10
1 includes a “destination IP address” that is a preset IP address of the voice control host device 108 and a “source IP address” that is an IP address assigned by the mobile terminal control host device 104. And a "terminal identification code" (for example, a PHS telephone number) for identifying the mobile terminal 101, and a real-time start request command for sentence speech recognition processing or a non-real-time start request command for sentence speech recognition processing based on a user's designation. Stored T
The CP / IP packet is transmitted to the Internet 105.

【００３９】このＴＣＰ／ＩＰパケットは、それに格納
されている“宛先ＩＰアドレス”に基づき、移動端末制
御ホスト装置１０４内のルーティング部１１４とインタ
ーネット１０５内の特には図示しない中継ホスト装置に
よって、音声サービスプロバイダ内のルータ装置１０６
まで転送された後、更に、ＬＡＮ１０７を介して音声制
御ホスト装置１０８内のパケット送受信部１１５まで転
送される。The TCP / IP packet is transmitted by the routing unit 114 in the mobile terminal control host device 104 and the relay host device (not shown) in the Internet 105 based on the “destination IP address” stored in the TCP / IP packet. Router device 106 in the provider
After that, the packet is further transferred to the packet transmitting / receiving unit 115 in the voice control host device 108 via the LAN 107.

【００４０】パケット送受信部１１５は、受信したＴＣ
Ｐ／ＩＰパケットから、“送信元ＩＰアドレス”と、
“端末識別コード”と、文音声認識処理のリアルタイム
開始要求コマンド又は文音声認識処理の非リアルタイム
開始要求コマンドとを取り出して、音声制御ホスト装置
１０８内の移動端末通信制御部１１６に引き渡す。The packet transmitting / receiving unit 115 receives the received TC
From the P / IP packet, the “source IP address”
The “terminal identification code” and the real-time start request command for sentence speech recognition processing or the non-real-time start request command for sentence speech recognition processing are taken out and passed to the mobile terminal communication control unit 116 in the speech control host device 108.

【００４１】移動端末通信制御部１１６は、引き渡され
た“送信元ＩＰアドレス”と、“端末識別コード”と、
文音声認識処理のリアルタイム開始要求コマンド又は文
音声認識処理の非リアルタイム開始要求コマンドに関す
る情報を後述する処理端末登録テーブル（図１３）に登
録した後、パケット送受信部１１５に対して、送信許可
データが格納されたＴＣＰ／ＩＰパケットの移動端末１
０１への返信を依頼する。The mobile terminal communication control unit 116 transmits the delivered “source IP address”, “terminal identification code”,
After registering information relating to a real-time start request command for sentence speech recognition processing or a non-real-time start request command for sentence speech recognition processing in a processing terminal registration table (FIG. 13) described later, transmission permission data is transmitted to the packet transmitting / receiving unit 115. Mobile terminal 1 of stored TCP / IP packet
Request a reply to 01.

【００４２】パケット送受信部１１５は、対応するＴＣ
Ｐ／ＩＰパケットを、移動端末１０１に対応するＩＰア
ドレスに向けて送信する。このようにして、音声制御ホ
スト装置１０８は、移動端末１０１から転送されてくる
音声データに対して文音声認識処理を実行することが可
能となる。The packet transmission / reception section 115
The P / IP packet is transmitted to the IP address corresponding to the mobile terminal 101. In this way, the voice control host device 108 can execute sentence voice recognition processing on voice data transferred from the mobile terminal 101.

【００４３】移動端末１０１内の通信部１１１は、音声
制御ホスト装置１０８から上記送信許可データが格納さ
れたＴＣＰ／ＩＰパケットを受信すると、それに格納さ
れている送信許可データを制御部１１０に引き渡す。When the communication unit 111 in the mobile terminal 101 receives the TCP / IP packet storing the transmission permission data from the voice control host device 108, the communication unit 111 transfers the transmission permission data stored therein to the control unit 110.

【００４４】移動端末１０１内の制御部１１０は、上記
送信許可データを引き渡された後、通信部１１１に対し
て、通話動作又はオフライン状態での音声入力動作によ
ってマイクから入力された音声データの音声制御ホスト
装置１０８への送信を依頼する。After the transmission permission data is delivered, the control unit 110 in the mobile terminal 101 sends the communication unit 111 the voice of the voice data input from the microphone by the voice call operation or the voice input operation in the offline state. Request transmission to the control host device 108.

【００４５】通信部１１１は、上記音声データが格納さ
れたＴＣＰ／ＩＰパケットを、音声制御ホスト装置１０
８に対応するＩＰアドレスに向けて送信する。このＴＣ
Ｐ／ＩＰパケットは、それに格納されている“宛先ＩＰ
アドレス”に基づき、移動端末制御ホスト装置１０４内
のルーティング部１１４、インターネット１０５内の特
には図示しない中継ホスト装置、音声サービスプロバイ
ダ内のルータ装置１０６、及びＬＡＮ１０７を介して、
音声制御ホスト装置１０８内のパケット送受信部１１５
まで転送される。The communication unit 111 transmits the TCP / IP packet storing the voice data to the voice control host device 10.
8 is transmitted to the IP address corresponding to No. 8. This TC
The P / IP packet has the “destination IP” stored therein.
Based on the address, via the routing unit 114 in the mobile terminal control host device 104, the relay host device (not shown) in the Internet 105, the router device 106 in the voice service provider, and the LAN 107,
Packet transmission / reception unit 115 in voice control host device 108
Transferred to

【００４６】パケット送受信部１１５は、受信したＴＣ
Ｐ／ＩＰパケットに格納されている音声データを取り出
し、それを音声制御ホスト装置１０８内の移動端末通信
制御部１１６に引き渡す。Packet transmitting / receiving section 115 receives the received TC
The voice data stored in the P / IP packet is extracted and delivered to the mobile terminal communication control unit 116 in the voice control host device 108.

【００４７】移動端末通信制御部１１６は、引き渡され
た音声データを文音声認識部１１７に引き渡す。文音声
認識部１１７は、引き渡された音声データに対して文音
声認識処理を実行し、認識結果である認識音声文章デー
タを移動端末通信制御部１１６に引き渡す。The mobile terminal communication control section 116 delivers the delivered speech data to the sentence speech recognition section 117. The sentence speech recognition unit 117 executes sentence speech recognition processing on the delivered speech data, and delivers the recognized speech sentence data as the recognition result to the mobile terminal communication control unit 116.

【００４８】移動端末通信制御部１１６は、移動端末１
０１から文音声認識処理のリアルタイム開始要求コマン
ドが指定されている場合には、即座に、パケット送受信
部１１５に対して、認識音声文章データが格納されたＴ
ＣＰ／ＩＰパケットの移動端末１０１への返信を依頼す
る。The mobile terminal communication control unit 116
When the real-time start request command of the sentence speech recognition process is designated from 01, the packet transmission / reception unit 115 immediately transmits the T
A request is made to return a CP / IP packet to the mobile terminal 101.

【００４９】パケット送受信部１１５は、対応するＴＣ
Ｐ／ＩＰパケットを、移動端末１０１に対応するＩＰア
ドレスに向けて送信する。移動端末１０１内の通信部１
１１は、音声制御ホスト装置１０８から上記認識音声文
章データが格納されたＴＣＰ／ＩＰパケットを受信する
と、それに格納されている認識音声文章データを制御部
１１０に引き渡す。The packet transmitting / receiving unit 115
The P / IP packet is transmitted to the IP address corresponding to the mobile terminal 101. Communication unit 1 in mobile terminal 101
When receiving the TCP / IP packet storing the recognized voice sentence data from the voice control host device 108, the control unit 110 transfers the recognized voice sentence data stored therein to the control unit 110.

【００５０】移動端末１０１内の制御部１１０は、上記
認識音声文章データを、出力部１１２に出力する。出力
部１１２は、認識音声文章データに対応する文章を、Ｌ
ＣＤ表示部に表示する。ユーザは、この文章データを、
任意に蓄積又は加工することができる。The control unit 110 in the mobile terminal 101 outputs the above-described recognized voice sentence data to the output unit 112. The output unit 112 outputs a sentence corresponding to the recognized speech sentence data as L
Display on the CD display. The user writes this sentence data
It can be arbitrarily stored or processed.

【００５１】移動端末通信制御部１１６は、移動端末１
０１から文音声認識処理の非リアルタイム開始要求コマ
ンドが指定されている場合には、認識音声文章データを
即座に移動端末１０１に返信することはせずに、それを
順次蓄積する。そして、移動端末通信制御部１１６は、
後に、移動端末１０１から文音声認識結果の一括転送要
求コマンドを受信したタイミングで、パケット送受信部
１１５に対して、蓄積しておいた認識音声文章データが
格納されたＴＣＰ／ＩＰパケットの移動端末１０１への
一括返信を依頼する。The mobile terminal communication control unit 116 controls the mobile terminal 1
When a non-real-time start request command for sentence speech recognition processing is designated from 01, the recognized speech sentence data is not immediately returned to the mobile terminal 101 but is sequentially stored. Then, the mobile terminal communication control unit 116
Later, at the timing when the batch transfer request command of the sentence speech recognition result is received from the mobile terminal 101, the packet transmission / reception unit 115 sends the stored TCP / IP packet of the stored speech / speech data to the mobile terminal 101. Request a bulk reply to.

【００５２】パケット送受信部１１５は、対応するＴＣ
Ｐ／ＩＰパケットを、移動端末１０１に対応するＩＰア
ドレスに向けて送信する。これにより、移動端末１０１
では、一括転送された認識音声文章データを、ＬＣＤ表
示部に表示させ、ユーザは、この文章データを、任意に
蓄積又は加工することができる。The packet transmitting / receiving unit 115
The P / IP packet is transmitted to the IP address corresponding to the mobile terminal 101. Thereby, the mobile terminal 101
Then, the batch-transferred recognition voice text data is displayed on the LCD display unit, and the user can arbitrarily store or process the text data.

【００５３】一方、移動端末１０１は、音声制御ホスト
装置１０８との通信のほかに、それが装備するホームペ
ージ閲覧ツールや電子メールを利用して、移動端末制御
ホスト装置１０４にダイヤルアップ発信することによ
り、インターネット１０５上の所望のリソースに自由に
アクセスすることが可能である。＜移動端末１０１の外
観構成＞図２は、図１の移動端末１０１の外観図であ
る。On the other hand, in addition to the communication with the voice control host device 108, the mobile terminal 101 makes a dial-up call to the mobile terminal control host device 104 by using a homepage browsing tool or an e-mail provided with the voice control host device 108. , It is possible to freely access desired resources on the Internet 105. <External Configuration of Mobile Terminal 101> FIG. 2 is an external view of the mobile terminal 101 of FIG.

【００５４】移動端末１０１は、コンパクトな携帯情報
管理装置の外観を有し、音声を入力するための送話器を
兼ねたマイク２０１と、本発明には特には関連しないが
画像を入力するためのカメラ２０２と、各種情報を表示
し、またタッチ入力又はペン入力を受け付けるタッチパ
ネル機能を有するＬＣＤ表示部２０３と、音声を出力す
るための受話器を兼ねたスピーカ２０４を有する、ま
た、図１の無線基地１０２に発信するための無線アンテ
ナ２０５と、無線基地１０２の代わりの有線接続装置に
接続するためのソケット２０６を有する。The mobile terminal 101 has the appearance of a compact portable information management device, and has a microphone 201 also serving as a transmitter for inputting voice, and a microphone 201 for inputting an image which is not particularly related to the present invention. 1. The camera 202, an LCD display unit 203 having a touch panel function of displaying various information and receiving a touch input or a pen input, and a speaker 204 also serving as a receiver for outputting voice. It has a wireless antenna 205 for transmitting to the base 102 and a socket 206 for connecting to a wired connection device instead of the wireless base 102.

【００５５】更に、各種ＩＣカードを挿入するためのＩ
Ｃカードスロット２０７と、他の移動端末１０１又はパ
ーソナルコンピュータ等との間で赤外線光通信を行うた
めの光送受信機２０８を有する。Further, an I for inserting various IC cards is provided.
An optical transceiver 208 for performing infrared optical communication between the C card slot 207 and another mobile terminal 101 or a personal computer or the like is provided.

【００５６】スイッチ２０９は、電源スイッチである。＜移動端末１０１の機能ブロック構成＞図３は、移動端
末１０１の機能ブロック図である。The switch 209 is a power switch. <Functional Block Configuration of Mobile Terminal 101> FIG. 3 is a functional block diagram of the mobile terminal 101.

【００５７】移動端末１０１は、図１にも示したよう
に、入力部１０９、制御部１１０、通信部１１１、及び
出力部１１２から構成され、それぞれバス３２６によっ
て相互に接続されている。As shown in FIG. 1, the mobile terminal 101 includes an input unit 109, a control unit 110, a communication unit 111, and an output unit 112, and is mutually connected by a bus 326.

【００５８】まず、入力部１０９は、音声を入力する部
分と、本発明には特には関連しないが画像を入力する部
分と、出力部１１２の動作において後述するタッチパネ
ル機構の部分とから構成される。First, the input unit 109 includes a part for inputting voice, a part for inputting an image which is not particularly related to the present invention, and a part of a touch panel mechanism described later in the operation of the output unit 112. .

【００５９】音声を入力する部分は、マイク３０１、Ａ
／Ｄ変換部３０２、及びマイク制御部３０３から構成さ
れる。マイク３０１（図２の２０１に対応）は、ＰＨＳ
電話の送話器を兼ねており、ユーザが発声した音声を入
力する。The part for inputting sound is the microphone 301, A
It comprises a / D conversion unit 302 and a microphone control unit 303. The microphone 301 (corresponding to 201 in FIG. 2) is a PHS
Also serves as a telephone transmitter, and inputs a voice uttered by the user.

【００６０】Ａ／Ｄ変換部３０２は、マイク３０１から
入力されたアナログ音声信号をデジタル音声データに変
換し、更にそのデジタル音声データを、ＰＨＳの標準音
声符号化方式であるＡＤＰＣＭ（Adaptive Differentia
l Pulse Code Modulation:適応差分線形パルス符号化）
方式によって符号化する。なお、この部分は、ＰＨＳ端
末を構成するＬＳＩ集積回路として、既に実用化されて
いる。The A / D converter 302 converts an analog audio signal input from the microphone 301 into digital audio data, and further converts the digital audio data into an ADPCM (Adaptive Differential) which is a PHS standard audio encoding system.
l Pulse Code Modulation: Adaptive differential linear pulse coding
Encode according to the method. This part has already been put to practical use as an LSI integrated circuit constituting a PHS terminal.

【００６１】マイク制御部３０３は、上述の符号化され
た音声データを、通話時には、通信部１１１内の通信制
御部３２１に転送して通話チャネルに載せると共に、文
音声認識処理時には、更に制御部１１０内のＲＡＭ３１
７に転送する。The microphone control unit 303 transfers the coded voice data to the communication control unit 321 in the communication unit 111 during a call and places it on the communication channel, and further performs the control unit during the sentence voice recognition processing. RAM 31 in 110
Transfer to 7.

【００６２】一方、画像を入力する部分は、ＣＣＤ（Ch
arge Coupled Device ）カメラ３０４、Ａ／Ｄ変換部３
０５、メモリ３０６、及びカメラ制御部３０７から構成
される。On the other hand, a part for inputting an image is a CCD (Ch
arge Coupled Device) Camera 304, A / D converter 3
05, a memory 306, and a camera control unit 307.

【００６３】ＣＣＤカメラ３０４は、ユーザの操作に基
づいて任意の画像を撮像する。Ａ／Ｄ変換部３０５は、
ＣＣＤカメラ３０４によって撮像されたアナログ映像信
号を、デジタル画像データに変換する。The CCD camera 304 captures an arbitrary image based on a user operation. The A / D conversion unit 305
An analog video signal captured by the CCD camera 304 is converted into digital image data.

【００６４】メモリ３０６は、デジタル画像データをフ
レーム単位で記憶する。カメラ制御部３０７は、ＣＣＤ
カメラ３０４、Ａ／Ｄ変換部３０５、及びメモリ３０６
の動作を制御する。The memory 306 stores digital image data in frame units. The camera control unit 307 includes a CCD
Camera 304, A / D converter 305, and memory 306
Control the operation of.

【００６５】次に、出力部１１２は、音声を出力する部
分と、画像を出力する部分とから構成される。音声を出
力する部分は、スピーカ３０８、Ｄ／Ａ変換部３０９、
及びスピーカ制御部３１０から構成される。Next, the output unit 112 is composed of a part for outputting a sound and a part for outputting an image. The part that outputs audio includes a speaker 308, a D / A converter 309,
And a speaker control unit 310.

【００６６】スピーカ制御部３１０は、通信部１１１内
の通信制御部３２１から受信されたＰＨＳ通話音声デー
タ、又は制御部１１０内のＲＡＭ３１７から受信された
合成音声データを、Ｄ／Ａ変換部３０９に転送する。The speaker control unit 310 sends the PHS call voice data received from the communication control unit 321 in the communication unit 111 or the synthesized voice data received from the RAM 317 in the control unit 110 to the D / A conversion unit 309. Forward.

【００６７】Ｄ／Ａ変換部３０９は、受信された音声デ
ータを復号し、アナログ音声信号に変換し、それをスピ
ーカ３０８（図２の２０４に対応）から音声として放音
させる。The D / A converter 309 decodes the received audio data, converts it into an analog audio signal, and emits it as sound from the speaker 308 (corresponding to 204 in FIG. 2).

【００６８】画像を出力する部分は、ＬＣＤ表示部２０
３、ＬＣＤドライバ３１２、メモリ３１３、及びＬＣＤ
制御部３１４から構成される。ＬＣＤ制御部３１４は、
制御部１１０内のＲＡＭ３１７から受信された文字デー
タ、イメージデータ、コマンドボタンデータ等の各種画
像データをメモリ３１３にフレーム単位で保持させ、Ｌ
ＣＤドライバ３１２に起動をかける。The part for outputting the image is the LCD display unit 20
3, LCD driver 312, memory 313, and LCD
It comprises a control unit 314. The LCD control unit 314
Various image data such as character data, image data, and command button data received from the RAM 317 in the control unit 110 are stored in the memory 313 in frame units.
The CD driver 312 is started.

【００６９】ＬＣＤドライバ３１２は、メモリ３１３か
らフレーム単位で読み出される画像データを、ＬＣＤ表
示部３１１（図２の２０３に対応）に表示する。なお、
ＬＣＤ表示部３１１（図２の２０３）の表面には、透明
タッチパネルが配設されており、ユーザは、ＬＣＤ表示
部３１１に表示されるコマンドボタンデータ等に従っ
て、タッチパネルに指タッチ又はペンタッチすることに
より、コマンド入力を行うことができる。この入力信号
は、タッチパネル制御部３１５によって制御部１１０内
のＲＡＭ３１７に転送される。The LCD driver 312 displays the image data read from the memory 313 in frame units on the LCD display unit 311 (corresponding to 203 in FIG. 2). In addition,
A transparent touch panel is provided on the surface of the LCD display unit 311 (203 in FIG. 2). The user touches the touch panel with his / her finger or pen in accordance with command button data displayed on the LCD display unit 311. Command input. This input signal is transferred by the touch panel control unit 315 to the RAM 317 in the control unit 110.

【００７０】続いて、制御部１１０は、ＣＰＵ３１６、
ＲＡＭ３１７、及びＲＯＭ３１８と、ＩＣカードインタ
フェース部３１９、及び必要に応じてＩＣカードスロッ
ト２０７（図２）に挿入されるＩＣカード３２０とから
構成される。Subsequently, the control unit 110 controls the CPU 316,
It comprises a RAM 317 and a ROM 318, an IC card interface unit 319, and an IC card 320 inserted into an IC card slot 207 (FIG. 2) as required.

【００７１】ＣＰＵ３１６は、ＲＯＭ３１８に記憶され
た制御プログラムに従って、ＲＡＭ３１７をワークエリ
アとして使用しながら、移動端末１０１全体の動作を制
御する。The CPU 316 controls the operation of the entire mobile terminal 101 according to the control program stored in the ROM 318 while using the RAM 317 as a work area.

【００７２】ＩＣカードインタフェース部３１９は、Ｉ
Ｃカード３２０に対するデータの入出力を制御する。最
後に、通信部１１１は、通信制御部３２１、無線ドライ
バ３２２、無線アンテナ３２３、有線ドライバ３２４、
及びソケット３２５から構成される。The IC card interface unit 319
The input / output of data to / from the C card 320 is controlled. Lastly, the communication unit 111 includes a communication control unit 321, a wireless driver 322, a wireless antenna 323, a wired driver 324,
And a socket 325.

【００７３】通信制御部３２１は、ＰＨＳ通話処理及び
インターネット１０５との間のＴＣＰ／ＩＰ通信処理
（後述する）を実行し、無線ドライバ３２２又は有線ド
ライバ３２４を制御する。The communication control unit 321 executes a PHS call process and a TCP / IP communication process (to be described later) with the Internet 105, and controls the wireless driver 322 or the wired driver 324.

【００７４】無線ドライバ３２２は、無線通信時に、通
信データを、無線アンテナ３２３（図２の２０５に対
応）を介して送受信されるＰＨＳ無線信号との間で相互
変換する。ＰＨＳ無線信号は、１．９ＧＨｚの無線周波
数と、３００ｋＨｚのキャリア周波数間隔と、４チャネ
ル／キャリアのＴＤＭＡ−ＴＤＤ無線アクセス方式と、
π／４シフトＱＰＳＫ変調方式と、３８４ｋｂｉｔｓ／
ｓｅｃの無線伝送速度に基づく無線信号である。The wireless driver 322 mutually converts communication data with a PHS wireless signal transmitted / received via the wireless antenna 323 (corresponding to 205 in FIG. 2) during wireless communication. The PHS radio signal has a radio frequency of 1.9 GHz, a carrier frequency interval of 300 kHz, a TDMA-TDD radio access scheme of 4 channels / carrier,
π / 4 shift QPSK modulation method and 384 kbits /
This is a wireless signal based on the wireless transmission speed of sec.

【００７５】一方、有線ドライバ３２４は、有線通信時
に、通信データを、ソケット３２５（図２の２０６に対
応）を介して送受信される有線信号との間で相互変換す
る。これは、一般的な電話帯域モデム変調信号である。
以上の構成を有する本発明の実施の形態の動作につい
て、以下に詳細に説明する。＜移動端末１０１の処理＞まず、移動端末１０１の処理
について説明する。On the other hand, at the time of wired communication, the wired driver 324 mutually converts communication data with a wired signal transmitted / received via the socket 325 (corresponding to 206 in FIG. 2). This is a typical telephone band modem modulated signal.
The operation of the embodiment of the present invention having the above configuration will be described in detail below. <Process of Mobile Terminal 101> First, the process of the mobile terminal 101 will be described.

【００７６】図４は、図３の制御部１１０内のＣＰＵ３
１６が、電源投入後に、制御部１１０内のＲＯＭ３１８
に記憶されている制御プログラムを実行する動作として
実現される制御動作を示す全体動作フローチャートであ
る。FIG. 4 shows the CPU 3 in the control unit 110 shown in FIG.
16 stores the ROM 318 in the control unit 110 after the power is turned on.
4 is an overall operation flowchart showing a control operation realized as an operation of executing a control program stored in the control program.

【００７７】なお、図４、図５、及び図８の動作フロー
チャートで示される各機能を実現する制御プログラム及
びそれに必要なデータは、例えば、図２に示されるＩＣ
カードスロット２０７に着脱自在なＩＣカード３２０
に、ＣＰＵ３１６が読み取り可能なプログラムコードの
形態で記憶され、そのプログラムコードがＣＰＵ３１６
によって直接実行され、又は、そのプログラムコードが
必要に応じてＲＡＭ３１７又は書込み可能なＲＯＭ３１
８にロードされてＣＰＵ３１６によって実行されるよう
に構成されてもよい。或いは、上述の制御プログラム及
びそれに必要なデータは、無線又は有線の通信回線又は
光送受信機２０８（図２）から通信部１１１を介して他
の機器から受信されて、ＲＡＭ３１７又は書込み可能な
ＲＯＭ３１８にロードされてＣＰＵ３１６によって実行
されるように構成されてもよい。The control programs for realizing the functions shown in the operation flowcharts of FIGS. 4, 5 and 8 and the data necessary for them are stored in, for example, the IC shown in FIG.
IC card 320 detachable from card slot 207
Is stored in the form of a program code readable by the CPU 316, and the program code is stored in the CPU 316.
Or the program code can be directly executed by the RAM 317 or the writable ROM 31 as needed.
8 to be executed by the CPU 316. Alternatively, the above-described control program and data necessary for the control program are received from another device via a communication unit 111 from a wireless or wired communication line or an optical transceiver 208 (FIG. 2), and stored in a RAM 317 or a writable ROM 318. It may be configured to be loaded and executed by the CPU 316.

【００７８】まず、ステップ４０１→４０２→４０３→
４０４→４０１の繰返しループにおいては、図３のタッ
チパネル制御部３１５からタッチパネル入力の検出が通
知されたか否かの判定処理（４０１）、音声制御ホスト
装置１０８（図１）から認識音声文章データが受信され
たか否かの判定処理（４０２）、その他の受信／表示処
理（４０３）、及び必要なデータの送信処理（４０４）
が実行される。First, steps 401 → 402 → 403 →
In the repetition loop of 404 → 401, it is determined whether touch panel input has been detected from the touch panel control unit 315 in FIG. 3 (401), and the recognized voice text data is received from the voice control host device 108 (FIG. 1). (402), other reception / display processing (403), and necessary data transmission processing (404)
Is executed.

【００７９】タッチパネル制御部３１５からタッチパネ
ル入力の検出が通知されステップ４０１の判定がＹＥＳ
となると、ステップ４０５又は４０６で、上記タッチパ
ネル入力が図３のＣＣＤカメラ３０４（図２の２０２）
の入力指示又は図３のマイク３０１（図２の２０１）の
入力指示であるか否かが、判定される。Touch panel control section 315 notifies of touch panel input detection, and determination in step 401 is YES.
Then, in step 405 or 406, the touch panel input is performed by the CCD camera 304 in FIG. 3 (202 in FIG. 2).
It is determined whether or not the input instruction is the input instruction of the microphone 301 of FIG. 3 (201 of FIG. 2).

【００８０】タッチパネル入力が図３のＣＣＤカメラ３
０４（図２の２０２）の入力指示であってステップ４０
５の判定がＹＥＳとなると、ステップ４０７で、図３の
入力部１０９内のカメラ制御部３０７に対して、例えば
手書き文字画像等の入力処理の開始が指示される。その
後、ステップ４０４の送信処理に進む。画像入力処理
は、本発明には特には関連しないため、その詳細な説明
は省略する。The touch panel input is the CCD camera 3 shown in FIG.
04 (202 in FIG. 2) and the
If the determination at 5 is YES, at step 407, the camera control unit 307 in the input unit 109 in FIG. 3 is instructed to start input processing of, for example, a handwritten character image. Thereafter, the process proceeds to the transmission process of step 404. Since the image input processing is not particularly related to the present invention, a detailed description thereof will be omitted.

【００８１】タッチパネル入力が図３のマイク３０１
（図２の２０１）の入力指示であってステップ４０６の
判定がＹＥＳとなると、ステップ４０８で、図３の入力
部１０９内のマイク制御部３０３に対して、音声入力処
理の開始が指示される。この音声入力処理の開始指示
は、例えばＰＨＳ通話処理の開始指示、又は文音声認識
処理を実行するためのオフライン状態での音声入力処理
の開始指示である。The touch panel input is the microphone 301 shown in FIG.
If the input instruction is (201 in FIG. 2) and the determination in step 406 is YES, in step 408, the microphone control unit 303 in the input unit 109 in FIG. 3 is instructed to start the voice input process. . The start instruction of the voice input process is, for example, a start instruction of the PHS call process or a start instruction of the voice input process in an offline state for executing the sentence voice recognition process.

【００８２】マイク制御部３０３は、上述のＣＰＵ３１
６からの指示によって、マイク３０１（図２の２０１）
及びＡ／Ｄ変換部３０２に対して、音声入力の開始を指
示する。この結果、Ａ／Ｄ変換部３０２からは、マイク
３０１（図２の２０１）から入力された音声データが出
力される。The microphone control unit 303 is compatible with the CPU 31 described above.
6, the microphone 301 (201 in FIG. 2)
And instruct the A / D converter 302 to start voice input. As a result, the audio data input from the microphone 301 (201 in FIG. 2) is output from the A / D converter 302.

【００８３】その後、上述の音声入力処理の開始指示が
ＰＨＳ通話の開始指示である場合には、上述の音声デー
タは、通信制御部３２１の特には図示しない送信処理に
よって、所定の通話チャネルに載せられて通話相手に送
信される。After that, when the above-mentioned voice input processing start instruction is a PHS telephone call start instruction, the above-mentioned voice data is loaded on a predetermined telephone channel by the transmission processing (not shown) of the communication control unit 321. And sent to the other party.

【００８４】また、上述の音声入力処理の開始指示が文
音声認識処理のための音声入力処理の開始指示を含む場
合には、それ以後マイク３０１（図２の２０１）から入
力されマイク制御部３０３から出力された音声データ
は、後述するステップ４０４の送信処理において、そこ
で音声制御ホスト装置１０８に向けて送信される。In the case where the above-mentioned voice input processing start instruction includes a voice input processing start instruction for sentence voice recognition processing, the voice input processing is thereafter input from microphone 301 (201 in FIG. 2) and microphone control unit 303 The audio data output from is transmitted to the audio control host device there in the transmission processing of step 404 described later.

【００８５】タッチパネル入力が図３のＣＣＤカメラ３
０４（図２の２０２）の入力指示でも図３のマイク３０
１（図２の２０１）の入力指示でもない場合には、ステ
ップ４０５及び４０６の判定がＮＯとなって、ステップ
４０９で、他のキー入力処理が実行される。その後、ス
テップ４０４の送信処理に進む。The touch panel input is the CCD camera 3 shown in FIG.
04 (202 in FIG. 2), the microphone 30 in FIG.
If the input instruction is not the input instruction 1 (201 in FIG. 2), the determinations in steps 405 and 406 are NO, and in step 409, another key input processing is executed. Thereafter, the process proceeds to the transmission process of step 404.

【００８６】一方、音声制御ホスト装置１０８（図１）
から通信部１１１を介して制御部１１０内のＲＡＭ３１
７に認識音声文章データが受信され、ステップ４０１→
４０２→４０３→４０４→４０１の繰返しループにおけ
るステップ４０２の判定がＹＥＳとなると、ステップ４
１０において、上記認識音声文章データがＲＡＭ３１７
から出力部１１２内のメモリ３１３に転送され、ＬＣＤ
制御部３１４に対して上記認識音声文章データの表示が
指示される。On the other hand, the voice control host device 108 (FIG. 1)
From the RAM 31 in the control unit 110 via the communication unit 111
7, the recognition voice sentence data is received, and step 401 →
If the determination in step 402 in the iterative loop of 402 → 403 → 404 → 401 is YES, step 4
In step 10, the recognized voice sentence data is stored in the RAM 317.
To the memory 313 in the output unit 112,
The control unit 314 is instructed to display the recognized voice sentence data.

【００８７】この結果、ＬＣＤ制御部３１４の制御によ
って、メモリ３１３からＬＣＤドライバ３１２を介して
ＬＣＤ表示部３１１（図２の２０３）に、受信された認
識音声文章データが表示される。As a result, under the control of the LCD control unit 314, the received recognized voice sentence data is displayed on the LCD display unit 311 (203 in FIG. 2) from the memory 313 via the LCD driver 312.

【００８８】次に、ステップ４０４の送信処理について
説明する。図５は、上記送信処理の詳細を示す動作フロ
ーチャートである。まず、ステップ５０１では、図４の
ステップ４０９の他キー入力処理によって処理されたタ
ッチパネルからのキー入力が送信指示を伴っているか否
かが判定される。Next, the transmission processing in step 404 will be described. FIG. 5 is an operation flowchart showing details of the transmission processing. First, in step 501, it is determined whether or not a key input from the touch panel processed by the other key input processing in step 409 in FIG. 4 is accompanied by a transmission instruction.

【００８９】この判定がＮＯの場合には、ステップ５０
５の処理へ進む。ステップ５０１の判定がＹＥＳの場合
には、ステップ５０２で、移動端末１０１が現在図１の
移動端末制御ホスト装置１０４に接続中であるか否かが
判定される。If this determination is NO, step 50
Proceed to step 5. If the determination in step 501 is YES, in step 502, it is determined whether the mobile terminal 101 is currently connected to the mobile terminal control host device 104 in FIG.

【００９０】移動端末１０１が現在図１の移動端末制御
ホスト装置１０４に接続中でありステップ５０２の判定
がＹＥＳならば、図３の制御部１１０内のＣＰＵ３１６
は、ステップ５０４で、移動端末１０１の“端末識別コ
ード”とキー入力処理に対応するコマンドの送信指示
を、図３の通信部１１１内の通信制御部３２１に対し依
頼する。この結果、通信制御部３２１は、上記“端末識
別コード”とコマンドが格納されたＴＣＰ／ＩＰパケッ
トを生成し、それをインターネット１０５に接続されて
いる所定のホスト（例えば図１の音声制御ホスト装置１
０８）に向け送信する。If mobile terminal 101 is currently connected to mobile terminal control host device 104 in FIG. 1 and the determination in step 502 is YES, CPU 316 in control unit 110 in FIG.
Requests the communication control unit 321 in the communication unit 111 in FIG. 3 to transmit a “terminal identification code” of the mobile terminal 101 and a command corresponding to the key input process in step 504. As a result, the communication control unit 321 generates a TCP / IP packet storing the “terminal identification code” and the command, and transmits the TCP / IP packet to a predetermined host connected to the Internet 105 (for example, the voice control host device in FIG. 1). 1
08).

【００９１】移動端末１０１が現在図１の移動端末制御
ホスト装置１０４に接続中ではなくステップ５０２の判
定がＮＯならば、図３の制御部１１０内のＣＰＵ３１６
は、ステップ５０３で、図３の通信部１１１内の通信制
御部３２１に対して発信処理を依頼してから、ステップ
５０４を実行する。If mobile terminal 101 is not currently connected to mobile terminal control host device 104 in FIG. 1 and the determination in step 502 is NO, CPU 316 in control unit 110 in FIG.
Requests the communication control unit 321 in the communication unit 111 of FIG. 3 to perform a transmission process in step 503, and then executes step 504.

【００９２】後に詳述するように、ユーザの指定に基づ
く文音声認識処理のリアルタイム開始要求コマンド又は
文音声認識処理の非リアルタイム開始要求コマンドの送
信指示、文音声認識処理の終了要求コマンドの送信指
示、及び文音声認識処理の一括転送要求コマンドの送信
指示は、上述のステップ５０４において発行される。As will be described later in detail, a transmission instruction of a real-time start request command for sentence speech recognition processing or a non-real-time start request command of sentence speech recognition processing based on a user's designation, and a transmission instruction of a termination request command for sentence speech recognition processing The transmission instruction of the batch transfer request command of the sentence speech recognition processing is issued in step 504 described above.

【００９３】前述したようにステップ５０１の判定がＮ
Ｏの場合又はステップ５０４の処理の後、ステップ５０
５では、図４のステップ４０８によって、文音声認識処
理のための音声入力処理の開始指示が実行されており、
音声データの音声制御ホスト装置１０８（図１）への送
信指示がなされているか否かが判定される。As described above, the determination at step 501 is N
In the case of O or after the processing of step 504, step 50
In step 5, in step 408 of FIG. 4, an instruction to start a speech input process for a sentence speech recognition process is executed.
It is determined whether an instruction to transmit audio data to the audio control host device 108 (FIG. 1) has been issued.

【００９４】この判定がＮＯの場合には、ステップ５１
０の処理へ進む。ステップ５０５の判定がＹＥＳの場合
には、ステップ５０６で、音声制御ホスト装置１０８か
ら文音声認識処理のリアルタイム開始要求コマンド又は
文音声認識処理の非リアルタイム開始要求コマンドに対
する応答である送信許可データが既に返信されているか
否かが判定される。If this determination is NO, step 51
Proceed to process 0. If the determination in step 505 is YES, in step 506, transmission permission data, which is a response to the real-time start request command for sentence speech recognition processing or the non-real-time start request command for sentence speech recognition processing, is already sent from the speech control host device 108. It is determined whether a reply has been sent.

【００９５】この判定がＮＯの場合には、音声制御ホス
ト装置１０８がまだ移動端末１０１からの文音声認識処
理のリアルタイム開始要求コマンド又は文音声認識処理
の非リアルタイム開始要求コマンドに対する準備が完了
していないため、ステップ５１０の処理へ進む。If the determination is NO, the voice control host device 108 has not yet completed the preparation for the real-time start request command for sentence speech recognition processing or the non-real-time start request command for sentence speech recognition processing from the mobile terminal 101. Since there is not, the process proceeds to step 510.

【００９６】音声制御ホスト装置１０８から文音声認識
処理のリアルタイム開始要求コマンド又は文音声認識処
理の非リアルタイム開始要求コマンドに対する応答であ
る送信許可データが既に返信されておりステップ５０６
の判定がＹＥＳの場合には、更に、ステップ５０７で、
移動端末１０１が現在図１の移動端末制御ホスト装置１
０４に接続中であるか否かが判定される。The voice control host device 108 has already returned the transmission permission data which is a response to the real-time start request command for sentence speech recognition processing or the non-real-time start request command for sentence speech recognition processing, and step 506 is performed.
If the determination is YES, then at step 507,
The mobile terminal 101 is currently the mobile terminal control host device 1 of FIG.
04 is determined.

【００９７】移動端末１０１が現在図１の移動端末制御
ホスト装置１０４に接続中でありステップ５０７の判定
がＹＥＳならば、図３の制御部１１０内のＣＰＵ３１６
は、ステップ５０９で、図３に示される入力部１０９内
のマイク制御部３０３から制御部１１０内のＲＡＭ３１
７に転送されてきている音声データの送信指示を、通信
部１１１内の通信制御部３２１に対し依頼する。この結
果、通信制御部３２１は、上記音声データが格納された
ＴＣＰ／ＩＰパケットを生成し、それをインターネット
１０５に接続されている図１の音声制御ホスト装置１０
８に向けて送信する。If mobile terminal 101 is currently connected to mobile terminal control host device 104 in FIG. 1 and the determination in step 507 is YES, CPU 316 in control unit 110 in FIG.
In step 509, the microphone control unit 303 in the input unit 109 shown in FIG.
The communication control section 321 in the communication section 111 is requested to transmit the voice data transferred to the communication section 7. As a result, the communication control unit 321 generates a TCP / IP packet in which the above-mentioned voice data is stored, and transmits the TCP / IP packet to the voice control host device 10 of FIG.
Send to 8

【００９８】移動端末１０１が現在図１の移動端末制御
ホスト装置１０４に接続中ではなくステップ５０７の判
定がＮＯならば、図３の制御部１１０内のＣＰＵ３１６
は、ステップ５０８で、図３の通信部１１１内の通信制
御部３２１に対して発信処理を依頼してから、ステップ
５０９を実行する。If mobile terminal 101 is not currently connected to mobile terminal control host device 104 in FIG. 1 and the determination in step 507 is NO, CPU 316 in control unit 110 in FIG.
Requests the communication control unit 321 in the communication unit 111 of FIG. 3 to perform a transmission process in step 508, and then executes step 509.

【００９９】後に詳述するように、文音声認識処理のた
めの音声データの送信指示は、上述のステップ５０９に
おいて発行される。前述したようにステップ５０５又は
５０６の判定がＮＯの場合又はステップ５０９の処理の
後、ステップ５１０では、図４のステップ４０７によっ
て、画像入力処理の開始指示が実行されており、画像デ
ータを図１のインターネット１０５に接続されている特
には図示しない画像制御ホスト装置への送信指示がなさ
れているか否かが判定される。As will be described in detail later, an instruction to transmit voice data for sentence voice recognition processing is issued in step 509 described above. As described above, when the determination in step 505 or 506 is NO or after the processing in step 509, in step 510, a start instruction of the image input processing is executed by step 407 in FIG. It is determined whether a transmission instruction to an image control host device (not shown) connected to the Internet 105 is issued.

【０１００】この判定がＮＯの場合には、図４のステッ
プ４０４の送信処理を終了する。ステップ５１０の判定
がＹＥＳの場合には、ステップ５１１で、移動端末１０
１が現在図１の移動端末制御ホスト装置１０４に接続中
であるか否かが判定される。If the determination is NO, the transmission process of step 404 in FIG. 4 ends. If the determination in step 510 is YES, in step 511, the mobile terminal 10
1 is currently connected to the mobile terminal control host device 104 in FIG.

【０１０１】移動端末１０１が現在図１の移動端末制御
ホスト装置１０４に接続中でありステップ５１１の判定
がＹＥＳならば、図３の制御部１１０内のＣＰＵ３１６
は、ステップ５１３で、図３に示される入力部１０９内
のメモリ３０６に得られている画像データの送信指示
を、通信部１１１内の通信制御部３２１に対して依頼す
る。この結果、通信制御部３２１は、上記画像データが
格納されたＴＣＰ／ＩＰパケットを生成し、それをイン
ターネット１０５に接続されている特には図示しない画
像制御ホスト装置１０８に向けて送信する。If the mobile terminal 101 is currently connected to the mobile terminal control host device 104 in FIG. 1 and the determination in step 511 is YES, the CPU 316 in the control unit 110 in FIG.
Requests the communication control unit 321 in the communication unit 111 to transmit the image data obtained in the memory 306 in the input unit 109 shown in FIG. As a result, the communication control unit 321 generates a TCP / IP packet storing the image data, and transmits the TCP / IP packet to the image control host device 108 (not shown) connected to the Internet 105.

【０１０２】移動端末１０１が現在図１の移動端末制御
ホスト装置１０４に接続中ではなくステップ５１１の判
定がＮＯならば、図３の制御部１１０内のＣＰＵ３１６
は、ステップ５１２で、図３の通信部１１１内の通信制
御部３２１に対して発信処理を依頼してから、ステップ
５１３を実行する。If the mobile terminal 101 is not currently connected to the mobile terminal control host device 104 in FIG. 1 and the determination in step 511 is NO, the CPU 316 in the control unit 110 in FIG.
Requests the communication control unit 321 in the communication unit 111 of FIG. 3 to perform a transmission process in step 512, and then executes step 513.

【０１０３】なお、ステップ５１３の画像データの送信
指示は、本発明には特には関連しないため、その詳細な
説明は省略する。前述したようにステップ５１０の判定
がＮＯの場合又はステップ５１３の処理の後、図４のス
テップ４０４の送信処理を終了する。＜通信データのフォーマット＞図６は、移動端末１０１
と移動端末制御ホスト装置１０４及びインターネット１
０５（音声制御ホスト装置１０８）との間で通信される
通信データのフォーマット図である。Since the image data transmission instruction in step 513 is not particularly related to the present invention, a detailed description thereof will be omitted. As described above, when the determination in step 510 is NO or after the processing in step 513, the transmission processing in step 404 in FIG. 4 ends. <Format of Communication Data> FIG.
And mobile terminal control host device 104 and Internet 1
FIG. 5 is a format diagram of communication data communicated with the MFP 05 (voice control host device 108).

【０１０４】移動端末１０１と移動端末制御ホスト装置
１０４との間では、通信データは、ＰＰＰ（Point-to-P
oint Protocol ）と呼ばれる通信プロトコルに基づき、
図６(a) に示されるＰＰＰフレーム（図の左から右に向
けて転送される）を用いて、ＰＨＳ規格の３２ｋｂｉｔ
ｓ／ｓｅｃの伝送レートを有するデジタル通信チャネル
上を伝送される。Communication data between the mobile terminal 101 and the mobile terminal control host device 104 is PPP (Point-to-P
oint Protocol).
Using the PPP frame shown in FIG. 6A (transferred from left to right in the figure), 32 kbits of the PHS standard
It is transmitted over a digital communication channel having a transmission rate of s / sec.

【０１０５】ＰＰＰフレームを構成する、“フラグ”、
“アドレス”、“コントロール”の各フィールドは、図
６(a) に示される各固定ビット列が設定される。２オク
テットのデータ長を有するＦＣＳは、フレームチェック
シーケンスと呼ばれ、ＰＰＰフレームデータの誤り検出
／訂正用のデータである。移動端末１０１と移動端末制
御ホスト装置１０４との間でＰＰＰリンクが確立した後
に転送されるＰＰＰフレームの“インフォメーション”
フィールド（可変データ長を有する）には、インターネ
ット１０５（図１）上のデータの基本伝送単位であるＩ
Ｐデータグラムが格納され、その場合に、２オクテット
のデータ長を有する“プロトコル”フィールドには、”
インフォメーション”フィールドにＩＰデータグラムが
格納されていることを示す１６進値“0021”が格納され
る。"Flag", which constitutes a PPP frame,
In each of the "address" and "control" fields, each fixed bit string shown in FIG. 6A is set. The FCS having a data length of 2 octets is called a frame check sequence, and is data for error detection / correction of PPP frame data. “Information” of a PPP frame transferred after a PPP link is established between the mobile terminal 101 and the mobile terminal control host device 104
The field (having a variable data length) includes I, which is a basic transmission unit of data on the Internet 105 (FIG. 1).
P datagram is stored, in which case a "protocol" field having a data length of 2 octets contains:
A hexadecimal value “0021” indicating that the IP datagram is stored is stored in the “information” field.

【０１０６】ＰＰＰフレームの“インフォメーション”
フィールドには、上述のようにＩＰデータグラムが格納
される。このＩＰデータグラムは、上述のようにインタ
ーネット１０５上のデータの基本伝送単位である。ＩＰ
データグラムは、インターネットプロトコル（ＩＰ）に
従って規定され、その“データ”フィールドに格納され
たデータをインターネット１０５上の宛先のホスト装置
まで一意に転送するための機能を提供し、インターネッ
ト１０５上でのアドレスを特定する機能、そのＩＰデー
タグラム自身を“宛先ＩＰアドレス”で指定されたホス
トまでインターネット１０５上の一定の経路で転送する
機能、そのＩＰデータグラム自身のフラグメント化（分
割）と再組立てを行う機能等を備える。"Information" of PPP frame
The IP datagram is stored in the field as described above. This IP datagram is a basic transmission unit of data on the Internet 105 as described above. IP
The datagram is defined in accordance with the Internet Protocol (IP), provides a function for uniquely transferring data stored in its “data” field to a destination host device on the Internet 105, and has an address on the Internet 105. , The function of transferring the IP datagram itself to the host specified by the "destination IP address" through a fixed route on the Internet 105, and the fragmentation (division) and reassembly of the IP datagram itself. It has functions and the like.

【０１０７】ＩＰデータグラムは、図６(b) に示される
ように、ＩＰヘッダフィールドとデータフィールドとか
ら構成される。ＩＰヘッダフィールドには、それが含ま
れるＩＰデータグラム自身を配送するために必要な全て
の情報が含まれる。図７(a)は、ＩＰヘッダのフォーマ
ット図である。As shown in FIG. 6B, an IP datagram is composed of an IP header field and a data field. The IP header field contains all the information necessary to deliver the IP datagram containing it. FIG. 7A is a format diagram of the IP header.

【０１０８】ＩＰヘッダは、３２ビットを１ワードとし
て、５乃至６ワードのデータ長を有し、このデータ長は
第１ワードの“ヘッダ長”フィールドに格納され、ま
た、ＩＰデータグラム全体のデータ長は、第１ワードの
“ＩＰデータグラムの全長”フィールドに格納される。The IP header has a data length of 5 to 6 words, with 32 bits as one word. This data length is stored in the “header length” field of the first word. The length is stored in the first word of the "total length of IP datagram" field.

【０１０９】第１ワードの“バージョン”フィールドに
は、ＩＰデータグラムの転送方法を規定するインターネ
ットプロトコル（ＩＰ）のバージョンが設定され、現在
のバージョンは４である。In the “version” field of the first word, the version of the Internet Protocol (IP) that defines the transfer method of the IP datagram is set, and the current version is 4.

【０１１０】第１ワードの“サービスの種類”フィール
ドには、配送の優先度を表わす情報等が格納されるが、
ここは本発明には特には関連しない。第２ワードの各フ
ィールドは、ＩＰデータグラムがインターネット１０５
上での転送の制約によりフラグメント化（分割）される
場合における制御情報を規定する。まず、“識別番号”
フィールドには、分割されたフラグメントであるこのＩ
Ｐデータグラムが属する分割前のＩＰデータグラムを識
別するための一意な整数が設定される。次に、”フラグ
メントのオフセット”フィールドには、分割されたフラ
グメントであるこのＩＰデータグラムが分割前のＩＰデ
ータグラムのどの部分に相当するかを示すオフセット情
報が設定される。そして、”フラグ列”フィールドに
は、分割されたフラグメントであるこのＩＰデータグラ
ムに、それが属する分割前のＩＰデータグラムを構成す
る他のフラグメントが後続するか否かが設定される。以
上の情報により、インターネット１０５上の中継ホスト
においてＩＰデータグラムがフラグメント化されても、
受信側で分割前のＩＰデータグラムを正確に復元するこ
とができる。[0110] The "service type" field of the first word stores information indicating the priority of delivery.
This is not particularly relevant to the present invention. Each field of the second word indicates that the IP datagram is
The control information in the case of fragmentation (division) due to the above-described transfer restriction is defined. First, the “identification number”
The field contains this fragmented I
A unique integer for identifying the undivided IP datagram to which the P datagram belongs is set. Next, in the “fragment offset” field, offset information indicating which part of the IP datagram before the division corresponds to the IP datagram that is the divided fragment is set. In the "flag string" field, it is set whether or not another fragment constituting the undivided IP datagram to which this divided IP datagram belongs is followed by the divided fragment. With the above information, even if the IP datagram is fragmented at the relay host on the Internet 105,
The IP datagram before division can be accurately restored on the receiving side.

【０１１１】第３ワードの“生存期間”（ＴＴＬ：Time
To Live）フィールドには、そのＩＰデータグラムがイ
ンターネット１０５上にどれだけの時間の間存在するこ
とを許すかを示す秒単位の時間情報が設定される。イン
ターネット１０５上の中継ホストは、ＩＰデータグラム
を処理する毎に上記フィールド値を減算し、値が０以下
になったＩＰデータグラムはインターネット１０５上か
ら廃棄する。これにより、インターネット１０５上での
過度なトラヒックの発生が抑制される。なお、廃棄され
たＩＰデータグラムに対する再送制御は、そのＩＰデー
タグラムに格納されるＴＣＰセグメントに対する制御処
理において実現される。The “lifetime” of the third word (TTL: Time
In the “To Live” field, time information in seconds indicating how long the IP datagram is allowed to exist on the Internet 105 is set. The relay host on the Internet 105 subtracts the above field value each time the IP datagram is processed, and discards the IP datagram whose value becomes 0 or less from the Internet 105. This suppresses the occurrence of excessive traffic on the Internet 105. Note that retransmission control for a discarded IP datagram is realized in control processing for a TCP segment stored in the IP datagram.

【０１１２】第３ワードの“プロトコル”フィールドに
は、そのＩＰデータグラムの“データ”フィールドに格
納されるデータのフォーマットを規定するための整数値
が設定される。本実施の形態の場合には、図６(c) に示
されるように、ＩＰデータグラムの“データ”フィール
ドにはＴＣＰセグメントデータが格納されるため、その
フォーマットを規定する整数値６が設定される。In the “protocol” field of the third word, an integer value for defining the format of data stored in the “data” field of the IP datagram is set. In the case of the present embodiment, as shown in FIG. 6C, since the TCP segment data is stored in the "data" field of the IP datagram, an integer value 6 defining the format is set. You.

【０１１３】第３ワードの“ヘッダのチェックサム”フ
ィールドには、ＩＰヘッダのデータの誤りを検出するた
めのチェックサムデータが設定される。第４ワードに
は、３２ビットの“送信元ＩＰアドレス”が設定され
る。例えばＩＰデータグラムが移動端末１０１から音声
制御ホスト装置１０８へ転送される場合には、“送信元
ＩＰアドレス”としては、後述する発信処理により移動
端末制御ホスト装置１０４から移動端末１０１に対して
付与されたＩＰアドレスが設定される。図１の音声制御
ホスト装置１０８は、この“送信元ＩＰアドレス”を記
憶することにより、インターネット１０５を介して移動
端末１０１に対して、フォーマット文章データ等を返信
することができる。In the “header checksum” field of the third word, checksum data for detecting an error in the data of the IP header is set. A 32-bit “source IP address” is set in the fourth word. For example, when an IP datagram is transferred from the mobile terminal 101 to the voice control host device 108, the “source IP address” is assigned from the mobile terminal control host device 104 to the mobile terminal 101 by a transmission process described later. The set IP address is set. By storing this “source IP address”, the voice control host device 108 in FIG. 1 can return format text data and the like to the mobile terminal 101 via the Internet 105.

【０１１４】第５ワードには、３２ビットの“宛先ＩＰ
アドレス”が設定される。例えばＩＰデータグラムが移
動端末１０１から音声制御ホスト装置１０８へ転送され
る場合には、“宛先ＩＰアドレス”としては、音声制御
ホスト装置１０８に固定的に割当てられているＩＰアド
レスが設定される。移動端末制御ホスト装置１０４内の
ルーティング部１１４、インターネット１０５上の各中
継ホスト装置、及び音声サービスプロバイダ内のルータ
装置１０６は、受信したＩＰデータグラムに格納されて
いる上記“宛先ＩＰアドレス”を識別することによっ
て、予め各装置が有する経路制御テーブル情報に従っ
て、そのＩＰデータグラムの配送経路を決定し、最終的
にそのＩＰデータグラムを音声サービスプロバイダ内の
音声制御ホスト装置１０８まで転送することができる。The fifth word contains a 32-bit “destination IP”.
For example, when an IP datagram is transferred from the mobile terminal 101 to the voice control host device 108, the “destination IP address” is fixedly assigned to the voice control host device 108. The IP address is set.The routing unit 114 in the mobile terminal control host device 104, each relay host device on the Internet 105, and the router device 106 in the voice service provider are stored in the received IP datagram. By identifying the "destination IP address", the delivery route of the IP datagram is determined in advance according to the routing control table information of each device, and finally the IP datagram is transferred to the voice control host device in the voice service provider. 108.

【０１１５】第６ワードの“ＩＰオプション”フィール
ドは、オプションであり、インターネット１０５を構成
する各ネットワークのテスト又はデバッグのための情報
や、インターネット１０５上での配送経路を制御又は監
視するための制御情報等が設定されるが、ここは本発明
には特には関連しない。The "IP option" field of the sixth word is optional, and is used for information for testing or debugging each network constituting the Internet 105 and control for controlling or monitoring a delivery route on the Internet 105. Information and the like are set, but this is not particularly relevant to the present invention.

【０１１６】第６ワードの“パディング”フィールドに
は、データ長を合わせるためのパディングデータが設定
される。次に、ＩＰデータグラムの“データ”フィール
ドには、ＴＣＰセグメントデータが格納される。このＴ
ＣＰセグメントは、トランスミッションコントロールプ
ロトコル（ＴＣＰ）に従って規定され、その“データ”
フィールドに格納されたデータをインターネット１０５
上の宛先のホスト装置まで正確に適切な順序で配送する
ための機能を備える。ＩＰデータグラムがインターネッ
ト１０５上でのデータの一意な転送の機能のみを提供
し、データの信頼性を確保する機能（再送制御機能等）
を提供しないのに対して、ＴＣＰセグメントは、データ
の信頼性を確保する機能を提供するものである。In the “padding” field of the sixth word, padding data for adjusting the data length is set. Next, TCP segment data is stored in the "data" field of the IP datagram. This T
The CP segment is defined according to the Transmission Control Protocol (TCP), and its "data"
The data stored in the field is transferred to the Internet 105
A function is provided for accurately delivering the packet to the above destination host device in an appropriate order. IP datagrams provide only the unique transfer function of data on the Internet 105, and the function of ensuring data reliability (retransmission control function, etc.)
Is provided, whereas the TCP segment provides a function for ensuring data reliability.

【０１１７】このように、通信データが、（ＰＰＰフレ
ームと）ＩＰデータグラムとＴＣＰセグメントという階
層構造を有するのは、インターネット１０５上ではなる
べく小さい処理負荷のもとで効率良くデータを配送する
必要があり、エンド対エンド間ではできるかぎり信頼性
の高いデータ配送を実現する必要があるという異なる要
請に効率的に対処するためである。これにより、インタ
ーネット１０５上の中継ホスト装置は、ＩＰデータグラ
ムのＩＰヘッダのみを参照することにより、そのＩＰデ
ータグラムの“データ”フィールドに格納された情報
（ＴＣＰセグメント）をできる限り高速かつ効率的に宛
先ホスト装置まで配送することができ、エンド対エンド
（送信元ホスト装置と宛先ホスト装置）間では、ＴＣＰ
セグメントのＴＣＰヘッダを参照することにより、再送
制御等の信頼性の高いデータ通信を実現することができ
るのである。As described above, the communication data has the hierarchical structure of the (PPP frame), the IP datagram and the TCP segment. Therefore, it is necessary to efficiently deliver the data on the Internet 105 under a processing load as small as possible. Yes, in order to efficiently address the different demands of achieving the most reliable data delivery end-to-end. As a result, the relay host device on the Internet 105 refers to only the IP header of the IP datagram, so that the information (TCP segment) stored in the “data” field of the IP datagram is as fast and efficiently as possible. To the destination host device, and between end-to-end (source host device and destination host device), TCP
By referring to the TCP header of the segment, highly reliable data communication such as retransmission control can be realized.

【０１１８】ＴＣＰセグメントは、図６(b) に示される
ように、ＴＣＰヘッダフィールドとデータフィールドと
から構成される。図７(b) は、ＴＣＰヘッダのフォーマ
ット図である。The TCP segment is composed of a TCP header field and a data field as shown in FIG. FIG. 7B is a format diagram of the TCP header.

【０１１９】ＴＣＰヘッダは、ＩＰヘッダの場合と同様
に、３２ビットを１ワードとして、５乃至６ワードのデ
ータ長を有し、このデータ長は第４ワードの“ヘッダ
長”フィールドに格納され、また、ＩＰデータグラム全
体のデータ長は、第１ワードの“ＩＰデータグラムの全
長”フィールドに格納される。As in the case of the IP header, the TCP header has a data length of 5 to 6 words with 32 bits as one word, and this data length is stored in the “header length” field of the fourth word. Also, the data length of the entire IP datagram is stored in the “total length of IP datagram” field of the first word.

【０１２０】第１ワードの“送信元ポート番号”フィー
ルドと“宛先ポート番号”フィールドには、文音声認識
処理のための通信プロトコルを特定する１６ビットの整
数値が設定される。In the "source port number" field and the "destination port number" field of the first word, a 16-bit integer value specifying a communication protocol for the sentence speech recognition process is set.

【０１２１】音声制御ホスト装置１０８内のパケット送
受信部１１５（図１）は、文音声認識処理のための音声
データが格納されたＴＣＰセグメントのほかにも、電子
メールデータを始めとする様々なデータが格納された様
々なＴＣＰセグメントを送受信するため、受信したＴＣ
ＰセグメントのＴＣＰヘッダに設定されている“宛先ポ
ート番号”フィールドの値を認識することによって、そ
のＴＣＰセグメントの“データ”フィールドに格納され
ているデータを音声制御ホスト装置１０８で実行される
どのアプリケーションに引き渡すかを決定することがで
きる。The packet transmitting / receiving unit 115 (FIG. 1) in the voice control host device 108 stores various data such as e-mail data in addition to the TCP segment in which voice data for sentence voice recognition processing is stored. Is transmitted and received to transmit and receive various TCP segments.
By recognizing the value of the "destination port number" field set in the TCP header of the P segment, the data stored in the "data" field of the TCP segment can be transmitted to any application executed by the voice control host device 108. You can decide to hand it over.

【０１２２】そして、パケット送受信部１１５は、受信
したＴＣＰセグメントのＴＣＰヘッダに設定されている
“宛先ポート番号”フィールドの値が文音声認識処理の
ための通信プロトコルに対応する値を示している場合に
は、そのＴＣＰセグメントの“データ”フィールドに格
納されている音声データを移動端末通信制御部１１６に
引き渡すことができる。The packet transmitting / receiving unit 115 determines that the value of the “destination port number” field set in the TCP header of the received TCP segment indicates a value corresponding to the communication protocol for the sentence speech recognition process. , The voice data stored in the “data” field of the TCP segment can be delivered to the mobile terminal communication control unit 116.

【０１２３】同様に、移動端末１０１の通信部１１１内
の通信制御部３２１（図３）も、認識音声文章データが
格納されたＴＣＰセグメントのほかにも、ホームページ
データや電子メールデータを始めとする様々なデータが
格納された様々なＴＣＰセグメントを送受信するため、
受信したＴＣＰセグメントのＴＣＰヘッダに設定されて
いる“宛先ポート番号”フィールドの値を認識すること
によって、そのＴＣＰセグメントの“データ”フィール
ドに格納されているデータを移動端末１０１で実行され
るどのアプリケーションに引き渡すかを決定することが
できる。Similarly, the communication control unit 321 (FIG. 3) in the communication unit 111 of the mobile terminal 101 also includes homepage data and e-mail data in addition to the TCP segment in which the recognized voice text data is stored. To send and receive various TCP segments storing various data,
By recognizing the value of the “destination port number” field set in the TCP header of the received TCP segment, the data stored in the “data” field of the TCP segment can be transmitted to any application executed by the mobile terminal 101. You can decide to hand it over.

【０１２４】そして、通信制御部３２１は、受信したＴ
ＣＰセグメントのＴＣＰヘッダに設定されている“宛先
ポート番号”フィールドの値が文音声認識処理のための
通信プロトコルに対応する値を示している場合には、制
御部１１０（図１、図３）に、文音声認識処理のための
データの受信を通知し、そのＴＣＰセグメントの“デー
タ”フィールドに格納されている認識音声文章データ等
を引き渡すことができる。Then, the communication control unit 321 transmits the received T
If the value of the “destination port number” field set in the TCP header of the CP segment indicates a value corresponding to a communication protocol for sentence speech recognition processing, the control unit 110 (FIGS. 1 and 3) , The reception of data for sentence speech recognition processing is notified, and the recognized speech sentence data and the like stored in the “data” field of the TCP segment can be delivered.

【０１２５】更に、音声制御ホスト装置１０８内のパケ
ット送受信部１１５及び移動端末１０１の通信部１１１
内の通信制御部３２１は、受信したＴＣＰセグメントの
ＴＣＰヘッダに設定されている“送信元ポート番号”を
確認することにより、送信元のアプリケーションを確認
することができる。Furthermore, the packet transmission / reception unit 115 in the voice control host device 108 and the communication unit 111 of the mobile terminal 101
The communication control unit 321 in the above can confirm the source application by confirming the “source port number” set in the TCP header of the received TCP segment.

【０１２６】次に、図７に示されるＴＣＰヘッダの第２
ワードの“シーケンス番号”フィールドは、現在のＴＣ
Ｐコネクションにおいて送信側から受信側に送信される
全バイトストリームのうち、このＴＣＰセグメントの
“データ”フィールドに格納されているデータの先頭が
上記全バイトストリームの何バイト目にあたるかを、送
信側から受信側に通知するためのフィールドである。逆
に、第３ワードの“確認応答番号”フィールドは、現在
のＴＣＰコネクションにおいて送信側から受信側に送信
される全バイトストリームのうち、受信側が現在何バイ
ト目までを誤り無く受信したかを、受信側から送信側に
通知するためのフィールドである。これにより、例えば
移動端末１０１から音声制御ホスト装置１０８に対し
て、音声データを正しい順序でかつ高い信頼性のもとで
転送することが可能となる。Next, the second part of the TCP header shown in FIG.
The word "sequence number" field contains the current TC
From the entire byte stream transmitted from the transmission side to the reception side in the P connection, the transmission side determines from which byte of the byte stream the data stored in the "data" field of this TCP segment corresponds. This is a field for notifying the receiving side. Conversely, the “acknowledgement number” field of the third word indicates the number of bytes that the receiving side has received without error in the entire byte stream transmitted from the transmitting side to the receiving side in the current TCP connection. This is a field for notification from the receiving side to the transmitting side. As a result, for example, the voice data can be transferred from the mobile terminal 101 to the voice control host device 108 in the correct order and with high reliability.

【０１２７】第４ワードの“フラグ列”フィールドに
は、ＴＣＰセグメントの種類を示す値が設定される。Ｔ
ＣＰ通信においては、例えばコネクションの開始時又は
終了時等において確認応答のための様々な制御データが
通信されるが、それらの制御データの種類が、“フラグ
列”フィールドに設定される。In the "flag string" field of the fourth word, a value indicating the type of the TCP segment is set. T
In the CP communication, for example, various control data for acknowledgment is transmitted at the start or end of the connection, for example, and the type of the control data is set in the “flag string” field.

【０１２８】第４ワードの“ウインドウ”フィールド
は、受信側が現在何バイトのデータを連続して受信する
ことが可能であるかを示すウインドウデータを、受信側
から送信側に通知するためのフィールドである。これに
より、受信側から送信側に対するデータのフロー制御が
可能となり、例えば音声制御ホスト装置１０８の負荷が
高いような場合には移動端末１０１に対して音声データ
の送信を抑制させる、といようなきめの細かい制御が可
能となる。The “window” field of the fourth word is a field for notifying the receiving side to the transmitting side window data indicating how many bytes of data the receiving side can currently receive continuously. is there. This enables data flow control from the receiving side to the transmitting side. For example, when the load on the voice control host device 108 is high, the transmission of voice data to the mobile terminal 101 is suppressed. Fine control is possible.

【０１２９】第４ワードの“予約済”フィールドは、予
約用のフィールドである。第５ワードの“チェックサ
ム”フィールドには、ＴＣＰヘッダ及び“データ”フィ
ールドに格納されているデータの誤りを検出するための
チェックサムデータが格納される。これにより、例えば
音声制御ホスト装置１０８は、移動端末１０１から音声
データを正確に受信することができる。第５ワードの
“緊急ポインタ”は、緊急データ（インタラプトデータ
やアボートデータ等）を通信するための制御データであ
るが、これは本発明には特には関連しない。The "reserved" field of the fourth word is a reserved field. The “checksum” field of the fifth word stores checksum data for detecting an error in the data stored in the TCP header and the “data” field. Thus, for example, the voice control host device 108 can correctly receive voice data from the mobile terminal 101. The "urgent pointer" of the fifth word is control data for communicating urgent data (interrupt data, abort data, etc.), but this is not particularly relevant to the present invention.

【０１３０】第６ワードの“オプション”フィールド
は、例えば送受信装置間で通信可能な最大セグメント長
を指定するため等に使用されるが、これは本発明には特
には関連しない。The "option" field of the sixth word is used, for example, to specify the maximum segment length that can be communicated between the transmitting and receiving apparatuses, but this is not particularly relevant to the present invention.

【０１３１】第６ワードの“パディング”フィールドに
は、データ長を合わせるためのパディングデータが設定
される。上述の構成を有するＴＣＰセグメントの通信
（終端）処理機能は、移動端末１０１においては通信部
１１１内の通信制御部３２１（図３）において実現さ
れ、音声制御ホスト装置１０８においてはパケット送受
信部１１５（図１）において実現される。なお、移動端
末１０１においてＣＰＵ３１６が実行する制御プログラ
ムが上記処理機能を実現するように構成されてもよい。＜発信処理＞前述のように、移動端末１０１の制御部１
１０内のＣＰＵ３１６（図３）は、図４のステップ４０
４に対応する図５に示される送信処理のうち、移動端末
１０１が現在図１の移動端末制御ホスト装置１０４に接
続中でなくステップ５０２、５０７、又は５１１の判定
がＮＯである場合には、ステップ５０３、５０８、又は
５１２において、図３の通信部１１１内の通信制御部３
２１に対して発信処理を依頼する。この依頼によって、
通信制御部３２１が実行する発信処理は、図８の動作フ
ローチャートによって示される。In the "padding" field of the sixth word, padding data for adjusting the data length is set. The communication (termination) processing function of the TCP segment having the above-described configuration is realized by the communication control unit 321 (FIG. 3) in the communication unit 111 in the mobile terminal 101, and the packet transmission / reception unit 115 ( 1). Note that the control program executed by the CPU 316 in the mobile terminal 101 may be configured to realize the above processing functions. <Outgoing Call Processing> As described above, the control unit 1 of the mobile terminal 101
The CPU 316 in FIG. 10 (FIG. 3)
5, when the mobile terminal 101 is not currently connected to the mobile terminal control host device 104 in FIG. 1 and the determination in step 502, 507, or 511 is NO, In step 503, 508, or 512, the communication control unit 3 in the communication unit 111 in FIG.
Request the transmission processing to 21. By this request,
The transmission process executed by the communication control unit 321 is shown by the operation flowchart in FIG.

【０１３２】まず、ステップ８０１では、リンク確立フ
ェーズが実行される。このフェーズでは、移動端末制御
ホスト装置１０４のアクセス電話番号に対して自動的に
ダイヤルアップが行われ移動端末制御ホスト装置１０４
が着信した後、リンクコントロールプロトコル（ＬＣ
Ｐ）と呼ばれるプロトコルを使用し、通信に使用される
ＰＰＰフレーム（図６(a) ）の最大データ長の決定、エ
スケープされるべき非透過文字の決定、ＰＰＰフレーム
の“プロトコル”フィールド（図６(a) ）のデータ長を
２オクテットから１オクテットに圧縮することの有無の
決定、ＰＰＰフレームの固定値“11111111”を有する
“アドレス”フィールド（図６(a) ）を省略（圧縮）す
ることの有無の決定等に関するネゴシエーションが、移
動端末制御ホスト装置１０４内の接続確立部１１３（図
１）との間で実行される。この場合、移動端末１０１の
通信部１１１内の通信制御部３２１と移動端末制御ホス
ト装置１０４内の接続確立部１１３との間の通信は、図
６(a) に示されるフォーマットを有するＰＰＰフレーム
を用いて、その“プロトコル”フィールドにＬＣＰを特
定する１６進値“c021”を設定し、その“インフォメー
ションフィールド”に、必要な制御データを設定して、
実行される。First, in step 801, a link establishment phase is executed. In this phase, the access telephone number of the mobile terminal control host device 104 is automatically dialed up and the mobile terminal control host device 104
Is received, the link control protocol (LC
P), a maximum data length of a PPP frame (FIG. 6A) used for communication, a non-transparent character to be escaped, a "protocol" field of the PPP frame (FIG. 6). (a)) Determine whether to compress the data length from 2 octets to 1 octet, and omit (compress) the "address" field (FIG. 6 (a)) having the fixed value "11111111" of the PPP frame. A negotiation regarding the determination of the presence / absence is performed with the connection establishment unit 113 (FIG. 1) in the mobile terminal control host device 104. In this case, communication between the communication control unit 321 in the communication unit 111 of the mobile terminal 101 and the connection establishment unit 113 in the mobile terminal control host device 104 is performed by using a PPP frame having the format shown in FIG. By setting a hexadecimal value “c021” specifying the LCP in the “protocol” field and setting necessary control data in the “information field”,
Be executed.

【０１３３】次に、ステップ８０２においては、認証フ
ェーズが実行される。このフェーズでは、ＰＡＰ（Pass
word Authentication Protocol）又はＣＨＡＰ（Challe
ngeHandshake Authentication Protocol ）と呼ばれる
認証プロトコルを使用し、移動端末１０１を使用するユ
ーザの認証が、移動端末制御ホスト装置１０４内の接続
確立部１１３（図１）から移動端末１０１に対して実行
される。これにより、移動端末制御ホスト装置１０４を
運営するインターネットプロバイダは、移動端末１０１
を使用するユーザが契約されたユーザであるか否かを決
定できる。この場合、移動端末１０１の通信部１１１内
の通信制御部３２１と移動端末制御ホスト装置１０４内
の接続確立部１１３との間の通信は、図６(a) に示され
るフォーマットを有するＰＰＰフレームを用いて、その
“プロトコル”フィールドにＰＡＰを特定する１６進値
“c023”又はＣＨＡＰを特定する１６進値“c223”を設
定し、その“インフォメーションフィールド”に、必要
な認証用データを設定して、実行される。Next, in step 802, an authentication phase is executed. In this phase, PAP (Pass
word Authentication Protocol) or CHAP (Challe
The authentication of the user using the mobile terminal 101 is performed from the connection establishment unit 113 (FIG. 1) in the mobile terminal control host device 104 to the mobile terminal 101 using an authentication protocol called ngeHandshake Authentication Protocol). As a result, the Internet provider operating the mobile terminal control host device 104 becomes the mobile terminal 101
Can be determined whether or not the user who uses is a contracted user. In this case, communication between the communication control unit 321 in the communication unit 111 of the mobile terminal 101 and the connection establishment unit 113 in the mobile terminal control host device 104 is performed by using a PPP frame having the format shown in FIG. A hexadecimal value “c023” specifying PAP or a hexadecimal value “c223” specifying CHAP is set in the “protocol” field, and necessary authentication data is set in the “information field”. Will be executed.

【０１３４】最後に、ステップ８０３では、ネットワー
クレイヤプロトコルフェーズが実行される。本実施の形
態の場合、このフェーズでは、ＩＰコントロールプロト
コル（ＩＰＣＰ）と呼ばれるプロトコルを使用して、Ｔ
ＣＰヘッダ（図７(b) 参照）の圧縮の有無が決定される
と共に、移動端末制御ホスト装置１０４が割当てること
のできる空き（未使用）ＩＰアドレスのうちの１つが移
動端末１０１に対して割り当てられ、加えて、必要な経
路情報が移動端末１０１の通信部１１１内の通信制御部
３２１（図３）と移動端末制御ホスト装置１０４内のル
ーティング部１１４（図１）に設定される。これ以後、
移動端末１０１は、そのＩＰアドレスを使用することに
よって、インターネット１０５に接続される音声制御ホ
スト装置１０８、及びインターネット１０５上のユーザ
が希望する任意のリソースにアクセスすることが可能と
なる。この場合、移動端末１０１の通信部１１１内の通
信制御部３２１と移動端末制御ホスト装置１０４内の接
続確立部１１３との間の通信は、図６(a) に示されるフ
ォーマットを有するＰＰＰフレームを用いて、その“プ
ロトコル”フィールドにＩＰＣＰを特定する１６進値
“8021”を設定し、その“インフォメーションフィール
ド”に、必要なＩＰアドレスのネゴシエーションのため
のデータ等を設定して、実行される。Finally, at step 803, a network layer protocol phase is executed. In the case of the present embodiment, in this phase, a protocol called an IP control protocol (IPCP) is used, and T
Whether to compress the CP header (see FIG. 7B) is determined, and one of the free (unused) IP addresses that can be allocated by the mobile terminal control host device 104 is allocated to the mobile terminal 101. In addition, necessary route information is set in the communication control unit 321 (FIG. 3) in the communication unit 111 of the mobile terminal 101 and the routing unit 114 (FIG. 1) in the mobile terminal control host device 104. After this,
By using the IP address, the mobile terminal 101 can access the voice control host device 108 connected to the Internet 105 and any resource desired by the user on the Internet 105. In this case, communication between the communication control unit 321 in the communication unit 111 of the mobile terminal 101 and the connection establishment unit 113 in the mobile terminal control host device 104 is performed by using a PPP frame having the format shown in FIG. The "protocol" field is used to set a hexadecimal value "8021" for specifying the IPCP, and the "information field" is set with data for negotiating a necessary IP address, and executed.

【０１３５】以上の一連の動作により、移動端末１０１
は、移動端末制御ホスト装置１０４内のルーティング部
１１４との間で通信用のＴＣＰ／ＩＰパケットが格納さ
れたＰＰＰフレームを授受することが可能となり、移動
端末１０１は、インターネット１０５上のリソースに自
由にアクセスすることが可能になる。By the above series of operations, the mobile terminal 101
Can exchange a PPP frame storing a TCP / IP packet for communication with the routing unit 114 in the mobile terminal control host device 104, and the mobile terminal 101 can freely use resources on the Internet 105. Can be accessed.

【０１３６】なお、ＰＨＳ通話時にも音声制御ホスト装
置１０８等へのアクセスを可能とするために、移動端末
１０１は、例えば２チャネル同時通信機能を有するよう
に構成することができる。Note that the mobile terminal 101 can be configured to have, for example, a two-channel simultaneous communication function in order to allow access to the voice control host device 108 and the like even during a PHS call.

【０１３７】また、移動端末１０１の通信部１１１内の
通信制御部３２１（図３）は、一定時間（例えば１０分
間）送受信データを検出しなかった場合に、移動端末制
御ホスト装置１０４との間のＰＰＰリンクを自動的に切
断するように構成することができる。＜文音声認識処理に関する移動端末１０１の送受信処理
の詳細動作＞ユーザが移動端末１０１のタッチパネルを
操作してリアルタイム又は非リアルタイムによる文音声
認識処理の開始を指示した場合及びそれ以後に移動端末
１０１が実行する送受信処理の詳細な動作について、説
明する。The communication control unit 321 (FIG. 3) in the communication unit 111 of the mobile terminal 101 communicates with the mobile terminal control host device 104 when the transmission / reception data is not detected for a predetermined time (for example, 10 minutes). May be configured to automatically disconnect the PPP link. <Detailed operation of transmission / reception processing of mobile terminal 101 related to sentence voice recognition processing> When the user operates the touch panel of mobile terminal 101 to instruct to start real-time or non-real-time sentence voice recognition processing, and thereafter, mobile terminal 101 The detailed operation of the transmission / reception processing to be executed will be described.

【０１３８】上述のタッチパネルの操作は、図３のタッ
チパネル制御部３１５において検出された後、制御部１
１０内のＣＰＵ３１６（図３）によって、それが実行さ
れる前述した図４の動作フローチャートに対応する制御
動作において、ステップ４０１の判定がＹＥＳ、ステッ
プ４０５及び４０６の判定がＮＯとなって、ステップ４
０９の他キー入力処理が実行されることにより、検出さ
れる。更に、ステップ４０４の送信処理において、前述
した図５のステップ５０１の判定がＹＥＳとなり、必要
に応じてステップ５０３で発信処理が実行された後、ス
テップ５０４において、移動端末１０１の“端末識別コ
ード”と上述の文音声認識処理の開始指示を示すキー入
力処理に対応するコマンドの送信指示が、図３の通信部
１１１内の通信制御部３２１に対して依頼される。After the operation of the touch panel described above is detected by the touch panel control unit 315 in FIG.
In the control operation corresponding to the above-described operation flowchart of FIG. 4 executed by the CPU 316 (FIG. 3) in the CPU 10, the determination in step 401 is YES, and the determinations in steps 405 and 406 are NO.
09 is detected by executing another key input process. Further, in the transmission processing in step 404, the determination in step 501 in FIG. 5 described above is YES, and the transmission processing is executed in step 503 as necessary. Then, in step 504, the “terminal identification code” of the mobile terminal 101 is transmitted. Then, the communication control unit 321 in the communication unit 111 of FIG. 3 is requested to send a command corresponding to the key input process indicating the start instruction of the above sentence voice recognition process.

【０１３９】この結果、通信制御部３２１は、まず、図
６(c) に示されるフォーマットを有するＴＣＰセグメン
トを生成する。この場合に、図６(c) 及び図７(b) に示
されるフォーマットを有するＴＣＰヘッダにおいて、
“送信元ポート番号”フィールド及び“宛先ポート番
号”フィールドには、文音声認識処理のための通信プロ
トコルを特定する１６ビットの整数値が設定される。そ
して、ＴＣＰセグメントの“データ”フィールドには、
移動端末１０１を特定する“端末識別コード”（例えば
そのＰＨＳ電話番号）と、ユーザの指定に基づく文音声
認識処理のリアルタイム開始要求コマンド又は文音声認
識処理の非リアルタイム開始要求コマンドとが格納され
る。As a result, the communication control section 321 first generates a TCP segment having the format shown in FIG. In this case, in the TCP header having the format shown in FIGS. 6 (c) and 7 (b),
In the “source port number” field and the “destination port number” field, a 16-bit integer value that specifies a communication protocol for sentence speech recognition processing is set. Then, in the “data” field of the TCP segment,
A “terminal identification code” (for example, the PHS telephone number) specifying the mobile terminal 101 and a real-time start request command for sentence speech recognition processing or a non-real-time start request command for sentence speech recognition processing based on a user's designation are stored. .

【０１４０】次に、通信制御部３２１は、上述のＴＣＰ
セグメントが“データ”フィールドに格納された図６
(b) に示されるフォーマットを有するＩＰデータグラム
を生成する。この場合に、図６(b) 及び図７(a) に示さ
れるフォーマットを有するＩＰヘッダにおいて、“プロ
トコル”フォーマットには、その“データ”フィールド
に格納されるＴＣＰセグメントデータのフォーマットを
規定する整数値６が設定される。また、“送信元ＩＰア
ドレス”フィールドには、既に実行されている発信処理
（図８のステップ８０３の説明を参照）によって移動端
末制御ホスト装置１０４内の接続確立部１１３から移動
端末１０１の通信部１１１内の通信制御部３２１に対し
て付与されたＩＰアドレスが設定される。更に、“宛先
ＩＰアドレス”フィールドには、音声制御ホスト装置１
０８に割り当てられているＩＰアドレスが設定される。Next, the communication control unit 321 executes the above-described TCP
Figure 6 with segments stored in the "data" field
An IP datagram having the format shown in (b) is generated. In this case, in the IP header having the format shown in FIGS. 6 (b) and 7 (a), the "protocol" format includes a format that defines the format of the TCP segment data stored in the "data" field. Numerical value 6 is set. In the “source IP address” field, the connection establishment unit 113 in the mobile terminal control host device 104 transmits the communication unit of the mobile terminal 101 by the transmission processing already executed (see the description of step 803 in FIG. 8). The assigned IP address is set for the communication control unit 321 in 111. Further, the "destination IP address" field contains the voice control host device 1
08 is set.

【０１４１】そして、通信制御部３２１は、上述のＩＰ
データグラムが“インフォメーション”フィールドに格
納され、その”インフォメーション”フィールドにＩＰ
データグラムが格納されていることを示す１６進値“00
21”が“プロトコル”フィールドに格納された図６(a)
に示されるフォーマットを有するＰＰＰフレームを生成
し、通信制御部３２１内に設定されている経路情報（図
８のステップ８０３の説明を参照）に従って、上記ＰＰ
Ｐフレームを移動端末制御ホスト装置１０４に送信す
る。以降、上述のＴＣＰセグメント、ＩＰデータグラ
ム、及びＰＰＰフレームとからなるデータ単位がインタ
ーネット１０５内を転送される場合に、そのデータ単位
を単にＴＣＰ／ＩＰパケットと呼ぶ。Then, the communication control unit 321 transmits the IP
The datagram is stored in the "Information" field, and the IP
Hexadecimal value "00" indicating that the datagram is stored
FIG. 6A in which “21” is stored in the “protocol” field
A PPP frame having the format shown in FIG. 8 is generated, and according to the path information (see the description of step 803 in FIG. 8) set in the communication control unit 321, the PPP frame is generated.
The P frame is transmitted to the mobile terminal control host device 104. Hereinafter, when a data unit including the above-described TCP segment, IP datagram, and PPP frame is transferred in the Internet 105, the data unit is simply referred to as a TCP / IP packet.

【０１４２】このＴＣＰ／ＩＰパケットは、それを構成
するＩＰデータグラムのＩＰヘッダに格納されている
“宛先ＩＰアドレス”に基づいて、移動端末制御ホスト
装置１０４内のルーティング部１１４とインターネット
１０５内の特には図示しない中継ホスト装置によって、
音声サービスプロバイダ内のルータ装置１０６まで転送
された後、更に、ＬＡＮ１０７を介して音声制御ホスト
装置１０８内のパケット送受信部１１５まで転送され
る。This TCP / IP packet is based on the “destination IP address” stored in the IP header of the IP datagram constituting the TCP / IP packet. In particular, by a relay host device (not shown),
After being transferred to the router device 106 in the voice service provider, the data is further transferred to the packet transmitting / receiving unit 115 in the voice control host device 108 via the LAN 107.

【０１４３】パケット送受信部１１５は、転送されてき
たＴＣＰ／ＩＰパケットを構成するＩＰデータグラムの
ＩＰヘッダの“宛先ＩＰアドレス”フィールドに自分で
ある音声制御ホスト装置１０８のＩＰアドレスが設定さ
れていることを識別することによって、そのＴＣＰ／Ｉ
Ｐパケットを受信する。The packet transmitting / receiving section 115 has its own IP address set in the “destination IP address” field of the IP header of the IP datagram constituting the transferred TCP / IP packet. That the TCP / I
Receive a P packet.

【０１４４】そして、パケット送受信部１１５は、受信
したＴＣＰ／ＩＰパケットを構成するＴＣＰセグメント
の“宛先ポート番号”フィールド及び“送信元ポート番
号”フィールドに文音声認識処理のための通信プロトコ
ルを特定する１６ビットの整数値が設定されていること
を確認することによって、移動端末通信制御部１１６
（図１）に対して受信通知を通知する。Then, the packet transmitting / receiving unit 115 specifies a communication protocol for text / speech recognition processing in the “destination port number” field and the “source port number” field of the TCP segment constituting the received TCP / IP packet. By confirming that a 16-bit integer value is set, the mobile terminal communication control unit 116
(FIG. 1) is notified of the reception notification.

【０１４５】この通知と共に、パケット送受信部１１５
は、受信したＴＣＰ／ＩＰパケットを構成するＩＰデー
タグラムのＩＰヘッダから“送信元ＩＰアドレス”を取
り出し、上記ＴＣＰ／ＩＰパケットを構成するＴＣＰセ
グメントの“データ”フィールドから“端末識別コー
ド”と文音声認識処理のリアルタイム開始要求コマンド
又は文音声認識処理の非リアルタイム開始要求コマンド
とを取り出して、それらのデータを移動端末通信制御部
１１６に引き渡す。Along with this notification, the packet transmitting / receiving unit 115
Extracts the “source IP address” from the IP header of the IP datagram constituting the received TCP / IP packet, and reads “terminal identification code” from the “data” field of the TCP segment constituting the TCP / IP packet. A real-time start request command for voice recognition processing or a non-real-time start request command for sentence voice recognition processing is extracted, and the data is transferred to the mobile terminal communication control unit 116.

【０１４６】この結果、後述するようにして音声制御ホ
スト装置１０８から移動端末１０１に対して、送信許可
データが格納されたＴＣＰ／ＩＰパケットが返信され
る。このＴＣＰ／ＩＰパケットは、それを構成するＩＰ
データグラムのＩＰヘッダに格納されている“宛先ＩＰ
アドレス”に基づいて、音声サービスプロバイダ内のル
ータ装置１０６と、インターネット１０５内の特には図
示しない中継ホスト装置によって、移動端末制御ホスト
装置１０４内のルーティング部１１４まで転送された
後、更に、ＰＨＳ網１０３（図１）を介して移動端末１
０１の通信部１１１内の通信制御部３２１（図３）まで
転送される。As a result, a TCP / IP packet storing transmission permission data is returned from the voice control host device 108 to the mobile terminal 101 as described later. This TCP / IP packet is composed of the IP
“Destination IP” stored in the IP header of the datagram
After being transferred to the routing unit 114 in the mobile terminal control host device 104 by the router device 106 in the voice service provider and the relay host device (not shown) in the Internet 105 based on the "address", the PHS network Mobile terminal 1 via 103 (FIG. 1)
01 to the communication control unit 321 (FIG. 3) in the communication unit 111.

【０１４７】移動端末１０１の通信部１１１内の通信制
御部３２１は、転送されてきたＴＣＰ／ＩＰパケットを
構成するＩＰデータグラムのＩＰヘッダの“宛先ＩＰア
ドレス”フィールドに自分である移動端末１０１（に一
時的又は動的）に割当てられているのＩＰアドレスが設
定されていることを識別することによって、そのＴＣＰ
／ＩＰパケットを受信する。The communication control unit 321 in the communication unit 111 of the mobile terminal 101 stores its own mobile terminal 101 (in the “destination IP address” field of the IP header of the IP datagram constituting the transferred TCP / IP packet. By identifying that the IP address assigned to it (temporarily or dynamically) is set.
/ IP packet is received.

【０１４８】そして、通信制御部３２１は、受信したＴ
ＣＰ／ＩＰパケットを構成するＴＣＰセグメントの“宛
先ポート番号”フィールド及び“送信元ポート番号”フ
ィールドに文音声認識処理のための通信プロトコルを特
定する１６ビットの整数値が設定されていることを確認
することにより、移動端末１０１の制御部１１０内のＣ
ＰＵ３１６に対して受信通知を通知する。Then, the communication control unit 321 transmits the received T
Confirm that a 16-bit integer value that specifies a communication protocol for text-to-speech recognition processing is set in the “destination port number” field and the “source port number” field of the TCP segment constituting the CP / IP packet By doing so, C in the control unit 110 of the mobile terminal 101
The reception notification is notified to the PU 316.

【０１４９】この通知と共に、通信制御部３２１は、受
信したＴＣＰ／ＩＰパケットを構成するＴＣＰセグメン
トの“データ”フィールドから送信許可データを取り出
し、それをＣＰＵ３１６に引き渡す。Along with this notification, the communication control unit 321 extracts transmission permission data from the “data” field of the TCP segment constituting the received TCP / IP packet, and delivers it to the CPU 316.

【０１５０】ＣＰＵ３１６は、上述の受信通知と送信許
可データを、前述した図４のステップ４０３で処理し、
その送信許可データをＲＡＭ３１７に記憶する。移動端
末１０１では、ユーザがタッチパネルを操作してリアル
タイム又は非リアルタイムによる文音声認識処理の開始
を指示することによって、ＣＰＵ３１６が、前述した図
４のステップ４０８で、図３の入力部１０９内のマイク
制御部３０３に対して、ＰＨＳ通話処理の開始指示、又
は文音声認識処理を実行するためのオフライン状態での
音声入力処理の開始を指示する。これにより、ユーザ
は、通話動作又はオフライン状態での音声入力動作によ
ってマイク３０１（図２の２０１）からの音声の入力を
開始している。The CPU 316 processes the above-described reception notification and transmission permission data in step 403 of FIG.
The transmission permission data is stored in the RAM 317. In the mobile terminal 101, when the user operates the touch panel to instruct the start of the real-time or non-real-time sentence speech recognition processing, the CPU 316 determines the microphone in the input unit 109 in FIG. The control unit 303 is instructed to start a PHS call process or to start a voice input process in an offline state for executing a sentence voice recognition process. As a result, the user has started inputting sound from the microphone 301 (201 in FIG. 2) through a call operation or a sound input operation in an offline state.

【０１５１】これ以後、ＣＰＵ３１６により前述した図
４のステップ４０１→４０２→４０３→４０４→４０１
の繰返しループの１処理として実行されるステップ４０
４の送信処理において、図５のステップ５０５、５０６
の判定がＹＥＳとなり、必要に応じてステップ５０８で
再度の発信処理が実行された後、ステップ５０９で、図
３に示される入力部１０９内のマイク制御部３０３から
制御部１１０内のＲＡＭ３１７に転送されてきている音
声データの送信指示が、通信部１１１内の通信制御部３
２１に対して依頼される。Thereafter, the CPU 316 executes the steps 401 → 402 → 403 → 404 → 401 of FIG.
Step 40 executed as one processing of a repetition loop of
4 in steps 505 and 506 in FIG.
Is determined to be YES, and if necessary, the transmission processing is executed again in step 508, and then, in step 509, the data is transferred from the microphone control unit 303 in the input unit 109 to the RAM 317 in the control unit 110 shown in FIG. The transmitted voice data transmission instruction is transmitted to the communication control unit 3 in the communication unit 111.
21 is requested.

【０１５２】この結果、通信制御部３２１は、まず、図
６(c) に示されるフォーマットを有するＴＣＰセグメン
トを生成する。この場合に、図６(c) 及び図７(b) に示
されるフォーマットを有するＴＣＰヘッダにおいて、
“送信元ポート番号”フィールド及び“宛先ポート番
号”フィールドには、文音声認識処理のための通信プロ
トコルを特定する１６ビットの整数値が設定される。そ
して、ＴＣＰセグメントの“データ”フィールドには、
図３に示される入力部１０９内のマイク制御部３０３か
ら制御部１１０内のＲＡＭ３１７に転送されてきている
音声データが格納される。As a result, the communication control unit 321 first generates a TCP segment having the format shown in FIG. In this case, in the TCP header having the format shown in FIGS. 6 (c) and 7 (b),
In the “source port number” field and the “destination port number” field, a 16-bit integer value that specifies a communication protocol for sentence speech recognition processing is set. Then, in the “data” field of the TCP segment,
Audio data transferred from the microphone control unit 303 in the input unit 109 shown in FIG. 3 to the RAM 317 in the control unit 110 is stored.

【０１５３】次に、通信制御部３２１は、上述のＴＣＰ
セグメントが“データ”フィールドに格納された図６
(b) に示されるフォーマットを有するＩＰデータグラム
を生成する。この場合に、図６(b) 及び図７(a) に示さ
れるフォーマットを有するＩＰヘッダにおいて、“プロ
トコル”フォーマットには、その“データ”フィールド
に格納されるＴＣＰセグメントデータのフォーマットを
規定する整数値６が設定される。また、“送信元ＩＰア
ドレス”フィールドには、既に実行されている発信処理
（図８のステップ８０３の説明を参照）によって移動端
末制御ホスト装置１０４内の接続確立部１１３から移動
端末１０１の通信部１１１内の通信制御部３２１に対し
て付与されたＩＰアドレスが設定される。更に、“宛先
ＩＰアドレス”フィールドには、音声制御ホスト装置１
０８に割り当てられているＩＰアドレスが設定される。Next, the communication control unit 321 executes the above-described TCP
Figure 6 with segments stored in the "data" field
An IP datagram having the format shown in (b) is generated. In this case, in the IP header having the format shown in FIGS. 6 (b) and 7 (a), the "protocol" format includes a format that defines the format of the TCP segment data stored in the "data" field. Numerical value 6 is set. In the “source IP address” field, the connection establishment unit 113 in the mobile terminal control host device 104 transmits the communication unit of the mobile terminal 101 by the transmission processing already executed (see the description of step 803 in FIG. 8). The assigned IP address is set for the communication control unit 321 in 111. Further, the "destination IP address" field contains the voice control host device 1
08 is set.

【０１５４】そして、通信制御部３２１は、上述のＩＰ
データグラムが“インフォメーション”フィールドに格
納され、その”インフォメーション”フィールドにＩＰ
データグラムが格納されていることを示す１６進値“00
21”が“プロトコル”フィールドに格納された図６(a)
に示されるフォーマットを有するＰＰＰフレームを生成
し、通信制御部３２１内に設定されている経路情報（図
８のステップ８０３の説明を参照）に従って、上記ＰＰ
Ｐフレームを移動端末制御ホスト装置１０４に送信す
る。Then, the communication control unit 321 transmits the IP
The datagram is stored in the "Information" field, and the IP
Hexadecimal value "00" indicating that the datagram is stored
FIG. 6A in which “21” is stored in the “protocol” field
A PPP frame having the format shown in FIG. 8 is generated, and according to the path information (see the description of step 803 in FIG. 8) set in the communication control unit 321, the PPP frame is generated.
The P frame is transmitted to the mobile terminal control host device 104.

【０１５５】このＴＣＰ／ＩＰパケットは、それを構成
するＩＰデータグラムのＩＰヘッダに格納されている
“宛先ＩＰアドレス”に基づいて、移動端末制御ホスト
装置１０４内のルーティング部１１４とインターネット
１０５内の特には図示しない中継ホスト装置によって、
音声サービスプロバイダ内のルータ装置１０６まで転送
された後、更に、ＬＡＮ１０７を介して音声制御ホスト
装置１０８内のパケット送受信部１１５まで転送され
る。This TCP / IP packet is routed to the routing unit 114 in the mobile terminal control host device 104 and to the Internet 105 in the Internet 105 based on the “destination IP address” stored in the IP header of the IP datagram constituting the TCP / IP packet. In particular, by a relay host device (not shown),
After being transferred to the router device 106 in the voice service provider, the data is further transferred to the packet transmitting / receiving unit 115 in the voice control host device 108 via the LAN 107.

【０１５６】パケット送受信部１１５は、転送されてき
たＴＣＰ／ＩＰパケットを構成するＩＰデータグラムの
ＩＰヘッダの“宛先ＩＰアドレス”フィールドに自分で
ある音声制御ホスト装置１０８のＩＰアドレスが設定さ
れていることを識別することによって、そのＴＣＰ／Ｉ
Ｐパケットを受信する。The packet transmitting / receiving section 115 has its own IP address set in the “destination IP address” field of the IP header of the IP datagram constituting the transferred TCP / IP packet. That the TCP / I
Receive a P packet.

【０１５７】そして、パケット送受信部１１５は、受信
したＴＣＰ／ＩＰパケットを構成するＴＣＰセグメント
の“宛先ポート番号”フィールド及び“送信元ポート番
号”フィールド文音声認識処理のための通信プロトコル
を特定する１６ビットの整数値が設定されていることを
確認することにより、移動端末通信制御部１１６（図
１）に対して受信通知を通知する。Then, the packet transmitting / receiving unit 115 specifies a communication protocol for the sentence speech recognition processing in the "destination port number" field and the "source port number" field of the TCP segment constituting the received TCP / IP packet 16 By confirming that the integer value of the bit is set, the reception notification is notified to the mobile terminal communication control unit 116 (FIG. 1).

【０１５８】この通知と共に、パケット送受信部１１５
は、受信したＴＣＰ／ＩＰパケットを構成するＩＰデー
タグラムのＩＰヘッダから“送信元ＩＰアドレス”を取
り出し、上記ＴＣＰ／ＩＰパケットを構成するＴＣＰセ
グメントの“データ”フィールドから音声データを取り
出して、それらのデータを移動端末通信制御部１１６に
引き渡す。Along with this notification, the packet transmitting / receiving unit 115
Extracts the “source IP address” from the IP header of the IP datagram constituting the received TCP / IP packet, extracts the audio data from the “data” field of the TCP segment constituting the TCP / IP packet, and Is transferred to the mobile terminal communication control unit 116.

【０１５９】この結果、移動端末通信制御部１１６は、
後述するようにして文音声認識処理の制御を実行し、文
音声認識部１１７に対して受信した音声データの認識処
理を実行させる。そして、上述の音声データについて移
動端末１０１から文音声認識処理のリアルタイム開始要
求コマンドが指定されている場合には、移動端末通信制
御部１１６は、後述するようにして、文音声認識部１１
７からその結果として得た認識音声文章データが格納さ
れたＴＣＰ／ＩＰパケットを、移動端末１０１に対して
返信する。As a result, the mobile terminal communication control unit 116
As will be described later, the control of the sentence speech recognition process is executed, and the sentence speech recognition unit 117 executes the recognition process of the received speech data. When the real-time start request command of the sentence speech recognition process is specified by the mobile terminal 101 for the above-described speech data, the mobile terminal communication control unit 116 transmits the sentence speech recognition unit 11 as described later.
7 returns a TCP / IP packet storing the resulting recognized voice sentence data to the mobile terminal 101.

【０１６０】このＴＣＰ／ＩＰパケットは、それを構成
するＩＰデータグラムのＩＰヘッダに格納されている
“宛先ＩＰアドレス”に基づいて、音声サービスプロバ
イダ内のルータ装置１０６と、インターネット１０５内
の特には図示しない中継ホスト装置によって、移動端末
制御ホスト装置１０４内のルーティング部１１４まで転
送された後、更に、ＰＨＳ網１０３（図１）を介して移
動端末１０１の通信部１１１内の通信制御部３２１（図
３）まで転送される。This TCP / IP packet is based on the “destination IP address” stored in the IP header of the IP datagram that composes the TCP / IP packet. After being transferred by the relay host device (not shown) to the routing unit 114 in the mobile terminal control host device 104, the communication control unit 321 (in the communication unit 111 of the mobile terminal 101) via the PHS network 103 (FIG. 1). It is transferred to Fig. 3).

【０１６１】移動端末１０１の通信部１１１内の通信制
御部３２１は、転送されてきたＴＣＰ／ＩＰパケットを
構成するＩＰデータグラムのＩＰヘッダの“宛先ＩＰア
ドレス”フィールドに自分である移動端末１０１（に一
時的又は動的）に割当てられているのＩＰアドレスが設
定されていることを識別することによって、そのＴＣＰ
／ＩＰパケットを受信する。The communication control section 321 in the communication section 111 of the mobile terminal 101 stores its own mobile terminal 101 (in the “destination IP address” field of the IP header of the IP datagram constituting the transferred TCP / IP packet. By identifying that the IP address assigned to it (temporarily or dynamically) is set.
/ IP packet is received.

【０１６２】そして、通信制御部３２１は、受信したＴ
ＣＰ／ＩＰパケットを構成するＴＣＰセグメントの“宛
先ポート番号”フィールド及び“送信元ポート番号”フ
ィールドに文音声認識処理のための通信プロトコルを特
定する１６ビットの整数値が設定されていることを確認
することにより、移動端末１０１の制御部１１０内のＣ
ＰＵ３１６に対して受信通知を通知する。Then, the communication control unit 321
Confirm that a 16-bit integer value that specifies a communication protocol for text-to-speech recognition processing is set in the “destination port number” field and the “source port number” field of the TCP segment constituting the CP / IP packet By doing so, C in the control unit 110 of the mobile terminal 101
The reception notification is notified to the PU 316.

【０１６３】この通知と共に、通信制御部３２１は、受
信したＴＣＰ／ＩＰパケットを構成するＴＣＰセグメン
トの“データ”フィールドから認識音声文章データを取
り出し、それをＣＰＵ３１６に引き渡す。At the same time as this notification, the communication control unit 321 extracts the recognized voice text data from the “data” field of the TCP segment constituting the received TCP / IP packet, and delivers it to the CPU 316.

【０１６４】ＣＰＵ３１６は、上述の受信通知と認識音
声文章データを、前述した図４のステップ４０２で処理
し、その認識音声文章データをＬＣＤ表示部３１１（図
２の２０３）に表示する。The CPU 316 processes the above-described reception notification and the recognized voice text data in the above-described step 402 in FIG. 4, and displays the recognized voice text data on the LCD display unit 311 (203 in FIG. 2).

【０１６５】ユーザは、移動端末１０１のタッチパネル
を操作することにより、音声制御ホスト装置１０８に対
して文音声認識処理の終了を示すための、文音声認識処
理の終了要求コマンドを指示することができる。By operating the touch panel of the mobile terminal 101, the user can instruct the voice control host device 108 to issue an end request command for the sentence speech recognition process to indicate the end of the sentence speech recognition process. .

【０１６６】この場合に、上述のタッチパネルの操作
は、図３のタッチパネル制御部３１５において検出され
た後、制御部１１０内のＣＰＵ３１６（図３）によっ
て、それが実行される前述した図４の動作フローチャー
トに対応する制御動作において、ステップ４０１の判定
がＹＥＳ、ステップ４０５及び４０６の判定がＮＯとな
って、ステップ４０９の他キー入力処理が実行されるこ
とにより、検出される。更に、ステップ４０４の送信処
理において、前述した図５のステップ５０１の判定がＹ
ＥＳとなり、必要に応じてステップ５０３で発信処理が
実行された後、ステップ５０４において、移動端末１０
１の“端末識別コード”と上述の文音声認識処理の終了
要求コマンドの送信指示が、図３の通信部１１１内の通
信制御部３２１に対して依頼される。In this case, the operation of the touch panel described above is detected by the touch panel control unit 315 of FIG. 3, and then executed by the CPU 316 (FIG. 3) of the control unit 110. In the control operation corresponding to the flowchart, the determination is made by YES in Step 401 and NO in Steps 405 and 406, and the other key input processing of Step 409 is executed. Further, in the transmission processing of step 404, the determination of step 501 in FIG.
In step 503, the mobile terminal 10 becomes an ES, and if necessary, a calling process is executed in step 503.
The transmission instruction of the “terminal identification code” of No. 1 and the above-mentioned sentence speech recognition processing end request command is requested to the communication control unit 321 in the communication unit 111 of FIG.

【０１６７】この結果、通信制御部３２１は、まず、
“データ”フィールドに移動端末１０１を特定する“端
末識別コード”と文音声認識処理の終了要求コマンドと
が格納された図６(c) に示されるフォーマットを有する
ＴＣＰセグメントを生成し、次に、そのＴＣＰセグメン
トが“データ”フィールドに格納された図６(b) に示さ
れるフォーマットを有するＩＰデータグラムを生成し、
更に、そのＩＰデータグラムが“インフォメーション”
フィールドに格納された図６(a) に示されるフォーマッ
トを有するＰＰＰフレームを生成し、それらからなるＴ
ＣＰ／ＩＰパケットを送信する。この場合に、ＴＣＰヘ
ッダ（図６(c) 、図７(b) ）、ＩＰヘッダ（図６(b) 、
図７(a) ）、及び“プロトコル”フィールド（図６(a)
）に設定される各情報は、前述の文音声認識処理のリ
アルタイム開始要求コマンド又は文音声認識処理の非リ
アルタイム開始要求コマンドが送信される場合に設定さ
れる各情報と同一である。As a result, the communication control unit 321 first
A TCP segment having a format shown in FIG. 6 (c) in which a "terminal identification code" for identifying the mobile terminal 101 and a command for ending the sentence / speech recognition process in the "data" field are generated. Generating an IP datagram having the format shown in FIG. 6 (b) with the TCP segment stored in the "data"field;
Furthermore, the IP datagram is "information"
A PPP frame having the format shown in FIG. 6A stored in the field is generated, and the T
Send a CP / IP packet. In this case, the TCP header (FIGS. 6 (c) and 7 (b)) and the IP header (FIG. 6 (b),
FIG. 7 (a)) and "protocol" field (FIG. 6 (a)
The information set in ()) is the same as the information set when the above-described real-time start request command for sentence speech recognition processing or the non-real-time start request command for sentence speech recognition processing is transmitted.

【０１６８】この結果、上述のＴＣＰ／ＩＰパケット
は、前述の文音声認識処理のリアルタイム開始要求コマ
ンド等が格納されたＴＣＰ／ＩＰパケットの場合と全く
同様にして、インターネット１０５を介して音声制御ホ
スト装置１０８内のパケット送受信部１１５まで転送さ
れる。As a result, the TCP / IP packet is transmitted via the Internet 105 in the same manner as the TCP / IP packet storing the real-time start request command for the sentence speech recognition processing. The packet is transferred to the packet transmitting / receiving unit 115 in the device 108.

【０１６９】パケット送受信部１１５は、前述の文音声
認識処理のリアルタイム開始要求コマンド等が格納され
たＴＣＰ／ＩＰパケットが転送されてきた場合と全く同
様にして、転送されてきたＴＣＰ／ＩＰパケットを受信
し、移動端末通信制御部１１６（図１）に対して受信通
知を通知する。The packet transmitting / receiving unit 115 transmits the transferred TCP / IP packet in exactly the same manner as when the TCP / IP packet storing the above-described real-time start request command for the sentence / voice recognition processing is transferred. It receives and notifies the mobile terminal communication control unit 116 (FIG. 1) of a reception notification.

【０１７０】この通知と共に、パケット送受信部１１５
は、受信したＴＣＰ／ＩＰパケットを構成するＴＣＰセ
グメントの“データ”フィールドから“端末識別コー
ド”と文音声認識処理の終了要求コマンドとを取り出し
て、それらのデータを移動端末通信制御部１１６に引き
渡す。Along with this notification, the packet transmitting / receiving unit 115
Extracts the “terminal identification code” and the command to end the sentence / speech recognition process from the “data” field of the TCP segment constituting the received TCP / IP packet, and delivers the data to the mobile terminal communication control unit 116 .

【０１７１】この結果、移動端末通信制御部１１６は、
後述するようにしてその移動端末１０１に対する文音声
認識処理を終了する。ユーザは、文音声認識処理の非リ
アルタイム開始要求コマンドを指定することにより、音
声制御ホスト装置１０８に対して、認識音声文章データ
を即座に移動端末１０１に返信することはさせずに、そ
れを順次蓄積させることができる。この場合、ユーザ
は、後に、移動端末１０１のタッチパネルを操作するこ
とにより、音声制御ホスト装置１０８に蓄積されている
認識音声文章データを一括して移動端末１０１に返信さ
せるための、文音声認識結果の一括転送要求コマンドを
指示することができる。As a result, the mobile terminal communication control unit 116
The sentence speech recognition process for the mobile terminal 101 is terminated as described later. The user specifies the non-real-time start request command of the sentence speech recognition process, so that the speech control host device 108 does not immediately return the recognized speech sentence data to the mobile terminal 101, but sequentially sends it. Can be accumulated. In this case, the user later operates the touch panel of the mobile terminal 101 to collectively return the recognized voice sentence data stored in the voice control host device 108 to the mobile terminal 101. Of the batch transfer request command.

【０１７２】この場合に、上述のタッチパネルの操作
は、図３のタッチパネル制御部３１５において検出され
た後、制御部１１０内のＣＰＵ３１６（図３）によっ
て、それが実行される前述した図４の動作フローチャー
トに対応する制御動作において、ステップ４０１の判定
がＹＥＳ、ステップ４０５及び４０６の判定がＮＯとな
って、ステップ４０９の他キー入力処理が実行されるこ
とにより、検出される。更に、ステップ４０４の送信処
理において、前述した図５のステップ５０１の判定がＹ
ＥＳとなり、必要に応じてステップ５０３で発信処理が
実行された後、ステップ５０４において、移動端末１０
１の“端末識別コード”と上述の文音声認識結果の一括
転送要求コマンドの送信指示が、図３の通信部１１１内
の通信制御部３２１に対して依頼される。In this case, the operation of the touch panel described above is detected by the touch panel control section 315 of FIG. 3, and then executed by the CPU 316 (FIG. 3) of the control section 110. In the control operation corresponding to the flowchart, the determination is made by YES in Step 401 and NO in Steps 405 and 406, and the other key input processing of Step 409 is executed. Further, in the transmission processing of step 404, the determination of step 501 in FIG.
In step 503, the mobile terminal 10 becomes an ES, and if necessary, a calling process is executed in step 503.
The transmission instruction of the “terminal identification code” of No. 1 and the above-mentioned batch transfer request command of the sentence / voice recognition result is requested to the communication control unit 321 in the communication unit 111 of FIG.

【０１７３】この結果、通信制御部３２１はまず、“デ
ータ”フィールドに移動端末１０１を特定する“端末識
別コード”と文音声認識結果の一括転送要求コマンドと
が格納された図６(c) に示されるフォーマットを有する
ＴＣＰセグメントを生成し、次に、そのＴＣＰセグメン
トが“データ”フィールドに格納された図６(b) に示さ
れるフォーマットを有するＩＰデータグラムを生成し、
更に、そのＩＰデータグラムが“インフォメーション”
フィールドに格納された図６(a) に示されるフォーマッ
トを有するＰＰＰフレームを生成し、それらからなるＴ
ＣＰ／ＩＰパケットを送信する。この場合に、ＴＣＰヘ
ッダ（図６(c) 、図７(b) ）、ＩＰヘッダ（図６(b) 、
図７(a) ）、及び“プロトコル”フィールド（図６(a)
）に設定される各情報は、前述の文音声認識処理のリ
アルタイム開始要求コマンド又は文音声認識処理の非リ
アルタイム開始要求コマンドが送信される場合に設定さ
れる各情報と同一である。As a result, the communication control unit 321 first stores the “terminal identification code” for specifying the mobile terminal 101 in the “data” field and the batch transfer request command of the sentence / voice recognition result in FIG. 6 (c). Generating a TCP segment having the format shown in FIG. 6B with the TCP segment stored in the "data" field, and generating an IP datagram having the format shown in FIG.
Furthermore, the IP datagram is "information"
A PPP frame having the format shown in FIG. 6A stored in the field is generated, and the T
Send a CP / IP packet. In this case, the TCP header (FIGS. 6 (c) and 7 (b)) and the IP header (FIG. 6 (b),
FIG. 7 (a)) and "protocol" field (FIG. 6 (a)
The information set in ()) is the same as the information set when the above-described real-time start request command for sentence speech recognition processing or the non-real-time start request command for sentence speech recognition processing is transmitted.

【０１７４】この結果、上述のＴＣＰ／ＩＰパケット
は、前述の文音声認識処理のリアルタイム開始要求コマ
ンド等が格納されたＴＣＰ／ＩＰパケットの場合と全く
同様にして、インターネット１０５を介して音声制御ホ
スト装置１０８内のパケット送受信部１１５まで転送さ
れる。As a result, the TCP / IP packet is transmitted via the Internet 105 in the same manner as the TCP / IP packet in which the real-time start request command for the sentence speech recognition processing is stored. The packet is transferred to the packet transmitting / receiving unit 115 in the device 108.

【０１７５】パケット送受信部１１５は、前述の文音声
認識処理のリアルタイム開始要求コマンド等が格納され
たＴＣＰ／ＩＰパケットが転送されてきた場合と全く同
様にして、転送されてきたＴＣＰ／ＩＰパケットを受信
し、移動端末通信制御部１１６（図１）に対して受信通
知を通知する。The packet transmitting / receiving unit 115 transmits the transferred TCP / IP packet in exactly the same manner as when the TCP / IP packet storing the above-described real-time start request command for the sentence / voice recognition processing is transferred. It receives and notifies the mobile terminal communication control unit 116 (FIG. 1) of a reception notification.

【０１７６】この通知と共に、パケット送受信部１１５
は、受信したＴＣＰ／ＩＰパケットを構成するＩＰデー
タグラムのＩＰヘッダから“送信元ＩＰアドレス”を取
り出し、上記ＴＣＰ／ＩＰパケットを構成するＴＣＰセ
グメントの“データ”フィールドから“端末識別コー
ド”と文音声認識結果の一括転送要求コマンドとを取り
出して、それらのデータを移動端末通信制御部１１６に
引き渡す。Along with this notification, the packet transmitting / receiving unit 115
Extracts the “source IP address” from the IP header of the IP datagram constituting the received TCP / IP packet, and reads “terminal identification code” from the “data” field of the TCP segment constituting the TCP / IP packet. The command and the batch transfer request command of the voice recognition result are extracted, and the data is transferred to the mobile terminal communication control unit 116.

【０１７７】この結果、移動端末通信制御部１１６は、
後述するようにして、認識音声文章データが格納された
ＴＣＰ／ＩＰパケットを、移動端末１０１に対して一括
して転送する。As a result, the mobile terminal communication control unit 116
As will be described later, the TCP / IP packet storing the recognized voice sentence data is transferred to the mobile terminal 101 in a lump.

【０１７８】このＴＣＰ／ＩＰパケットは、前述したよ
うにして通信制御部３２１（図３）まで転送された後、
通信制御部３２１からＣＰＵ３１６に、受信通知と認識
音声文章データが引き渡される。The TCP / IP packet is transferred to the communication control unit 321 (FIG. 3) as described above,
The reception notification and the recognized voice sentence data are delivered from the communication control unit 321 to the CPU 316.

【０１７９】ＣＰＵ３１６は、上述の受信通知と認識音
声文章データを、前述した図４のステップ４０２で処理
し、その認識音声文章データをＬＣＤ表示部３１１（図
２の２０３）に表示する。＜移動端末通信制御部１１６と文音声認識部１１７の概
略動作＞次に、音声制御ホスト装置１０８内の移動端末
通信制御部１１６と文音声認識部１１７の概略動作につ
いて説明する。The CPU 316 processes the above-described reception notification and the recognized voice text data in the above-described step 402 in FIG. 4, and displays the recognized voice text data on the LCD display unit 311 (203 in FIG. 2). <Schematic Operation of Mobile Terminal Communication Control Unit 116 and Sentence Voice Recognition Unit 117> Next, the schematic operation of the mobile terminal communication control unit 116 and the sentence voice recognition unit 117 in the voice control host device 108 will be described.

【０１８０】移動端末通信制御部１１６は、文音声認識
処理のリアルタイム開始要求コマンド又は文音声認識処
理の非リアルタイム開始要求コマンドを送信した移動端
末１０１に割当てられている“端末識別コード”（上記
コマンドを転送してきたＴＣＰセグメントに格納されて
いる）毎に、図１３に示されるデータ構造を有する処理
端末登録テーブルにエントリを登録すると共に、音声デ
ータの受信用のバッファファイル（音声バッファファイ
ル）と認識音声文章データの送信用のバッファファイル
（文章バッファファイル）とを音声制御ホスト装置１０
８が管理するファイルシステム上に作成する。また、移
動端末通信制御部１１６は、上記エントリとファイルの
登録に成功すると、上記コマンドを転送してきたＩＰデ
ータグラムに格納されていた“送信元ＩＰアドレス”の
移動端末１０１に向けて、送信許可データを返信する。The mobile terminal communication control unit 116 transmits the “terminal identification code” (the above-mentioned command) assigned to the mobile terminal 101 that has transmitted the real-time start request command for sentence speech recognition processing or the non-real-time start request command for sentence speech recognition processing. Is stored in the processing terminal registration table having the data structure shown in FIG. 13 and is recognized as a buffer file for receiving audio data (audio buffer file). A buffer file (sentence buffer file) for transmitting voice text data is transmitted to the voice control host device 10.
8 is created on the file system managed by. When the registration of the entry and the file is successful, the mobile terminal communication control unit 116 permits transmission to the mobile terminal 101 of the “source IP address” stored in the IP datagram that transferred the command. Reply the data.

【０１８１】移動端末通信制御部１１６は、それ以後移
動端末１０１から受信した音声データを、その“送信元
ＩＰアドレス”（それを転送してきたＩＰデータグラム
に格納されている）に対応する処理端末登録テーブルの
エントリから特定される音声バッファファイルに追加書
き込みする。The mobile terminal communication control unit 116 converts the voice data received from the mobile terminal 101 thereafter into the processing terminal corresponding to the “source IP address” (stored in the IP datagram that transferred the data). Write additionally to the audio buffer file specified from the entry in the registration table.

【０１８２】文音声認識部１１７は、図１３に示される
処理端末登録テーブルのエントリ毎に、各エントリから
特定される音声バッファファイルに音声データが受信さ
れていればそれに対して文音声認識処理を実行し、その
結果得られる認識音声文章データを上記各エントリに対
応する文章バッファファイルに追加書き込みする。The sentence / speech recognition unit 117 performs a sentence / speech recognition process on each entry of the processing terminal registration table shown in FIG. 13 if the speech data is received in the speech buffer file specified from each entry. Then, the recognition voice text data obtained as a result is additionally written to the text buffer file corresponding to each of the above entries.

【０１８３】移動端末通信制御部１１６は、リアルタイ
ム要求が設定されている処理端末登録テーブルのエント
リ毎に、各エントリから特定される文章バッファファイ
ルに認識音声文章データが得られていれば、それを各エ
ントリに登録されている“送信元ＩＰアドレス”の移動
端末１０１に向けて返信する。For each entry in the processing terminal registration table for which the real-time request is set, the mobile terminal communication control unit 116, if the recognized voice text data is obtained in the text buffer file specified from each entry, deletes it. A reply is sent to the mobile terminal 101 of the “source IP address” registered in each entry.

【０１８４】移動端末通信制御部１１６は、文音声認識
処理の終了要求コマンドを受信しかつリアルタイム要求
が設定されている処理端末登録テーブルのエントリ、又
はリアルタイム要求が設定されておりかつ最終アクセス
時刻が現在時刻から一定時間前の時刻よりも更に前の時
刻である処理端末登録テーブルのエントリについて、そ
のエントリの内容を削除し、それから特定される音声／
文章バッファファイルを削除する。The mobile terminal communication control unit 116 receives the command for ending the sentence / voice recognition process and sets an entry in the processing terminal registration table in which the real-time request is set, or sets the real-time request and sets the last access time. For an entry in the processing terminal registration table that is a time earlier than the time before the current time by a fixed time, the content of the entry is deleted, and the voice / audio specified from the entry is deleted.
Delete the sentence buffer file.

【０１８５】移動端末通信制御部１１６は、リアルタイ
ム要求が設定されていない（即ち、非リアルタイム要求
が設定されている）処理端末登録テーブルのエントリに
ついて、そのエントリの内容とそれから特定される音声
／文章バッファファイルは、対応する移動端末１０１と
の通信終了後も保存し、後に、その移動端末１０１から
文音声認識結果の一括転送要求コマンドを受信した時点
で、その移動端末１０１の“端末識別コード”に対応す
る処理端末登録テーブルのエントリから特定される文章
バッファファイルに保存されていた認識音声文章データ
を、上記コマンドを転送してきたＩＰデータグラムに格
納されていた“送信元ＩＰアドレス”の移動端末１０１
に向けて送信する。＜移動端末通信制御部１１６の詳細動作＞図９〜図１２
は、上記機能を実現するために、移動端末通信制御部１
１６が実行する制御動作を示す動作フローチャートであ
る。この動作フローチャートは、移動端末通信制御部１
１６を制御する特には図示しないプロセッサが、特には
図示しない制御プログラムを実行する動作として実現さ
れる。For the entry in the processing terminal registration table for which no real-time request is set (that is, for which a non-real-time request is set), the mobile terminal communication control unit 116 specifies the contents of the entry and the voice / text specified from the entry. The buffer file is stored even after the communication with the corresponding mobile terminal 101 is completed, and later, when a batch transfer request command of the sentence speech recognition result is received from the mobile terminal 101, the “terminal identification code” of the mobile terminal 101 is stored. The recognized voice text data stored in the text buffer file specified from the entry of the processing terminal registration table corresponding to the above-mentioned mobile terminal of the "source IP address" stored in the IP datagram that transmitted the command. 101
Send to. <Detailed Operation of Mobile Terminal Communication Control Unit 116> FIGS. 9 to 12
Is a mobile terminal communication control unit 1 for realizing the above function.
16 is an operation flowchart illustrating a control operation performed by the control unit 16. This operation flowchart is based on the mobile terminal communication control unit 1.
A processor (not shown) for controlling the CPU 16 is realized as an operation for executing a control program (not shown).

【０１８６】まず、ステップ９０１で、音声制御ホスト
装置１０８内のパケット送受信部１１５（図１）から受
信通知が通知されたか否かが判定される。前述したよう
に、パケット送受信部１１５は、インターネット１０５
から転送されてきたＴＣＰ／ＩＰパケットを構成するＩ
ＰデータグラムのＩＰヘッダの“宛先ＩＰアドレス”フ
ィールドに自分である音声制御ホスト装置１０８のＩＰ
アドレスが設定されていることを識別することにより、
そのＴＣＰ／ＩＰパケットを受信し、かつ、それを構成
するＴＣＰセグメントの“宛先ポート番号”フィールド
及び“送信元ポート番号”フィールドに文音声認識処理
のための通信プロトコルを特定する１６ビットの整数値
が設定されていることを確認することによって、移動端
末通信制御部１１６に対して受信通知を通知する。この
受信通知は、文音声認識処理のリアルタイム開始要求コ
マンド、文音声認識処理の非リアルタイム開始要求コマ
ンド、文音声認識処理の対象である音声データ、文音声
認識処理の終了要求コマンド、又は文音声認識結果の一
括転送要求コマンドの何れかに関する受信通知である。
パケット送受信部１１５から受信通知が通知されステッ
プ９０１の判定がＹＥＳとなると、ステップ９０２で、
パケット送受信部１１５から受信通知と共に引き渡され
たデータが取り込まれる。この場合に、受信通知が、文
音声認識処理のリアルタイム開始要求コマンド、文音声
認識処理の非リアルタイム開始要求コマンド、又は文音
声認識結果の一括転送要求コマンドの何れかの受信通知
である場合には、“送信元ＩＰアドレス”と“端末識別
コード”と上記コマンドとが取り込まれる。また、受信
通知が、音声データの受信通知である場合には、“送信
元ＩＰアドレス”と音声データとが取り込まれる。更
に、受信通知が、文音声認識処理の終了要求コマンドの
受信通知である場合には、“端末識別コード”とそのコ
マンドとが取り込まれる。First, in step 901, it is determined whether or not a reception notification has been received from the packet transmission / reception unit 115 (FIG. 1) in the voice control host device. As described above, the packet transmitting / receiving unit 115 communicates with the Internet 105
I that constitutes the TCP / IP packet transferred from
In the "destination IP address" field of the IP header of the P datagram, the IP address of the voice control host
By identifying that the address is set,
A 16-bit integer value for receiving the TCP / IP packet and specifying a communication protocol for text-to-speech recognition processing in a “destination port number” field and a “source port number” field of a TCP segment constituting the TCP / IP packet Is confirmed, the mobile terminal communication control unit 116 is notified of the reception notification. This reception notification includes a real-time start request command for sentence speech recognition processing, a non-real-time start request command for sentence speech recognition processing, speech data to be subjected to sentence speech recognition processing, a termination request command for sentence speech recognition processing, or a sentence speech recognition. This is a reception notification regarding any of the resultant batch transfer request commands.
If a reception notification is notified from the packet transmitting / receiving unit 115 and the determination in step 901 is YES, in step 902,
The data delivered together with the reception notification from the packet transmission / reception unit 115 is captured. In this case, if the reception notification is any one of a real-time start request command for sentence speech recognition processing, a non-real-time start request command for sentence speech recognition processing, or a batch transfer request command for sentence speech recognition results, , The “source IP address”, the “terminal identification code”, and the above command are fetched. When the reception notification is a reception notification of audio data, the “source IP address” and the audio data are captured. Further, when the reception notification is a reception notification of a termination request command for the sentence speech recognition process, the “terminal identification code” and the command are fetched.

【０１８７】ステップ９０２の処理の後に、図９のステ
ップ９０３、図１０のステップ９０９、図１０のステッ
プ９１１、又は図１１のステップ９１５の判定が順に検
査され、何れかの判定結果がＹＥＳとなる。即ち、ステ
ップ９０２でパケット送受信部１１５から引き渡された
データが、文音声認識処理のリアルタイム開始要求コマ
ンド又は文音声認識処理の非リアルタイム開始要求コマ
ンドに関するものである場合はステップ９０３の判定が
ＹＥＳとなってステップ９０４〜９０８が実行され、音
声データに関するものである場合は図１０のステップ９
０９の判定がＹＥＳとなってステップ９１０が実行さ
れ、文音声認識処理の終了要求コマンドに関するもので
ある場合には図１０のステップ９１１の判定がＹＥＳと
なってステップ９１２〜９１４が実行され、文音声認識
結果の一括転送要求コマンドに関するものである場合に
は図１１のステップ９１５の判定がＹＥＳとなってステ
ップ９１６〜９１８が実行される。After the processing of step 902, the judgments of step 903 in FIG. 9, step 909 in FIG. 10, step 911 in FIG. 10, or step 915 in FIG. 11 are sequentially examined, and any judgment result becomes YES. . That is, if the data transferred from the packet transmitting / receiving unit 115 in step 902 is related to a real-time start request command for sentence speech recognition processing or a non-real-time start request command for sentence speech recognition processing, the determination in step 903 becomes YES. Steps 904 to 908 are executed, and if it is related to audio data, Step 9 in FIG.
If the determination in step 09 is YES and step 910 is executed, and if it is related to the end request command of the sentence speech recognition process, the determination in step 911 in FIG. 10 is YES and steps 912 to 914 are executed and the sentence If the voice recognition result is related to the batch transfer request command, the determination in step 915 in FIG. 11 becomes YES, and steps 916 to 918 are executed.

【０１８８】パケット送受信部１１５から受信通知が通
知されておらずステップ９０１の判定がＮＯの場合、又
は上述の各コマンド又は音声データの受信に対応する処
理の後には、図１２のステップ９１９と９２０で認識音
声文章データの送信処理が実行され、それに続くステッ
プ９２１及び９２２で最終アクセス時刻が一定時間以上
前である移動端末１０１との通信を終了させるための処
理が行われた後、再び図９のステップ９０１の判定処理
に戻る。If the reception notification has not been received from the packet transmission / reception unit 115 and the determination in step 901 is NO, or after the processing corresponding to the reception of each command or voice data described above, steps 919 and 920 in FIG. In step 921 and step 922, a process for terminating communication with the mobile terminal 101 whose last access time is a predetermined time or more is performed, and then the process shown in FIG. The process returns to the determination processing of step 901.

【０１８９】ステップ９０１の判定がＹＥＳであり、ス
テップ９０２でパケット送受信部１１５から引き渡され
たデータが文音声認識処理のリアルタイム開始要求コマ
ンド又は文音声認識処理の非リアルタイム開始要求コマ
ンドに関するものである場合において、ステップ９０３
の判定がＹＥＳとなって実行されるステップ９０４〜９
０８の処理について説明する。When the determination in step 901 is YES, and the data transferred from the packet transmitting / receiving unit 115 in step 902 relates to a real-time start request command for sentence speech recognition processing or a non-real-time start request command for sentence speech recognition processing. In step 903
Steps 904 to 9 to be executed when the judgment of the step is YES
The process 08 is described.

【０１９０】まず、ステップ９０４では、ステップ９０
２でパケット送受信部１１５から引き渡された“端末識
別コード”に対応するエントリが、処理端末登録テーブ
ルに登録されているか否かが判定される。First, in step 904, step 90
In step 2, it is determined whether an entry corresponding to the “terminal identification code” passed from the packet transmitting / receiving unit 115 is registered in the processing terminal registration table.

【０１９１】移動端末１０１から初めて文音声認識処理
のリアルタイム開始要求コマンド又は文音声認識処理の
非リアルタイム開始要求コマンドが指定された場合に
は、この判定はＮＯとなる。When a real-time start request command for sentence speech recognition processing or a non-real-time start request command for sentence speech recognition processing is specified for the first time from the mobile terminal 101, this determination is NO.

【０１９２】その結果、ステップ９０５では、音声デー
タの受信用のバッファファイルである音声バッファファ
イルと、認識音声文章データの送信用のバッファファイ
ルである文章バッファファイルとが、音声制御ホスト装
置１０８が管理するファイルシステム上に作成される。As a result, in step 905, the voice control host device 108 manages the voice buffer file, which is a buffer file for receiving voice data, and the text buffer file, which is a buffer file for transmitting recognized voice text data. Is created on the file system.

【０１９３】次に、ステップ９０６では、移動端末通信
制御部１１６内の特には図示しないメモリに記憶される
図１３に示されるデータ構造を有する処理端末登録テー
ブルに、１つのエントリ（横１行のデータ組）が確保さ
れる。そして、そのエントリに、“端末識別コード”
と、“送信元ＩＰアドレス”と、リアルタイム要求の有
無と、最終アクセス時刻と、音声バッファファイル名
と、文章バッファファイル名とが、登録される。“端末
識別コード”は、ステップ９０２においてパケット送受
信部１１５から引き渡されたデータであり、移動端末１
０１から転送されてきたＴＣＰ／ＩＰパケットを構成す
るＴＣＰセグメントの“データ”フィールドに格納され
ていたものである（図６(c) 参照）。“送信元ＩＰアド
レス”は、やはりステップ９０２においてパケット送受
信部１１５から引き渡されたデータであり、移動端末１
０１から転送されてきたＴＣＰ／ＩＰパケットを構成す
るＩＰデータグラムのＩＰヘッダに格納されていたもの
である（図６(b) 、図７(a) 参照）。リアルタイム要求
の有無は、ステップ９０２においてパケット送受信部１
１５から引き渡されたコマンドが、文音声認識処理のリ
アルタイム開始要求コマンドである場合には“有り”
（値“１”）に設定され、文音声認識処理の非リアルタ
イム開始要求コマンドである場合には“無し”（値
“０”）に設定される。最終アクセス時刻には、現在時
刻が設定される。音声バッファファイル名と文章バッフ
ァファイル名は、ステップ９０５で作成された各ファイ
ルを示すファイル名である。Next, in step 906, one entry (one horizontal row) is stored in the processing terminal registration table having the data structure shown in FIG. 13 and stored in the memory (not shown) in the mobile terminal communication control unit 116. Data set) is secured. Then, in the entry, "terminal identification code"
, “Source IP address”, presence / absence of a real-time request, last access time, audio buffer file name, and text buffer file name are registered. The “terminal identification code” is data passed from the packet transmitting / receiving unit 115 in step 902,
This is stored in the "data" field of the TCP segment constituting the TCP / IP packet transferred from No. 01 (see FIG. 6 (c)). The “source IP address” is also data passed from the packet transmitting / receiving unit 115 in step 902,
This is stored in the IP header of the IP datagram constituting the TCP / IP packet transferred from No. 01 (see FIGS. 6 (b) and 7 (a)). The presence or absence of a real-time request is determined in step 902 by the packet
“Yes” if the command passed from No. 15 is a real-time start request command for sentence speech recognition processing
(Value "1"), and if the command is a non-real-time start request command for sentence speech recognition processing, it is set to "none" (value "0"). The current time is set as the last access time. The voice buffer file name and the sentence buffer file name are file names indicating the files created in step 905.

【０１９４】次に、前述したように、移動端末１０１か
ら初めて文音声認識処理のリアルタイム開始要求コマン
ド又は文音声認識処理の非リアルタイム開始要求コマン
ドが指定された場合には、処理端末登録テーブルに対応
するエントリは存在しないため、前述したステップ９０
４の判定はＹＥＳとなる。また、前述したように、移動
端末１０１から文音声認識処理のリアルタイム開始要求
コマンドが指定されて文音声認識処理の実行が開始され
た後に、その移動端末１０１から文音声認識処理の終了
要求コマンドを受信した場合又は最終アクセス時刻が現
在時刻から一定時間前の時刻よりも更に前の時刻である
（即ち一定時間アクセスがない）場合には、処理端末登
録テーブル上の対応するエントリは削除される。Next, as described above, when a real-time start request command for sentence speech recognition processing or a non-real-time start request command for sentence speech recognition processing is specified from the mobile terminal 101 for the first time, the processing terminal registration table is used. Since there is no entry to perform,
The determination of 4 is YES. Further, as described above, after the real-time start request command of the sentence speech recognition process is designated from the mobile terminal 101 and the execution of the sentence speech recognition process is started, the end request command of the sentence speech recognition process is sent from the mobile terminal 101. If received, or if the last access time is earlier than the time before the current time by a certain time (that is, there is no access for a certain time), the corresponding entry on the processing terminal registration table is deleted.

【０１９５】しかし、移動端末１０１から文音声認識処
理のリアルタイム開始要求コマンドが指定されて文音声
認識処理の実行が開始された後に、処理端末登録テーブ
ル上の対応するエントリが削除されないうちに、再び同
じ移動端末１０１から文音声認識処理のリアルタイム開
始要求コマンド又は文音声認識処理の非リアルタイム開
始要求コマンドが指定された場合は、ステップ９０４の
判定はＹＥＳとなる。また、前回、移動端末１０１から
文音声認識処理の非リアルタイム開始要求コマンドが指
定された文音声認識処理が実行された後、まだ、その移
動端末１０１から文音声認識結果の一括転送要求コマン
ドが指定されていないうちに、再び同じ移動端末１０１
から文音声認識処理のリアルタイム開始要求コマンド又
は文音声認識処理の非リアルタイム開始要求コマンドが
指定された場合にも、ステップ９０４の判定はＹＥＳと
なる（後述する図１１のステップ９１７参照）。However, after the mobile terminal 101 specifies the real-time start request command of the sentence speech recognition process and starts the execution of the sentence speech recognition process, before the corresponding entry on the processing terminal registration table is deleted, it is re-entered. If a real-time start request command for sentence speech recognition processing or a non-real-time start request command for sentence speech recognition processing is specified from the same mobile terminal 101, the determination in step 904 is YES. In addition, after the sentence speech recognition process in which the non-real-time start request command of the sentence speech recognition process is specified from the mobile terminal 101 last time, the batch transfer request command of the sentence speech recognition result is still specified from the mobile terminal 101. Before the same mobile terminal 101
When the real-time start request command of the sentence speech recognition process or the non-real-time start request command of the sentence speech recognition process is specified, the determination in step 904 becomes YES (see step 917 in FIG. 11 described later).

【０１９６】このような場合には、処理端末登録テーブ
ル上の削除されていない前回と同じエントリが使用され
る。そして、ステップ９０７において、処理端末登録テ
ーブル上の上記エントリに記憶されている“送信元ＩＰ
アドレス”とリアルタイム要求の有無のみが、図９のス
テップ９０２においてパケット送受信部１１５から引き
渡された新しいデータに更新され、また、最終アクセス
時刻が現在時刻に更新される。In such a case, the same entry in the processing terminal registration table that has not been deleted is used. Then, in step 907, the “source IP address” stored in the above entry on the processing terminal registration table
Only the address “” and the presence / absence of a real-time request are updated to the new data passed from the packet transmitting / receiving unit 115 in step 902 of FIG. 9, and the last access time is updated to the current time.

【０１９７】前述したステップ９０６又は上記ステップ
９０７の処理の後、ステップ９０８では、ステップ９０
２でパケット送受信部１１５から引き渡され処理端末登
録テーブルの上記エントリに登録された“送信元ＩＰア
ドレス”に向けて、送信許可データが返信される。After step 906 or step 907 described above, step 908 proceeds to step 90
In step 2, the transmission permission data is returned to the "source IP address" passed from the packet transmission / reception unit 115 and registered in the entry of the processing terminal registration table.

【０１９８】具体的には、移動端末通信制御部１１６
は、“送信元ＩＰアドレス”への送信許可データの返信
を、パケット送受信部１１５（図１）に対して依頼す
る。この結果、パケット送受信部１１５は、まず、図６
(c) に示されるフォーマットを有するＴＣＰセグメント
を生成する。この場合、図６(c) 及び図７(b) に示され
るフォーマットを有するＴＣＰヘッダにおいて、“送信
元ポート番号”フィールド及び“宛先ポート番号”フィ
ールドには、文音声認識処理のための通信プロトコルを
特定する１６ビットの整数値が設定される。そして、Ｔ
ＣＰセグメントの“データ”フィールドには、送信許可
データが格納される。Specifically, mobile terminal communication control section 116
Requests the packet transmission / reception unit 115 (FIG. 1) to return transmission permission data to the “source IP address”. As a result, the packet transmitting / receiving unit 115 first
Generate a TCP segment having the format shown in (c). In this case, in the TCP header having the format shown in FIGS. 6C and 7B, the “source port number” field and the “destination port number” field include a communication protocol for the text / speech recognition process. Is set as a 16-bit integer. And T
Transmission permission data is stored in the “data” field of the CP segment.

【０１９９】次に、パケット送受信部１１５は、上述の
ＴＣＰセグメントが“データ”フィールドに格納された
図６(b) に示されるフォーマットを有するＩＰデータグ
ラムを生成する。この場合に、図６(b) 及び図７(a) に
示されるフォーマットを有するＩＰヘッダにおいて、
“プロトコル”フォーマットには、その“データ”フィ
ールドに格納されるＴＣＰセグメントデータのフォーマ
ットを規定する整数値６が設定される。また、“送信元
ＩＰアドレス”フィールドには、音声制御ホスト装置１
０８に割当てられているＩＰアドレスが設定される。更
に、“宛先ＩＰアドレス”フィールドには、図９のステ
ップ９０２でパケット送受信部１１５から引き渡された
“送信元ＩＰアドレス”が設定される。Next, the packet transmitting / receiving unit 115 generates an IP datagram having the format shown in FIG. 6B in which the above-mentioned TCP segment is stored in the “data” field. In this case, in the IP header having the format shown in FIGS. 6 (b) and 7 (a),
In the “protocol” format, an integer value 6 defining the format of the TCP segment data stored in the “data” field is set. The “source IP address” field contains the voice control host device 1
08 is set. Further, in the “destination IP address” field, the “source IP address” passed from the packet transmission / reception unit 115 in step 902 of FIG. 9 is set.

【０２００】そして、パケット送受信部１１５は、上述
のＩＰデータグラムが格納されたＬＡＮ１０７上のプロ
トコルに従ったフレームを生成し、それをＬＡＮ１０７
に送出する。例えば、ＬＡＮ１０７がイーサネット方式
によるローカルエリアネットワークであれば、上記フレ
ームは、イーサネットフレームである。Then, the packet transmitting / receiving unit 115 generates a frame in accordance with the protocol on the LAN 107 in which the above-described IP datagram is stored, and transmits the frame to the LAN 107
To send to. For example, if the LAN 107 is a local area network based on the Ethernet system, the frame is an Ethernet frame.

【０２０１】上記フレームとＩＰデータグラムとＴＣＰ
セグメントとから構成されるＴＣＰ／ＩＰパケットは、
それを構成するＩＰデータグラムのＩＰヘッダに格納さ
れている“宛先ＩＰアドレス”に基づいて、ルータ装置
１０６及びインターネット１０５を介して移動端末制御
ホスト装置１０４まで転送された後、更に、ＰＨＳ網１
０３及び無線基地（又は有線接続装置）１０２を介し
て、移動端末１０１の通信部１１１内の通信制御部３２
１（図３）まで転送される。The above frame, IP datagram and TCP
A TCP / IP packet composed of a segment and
After being transferred to the mobile terminal control host device 104 via the router device 106 and the Internet 105 based on the “destination IP address” stored in the IP header of the IP datagram constituting the PHS network,
03 and the communication control unit 32 in the communication unit 111 of the mobile terminal 101 via the wireless base (or wired connection device) 102
1 (FIG. 3).

【０２０２】これ以降、移動端末１０１から音声制御ホ
スト装置１０８へは、前述したようにして、音声データ
が転送されてくる。ステップ９０８の処理の後は、図１
２のステップ９１９と９２０で認識音声文章データの送
信処理が実行され、それに続くステップ９２１及び９２
２で最終アクセス時刻が一定時間以上前である移動端末
１０１との通信を終了させるための処理が行われた後、
再び図９のステップ９０１の判定処理に戻る。Thereafter, voice data is transferred from the mobile terminal 101 to the voice control host device 108 as described above. After the processing of step 908, FIG.
In steps 919 and 920 of step 2, transmission processing of the recognized speech text data is executed, and steps 921 and 92
After processing for ending communication with the mobile terminal 101 whose last access time is a fixed time or more in 2 is performed,
The process returns to the determination processing of step 901 in FIG. 9 again.

【０２０３】次に、図９のステップ９０１の判定がＹＥ
Ｓであり、ステップ９０２でパケット送受信部１１５か
ら引き渡されたデータが音声データである場合におい
て、図１０のステップ９０９の判定がＹＥＳとなって実
行されるステップ９１０の処理について説明する。Next, the determination in step 901 in FIG.
In the case where S is S and the data delivered from the packet transmission / reception unit 115 in step 902 is audio data, the process of step 910 executed when the determination of step 909 in FIG.

【０２０４】即ち、ステップ９１０では、図９のステッ
プ９０２でパケット送受信部１１５から引き渡されたの
と同じ“送信元ＩＰアドレス”が記憶されている処理端
末登録テーブル（図１３）のエントリが検索され、該当
するエントリに記憶されている音声バッファファイル名
に対応する音声バッファファイル（図９のステップ９０
６参照）に、図９のステップ９０２でパケット送受信部
１１５から引き渡された音声データが追加書き込みされ
る。なお、追加書込み時の音声バッファファイルのサイ
ズは、音声制御ホスト装置１０８が管理するファイルシ
ステムによって自動的に調整される。That is, in step 910, an entry in the processing terminal registration table (FIG. 13) in which the same “source IP address” passed from the packet transmitting / receiving section 115 in step 902 in FIG. 9 is stored is searched. The audio buffer file corresponding to the audio buffer file name stored in the corresponding entry (step 90 in FIG. 9)
6), the audio data transferred from the packet transmitting / receiving unit 115 in step 902 of FIG. 9 is additionally written. The size of the audio buffer file at the time of additional writing is automatically adjusted by the file system managed by the audio control host device 108.

【０２０５】また、ステップ９１０では、上記該当する
エントリに記憶されている最終アクセス時刻が、現在時
刻に更新される。このようにして、移動端末１０１毎
（“端末識別コード”毎）の音声バッファファイルを介
して、移動端末通信制御部１１６から文音声認識部１１
７（図１）に音声データが引き渡される。即ち、文音声
認識部１１７は、後述するように、図１３に示される処
理端末登録テーブルのエントリ毎に、各エントリから特
定される音声バッファファイルに音声データが受信され
ていればそれに対して文音声認識処理を実行し、その結
果得られる認識音声文章データを上記各エントリに対応
する文章バッファファイルに追加書き込みすることにな
る。At step 910, the last access time stored in the relevant entry is updated to the current time. In this way, the mobile terminal communication control unit 116 sends the sentence voice recognition unit 11 via the voice buffer file for each mobile terminal 101 (for each “terminal identification code”).
7 (FIG. 1) is delivered. That is, as described later, the sentence speech recognition unit 117 performs, for each entry in the processing terminal registration table shown in FIG. The voice recognition processing is executed, and the resulting recognized voice text data is additionally written to the text buffer file corresponding to each of the entries.

【０２０６】ステップ９１０の処理の後は、図１２のス
テップ９１９と９２０で認識音声文章データの送信処理
が実行され、それに続くステップ９２１及び９２２で最
終アクセス時刻が一定時間以上前である移動端末１０１
との通信を終了させるための処理が行われた後、再び図
９のステップ９０１の判定処理に戻る。After the processing of step 910, the transmission processing of the recognized speech text data is executed in steps 919 and 920 of FIG. 12, and in subsequent steps 921 and 922, the mobile terminal 101 whose last access time is a predetermined time or more ago.
After the processing for terminating the communication with is performed, the flow returns to the determination processing of step 901 in FIG. 9 again.

【０２０７】次に、図９のステップ９０１の判定がＹＥ
Ｓであり、ステップ９０２でパケット送受信部１１５か
ら引き渡されたデータが文音声認識処理の終了要求コマ
ンドに関するものである場合において、図１０のステッ
プ９１１の判定がＹＥＳとなって実行されるステップ９
１２〜９１４の処理について説明する。Next, the determination in step 901 in FIG.
If it is S and the data transferred from the packet transmitting / receiving unit 115 in step 902 is related to a command for requesting termination of the sentence speech recognition process, the determination in step 911 in FIG.
The processing of 12 to 914 will be described.

【０２０８】まず、ステップ９１２では、図９のステッ
プ９０２でパケット送受信部１１５から引き渡されたの
と同じ“端末識別コード”が記憶されている処理端末登
録テーブル（図１３）のエントリに、リアルタイム要求
として“有り”（値“１”）が記憶されているか否かが
判定される。First, in step 912, the real-time request is entered in the entry of the processing terminal registration table (FIG. 13) storing the same “terminal identification code” passed from the packet transmitting / receiving section 115 in step 902 of FIG. It is determined whether or not “present” (value “1”) is stored.

【０２０９】上記エントリにリアルタイム要求として
“有り”（値“１”）が記憶されていない、即ちリアル
タイム要求として“無し”（値“０”が記憶されてお
り、移動端末１０１から最初に文音声認識処理の非リア
ルタイム開始要求コマンドが指示されていた場合には、
文音声認識処理の終了要求コマンドの受信後も処理端末
登録テーブルのエントリ、文章バッファファイル、及び
音声バッファファイル（この内容は通常文音声認識処理
が終了した時点で空となる）の各内容は保持されるた
め、ステップ９１２の判定がＮＯとなってステップ９１
３及び９１４の処理は実行されず、図１２のステップ９
１９と９２０で認識音声文章データの送信処理が実行さ
れ、それに続くステップ９２１及び９２２で最終アクセ
ス時刻が一定時間以上前である移動端末１０１との通信
を終了させるための処理が行われた後、再び図９のステ
ップ９０１の判定処理に戻る。[0209] In the above entry, "present" (value "1") is not stored as a real-time request, that is, "absent" (value "0") is stored as a real-time request. If the non-real-time start request command of the recognition process is instructed,
After receiving the command for requesting termination of the sentence speech recognition process, the contents of the entry in the processing terminal registration table, the sentence buffer file, and the speech buffer file (the contents of which become empty when the sentence speech recognition process ends) are retained. Therefore, the determination in step 912 is NO and step 91
Steps 3 and 914 are not executed, and step 9 in FIG.
At 19 and 920, transmission processing of the recognized voice sentence data is performed, and at subsequent steps 921 and 922, processing for terminating communication with the mobile terminal 101 whose last access time is a predetermined time or more is performed. The process returns to the determination processing of step 901 in FIG. 9 again.

【０２１０】一方、前記エントリにリアルタイム要求と
して“有り”（値“１”）が記憶されている、即ち移動
端末１０１から最初に文音声認識処理のリアルタイム開
始要求コマンドが指示されていた場合は、ステップ９１
２の判定がＹＥＳとなる。On the other hand, if “present” (value “1”) is stored as a real-time request in the entry, that is, if a real-time start request command of the sentence speech recognition processing is first instructed from the mobile terminal 101, Step 91
The determination of 2 is YES.

【０２１１】この結果、まず、ステップ９１３で、図９
のステップ９０２でパケット送受信部１１５から引き渡
されたのと同じ“端末識別コード”が記憶されている処
理端末登録テーブル（図１３）のエントリの内容が全て
削除される。As a result, first, in step 913, FIG.
In step 902, all the contents of the entry of the processing terminal registration table (FIG. 13) storing the same "terminal identification code" passed from the packet transmitting / receiving unit 115 are deleted.

【０２１２】また、ステップ９１４で、上記エントリに
記憶されていた音声バッファファイル名に対応する音声
バッファファイル及び文章バッファファイル名に対応す
る文章バッファファイルが、音声制御ホスト装置１０８
が管理するファイルシステム上から削除される。At step 914, the voice buffer file corresponding to the voice buffer file name and the text buffer file corresponding to the text buffer file name stored in the entry are stored in the voice control host device 108.
Is deleted from the file system managed by.

【０２１３】ステップ９１４の処理の後は、図１２のス
テップ９１９と９２０で認識音声文章データの送信処理
が実行され、それに続くステップ９２１及び９２２で最
終アクセス時刻が一定時間以上前である移動端末１０１
との通信を終了させるための処理が行われた後、再び図
９のステップ９０１の判定処理に戻る。After the processing of step 914, the transmission processing of the recognized speech text data is executed in steps 919 and 920 of FIG. 12, and in subsequent steps 921 and 922, the mobile terminal 101 whose final access time is a predetermined time or more ago.
After the processing for terminating the communication with is performed, the flow returns to the determination processing of step 901 in FIG. 9 again.

【０２１４】続いて、図９のステップ９０１の判定がＹ
ＥＳであり、ステップ９０２でパケット送受信部１１５
から引き渡されたデータが文音声認識結果の一括転送要
求コマンドに関するものである場合において、図１１の
ステップ９１５の判定がＹＥＳとなって実行されるステ
ップ９１６〜９１８の処理について説明する。Subsequently, the determination in step 901 in FIG.
ES, and the packet transmitting / receiving unit 115
Steps 916 to 918 executed when the determination result of step 915 in FIG. 11 is YES when the data transferred from is related to the batch transfer request command of the sentence speech recognition result will be described.

【０２１５】まず、ステップ９１６では、図９のステッ
プ９０２でパケット送受信部１１５から引き渡されたの
と同じ“端末識別コード”が記憶されている処理端末登
録テーブル（図１３）のエントリから得られる文章バッ
ファファイル名に対応する文章バッファファイルから、
認識音声文章データが一括して読み出され、それが１つ
以上のＴＣＰ／ＩＰパケットに格納され、図９のステッ
プ９０２でパケット送受信部１１５から引き渡された
“送信元ＩＰアドレス”に向けて送信される。First, in step 916, a sentence obtained from an entry in the processing terminal registration table (FIG. 13) in which the same “terminal identification code” passed from the packet transmitting / receiving unit 115 in step 902 in FIG. 9 is stored. From the sentence buffer file corresponding to the buffer file name,
Recognized voice sentence data is read out in a lump, stored in one or more TCP / IP packets, and transmitted to the “source IP address” passed from the packet transmitting / receiving unit 115 in step 902 of FIG. Is done.

【０２１６】具体的には、移動端末通信制御部１１６
は、“送信元ＩＰアドレス”への認識音声文章データの
送信を、パケット送受信部１１５に対し依頼する。この
結果、パケット送受信部１１５は、まず、図６(c) に示
されるフォーマットを有するＴＣＰセグメントを生成す
る。この場合、図６(c) 及び図７(b) に示されるフォー
マットを有するＴＣＰヘッダにおいて、“送信元ポート
番号”フィールド及び“宛先ポート番号”フィールドに
は、文音声認識処理のための通信プロトコルを特定する
１６ビットの整数値が設定される。そして、ＴＣＰセグ
メントの“データ”フィールドには、認識音声文章デー
タが格納される。More specifically, mobile terminal communication control section 116
Requests the packet transmitting / receiving unit 115 to transmit the recognized voice text data to the “source IP address”. As a result, the packet transmitting / receiving unit 115 first generates a TCP segment having the format shown in FIG. In this case, in the TCP header having the format shown in FIGS. 6C and 7B, the “source port number” field and the “destination port number” field include a communication protocol for the text / speech recognition process. Is set as a 16-bit integer. The "data" field of the TCP segment stores the recognized voice sentence data.

【０２１７】次に、パケット送受信部１１５は、上述の
ＴＣＰセグメントが“データ”フィールドに格納された
図６(b) に示されるフォーマットを有するＩＰデータグ
ラムを生成する。この場合に、図６(b) 及び図７(a) に
示されるフォーマットを有するＩＰヘッダにおいて、
“プロトコル”フォーマットには、その“データ”フィ
ールドに格納されるＴＣＰセグメントデータのフォーマ
ットを規定する整数値６が設定される。また、“送信元
ＩＰアドレス”フィールドには、音声制御ホスト装置１
０８に割当てられているＩＰアドレスが設定される。更
に、“宛先ＩＰアドレス”フィールドには、図９のステ
ップ９０２でパケット送受信部１１５から引き渡された
“送信元ＩＰアドレス”が設定される。Next, the packet transmission / reception unit 115 generates an IP datagram having the format shown in FIG. 6B in which the above-mentioned TCP segment is stored in the “data” field. In this case, in the IP header having the format shown in FIGS. 6 (b) and 7 (a),
In the “protocol” format, an integer value 6 defining the format of the TCP segment data stored in the “data” field is set. The “source IP address” field contains the voice control host device 1
08 is set. Further, in the “destination IP address” field, the “source IP address” passed from the packet transmission / reception unit 115 in step 902 of FIG. 9 is set.

【０２１８】そして、パケット送受信部１１５は、上述
のＩＰデータグラムが格納されたＬＡＮ１０７上のプロ
トコルに従ったフレームを生成し、それをＬＡＮ１０７
に送出する。Then, the packet transmitting / receiving unit 115 generates a frame according to the protocol on the LAN 107 in which the above-described IP datagram is stored, and transmits the frame to the LAN 107
To send to.

【０２１９】上記フレームとＩＰデータグラムとＴＣＰ
セグメントとから構成されるＴＣＰ／ＩＰパケットは、
それを構成するＩＰデータグラムのＩＰヘッダに格納さ
れている“宛先ＩＰアドレス”に基づいて、ルータ装置
１０６及びインターネット１０５を介して移動端末制御
ホスト装置１０４まで転送された後、更に、ＰＨＳ網１
０３及び無線基地（又は有線接続装置）１０２を介し
て、移動端末１０１の通信部１１１内の通信制御部３２
１（図３）まで転送され、前述したように、移動端末１
０１のＬＣＤ表示部３１１（図２の２０３）に表示され
る。The above frame, IP datagram and TCP
A TCP / IP packet composed of a segment and
After being transferred to the mobile terminal control host device 104 via the router device 106 and the Internet 105 based on the “destination IP address” stored in the IP header of the IP datagram constituting the PHS network,
03 and the communication control unit 32 in the communication unit 111 of the mobile terminal 101 via the wireless base (or wired connection device) 102
1 (FIG. 3), and as described above, the mobile terminal 1
01 is displayed on the LCD display unit 311 (203 in FIG. 2).

【０２２０】次に、ステップ９１７では、図９のステッ
プ９０２でパケット送受信部１１５から引き渡されたの
と同じ“端末識別コード”が記憶されている処理端末登
録テーブル（図１３）のエントリの内容が全て削除され
る。Next, in step 917, the contents of the entry of the processing terminal registration table (FIG. 13) storing the same “terminal identification code” passed from the packet transmitting / receiving section 115 in step 902 in FIG. All are deleted.

【０２２１】また、ステップ９１８で、上記エントリに
記憶されていた音声バッファファイル名に対応する音声
バッファファイル及び文章バッファファイル名に対応す
る文章バッファファイルが、音声制御ホスト装置１０８
が管理するファイルシステム上から削除される。At step 918, the voice buffer file corresponding to the voice buffer file name and the text buffer file corresponding to the text buffer file name stored in the entry are stored in the voice control host device 108.
Is deleted from the file system managed by.

【０２２２】ステップ９１８の処理の後は、図１２のス
テップ９１９と９２０で認識音声文章データの送信処理
が実行され、それに続くステップ９２１及び９２２で最
終アクセス時刻が一定時間以上前である移動端末１０１
との通信を終了させるための処理が行われた後、再び図
９のステップ９０１の判定処理に戻る。After the processing of step 918, the transmission processing of the recognized voice text data is executed in steps 919 and 920 of FIG. 12, and in subsequent steps 921 and 922, the mobile terminal 101 whose last access time is a predetermined time or more ago.
After the processing for terminating the communication with is performed, the flow returns to the determination processing of step 901 in FIG. 9 again.

【０２２３】パケット送受信部１１５から受信通知が通
知されておらず図９のステップ９０１の判定がＮＯの場
合、又は上述の各コマンド又は音声データの受信に対応
する処理の後に実行される、図１２のステップ９１９と
９２０の処理、及びそれに続くステップ９２１と９２２
の処理について説明する。When the reception notification is not notified from the packet transmitting / receiving unit 115 and the determination in step 901 in FIG. 9 is NO, or after the processing corresponding to the reception of each command or voice data described above, FIG. Processing of steps 919 and 920, and subsequent steps 921 and 922
Will be described.

【０２２４】これらの処理において、文音声認識部１１
７から得られている認識音声文章データの送信処理が実
行される。まず、ステップ９１９では、処理端末登録テ
ーブル（図１３）において、リアルタイム要求として
“有り”（値“１”）が記憶されており、かつ文章バッ
ファファイル名に対応する文章バッファファイルに認識
音声文章データが存在するエントリがあるか否かが判定
される。In these processes, the sentence speech recognition section 11
The transmission processing of the recognized voice sentence data obtained from the step 7 is executed. First, in step 919, in the processing terminal registration table (FIG. 13), "present" (value "1") is stored as a real-time request, and the recognized voice text data is stored in the text buffer file corresponding to the text buffer file name. It is determined whether or not there is an entry in which.

【０２２５】そのようなエントリが無くステップ９１９
の判定がＮＯの場合には、ステップ９２０での認識音声
文章データの送信処理は実行されずに、ステップ９２１
及び９２２の処理に進む。There is no such entry and step 919
Is NO, the transmission process of the recognized voice text data in step 920 is not executed, and the
And 922.

【０２２６】上述のようなエントリが１つ以上存在しス
テップ９１９の判定がＹＥＳの場合には、ステップ９２
０で、該当するエントリ毎に、そのエントリに記憶され
ている“送信元ＩＰアドレス”に向けて、そのエントリ
に記憶されている文章バッファファイル名に対応する文
章バッファファイル内の認識音声文章データが送信さ
れ、その送信された認識音声文章データが上記文章バッ
ファファイルから削除される。なお、削除時の文章バッ
ファファイルのサイズは、音声制御ホスト装置１０８が
管理するファイルシステムによって自動的に調整され
る。If there is one or more entries as described above and the determination in step 919 is YES, step 92
0, for each applicable entry, the recognized speech text data in the text buffer file corresponding to the text buffer file name stored in that entry is directed toward the “source IP address” stored in that entry. The sent recognized speech sentence data is deleted from the sentence buffer file. The size of the sentence buffer file at the time of deletion is automatically adjusted by the file system managed by the voice control host device 108.

【０２２７】上述のステップ９２０の処理の後又はステ
ップ９１９の判定がＮＯである場合に、ステップ９２１
が実行される。ここでは、処理端末登録テーブル（図１
３）のエントリのうち、リアルタイム要求として“有
り”（値“１”）が設定されており、かつ最終アクセス
時刻が現在時刻から一定時間前の時刻よりも更に前の時
刻であるエントリが検出され、そのエントリの内容が全
て削除される。After the processing in step 920 or when the determination in step 919 is NO, step 921
Is executed. Here, the processing terminal registration table (FIG. 1)
Among the entries of 3), an entry in which “present” (value “1”) is set as the real-time request and the last access time is a time earlier than the current time by a fixed time is detected. , The contents of the entry are all deleted.

【０２２８】また、ステップ９２２で、上記エントリに
記憶されていた音声バッファファイル名に対応する音声
バッファファイル及び文章バッファファイル名に対応す
る文章バッファファイルが、音声制御ホスト装置１０８
が管理するファイルシステム上から削除される。At step 922, the voice buffer file corresponding to the voice buffer file name and the text buffer file corresponding to the text buffer file name stored in the entry are stored in the voice control host device 108.
Is deleted from the file system managed by.

【０２２９】ステップ９２２の処理の後、再び図９のス
テップ９０１の判定処理に戻る。＜文音声認識部１１７の詳細動作＞図１４は、文音声認
識部１１７の機能ブロック図である。After the process of step 922, the process returns to the determination process of step 901 in FIG. <Detailed Operation of Sentence Speech Recognition Unit 117> FIG. 14 is a functional block diagram of the sentence speech recognition unit 117.

【０２３０】この文音声認識部１１７は、前述したよう
に、図１３に示される処理端末登録テーブルのエントリ
毎に、各エントリから特定される音声バッファファイル
に音声データが受信されていればそれに対して文音声認
識処理を実行し、その結果得られる認識音声文章データ
を上記各エントリに対応する文章バッファファイルに追
加書き込みする。As described above, this sentence speech recognition unit 117 performs, for each entry in the processing terminal registration table shown in FIG. 13, if speech data is received in a speech buffer file specified from each entry, Then, the sentence speech recognition processing is executed, and the resulting recognized speech sentence data is additionally written into the sentence buffer file corresponding to each entry.

【０２３１】上述のエントリ毎の音声バッファファイル
からの音声データの読出しと文章バッファファイルへの
認識音声文章データの書込みは、図１４の入出力制御部
１４０９が制御する。まず、この入出力制御部１４０９
の制御動作につき説明する。図１５は、入出力制御部１
４０９が実行する制御動作を示す動作フローチャートで
ある。この動作フローチャートは、入出力制御部１４０
９を制御する特には図示しないプロセッサが、特には図
示しない制御プログラムを実行する動作として実現され
る。The reading of the voice data from the voice buffer file for each entry and the writing of the recognized voice text data to the text buffer file are controlled by the input / output control unit 1409 in FIG. First, the input / output control unit 1409
Will be described. FIG. 15 shows the input / output control unit 1.
409 is an operation flowchart illustrating a control operation performed by the control unit. This operation flowchart is based on the input / output control unit 140
9 is realized as an operation of executing a control program (not shown).

【０２３２】まず、ステップ１５０１では、処理端末登
録テーブル（図１３）において、音声バッファファイル
名に対応する音声バッファファイルに音声データが記憶
されているエントリが存在するか否かが判定される。First, in step 1501, it is determined whether or not an entry in which audio data is stored in the audio buffer file corresponding to the audio buffer file name exists in the processing terminal registration table (FIG. 13).

【０２３３】そのようなエントリが存在しステップ１５
０１の判定がＹＥＳならば、ステップ１５０２で、該当
するエントリ毎に、そのエントリに記憶されている“端
末識別コード”と、そのエントリに記憶されている音声
バッファファイル名に対応する音声バッファファイル上
の音声データとが、図１４の入力バッファキュー１４０
１に書き込まれ、その音声データが音声バッファファイ
ルから削除される。If such an entry exists and step 15
If the determination of 01 is YES, in step 1502, for each entry, the “terminal identification code” stored in the entry and the audio buffer file name corresponding to the audio buffer file name stored in the entry Of the input buffer queue 140 shown in FIG.
1 and the audio data is deleted from the audio buffer file.

【０２３４】入力バッファキュー１４０１は、それがキ
ューイングしている音声データを、音声区間検出部１４
０２に順次流し込む機能を有する。音声区間検出部１４
０２以降に接続されている音声分析部１４０３、音素認
識部１４０４、単語認識部１４０６、及び文章認識部１
４０７は、データ処理パイプラインを形成しており、相
互に独立して、入力データを処理する機能を有する。ま
た、１４０２〜１４０７の各部分は、現在処理している
音声データに対応する“端末識別コード”（入力バッフ
ァキュー１４０１から入力される）を認識することがで
きる。従って、最終的に文章認識部１４０７から出力バ
ッファキュー１４０８へは、“端末識別コード”と認識
音声文章データとの組が出力されることになる。The input buffer queue 1401 stores the audio data queued in the input buffer queue 1401,
02. Voice section detector 14
02, the speech analysis unit 1403, the phoneme recognition unit 1404, the word recognition unit 1406, and the sentence recognition unit 1
Reference numeral 407 forms a data processing pipeline, and has a function of processing input data independently of each other. Each of the parts 1402 to 1407 can recognize the “terminal identification code” (input from the input buffer queue 1401) corresponding to the audio data currently being processed. Therefore, finally, a set of the “terminal identification code” and the recognized voice text data is output from the text recognition unit 1407 to the output buffer queue 1408.

【０２３５】ステップ１５０２の処理の後又はステップ
１５０１の判定がＮＯの場合には、ステップ１５０３
で、図１４の出力バッファキュー１４０８に、“端末識
別コード”と認識音声文章データの組が得られているか
否かが判定される。After the processing in step 1502 or when the determination in step 1501 is NO, step 1503
Then, it is determined whether or not a set of “terminal identification code” and recognized speech text data has been obtained in the output buffer queue 1408 in FIG.

【０２３６】そのような組が得られておりステップ１５
０３の判定がＹＥＳならば、ステップ１５０４で、出力
バッファキュー１４０８内の組毎に、その組の“端末識
別コード”に対応する処理端末登録テーブルのエントリ
について、そのエントリに記憶されている文章バッファ
ファイル名に対応する文章バッファファイルに、出力バ
ッファキュー１４０８内の組の認識音声文章データが追
加書き込みされる。When such a set is obtained, step 15
If the determination in step 03 is YES, in step 1504, for each set in the output buffer queue 1408, for the entry in the processing terminal registration table corresponding to the "terminal identification code" of that set, the text buffer stored in that entry The set of recognized speech text data in the output buffer queue 1408 is additionally written to the text buffer file corresponding to the file name.

【０２３７】ステップ１５０４の処理の後又はステップ
１５０３の判定がＮＯの場合には、再びステップ１５０
１の判定処理が実行される。以上のようにして文音声認
識部１１７は、流れ作業的に効率良く、複数の移動端末
１０１から要求された音声データに対する文音声認識処
理を実行することができる。After the processing in step 1504 or when the determination in step 1503 is NO, step 150 is executed again.
1 is performed. As described above, the sentence / speech recognition unit 117 can efficiently execute the sentence / speech recognition process on the speech data requested by the plurality of mobile terminals 101 in a streamlined manner.

【０２３８】次に、文音声認識処理を実現するための１
４０２〜１４０７の各部分の機能につき、以下に説明す
る。なお、以下に説明する各方式は、例えば、文献「電
子・情報工学入門シリーズ２音響・音声工学」（古井
著、近代科学社）第１４章」を参照することにより、実
現することができる。Next, 1 for realizing the sentence speech recognition processing is described.
The function of each part of 402 to 1407 will be described below. Each of the methods described below can be realized by referring to, for example, the document “Electronic / Information Engineering Introduction Series 2, Sound and Speech Engineering” (Furui, Modern Science Co., Chapter 14).

【０２３９】音声区間検出部１４０２は、入力バッファ
キュー１４０１から入力される音声データのサンプル時
系列について、音声が存在する区間を検出する。より具
体的には、音声区間検出部１４０２は、所定サンプル
（例えば８ｋＨｚサンプリングデータについて３２乃至
２５６サンプル）ずつの平均パワー（電力）を計算し、
その平均パワーが所定の閾値を超えた状態が所定回数以
上連続して続く区間を、音声区間として検出する。これ
により、音声が存在しない区間で文音声が誤認識されて
しまうのを防ぐことができる。The voice section detection section 1402 detects a section in which voice exists in the sample time series of the voice data input from the input buffer queue 1401. More specifically, the voice section detection unit 1402 calculates an average power (power) of predetermined samples (for example, 32 to 256 samples for 8 kHz sampling data),
A section in which the state in which the average power exceeds a predetermined threshold continues for a predetermined number of times or more is detected as a voice section. Thereby, it is possible to prevent a sentence voice from being erroneously recognized in a section where no voice exists.

【０２４０】音声分析部１４０３は、音声区間検出部１
４０２から出力される音声データについて、その特徴分
析を行うことによって、特徴量パラメータベクトルを検
出する。音声分析方式としては、以下の周知の分析方式
の何れかを採用することができる。（１）音声データ時系列を入力とする帯域フィルタバン
クの各出力を平滑化し、それらの平滑化された各出力を
特徴量パラメータベクトルの要素とする方式。（２）連続する所定サンプルずつの音声データ時系列を
入力とする高速フーリエ変換（ＦＦＴ）によって計算し
た各短時間スペクトル成分を平滑化し、それらの平滑化
された各成分値を特徴量パラメータベクトルの要素とす
る方式。（３）連続する所定サンプルずつの音声データ時系列を
入力とするケプストラム分析によってケプストラム係数
群を計算し、それらを特徴量パラメータベクトルの要素
とする方式。（４）上記（３）のケプストラム係数群に加えて、それ
らに対するΔ（デルタ）ケプストラム（ケプストラムの
微係数）群を計算し、それらを特徴量パラメータベクト
ルの要素に加える方式。（５）連続する所定サンプルずつの音声データ時系列を
入力とする線形予測分析（ＬＰＣ分析、更に具体的には
線スペクトル対分析：ＬＳＰ分析）によって、ＬＰＣ
（ＬＳＰ）係数群を計算し、それらを特徴量パラメータ
ベクトルの要素とする方式。（６）連続する所定サンプルずつの音声データ時系列を
入力とする自己相関分析によって自己相関関数を計算
し、それらに基づいて検出される音声のピッチ基本周波
数パターンを特徴量パラメータベクトルの１つの要素に
加える方式。次に、音素認識部１４０４は、所定フレーム周期（所定
サンプル）毎に音声分析部１４０３から出力される特徴
量パラメータベクトルと、音素標準パターン辞書１４０
５に蓄積されている各音素の特徴量パラメータベクトル
の標準パターンとの類似度（距離）を計算し、その結果
所定フレーム周期毎に得られる類似度の高い音素の組を
その類似度と共に音素ラティスデータとして出力する。
音素認識部１４０４は、音素の認識誤りの発生を回避す
るために、所定フレーム周期毎に最終的な音素を決定す
ることはせずに、音素候補を表にした音素ラティスデー
タの形式で結果データを出力する。[0240] The voice analysis section 1403 includes the voice section detection section 1
A feature amount parameter vector is detected by performing a feature analysis on the audio data output from 402. Any of the following well-known analysis methods can be adopted as the voice analysis method. (1) A method of smoothing each output of a band filter bank to which a time series of audio data is input, and using each smoothed output as an element of a feature parameter vector. (2) Smoothing each short-time spectrum component calculated by Fast Fourier Transform (FFT) which receives an audio data time series of a predetermined number of continuous samples as input, and converts each smoothed component value into a feature amount parameter vector. Element method. (3) A method in which a cepstrum coefficient group is calculated by cepstrum analysis using a time series of audio data for each successive predetermined sample as an input, and these are used as elements of a feature parameter vector. (4) A method of calculating a Δ (delta) cepstrum (differential coefficient of a cepstrum) group for the cepstrum coefficient group in addition to the cepstrum coefficient group of the above (3), and adding them to the element of the feature amount parameter vector. (5) LPC analysis is performed by linear prediction analysis (LPC analysis, more specifically, line spectrum pair analysis: LSP analysis) that receives a time series of audio data for each successive predetermined sample as input.
(LSP) A method of calculating coefficient groups and using them as elements of a feature parameter vector. (6) An autocorrelation function is calculated by an autocorrelation analysis using a time series of voice data of each successive predetermined sample as an input, and a pitch fundamental frequency pattern of voice detected based on the autocorrelation function is calculated as one element of a feature parameter vector. Method to add to Next, the phoneme recognition unit 1404 stores the feature parameter vector output from the speech analysis unit 1403 at every predetermined frame period (predetermined sample) and the phoneme standard pattern dictionary 140
5. The similarity (distance) of the feature parameter vector of each phoneme stored in No. 5 with the standard pattern is calculated, and as a result, a set of phonemes with high similarity obtained at every predetermined frame period is determined along with the phonetic lattice along with the similarity. Output as data.
The phoneme recognizing unit 1404 does not determine the final phoneme at every predetermined frame period in order to avoid occurrence of a phoneme recognition error, and outputs the result data in the form of phoneme lattice data in which phoneme candidates are listed. Is output.

【０２４１】単語認識部１４０６は、所定フレーム周期
毎に音素認識部１４０４から出力される音素ラティスデ
ータを入力として、所定フレーム周期毎に単語候補を表
にして単語ラティスデータを出力する。単語認識方式と
しては、以下の周知の分析方式の何れかを採用すること
ができる。（１）単語認識部１４０６は、音素認識部１４０４から
出力される複数のフレーム周期にまたがる音素ラティス
データの時系列と、単語辞書に蓄積されている全音素標
準パターン系列とで、時間正規化（ＤＰマッチング or
ＤＴＷ：DynamicTime Warping）を実行し、単語ラティ
スデータを出力する。この場合も、単語認識部１４０６
は、単語の認識誤りの発声を回避するために、所定フレ
ーム周期毎に最終的な単語を決定することはせずに、単
語候補を表にした単語ラティスデータの形式で結果デー
タを出力する。（２）単語認識部１４０６は、ＨＭＭ（Hidden Markov
Model ）によって、全単語をモデル化し、音素認識部１
４０４から出力される複数のフレーム周期にまたがる音
素ラティスデータの時系列をＨＭＭ分析部に入力し、生
起確率の大きいものから複数個のモデルに対応する各単
語を、単語候補である単語ラティスデータとして出力す
る。最後に、文章認識部１４０７は、その第１段処理と
して、単語認識部１４０６から出力される単語ラティス
データを順次入力し、日本語（英語でもよい）の文節構
造に関する文節内文法（語順規則）に従って、種々の文
節の可能性を文節ラティスデータとして算出する。そし
て、文章認識部１４０７は、その第２段処理として、文
節間文法に従って文節間の意味的な係り受けを解析し、
認識音声文章データを決定し、それを、入力バッファキ
ュー１４０１から順次伝達されてきた“端末識別コー
ド”と対について、出力バッファキュー１４０８に書き
込む。＜他の実施の形態＞以上説明した実施の形態では、移動
端末１０１は、ＰＨＳ端末であって、移動端末１０１と
音声制御ホスト装置１０８とは、ＰＨＳ網１０３とイン
ターネット１０５を介して接続されている。しかし、本
発明は、これに限られるものではなく、無線又は有線に
よって間接的又は直接的に音声制御ホスト装置１０８に
接続される形態であれば、どのような形態であっても本
発明をそれに適用することができる。The word recognizing unit 1406 receives the phoneme lattice data output from the phoneme recognizing unit 1404 at every predetermined frame period as input, and outputs word lattice data in a table of word candidates at every predetermined frame period. As the word recognition method, any of the following well-known analysis methods can be adopted. (1) The word recognizing unit 1406 performs time normalization on the time series of phoneme lattice data output from the phoneme recognizing unit 1404 over a plurality of frame periods and the all-phoneme standard pattern sequence stored in the word dictionary ( DP matching or
DTW (Dynamic Time Warping) is executed to output word lattice data. Also in this case, the word recognition unit 1406
Outputs the result data in the form of word lattice data in which word candidates are tabulated without determining a final word at every predetermined frame period in order to avoid utterance of a word recognition error. (2) The word recognition unit 1406 uses HMM (Hidden Markov)
Model), all the words are modeled, and the phoneme recognition unit 1
The time series of phoneme lattice data spanning a plurality of frame periods output from 404 is input to the HMM analysis unit, and each word corresponding to a plurality of models from the one with the highest probability of occurrence is regarded as word lattice data as a word candidate. Output. Finally, the sentence recognizing unit 1407 sequentially inputs the word lattice data output from the word recognizing unit 1406 as a first-stage processing, and generates a grammar (phrase order rule) in a phrase related to a phrase structure of Japanese (or English). , The possibility of various phrases is calculated as phrase lattice data. Then, the sentence recognizing unit 1407 analyzes the semantic dependency between the phrases according to the inter-phrase grammar as the second stage processing,
The recognition voice sentence data is determined, and it is written in the output buffer queue 1408 in combination with the “terminal identification code” sequentially transmitted from the input buffer queue 1401. <Other Embodiments> In the embodiment described above, mobile terminal 101 is a PHS terminal, and mobile terminal 101 and voice control host device 108 are connected via PHS network 103 and Internet 105. I have. However, the present invention is not limited to this, and the present invention may be applied to any form connected to the voice control host device 108 indirectly or directly by wireless or wired. Can be applied.

【０２４２】[0242]

【発明の効果】本発明によれば、移動端末は、高度な音
声認識環境を設備する必要がなく実用的な精度を有する
音声認識機能の提供を低コストで受けることが可能とな
る。According to the present invention, it is possible for a mobile terminal to provide a speech recognition function having practical accuracy at a low cost without having to provide a sophisticated speech recognition environment.

【０２４３】また、本発明によれば、現在全国的及び全
世界的に普及しつつあるパーソナルハンディホンシステ
ム通信網及びインターネットを経由することによって、
実用的な精度を有する音声認識機能の提供をより低コス
ト及び手軽に受けることができると同時に、本発明が提
供する機能とパーソナルハンディホンシステム通話機能
及びインターネットアクセス機能とを、シームレスに結
合することが可能となる。In addition, according to the present invention, by using a personal handyphone system communication network and the Internet, which are currently spreading nationwide and worldwide,
It is possible to provide a voice recognition function having practical accuracy at lower cost and easily, and to seamlessly combine the function provided by the present invention with the personal handyphone system call function and the Internet access function. Becomes possible.

【０２４４】更に、本発明のよれば、移動端末と音声制
御ホスト装置とを全世界的に容易に特定できると共に、
音声認識処理サービスと他の情報処理サービスとの共存
を容易に実現することが可能となる。Further, according to the present invention, the mobile terminal and the voice control host device can be easily specified worldwide, and
It is possible to easily realize coexistence of the voice recognition processing service and another information processing service.

【０２４５】加えて、本発明によれば、ホスト装置側の
負荷分散を容易に実現することが可能となる。In addition, according to the present invention, it is possible to easily realize load distribution on the host device side.

[Brief description of the drawings]

【図１】全システム構成図である。FIG. 1 is an overall system configuration diagram.

【図２】移動端末の外観図である。FIG. 2 is an external view of a mobile terminal.

【図３】移動端末の機能ブロック図である。FIG. 3 is a functional block diagram of a mobile terminal.

【図４】移動端末の処理の全体動作フローチャートであ
る。FIG. 4 is an overall operation flowchart of processing of a mobile terminal.

【図５】送信処理の動作フローチャートである。FIG. 5 is an operation flowchart of a transmission process.

【図６】通信データのフォーマット図である。FIG. 6 is a format diagram of communication data.

【図７】ＩＰヘッダとＴＣＰヘッダのフォーマット図で
ある。FIG. 7 is a format diagram of an IP header and a TCP header.

【図８】ＰＰＰを用いた発信処理の動作フローチャート
である。FIG. 8 is an operation flowchart of a calling process using PPP.

【図９】移動端末通信制御部の動作フローチャート（そ
の１）である。FIG. 9 is an operation flowchart (part 1) of a mobile terminal communication control unit.

【図１０】移動端末通信制御部の動作フローチャート
（その２）である。FIG. 10 is an operation flowchart (part 2) of the mobile terminal communication control unit.

【図１１】移動端末通信制御部の動作フローチャート
（その３）である。FIG. 11 is an operation flowchart (part 3) of the mobile terminal communication control unit.

【図１２】移動端末通信制御部の動作フローチャート
（その４）である。FIG. 12 is an operation flowchart (part 4) of the mobile terminal communication control unit.

【図１３】処理端末登録テーブルのデータ構成図であ
る。FIG. 13 is a data configuration diagram of a processing terminal registration table.

【図１４】文音声認識部の構成図である。FIG. 14 is a configuration diagram of a sentence speech recognition unit.

【図１５】入出力制御部の動作フローチャートである。FIG. 15 is an operation flowchart of the input / output control unit.

[Explanation of symbols]

１０１移動端末１０２無線基地（有線接続装置）１０３ＰＨＳ網（公衆電話網、ＩＳＤＮ網）１０４移動端末制御ホスト装置１０５インターネット１０６ルータ装置１０７ＬＡＮ（ローカルエリアネットワーク）１０８音声制御ホスト装置１０９入力部１１０制御部１１１通信部１１２出力部１１３接続確立部１１４ルーティング部１１５パケット送受信部１１６移動端末通信制御部１１７文音声認識部２０１、３０１マイク２０２、３０４カメラ（ＣＣＤカメラ）２０３、３１１ＬＣＤ表示部２０４、３０８スピーカ２０５、３２３無線アンテナ２０６、３２５ソケット（通信用）２０７ＩＣカードスロット２０８光送受信機（光通信用）３０２、３０５Ａ／Ｄ変換部３０３マイク制御部３０６、３１３メモリ３０７カメラ制御部３０９Ｄ／Ａ変換部３１０スピーカ制御部３１２ＬＣＤドライバ３１４ＬＣＤ制御部３１５タッチパネル制御部３１６ＣＰＵ３１７ＲＡＭ３１８ＲＯＭ３１９ＩＣカードインタフェース部３２０ＩＣカード３２１通信制御部３２２無線ドライバ３２４有線ドライバ１４０１入力バッファキュー１４０２音声区間検出部１４０３音声分析部１４０４音素認識部１４０５音素標準パターン辞書１４０６単語認識部１４０７文章認識部１４０８出力バッファキュー１４０９入出力制御部 Reference Signs List 101 mobile terminal 102 wireless base (wired connection device) 103 PHS network (public telephone network, ISDN network) 104 mobile terminal control host device 105 internet 106 router device 107 LAN (local area network) 108 voice control host device 109 input unit 110 control Unit 111 communication unit 112 output unit 113 connection establishment unit 114 routing unit 115 packet transmission / reception unit 116 mobile terminal communication control unit 117 sentence speech recognition unit 201, 301 microphone 202, 304 camera (CCD camera) 203, 311 LCD display unit 204, 308 Speakers 205, 323 Wireless antennas 206, 325 Socket (for communication) 207 IC card slot 208 Optical transceiver (for optical communication) 302, 305 A / D converter 303 Microphone controller 306, 3 3 Memory 307 Camera control unit 309 D / A conversion unit 310 Speaker control unit 312 LCD driver 314 LCD control unit 315 Touch panel control unit 316 CPU 317 RAM 318 ROM 319 IC card interface unit 320 IC card 321 Communication control unit 322 Wireless driver 324 Wired Driver 1401 Input buffer queue 1402 Voice section detector 1403 Voice analyzer 1404 Phoneme recognizer 1405 Phoneme standard pattern dictionary 1406 Word recognizer 1407 Text recognizer 1408 Output buffer queue 1409 Input / output controller

Claims

[Claims]

1. A communication system in which a mobile terminal communicates with a host device, wherein the mobile terminal indirectly or via a relay network including one or both of a wireless network and a wired network in the mobile terminal. A host connecting means for directly connecting to the voice control host device, which is the host device, without going through a relay network; a voice input means for inputting voice; and a connection operation by the host connecting means, after the voice input means, Voice data transmitting means for transmitting input voice data to the voice control host device; recognition voice data receiving means for receiving recognition voice data returned from the voice control host device; And a recognition voice data display / edit means for displaying or editing, wherein the voice control host device is provided with a connection operation by a host connection means in the mobile terminal. Mobile terminal connecting means for identifying and connecting the mobile terminal, voice data receiving means for receiving the voice data for each currently connected mobile terminal, and for each currently connected mobile terminal, Voice recognition means for performing voice recognition processing on voice data received by the voice data receiving means; and for each currently connected mobile terminal, recognition voice data obtained by voice recognition processing by the voice recognition means. ,
A mobile terminal voice recognition communication system, comprising: recognition voice data return means for returning a response to a mobile terminal corresponding thereto.

2. The mobile terminal used in a communication system in which the mobile terminal communicates with a host device, wherein the mobile terminal is indirectly or via a relay network including one or both of a wireless network and a wired network. Host connection means for directly connecting to the audio control host device which is the host device without passing through the relay network; voice input means for inputting voice; and after the connection operation by the host connection means, the voice input means Voice data transmitting means for transmitting voice data input from the voice control host device to the voice control host device; recognition voice data receiving means for receiving recognition voice data returned from the voice control host device; and the received recognized voice data. And a recognition voice data displaying / editing means for displaying or editing the data.

3. The host device used in a communication system in which a mobile terminal communicates with a host device, wherein the host device is indirectly or via a relay network including one or both of a wireless network and a wired network. Mobile terminal connection means for identifying and connecting to the mobile terminal in response to a connection operation performed by the mobile terminal directly without passing through the relay network; and voice data for each currently connected mobile terminal. Voice data receiving means for receiving voice data, voice recognition means for performing voice recognition processing on voice data received by the voice data receiving means, for each currently connected mobile terminal, For each terminal, recognition voice data obtained by voice recognition processing by the voice recognition unit,
A voice control host device, comprising: recognition voice data return means for returning a response to a mobile terminal corresponding thereto.

4. The host connection means in the mobile terminal selectively selects a real-time reply request command or a non-real-time reply request command for causing the voice control host device to return recognized voice data in real time or non-real time. And a function of transmitting a batch transfer request command for collectively transferring the recognition voice data, wherein the recognition voice data returning means is configured to perform, for each mobile terminal currently connected, the mobile terminal When the connection means has received the real-time reply request command from the mobile terminal, the corresponding voice data obtained by the voice recognition processing by the voice recognition means is immediately returned to the corresponding mobile terminal in response thereto. Receiving the non-real-time reply request command from the mobile terminal by the mobile terminal connection means. If the mobile terminal connection unit receives the batch transfer request command from the mobile terminal after holding the recognized voice data obtained by the voice recognition process by the voice recognition unit, The mobile terminal voice recognition communication system according to any one of claims 1 to 3, wherein the stored recognition voice data corresponding thereto is collectively returned to the corresponding mobile terminal. , Mobile terminal, or voice control host device.

5. The mobile terminal has a personal handyphone system communication function; the relay network includes a personal handyphone system communication network and the Internet; the voice control host device connects to the Internet; The host connection means in the terminal transmits and connects to the mobile terminal control host device having a gateway function between the Internet and a public network including the personal handyphone system communication network via the personal handyphone system communication network. 5. The mobile terminal control host device connects to the voice control host device via the Internet by using the communication protocol on the Internet. The mobile terminal speech recognition communication system according to the paragraph, mobile End, or voice control host device.

6. The communication protocol is a hierarchical protocol including an Internet protocol layer and a transmission control protocol layer, and includes a header field of an Internet Protocol datagram, which is packet data of the Internet protocol layer transmitted on the Internet. Stores a source Internet protocol address and a destination Internet protocol address designating addresses of the mobile terminal and the voice control host device on the Internet, and the transmission control protocol is stored in a data field of the Internet protocol datagram. A transmission control protocol segment, which is packet data of a layer, is stored. In the header field of the control protocol segment, a source port number and a destination port number for specifying a communication protocol for the voice recognition processing are stored, and in the data field of the transmission control protocol segment, the mobile terminal is identified. The terminal identification code, the real-time reply request command, the non-real-time reply request command, the batch transfer request command, the voice data, or the recognized voice data are stored. Mobile terminal voice recognition communication system, mobile terminal, or voice control host device.

7. The voice control host device is mutually connected by a network, and disperses functions corresponding to the mobile terminal connection means, the voice data receiving means, the voice recognition means, and the recognized voice data return means. The mobile terminal voice recognition communication system, the mobile terminal, or the voice control host device according to any one of claims 1 to 6, comprising a plurality of host computers realized by: