JP2010152237A

JP2010152237A - Cellular phone system

Info

Publication number: JP2010152237A
Application number: JP2008332496A
Authority: JP
Inventors: Takao Hayashi; 孝郎林
Original assignee: Individual
Current assignee: Individual
Priority date: 2008-12-26
Filing date: 2008-12-26
Publication date: 2010-07-08

Abstract

PROBLEM TO BE SOLVED: To provide a cellular phone system which is small, has a high added value, and highly-functional user interface in order to solve the problems that the cellular phone becomes large, it is inconvenient for carrying, and a damage is large when the cellular phone is dropped or sunk in water, when a voice interactive system is incorporated into the cellular phone system. SOLUTION: Of a microphone 17, a speaker 21, a voice recognition board 55, an interaction processing part 71 constituting the voice interactive system, the microphone 17, the speaker 21 are provided in an interacted body (the cellular phone 11 or a movable unit 15A), the voice recognition board 55 and the interaction processing part 71 are provided in a server 13, and between the interacted body and the sever 13 is connected by cable or radio. COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、携帯電話に関するものである。 The present invention relates to a mobile phone.

近年、産官学を挙げて音声対話装置の開発、製品化が行われている。発明者は、音声対話装置を携帯電話に組み込んだ携帯電話システムを開発中である。 In recent years, spoken dialogue devices have been developed and commercialized through industry, government and academia. The inventor is developing a mobile phone system in which a voice interaction device is incorporated in a mobile phone.

しかし、従来の音声対話装置は、小型化が進んでいるが、携帯電話に組み込むと非常に大きな携帯電話になってしまい、実用上、不便である。 However, although the conventional voice interaction device has been reduced in size, it becomes a very large mobile phone when incorporated in a mobile phone, which is inconvenient in practice.

また、従来の携帯電話に音声対話装置を組み込んだものを持ち歩く場合、常に、落下事故や水没事故で故障する可能性が指摘される。また、上記音声対話装置は、非常に高価であり、上記のような事故が発生して故障すると、修理に多額の費用が発生する問題点がある。 In addition, it is pointed out that there is always a possibility of failure due to a drop accident or a submergence accident when carrying a portable telephone with a built-in voice interactive device. Further, the above-mentioned voice interactive apparatus is very expensive, and there is a problem that a large amount of cost is required for repair when the above accident occurs and breaks down.

また、単に、携帯電話と、音声対話装置と組み合わせただけでは、音声対話を行う携帯電話という位置付けに過ぎず、より付加価値の高い、より高機能な携帯電話システムを提供することができなかった。さらに、従来の携帯電話は、音声対話装置と組み合わせただけでは、より高度なユーザインターフェースを実現できない問題点があった。 In addition, simply combining a mobile phone and a voice interaction device is merely a position as a mobile phone for performing voice conversation, and it has not been possible to provide a higher-value, higher-function mobile phone system. . Furthermore, the conventional mobile phone has a problem that a more advanced user interface cannot be realized only by combining with a voice interactive device.

本発明は上記点に鑑み、小型に構成できる携帯電話システムを提供することを第１の目的とする。 In view of the above points, it is a first object of the present invention to provide a mobile phone system that can be made compact.

また、本発明は上記点に鑑み、事故が発生した場合に、損傷を少なくする携帯電話システムを提供することを第２の目的とする。 In addition, in view of the above points, the second object of the present invention is to provide a mobile phone system that reduces damage when an accident occurs.

また、本発明は上記点に鑑み、付加価値の高い、高機能な携帯電話システムを提供することを第３の目的とする。 Moreover, in view of the above points, the third object of the present invention is to provide a highly functional mobile phone system with high added value.

また、本発明は上記点に鑑み、高度なユーザインターフェースを実現できる音声対話可能な携帯電話または携帯電話システムを提供することを第４の目的とする。 In addition, in view of the above points, a fourth object of the present invention is to provide a mobile phone or a mobile phone system capable of voice conversation capable of realizing an advanced user interface.

本発明は、上記目的を達成するために、請求項１に記載の発明では、人の音声を音声信号に変換する音声変換手段および所定の発音信号を振動に変えて発音する発音手段を備えた被対話体と、
被対話体とは別体に設けられて被対話体に有線及び無線のいずれかで接続されたサーバ用コンピュータと、
を備えており、
サーバ用コンピュータが、音声変換手段により変換された音声信号を処理して人の音声を認識する音声認識手段と、音声認識手段により認識された音声に対応する音声を決定し所定の発音信号を出力する対話制御手段とを備えていることを特徴とする。 In order to achieve the above object, according to the present invention, in the first aspect of the present invention, there is provided speech conversion means for converting a human voice into a voice signal and a sound generation means for generating a sound by changing a predetermined pronunciation signal into vibration. The interactee,
A server computer provided separately from the interactee and connected to the interactee either by wire or wirelessly;
With
The server computer processes the voice signal converted by the voice conversion means to recognize the voice of the person, determines the voice corresponding to the voice recognized by the voice recognition means, and outputs a predetermined pronunciation signal And a dialogue control means.

これによれば、音声認識手段、対話制御手段がサーバ用コンピュータに備えられるので、被対話体を落下させた場合、あるいは水たまりに水没させた場合でも、高価な音声認識手段、対話制御手段が故障することがない。さらに、被対話体とサーバ用コンピュータとが無線で接続されている場合には、有線で接続されている場合のように、有線の長さに制約されることなく、被対話体を移動することができる。 According to this, since the voice recognition means and the dialogue control means are provided in the server computer, even when the object to be interacted is dropped or submerged in a puddle, the expensive voice recognition means and the dialogue control means fail. There is nothing to do. Furthermore, when the interactee is connected wirelessly to the server computer, the interactee can be moved without being restricted by the length of the wire as in the case of being connected by wire. Can do.

請求項２に記載の発明では、所定の発音信号を振動に変えて発音する発音手段を備えた被対話体と、
被対話体とは別体に設けられて被対話体に有線及び無線のいずれかで接続されたサーバ用コンピュータと、
被対話体およびサーバ用コンピュータとは別体に設けられて被対話体およびサーバ用コンピュータのいずれかに有線及び無線のいずれかで接続されて人の音声を音声信号に変換する音声変換手段と、
を備えており、
サーバ用コンピュータが、音声変換手段により変換された音声信号を処理して人の音声を認識する音声認識手段と、音声認識手段により認識された音声に対応する音声を決定し所定の発音信号を出力する対話制御手段とを備えていることを特徴とする。 In the invention according to claim 2, to-be-interacted body provided with sounding means for sounding by changing a predetermined sounding signal into vibration,
A server computer provided separately from the interactee and connected to the interactee either by wire or wirelessly;
A voice conversion unit that is provided separately from the interactee and the server computer, and is connected to either the talkee and the server computer either by wire or wirelessly and converts a human voice into an audio signal;
With
The server computer processes the voice signal converted by the voice conversion means to recognize the voice of the person, determines the voice corresponding to the voice recognized by the voice recognition means, and outputs a predetermined pronunciation signal And a dialogue control means.

これによれば、音声変換手段が、被対話体およびサーバ用コンピュータとは別体に設けられるので、人が被対話体に近づかなくとも、音声を音声変換手段に入力することができる。また、音声変換装置を持ち歩くことがないので、被対話体を落下させた場合、あるいは水たまりに水没させた場合でも、音声変換手段が故障することがない。 According to this, since the voice conversion means is provided separately from the object to be interacted with and the server computer, it is possible to input the sound to the sound conversion means even if a person does not approach the object to be interacted with. Further, since the voice conversion device is not carried around, the voice conversion means does not break down even when the interactee is dropped or submerged in a puddle.

請求項３に記載の発明では、人の音声を音声信号に変換する音声変換手段および所定の発音信号を振動に変えて発音する発音手段を備えた被対話体と、
被対話体とは別体に設けられて被対話体に有線及び無線のいずれかで接続されたサーバ用コンピュータと、
を備えており、
音声変換手段により変換された音声信号を処理して人の言葉を認識する音声認識手段、音声認識手段により認識された言葉に対応する言葉を決定し所定の発音信号を出力する対話制御手段の両手段のうち、どちらか一方が被対話体に備えられており、他方がサーバ用コンピュータに備えられていることを特徴とする。 In the invention according to claim 3, there is provided a voice conversion means for converting a human voice into a voice signal, and a to-be-interactive body provided with a sound generation means for changing a predetermined sound generation signal into vibration,
A server computer provided separately from the interactee and connected to the interactee either by wire or wirelessly;
With
Both voice recognition means for processing a voice signal converted by the voice conversion means to recognize a human word, and a dialog control means for determining a word corresponding to the word recognized by the voice recognition means and outputting a predetermined pronunciation signal One of the means is provided in the object to be interacted with, and the other is provided in the server computer.

これによれば、音声認識手段が被対話体に備えられ、対話制御手段がサーバ用コンピュータに備えられている場合には、被対話体を落下させた場合に、あるいは水たまりに水没させた場合に、高価な対話制御手段が故障することがない。また、対話制御手段が被対話体に備えられ、音声認識手段がサーバ用コンピュータに備えられている場合には、被対話体を落下させた場合に、あるいは水たまりに水没させた場合に、高価な音声認識手段が故障することがない。また、被対話体とサーバ用コンピュータとが無線で接続される場合には、有線で接続されている場合のように、有線の長さに制約されることなく、被対話体を移動することができる。 According to this, when the speech recognition means is provided in the interactee and the dialog control means is provided in the server computer, when the interactee is dropped or submerged in a puddle. Expensive dialogue control means will not break down. Further, when the dialogue control means is provided in the object to be interacted and the voice recognition means is provided in the server computer, it is expensive when the object to be interacted is dropped or submerged in a puddle. The voice recognition means will not break down. In addition, when the interactee and the server computer are connected wirelessly, the interactee can be moved without being limited by the length of the wire as in the case of being connected by wire. it can.

請求項４に記載の発明では、所定の発音信号を振動に変えて発音する発音手段を備えた被対話体と、
被対話体とは別体に設けられて被対話体に有線及び無線のいずれかで接続されたサーバ用コンピュータと、
被対話体およびサーバ用コンピュータとは別体に設けられて被対話体およびサーバ用コンピュータのいずれかに有線及び無線のいずれかで接続されて人の音声を音声信号に変換する音声変換手段と、
を備えており、
音声変換手段により変換された音声信号を処理して人の音声を認識する音声認識手段、音声認識手段により認識された音声に対応する音声を決定し所定の発音信号を出力する対話制御手段の両手段のうち、どちらか一方が被対話体に備えられており、他方がサーバ用コンピュータに備えられていることを特徴とする。 In the invention according to claim 4, to-be-interacted body provided with sounding means for sounding by changing a predetermined sounding signal into vibration,
A server computer provided separately from the interactee and connected to the interactee either by wire or wirelessly;
A voice conversion unit that is provided separately from the interactee and the server computer, and is connected to either the talkee and the server computer either by wire or wirelessly and converts a human voice into an audio signal;
With
Both voice recognition means for processing a voice signal converted by the voice conversion means to recognize a human voice, and a dialog control means for determining a voice corresponding to the voice recognized by the voice recognition means and outputting a predetermined pronunciation signal One of the means is provided in the object to be interacted with, and the other is provided in the server computer.

これによれば、音声変換手段が被対話体およびサーバ用コンピュータとは別体に設けられるので、人が被対話体に近づかなくとも、音声を音声変換手段に入力することができる。また、音声変換装置を持ち歩くことがないので、被対話体を落下させた場合、あるいは水たまりに水没させた場合でも、音声変換手段が故障することがない。 According to this, since the voice conversion means is provided separately from the object to be interacted with and the server computer, it is possible to input the sound to the sound conversion means even if a person does not approach the object to be interacted with. Further, since the voice conversion device is not carried around, the voice conversion means does not break down even when the interactee is dropped or submerged in a puddle.

なお、請求項１乃至請求項４のいずれか１つによれば、音声変換手段、発音手段、音声認識手段、対話制御手段のすべてが被対話体に搭載される場合に比べると、被対話体を小さく、軽くすることができ、被対話体の持ち運びを容易にすることができる。 According to any one of claims 1 to 4, compared to the case where all of the voice conversion means, the sound generation means, the voice recognition means, and the dialogue control means are mounted on the dialogue target, Can be made small and light, and the object can be easily carried.

請求項５に記載の発明では、請求項１乃至請求項４のいずれか１つに記載の携帯電話システムにおいて、さらに所定の発音情報を記憶自在な発音情報記憶部が被対話体およびサーバ用コンピュータのいずれかに搭載されており、
所定の発音情報が発音情報記憶部に記憶されており、
人が音声変換手段を介して所定の発音情報を要求した場合、人が音声変換手段を介して所定の発音情報を許可した場合、所定の発音情報を用いて被対話体が自ら発音する場合のいずれかに、発音情報記憶部から所定の発音情報を読み出して、発音手段から発音することを特徴とする。 According to a fifth aspect of the present invention, in the mobile phone system according to any one of the first to fourth aspects, a pronunciation information storage unit capable of storing predetermined pronunciation information is further provided as a computer to be interacted with and a server computer. Is mounted on either
Predetermined pronunciation information is stored in the pronunciation information storage unit,
When a person requests predetermined pronunciation information via the voice conversion means, when a person permits the predetermined pronunciation information via the voice conversion means, or when the person to be spoken pronounces himself using the predetermined pronunciation information One of the features is that predetermined sounding information is read from the sounding information storage unit and sounded by the sounding means.

これによれば、人が被対話体と単に音声対話するだけではなく、人が音声変換手段を介して所定の発音情報を要求した場合、人が音声変換手段を介して所定の発音情報を許可した場合、所定の発音情報を用いて被対話体が自ら発音する場合のいずれかに、所定の発音情報を得ることができる高機能な携帯電話システムを提供することができる。また、人が所定の発音情報を要求した場合、所定の発音情報を用いて被対話体が自ら発音する場合に、所定の発音情報を読み出して、発音手段から発音する高度なユーザインターフェースを提供できる。さらに、発音情報記憶部がサーバ用コンピュータに搭載されている場合には、被対話体を落下させた場合、あるいは水たまりに水没させた場合でも、発音情報記憶部に記憶された発音情報を損傷させることがない。 According to this, when a person requests a predetermined pronunciation information via the voice conversion means, the person permits the predetermined pronunciation information via the voice conversion means. In this case, it is possible to provide a highly functional mobile phone system capable of obtaining predetermined pronunciation information in any case where the person to be uttered himself / herself using predetermined pronunciation information. In addition, when a person requests predetermined pronunciation information, an advanced user interface can be provided that reads out the predetermined pronunciation information and produces sound from the sound generation means when the person to be spoken uses the predetermined pronunciation information. . Further, when the pronunciation information storage unit is mounted on the server computer, the pronunciation information stored in the pronunciation information storage unit is damaged even if the interactee is dropped or submerged in a puddle. There is nothing.

請求項６に記載の発明では、請求項５において、発音情報記憶部がインターネットに接続自在に構成されており、
発音情報がインターネット上の所定の記憶場所からダウンロード自在であることを特徴とする。 In the invention described in claim 6, in claim 5, the pronunciation information storage unit is configured to be freely connected to the Internet.
The pronunciation information can be downloaded from a predetermined storage location on the Internet.

これによれば、所定の発音情報をインターネット上からダウンロードできる高機能な携帯電話システムを提供できる。また、所定の発音情報をインターネット上からダウンロードできるので、発音情報記憶部に記憶された所定の発音情報が損傷しても、直ぐに所定の発音情報を復旧することができる。 According to this, it is possible to provide a highly functional mobile phone system that can download predetermined pronunciation information from the Internet. Further, since the predetermined pronunciation information can be downloaded from the Internet, even if the predetermined pronunciation information stored in the pronunciation information storage unit is damaged, the predetermined pronunciation information can be restored immediately.

請求項７に記載の発明のように、請求項１乃至請求項６のいずれか１つにおいて、被対話体が、１つ以上の可動部と、
１つ以上の可動部をそれぞれ可動するモータと、
モータをそれぞれ駆動する駆動部と、
駆動部に可動部の動作を司令する指令信号を出力するコントローラと、
を備えていてもよい。 As in the invention described in claim 7, in any one of claims 1 to 6, the interactee is one or more movable parts;
A motor for moving each of the one or more movable parts;
A drive unit for driving each motor;
A controller that outputs a command signal to command the operation of the movable part to the drive part;
May be provided.

これによれば、可動部の動作を司令する指令信号を、コントローラから駆動部に出力し、この指令信号に基づいてモータを駆動することで、可動部を可動することができる。上記のように、被対話体に可動部、モータ、駆動部、コントローラが備えられた携帯電話システムであってもよい。 According to this, it is possible to move the movable part by outputting the command signal for commanding the operation of the movable part from the controller to the drive part and driving the motor based on the command signal. As described above, the mobile phone system may include a movable part, a motor, a drive part, and a controller provided in the object to be interacted with.

請求項８に記載の発明では、請求項１乃至請求項６のいずれか１つにおいて、被対話体が、１つ以上の可動部と、
１つ以上の可動部をそれぞれ可動するモータと、
モータをそれぞれ駆動する駆動部と、
を備えており、
サーバ用コンピュータが、駆動部に動作の指令信号を出力するコントローラを備えていることを特徴とする。 According to an eighth aspect of the present invention, in any one of the first to sixth aspects, the interactee is one or more movable parts;
A motor for moving each of the one or more movable parts;
A drive unit for driving each motor;
With
The server computer includes a controller that outputs an operation command signal to the drive unit.

これによれば、可動部の動作を司令する指令信号を、サーバ用コンピュータに備えられたコントローラから、被対話体に備えられた駆動部に出力し、この指令信号に基づいてモータを駆動することで、可動部を可動することができる。 According to this, the command signal for commanding the operation of the movable part is output from the controller provided for the server computer to the drive part provided for the interactee, and the motor is driven based on the command signal. Thus, the movable part can be moved.

上記のように、コントローラがサーバ用コンピュータに備えられているので、被対話体を落下させた場合、あるいは水たまりに水没させた場合でも、高価なコントローラが故障することがない。 As described above, since the controller is provided in the server computer, the expensive controller does not fail even when the interactee is dropped or submerged in a puddle.

請求項９に記載の発明では、請求項１乃至請求項６のいずれか１つにおいて、被対話体が、１つ以上の可動部と、
１つ以上の可動部をそれぞれ可動するモータと、
を備えており、
サーバ用コンピュータが、モータをそれぞれ駆動する駆動部と、駆動部に動作の指令信号を出力するコントローラとを備えていることを特徴とする。 In the invention according to claim 9, in any one of claims 1 to 6, the interactee is one or more movable parts;
A motor for moving each of the one or more movable parts;
With
The server computer includes a drive unit that drives each motor, and a controller that outputs an operation command signal to the drive unit.

これによれば、可動部の動作を司令する指令信号を、サーバ用コンピュータに備えられたコントローラから駆動部に出力し、この指令信号に基づいて、被対話体に備えられたモータを駆動することで、可動部を可動することができる。 According to this, the command signal for commanding the operation of the movable part is output from the controller provided in the server computer to the drive unit, and the motor provided in the interactee is driven based on the command signal. Thus, the movable part can be moved.

駆動部、コントローラがサーバ用コンピュータに備えられるので、被対話体を落下させた場合、あるいは水たまりに水没させた場合でも、高価な駆動部、コントローラが故障することがない。 Since the drive unit and the controller are provided in the server computer, the expensive drive unit and controller do not break down even when the object to be interacted is dropped or submerged in a puddle.

請求項１０に記載の発明では、請求項１乃至請求項６のいずれか１つにおいて、被対話体およびサーバ用コンピュータとは別体に設けられて被対話体およびサーバ用コンピュータの少なくとも１つに有線及び無線のいずれかで接続されて可動する可動ユニットを備えており、
可動ユニットが、１つ以上の可動部と、
１つ以上の可動部をそれぞれ可動するモータと、
モータをそれぞれ駆動する駆動部と、
駆動部に可動部の動作を司令する指令信号を出力するコントローラと、
を備えていることを特徴とする。 According to a tenth aspect of the present invention, in any one of the first to sixth aspects, the at least one of the interactee and the server computer is provided separately from the interactee and the server computer. It is equipped with a movable unit that can be connected and moved by either wired or wireless,
The movable unit includes one or more movable parts;
A motor for moving each of the one or more movable parts;
A drive unit for driving each motor;
A controller that outputs a command signal to command the operation of the movable part to the drive part;
It is characterized by having.

これによれば、可動ユニットが被対話体およびサーバ用コンピュータとは別体に設けられ、被対話体と有線及び無線のいずれかで接続されて可動することができる。 According to this, the movable unit is provided separately from the interactee and the server computer, and can be moved by being connected to the interactee either by wire or wirelessly.

上記のように、可動部、モータ、駆動部、コントローラが被対話体と別体に備えられているので、被対話体を落下させた場合、あるいは水たまりに水没させた場合でも、高価な可動部、モータ、駆動部、コントローラが故障することがない。 As described above, since the movable part, motor, drive unit, and controller are provided separately from the interactee, even if the interactee is dropped or submerged in a puddle, the expensive movable part The motor, drive unit and controller will not break down.

請求項１１に記載の発明では、請求項１乃至請求項６のいずれか１つにおいて、被対話体およびサーバ用コンピュータとは別体に設けられて被対話体およびサーバ用コンピュータの少なくとも１つに有線及び無線のいずれかで接続された可動ユニットを備えており、
可動ユニットが、１つ以上の可動部と、
１つ以上の可動部をそれぞれ駆動するモータと、
モータをそれぞれ駆動する駆動部と、
を備えており、
被対話体およびサーバ用コンピュータのいずれかが、駆動部に動作の指令信号を出力するコントローラを備えていることを特徴とする。 According to an eleventh aspect of the present invention, in any one of the first to sixth aspects, at least one of the interactee and the server computer is provided separately from the interactee and the server computer. It has a movable unit connected by either wired or wireless,
The movable unit includes one or more movable parts;
Motors each driving one or more movable parts;
A drive unit for driving each motor;
With
One of the object to be interacted with and the server computer includes a controller that outputs an operation command signal to the drive unit.

これによれば、可動部の動作を司令する指令信号を、被対話体およびサーバ用コンピュータのいずれかに備えられたコントローラから駆動部に出力し、この指令信号に基づいて、可動ユニットに備えられたモータを駆動することで、可動部を可動することができる。 According to this, a command signal for commanding the operation of the movable part is output from the controller provided in either the interactee or the server computer to the drive unit, and the movable unit is provided based on this command signal. By driving the motor, the movable part can be moved.

可動部、モータ、駆動部が被対話体と別体に備えられているので、被対話体を落下させた場合、あるいは水たまりに水没させた場合でも、高価な可動部、モータ、駆動部が故障することがない。 Since the movable part, motor, and drive part are provided separately from the interactee, the expensive movable part, motor, and drive part will fail even if the interactee is dropped or submerged in a puddle. There is nothing to do.

請求項１２に記載の発明では、請求項１乃至請求項６のいずれか１つにおいて、被対話体とは別体に設けられて被対話体およびサーバ用コンピュータの少なくとも１つに有線及び無線のいずれかで接続された可動ユニットを備えており、
可動ユニットが、１つ以上の可動部と、
１つ以上の可動部をそれぞれ可動するモータと、
を備えており、
モータをそれぞれ駆動する駆動部が、被対話体およびサーバ用コンピュータのいずれかに備えられており、
駆動部に動作の指令信号を出力するコントローラが、被対話体およびサーバ用コンピュータのいずれかに備えられていることを特徴とする。 According to a twelfth aspect of the present invention, in any one of the first to sixth aspects, a wired and wireless connection is provided to at least one of the interactee and the server computer. It has a movable unit connected by either
The movable unit includes one or more movable parts;
A motor for moving each of the one or more movable parts;
With
A drive unit for driving each motor is provided in either the interactee or the server computer,
A controller that outputs an operation command signal to the drive unit is provided in either the interactee or the server computer.

これによれば、少なくとも可動部、モータが可動ユニットに備えられているので、被対話体を落下させた場合、あるいは水たまりに水没させた場合でも、高価な可動部、モータが故障することがない。 According to this, since at least the movable part and the motor are provided in the movable unit, the expensive movable part and the motor do not break down even when the interactee is dropped or submerged in a puddle. .

なお、請求項８乃至請求項１２のいずれかによれば、可動部、モータ、駆動部、コントローラのすべてが被対話体に備えられる場合に比べると、被対話体を小さく、軽くすることができ、被対話体の持ち運びを容易にすることができる。 According to any one of claims 8 to 12, the interactee can be made smaller and lighter than when the movable part, the motor, the drive unit, and the controller are all provided in the interactee. Therefore, it is possible to easily carry the interactee.

請求項１３に記載の発明のように、請求項１０乃至請求項１２のいずれか１つにおいて、被対話体と可動ユニットが取り付け自在、取り外し自在に構成されていてもよい。 As in a thirteenth aspect of the present invention, in any one of the tenth to twelfth aspects, the interactee and the movable unit may be configured to be attachable and detachable.

これによれば、可動ユニットを被対話体に取り付けることができるので、被対話体が可動ユニットと別体に構成される場合と、被対話体が可動ユニットと一体に構成される場合の２つの構成を使い分けて使用することができる。 According to this, since the movable unit can be attached to the interacting body, there are two cases where the interacted body is configured separately from the movable unit and when the interacted body is configured integrally with the movable unit. You can use different configurations.

請求項１４に記載の発明では、請求項１乃至請求項９のいずれか１つに記載の携帯電話システムにおいて、さらに所定の画像を表示する画像表示手段が被対話体と一体および別体のいずれかに設けられており、
所定の画像情報が予め記憶された画像情報記憶部が被対話体およびサーバ用コンピュータのいずれかに搭載されており、
人が音声変換手段を介して所定の画像情報を要求した場合、人が音声変換手段を介して所定の画像情報を許可した場合、所定の画像情報を用いて被対話体が自ら所定の画像を表示する場合のいずれかに、画像情報記憶部から所定の画像情報を読み出して、画像表示手段に表示することを特徴とする。 According to a fourteenth aspect of the present invention, in the mobile phone system according to any one of the first to ninth aspects, the image display means for displaying a predetermined image is integrated with the object to be interacted, either separately or separately. It is established in
An image information storage unit in which predetermined image information is stored in advance is mounted on either the interactee or the server computer,
When a person requests predetermined image information through the voice conversion unit, or when a person permits the predetermined image information through the voice conversion unit, the interactee uses the predetermined image information to display the predetermined image information. In any case of displaying, predetermined image information is read from the image information storage unit and displayed on the image display means.

これによれば、画像表示手段が被対話体と別体に設けられている場合には、被対話体を落下させた場合、あるいは水たまりに水没させた場合でも、画像表示手段を損傷させることがない。また、画像情報記憶部がサーバ用コンピュータに搭載されている場合には、被対話体を落下させた場合、あるいは水たまりに水没させた場合でも、画像情報記憶部の画像情報を損傷させることがない。なお、画像情報記憶部が被対話体に搭載されていてもよく、画像表示手段が被対話体と一体に設けられていてもよい。 According to this, when the image display means is provided separately from the interactee, the image display means can be damaged even if the interactee is dropped or submerged in a puddle. Absent. Further, when the image information storage unit is mounted on the server computer, the image information in the image information storage unit is not damaged even if the interactee is dropped or submerged in a puddle. . Note that the image information storage unit may be mounted on the interactee, and the image display means may be provided integrally with the interactee.

請求項１５に記載の発明では、請求項１０乃至請求項１３のいずれか１つに記載の携帯電話システムにおいて、さらに所定の画像を表示する画像表示手段が被対話体および可動ユニットのいずれかに設けられて、被対話体、サーバ用コンピュータ、可動ユニットの少なくとも１つに有線及び無線のいずれかで接続されており、
所定の画像情報が予め記憶された画像情報記憶部が、被対話体、サーバ用コンピュータ、可動ユニットのいずれかに搭載されており、
人が音声変換手段を介して所定の画像情報を要求した場合、人が音声変換手段を介して所定の画像情報を許可した場合、所定の画像情報を用いて被対話体が自ら所定の画像を表示する場合のいずれかに、画像情報記憶部から所定の画像情報を読み出して、画像表示手段に表示することを特徴とする。 According to a fifteenth aspect of the present invention, in the mobile phone system according to any one of the tenth to thirteenth aspects, the image display means for displaying a predetermined image is either the interactee or the movable unit. Provided, connected to at least one of the interactee, the server computer, and the movable unit by either wired or wireless,
An image information storage unit in which predetermined image information is stored in advance is mounted on any of the interactee, the server computer, and the movable unit,
When a person requests predetermined image information through the voice conversion unit, or when a person permits the predetermined image information through the voice conversion unit, the interactee uses the predetermined image information to display the predetermined image information. In any case of displaying, predetermined image information is read from the image information storage unit and displayed on the image display means.

これによれば、画像表示手段がサーバ用コンピュータ、可動ユニットのいずれかに搭載されている場合には、被対話体を落下させた場合、あるいは水たまりに水没させた場合でも、高価な画像表示手段を損傷させることがない。また、画像情報記憶部がサーバ用コンピュータ、可動ユニットのいずれかに搭載されている場合には、被対話体を落下させた場合、あるいは水たまりに水没させた場合でも、画像情報記憶部の画像情報を損傷させることがない。なお、画像表示手段が被対話体に設けられていてもよく、画像情報記憶部が被対話体に設けられていてもよい。
According to this, when the image display means is mounted on either the server computer or the movable unit, the expensive image display means can be used even when the interactee is dropped or submerged in a puddle. Will not damage. Further, when the image information storage unit is mounted on either the server computer or the movable unit, the image information stored in the image information storage unit can be obtained even when the object to be interacted is dropped or submerged in a puddle. Will not damage. Note that the image display means may be provided in the interactee, and the image information storage unit may be provided in the interactee.

請求項１６に記載の発明では、請求項１０乃至請求項１３のいずれか１つに記載の携帯電話システムにおいて、さらに所定の画像を表示する画像表示手段が被対話体および可動ユニットのいずれとも別体に設けられ、被対話体、サーバ用コンピュータ、可動ユニットの少なくとも１つに有線及び無線のいずれかで接続されており、
所定の画像情報が予め記憶された画像情報記憶部が、被対話体、サーバ用コンピュータ、可動ユニットのいずれかに搭載されており、
人が音声変換手段を介して所定の画像情報を要求した場合、人が音声変換手段を介して所定の画像情報を許可した場合、所定の画像情報を用いて被対話体が自ら所定の画像を表示する場合のいずれかに、画像情報記憶部から所定の画像情報を読み出して、画像表示手段に表示することを特徴とする。 According to a sixteenth aspect of the present invention, in the mobile phone system according to any one of the tenth to thirteenth aspects, the image display means for displaying a predetermined image is separate from both the interactee and the movable unit. Is connected to at least one of the interactee, the server computer, and the movable unit by either wired or wireless,
An image information storage unit in which predetermined image information is stored in advance is mounted on any of the interactee, the server computer, and the movable unit,
When a person requests predetermined image information through the voice conversion unit, or when a person permits the predetermined image information through the voice conversion unit, the interactee uses the predetermined image information to display the predetermined image information. In any case of displaying, predetermined image information is read from the image information storage unit and displayed on the image display means.

これによれば、所定の画像を表示する画像表示手段が被対話体および可動ユニットのいずれとも別体に設けられているので、被対話体を落下させた場合、あるいは水たまりに水没させた場合でも、画像情報記憶部の画像情報を損傷させることがない。 According to this, since the image display means for displaying a predetermined image is provided separately from both the interactee and the movable unit, even when the interactee is dropped or submerged in a puddle The image information in the image information storage unit is not damaged.

なお、請求項１４、請求項１５、請求項１６のいずれか１つによれば、人が音声変換手段を介して所定の画像情報を要求した場合、人が音声変換手段を介して所定の画像情報を許可した場合、所定の画像情報を用いて被対話体が自ら所定の画像を表示する場合のいずれかに、所定の画像情報を得ることができる高機能な携帯電話システムを提供できる。また、人が音声変換手段を介して所定の画像情報を要求した場合、人が音声変換手段を介して所定の画像情報を許可した場合、所定の画像情報を用いて被対話体が自ら所定の画像を表示する場合のいずれかに、所定の画像情報を画像情報記憶部から得て、画像表示手段に表示する高度なユーザインターフェースを実現することができる。 According to any one of claims 14, 15, and 16, when a person requests predetermined image information via the voice conversion unit, the person receives a predetermined image via the voice conversion unit. When the information is permitted, it is possible to provide a high-function mobile phone system that can obtain the predetermined image information in any of cases where the interactee displays the predetermined image using the predetermined image information. In addition, when a person requests predetermined image information through the voice conversion unit, or when a person permits the predetermined image information through the voice conversion unit, the person to be interacted with the predetermined image information uses the predetermined image information. In any case of displaying an image, it is possible to realize an advanced user interface that obtains predetermined image information from the image information storage unit and displays it on the image display means.

請求項１７に記載の発明では、請求項１４乃至請求項１６のいずれか１つにおいて、画像情報記憶部がインターネットに接続自在に構成されており、
画像情報がインターネット上の所定の記憶場所からダウンロード自在であることを特徴とする。 According to a seventeenth aspect of the present invention, in any one of the fourteenth to sixteenth aspects, the image information storage unit is configured to be connectable to the Internet.
The image information can be downloaded from a predetermined storage location on the Internet.

これによれば、所定の画像情報をインターネット上からダウンロードできる高機能な携帯電話システムを提供できる。また、所定の画像情報をインターネット上からダウンロードできるので、画像情報記憶部に記憶された所定の画像情報が損傷しても、直ぐに所定の画像情報を復旧することができる。 According to this, it is possible to provide a highly functional mobile phone system that can download predetermined image information from the Internet. Further, since the predetermined image information can be downloaded from the Internet, even if the predetermined image information stored in the image information storage unit is damaged, the predetermined image information can be restored immediately.

請求項１８に記載の発明のように、請求項１乃至請求項９、請求項１４のいずれか１つにおいて、人を含む所定の対象物を撮像自在な撮像手段が被対話体と一体および別体のいずれかに構成されており、
撮像手段により撮像された撮像データから所定の対象物を認識する画像認識手段が被対
話体およびサーバ用コンピュータのいずれかに搭載されていてもよい。 As in the eighteenth aspect of the present invention, in any one of the first to ninth aspects and the fourteenth aspect, the imaging means capable of imaging a predetermined object including a person is integrated with and separated from the interactee. Composed of one of the bodies,
Image recognition means for recognizing a predetermined object from image data captured by the image pickup means may be mounted on either the interactee or the server computer.

これによれば、撮像手段により撮像された撮像データから所定の対象物を認識することができる。上記のように、撮像手段が被対話体と一体および別体のいずれかに構成されていてもよい。また、撮像手段を備えた高機能な携帯電話システムを提供できる。 According to this, it is possible to recognize a predetermined object from the imaging data imaged by the imaging means. As described above, the imaging means may be configured either as an integral body or a separate body. In addition, it is possible to provide a highly functional mobile phone system including an imaging unit.

請求項１９に記載の発明では、請求項１０乃至請求項１３、請求項１５、請求項１６のいずれか１つにおいて、人を含む所定の対象物を撮像自在な撮像手段が被対話体および可動ユニットのいずれかに設けられて、被対話体、サーバ用コンピュータ、可動ユニットの少なくとも１つに有線及び無線のいずれかで接続されており、
撮像手段により撮像された撮像データから所定の対象物を認識する画像認識手段が被対話体、サーバ用コンピュータ、可動ユニットの少なくとも１つに搭載されていることを特徴とする。 According to a nineteenth aspect of the present invention, in any one of the tenth to thirteenth, fifteenth, and sixteenth aspects, the imaging means capable of imaging a predetermined object including a person is an interactive body and a movable body. It is provided in any of the units and is connected to at least one of the interactee, the server computer, and the movable unit by either wired or wireless,
Image recognition means for recognizing a predetermined object from image data captured by the imaging means is mounted on at least one of the interactee, the server computer, and the movable unit.

これによれば、撮像手段により撮像された撮像データから人を含む所定の対象物を認識することができる。撮像手段が可動ユニットに設けられている場合には、被対話体を落下させた場合、あるいは水たまりに水没させた場合でも、高価な撮像手段を損傷させることがない。また、画像認識手段がサーバ用コンピュータ、可動ユニットに搭載されている場合には、高価な画像認識手段を損傷させることがない。なお、撮像手段が被対話体に設けられていてもよく、画像認識手段が被対話体に搭載されていてもよい。 According to this, it is possible to recognize a predetermined object including a person from the imaging data captured by the imaging unit. When the imaging unit is provided in the movable unit, the expensive imaging unit is not damaged even when the interactee is dropped or submerged in a puddle. Further, when the image recognition means is mounted on the server computer or the movable unit, the expensive image recognition means is not damaged. Note that the imaging means may be provided on the interactee, and the image recognition means may be mounted on the interactee.

請求項２０に記載の発明では、請求項１０乃至請求項１３、請求項１５、請求項１６のいずれか１つにおいて、人を含む所定の対象物を撮像自在な撮像手段が被対話体および可動ユニットのいずれかとも別体に設けられて、被対話体、サーバ用コンピュータ、可動ユニットの少なくとも１つに有線及び無線のいずれかで接続されており、
撮像手段により撮像された撮像データから所定の対象物を認識する画像認識手段が被対話体、サーバ用コンピュータ、可動ユニットの少なくとも１つに搭載されていることを特徴とする。 According to a twentieth aspect of the present invention, in any one of the tenth to thirteenth, fifteenth, and sixteenth aspects, the imaging means capable of imaging a predetermined object including a person is an interactive body and a movable body. It is provided separately from any of the units, and is connected to at least one of the interactee, the server computer, and the movable unit by either wired or wireless,
Image recognition means for recognizing a predetermined object from image data captured by the imaging means is mounted on at least one of the interactee, the server computer, and the movable unit.

これによれば、撮像手段により撮像された撮像データから人を含む所定の対象物を認識することができる。上記のように、撮像手段が被対話体と一体および別体のいずれかに構成されていてもよい。また、撮像手段を備えた高機能な携帯電話システムを提供できる。 According to this, it is possible to recognize a predetermined object including a person from the imaging data captured by the imaging unit. As described above, the imaging means may be configured either as an integral body or a separate body. In addition, it is possible to provide a highly functional mobile phone system including an imaging unit.

請求項２１に記載の発明では、請求項７乃至請求項２０のいずれか１つにおいて、人と対話を行う場合、所定の説明を行う場合の少なくとも１つにおいて、可動部が所定のコミュニケーション動作をするように、コントローラが駆動部に指令信号を出力することを特徴とする。 According to a twenty-first aspect of the present invention, in any one of the seventh to twentieth aspects, the movable portion performs a predetermined communication operation in at least one of a case where a dialogue is performed with a person and a predetermined explanation is given. As described above, the controller outputs a command signal to the drive unit.

これによれば、可動部が設けられていない携帯電話システムに比べて、ミュニケーション動作をして、臨場感を持って発音する高度な携帯電話システムを提供できる。また、動部が設けられていない携帯電話システムに比べて、ミュニケーション動作をして、臨場感を持って発音する高度なユーザインターフェースを実現することができる。 According to this, it is possible to provide an advanced mobile phone system that performs a communication operation and produces sound with a sense of presence compared to a mobile phone system that is not provided with a movable part. In addition, it is possible to realize an advanced user interface that performs a communication operation and sounds with a sense of presence compared to a mobile phone system that does not include a moving part.

請求項２２に記載の発明では、請求項１０乃至請求項１３、請求項請求項１５のいずれか１つにおいて、可動部が所定の装置を操作する位置に配置されており、
人の音声が所定の装置を操作する命令である場合、人の音声が所定の装置を操作する許可である場合、所定の操作入力手段により所定の装置を操作する場合、所定の装置を操作する自動実行プログラムが実行される場合に、所定の装置を操作するように、コントローラが駆動部に指令信号を出力することを特徴とする。 According to a twenty-second aspect of the present invention, in any one of the tenth to thirteenth and thirteenth and fifteenth aspects, the movable portion is disposed at a position for operating a predetermined device.
When a human voice is an instruction to operate a predetermined device, when a human voice is permission to operate a predetermined device, when operating a predetermined device by a predetermined operation input means, operate the predetermined device When the automatic execution program is executed, the controller outputs a command signal to the drive unit so as to operate a predetermined device.

これによれば、コントローラが駆動部に指令信号を出力して、可動部が所定の装置を操作する高度な携帯電話システムを提供できる。 According to this, it is possible to provide an advanced mobile phone system in which the controller outputs a command signal to the drive unit and the movable unit operates a predetermined device.

請求項２３に記載の発明では、請求項１８乃至請求項２０のいずれか１つにおいて、撮像手段が、人を含む所定の対象物を撮像し、画像認識手段が所定の対象物を認識した結果に基づいて、人と所定のコミュニケーション動作をするように、コントローラが駆動部に指令信号を出力することを特徴とする。 According to a twenty-third aspect of the present invention, in any one of the eighteenth to twentieth aspects, the imaging unit images a predetermined object including a person, and the image recognition unit recognizes the predetermined object. Based on the above, the controller outputs a command signal to the drive unit so as to perform a predetermined communication operation with a person.

これによれば、人を含む所定の対象物を撮像し、画像認識手段が所定の対象物を認識した結果に基づいてコミュニケーション動作をし、臨場感を持って発音する高度な携帯電話システムを提供できる。また、人を含む所定の対象物を撮像し、画像認識手段が所定の対象物を認識した結果に基づいてコミュニケーション動作をし、臨場感を持って発音する高度なユーザインターフェースを実現することができる。 According to this, an advanced mobile phone system that captures an image of a predetermined object including a person, communicates based on the result of the image recognition means recognizing the predetermined object, and pronounces with a sense of reality is provided. it can. Further, it is possible to realize an advanced user interface that captures an image of a predetermined object including a person, performs a communication operation based on a result of the image recognition unit recognizing the predetermined object, and pronounces with a sense of presence. .

請求項２４に記載の発明では、請求項１８乃至請求項２０のいずれか１つにおいて、撮像手段が、人を含む所定の対象物を撮像し、画像認識手段が所定の対象物を認識した結果に基づいて、複数の発音データから少なくとも１つを選択し、発音手段を介して人に対して発音することを特徴とする。 According to a twenty-fourth aspect of the present invention, in any one of the eighteenth to twentieth aspects, the imaging unit images a predetermined object including a person, and the image recognition unit recognizes the predetermined object. Based on the above, at least one of a plurality of pronunciation data is selected and pronounced with respect to a person via a pronunciation means.

これによれば、撮像手段、画像認識手段により人を含む所定の対象物を認識して、人と音声対話をする高度な携帯電話システムを提供できる。また、画像認識手段により人を含む所定の対象物を認識して、人と音声対話をする高度なユーザインターフェースを実現することができる。 According to this, it is possible to provide an advanced mobile phone system for recognizing a predetermined object including a person by the image pickup means and the image recognition means and having a voice conversation with the person. In addition, it is possible to realize a high-level user interface for recognizing a predetermined object including a person by the image recognizing means and having a voice conversation with the person.

請求項２５に記載の発明では、請求項１８乃至請求項２０のいずれか１つにおいて、撮像手段が、所定の装置の操作手段を撮像し、
人の音声が所定の装置を操作する命令である場合、人の音声が所定の装置を操作する許可である場合、所定の操作入力手段により所定の装置を操作する場合、所定の装置を操作する自動実行プログラムが実行される場合に、画像認識手段が操作手段の位置を認識した結果に基づいて、可動部及び被対話体が、手段の操作位置に可動し、所定の装置を操作するように、コントローラが駆動部に指令信号を出力することを特徴とする。 According to a twenty-fifth aspect of the present invention, in any one of the eighteenth to twentieth aspects, the imaging unit images an operation unit of a predetermined device,
When a human voice is an instruction to operate a predetermined device, when a human voice is permission to operate a predetermined device, when operating a predetermined device by a predetermined operation input means, operate the predetermined device When the automatic execution program is executed, based on the result of the image recognizing means recognizing the position of the operating means, the movable part and the object to be interacted move to the operating position of the means and operate a predetermined device. The controller outputs a command signal to the drive unit.

これによれば、可動部及び被対話体が、手段の操作位置に可動し、所定の装置を操作する高度な携帯電話システムを提供できる。また、人の音声が所定の装置を操作する命令である場合、人の音声が所定の装置を操作する許可である場合、所定の操作入力手段により所定の装置を操作する場合に、可動部及び被対話体が、手段の操作位置に可動し、所定の装置を操作する高度なユーザインターフェースを実現することができる。 According to this, it is possible to provide an advanced mobile phone system in which the movable part and the object to be interacted move to the operation position of the means and operate a predetermined device. Further, when the human voice is an instruction to operate the predetermined device, the human voice is permitted to operate the predetermined device, or the predetermined device is operated by the predetermined operation input means, the movable portion and It is possible to realize an advanced user interface in which the interactee moves to the operation position of the means and operates a predetermined device.

請求項２６に記載の発明では、請求項１８乃至請求項２０のいずれか１つにおいて、撮像手段が、テーブルゲームの進行状況を撮像し、画像認識手段が、テーブルゲームの進行状況を画像認識するように構成されており、
画像認識手段により認識された進行状況から可動部の次の動作を決定する動作決定手段を備えており、
可動部が、動作決定手段により決定された次の動作を実行するように、コントローラが駆動部に指令信号を出力することを特徴とする。 According to a twenty-sixth aspect of the present invention, in any one of the eighteenth to twentieth aspects, the imaging unit images the progress of the table game, and the image recognition unit recognizes the progress of the table game. Is configured as
An operation determining means for determining the next operation of the movable part from the progress status recognized by the image recognition means;
The controller outputs a command signal to the drive unit so that the movable unit executes the next operation determined by the operation determination unit.

これによれば、撮像手段、画像認識手段によりテーブルゲームの進行状況を撮像、画像認識し、動作決定手段により可動部の次の動作を決定し、可動部が、動作決定手段により決定された次の動作を実行する高度な携帯電話システムを提供できる。 According to this, the progress state of the table game is imaged and recognized by the imaging means and the image recognition means, the next action of the movable part is determined by the action determining means, and the next moving part determined by the action determining means is determined. It is possible to provide an advanced mobile phone system that performs the above operations.

請求項２７に記載の発明では、請求項１８乃至請求項２０のいずれか１つにおいて、コントローラから駆動部に指令信号を出力して可動部を可動させ、人を含む所定の対象物を探し出すことを特徴とする。 According to a twenty-seventh aspect of the present invention, in any one of the eighteenth to twentieth aspects, the controller outputs a command signal to the driving unit to move the movable unit to search for a predetermined object including a person. It is characterized by.

これによれば、所定の対象物を探し出す高度な携帯電話システムを提供できる。 According to this, an advanced mobile phone system for searching for a predetermined object can be provided.

請求項２８に記載の発明では、請求項１８乃至請求項２０のいずれか１つにおいて、画像認識手段により認識された所定の対象物を撮像手段が追跡する追跡プログラムが被対話体およびサーバ用コンピュータのいずれかに搭載されており、
撮像手段が人を含む所定の対象物を追跡するように、コントローラから駆動部に指令信号を出力し、可動部を可動させることを特徴とする。 According to a twenty-eighth aspect of the present invention, in any one of the eighteenth to twentieth aspects, the tracking program in which the imaging unit tracks a predetermined object recognized by the image recognition unit is an interactive object and a server computer. Is mounted on either
A command signal is output from the controller to the drive unit so that the imaging unit tracks a predetermined object including a person, and the movable unit is moved.

これによれば、撮像手段が人を含む所定の対象物を追跡するように可動部を可動できるので、人を含む所定の対象物が移動しても、人を含む所定の対象物を追跡して認識をする高度な携帯電話システムを提供できる。 According to this, since the movable unit can be moved so that the imaging unit tracks a predetermined object including a person, even if the predetermined object including the person moves, the predetermined object including the person is tracked. Can provide an advanced mobile phone system that recognizes

請求項２９に記載の発明では、請求項１乃至請求項２８のいずれか１つの携帯電話システムにおいて、さらに作動信号によって作動する作動手段を具えた作動体の作動手段に、作動信号を出力する作動信号出力手段が被対話体およびサーバ用コンピュータの少なくとも１つに搭載されており、
作動手段と作動信号出力手段との間が無線および有線のいずれか１つにより接続されていることを特徴とする。 According to a twenty-ninth aspect of the present invention, in the mobile phone system according to any one of the first to twenty-eighth aspects, an operation for outputting an operation signal to an operation means of an operation body further including an operation means operated by an operation signal. A signal output means is mounted on at least one of the interactee and the server computer;
The actuating means and the actuating signal output means are connected by one of wireless and wired.

これによれば、可動ユニットを用いずに、作動信号出力手段から出力された作動信号により、直接、作動体の作動手段を作動させる高度な携帯電話システムを提供できる。 According to this, it is possible to provide an advanced mobile phone system that directly operates the operating means of the operating body by the operating signal output from the operating signal output means without using the movable unit.

請求項３０に記載の発明では、請求項１乃至請求項２９のいずれか１つにおいて、被対話体が人形、ぬいぐるみ、玩具のいずれか１つで構成されていることを特徴とする。 According to a thirty-third aspect of the present invention, in any one of the first to thirty-ninth aspects, the object to be interacted is formed of any one of a doll, a stuffed toy, and a toy.

これによれば、人と、人形、ぬいぐるみ、玩具のいずれか１つとが音声対話を行う高度な携帯電話システムを提供できる。また、上述した請求項１乃至請求項３０のいずれか１つの手段の後に説明した作用、効果の「被対話体」を「人形」、「ぬいぐるみ」、「玩具」のいずれかに置き換えた効果を得ることができる。また、被対話体が人形、ぬいぐるみ、玩具のいずれか１つで構成されているので、親しみがわきやすい。 According to this, it is possible to provide an advanced mobile phone system in which a person and any one of a doll, a stuffed animal, and a toy have a voice conversation. Further, an effect obtained by replacing the “interactive body” of the action and effect described after any one of the means of claims 1 to 30 with any of “doll”, “stuffed animal”, and “toy”. Obtainable. In addition, since the object to be interacted with is composed of any one of a doll, a stuffed animal, and a toy, it is easy to get familiar.

（第１実施形態）
最初に、以下の説明で用いる用語について説明する。人の音声とは、人が発する音である。発音とは、携帯電話システムから人に発する音である。 (First embodiment)
First, terms used in the following description will be described. A person's voice is a sound emitted by a person. Pronunciation is a sound emitted from a mobile phone system to a person.

以下具体的に説明する。図１は携帯電話システム１００の外観図を、図２は可動携帯電話体３００の外観図を、図３は携帯電話システム１００のブロック図を、図４は、可動ユニット１５Ｂの正面断面図を示す。図１に示すように、第１実施形態における携帯電話システム１００は、携帯電話１１、サーバ１３、可動ユニット１５Ａ、可動ユニット１５Ｂを備えている。サーバ１３は、本発明のサーバ用コンピュータを構成する。 This will be specifically described below. 1 is an external view of the mobile phone system 100, FIG. 2 is an external view of the movable mobile phone body 300, FIG. 3 is a block diagram of the mobile phone system 100, and FIG. 4 is a front sectional view of the mobile unit 15B. . As shown in FIG. 1, the mobile phone system 100 according to the first embodiment includes a mobile phone 11, a server 13, a movable unit 15A, and a movable unit 15B. The server 13 constitutes the server computer of the present invention.

携帯電話１１は、図２に示すように、可動ユニット１５Ａに取り付け自在、取り外し自在に構成されており、図３に示すように、マイク１７、音声出力ボード１９、スピーカ２１、音声信号変調送信手段２３、発音信号受信復調手段２５を備えている。なお、第１実施形態では、携帯電話１１および可動ユニット１５Ａからなるものを可動携帯電話体３００と称するものとする。携帯電話１１、可動ユニット１５Ａは、後述するサーバ１３と有線または無線により接続されており、後述するように、指令信号、音声信号、発音信号等を送受信することができるように構成されている。上記無線は、インターネット回線、電話回線を用いたものであってもよい。上記可動携帯電話体３００は、本発明の被対話体を構成する。 As shown in FIG. 2, the mobile phone 11 is configured to be attachable to and detachable from the movable unit 15A. As shown in FIG. 3, the microphone 17, the audio output board 19, the speaker 21, and the audio signal modulation / transmission means. 23, a tone signal receiving / demodulating means 25 is provided. In the first embodiment, the mobile phone 11 and the movable unit 15A are referred to as a movable mobile phone body 300. The mobile phone 11 and the movable unit 15A are connected to a server 13 (described later) by wire or wirelessly, and are configured to transmit and receive command signals, audio signals, sound generation signals, and the like, as will be described later. The radio may be one using an internet line or a telephone line. The movable mobile phone body 300 constitutes an interactee of the present invention.

マイク１７は、人の音声を音声信号に変換して出力する。上記マイク１７は、本発明の音声変換手段を構成する。 The microphone 17 converts a human voice into a voice signal and outputs the voice signal. The microphone 17 constitutes a sound conversion means of the present invention.

音声出力ボード１９は、発音信号受信復調手段２５で受信、復調された発音信号を所定の電圧に変換して出力する。 The sound output board 19 converts the sound signal received and demodulated by the sound signal reception demodulation means 25 into a predetermined voltage and outputs it.

スピーカ２１は、音声出力ボード１９から出力された電圧を音に変換して発音する。上記スピーカ２１は、本発明の発音手段を構成する。 The speaker 21 converts the voltage output from the audio output board 19 into sound and generates a sound. The speaker 21 constitutes the sounding means of the present invention.

音声信号変調送信手段２３は、マイク１７により変換された音声信号を電波、光波、超音波のいずれかに変調してサーバ１３に搭載された音声信号受信復調手段６３に送信する。 The audio signal modulation / transmission means 23 modulates the audio signal converted by the microphone 17 into any one of radio waves, light waves, and ultrasonic waves, and transmits it to the audio signal reception / demodulation means 63 mounted on the server 13.

発音信号受信復調手段２５は、サーバ１３に搭載された発音信号変調送信手段６５から送信された電波、光波、超音波のいずれかを受信し、所定の発音信号に復調する。 The sound signal receiving / demodulating means 25 receives any one of radio waves, light waves, and ultrasonic waves transmitted from the sound signal modulating / transmitting means 65 mounted on the server 13 and demodulates it into a predetermined sound signal.

また、可動ユニット１５Ａは、図３に示すように、駆動部２７、上腕用モータ２９、下腕用モータ３１、ハンド用モータ３３、走行用モータ３５、旋回用モータ３７、上腕部３９、下腕部４１、ハンド４３、走行部４５、旋回部４７、ＣＣＤカメラ４９、指令信号受信復調手段５１、撮像信号変調送信手段５３、図示しない制御装置、図示しない電源を備えている。 As shown in FIG. 3, the movable unit 15A includes a drive unit 27, an upper arm motor 29, a lower arm motor 31, a hand motor 33, a traveling motor 35, a turning motor 37, an upper arm portion 39, and a lower arm. Unit 41, hand 43, traveling unit 45, turning unit 47, CCD camera 49, command signal receiving / demodulating unit 51, imaging signal modulation / transmitting unit 53, control device not shown, and power source not shown.

駆動部２７は、後述するコントローラ５９の指令信号に基づいて、コントローラ５９の指令信号通りに、上腕用モータ２９、下腕用モータ３１、ハンド用モータ３３、走行用モータ３５、旋回用モータ３７を駆動する。 Based on a command signal from the controller 59, which will be described later, the drive unit 27 controls the upper arm motor 29, the lower arm motor 31, the hand motor 33, the traveling motor 35, and the turning motor 37 in accordance with the command signal from the controller 59. To drive.

上腕用モータ２９、下腕用モータ３１、ハンド用モータ３３、走行用モータ３５、旋回用モータ３７は、それぞれ、上腕部３９、下腕部４１、ハンド４３、走行部４５、旋回部４７を可動する。 The upper arm motor 29, the lower arm motor 31, the hand motor 33, the traveling motor 35, and the turning motor 37 move the upper arm portion 39, the lower arm portion 41, the hand 43, the traveling portion 45, and the turning portion 47, respectively. To do.

上腕部３９、下腕部４１、ハンド４３、走行部４５、旋回部４７は、それぞれ、上腕用モータ２９、下腕用モータ３１、ハンド用モータ３３、走行用モータ３５、旋回用モータ３７の図示しない駆動軸に取り付けられており、上記駆動軸を駆動することで可動することができる。上記上腕部３９、下腕部４１、ハンド４３、走行部４５、旋回部４７は、人と対話を行う場合、所定の説明を行う場合、所定のコミュニケーション動作をすることができる。上記コミュニケーション動作は、コントローラ５９が駆動部２７に、所定のコミュニケーション動作をする指令信号を出力することで行われる。所定のコミュニケーション動作は、後述する動作決定部７３で決定される。上記上腕部３９、下腕部４１、ハンド４３、走行部４５、旋回部４７は、本発明の可動部を構成する。 The upper arm portion 39, the lower arm portion 41, the hand 43, the traveling portion 45, and the turning portion 47 are respectively shown as an upper arm motor 29, a lower arm motor 31, a hand motor 33, a traveling motor 35, and a turning motor 37. It is attached to the drive shaft that is not, and can be moved by driving the drive shaft. The upper arm portion 39, the lower arm portion 41, the hand 43, the running portion 45, and the turning portion 47 can perform a predetermined communication operation when performing a predetermined explanation when performing a conversation with a person. The communication operation is performed when the controller 59 outputs a command signal for performing a predetermined communication operation to the drive unit 27. The predetermined communication operation is determined by an operation determination unit 73 described later. The upper arm part 39, the lower arm part 41, the hand 43, the traveling part 45, and the turning part 47 constitute a movable part of the present invention.

また、上腕部３９、下腕部４１、ハンド４３、走行部４５、旋回部４７は、協調して、図１に示す所定の装置２００を操作することがきる。上記所定の装置２００の操作は、最初に、人がマイク１７、図示しないキーボード、マウス、ペン式入力装置等を介して所定の装置２００を操作するように指示、または命令、許可する。次に、ＣＣＤカメラ４９で、所定の装置２００の操作手段２００ａを撮像し、画像認識手段で操作手段２００ａを画像認識する。次に、走行用モータ３５を駆動して走行部４５を可動させ、携帯電話１１が取り付けられた可動ユニット１５Ａを、操作手段２００ａを操作する位置に移動させる。次に、所定の装置２００の操作をするプログラムに基づいてコントローラ５９が駆動部２７に、所定の装置２００の操作をする指令信号を出力する。上記指令信号が出力されると、上腕部３９、下腕部４１、ハンド４３、走行部４５、旋回部４７は、協調して、所定の装置２００を操作する。なお、ＣＣＤカメラ４９、画像認識手段を用いずとも操作が可能である場合には、上記ＣＣＤカメラ４９、画像認識手段を用いずともよい。可動ユニット１５Ａは、上記構成に限るものではなく、例えば、走行部４５に替えて、歩行手段であってもよい。また、走行部４５を備えていなくともよく、さらに複雑な可動機構を設けてもよい。 Moreover, the upper arm part 39, the lower arm part 41, the hand 43, the running part 45, and the turning part 47 can operate the predetermined apparatus 200 shown in FIG. 1 in cooperation. For the operation of the predetermined device 200, first, a person gives an instruction, command, or permission to operate the predetermined device 200 via the microphone 17, a keyboard (not shown), a mouse, a pen-type input device, or the like. Next, the CCD camera 49 images the operation means 200a of the predetermined apparatus 200, and the image recognition means recognizes the image of the operation means 200a. Next, the traveling motor 35 is driven to move the traveling unit 45, and the movable unit 15A to which the mobile phone 11 is attached is moved to a position where the operating means 200a is operated. Next, the controller 59 outputs a command signal for operating the predetermined apparatus 200 to the drive unit 27 based on a program for operating the predetermined apparatus 200. When the command signal is output, the upper arm unit 39, the lower arm unit 41, the hand 43, the traveling unit 45, and the turning unit 47 operate in cooperation with the predetermined device 200. If the operation can be performed without using the CCD camera 49 and the image recognition means, the CCD camera 49 and the image recognition means may not be used. The movable unit 15A is not limited to the above configuration, and may be a walking means instead of the traveling unit 45, for example. Further, the traveling unit 45 may not be provided, and a more complicated movable mechanism may be provided.

ＣＣＤカメラ４９は、携帯電話１１の周りを撮像するもので、ＣＣＤイメージセンサ４９ａ、信号処理部４９ｂから構成されている。上記ＣＣＤイメージセンサ４９ａ、信号処理部４９ｂは、旋回部４７に搭載されている。そして、上記旋回用モータ３７が駆動することで、旋回部４７が旋回を行い、携帯電話１１の周りを撮像する。なお、第１実施形態では、信号処理部４９ｂは携帯電話１１に搭載されているが、サーバ１３に搭載されていてもよい。上記ＣＣＤイメージセンサ４９ａは本発明の撮像手段を構成する。 The CCD camera 49 images the surroundings of the mobile phone 11, and is composed of a CCD image sensor 49a and a signal processing unit 49b. The CCD image sensor 49 a and the signal processing unit 49 b are mounted on the turning unit 47. Then, when the turning motor 37 is driven, the turning unit 47 makes a turn and images around the mobile phone 11. In the first embodiment, the signal processing unit 49 b is mounted on the mobile phone 11, but may be mounted on the server 13. The CCD image sensor 49a constitutes the imaging means of the present invention.

ＣＣＤイメージセンサ４９ａは、人を含む所定の対象物から発した光をレンズなどの光学系によって撮像素子の受光平面に結合させ、その像の光による明暗を電荷の量に光電変換し、それを順次読み出して電気信号に変換するものであって、可動ユニット１５Ａの周囲を撮像し、電気信号に変換している。 The CCD image sensor 49a couples light emitted from a predetermined object including a person to a light receiving plane of an image pickup device by an optical system such as a lens, and photoelectrically converts light and darkness of the image into an amount of electric charge. The information is sequentially read and converted into an electric signal, and the periphery of the movable unit 15A is imaged and converted into an electric signal.

また、信号処理部４９ｂは、ＣＣＤイメージセンサ４９ａによって変換された電気信号を所定の撮像信号に処理する。上記信号処理部４９ｂで認識された認識信号は、撮像信号変調送信手段５３により、電波、光波、超音波のいずれかに変調され、サーバ１３に設けられた撮像信号受信復調手段６７により、所定の認識信号に復調される。そして、サーバ１３の画像認識手段で所定の処理がなされる。 The signal processing unit 49b processes the electrical signal converted by the CCD image sensor 49a into a predetermined imaging signal. The recognition signal recognized by the signal processing unit 49b is modulated into one of radio waves, light waves, and ultrasonic waves by the imaging signal modulation / transmission means 53, and a predetermined signal is received by the imaging signal reception / demodulation means 67 provided in the server 13. Demodulated into a recognition signal. Then, predetermined processing is performed by the image recognition means of the server 13.

上記画像認識手段では、可動ユニット１５Ａの周囲を撮像した撮像信号から人を含む所定の対象物の特徴点を抽出し、認識を行っている。ＣＰＵボード５７のＣＰＵは上記画像認識手段で認識された結果に基づいて、対話処理部７１、動作決定部７３を制御する。なお、ＣＣＤイメージセンサ４９ａで撮像され、信号処理部４９ｂで処理された画像は、後述する画像モニタ７９ａにより表示することができる。 In the image recognition means, feature points of a predetermined object including a person are extracted from an imaging signal obtained by imaging the periphery of the movable unit 15A, and recognition is performed. The CPU of the CPU board 57 controls the dialogue processing unit 71 and the operation determining unit 73 based on the result recognized by the image recognition means. An image captured by the CCD image sensor 49a and processed by the signal processing unit 49b can be displayed on an image monitor 79a described later.

なお、信号処理部４９ｂが、サーバ１３側に設けられた場合は、ＣＣＤイメージセンサ４９ａに撮像された撮像データを撮像信号変調送信手段５３により、電波、光波、超音波のいずれかに変調し、サーバ１３に設けられた撮像信号受信復調手段６７により、所定の撮像データに復調して信号処理部４９ｂに送信するようにしてもよい。 When the signal processing unit 49b is provided on the server 13 side, the image data captured by the CCD image sensor 49a is modulated to any one of radio waves, light waves, and ultrasonic waves by the imaging signal modulation transmission unit 53. The imaging signal reception demodulating unit 67 provided in the server 13 may demodulate to predetermined imaging data and transmit it to the signal processing unit 49b.

指令信号受信復調手段５１は、サーバ１３に搭載された指令信号変調送信手段６１から送信された電波、光波、超音波のいずれかを受信し、所定の指令信号に復調する。 The command signal receiving / demodulating means 51 receives any one of radio waves, light waves and ultrasonic waves transmitted from the command signal modulation / transmitting means 61 mounted on the server 13 and demodulates them into a predetermined command signal.

音声信号変調送信手段２３は、マイク１７から出力された音声信号を電波、光波、超音波のいずれかに変調し、音声信号受信復調手段６３に送信する。 The audio signal modulation / transmission means 23 modulates the audio signal output from the microphone 17 into one of a radio wave, a light wave, and an ultrasonic wave, and transmits it to the audio signal reception / demodulation means 63.

発音信号受信復調手段２５は、発音信号変調送信手段６５から送信された電波、光波、超音波のいずれかを受信し、所定の発音信号に復調する。 The sound signal receiving / demodulating means 25 receives any one of radio waves, light waves and ultrasonic waves transmitted from the sound signal modulating / transmitting means 65 and demodulates them into a predetermined sound signal.

撮像信号変調送信手段５３は、ＣＣＤカメラ４９の信号処理部４９ｂから出力された撮像信号を電波、光波、超音波のいずれかに変調し、撮像信号受信復調手段６７に送信する。 The imaging signal modulation transmission unit 53 modulates the imaging signal output from the signal processing unit 49 b of the CCD camera 49 into one of radio wave, light wave, and ultrasonic wave, and transmits the modulated signal to the imaging signal reception demodulation unit 67.

次に、サーバ１３について説明する。上記サーバ１３は、音声認識ボード５５、ＣＰＵボード５７、コントローラ５９、指令信号変調送信手段６１、音声信号受信復調手段６３、発音信号変調送信手段６５、撮像信号受信復調手段６７、画像信号変調送信手段６９が搭載されており、図示しない電源から電気が供給されている。 Next, the server 13 will be described. The server 13 includes a voice recognition board 55, a CPU board 57, a controller 59, a command signal modulation / transmission means 61, a voice signal reception / demodulation means 63, a sound signal modulation / transmission means 65, an imaging signal reception / demodulation means 67, and an image signal modulation / transmission means. 69 is mounted and electricity is supplied from a power source (not shown).

音声認識ボード５５は、図３に示すように、音響分析部を備えており、マイク１７から入力された相手の音声を分析し、音響的特徴を抽出している。そして、音声認識エンジンで上記音響分析部で抽出された音響的特徴と、音素を単位とした音声特徴量パターンの分布の統計モデルである音響モデルとの比較照合を行うことで音声を認識し、その結果をＣＰＵボード５７に出力している。なお、第１実施形態では、音響モデルに加えて、単語間の接続関係を規定する言語モデルを備えており、連続した単語や、接頭語、接続詞を含めた文章を認識することができる。上記音声認識ボード５５は、本発明の音声認識手段を構成する。 As shown in FIG. 3, the voice recognition board 55 includes an acoustic analysis unit, analyzes the voice of the other party input from the microphone 17, and extracts acoustic features. Then, the speech recognition engine recognizes the speech by comparing and comparing the acoustic feature extracted by the acoustic analysis unit with the acoustic model that is a statistical model of the distribution of the speech feature amount pattern in units of phonemes, The result is output to the CPU board 57. In the first embodiment, in addition to the acoustic model, a language model that defines the connection relationship between words is provided, and a continuous word, a sentence including a prefix, and a conjunction can be recognized. The speech recognition board 55 constitutes speech recognition means of the present invention.

ＣＰＵボード５７には、ＣＰＵの他にＲＡＭおよびＲＯＭからなるメモリが搭載されており、上記メモリに対話処理プログラム、動作決定プログラム、発音情報、画像情報が記憶されている。なお、以下の説明では、対話処理プログラムおよび上記対話処理プログラムが記憶される所定のメモリ領域を対話処理部７１、動作決定プログラムおよび上記動作決定プログラムが記憶される所定のメモリ領域を動作決定部７３、発音情報およびを発音情報が記憶される所定のメモリ領域を発音情報記憶部７５、画像情報およびを画像情報が記憶される所定のメモリ領域を画像情報記憶部７７と称するものとする。 In addition to the CPU, the CPU board 57 includes a memory including a RAM and a ROM, and the memory stores an interactive processing program, an operation determination program, pronunciation information, and image information. In the following description, the dialogue processing unit 71 is a predetermined memory area in which the dialogue processing program and the dialogue processing program are stored, and the action determination unit 73 is a predetermined memory area in which the operation determination program and the operation determination program are stored. A predetermined memory area in which the pronunciation information and the pronunciation information are stored is referred to as a pronunciation information storage section 75, and a predetermined memory area in which the image information and the image information are stored is referred to as an image information storage section 77.

コントローラ５９は、上述した上腕部３９、下腕部４１、ハンド４３、走行部４５、旋回部４７が動作決定部７３によって決定された動作となるように、駆動部２７に動作の指令信号を出す。 The controller 59 issues an operation command signal to the drive unit 27 so that the above-described upper arm 39, lower arm 41, hand 43, travel unit 45, and turning unit 47 perform the operations determined by the operation determining unit 73. .

指令信号変調送信手段６１は、コントローラ５９から送信された動作信号を、電波、光波、超音波のいずれかに変調し、指令信号受信復調手段５１に送信をする。 The command signal modulation / transmission means 61 modulates the operation signal transmitted from the controller 59 into one of radio waves, light waves, and ultrasonic waves, and transmits it to the command signal reception / demodulation means 51.

音声信号受信復調手段６３は、音声信号変調送信手段２３によって電波、光波、超音波のいずれかに変調された音声信号を受信し、所定の音声信号に復調する。 The audio signal receiving / demodulating unit 63 receives the audio signal modulated by the audio signal modulation / transmitting unit 23 into one of radio waves, light waves, and ultrasonic waves, and demodulates it into a predetermined audio signal.

発音信号変調送信手段６５は、発音信号を波、光波、超音波のいずれかに変調し、発音信号受信復調手段２５に送信する。 The sound generation signal modulation / transmission means 65 modulates the sound generation signal into one of a wave, a light wave, and an ultrasonic wave, and transmits it to the sound generation signal reception / demodulation means 25.

撮像信号受信復調手段６７は、撮像信号変調送信手段５３から出力された電波、光波、超音波のいずれかを受信し、所定の撮像信号に復調する。 The imaging signal receiving / demodulating means 67 receives any one of the radio wave, light wave and ultrasonic wave output from the imaging signal modulation / transmitting means 53 and demodulates it into a predetermined imaging signal.

画像信号変調送信手段６９は、画像信号を電波、光波、超音波のいずれかに変調し、画像情報受信復調手段８１に送信する。 The image signal modulation / transmission means 69 modulates the image signal into one of radio waves, light waves, and ultrasonic waves, and transmits it to the image information reception / demodulation means 81.

対話処理部７１は、音声認識ボード５５により認識された音声に基づいて、相手に対して応答する音声を決定する。上記対話処理部７１で決定された発音は、発音信号変調送信手段６５、発音信号受信復調手段２５を経由し、音声出力ボード１９で所定の電圧に変換され、スピーカ２１で発音される。なお、上記対話処理部７１は携帯電話システム１００自らが発音する機能も有している。上記対話処理部７１は、本発明の対話制御手段を構成する。 The dialogue processing unit 71 determines a voice to respond to the other party based on the voice recognized by the voice recognition board 55. The sound determined by the dialog processing unit 71 is converted into a predetermined voltage by the sound output board 19 via the sound signal modulation / transmission means 65 and the sound signal reception / demodulation means 25, and is generated by the speaker 21. The dialog processing unit 71 also has a function of sounding by the mobile phone system 100 itself. The dialogue processing unit 71 constitutes dialogue control means of the present invention.

動作決定部７３は、上記ＣＣＤカメラ４９で人を含む所定の対象物を認識した際、対話の際、あるいは可動ユニット１５Ａが自ら発音する際、可動ユニット１５Ａがコミュニケーションを行う際、所定の装置を操作する際の動作を決定する。 When the CCD camera 49 recognizes a predetermined object including a person, when the dialogue is performed, or when the movable unit 15A pronounces itself, the operation determining unit 73 performs a predetermined device when communicating with the movable unit 15A. Determine the action when operating.

発音情報記憶部７５は、所定の音声情報を記憶する。人がマイク１７、図示しないキーボード、マウス、ペン式入力装置等を介して所定の発音情報を要求した場合、人がマイク１７、図示しないキーボード、マウス、ペン式入力装置等を介して所定の発音情報を許可した場合、所定の発音情報を用いて可動ユニット１５Ａが自ら発音する場合に、発音情報記憶部７５から所定の発音情報を読み出して、スピーカ２１から発音する。 The pronunciation information storage unit 75 stores predetermined audio information. When a person requests predetermined sounding information via the microphone 17, a keyboard, mouse, pen-type input device, etc. (not shown), the person sounds predetermined sound via the microphone 17, a keyboard, mouse, not shown, a pen-type input device, etc. When the information is permitted, when the movable unit 15A pronounces itself using predetermined sound information, the predetermined sound information is read from the sound information storage unit 75 and is sounded from the speaker 21.

画像情報記憶部７７は、所定の画像情報を記憶する。人がマイク１７、図示しないキーボード、マウス、ペン式入力装置等を介して所定の画像情報を要求した場合、人がマイク１７、図示しないキーボード、マウス、ペン式入力装置等を介して所定の画像情報を許可した場合、所定の画像情報を用いて可動ユニット１５Ａが自ら発音する場合に、画像情報記憶部７７から所定の画像情報を読み出して、画像モニタ７９ａに表示する。 The image information storage unit 77 stores predetermined image information. When a person requests predetermined image information via the microphone 17, a keyboard, mouse, pen-type input device, etc. (not shown), the person sends a predetermined image via the microphone 17, keyboard, mouse, pen-type input device, etc. not shown. When the information is permitted, when the movable unit 15A produces its own sound using predetermined image information, the predetermined image information is read from the image information storage unit 77 and displayed on the image monitor 79a.

また、上記携帯電話１１および上記サーバ１３とは別体に、画像表示装置７９が設けられている。画像表示装置７９は、画像を表示する画像モニタ７９ａと、画像信号変調送信手段６９から送信された電波、光波、超音波のいずれかを受信して所定の画像情報に復調する画像情報受信復調手段８１とが設けられている。上記画像モニタ７９ａは、本発明の画像表示手段を構成する。 Further, an image display device 79 is provided separately from the mobile phone 11 and the server 13. The image display device 79 includes an image monitor 79a for displaying an image, and an image information receiving / demodulating unit that receives any one of radio waves, light waves, and ultrasonic waves transmitted from the image signal modulation / transmitting unit 69 and demodulates them into predetermined image information. 81 is provided. The image monitor 79a constitutes the image display means of the present invention.

上記発音情報とは、発音により人に伝達する情報であって、言葉、音楽、所定の音を含む。また、画像情報とは、人に対して表示する情報であって、静止画像、動画像、文字、所定の光を含む。 The pronunciation information is information transmitted to a person by pronunciation, and includes words, music, and predetermined sounds. The image information is information displayed to a person and includes a still image, a moving image, characters, and predetermined light.

なお、発音情報記憶部７５、画像情報記憶部７７は、ＣＰＵボード５７の外側に配置してもよく、携帯電話１１に配置してもよい。 Note that the pronunciation information storage unit 75 and the image information storage unit 77 may be arranged outside the CPU board 57 or may be arranged in the mobile phone 11.

また、画像モニタ７９ａには、人の眉毛、目、口を真似て表情を表示するようにしてもよい。上記表情とは、例えば、普通の表情、笑った表情、泣いた表情、怒った表情等などで、対話処理部７１で決定された対話内容に基づいて、図示しない表情決定部により表情を決定する。 The image monitor 79a may display facial expressions by imitating human eyebrows, eyes, and mouth. The facial expression is, for example, an ordinary facial expression, a laughing facial expression, a crying facial expression, an angry facial expression, etc., and the facial expression is determined by a facial expression determination unit (not shown) based on the conversation content determined by the dialogue processing unit 71. .

なお、上述したサーバ１３は、図示しないインターネットに接続自在に構成されており、上述した発音情報、画像情報を、インターネット上の所定の記憶場所からダウンロードできるように構成されている。 The server 13 described above is configured to be freely connected to the Internet (not shown), and is configured to be able to download the above-described pronunciation information and image information from a predetermined storage location on the Internet.

次に、可動ユニット１５Ｂについて説明する。可動ユニット１５Ｂは、図１に示すように、無線により、所定の装置２０１の操作手段２０１ａをオン／オフするもので、第１実施形態では、駆動部８３、ソレノイド８５、プッシャ８７、指令信号受信復調手段８９を備えており、図示しない電源から電気が供給されている。上記駆動部８３は、図３に示すように、指令信号受信復調手段８９で受信復調された動作の指令信号を受信すると、ソレノイド８５に通電し、プッシャ８７を可動する。 Next, the movable unit 15B will be described. As shown in FIG. 1, the movable unit 15B wirelessly turns on / off the operation means 201a of a predetermined device 201. In the first embodiment, the movable unit 15B receives a drive unit 83, a solenoid 85, a pusher 87, and a command signal. Demodulating means 89 is provided, and electricity is supplied from a power source (not shown). As shown in FIG. 3, the drive unit 83 energizes the solenoid 85 and moves the pusher 87 when it receives the command signal of the operation received and demodulated by the command signal receiving and demodulating means 89.

なお、可動ユニット１５Ｂは、上記構成に限るものではなく、種々の形態が考えられる。例えば、可動ユニット１５Ａと同様に、複数の可動部と、複数の可動部をそれぞれ可動するモータが搭載されていてもよく、上記複数の可動部、上記モータに加え、モータを駆動する駆動部が搭載されていてもよい。 The movable unit 15B is not limited to the above configuration, and various forms are conceivable. For example, similarly to the movable unit 15A, a plurality of movable units and a motor that can move the plurality of movable units may be mounted. In addition to the plurality of movable units and the motor, a drive unit that drives the motor may be provided. It may be installed.

ここで、携帯電話システム１００の対話動作について説明する。人が携帯電話１１に発声すると、周囲音とともに、その音声が携帯電話１１に搭載されたマイク１７で音声信号に変換される。そして、変換された音声信号が、音声信号変調送信手段２３、音声信号受信復調手段６３を経由して音声認識ボード５５に送信される。上記音声認識ボード５５では、マイク１７から入力された相手の音声を分析し、音響的特徴を抽出、音声認識エンジンで上記音響分析部で抽出された音響的特徴と、音素を単位とした音声特徴量パターンの分布の統計モデルである音響モデルとの比較照合を行うことで音声を認識し、その結果をＣＰＵボード５７に出力する。 Here, the interactive operation of the mobile phone system 100 will be described. When a person speaks to the mobile phone 11, the sound is converted into an audio signal by the microphone 17 mounted on the mobile phone 11 together with the ambient sound. The converted voice signal is transmitted to the voice recognition board 55 via the voice signal modulation / transmission means 23 and the voice signal reception / demodulation means 63. The voice recognition board 55 analyzes the other party's voice input from the microphone 17 to extract acoustic features, the acoustic features extracted by the acoustic analysis unit in the voice recognition engine, and the voice features in units of phonemes. The voice is recognized by comparing with the acoustic model that is a statistical model of the distribution of the quantity pattern, and the result is output to the CPU board 57.

その際、あるいは、その前後、ＣＣＤカメラ４９は、旋回して人を捜すことができる。可動ユニット１５Ａに備えられた旋回用モータ３７および上記旋回用モータ３７に搭載されたＣＣＤカメラ４９が旋回して人を捜すように、コントローラ５９が駆動部２７に動作の指令信号を出力する。そして、ＣＣＤカメラ４９が携帯電話１１の周囲を撮像し、ＣＣＤイメージセンサ４９ａによって変換された電気信号から人を含む所定の対象物の特徴点を抽出して認識を行う。そして、上記人が移動すると、人を追跡するように旋回用モータ３７および上記旋回用モータ３７に搭載されたＣＣＤカメラ４９が旋回する。 At that time, or before and after that, the CCD camera 49 can turn to search for a person. The controller 59 outputs an operation command signal to the drive unit 27 so that the turning motor 37 provided in the movable unit 15A and the CCD camera 49 mounted on the turning motor 37 turn to search for a person. Then, the CCD camera 49 captures an image of the periphery of the mobile phone 11 and extracts and recognizes feature points of a predetermined object including a person from the electrical signal converted by the CCD image sensor 49a. When the person moves, the turning motor 37 and the CCD camera 49 mounted on the turning motor 37 turn so as to track the person.

次に、対話処理部７１は、音声認識ボード５５により認識された音声に基づいて、相手に対して応答する音声を決定する。上記対話処理部７１で決定された音声は、発音信号変調送信手段６５、発音信号受信復調手段２５を経由し、音声出力ボード１９で所定の電圧に変換され、スピーカ２１で発音される。その際、上記対話処理部７１で決定された音声の内容に応じて、動作決定部７３で動作を決定し、人に対してコミュニケーション動作をするように、コントローラ５９が駆動部２７に動作の指令信号を出力する。 Next, the dialogue processing unit 71 determines a voice to respond to the other party based on the voice recognized by the voice recognition board 55. The sound determined by the dialogue processing unit 71 is converted into a predetermined voltage by the sound output board 19 via the sound signal modulation / transmission means 65 and the sound signal reception / demodulation means 25, and is sounded by the speaker 21. At that time, in accordance with the content of the voice determined by the dialog processing unit 71, the operation determining unit 73 determines the operation, and the controller 59 instructs the driving unit 27 to perform the operation of communication. Output a signal.

また、人が携帯電話１１に発声する内容が、人がマイク１７、図示しないキーボード、マウス、ペン式入力装置等を介して所定の発音情報を要求した場合、人がマイク１７、図示しないキーボード、マウス、ペン式入力装置等を介して所定の発音情報を許可した場合、所定の発音情報を用いて携帯電話１１が自ら発音する場合のいずれかには、発音情報記憶部７５から所定の発音情報を読み出して、発音手段から発音する。 In addition, when a person utters the mobile phone 11 when the person requests predetermined pronunciation information via the microphone 17, a keyboard (not shown), a mouse, a pen-type input device, or the like, the person speaks the microphone 17, a keyboard (not shown), When predetermined phonetic information is permitted through a mouse, a pen-type input device or the like, when the mobile phone 11 uses the predetermined phonetic information to pronounce itself, the phonetic information storage unit 75 stores the predetermined phonetic information. Is read out and pronounced from the sound generation means.

また、人が携帯電話１１に発声する内容が、人がマイク１７、図示しないキーボード、マウス、ペン式入力装置等を介して所定の画像情報を要求した場合、人がマイク１７、図示しないキーボード、マウス、ペン式入力装置等を介して所定の画像情報を許可した場合、所定の画像情報を用いて画像モニタ７９ａに画像情報を表示する場合には、画像情報記憶部７７から所定の画像情報を読み出して、画像モニタ７９ａに表示する。 In addition, when a person utters the mobile phone 11 when the person requests predetermined image information via the microphone 17, a keyboard (not shown), a mouse, a pen-type input device, or the like, the person speaks the microphone 17, a keyboard (not shown), When predetermined image information is permitted via a mouse, a pen-type input device, etc., when displaying image information on the image monitor 79a using the predetermined image information, the predetermined image information is received from the image information storage unit 77. It is read out and displayed on the image monitor 79a.

また、人が携帯電話１１に発声する内容が、人の音声が所定の装置２００を操作する命令である場合、人の音声が所定の装置２００を操作する許可である場合、所定の操作入力手段により所定の装置２００を操作する場合、所定の装置２００を操作する自動実行プログラムが実行される場合、ＣＣＤカメラ４９が、所定の装置２００の操作手段２００ａを撮像し、画像認識手段が操作手段２００ａの位置を認識する。そして、走行部４５が操作手段２００ａを操作する位置に可動し、上腕部３９、下腕部４１、ハンド４３、走行部４５が所定の装置２００を操作するように、コントローラ５９が駆動部２７に指令信号を出力する。 Further, when the content of the person uttering to the mobile phone 11 is a command for operating the predetermined apparatus 200 when the voice of the person is an instruction to operate the predetermined apparatus 200, the predetermined operation input means is used. When the predetermined apparatus 200 is operated by the automatic execution program for operating the predetermined apparatus 200, the CCD camera 49 images the operation means 200a of the predetermined apparatus 200, and the image recognition means operates the operation means 200a. Recognize the position of Then, the controller 59 is moved to the drive unit 27 so that the travel unit 45 is moved to a position where the operation unit 200a is operated, and the upper arm unit 39, the lower arm unit 41, the hand 43, and the travel unit 45 operate the predetermined device 200. A command signal is output.

また、人が携帯電話１１に発声する内容が、人の音声が所定の装置２０１を操作する命令である場合、人の音声が所定の装置２０１を操作する許可である場合、所定の操作入力手段により所定の装置２０１を操作する場合、所定の装置２０１を操作する自動実行プログラムが実行される場合、所定の装置２０１を操作するように、コントローラ５９が指令信号を出力し、信号変調送信手段６１、指令信号受信復調手段８９、駆動部８３を経由し、ソレノイド８５を作動させる。 Further, when the content of the person uttering to the mobile phone 11 is a command for operating the predetermined device 201 when the voice of the person is an instruction to operate the predetermined device 201, the predetermined operation input means When the predetermined device 201 is operated by the controller 59, when the automatic execution program for operating the predetermined device 201 is executed, the controller 59 outputs a command signal so as to operate the predetermined device 201, and the signal modulation transmission means 61 Then, the solenoid 85 is operated via the command signal receiving / demodulating means 89 and the drive unit 83.

また、画像モニタ７９ａに、人の眉毛、目、口を真似て表情を表示するよう設定されている場合には、対話処理部７１で決定された対話内容に基づいて、図示しない表情決定部で表情を決定し、画像モニタ７９ａ用に、普通の表情、笑った表情、泣いた表情、怒った表情等などを表示する。 Further, when the image monitor 79a is set to display facial expressions while imitating human eyebrows, eyes, and mouth, the facial expression determination unit (not shown) is based on the conversation content determined by the dialogue processing unit 71. A facial expression is determined, and an ordinary facial expression, a laughing facial expression, a crying facial expression, an angry facial expression, etc. are displayed on the image monitor 79a.

上記構成によれば、マイク１７、スピーカ２１を備えた可動ユニット１５Ａと、サーバ１３との間が有線及び無線のいずれかで接続されて、人が可動ユニット１５Ａ（または可動携帯電話体３００）と音声対話を行うことができる。また、各可動部３９、４１、４３、４５、４７の動作を司令する指令信号を、サーバ１３に備えられたコントローラ５９から、可動ユニット１５Ａに備えられた駆動部２７に出力し、この指令信号に基づいて各モータ２９、３１、３３、３５、３７を駆動することで、各可動部３９、４１、４３、４５、４７を可動することができる。なお、携帯電話１１が、可動ユニット１５Ａに取り付けられている場合、見かけ上、携帯電話１１と音声対話をしているように構成される。 According to the above configuration, the movable unit 15A including the microphone 17 and the speaker 21 and the server 13 are connected to each other by a wired or wireless connection so that a person can move to the movable unit 15A (or the movable mobile phone body 300). Voice dialogue can be conducted. In addition, a command signal for commanding the operation of each movable portion 39, 41, 43, 45, 47 is output from the controller 59 provided in the server 13 to the drive portion 27 provided in the movable unit 15A. By driving the motors 29, 31, 33, 35, and 37 based on the above, the movable portions 39, 41, 43, 45, and 47 can be moved. In addition, when the mobile phone 11 is attached to the movable unit 15A, it is configured to seem to have a voice conversation with the mobile phone 11.

また、上記構成によれば、携帯電話１１が可動ユニット１５Ａから取り外し可能に構成されているので、携帯電話１１単独で持ち歩くことができる。また、携帯電話１１が可動ユニット１５Ａと一体に構成されている場合に比較して、携帯電話１１を小さく、軽くすることができ、携帯電話１１の持ち運びを容易にすることができる。 Further, according to the above configuration, since the mobile phone 11 is configured to be removable from the movable unit 15A, the mobile phone 11 can be carried alone. In addition, the mobile phone 11 can be made smaller and lighter than when the mobile phone 11 is configured integrally with the movable unit 15A, and the mobile phone 11 can be easily carried.

また、上記構成によれば、音声認識ボード５５、対話処理部７１がサーバ１３に備えられるので、マイク１７、スピーカ２１、音声認識ボード５５、対話処理部７１が携帯電話１１（または可動ユニット１５Ａ、または可動携帯電話体３００）に搭載される場合に比べると、携帯電話１１を小さく、軽くすることができ、携帯電話１１（または可動ユニット１５Ａ、または可動携帯電話体３００）の持ち運びを容易にすることができる。 Further, according to the above configuration, since the voice recognition board 55 and the dialogue processing unit 71 are provided in the server 13, the microphone 17, the speaker 21, the voice recognition board 55, and the dialogue processing unit 71 are connected to the mobile phone 11 (or the movable unit 15A, Alternatively, the mobile phone 11 can be made smaller and lighter than when mounted on the mobile phone body 300), and the mobile phone 11 (or the movable unit 15A or the mobile phone body 300) can be easily carried. be able to.

また、音声認識ボード５５、対話処理部７１がサーバ１３に備えられるので、携帯電話１１（または可動ユニット１５Ａ、または可動携帯電話体３００）を落下させた場合、あるいは水たまりに水没させた場合でも、高価な音声認識ボード５５、対話処理部７１が故障することがない。 Further, since the voice recognition board 55 and the dialogue processing unit 71 are provided in the server 13, even when the cellular phone 11 (or the movable unit 15A or the movable cellular phone body 300) is dropped or submerged in a puddle, The expensive voice recognition board 55 and dialogue processing unit 71 do not break down.

また、上記構成によれば、コントローラ５９がサーバ１３に備えられるので、コントローラ５９が携帯電話１１（または可動ユニット１５Ａ、または可動携帯電話体３００）に備えられる場合に比べると、携帯電話１１（または可動ユニット１５Ａ、または可動携帯電話体３００）を小さく、軽くすることができる。 Further, according to the above configuration, since the controller 59 is provided in the server 13, the mobile phone 11 (or the mobile phone 11 (or the movable unit 15 A or the movable mobile phone body 300) is compared with the case where the controller 59 is provided in the mobile phone 11. The movable unit 15A or the movable cellular phone body 300) can be made small and light.

また、コントローラ５９がサーバ１３に備えられるので、携帯電話１１（または可動ユニット１５Ａ、または可動携帯電話体３００）を落下させた場合、あるいは水たまりに水没させた場合でも、高価なコントローラ５９が故障することがない。 Further, since the controller 59 is provided in the server 13, even when the cellular phone 11 (or the movable unit 15A or the movable cellular phone body 300) is dropped or submerged in a puddle, the expensive controller 59 breaks down. There is nothing.

また、上記構成によれば、発音情報記憶部７５、画像情報記憶部７７がサーバ１３に備えられるので、発音情報記憶部７５、画像情報記憶部７７が携帯電話１１（または可動携帯電話体３００）に備えられる場合に比べると、携帯電話１１（または可動ユニット１５Ａまたは可動携帯電話体３００）を小さく、軽くすることができ、携帯電話１１の持ち運びを容易にすることができる。 Further, according to the above configuration, since the pronunciation information storage unit 75 and the image information storage unit 77 are provided in the server 13, the pronunciation information storage unit 75 and the image information storage unit 77 are included in the mobile phone 11 (or the movable mobile phone body 300). Compared with the case where the mobile phone 11 is provided, the mobile phone 11 (or the movable unit 15A or the mobile mobile phone body 300) can be made smaller and lighter, and the mobile phone 11 can be easily carried.

また、発音情報記憶部７５、画像情報記憶部７７がサーバ１３に備えられるので、携帯電話１１（または可動ユニット１５Ａまたは可動携帯電話体３００）を落下させた場合、あるいは水たまりに水没させた場合でも、高価な発音情報記憶部７５、画像情報記憶部７７が故障することがない。 Moreover, since the pronunciation information storage unit 75 and the image information storage unit 77 are provided in the server 13, even when the mobile phone 11 (or the movable unit 15A or the movable mobile phone body 300) is dropped or submerged in a puddle. The expensive pronunciation information storage unit 75 and image information storage unit 77 do not break down.

また、上記構成によれば、画像表示装置７９が、携帯電話１１（または可動ユニット１５Ａ、または可動携帯電話体３００）およびサーバ１３のいずれとも別体で構成されているので、画像表示装置７９が携帯電話１１（または可動ユニット１５Ａ、または可動携帯電話体３００）に備えられる場合に比べると、携帯電話１１（または可動ユニット１５Ａ、または可動携帯電話体３００）を小さく、軽くすることができ、携帯電話１１（または可動携帯電話体３００）の持ち運びを容易にすることができる。 Further, according to the above configuration, since the image display device 79 is configured separately from both the mobile phone 11 (or the movable unit 15A or the movable mobile phone body 300) and the server 13, the image display device 79 is Compared with the case where the mobile phone 11 (or the movable unit 15A or the movable mobile phone body 300) is provided, the mobile phone 11 (or the movable unit 15A or the movable mobile phone body 300) can be made smaller and lighter. The telephone 11 (or the movable mobile phone 300) can be easily carried.

また、画像表示装置７９が、携帯電話１１（または可動ユニット１５Ａ、または可動携帯電話体３００）、サーバ１３のいずれとも別体で構成されているので、携帯電話１１（または可動携帯電話体３００）を落下させた場合、あるいは水たまりに水没させた場合でも、高価な画像表示装置７９が故障することがない。 Since the image display device 79 is configured separately from the mobile phone 11 (or the movable unit 15A or the movable mobile phone body 300) and the server 13, the mobile phone 11 (or the movable mobile phone body 300). Even if the camera is dropped or submerged in a puddle, the expensive image display device 79 does not break down.

また、上記構成によれば、携帯電話１１、サーバ１３、可動ユニット１５Ａとが無線で接続されているので、有線の長さに制約されることなく、携帯電話１１（または可動ユニット１５Ａ、または可動携帯電話体３００）を移動することができる。 Further, according to the above configuration, since the mobile phone 11, the server 13, and the movable unit 15A are wirelessly connected, the mobile phone 11 (or the movable unit 15A or the movable unit 15A or the movable unit 15A is not limited by the length of the wire). The mobile phone body 300) can be moved.

また、上記構成によれば、人が所定の発音情報を要求した場合、人がマイク１７、図示しないキーボード、マウス、ペン式入力装置等を介して所定の発音情報を許可した場合、所定の発音情報を用いて可動ユニット１５Ａ（または可動携帯電話体３００）が自ら発音する場合のいずれかに、所定の発音情報を得ることができる高機能な携帯電話システムを提供することができる。また、人が所定の発音情報を要求した場合、人がマイク１７、図示しないキーボード、マウス、ペン式入力装置等を介して所定の発音情報を許可した場合、所定の発音情報を用いて可動ユニット１５Ａ（または可動携帯電話体３００）が自ら発音する場合のいずれかに、所定の発音情報を読み出して、スピーカ２１から発音する高度なユーザインターフェースを提供できる。さらに、発音情報記憶部がサーバ１３に搭載されているので、携帯電話１１（または、可動ユニット１５Ａ、または可動携帯電話体３００）を落下させた場合、あるいは水たまりに水没させた場合でも、発音情報記憶部７５に記憶された発音情報を損傷させることがない。 Further, according to the above configuration, when a person requests predetermined pronunciation information, when a person permits the predetermined pronunciation information via the microphone 17, a keyboard (not shown), a mouse, a pen-type input device, or the like, It is possible to provide a highly functional mobile phone system capable of obtaining predetermined pronunciation information in any case where the movable unit 15A (or the movable mobile phone body 300) pronounces itself using the information. In addition, when a person requests predetermined pronunciation information, if the person permits the predetermined pronunciation information via the microphone 17, a keyboard, a mouse, a pen-type input device, etc. (not shown), the movable unit is used using the predetermined pronunciation information. It is possible to provide an advanced user interface that reads predetermined sounding information and sounds from the speaker 21 in any case where 15A (or the movable mobile phone body 300) sounds itself. Further, since the pronunciation information storage unit is mounted on the server 13, even when the mobile phone 11 (or the movable unit 15A or the movable mobile phone body 300) is dropped or submerged in a puddle, the pronunciation information is stored. The pronunciation information stored in the storage unit 75 is not damaged.

また、上記構成によれば、人が所定の画像情報を要求した場合、人がマイク１７、図示しないキーボード、マウス、ペン式入力装置等を介して所定の画像情報を許可した場合、所定の画像情報を用いて可動ユニット１５Ａ（または可動携帯電話体３００）が自ら所定の画像を表示する場合のいずれかに、所定の画像情報を得ることができる高機能な携帯電話システムを提供することができる。また、人が所定の画像情報を要求した場合、人がマイク１７、図示しないキーボード、マウス、ペン式入力装置等を介して所定の画像情報を許可した場合、所定の画像情報を用いて可動ユニット１５Ａ（または可動携帯電話体３００）が自ら所定の画像を表示する場合のいずれかに、所定の画像情報を得ることができる高機能ユーザインターフェースを実現することができる。 According to the above configuration, when a person requests predetermined image information, when a person permits predetermined image information via the microphone 17, a keyboard (not shown), a mouse, a pen-type input device, or the like, a predetermined image information is obtained. It is possible to provide a high-performance mobile phone system capable of obtaining predetermined image information in any case where the movable unit 15A (or the movable mobile phone body 300) displays a predetermined image by itself using information. . Further, when a person requests predetermined image information, if the person permits predetermined image information via the microphone 17, a keyboard, a mouse, a pen-type input device, etc. (not shown), the movable unit is used using the predetermined image information. It is possible to realize a high-functional user interface capable of obtaining predetermined image information in any case where 15A (or movable mobile phone body 300) displays a predetermined image by itself.

また、上記構成によれば、ＣＣＤカメラ４９ａが人を含む所定の対象物を撮像し、画像認識手段が所定の対象物を認識した結果に基づいてコミュニケーション動作をし、臨場感を持って発音する高度な携帯電話システムを提供できる。また、人を含む所定の対象物を撮像し、画像認識手段が所定の対象物を認識した結果に基づいてコミュニケーション動作をし、臨場感を持って発音する高度なユーザインターフェースを実現することができる。 Further, according to the above configuration, the CCD camera 49a captures a predetermined object including a person, and the image recognition means performs a communication operation based on the result of recognizing the predetermined object, and pronounces it with a sense of presence. An advanced mobile phone system can be provided. Further, it is possible to realize an advanced user interface that captures an image of a predetermined object including a person, performs a communication operation based on a result of the image recognition unit recognizing the predetermined object, and pronounces with a sense of presence. .

また、上記構成によれば、ＣＣＤイメージセンサ４９ａ、画像認識手段により人を含む所定の対象物を認識して、人と音声対話をする高度な携帯電話システムを提供できる。また、ＣＣＤイメージセンサ４９ａ、画像認識手段により人を含む所定の対象物を認識して、人と音声対話をする高度なユーザインターフェースを実現することができる。 Further, according to the above configuration, it is possible to provide an advanced mobile phone system for recognizing a predetermined object including a person by the CCD image sensor 49a and the image recognizing means and having a voice conversation with the person. In addition, it is possible to realize an advanced user interface for recognizing a predetermined object including a person by the CCD image sensor 49a and the image recognizing means and having a voice conversation with the person.

また、上記構成によれば、人の音声が所定の装置２００を操作する指示、命令である場合、人の音声が所定の装置を操作する許可である場合、図示しないキーボード、マウス、ペン式入力装置等により所定の装置２００を操作する場合に、可動ユニット１５Ａ（または可動携帯電話体３００）が、所定の装置２００の操作位置に可動し、所定の装置２００を操作する高度な携帯電話システムを提供できる。また、人の音声が所定の装置２００を操作する指示、命令である場合、人の音声が所定の装置２００を操作する許可である場合、所定の操作入力手段により所定の装置２００を操作する場合に、各可動部３９、４１、４３、４５、４７および携帯電話１１および可動ユニット１５Ａが、所定の装置２００の操作位置に可動し、所定の装置２００を操作する高度なユーザインターフェースを実現することができる。 Further, according to the above configuration, when a human voice is an instruction or command for operating the predetermined device 200, or when a human voice is permission to operate the predetermined device, a keyboard, a mouse, and a pen-type input (not shown) When the predetermined device 200 is operated by a device or the like, an advanced mobile phone system in which the movable unit 15A (or the movable mobile phone body 300) moves to the operation position of the predetermined device 200 and operates the predetermined device 200 is provided. Can be provided. Also, when the human voice is an instruction or command for operating the predetermined device 200, when the human voice is permission to operate the predetermined device 200, when operating the predetermined device 200 by a predetermined operation input means Further, each of the movable parts 39, 41, 43, 45, 47, the mobile phone 11 and the movable unit 15A can be moved to the operation position of the predetermined device 200 to realize an advanced user interface for operating the predetermined device 200. Can do.

また、上記構成によれば、可動ユニット１５Ｂが無線により所定の装置２０１を操作する高度な携帯電話システムを提供できる。 Further, according to the above configuration, it is possible to provide an advanced mobile phone system in which the movable unit 15B operates the predetermined device 201 wirelessly.

また、所定の発音情報、画像情報がインターネット上の所定の記憶場所からダウンロード自在であるので、所定の発音情報、画像情報をインターネット上からダウンロードできる高機能な携帯電話システムを提供できる。また、所定の発音情報、画像情報をインターネット上からダウンロードできるので、発音情報記憶部７５に記憶された所定の発音情報、画像情報記憶部７７に記憶された所定の画像情報が損傷しても、直ぐに、所定の発音情報、所定の画像情報を復旧することができる。 Further, since the predetermined pronunciation information and image information can be downloaded from a predetermined storage location on the Internet, it is possible to provide a highly functional mobile phone system that can download the predetermined pronunciation information and image information from the Internet. Further, since the predetermined pronunciation information and image information can be downloaded from the Internet, even if the predetermined pronunciation information stored in the pronunciation information storage unit 75 and the predetermined image information stored in the image information storage unit 77 are damaged, Immediately, the predetermined pronunciation information and the predetermined image information can be restored.

また、上記構成によれば、携帯電話１１、可動ユニット１５Ａが、サーバ１３とインターネット、電話回線によっても接続可能であるので、例えば家庭内の限定された領域で使用するだけでなく、家庭を遠く離れた領域に、携帯電話１１のみ移動させて使用することができる。 Further, according to the above configuration, since the mobile phone 11 and the movable unit 15A can be connected to the server 13 via the Internet and a telephone line, for example, the mobile phone 11 and the movable unit 15A can be used not only in a limited area in the home but also far away from the home. Only the mobile phone 11 can be moved to a remote area for use.

（第２実施形態）
第２実施形態および第２実施形態以降の説明では、第１実施形態の説明で用いた図１乃至図４を元に、図１乃至図４で用いた番号を用いて説明する。上記第１実施形態では、可動携帯電話体３００が本発明の被対話体を構成していたが、携帯電話１１が本発明の被対話体を構成してもよい。また、図５に示すように、携帯電話１１および可動ユニット１５Ａが一体に構成されたものが本発明の被対話体を構成してもよい。また、携帯電話１１に直接、可動部分（各可動部３９、４１、４３、４５、４７、各モータ２９、３１、３３、３５、３７）が設けられてもよい。なお、上記可動部分の構成は、これに限るものではない。 (Second Embodiment)
In the description of the second embodiment and the second and subsequent embodiments, description will be made using the numbers used in FIGS. 1 to 4 based on FIGS. 1 to 4 used in the description of the first embodiment. In the first embodiment, the mobile cellular phone body 300 constitutes the interactee of the present invention, but the mobile phone 11 may constitute the interactee of the present invention. Further, as shown in FIG. 5, the cellular phone 11 and the movable unit 15 A that are integrally configured may constitute the interactee of the present invention. Further, the mobile phone 11 may be directly provided with movable parts (movable parts 39, 41, 43, 45, 47, motors 29, 31, 33, 35, 37). In addition, the structure of the said movable part is not restricted to this.

（第３実施形態）
第１実施携帯では、マイク１７、スピーカ２１が携帯電話１１に備えられたが、可動ユニット１５Ａに備えられてもよい。また、マイク１７、スピーカ２１は、携帯電話１１の通話に使用するマイク、スピーカを用いてもよい。その場合、携帯電話１１の通信回線を用いて、直接サーバ１３と信号の送受信を行うようにしてもよい。 (Third embodiment)
In the first embodiment mobile phone, the microphone 17 and the speaker 21 are provided in the mobile phone 11, but may be provided in the movable unit 15A. Further, as the microphone 17 and the speaker 21, a microphone and a speaker used for a call of the mobile phone 11 may be used. In that case, the signal may be directly transmitted to and received from the server 13 using the communication line of the mobile phone 11.

（第４実施形態）
第１実施形態では、音声認識ボード５５、対話処理部７１がサーバ１３に備えられたが、音声認識ボード５５、対話処理部７１のどちらか一方が可動ユニット１５Ａに設けられ、他方がサーバ１３に設けられてもよい。また、音声認識ボード５５、対話処理部７１のどちらか一方が携帯電話１１に設けられ、他方がサーバ１３に設けられてもよい。 (Fourth embodiment)
In the first embodiment, the voice recognition board 55 and the dialogue processing unit 71 are provided in the server 13, but one of the voice recognition board 55 and the dialogue processing unit 71 is provided in the movable unit 15 A, and the other is provided in the server 13. It may be provided. Further, either the voice recognition board 55 or the dialogue processing unit 71 may be provided in the mobile phone 11 and the other may be provided in the server 13.

音声認識ボード５５が可動ユニット１５Ａに備えられ、対話処理部７１がサーバ１３に備えられている場合、対話処理部７１が可動ユニット１５Ａに備えられ、音声認識ボード５５がサーバ１３に備えられている場合には、携帯電話１１を落下させた場合に、あるいは水たまりに水没させた場合に、高価な音声認識ボード５５、対話処理部７１が故障することがない。また、音声認識ボード５５、対話処理部７１の両方が携帯電話１１に備えられている場合に比べて、携帯電話１１を小さくすることができ、持ち運びを容易にする。 When the voice recognition board 55 is provided in the movable unit 15A and the dialogue processing unit 71 is provided in the server 13, the dialogue processing unit 71 is provided in the movable unit 15A, and the voice recognition board 55 is provided in the server 13. In this case, when the mobile phone 11 is dropped or submerged in a puddle, the expensive speech recognition board 55 and the dialogue processing unit 71 do not break down. In addition, the mobile phone 11 can be made smaller and easier to carry than the case where both the voice recognition board 55 and the dialogue processing unit 71 are provided in the mobile phone 11.

また、音声認識ボード５５が携帯電話１１に備えられ、対話処理部７１がサーバ１３に備えられている場合には、携帯電話１１を落下させた場合に、高価な対話処理部７１が故障することがない。また、対話処理部７１が携帯電話１１に備えられ、音声認識ボード５５がサーバ１３に備えられている場合には、携帯電話１１を落下させた場合に、高価な音声認識ボード５５が故障することがない。また、音声認識ボード５５、対話処理部７１の両方が携帯電話１１に備えられている場合に比べて、携帯電話１１を小さくすることができ、持ち運びを容易にする。 Further, when the mobile phone 11 is provided with the voice recognition board 55 and the dialogue processing unit 71 is provided in the server 13, the expensive dialogue processing unit 71 may fail when the cellular phone 11 is dropped. There is no. Further, when the dialogue processing unit 71 is provided in the mobile phone 11 and the voice recognition board 55 is provided in the server 13, the expensive voice recognition board 55 may fail when the mobile phone 11 is dropped. There is no. In addition, the mobile phone 11 can be made smaller and easier to carry than the case where both the voice recognition board 55 and the dialogue processing unit 71 are provided in the mobile phone 11.

なお、音声認識ボード５５、対話処理部７１の両方が、携帯電話１１に備えられてもよく、音声認識ボード５５、対話処理部７１の両方が、可動ユニット１５Ａに備えられてもよい。対話処理部７１の両方が、可動ユニット１５Ａに備えられた場合には、携帯電話１１を落下させた場合、あるいは水たまりに水没させた場合でも、音声認識ボード５５、対話処理部７１の両方を損傷させることがない。 Note that both the voice recognition board 55 and the dialogue processing unit 71 may be provided in the mobile phone 11, and both the voice recognition board 55 and the dialogue processing unit 71 may be provided in the movable unit 15A. When both of the dialogue processing units 71 are provided in the movable unit 15A, both the voice recognition board 55 and the dialogue processing unit 71 are damaged even when the mobile phone 11 is dropped or submerged in a puddle. I will not let you.

（第５実施形態）
第１実施形態、第２実施形態では、マイク１７が、携帯電話１１に設けられたが、マイク１７を携帯電話１１と別体に構成し、図示しないヘッドマイクに搭載するようにしてもよい。上記ヘッドマイクは、マイクを人の口元に配置する装置である。 (Fifth embodiment)
In the first embodiment and the second embodiment, the microphone 17 is provided in the mobile phone 11. However, the microphone 17 may be configured separately from the mobile phone 11 and mounted on a head microphone (not shown). The head microphone is a device that arranges a microphone at the mouth of a person.

上記構成によれば、人が携帯電話１１に近づかなくとも、音声をマイク１７に入力することができ、これにより、音声の認識率を向上させることができる。一般に、音声認識ボード５５で人の音声を認識する場合、周囲音、雑音等により、人の音声の認識率が低下することが知られている。このためマイク１７を複数個配置する、あるいは音響部分析部の手前にノイズ除去フィルタを配置する、などして音声の認識率を向上させる方法が考えられている。本実施形態は、上記の他に、音声の認識率を向上させるようにしたものである。 According to the above configuration, voice can be input to the microphone 17 even if a person does not approach the mobile phone 11, thereby improving the voice recognition rate. In general, when a human voice is recognized by the voice recognition board 55, it is known that the recognition rate of a human voice is reduced due to ambient sounds, noise, and the like. For this reason, a method of improving the speech recognition rate by arranging a plurality of microphones 17 or arranging a noise removal filter in front of the acoustic unit analysis unit has been considered. In this embodiment, in addition to the above, the speech recognition rate is improved.

また、上記構成によれば、マイク１７を、携帯電話１１に設けなくともよいので、携帯電話１１にマイク１７を設けた場合に比べて、携帯電話１１（または可動ユニット１５Ａ、または可動携帯電話体３００）を小さく、軽くすることができ、携帯電話１１（または可動ユニット１５Ａ、または可動携帯電話体３００）の持ち運びを容易にすることができる。また、ヘッドマイクを使用することにより、音声認識ボード５５に音声信号が入力される際の雑音を小さくすることができる。 Further, according to the above configuration, since the microphone 17 does not have to be provided in the mobile phone 11, the mobile phone 11 (or the movable unit 15 A or the movable mobile phone body is compared with the case where the microphone 17 is provided in the mobile phone 11. 300) can be made small and light, and the mobile phone 11 (or the movable unit 15A or the movable mobile phone body 300) can be easily carried. Further, by using the head microphone, it is possible to reduce noise when a voice signal is input to the voice recognition board 55.

（第６実施形態）
第１実施形態では、発音情報記憶部７５がサーバ１３に備えられたが、発音情報記憶部７５が携帯電話１１、可動ユニット１５Ａのいずれかに備えられていてもよい。発音情報記憶部７５が可動ユニット１５Ａに備えられている場合には、携帯電話１１を落下させた場合に、あるいは水たまりに水没させた場合に、発音情報記憶部７５が故障することがない。また、上述のように、発音情報記憶部７５が携帯電話１１に備えられていてもよい。 (Sixth embodiment)
In the first embodiment, the pronunciation information storage unit 75 is provided in the server 13, but the pronunciation information storage unit 75 may be provided in either the mobile phone 11 or the movable unit 15A. When the pronunciation information storage unit 75 is provided in the movable unit 15A, the pronunciation information storage unit 75 does not fail when the mobile phone 11 is dropped or submerged in a puddle. Further, as described above, the pronunciation information storage unit 75 may be provided in the mobile phone 11.

さらに、発音情報記憶部７５が可動ユニット１５Ａに搭載されている場合には、携帯電話１１（または可動ユニット１５Ａ）を落下させた場合、あるいは水たまりに水没させた場合でも、発音情報記憶部７５に記憶された発音情報を損傷させることがない。 Further, when the pronunciation information storage unit 75 is mounted on the movable unit 15A, even if the mobile phone 11 (or the movable unit 15A) is dropped or submerged in a puddle, it is stored in the pronunciation information storage unit 75. The stored pronunciation information is not damaged.

（第７実施形態）
第１実施形態では、可動ユニット１５Ａに、駆動部２７、各モータ２９、３１、３３、３５、３７、各可動部３９、４１、４３、４５、４７が備えられたが、携帯電話１１に、駆動部２７、各モータ２９、３１、３３、３５、３７、各可動部３９、４１、４３、４５、４７が備えられてもよい。その場合、コントローラ５９が、携帯電話１１に備えられてもよく、サーバ１３に備えられていてもよい。 (Seventh embodiment)
In the first embodiment, the movable unit 15A is provided with the driving unit 27, the motors 29, 31, 33, 35, and 37, and the movable units 39, 41, 43, 45, and 47. The drive part 27, each motor 29, 31, 33, 35, 37 and each movable part 39, 41, 43, 45, 47 may be provided. In that case, the controller 59 may be provided in the mobile phone 11 or may be provided in the server 13.

コントローラ５９が、サーバ１３に備えられている場合、各可動部３９、４１、４３、４５、４７、各モータ２９、３１、３３、３５、３７、駆動部２７、コントローラ５９のすべてが携帯電話１１に備えられる場合に比べて、携帯電話１１を小さく、軽くすることができ、携帯電話１１の持ち運びを容易にすることができる。 When the controller 59 is provided in the server 13, the movable units 39, 41, 43, 45, 47, the motors 29, 31, 33, 35, 37, the drive unit 27, and the controller 59 are all mobile phones 11. Compared with the case where the mobile phone 11 is provided, the mobile phone 11 can be made smaller and lighter, and the mobile phone 11 can be easily carried.

（第８実施形態）
第１実施形態では、可動ユニット１５Ａに、駆動部２７、各モータ２９、３１、３３、３５、３７、各可動部３９、４１、４３、４５、４７が備えられ、サーバ１３にコントローラ５９が備えられたが、可動ユニット１５Ａに替えて、携帯電話１１に、駆動部２７、各モータ２９、３１、３３、３５、３７、各可動部３９、４１、４３、４５、４７が備えられ、サーバ１３にコントローラ５９が備えられてもよい。 (Eighth embodiment)
In the first embodiment, the movable unit 15A is provided with a drive unit 27, motors 29, 31, 33, 35, and 37, and movable units 39, 41, 43, 45, and 47, and the server 13 is provided with a controller 59. However, instead of the movable unit 15A, the mobile phone 11 is provided with a drive unit 27, motors 29, 31, 33, 35, 37, and movable units 39, 41, 43, 45, 47. A controller 59 may be provided.

上記構成によれば、コントローラ５９がサーバ１３に備えられているので、携帯電話１１を落下させた場合、あるいは水たまりに水没させた場合でも、高価なコントローラ５９が故障することがない。 According to the above configuration, since the controller 59 is provided in the server 13, even when the mobile phone 11 is dropped or submerged in a puddle, the expensive controller 59 does not break down.

（第９実施形態）
可動ユニット１５Ａに、各モータ２９、３１、３３、３５、３７、各可動部３９、４１、４３、４５、４７が備えられ、サーバ１３に駆動部２７、コントローラ５９が備えられてもよい。また、携帯電話１１に、各モータ２９、３１、３３、３５、３７、各可動部３９、４１、４３、４５、４７が備えられ、サーバ１３に駆動部２７、コントローラ５９が備えられてもよい。 (Ninth embodiment)
The movable unit 15 A may include the motors 29, 31, 33, 35, and 37 and the movable units 39, 41, 43, 45, and 47, and the server 13 may include the drive unit 27 and the controller 59. Further, the mobile phone 11 may be provided with each motor 29, 31, 33, 35, 37, each movable part 39, 41, 43, 45, 47, and the server 13 may be provided with a drive part 27 and a controller 59. .

上記構成によれば、駆動部２７、コントローラ５９がサーバ１３に備えらるので、可動ユニット１５Ａ（または携帯電話１１）を落下させた場合、あるいは水たまりに水没させた場合でも、高価な駆動部２７、コントローラ５９が故障することがない。 According to the above configuration, since the drive unit 27 and the controller 59 are provided in the server 13, even if the movable unit 15A (or the mobile phone 11) is dropped or submerged in a puddle, the expensive drive unit 27 is used. The controller 59 does not fail.

（第１０実施形態）
可動ユニット１５Ａが、携帯電話１１、サーバ１３と別体に設けられて、駆動部２７、各モータ２９、３１、３３、３５、３７、各可動部３９、４１、４３、４５、４７、コントローラ５９を備えていてもよい。 (10th Embodiment)
A movable unit 15A is provided separately from the mobile phone 11 and the server 13, and includes a drive unit 27, motors 29, 31, 33, 35, 37, movable units 39, 41, 43, 45, 47, and a controller 59. May be provided.

上記構成によれば、可動ユニット１５Ａが、携帯電話１１、サーバ１３と別体に設けられているので、携帯電話１１を落下させた場合、あるいは水たまりに水没させた場合でも、駆動部２７、各モータ２９、３１、３３、３５、３７、各可動部３９、４１、４３、４５、４７、コントローラ５９を備えた可動ユニット１５Ａが故障することがない。 According to the above configuration, since the movable unit 15A is provided separately from the mobile phone 11 and the server 13, even when the mobile phone 11 is dropped or submerged in a puddle, the drive unit 27, The movable unit 15A including the motors 29, 31, 33, 35, and 37, the movable portions 39, 41, 43, 45, and 47, and the controller 59 does not break down.

（第１１実施形態）
可動ユニット１５Ａが、携帯電話１１、サーバ１３と別体に設けられて、駆動部２７、各モータ２９、３１、３３、３５、３７、各可動部３９、４１、４３、４５、４７を備えており、携帯電話１１、サーバ１３のいずれかが、コントローラ５９を備えていてもよい。 (Eleventh embodiment)
The movable unit 15A is provided separately from the mobile phone 11 and the server 13, and includes a drive unit 27, motors 29, 31, 33, 35, and 37, and movable units 39, 41, 43, 45, and 47. In addition, either the mobile phone 11 or the server 13 may include the controller 59.

上記構成によれば、可動ユニット１５Ａが、携帯電話１１、サーバ１３と別体に設けられて、駆動部２７、各モータ２９、３１、３３、３５、３７、各可動部３９、４１、４３、４５、４７を備えており、携帯電話１１が、コントローラ５９を備えている場合は、携帯電話１１を落下させた場合、あるいは水たまりに水没させた場合でも、駆動部２７、各モータ２９、３１、３３、３５、３７、各可動部３９、４１、４３、４５、４７が故障することがない。また、可動ユニット１５Ａが、携帯電話１１、サーバ１３と別体に設けられて、駆動部２７、各モータ２９、３１、３３、３５、３７、各可動部３９、４１、４３、４５、４７を備えており、サーバ１３が、コントローラ５９を備えている場合は、携帯電話１１を落下させた場合、あるいは水たまりに水没させた場合でも、駆動部２７、各モータ２９、３１、３３、３５、３７、各可動部３９、４１、４３、４５、４７、コントローラ５９が故障することがない。 According to the above configuration, the movable unit 15A is provided separately from the mobile phone 11 and the server 13, and includes the drive unit 27, the motors 29, 31, 33, 35, 37, the movable units 39, 41, 43, When the mobile phone 11 includes a controller 59, even when the mobile phone 11 is dropped or submerged in a puddle, the drive unit 27 and the motors 29, 31, 33, 35, 37 and each movable part 39, 41, 43, 45, 47 do not break down. The movable unit 15A is provided separately from the mobile phone 11 and the server 13, and includes a drive unit 27, motors 29, 31, 33, 35, 37, and movable units 39, 41, 43, 45, 47. In the case where the server 13 includes the controller 59, even when the mobile phone 11 is dropped or submerged in a puddle, the drive unit 27 and the motors 29, 31, 33, 35, 37 The movable parts 39, 41, 43, 45, 47 and the controller 59 do not break down.

（第１２実施形態）
可動ユニット１５Ａが、携帯電話１１、サーバ１３と別体に設けられて、各モータ２９、３１、３３、３５、３７、各可動部３９、４１、４３、４５、４７を備えており、駆動部２７が携帯電話１１およびサーバ１３のいずれかに備えられており、コントローラ５９が携帯電話１１およびサーバ１３のいずれかに備えられていてもよい。 (Twelfth embodiment)
The movable unit 15A is provided separately from the mobile phone 11 and the server 13, and includes motors 29, 31, 33, 35, and 37, and movable units 39, 41, 43, 45, and 47, and a driving unit. 27 may be provided in either the mobile phone 11 or the server 13, and the controller 59 may be provided in either the mobile phone 11 or the server 13.

上記構成によれば、各モータ２９、３１、３３、３５、３７、各可動部３９、４１、４３、４５、４７が可動ユニット１５Ａに備えられているので、携帯電話１１を落下させた場合、あるいは水たまりに水没させた場合でも、高価な各可動部３９、４１、４３、４５、４７、各モータ２９、３１、３３、３５、３７が故障することがない。また、駆動部２７、コントローラ５９のいずれかがサーバ１３に備えられている場合には、携帯電話１１を落下させた場合、あるいは水たまりに水没させた場合でも、サーバ１３に備えられた駆動部２７、コントローラ５９のいずれかが故障することがない。 According to the above configuration, since each motor 29, 31, 33, 35, 37 and each movable part 39, 41, 43, 45, 47 is provided in the movable unit 15A, when the mobile phone 11 is dropped, Alternatively, even when submerged in a puddle, the expensive movable parts 39, 41, 43, 45, 47 and the motors 29, 31, 33, 35, 37 do not break down. Further, when either the drive unit 27 or the controller 59 is provided in the server 13, the drive unit 27 provided in the server 13 even when the mobile phone 11 is dropped or submerged in a puddle. Any of the controllers 59 will not fail.

なお、第８実施形態乃至第１２実施形態によれば、駆動部２７、各モータ２９、３１、３３、３５、３７、各可動部３９、４１、４３、４５、４７、コントローラ５９のすべてが携帯電話１１に備えられる場合に比べると、携帯電話１１を小さく、軽くすることができ、携帯電話１１の持ち運びを容易にすることができる。 According to the eighth to twelfth embodiments, the drive unit 27, the motors 29, 31, 33, 35, 37, the movable units 39, 41, 43, 45, 47, and the controller 59 are all portable. Compared with the case where the telephone 11 is provided, the mobile phone 11 can be made smaller and lighter, and the mobile phone 11 can be easily carried.

（第１３実施形態）
第１実施携帯では、画像モニタ７９ａが携帯電話１１、サーバ１３、可動ユニット１５Ａのいずれとも別体に構成されたが、画像モニタ７９ａが携帯電話１１、サーバ１３、可動ユニット１５Ａのいずれかに備えられていてもよい。また、画像モニタ７９ａは、上記画像モニタ７９ａに替えて、携帯電話に備えられている画像モニタを用いてもよい。その場合、携帯電話の通信回線を用いて、直接サーバ１３と信号の送受信を行うようにしてもよい。第１実施形態と同様に、人がマイク１７、図示しないキーボード、マウス、ペン式入力装置等を介して所定の画像情報を要求した場合、人がマイク１７、図示しないキーボード、マウス、ペン式入力装置等を介して所定の画像情報を許可した場合、所定の画像情報を用いて携帯電話１１、サーバ１３、可動ユニット１５Ａのいずれかに所定の画像を表示する。 (13th Embodiment)
In the first embodiment, the image monitor 79a is configured separately from the mobile phone 11, the server 13, and the movable unit 15A. However, the image monitor 79a is provided in any of the mobile phone 11, the server 13, and the movable unit 15A. It may be done. The image monitor 79a may be an image monitor provided in a mobile phone instead of the image monitor 79a. In that case, the signal may be directly transmitted to and received from the server 13 using the communication line of the mobile phone. As in the first embodiment, when a person requests predetermined image information via the microphone 17, a keyboard, mouse, pen-type input device (not shown), etc., the person inputs the microphone 17, keyboard, mouse, pen-type (not shown). When the predetermined image information is permitted through the device or the like, the predetermined image information is displayed on any of the mobile phone 11, the server 13, and the movable unit 15A using the predetermined image information.

上記構成によれば、画像モニタ７９ａがサーバ１３、可動ユニット１５Ａのいずれかに備えられている場合には、携帯電話１１を落下させた場合、あるいは水たまりに水没させた場合でも、画像モニタ７９ａを損傷させることがない。なお、上述のように、画像モニタ７９ａが携帯電話１１に搭載されていてもよく、画像モニタ７９ａは、携帯電話１１に備えられている画像モニタを用いてもよい。 According to the above configuration, when the image monitor 79a is provided in either the server 13 or the movable unit 15A, the image monitor 79a can be used even when the mobile phone 11 is dropped or submerged in a puddle. There is no damage. As described above, the image monitor 79a may be mounted on the mobile phone 11, and the image monitor 79a may be an image monitor provided on the mobile phone 11.

（第１４実施形態）
第１実施形態では画像情報記憶部７７がサーバ１３に搭載されたが、画像情報記憶部７７が携帯電話１１、可動ユニット１５Ａのいずれかに搭載されていてもよい。第１実施形態と同様に、人がマイク１７、図示しないキーボード、マウス、ペン式入力装置等を介して所定の画像情報を要求した場合、人がマイク１７、図示しないキーボード、マウス、ペン式入力装置等を介して所定の画像情報を許可した場合、所定の画像情報を用いて携帯電話１１が自ら所定の画像を表示する場合に、画像情報記憶部７７から所定の画像情報を読み出して、画像モニタ７９ａに表示する。また、上記画像情報は、第１実施形態と同様に、インターネット上の所定の記憶場所からダウンロード自在に構成されてもよい。 (14th Embodiment)
In the first embodiment, the image information storage unit 77 is mounted on the server 13, but the image information storage unit 77 may be mounted on either the mobile phone 11 or the movable unit 15A. As in the first embodiment, when a person requests predetermined image information via the microphone 17, a keyboard, mouse, pen-type input device (not shown), etc., the person inputs the microphone 17, keyboard, mouse, pen-type (not shown). When the predetermined image information is permitted via the device or the like, when the mobile phone 11 displays the predetermined image by itself using the predetermined image information, the predetermined image information is read from the image information storage unit 77, and the image Displayed on the monitor 79a. The image information may be configured to be freely downloadable from a predetermined storage location on the Internet, as in the first embodiment.

上記構成によれば、画像情報記憶部７７が携帯電話１１に搭載されている場合には、携帯電話１１を落下させた場合、あるいは水たまりに水没させた場合でも、画像情報記憶部７７を損傷させることがない。 According to the above configuration, when the image information storage unit 77 is mounted on the mobile phone 11, the image information storage unit 77 is damaged even when the mobile phone 11 is dropped or submerged in a puddle. There is nothing.

また、人がマイク１７を介して所定の画像情報を要求した場合、人がマイク１７、図示しないキーボード、マウス、ペン式入力装置等を介して所定の画像情報を許可した場合、所定の画像情報を用いて携帯電話１１、可動ユニット１５Ａのいずれかが自ら所定の画像を表示する場合のいずれかに、所定の画像情報を得ることができる高機能な携帯電話システムを提供できる。また、人人がマイク１７、図示しないキーボード、マウス、ペン式入力装置等を介して所定の画像情報を要求した場合、人がマイク１７を介して所定の画像情報を許可した場合、所定の画像情報を用いて携帯電話１１、可動ユニット１５Ａのいずれかが自ら所定の画像を表示する場合のいずれかに、所定の画像情報を画像情報記憶部７７から得て、画像モニタ７９ａに表示する高度なユーザインターフェースを実現することができる。 In addition, when a person requests predetermined image information via the microphone 17, when a person permits predetermined image information via the microphone 17, a keyboard, a mouse (not shown), a pen-type input device, etc., predetermined image information is obtained. Thus, it is possible to provide a high-function mobile phone system capable of obtaining predetermined image information in any of cases where either the mobile phone 11 or the movable unit 15A displays a predetermined image by itself. Further, when a person requests predetermined image information via the microphone 17, a keyboard, a mouse, a pen-type input device (not shown), or the like, when a person permits predetermined image information via the microphone 17, a predetermined image is displayed. A high-level display that obtains predetermined image information from the image information storage unit 77 and displays it on the image monitor 79a when either the mobile phone 11 or the movable unit 15A displays a predetermined image by itself using the information. A user interface can be realized.

また、所定の画像情報がインターネット上の所定の記憶場所からダウンロード自在であるので、所定の画像情報をインターネット上からダウンロードできる高機能な携帯電話システムを提供できる。また、所定の画像情報をインターネット上からダウンロードできるので、画像情報記憶部７７に記憶された所定の画像情報が損傷しても、直ぐに所定の画像情報を復旧することができる。 Further, since the predetermined image information can be downloaded from a predetermined storage location on the Internet, it is possible to provide a highly functional mobile phone system that can download the predetermined image information from the Internet. Further, since the predetermined image information can be downloaded from the Internet, even if the predetermined image information stored in the image information storage unit 77 is damaged, the predetermined image information can be restored immediately.

（第１５実施形態）
第１実施形態では、画像認識手段がサーバ１３に搭載されていたが、画像認識手段が携帯電話１１、可動ユニット１５Ａのいずれかに搭載されてもよい。 (Fifteenth embodiment)
In the first embodiment, the image recognition unit is mounted on the server 13, but the image recognition unit may be mounted on either the mobile phone 11 or the movable unit 15A.

（第１６実施形態）
第１実施形態では、ＣＣＤイメージセンサ４９ａが、可動ユニット１５Ａに設けられたが、携帯電話１１に設けられてもよい。また、ＣＣＤイメージセンサ４９ａが、携帯電話１１、可動ユニット１５Ａと別体に構成されていてもよい。また、ＣＣＤイメージセンサ４９ａに替えて、携帯電話１１に予め設けられた撮像手段を用いてもよい。 (Sixteenth embodiment)
In the first embodiment, the CCD image sensor 49a is provided in the movable unit 15A, but may be provided in the mobile phone 11. Further, the CCD image sensor 49a may be configured separately from the mobile phone 11 and the movable unit 15A. Further, in place of the CCD image sensor 49a, an imaging unit provided in advance in the mobile phone 11 may be used.

ＣＣＤイメージセンサ４９ａが、携帯電話１１、可動ユニット１５Ａと別体に構成されている場合、携帯電話１１、可動ユニット１５Ａの配置場所に制約されることなく、所定の対象物を撮像することができる。 When the CCD image sensor 49a is configured separately from the mobile phone 11 and the movable unit 15A, a predetermined object can be imaged without being restricted by the location of the mobile phone 11 and the movable unit 15A. .

なお、ＣＣＤイメージセンサ４９ａが、可動ユニット１５Ｂに設けられていてもよい。上記構成によれば、可動ユニット１５Ｂの周囲を撮像することができる。 The CCD image sensor 49a may be provided in the movable unit 15B. According to the said structure, the periphery of the movable unit 15B can be imaged.

（第１７実施形態）
第１実施携帯では、可動ユニット１５Ａに設けられた各可動部３９、４１、４３、４５、４７が所定のコミュニケーション動作をするように、コントローラ５９が駆動部２７に指令信号を出力したが、各可動部３９、４１、４３、４５、４７が携帯電話１１に設けられて所定のコミュニケーション動作をするように、コントローラ５９が駆動部２７に指令信号を出力するようにしてもよい。 (17th Embodiment)
In the first embodiment mobile phone, the controller 59 outputs a command signal to the drive unit 27 so that each movable unit 39, 41, 43, 45, 47 provided in the movable unit 15A performs a predetermined communication operation. The controller 59 may output a command signal to the drive unit 27 so that the movable units 39, 41, 43, 45, and 47 are provided in the mobile phone 11 and perform a predetermined communication operation.

（第１８実施形態）
第１実施形態では、携帯電話システム１００が、携帯電話１１、サーバ１３、可動ユニット１５Ａ、可動ユニット１５Ｂを備えていたが、携帯電話システム１００が、可動ユニット１５Ａ、可動ユニット１５Ｂを備えていなくともよい。その場合、コントローラ５９、指令信号変調送信手段６１は、サーバ１３に備えられなくてもよい。 (Eighteenth embodiment)
In the first embodiment, the mobile phone system 100 includes the mobile phone 11, the server 13, the movable unit 15A, and the movable unit 15B. However, the mobile phone system 100 does not include the movable unit 15A and the movable unit 15B. Good. In that case, the controller 59 and the command signal modulation transmission means 61 may not be provided in the server 13.

上記構成によれば、携帯電話１１が、音声認識ボード５５、対話処理部７１と別体に構成されているので、携帯電話１１を飛躍的に小型化することができる。また、携帯電話１１を落下させた場合、あるいは水たまりに水没させた場合でも、高価な音声認識ボード５５、対話処理部７１が故障することがない。 According to the above configuration, since the mobile phone 11 is configured separately from the voice recognition board 55 and the dialogue processing unit 71, the mobile phone 11 can be dramatically downsized. Even when the mobile phone 11 is dropped or submerged in a puddle, the expensive speech recognition board 55 and the dialogue processing unit 71 do not break down.

（第１９実施形態）
第１実施形態では、駆動部２７が可動ユニット１５Ａに備えられていたが、駆動部２７がサーバ１３に備えられていてもよい。 (Nineteenth embodiment)
In the first embodiment, the drive unit 27 is provided in the movable unit 15 A, but the drive unit 27 may be provided in the server 13.

上記構成によれば、駆動部２７、コントローラ５９がサーバ１３に備えられるので、携帯電話１１（または可動ユニット１５Ａ、または可動携帯電話体３００）を落下させた場合、あるいは水たまりに水没させた場合でも、高価な駆動部２７、コントローラ５９が故障することがない。 According to the above configuration, since the drive unit 27 and the controller 59 are provided in the server 13, even when the cellular phone 11 (or the movable unit 15A or the movable cellular phone body 300) is dropped or submerged in a puddle. Therefore, the expensive drive unit 27 and controller 59 do not break down.

また、各可動部３９、４１、４３、４５、４７、各モータ２９、３１、３３、３５、３７、駆動部２７、コントローラ５９のすべてが携帯電話１１（または可動ユニット１５Ａ、または可動携帯電話体３００）に備えられる場合に比べると、携帯電話１１（または可動ユニット１５Ａ、または可動携帯電話体３００）を小さく、軽くすることができ、携帯電話１１（または可動ユニット１５Ａ、または可動携帯電話体３００）の持ち運びを容易にすることができる。 Also, each of the movable parts 39, 41, 43, 45, 47, the motors 29, 31, 33, 35, 37, the drive part 27, and the controller 59 are all mobile phones 11 (or the mobile unit 15A or the mobile mobile phone body). 300), the mobile phone 11 (or the movable unit 15A or the movable mobile phone body 300) can be made smaller and lighter, and the mobile phone 11 (or the movable unit 15A or the movable mobile phone body 300) can be reduced. ) Can be carried easily.

（第２０実施形態）
第１実施形態では、可動ユニット１５Ａに駆動部２７、各モータ２９、３１、３３、３５、３７、各可動部３９、４１、４３、４５、４７が備えられ、サーバ１３にコントローラ５９が備えられたが、これに替わり、可動ユニット１５Ｂに駆動部２７、各モータ２９、３１、３３、３５、３７、各可動部３９、４１、４３、４５、４７が備えられ、サーバ１３にコントローラ５９が備えられていてもよい。また、可動ユニット１５Ｂに各モータ２９、３１、３３、３５、３７、各可動部３９、４１、４３、４５、４７が備えられ、サーバ１３に駆動部２７、コントローラ５９が備えられていてもよい。 (20th embodiment)
In the first embodiment, the movable unit 15A is provided with the drive unit 27, the motors 29, 31, 33, 35, and 37, the movable units 39, 41, 43, 45, and 47, and the server 13 is provided with the controller 59. However, instead of this, the movable unit 15B is provided with the drive unit 27, the motors 29, 31, 33, 35, and 37, the movable units 39, 41, 43, 45, and 47, and the server 13 is equipped with the controller 59. It may be done. Moreover, each motor 29, 31, 33, 35, 37, each movable part 39, 41, 43, 45, 47 may be provided in the movable unit 15B, and the drive part 27 and the controller 59 may be provided in the server 13. .

（第２１実施形態）
上記実施形態で説明した可動ユニット１５Ａ、可動ユニット１５Ｂ、携帯電話１１に備えられた各可動部３９、４１、４３、４５、４７が可動して、テーブルゲームを行うようにしてもよい。 (21st Embodiment)
The movable units 39A, 41, 43, 45, and 47 provided in the movable unit 15A, the movable unit 15B, and the mobile phone 11 described in the above embodiment may be moved to perform a table game.

テーブルゲームを行う場合、次のように動作する。最初に、ＣＣＤカメラ４９が、テーブルゲームの進行状況を撮像し、画像モニタ７９ａが、テーブルゲームの進行状況を画像認識する。次に、動作決定部７３で、画像認識手段により認識された進行状況から各可動部３９、４１、４３、４５、４７の次の動作を決定する。次に、各可動部３９、４１、４３、４５、４７が、動作決定部７３により決定された次の動作を実行するように、コントローラ５９が駆動部１７に指令信号を出力する。 When playing a table game, it operates as follows. First, the CCD camera 49 captures the progress of the table game, and the image monitor 79a recognizes the progress of the table game. Next, the operation determining unit 73 determines the next operation of each movable unit 39, 41, 43, 45, 47 from the progress status recognized by the image recognition means. Next, the controller 59 outputs a command signal to the drive unit 17 so that each movable unit 39, 41, 43, 45, 47 executes the next operation determined by the operation determination unit 73.

上記構成によれば、ＣＣＤカメラ４９、画像モニタ７９ａによりテーブルゲームの進行状況を撮像、画像認識して、ゲームを進行する高度な携帯電話システムを提供できる。 According to the above configuration, it is possible to provide an advanced mobile phone system in which the CCD camera 49 and the image monitor 79a capture the progress of the table game, recognize the image, and proceed with the game.

（第２２実施形態）
各可動部３９、４１、４３、４５、４７のいずれかにＣＣＤイメージセンサ４９ａを搭載し、コントローラ５９から駆動部２７に指令信号を出力して各可動部３９、４１、４３、４５、４７を可動させ、人を含む所定の対象物を探し出してもよい。 (Twenty-second embodiment)
A CCD image sensor 49a is mounted on any one of the movable parts 39, 41, 43, 45, and 47, and a command signal is output from the controller 59 to the drive part 27 so that each of the movable parts 39, 41, 43, 45, and 47 is installed. A predetermined object including a person may be searched by moving the object.

上記構成によれば、人を含む所定の対象物を探し出す高度な携帯電話システムを提供できる。 According to the above configuration, it is possible to provide an advanced mobile phone system that searches for a predetermined object including a person.

（第２３実施形態）
第１実施形態では、所定の装置２００に操作手段２００ａが、所定の装置２０１に操作手段２０１ａが設けられており、可動ユニット１５Ａ、可動ユニット１５Ｂを用いて所定の装置２００、所定の装置２０１を作動させたが、所定の装置２００、所定の装置２０１が、所定の作動信号によって作動するように構成されており、携帯電話１１、サーバ１３、可動ユニット１５Ａ、可動ユニット１５Ｂのいずれかから上記作動信号を送信して、所定の装置２００、所定の装置２０１を作動させてもよい。上記所定の作動信号は、無線および有線のいずれかで送信される。 (23rd Embodiment)
In the first embodiment, the operation device 200a is provided in the predetermined device 200, and the operation device 201a is provided in the predetermined device 201, and the predetermined device 200 and the predetermined device 201 are connected using the movable unit 15A and the movable unit 15B. The predetermined device 200 and the predetermined device 201 are configured to operate in response to a predetermined operation signal, and the operation is performed from any one of the mobile phone 11, the server 13, the movable unit 15A, and the movable unit 15B. The predetermined device 200 and the predetermined device 201 may be operated by transmitting a signal. The predetermined operation signal is transmitted either wirelessly or wired.

上記構成によれば、可動ユニット１５Ａ、可動ユニット１５Ｂを用いずに、所定の作動信号により、直接、所定の装置２００、所定の装置２０１を作動させる高度な携帯電話システムを提供できる。 According to the above configuration, it is possible to provide an advanced mobile phone system that directly operates the predetermined device 200 and the predetermined device 201 by a predetermined operation signal without using the movable unit 15A and the movable unit 15B.

（第２４実施形態）
携帯電話１１が人形、ぬいぐるみ、玩具のいずれか１つで構成されていてもよい。 (24th Embodiment)
The mobile phone 11 may be composed of any one of a doll, a stuffed animal, and a toy.

上記構成によれば、人と、人形、ぬいぐるみ、玩具のいずれか１つとが音声対話を行う高度な携帯電話システムを提供できる。また、携帯電話１１が人形、ぬいぐるみ、玩具のいずれか１つで構成されているので、親しみがわきやすい。 According to the above configuration, it is possible to provide an advanced mobile phone system in which a person and any one of a doll, a stuffed animal, and a toy have a voice conversation. Further, since the mobile phone 11 is composed of any one of a doll, a stuffed toy, and a toy, it is easy to get familiar with it.

（その他の実施形態）
上述した可動ユニット１５Ａ、可動ユニット１５Ｂの構成は、上述したものに限らない。例えば、所定の装置２００の操作手段２００ａの操作に必要な構成であってもよく、所定のゲーム機の操作に必要な構成であってもよい。また、可動部が、顔部、目部、口部、頭部、腕部、脚部、尻部のいずれかで構成されていてもよい。また、上述した各可動部３９、４１、４３、４５、４７は、その一部、例えば、上腕部３９、下腕部４１のみでもよい。 (Other embodiments)
The configurations of the movable unit 15A and the movable unit 15B described above are not limited to those described above. For example, a configuration necessary for operating the operation means 200a of the predetermined device 200 may be used, or a configuration required for operating a predetermined game machine may be used. Moreover, the movable part may be configured by any one of a face part, an eye part, a mouth part, a head part, an arm part, a leg part, and a hip part. Moreover, each movable part 39, 41, 43, 45, 47 mentioned above may be only a part, for example, the upper arm part 39 and the lower arm part 41.

また、音声認識ボード５５に替えて音声対話用プログラムを用いて音声対話の処理をしてもよい。 Further, instead of the voice recognition board 55, a voice dialogue process may be performed using a voice dialogue program.

また、サーバ１３がインターネット回線の他に、電話回線、家庭用ＬＡＮを含むローカルネットワーク回線に接続されていてもよい。また、携帯電話１１がインターネット回線、電話回線、家庭用ＬＡＮを含むローカルネットワーク回線に接続されていてもよい。また、上記インターネット回線、電話回線、家庭用ＬＡＮを含むローカルネットワーク回線に、携帯電話１１と、サーバ１３とを中継するアクセスポイント、中継自在なコンピュータ、電話のいずれかが接続されており、上記携帯電話１１が上記アクセスポイント、中継自在なコンピュータ、電話のいずれかを中継点として上記サーバ１３に接続されてもよい。 In addition to the Internet line, the server 13 may be connected to a local network line including a telephone line and a home LAN. The mobile phone 11 may be connected to a local network line including an Internet line, a telephone line, and a home LAN. In addition, any one of an access point, a relayable computer, and a telephone that relays the mobile phone 11 and the server 13 is connected to the local network line including the Internet line, the telephone line, and the home LAN. The telephone 11 may be connected to the server 13 using any one of the access point, the relayable computer, and the telephone as a relay point.

また、携帯電話１１およびサーバ１３のいずれかに、携帯電話１１が発音する際の感情パラメータを記憶する感情パラメータ記憶部が備えられており、スピーカ２１から発音する際にパラメータを参照し、顔の表情および口形状のうち、パラメータに応じた顔の表情および口形状を選択し、画像表示部に表示するようにしてもよい。上記構成によれば、人と対話を行う場合、所定の説明を行う場合、顔部、目部、口部、頭部、腕部、脚部、尻部のいずれかを可動させて、臨場感を持って発音する高度な携帯電話システムを提供できる。また、顔部、目部、口部、頭部、腕部、脚部、尻部のいずれかを可動させて、臨場感を持って発音する高度なユーザインターフェースを実現することができる。 In addition, either the mobile phone 11 or the server 13 is provided with an emotion parameter storage unit that stores an emotion parameter when the mobile phone 11 pronounces. Of facial expressions and mouth shapes, facial expressions and mouth shapes according to parameters may be selected and displayed on the image display unit. According to the above configuration, when interacting with a person, when performing a predetermined explanation, any of the face, eyes, mouth, head, arms, legs, and buttocks can be moved to provide a sense of presence. It is possible to provide an advanced mobile phone system that produces sound. In addition, it is possible to realize an advanced user interface that can generate a realistic sensation by moving any of the face, eyes, mouth, head, arms, legs, and buttocks.

また、発音情報記憶部７５、画像情報記憶部７７に記憶される発音情報、画像情報は、光学ディスクドライブ、ブルーレイディスクドライブ、ＵＳＢメモリ等を介し、所定の記憶媒体から供給されたものであってもよい。 The pronunciation information and image information stored in the pronunciation information storage unit 75 and the image information storage unit 77 are supplied from a predetermined storage medium via an optical disk drive, a Blu-ray disk drive, a USB memory, or the like. Also good.

なお、音声認識を行う音声認識部（第１実施形態では、音声認識ボード５５）、制御部（第１実施携帯では、ＣＰＵボード５７）のハード構成、記憶部（第１実施携帯では、ＣＰＵボード５７のＲＡＭおよびＲＯＭからなるメモリ）、対話処理部（第１実施形態では、対話処理部７１）等のハード構成は、上記各実施形態で説明した機能を満足するものであれば、上記のものに限らない。例えば、ＣＰＵボード５７と別体にハードディスクを設け、上記ハードディスクに発音情報、画像情報を記憶するようにしてもよい。 It should be noted that a voice recognition unit (voice recognition board 55 in the first embodiment) that performs voice recognition, a hardware configuration of a control unit (CPU board 57 in the first embodiment mobile phone), and a storage unit (CPU board in the first embodiment mobile phone). 57) and a dialogue processing unit (in the first embodiment, the dialogue processing unit 71), etc., if the hardware configuration satisfies the functions described in the above embodiments, Not limited to. For example, a hard disk may be provided separately from the CPU board 57, and pronunciation information and image information may be stored in the hard disk.

本発明の第１実施形態による携帯電話システムの外観図である。1 is an external view of a mobile phone system according to a first embodiment of the present invention. 本発明の第１実施形態による可動携帯電話体の外観図である。1 is an external view of a movable mobile phone body according to a first embodiment of the present invention. 本発明の第１実施形態による携帯電話システムのブロック図である。1 is a block diagram of a mobile phone system according to a first embodiment of the present invention. 本発明の第１実施形態による可動ユニット（１５Ｂ）の正面断面図である。It is front sectional drawing of the movable unit (15B) by 1st Embodiment of this invention. 本発明の第２実施形態による携帯電話システムのブロック図である。It is a block diagram of the mobile telephone system by 2nd Embodiment of this invention.

Explanation of symbols

１００…携帯電話システム
２００…所定の装置
２００ａ…操作手段
２０１…所定の装置
２０１ａ…操作手段
３００…可動携帯電話体（被対話体）
１１…携帯電話（被対話体）
１３…サーバ（サーバ用コンピュータ）
１５Ａ…可動ユニット（被対話体）
１５Ｂ…可動ユニット
１７…マイク（音声変換手段）
１９…音声出力ボード
２１…スピーカ（発音手段）
２３…音声信号変調送信手段
２５…発音信号受信復調手段
２７…駆動部
２９…上腕用モータ
３１…下腕用モータ
３３…ハンド用モータ
３５…走行用モータ
３７…旋回用モータ
３９…上腕部（可動部）
４１…下腕部（可動部）
４３…ハンド（可動部）
４５…走行部（可動部）
４７…旋回部（可動部）
４９…ＣＣＤカメラ
４９ａ…ＣＣＤイメージセンサ（撮像手段）
４９ｂ…信号処理部
５１…指令信号受信復調手段
５３…撮像信号変調送信手段
５５…音声認識ボード（音声認識手段）
５７…ＣＰＵボード
５９…コントローラ
６１…指令信号変調送信手段
６３…音声信号受信復調手段
６５…発音信号変調送信手段
６７…撮像信号受信復調段
６９…画像信号変調送信手段
７１…対話処理部（対話制御手段）
７３…動作決定部
７５…発音情報記憶部
７７…画像情報記憶部
７９…画像表示装置
７９ａ…画像モニタ（画像表示手段）
８１…画像情報受信復調手段
８３…駆動部
８５…ソレノイド
８７…プッシャ
８９…指令信号受信復調手段 DESCRIPTION OF SYMBOLS 100 ... Mobile phone system 200 ... Predetermined apparatus 200a ... Operation means 201 ... Predetermined apparatus 201a ... Operation means 300 ... Movable mobile telephone body (interacted body)
11 ... Mobile phone (interacted body)
13 ... Server (server computer)
15A ... Movable unit (interacted body)
15B ... Movable unit 17 ... Microphone (voice conversion means)
19 ... Audio output board 21 ... Speaker (sounding means)
23 ... voice signal modulation / transmission means 25 ... sound generation signal reception demodulation means 27 ... drive unit 29 ... upper arm motor 31 ... lower arm motor 33 ... hand motor 35 ... travel motor 37 ... turning motor 39 ... upper arm part (movable) Part)
41 ... Lower arm (movable part)
43 ... Hand (movable part)
45 ... traveling part (movable part)
47. Turning part (movable part)
49 ... CCD camera 49a ... CCD image sensor (imaging means)
49b ... Signal processing unit 51 ... Command signal reception demodulating means 53 ... Imaging signal modulation transmitting means 55 ... Voice recognition board (voice recognition means)
57 ... CPU board 59 ... controller 61 ... command signal modulation transmission means 63 ... audio signal reception demodulation means 65 ... sound signal modulation transmission means 67 ... imaging signal reception demodulation stage 69 ... image signal modulation transmission means 71 ... dialog processing section (dialog control) means)
73 ... Operation determination unit 75 ... Pronunciation information storage unit 77 ... Image information storage unit 79 ... Image display device 79a ... Image monitor (image display means)
81 ... Image information reception demodulating means 83 ... Driving section 85 ... Solenoid 87 ... Pusher 89 ... Command signal receiving demodulation means

Claims

A voice conversion means for converting a human voice into a voice signal, and a to-be-interactive body provided with a sound generation means for generating a sound by changing a predetermined pronunciation signal into vibration;
A server computer provided separately from the interactee and connected to the interactee either by wire or wirelessly;
With
The server computer determines the voice corresponding to the voice recognized by the voice recognition means by processing the voice signal converted by the voice conversion means and recognizing the voice of the person. A cellular phone system comprising a dialogue control means for outputting a pronunciation signal.

A to-be-interactive body provided with a sound generation means for changing a predetermined sound generation signal into a vibration;
A server computer provided separately from the interactee and connected to the interactee either by wire or wirelessly;
Audio that is provided separately from the interactee and the server computer, and is connected to either the interactee or the server computer either by wire or wirelessly and converts human speech into an audio signal Conversion means;
With
The server computer determines the voice corresponding to the voice recognized by the voice recognition means by processing the voice signal converted by the voice conversion means and recognizing the voice of the person. A cellular phone system comprising a dialogue control means for outputting a pronunciation signal.

A voice conversion means for converting a human voice into a voice signal, and a to-be-interactive body provided with a sound generation means for generating a sound by changing a predetermined pronunciation signal into vibration;
A server computer provided separately from the interactee and connected to the interactee either by wire or wirelessly;
With
Voice recognition means for processing a voice signal converted by the voice conversion means to recognize a human word, dialog control for determining a word corresponding to the word recognized by the voice recognition means and outputting the predetermined pronunciation signal One of the two means is provided in the interactee, and the other is provided in the server computer.

A to-be-interactive body provided with a sound generation means for changing a predetermined sound generation signal into a vibration;
A server computer provided separately from the interactee and connected to the interactee either by wire or wirelessly;
Audio that is provided separately from the interactee and the server computer, and is connected to either the interactee or the server computer either by wire or wirelessly and converts human speech into an audio signal Conversion means;
With
Voice recognition means for processing a voice signal converted by the voice conversion means to recognize a human voice; dialog control for determining a voice corresponding to the voice recognized by the voice recognition means and outputting the predetermined pronunciation signal One of the two means is provided in the interactee, and the other is provided in the server computer.

5. The mobile phone system according to claim 1, wherein a pronunciation information storage unit capable of storing predetermined pronunciation information is mounted on either the interactee or the server computer. And
The predetermined pronunciation information is stored in the pronunciation information storage unit;
When the person requests the predetermined pronunciation information via the voice conversion means, and when the person permits the predetermined pronunciation information via the voice conversion means, the subject using the predetermined pronunciation information The mobile phone system, wherein the predetermined pronunciation information is read out from the pronunciation information storage unit and is pronounced from the pronunciation means in any case where the dialogue body itself pronounces.

The pronunciation information storage unit is configured to be freely connected to the Internet,
6. The mobile phone system according to claim 5, wherein the pronunciation information can be downloaded from a predetermined storage location on the Internet.

The interactee includes one or more movable parts;
A motor for moving each of the one or more movable parts;
Driving units for driving the motors;
A controller that outputs a command signal to command the operation of the movable unit to the drive unit;
The mobile phone system according to any one of claims 1 to 6, further comprising:

The interactee includes one or more movable parts;
A motor for moving each of the one or more movable parts;
Driving units for driving the motors;
With
The mobile phone system according to claim 1, wherein the server computer includes a controller that outputs an operation command signal to the drive unit.

The interactee includes one or more movable parts;
A motor for moving each of the one or more movable parts;
With
7. The server computer according to claim 1, further comprising: a drive unit that drives each of the motors; and a controller that outputs an operation command signal to the drive unit. The mobile phone system described in 1.

A movable unit provided separately from the interactee and the server computer, and connected to at least one of the interactee and the server computer in a wired or wireless manner and movable;
The movable unit includes one or more movable parts;
A motor for moving each of the one or more movable parts;
Driving units for driving the motors;
A controller that outputs a command signal to command the operation of the movable unit to the drive unit;
The mobile phone system according to any one of claims 1 to 6, further comprising:

A movable unit provided separately from the interactee and the server computer and connected to at least one of the interactee and the server computer either by wire or wirelessly;
The movable unit includes one or more movable parts;
A motor for driving each of the one or more movable parts;
Driving units for driving the motors;
With
7. The controller according to claim 1, wherein one of the interactee and the server computer includes a controller that outputs an operation command signal to the drive unit. 8. Mobile phone system.

A movable unit provided separately from the interactee and connected to at least one of the interactee and the server computer either by wire or wirelessly;
The movable unit includes one or more movable parts;
A motor for moving each of the one or more movable parts;
With
A drive unit for driving each of the motors is provided in either the interactee or the server computer,
7. The controller according to claim 1, wherein a controller that outputs an operation command signal to the drive unit is provided in either the object to be interacted with or the server computer. 8. Mobile phone system.

The mobile phone system according to any one of claims 10 to 12, wherein the interactee and the movable unit are configured to be freely attachable.

In the mobile phone system according to any one of claims 1 to 9, an image display means for displaying a predetermined image is provided either separately or separately from the interactee.
The image information storage unit in which the predetermined image information is stored in advance is mounted on either the interactee or the server computer,
When the person requests the predetermined image information via the sound conversion means, or when the person permits the predetermined image information via the sound conversion means, the predetermined image information is used to The mobile phone system, wherein the predetermined image information is read from the image information storage unit and displayed on the image display means in any case where the interactive body displays the predetermined image by itself.

The mobile phone system according to any one of claims 10 to 13, further comprising an image display means for displaying a predetermined image provided in either the interactee or the movable unit. Connected to at least one of the body, the server computer, and the movable unit by wire or wireless,
The image information storage unit in which the predetermined image information is stored in advance is mounted on any of the interactee, the server computer, and the movable unit,
When the person requests the predetermined image information via the sound conversion means, or when the person permits the predetermined image information via the sound conversion means, the predetermined image information is used to The mobile phone system, wherein the predetermined image information is read from the image information storage unit and displayed on the image display means in any case where the interactive body displays the predetermined image by itself.

14. The mobile phone system according to claim 10, further comprising an image display means for displaying a predetermined image provided separately from both the interactee and the movable unit. It is connected to at least one of the interactive body, the server computer, and the movable unit by either wired or wireless,
The image information storage unit in which the predetermined image information is stored in advance is mounted on any of the interactee, the server computer, and the movable unit,
When the person requests the predetermined image information via the sound conversion means, or when the person permits the predetermined image information via the sound conversion means, the predetermined image information is used to The mobile phone system, wherein the predetermined image information is read from the image information storage unit and displayed on the image display means in any case where the interactive body displays the predetermined image by itself.

The image information storage unit is configured to be freely connected to the Internet,
The mobile phone system according to any one of claims 14 to 16, wherein the image information can be downloaded from a predetermined storage location on the Internet.

An imaging means capable of imaging a predetermined object including the person is configured to be integral with or separate from the interactee,
The image recognition means for recognizing the predetermined object from the image data picked up by the image pickup means is mounted on either the interactee or the server computer. 9. The mobile phone system according to any one of claims 14 and 14.

An imaging means capable of imaging a predetermined object including the person is provided in either the interactee or the movable unit, and is wired to at least one of the interactee, the server computer, and the movable unit. And wirelessly connected,
The image recognition means for recognizing the predetermined object from the image data picked up by the image pickup means is mounted on at least one of the interactee, the server computer, and the movable unit. The cellular phone system according to any one of claims 10 to 13, 15 and 16.

An imaging means capable of imaging a predetermined object including the person is provided separately from either the interacted body or the movable unit, and at least one of the interacted body, the server computer, and the movable unit. Connected to either cable or wirelessly,
The image recognition means for recognizing the predetermined object from the image data picked up by the image pickup means is mounted on at least one of the interactee, the server computer, and the movable unit. The cellular phone system according to any one of claims 10 to 13, 15 and 16.

The controller outputs the command signal to the drive unit so that the movable unit performs a predetermined communication operation in at least one of a case where a dialogue with the person is performed and a predetermined explanation is performed. The mobile phone system according to any one of claims 7 to 20.

The movable part is arranged at a position for operating a predetermined device;
When the voice of the person is an instruction to operate a predetermined device, when the voice of the person is permission to operate the predetermined device, when operating the predetermined device by a predetermined operation input means, The said controller outputs the said command signal to the said drive part so that the said predetermined | prescribed apparatus may be operated when the automatic execution program to operate is performed, The Claim 13 thru | or 13 characterized by the above-mentioned. The mobile phone system according to claim 15.

The controller is configured so that the imaging unit images a predetermined object including the person and the image recognition unit performs a predetermined communication operation with the person based on the result of the recognition of the predetermined object. The mobile phone system according to any one of claims 18 to 20, wherein the command signal is output to the drive unit.

The imaging means picks up a predetermined object including the person, and the image recognition means selects at least one of the sound generation data based on the result of recognition of the predetermined object, and the sound generation The mobile phone system according to any one of claims 18 to 20, wherein pronunciation is made to the person via a means.

The imaging means images the operation means of the predetermined device,
When the voice of the person is an instruction to operate a predetermined device, when the voice of the person is permission to operate the predetermined device, when operating the predetermined device by a predetermined operation input means, When the automatic execution program to be operated is executed, based on the result of the image recognition means recognizing the position of the operation means, the movable part and the interactee move to the operation position of the means, The mobile phone system according to any one of claims 18 to 20, wherein the controller outputs the command signal to the drive unit so as to operate a predetermined device.

The imaging means is configured to image the progress of the table game, and the image recognition means is configured to recognize the image of the progress of the table game.
An operation determining means for determining a next operation of the movable part from the progress status recognized by the image recognition means;
21. The controller according to claim 18, wherein the controller outputs the command signal to the driving unit so that the movable unit executes a next operation determined by the operation determining unit. The mobile phone system according to one.

21. The controller according to claim 18, wherein the controller outputs the command signal to the drive unit to move the movable unit to search for a predetermined object including the person. Mobile phone system.

A tracking program in which the imaging unit tracks the predetermined object recognized by the image recognition unit is installed in either the interactee or the server computer,
21. The moving unit is moved by outputting the command signal from the controller to the driving unit so that the imaging unit tracks a predetermined object including the person. The mobile phone system according to any one of the above.

29. The mobile phone system according to any one of claims 1 to 28, wherein an operation signal output means for outputting the operation signal is provided to the operation means of an operation body further comprising an operation means operated by an operation signal. It is mounted on at least one of the dialog and the server computer,
The mobile phone system, wherein the operation means and the operation signal output means are connected by one of wireless and wired.

30. The mobile phone system according to any one of claims 1 to 29, wherein the object to be interacted comprises one of a doll, a stuffed toy, and a toy.