JP2005110180A

JP2005110180A - Radio communication terminal

Info

Publication number: JP2005110180A
Application number: JP2003344425A
Authority: JP
Inventors: Kayoko Sanae; 賀世子早苗
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2003-10-02
Filing date: 2003-10-02
Publication date: 2005-04-21

Abstract

<P>PROBLEM TO BE SOLVED: To provide a radio communication terminal with a voice recognition function which can automatically operate the voice recognition function when it is necessary without an operation of a voice recognition-specific key and which can suppress utterance of persons except a user and a response to a surrounding noise minimum. <P>SOLUTION: In the presence of change in external light, voice recognition processing is started. When a word inputted thereafter, namely a word spoken by a user coincides with a word having already registered, a response is performed by a vibration of a vibrator 111, lighting of an LED portion 112, or an output of a musical tone from a speaker 113. Thus, even if a dedicated key for start of voice recognition is not operated, the voice recognition processing is started by the change of the external light. Therefore, even if a radio communication terminal 100 is put in e.g. a bag, a response is performed by speaking in the presence of the change in the external light, thereby enabling the radio communication terminal 100 to be easily found. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、携帯電話、ＰＨＳ（Personal Handy-phone System）、ＰＤＡ（Personal Digital Assistant）などの無線通信端末に関し、特に音声認識機能を有する無線通信端末に関する。 The present invention relates to a wireless communication terminal such as a mobile phone, a PHS (Personal Handy-phone System), a PDA (Personal Digital Assistant), and more particularly to a wireless communication terminal having a voice recognition function.

従来、上述した無線通信端末においては、音声認識機能を有し、この音声認識機能を用いて音声で無線通信端末の機能の一部を実現できるようにしたものが提案されている（例えば、特許文献１参照）。 2. Description of the Related Art Conventionally, the above-described wireless communication terminals have been proposed that have a voice recognition function and that can realize part of the functions of the wireless communication terminal by voice using this voice recognition function (for example, patents). Reference 1).

図３は、上記特許文献１で開示された音声認識機能を有する携帯電話の概略構成を示すブロック図である。 FIG. 3 is a block diagram showing a schematic configuration of a mobile phone having a voice recognition function disclosed in Patent Document 1. In FIG.

図３に示す携帯電話は、無線信号の送受信を行う無線部３１と、携帯電話における各種操作を行う操作部３２と、文字列等を表示する表示部３３と、音声認識の固定の認識語や応答音声などのデータを格納するデータＲＯＭ（Read Only Memory）３４と、携帯電話の各部の制御を行うメインＣＰＵ（Central Processing Unit）とＤＳＰ（Digital Signal Processor）からなる制御部３５と、音声認識処理を行う音声認識部３６と、Ａ／Ｄ（Analog-to-Digital）コンバータ３７と、Ａ／Ｄコンバータ３７の接続先を切り替えるスイッチ３８と、音声が入力されるマイク３９と、音声を出力するスピーカ４０とから構成されている。操作部３２は、名前や電話番号をメモリ登録するための登録キーや音声認識専用キーなどを備えている。音声認識部３６は通常動作時には動作せず、操作部３２の音声認識専用キー（図示略）が押下されて、その状態が制御部３５で検知された場合にのみ音声認識処理を行う。 The mobile phone shown in FIG. 3 includes a radio unit 31 that transmits and receives radio signals, an operation unit 32 that performs various operations on the mobile phone, a display unit 33 that displays a character string, a recognition word that is fixed for voice recognition, A data ROM (Read Only Memory) 34 for storing data such as response voice, a control unit 35 comprising a main CPU (Central Processing Unit) and DSP (Digital Signal Processor) for controlling each part of the cellular phone, and voice recognition processing A voice recognition unit 36 that performs A / D, an A / D (Analog-to-Digital) converter 37, a switch 38 that switches a connection destination of the A / D converter 37, a microphone 39 to which voice is input, and a speaker that outputs voice. 40. The operation unit 32 includes a registration key for registering a name and telephone number in a memory, a voice recognition dedicated key, and the like. The voice recognition unit 36 does not operate during normal operation, and performs voice recognition processing only when a voice recognition dedicated key (not shown) of the operation unit 32 is pressed and the state is detected by the control unit 35.

次に、上記構成の携帯電話の動作を説明する。ユーザーが操作部３２の図示せぬ音声認識専用キーを押下すると、制御部３５はその状態を検知し、音声認識部３６を起動させて音声認識処理を開始させる。ユーザーが音声認識専用キーを押下した後、マイク３９に向かって発声すると、マイク３９からの音声信号がＡ／Ｄコンバータ３７に入力されてデジタル変換される。そして、デジタル変換された音声信号（音声データ）が音声認識部３６に入力される。なお、このとき、スイッチ３８は音声認識部３６側に切り替わっている。 Next, the operation of the mobile phone having the above configuration will be described. When the user presses a voice recognition dedicated key (not shown) of the operation unit 32, the control unit 35 detects the state and activates the voice recognition unit 36 to start the voice recognition process. After the user presses the voice recognition dedicated key and speaks into the microphone 39, the voice signal from the microphone 39 is input to the A / D converter 37 and converted into a digital signal. The digitally converted audio signal (audio data) is input to the audio recognition unit 36. At this time, the switch 38 is switched to the voice recognition unit 36 side.

音声認識部３６は、入力された音声データに対して音声認識処理を行い、認識結果が確定するとその結果を制御部３５に入力する。制御部３５は、音声認識部３６より入力された認識結果に対応するデータを表示部３３に表示するとともに、認識結果に対応する応答音声をスピーカ４０から出力する。このとき、認識語のデータとして、携帯電話の機能名については事前にその固定値がデータＲＯＭ３４に格納されていて、それを使用するようにしている。また、メモリダイヤルの名前については、事前に複数の単音データがデータＲＯＭ３４に格納されていて、夫々をつなぎ合わせて使用するようにしている。 The voice recognition unit 36 performs voice recognition processing on the input voice data, and inputs the result to the control unit 35 when the recognition result is confirmed. The control unit 35 displays data corresponding to the recognition result input from the voice recognition unit 36 on the display unit 33, and outputs a response voice corresponding to the recognition result from the speaker 40. At this time, as the recognition word data, the fixed value of the function name of the mobile phone is stored in advance in the data ROM 34 and is used. As for the name of the memory dial, a plurality of single-tone data are stored in advance in the data ROM 34, and are used by connecting them.

特開平１１−１１２６３３号公報（第３頁、第４頁、図２、図３）Japanese Patent Application Laid-Open No. 11-112633 (page 3, page 4, FIG. 2, FIG. 3)

しかしながら、従来の音声認識機能を有する携帯電話においては、次のような問題がある。即ち、音声認識機能があるものの、その機能が音声認識専用キーの操作によってのみ可能となるため、この音声認識専用キーの操作をしていなければ携帯電話に向けて呼びかけても応答が得られない。このため、携帯電話を例えば鞄にしまっておいた場合に探すのに時間がかかってしまう。 However, a conventional mobile phone having a voice recognition function has the following problems. In other words, although there is a voice recognition function, the function can be performed only by operating the voice recognition dedicated key. Therefore, if the voice recognition dedicated key is not operated, no response can be obtained even if calling to the mobile phone. . For this reason, it takes a long time to search when the mobile phone is kept in a bag, for example.

また、音声認識に使用する認識語は、携帯電話がユーザーに渡る前（例えば工場出荷時）にデータＲＯＭ３４に保存しておく固定データであるため、そのユーザー以外の人の発声や周囲の雑音でも応答してしまうことになる。 The recognition word used for speech recognition is fixed data stored in the data ROM 34 before the mobile phone is delivered to the user (for example, at the time of factory shipment). Will respond.

本発明は、係る点に鑑みてなされたものであり、音声認識専用キーの操作を行わなくても、必要なときに自動的に音声認識機能を働かせることができ、またユーザー以外の人の発声や周囲の雑音への応答を最小限に抑えることができる無線通信端末を提供することを目的とする。 The present invention has been made in view of the above points, and can automatically activate a speech recognition function when necessary without operating a dedicated voice recognition key. Another object of the present invention is to provide a wireless communication terminal capable of minimizing the response to ambient noise.

上記課題を解決するために、本発明の無線通信端末は、キー入力された単語及び応答動作を登録する登録手段と、音声を入力する音声入力手段と、前記音声入力手段にて音声入力された単語を認識する音声認識手段と、装置外の光を検知する光検知手段と、前記光検知手段の出力に変化があった場合に前記音声認識手段に対して音声認識処理を開始させる指示を与える応答開始指示出力手段と、前記登録手段に登録された単語とこの単語が登録された後であって前記音声認識処理開始後に音声入力されて認識された単語とを照合する照合手段と、前記照合手段による照合結果から前記登録手段に登録された単語が前記音声認識処理開始後に音声入力されて認識された単語と一致した場合、前記登録手段に登録された応答動作に従って振動、光及び楽音のうち少なくとも１つを発生する応答手段と、を備えている。 In order to solve the above-described problems, a wireless communication terminal according to the present invention includes a registration unit that registers a keyed word and a response operation, a voice input unit that inputs voice, and a voice input by the voice input unit. A voice recognition means for recognizing a word; a light detection means for detecting light outside the apparatus; and an instruction for starting a voice recognition process to the voice recognition means when there is a change in the output of the light detection means. A response start instruction output unit; a collation unit that collates a word registered in the registration unit with a word that has been registered and is recognized by voice input after the voice recognition process is started; and the collation When the word registered in the registration unit matches the word recognized by voice input after the voice recognition process starts from the collation result by the unit, vibration, light And a, a response means for generating at least one of the fine tone.

この構成によれば、外光に変化があった場合に音声認識処理を開始し、その後に入力された音声（ユーザーの呼びかけ）の単語が登録済みの単語と一致した場合に、振動、光又は楽音を発生して応答する。したがって、音声認識開始用の専用キーを操作しなくても外光の変化によって音声認識処理を開始するので、無線通信端末を鞄内に入れたままでも光に変化があれば呼びかけることで応答することから、無線通信端末を容易に探し出すことができる。 According to this configuration, when the external light changes, the speech recognition process is started, and when the speech (user call) word input thereafter matches the registered word, vibration, light or Generate and respond to musical sounds. Therefore, since the voice recognition process is started by the change of the external light without operating the dedicated key for starting the voice recognition, it responds by calling if there is a change in the light even when the wireless communication terminal is put in the bag. Therefore, it is possible to easily find a wireless communication terminal.

また、本発明の無線通信端末は、音声を入力する音声入力手段と、前記音声入力手段にて音声入力された単語を認識する音声認識手段と、前記音声認識手段にて認識された単語及び応答動作を登録する登録手段と、装置外の光を検知する光検知手段と、前記光検知手段の出力に変化があった場合に前記音声認識手段に対して音声認識処理を開始させる指示を与える応答開始指示出力手段と、前記登録手段に登録された単語とこの単語が登録された後であって前記音声認識処理開始後に音声入力されて認識された単語とを照合する照合手段と、前記照合手段による照合結果から前記登録手段に登録された単語と前記音声認識処理開始後に音声入力されて認識された単語と一致した場合、前記登録手段に登録された応答動作に従って振動、光及び楽音のうち少なくとも１つを発生する応答手段と、を備えている。 The wireless communication terminal of the present invention includes a voice input means for inputting voice, a voice recognition means for recognizing a word inputted by the voice input means, a word recognized by the voice recognition means, and a response. A registration unit for registering an operation; a light detection unit for detecting light outside the apparatus; and a response for giving an instruction to start the voice recognition process to the voice recognition unit when there is a change in the output of the light detection unit A start instruction output unit, a collation unit that collates a word registered in the registration unit and a word that has been registered and is recognized by voice input after the start of the speech recognition process; and the collation unit If the word registered in the registration means matches the word recognized by speech input after the start of the speech recognition process, the vibration, light, and light according to the response operation registered in the registration means And a, a response means for generating at least one of sound.

この構成によれば、前述した本発明の無線通信端末と同様の作用及び効果が得られるとともに、登録する単語を音声によって入力できるので、単語の登録を容易に行うことができる。 According to this configuration, the same operation and effect as the above-described wireless communication terminal of the present invention can be obtained, and the word to be registered can be input by voice, so that the word can be easily registered.

また、本発明の無線通信端末は、音声を入力する音声入力手段と、前記音声入力手段にて音声入力された単語を認識する音声認識手段と、前記音声入力手段にて入力された音声の特徴量を検出する音声特徴量検出手段と、前記音声認識手段にて認識された単語及び応答動作と前記音声特徴量検出手段にて検出された音声の特徴量とを関連付けて登録する登録手段と、装置外の光を検知する光検知手段と、前記光検知手段の出力に変化があった場合に前記音声認識手段に対して音声認識処理を開始させる指示を与える応答開始指示出力手段と、前記登録手段に登録された単語とこの単語が登録された後であって前記音声認識処理開始後に音声入力されて認識された単語とを照合するとともに、前記音声認識処理開始後に入力された音声の特徴量が前記登録手段に登録された音声の特徴量の判定基準内かを照合する照合手段と、前記照合手段による照合結果から前記音声認識処理開始後に入力された音声の単語が登録された単語と一致し、且つ当該音声の特徴量が前記判定基準内の場合にのみ、前記登録手段に登録された応答動作に従って振動、光及び楽音のうち少なくとも１つを発生する応答手段と、を備えている。 The wireless communication terminal according to the present invention includes a voice input unit for inputting voice, a voice recognition unit for recognizing a word input by the voice input unit, and a feature of the voice input by the voice input unit. A voice feature quantity detecting means for detecting a quantity; a registration means for registering the word and response action recognized by the voice recognition means in association with the voice feature quantity detected by the voice feature quantity detection means; A light detection means for detecting light outside the apparatus, a response start instruction output means for giving an instruction to start the voice recognition processing to the voice recognition means when there is a change in the output of the light detection means, and the registration The word registered in the means is compared with the word recognized after voice registration after the word is registered and after the voice recognition processing is started, and the feature amount of the voice inputted after the voice recognition processing is started A collating unit that collates whether the feature value of the voice registered in the registering unit is within a determination criterion, and a speech word that is input after the start of the speech recognition process matches a registered word from the collation result by the collating unit. And response means for generating at least one of vibration, light, and musical sound according to the response operation registered in the registration means only when the feature amount of the sound is within the determination criterion.

この構成によれば、前述した本発明の無線通信端末と同様の作用及び効果が得られるとともに、音声による単語登録時にユーザーの音声の特徴量を記録し、この特徴量を音声認識の判定基準としていることから、ユーザー以外の人の発声や周囲の雑音への応答を最小限に抑えることができる。即ち、単語のみで照合する場合と比べて誤り率の少ない音声認識機能が得られる。 According to this configuration, the same operation and effect as the wireless communication terminal of the present invention described above can be obtained, and the feature amount of the user's voice is recorded at the time of word registration by voice, and this feature amount is used as a criterion for speech recognition. Therefore, it is possible to minimize the utterance of people other than the user and the response to ambient noise. That is, it is possible to obtain a speech recognition function with a lower error rate than in the case of collating only with words.

また、本発明の無線通信端末は、前述した無線通信端末のいずれかにおいて、前記応答手段は、バイブレータを駆動することで振動を発生する。 In the wireless communication terminal of the present invention, in any of the wireless communication terminals described above, the response means generates vibration by driving a vibrator.

この構成によれば、ユーザーの呼びかけに対する応答を、元々無線通信端末に設けられているバイブレータを動作させることで行うので、専用のバイブレータを設ける必要がない分、コストアップを最小限に抑えることができるとともに省スペース化が図れる。 According to this configuration, since the response to the user's call is performed by operating the vibrator originally provided in the wireless communication terminal, it is not necessary to provide a dedicated vibrator, thereby minimizing the cost increase. And space saving.

また、本発明の無線通信端末は、前述した無線通信端末のいずれかにおいて、前記応答手段は、発光素子を駆動することで光を発生する。 In the wireless communication terminal of the present invention, in any of the wireless communication terminals described above, the response means generates light by driving a light emitting element.

この構成によれば、ユーザーの呼びかけに対する応答を、元々無線通信端末に設けられている着呼報知用の発光素子を点灯させることで行うので、専用の発光素子を設ける必要がない分、コストアップを最小限に抑えることができるとともに省スペース化が図れる。 According to this configuration, since the response to the user's call is performed by turning on the light emitting element for incoming call notification originally provided in the wireless communication terminal, there is no need to provide a dedicated light emitting element, which increases the cost. As well as space saving.

また、本発明の無線通信端末は、前述した無線通信端末のいずれかにおいて、前記応答手段は、スピーカを駆動することで楽音を発生する。 In the wireless communication terminal of the present invention, in any of the wireless communication terminals described above, the response means generates a musical sound by driving a speaker.

この構成によれば、ユーザーの呼びかけに対する応答を、元々無線通信端末に設けられているスピーカを動作させることで行うので、専用のスピーカを設ける必要がない分、コストアップを最小限に抑えることができるとともに省スペース化が図れる。 According to this configuration, since the response to the user's call is performed by operating the speaker originally provided in the wireless communication terminal, it is not necessary to provide a dedicated speaker, so that the cost increase can be minimized. And space saving.

本発明によれば、音声認識専用キーの操作を行わなくても、外光に変化があった場合に音声認識処理を開始し、その後に入力された音声（ユーザーの呼びかけ）の単語が登録済みの単語と一致した場合に、振動、光又は楽音を発生して応答するので、無線通信端末を例えば鞄内に入れたままでも容易に探し出すことができる。 According to the present invention, the voice recognition process is started when there is a change in the external light without performing the operation of the voice recognition dedicated key, and the words of the voice (user call) inputted after that are already registered. When the word matches, it responds by generating vibration, light, or musical sound, so that the wireless communication terminal can be easily found even if it is placed in a bag, for example.

また、本発明によれば、音声による単語登録時にユーザーの音声の特徴量を記録し、この特徴量を音声認識の判定基準とするので、ユーザー以外の人の発声や周囲の雑音への応答を最小限に抑えることができる。 In addition, according to the present invention, the feature amount of the user's voice is recorded at the time of word registration by voice, and this feature amount is used as a criterion for speech recognition. Can be minimized.

以下、本発明を実施するための最良の形態について、図面を参照して詳細に説明する。 Hereinafter, the best mode for carrying out the present invention will be described in detail with reference to the drawings.

（実施の形態１）
図１は、本発明の実施の形態１の無線通信端末の概略構成を示すブロック図である。 (Embodiment 1)
FIG. 1 is a block diagram showing a schematic configuration of a radio communication terminal according to Embodiment 1 of the present invention.

図１において、本実施の形態の無線通信端末１００は、光を検知する光検知部１０１と、装置各部を制御する制御部１０２と、音声認識開始時から時間を計測するタイマ１０３と、応答のための単語を記録する記録部１０４と、音声入力するためのマイク１０５と、キー入力手段を含む操作部１０６と、音声から単語を認識する音声認識部１０７と、入力された音声の単語と登録済みの単語とを照合する照合部１０８と、照合部１０８の照合結果から、ユーザーからの呼びかけである場合に応答処理を行う応答処理部１０９とを備えている。応答処理部１０９は、機械的振動を発生することで着信を報知するバイブレータ１１１を駆動するバイブ駆動部１０９ａと、発光することで着信を報知するＬＥＤ（発光ダイオード）部１１２を駆動するＬＥＤ駆動部１０９ｂと、楽音を出力することで着信を報知するスピーカ１１３を駆動するスピーカ駆動部１０９ｃとを備えている。バイブレータ１１１、ＬＥＤ部１１２及びスピーカ１１３は元々無線通信端末に備えられているものである。 In FIG. 1, a wireless communication terminal 100 according to the present embodiment includes a light detection unit 101 that detects light, a control unit 102 that controls each unit of the device, a timer 103 that measures time from the start of speech recognition, A recording unit 104 for recording a word for recording, a microphone 105 for inputting voice, an operation unit 106 including a key input means, a voice recognition unit 107 for recognizing a word from voice, and registration of an input voice word A collation unit 108 that collates with a completed word, and a response processing unit 109 that performs a response process in response to a call from the user based on the collation result of the collation unit 108 are provided. The response processing unit 109 includes a vibrator driving unit 109a that drives a vibrator 111 that notifies an incoming call by generating mechanical vibration, and an LED driving unit that drives an LED (light emitting diode) unit 112 that notifies the incoming call by emitting light. 109b and a speaker driving unit 109c that drives the speaker 113 that notifies an incoming call by outputting a musical sound. The vibrator 111, the LED unit 112, and the speaker 113 are originally provided in the wireless communication terminal.

なお、上記記録部１０４は登録手段に対応する。また、上記マイク１０５は音声入力手段に対応し、音声認識部１０７は音声認識手段に対応し、光検知部１０１は光検知手段に対応する。また、制御部１０２は応答開始指示出力手段に対応し、照合部１０８は照合手段に対応する。また、応答処理部１０９は応答手段に対応する。 The recording unit 104 corresponds to a registration unit. The microphone 105 corresponds to a voice input unit, the voice recognition unit 107 corresponds to a voice recognition unit, and the light detection unit 101 corresponds to a light detection unit. The control unit 102 corresponds to a response start instruction output unit, and the verification unit 108 corresponds to a verification unit. The response processing unit 109 corresponds to response means.

次に、上記構成の無線通信端末１００の音声認識処理について説明する。まず、ユーザーが操作部１０６を使用して、自己の呼びかけに対して無線通信端末１００が応答する単語と応答動作（振動で応答、光で応答又は楽音で応答）をキー入力する。入力した単語と応答動作は記録部１０４に記録される。応答用の単語と応答動作が記録された後、無線通信端末１００を例えば鞄内などの外光が遮られるところに置くものとする。光検知部１０１は、無線通信端末１００に電源が投入された時点から外光の強度を検出し、その結果を制御部１０２に入力するので、鞄の開け閉めにより外光に変化が生じると光検知部１０１の出力値が変化することになり、制御部１０２が光検知部１０１の出力値の変化を検知すると、音声認識部１０７に対して音声認識処理を開始させる指示を与える。このときタイマ１０３にて設定された時間の間、応答処理部１０９が応答しなかった場合には音声認識処理を停止させる。 Next, the speech recognition process of the wireless communication terminal 100 having the above configuration will be described. First, the user uses the operation unit 106 to key-in a word and a response operation (response by vibration, response by light, or response by tone) that the wireless communication terminal 100 responds to the call of the user. The input word and response action are recorded in the recording unit 104. After the response word and the response action are recorded, the wireless communication terminal 100 is assumed to be placed in a place where outside light is blocked, for example, in a cage. The light detection unit 101 detects the intensity of external light from the time when the power is supplied to the wireless communication terminal 100 and inputs the result to the control unit 102. When the output value of the detection unit 101 changes and the control unit 102 detects a change in the output value of the light detection unit 101, the control unit 102 instructs the voice recognition unit 107 to start the voice recognition process. At this time, if the response processing unit 109 does not respond during the time set by the timer 103, the voice recognition process is stopped.

音声認識部１０７が音声認識処理を開始した後、外部より音声がマイク１０５に入力されると即ちユーザーが呼びかける発声をすると、音声認識部１０７がその発声に対して音声認識を行い単語を検出する。そして、検出した単語を照合部１０８に入力する。照合部１０８は、音声認識部１０７から単語が入力されると、記録部１０４に登録されている単語を読み出してこれと照合し、一致しない場合は制御部１０２に不一致の結果を入力し、一致した場合は応答処理部１０９にオン信号を入力する。応答処理部１０９は、照合部１０８からオン信号が入力されると、記録部１０４に登録されている応答動作に基づいてバイブ駆動部１０９ａ、ＬＥＤ駆動部１０９ｂ及びスピーカ駆動部１０９ｃのうち少なくとも１つを動作させて、バイブレータ１１１の振動、ＬＥＤ部１１２の点灯又はスピーカ１１３からの楽音の再生を行ってユーザーに無線通信端末１００の所在を知らせる。 After the voice recognition unit 107 starts the voice recognition process, when voice is input from the outside to the microphone 105, that is, when the user makes a utterance, the voice recognition unit 107 performs voice recognition on the utterance and detects a word. . Then, the detected word is input to the matching unit 108. When a word is input from the speech recognition unit 107, the matching unit 108 reads the word registered in the recording unit 104 and compares it, and if not matched, inputs a mismatch result to the control unit 102, and matches. In this case, an ON signal is input to the response processing unit 109. When an ON signal is input from the collating unit 108, the response processing unit 109 receives at least one of the vibrator driving unit 109a, the LED driving unit 109b, and the speaker driving unit 109c based on the response operation registered in the recording unit 104. Is operated to vibrate the vibrator 111, turn on the LED unit 112, or reproduce the musical sound from the speaker 113 to notify the user of the location of the wireless communication terminal 100.

このように、本実施の形態の無線通信端末１００によれば、外光に変化があった場合に音声認識処理を開始し、その後に入力された音声（ユーザーの呼びかけ）の単語が登録済みの単語と一致した場合、振動、光又は楽音を発生して応答する。したがって、音声認識開始用の専用キーを操作しなくても外光の変化によって音声認識処理を開始するので、無線通信端末１００を鞄内に入れたままでも呼びかけることで応答することから容易に探し出すことができる。 As described above, according to the wireless communication terminal 100 of the present embodiment, the speech recognition process is started when there is a change in the external light, and the input speech (user call) word is registered. If it matches the word, it responds by generating vibration, light or music. Therefore, since the voice recognition process is started by the change of the external light without operating the dedicated key for starting voice recognition, it is easily searched for by responding by calling even when the wireless communication terminal 100 is put in the bag. be able to.

なお、上記実施の形態では、登録する単語及び応答動作を操作部１０６にてキー入力するようにしたが、マイク１０５より音声入力して登録するようにしても良い。このとき、入力した音声は音声認識部１０７にて認識されて、その結果が記録部１０４に登録されることになる。 In the above embodiment, the word to be registered and the response operation are key-inputted by the operation unit 106, but may be registered by voice input from the microphone 105. At this time, the input voice is recognized by the voice recognition unit 107, and the result is registered in the recording unit 104.

(実施の形態２)
図２は、本発明の実施の形態２の無線通信端末の概略構成を示すブロック図である。なお、図２において前述した実施の形態１の無線通信端末と機能の共通する部分には同じ符号を付けて説明を省略する。 (Embodiment 2)
FIG. 2 is a block diagram showing a schematic configuration of the radio communication terminal according to the second embodiment of the present invention. In FIG. 2, the same reference numerals are given to the parts having the same functions as those of the wireless communication terminal of the first embodiment described above, and the description thereof is omitted.

図２に示すように、本実施の形態の無線通信端末２００は、マイク１０５にて入力された音声の特徴量を検出する測定部２０１と、音声認識部１０７にて認識された単語及び応答動作と測定部２０１にて検出された入力音声の特徴量とを登録する記録部２０２と、記録部２０２に登録された単語とこの単語が登録された後であって制御部１０２から音声認識部１０７に音声認識処理開始指示が出力された後にマイク１０５より音声入力されて認識された単語とを照合するとともに、音声認識処理開始指示が出力された後にマイク１０５より入力された音声の特徴量が記録部２０２に登録された音声の特徴量の判定基準内かを照合する照合部２０３とを備えている。 As illustrated in FIG. 2, the wireless communication terminal 200 according to the present embodiment includes a measurement unit 201 that detects a feature amount of speech input by a microphone 105, a word recognized by the speech recognition unit 107, and a response operation. And a recording unit 202 for registering the input voice feature value detected by the measuring unit 201, a word registered in the recording unit 202, and after the word is registered, the control unit 102 to the speech recognition unit 107. Are compared with words recognized by voice input from the microphone 105 after the voice recognition process start instruction is output, and the feature amount of the voice input from the microphone 105 after the voice recognition process start instruction is output is recorded. And a collation unit 203 for collating whether or not the voice feature value registered in the unit 202 is within the determination criterion.

なお、上記測定部２０１は音声特徴量検出手段に対応し、また記録部２０２は登録手段に対応する。 The measurement unit 201 corresponds to a voice feature amount detection unit, and the recording unit 202 corresponds to a registration unit.

応答処理部１０９は、照合部２０３の照合結果から、音声認識処理開始指示の出力後にマイク１０５より入力された音声の単語が登録された単語と一致し、且つ当該音声の特徴量が判定基準内の場合にのみ記録部２０２に登録された応答動作に従ってバイブ駆動部１０９ａ、ＬＥＤ駆動部１０９ｂ及びスピーカ駆動部１０９ｃのうち少なくとも１つを動作させて、バイブレータ１１１の振動、ＬＥＤ部１１２の点灯又はスピーカ１１３からの楽音の再生を行ってユーザーに無線通信端末２００の所在を知らせる。 From the collation result of the collation unit 203, the response processing unit 109 matches the word of the voice input from the microphone 105 after the output of the voice recognition process start instruction, and the feature amount of the voice is within the criterion. Only in this case, at least one of the vibrator drive unit 109a, the LED drive unit 109b, and the speaker drive unit 109c is operated according to the response operation registered in the recording unit 202, and vibration of the vibrator 111, lighting of the LED unit 112, or speaker The musical sound from 113 is reproduced to notify the user of the location of the wireless communication terminal 200.

次に、上記構成の無線通信端末２００の音声認識処理について説明する。まず、ユーザーがマイク１０５から無線通信端末２００が応答する単語を音声入力し、さらに応答動作を操作部１０６によるキー入力又はマイク１０５による音声入力する。音声入力された単語と音声入力又はキー入力された応答動作が記録部２０２に記録される。このときマイク１０５より入力された音声は測定部２０１によってその特徴量が測定されて、その特徴量が基準値として上記単語及び応答動作と関連付けて記録部２０２に記録される。応答用の単語と応答動作を記録させた後、無線通信端末２００を例えば鞄内などの外光が遮られるところに置くものとする。 Next, the voice recognition process of the radio communication terminal 200 having the above configuration will be described. First, a user inputs a word from the microphone 105 to which the wireless communication terminal 200 responds, and further inputs a response operation by key input by the operation unit 106 or voice input by the microphone 105. The word input by voice and the response action input by voice or key are recorded in the recording unit 202. At this time, the feature amount of the voice input from the microphone 105 is measured by the measurement unit 201, and the feature amount is recorded in the recording unit 202 in association with the word and the response action as a reference value. After recording the response word and the response action, it is assumed that the wireless communication terminal 200 is placed in a place where outside light is blocked, for example, in a cage.

光検知部１０１は、無線通信端末２００に電源が投入された時点から外光の強度を検出し、その結果を制御部１０２に入力するので、鞄の開け閉めにより外光に変化が生じると光検知部１０１の出力値が変化することになる。制御部１０２は、光検知部１０１の出力値の変化を検知すると、音声認識部１０７に対して音声認識処理を開始させる指示を与える。このときタイマ１０３にて設定された時間の間、応答処理部１０９が応答しなかった場合には音声認識処理を停止させる。 The light detection unit 101 detects the intensity of external light from the time the power is supplied to the wireless communication terminal 200, and inputs the result to the control unit 102. The output value of the detection unit 101 changes. When the control unit 102 detects a change in the output value of the light detection unit 101, the control unit 102 gives an instruction to start the voice recognition processing to the voice recognition unit 107. At this time, if the response processing unit 109 does not respond during the time set by the timer 103, the voice recognition process is stopped.

音声認識部１０７が音声認識処理を開始した後、外部より音声がマイク１０５に入力されると即ちユーザーが呼びかける発声をすると、音声認識部１０７がマイク１０５からの音声入力に対して音声認識を行い入力された音声の単語を検出する。また同時に測定部２０１が入力された音声の特徴量を測定する。そして、検出された単語と測定された音声の特徴量が照合部２０３に入力される。照合部２０３は、入力された音声の単語及び特徴量が、記録部２０２に記録されている単語と一致し、且つ記録されている特徴量の判定基準内かを照合し、一致しない又は判定基準値外の場合は制御部１０２に不一致の結果を入力し、一致し、且つ判定基準値内の場合は応答処理部１０９にオン信号を入力する。応答処理部１０９は、照合部２０３からオン信号が入力されると、記録部２０２に登録されている応答動作に基づいてバイブ駆動部１０９ａ、ＬＥＤ駆動部１０９ｂ及びスピーカ駆動部１０９ｃのうち少なくとも１つを動作させて、バイブレータ１１１の振動、ＬＥＤ部１１２の点灯又はスピーカ１１３からの楽音の再生を行ってユーザーに無線通信端末２００の所在を知らせる。 After the voice recognition unit 107 starts the voice recognition process, when voice is input from the outside to the microphone 105, that is, when the user makes a utterance, the voice recognition unit 107 performs voice recognition on the voice input from the microphone 105. Detects input speech words. At the same time, the measurement unit 201 measures the feature amount of the input voice. Then, the detected word and the measured voice feature amount are input to the collation unit 203. The collation unit 203 collates whether the input speech word and the feature amount match the word recorded in the recording unit 202 and are within the judgment criterion of the recorded feature amount. When the value is out of the value, a mismatch result is input to the control unit 102, and when the values match and are within the determination reference value, an ON signal is input to the response processing unit 109. When the ON signal is input from the collating unit 203, the response processing unit 109 receives at least one of the vibrator driving unit 109a, the LED driving unit 109b, and the speaker driving unit 109c based on the response operation registered in the recording unit 202. Is operated to vibrate the vibrator 111, turn on the LED unit 112, or reproduce the musical sound from the speaker 113 to notify the user of the location of the wireless communication terminal 200.

このように、本実施の形態の無線通信端末２００によれば、前述した実施の形態１の無線通信端末１００と同じ作用及び効果が得られるとともに、音声による単語登録時にユーザーの音声の特徴量を測定して記録し、この特徴量を音声認識の判定基準とするので、ユーザー以外の人の発声や周囲の雑音への応答を最小限に抑えることができる。即ち、単語のみで照合する場合と比べて誤り率の少ない音声認識機能が得られる。 As described above, according to the wireless communication terminal 200 of the present embodiment, the same operation and effect as the wireless communication terminal 100 of Embodiment 1 described above can be obtained, and the feature amount of the user's voice can be set at the time of word registration by voice. Since measurement and recording are performed, and this feature value is used as a criterion for speech recognition, it is possible to minimize the response to voices of people other than the user and ambient noise. That is, it is possible to obtain a speech recognition function with a lower error rate than in the case of collating only with words.

本発明は、音声認識専用キーの操作を行わなくても、外光に変化があった場合に音声認識処理を開始することができるといった効果を有し、携帯電話、ＰＨＳ、ＰＤＡなどの無線通信端末に適用が可能である。 The present invention has an effect that voice recognition processing can be started when there is a change in external light without operating a voice recognition dedicated key, and wireless communication such as a cellular phone, PHS, and PDA. Applicable to terminals.

本発明の実施の形態１の無線通信端末の概略構成を示すブロック図1 is a block diagram showing a schematic configuration of a wireless communication terminal according to a first embodiment of the present invention. 本発明の実施の形態２の無線通信端末の概略構成を示すブロック図FIG. 2 is a block diagram showing a schematic configuration of a wireless communication terminal according to a second embodiment of the present invention. 従来の携帯電話の概略構成を示すブロック図Block diagram showing a schematic configuration of a conventional mobile phone

Explanation of symbols

１０１光検知部
１０２制御部
１０３タイマ
１０４、２０２記録部
１０５マイク
１０６操作部
１０７音声認識部
１０８、２０３照合部
１０９応答処理部
１０９ａバイブ駆動部
１０９ｂＬＥＤ駆動部
１０９ｃスピーカ駆動部
１１１バイブレータ
１１２ＬＥＤ部
１１３スピーカ
２０１測定部 DESCRIPTION OF SYMBOLS 101 Light detection part 102 Control part 103 Timer 104,202 Recording part 105 Microphone 106 Operation part 107 Voice recognition part 108,203 Collation part 109 Response processing part 109a Vibration drive part 109b LED drive part 109c Speaker drive part 111 Vibrator 112 LED part 113 Speaker 201 Measuring unit

Claims

Registration means for registering the keyed word and response action;
Voice input means for inputting voice;
Speech recognition means for recognizing a word input by the voice input means;
Light detection means for detecting light outside the device;
A response start instruction output means for giving an instruction to start voice recognition processing to the voice recognition means when there is a change in the output of the light detection means;
Collating means for collating the word registered in the registering means with the word recognized after being input by voice after the start of the speech recognition processing after the word is registered;
When the word registered in the registration unit matches the word recognized by voice input after the start of the speech recognition process based on the collation result by the collation unit, vibration, light, and music are performed according to the response operation registered in the registration unit. A response means for generating at least one of:
Wireless communication terminal equipped with.

Voice input means for inputting voice;
Speech recognition means for recognizing a word input by the voice input means;
Registration means for registering the words and response actions recognized by the voice recognition means;
Light detection means for detecting light outside the device;
A response start instruction output means for giving an instruction to start voice recognition processing to the voice recognition means when there is a change in the output of the light detection means;
Collating means for collating the word registered in the registering means with the word recognized after being input by voice after the start of the speech recognition processing after the word is registered;
If the word registered in the registration unit matches the word recognized by voice input after the start of the speech recognition process based on the collation result by the collation unit, vibration, light, and music are performed according to the response operation registered in the registration unit. A response means for generating at least one of:
Wireless communication terminal equipped with.

Voice input means for inputting voice;
Speech recognition means for recognizing a word input by the voice input means;
Voice feature quantity detecting means for detecting a feature quantity of the voice input by the voice input means;
Registration means for registering the word and response action recognized by the voice recognition means in association with the voice feature quantity detected by the voice feature quantity detection means;
Light detection means for detecting light outside the device;
A response start instruction output means for giving an instruction to start the voice recognition processing to the voice recognition means when there is a change in the output of the light detection means;
The word registered in the registration unit is compared with the word that is registered after the word is registered and is voice-input after the voice recognition process is started, and the voice that is input after the voice recognition process is started is checked. Collating means for collating whether or not the feature amount is within a criterion for determination of the feature amount of the voice registered in the registration unit;
Only when the speech word input after the start of the speech recognition process matches the registered word from the collation result by the collating unit and the feature amount of the speech is within the criterion, it is registered in the registration unit. Response means for generating at least one of vibration, light and music according to the response operation
Wireless communication terminal equipped with.

The wireless communication terminal according to claim 1, wherein the response unit generates vibration by driving a vibrator.

The wireless communication terminal according to any one of claims 1 to 3, wherein the response means generates light by driving a light emitting element.

The wireless communication terminal according to claim 1, wherein the response means generates a musical sound by driving a speaker.