JP2000125006A

JP2000125006A - Speech recognition device, speech recognition method, and automatic telephone answering device

Info

Publication number: JP2000125006A
Application number: JP10296297A
Authority: JP
Inventors: Katsufumi Fukunishi; 克文福西; Masatoshi Morishima; 昌俊森島
Original assignee: NTT Data Corp
Current assignee: NTT Data Group Corp
Priority date: 1998-10-19
Filing date: 1998-10-19
Publication date: 2000-04-28

Abstract

PROBLEM TO BE SOLVED: To neither limit voice to be recognized, nor decrease the recognition performance of voice. SOLUTION: A main control part 5 once informed of incoming detection information and sender telephone number information by a telephone line control part 3 informs a dictionary selection part 11 of the telephone number information and sends a line connection request to a control part 3. When information on the kind of a dictionary to be used is sent from the selection part 11, it is reported to a voice recognition part 7. A request to output an answer sentence is sent to an answer guidance output part 9 and when a guidance output end note indicating that the answer sentence has been outputted in compliance with the request is sent from the output part 9 to a public telephone network 1, a recognition part 7 is requested to perform voice recognition. After a recognition result is reported from the recognition part 7, it is decided whether or not the series of processing operations is ended. When the telephone number information is sent, the selection part 11 discriminates which of portable, a PHS, and a general telephone set number the telephone number information corresponds to. According to the discrimination result, one of a portable telephone dictionary 13, a general public line dictionary 15, and a PHS line dictionary 17 is selected and reported to a main control part 5.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、音声認識装置、音
声認識方法、及び電話自動応答装置の改良に関するもの
である。以下、音声認識装置を電話自動応答装置に適用
した例について説明する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech recognition apparatus, a speech recognition method, and an improvement of an automatic telephone answering apparatus. Hereinafter, an example in which the voice recognition device is applied to a telephone automatic answering device will be described.

【０００２】[0002]

【従来の技術】一般に、音声認識処理においては、入力
した音声信号をアナログ信号からディジタル信号に変換
し、そのディジタル信号から特徴量を計算し、その特徴
量と認識用辞書との比較結果に基づいて認識結果を取得
する。上記認識用辞書は、音韻モデルや文法モデル等の
音声認識用の基本辞書（基本辞書）を基にして認識語
彙、即ち、認識対象の単語や或いは認識対象の文章を構
成するための単語から構成される。また、音声認識用の
基本辞書は、予め大量の音声データを基に作成される。
音声認識処理において、認識用辞書を用いる理由は、現
状では基本辞書により凡ゆる言葉を認識することが困難
なためである。2. Description of the Related Art Generally, in speech recognition processing, an input speech signal is converted from an analog signal to a digital signal, a feature is calculated from the digital signal, and a comparison is made between the feature and a recognition dictionary. To obtain the recognition result. The recognition dictionary is composed of a recognition vocabulary, that is, a word to be recognized or a word for forming a text to be recognized, based on a basic dictionary (basic dictionary) for speech recognition such as a phoneme model or a grammar model. Is done. The basic dictionary for voice recognition is created in advance based on a large amount of voice data.
The reason for using the recognition dictionary in the voice recognition processing is that it is difficult at present to recognize all words using the basic dictionary.

【０００３】[0003]

【発明が解決しようとする課題】ところで、音声認識の
性能は、認識すべき音声の環境が基本辞書作成時のそれ
と同一であるときには高く、そうでないときには低くな
ることが知られている。例えば、基本辞書作成用の音声
データが、携帯電話機からのものである場合、その基本
辞書から作成する認識用辞書を用いて携帯電話機から発
せられた音声の認識処理を行うときには高い認識性能が
得られる。しかし、上記認識用辞書では携帯電話機とは
環境が異なる電話機、例えば公衆用電話回線に接続され
る通常の電話機からの音声の認識処理においては低い認
識性能しか得られない。そのため、上記のように各種電
話機からの音声を認識対象にする例では、各々の基本辞
書の作成環境に対応する環境の電話機からの音声のみを
認識対象に限定し、対応する認識用辞書を用いて認識処
理を行う必要がある。一方、上記の例において、認識対
象を敢えて限定しない場合には、環境の違いによる差分
を吸収するために、各種電話機に夫々対応する環境毎の
音声データを基に作成した基本辞書から得られた認識用
辞書を、音声認識に使用することになる。しかし、各々
の環境に適合した認識用辞書を使用して音声認識を行う
場合に比較して、認識性能が低くなるのは避けられな
い。By the way, it is known that the performance of speech recognition is high when the environment of the voice to be recognized is the same as that at the time of creating the basic dictionary, and is low otherwise. For example, when voice data for creating a basic dictionary is from a mobile phone, high recognition performance is obtained when performing recognition processing of voice emitted from the mobile phone using a recognition dictionary created from the basic dictionary. Can be However, in the recognition dictionary, only low recognition performance can be obtained in a voice recognition process from a telephone having a different environment from a mobile telephone, for example, a normal telephone connected to a public telephone line. Therefore, in the example in which voices from various telephones are to be recognized as described above, only voices from telephones in an environment corresponding to the environment for creating each basic dictionary are limited to recognition targets, and a corresponding recognition dictionary is used. It is necessary to perform recognition processing. On the other hand, in the above example, in a case where the recognition target is not intentionally limited, in order to absorb a difference due to a difference in environment, the recognition target is obtained from a basic dictionary created based on audio data for each environment corresponding to each type of telephone. The recognition dictionary will be used for speech recognition. However, it is inevitable that the recognition performance will be lower than when speech recognition is performed using a recognition dictionary suitable for each environment.

【０００４】上述したように、従来においては、認識対
象となる音声の環境を限定し、その環境に対応する認識
用辞書を用いて音声認識を行う方法、又は全ての環境に
対応すべく作成した認識用辞書を用いて音声認識を行う
方法のいずれかが採用されていた。しかし、前者では認
識対象が限定され、後者では認識性能が低くなるという
問題があった。As described above, in the related art, a method of performing voice recognition using a recognition dictionary corresponding to the environment by limiting the environment of the voice to be recognized, or a method prepared to support all environments. One of the methods of performing speech recognition using a recognition dictionary has been employed. However, the former has a problem that recognition targets are limited, and the latter has a problem that recognition performance is low.

【０００５】従って本発明の目的は、認識対象となる音
声が限定されることがなく、且つ、音声の認識性能が低
くなることがないようにすることにある。[0005] Accordingly, it is an object of the present invention to prevent the speech to be recognized from being limited and to prevent the speech recognition performance from being lowered.

【０００６】[0006]

【課題を解決するための手段】本発明の第１の側面に従
う音声認識装置は、認識対象となる音声の環境毎に設定
した複数の音声認識用辞書と、入力した音声の環境を識
別する手段と、各辞書の中から識別した環境に適合する
辞書を選択する手段とを備え、選択した辞書を用いて入
力した音声の認識処理を行うように構成される。According to a first aspect of the present invention, there is provided a voice recognition apparatus comprising: a plurality of voice recognition dictionaries set for each voice environment to be recognized; and a means for identifying an input voice environment. And a means for selecting a dictionary suitable for the identified environment from among the dictionaries, and configured to perform recognition processing of the input speech using the selected dictionary.

【０００７】上記構成において、選択手段が選択した音
声認識用辞書を用いて入力した音声の認識処理を行うよ
うにしたので、認識対象となる音声が限定されることが
なく、且つ、音声の認識性能が低くなることがない。In the above arrangement, the input speech is recognized using the speech recognition dictionary selected by the selection means, so that the speech to be recognized is not limited, and the speech recognition is not limited. The performance does not decrease.

【０００８】本発明の第１の側面に係る好適な実施形態
では、認識対象となる各音声の環境毎の音声データを基
に作成した基本辞書から得られる音声認識用辞書を更に
備え、識別手段は、入力した音声の環境を識別できなか
ったとき、上述の音声認識用辞書を用いて音声の認識処
理を行う。認識対象となる音声は、携帯電話機、ＰＨＳ
及び公衆用電話回線網に接続される通常の電話機のいず
れかから発呼されたものである。また、上記複数の音声
認識用辞書は、携帯電話機からの音声、ＰＨＳからの音
声、及び通常の電話機からの音声に夫々対応して設定さ
れる音声認識用辞書である。In a preferred embodiment according to the first aspect of the present invention, there is further provided a voice recognition dictionary obtained from a basic dictionary created based on voice data for each environment of each voice to be recognized. Performs a speech recognition process using the above-described speech recognition dictionary when the environment of the inputted speech cannot be identified. Speech to be recognized is mobile phone, PHS
And a normal telephone connected to the public telephone network. The plurality of voice recognition dictionaries are voice recognition dictionaries set corresponding to voice from a mobile phone, voice from a PHS, and voice from a normal phone.

【０００９】識別手段は、通知された発信者電話番号に
基づいて認識対象となる音声の環境を識別する。この識
別は、例えば通知された発信者電話番号と、複数の電話
番号が携帯電話機、ＰＨＳ及び通常の電話機毎に分類さ
れて登録されている電話番号／辞書テーブルとを対照す
ることにより行われる。選択手段は、識別結果に基づい
て各音声認識用辞書の中から対応する辞書を選択する。The identification means identifies a voice environment to be recognized based on the notified caller telephone number. This identification is performed, for example, by comparing the notified caller telephone number with a telephone number / dictionary table in which a plurality of telephone numbers are classified and registered for each of the mobile telephone, the PHS, and the ordinary telephone. The selecting means selects a corresponding dictionary from the speech recognition dictionaries based on the identification result.

【００１０】本発明の第２の側面に従う音声認識方法
は、認識対象となる音声の環境毎に夫々音声認識用辞書
を設定すると共に、入力された音声の環境を識別する第
１の過程と、各辞書の中から識別された環境に適合する
辞書を選択する第２の過程と、選択した辞書を用いて入
力した音声の認識処理を行う第３の過程とを備える。A speech recognition method according to a second aspect of the present invention comprises: a first step of setting a speech recognition dictionary for each speech environment to be recognized and identifying the environment of the inputted speech; The method includes a second step of selecting a dictionary suitable for the identified environment from the dictionaries, and a third step of recognizing input speech using the selected dictionary.

【００１１】本発明の第３の側面に従う電話自動応答装
置は、受信した音声に基づいて発呼側電話機に自動応答
するもので、認識対象となる音声の環境毎に設定した複
数の音声認識用辞書と、受信した音声の環境に適合する
辞書を各辞書の中から選択して、音声の認識処理を行う
手段と、認識処理の結果に応じた応答用のメッセージ
を、発呼側電話機に送信する手段とを備える。An automatic telephone answering apparatus according to a third aspect of the present invention automatically responds to a calling telephone based on a received voice, and includes a plurality of voice recognition devices set for each voice environment to be recognized. A dictionary and a dictionary adapted to the environment of the received voice are selected from the dictionaries, a means for performing voice recognition processing, and a response message corresponding to a result of the recognition processing are transmitted to the calling telephone. Means to perform.

【００１２】本発明の第３の側面に係る好適な実施形態
では、発呼側電話機は、携帯電話機、ＰＨＳ及び公衆用
電話回線網に接続される通常の電話機のいずれかであ
る。複数の音声認識用辞書は、携帯電話機からの音声、
ＰＨＳからの音声、及び通常の電話機からの音声に夫々
対応して設定される音声認識用辞書である。In a preferred embodiment according to the third aspect of the present invention, the calling telephone is any one of a portable telephone, a PHS, and an ordinary telephone connected to a public telephone network. Multiple voice recognition dictionaries are used for voices from mobile phones,
These are speech recognition dictionaries that are set corresponding to speech from a PHS and speech from a normal telephone, respectively.

【００１３】上記実施形態に係る変形例では、認識対象
となる各音声の環境毎の音声データを基に作成した基本
辞書から得られる音声認識用辞書を更に備え、音声認識
処理手段は、入力した音声の環境を識別できなかったと
き、上述の音声認識用辞書を用いて音声の認識処理を行
う。また、音声認識処理手段は、通知された発信者電話
番号と、複数の電話番号が携帯電話機、ＰＨＳ及び通常
の電話機毎に分類されて登録されている電話番号／辞書
テーブルとを対照することにより、音声の環境を識別す
る。In the modification according to the above embodiment, there is further provided a speech recognition dictionary obtained from a basic dictionary created based on speech data for each environment of each speech to be recognized. When the environment of the voice cannot be identified, the voice recognition processing is performed using the above-described voice recognition dictionary. Further, the voice recognition processing means compares the notified caller's telephone number with a telephone number / dictionary table in which a plurality of telephone numbers are classified and registered for each of a mobile phone, a PHS, and a normal telephone. Identify the audio environment.

【００１４】本発明の第４の側面に従うプログラム媒体
は、認識対象となる音声の環境毎に設定した複数の音声
認識用辞書と、入力された音声の環境を識別する手段
と、各辞書の中から識別された環境に適合する辞書を選
択する手段とを備え、選択した辞書を用いて入力した音
声の認識処理を行う音声認識装置における各手段として
コンピュータを動作させるためのコンピュータプログラ
ムをコンピュータ読取可能に担持する。According to a fourth aspect of the present invention, there is provided a program medium comprising: a plurality of speech recognition dictionaries set for each speech environment to be recognized; a means for identifying an inputted speech environment; Means for selecting a dictionary suitable for the environment identified from the computer, and a computer program for operating a computer as each means in a speech recognition apparatus for performing recognition processing of speech input using the selected dictionary. Carry on.

【００１５】[0015]

【発明の実施の形態】以下、本発明の実施の形態を、図
面により詳細に説明する。Embodiments of the present invention will be described below in detail with reference to the drawings.

【００１６】図１は、本発明の一実施形態が適用される
電話自動応答装置の全体構成を示すブロック図である。FIG. 1 is a block diagram showing the overall configuration of an automatic telephone answering apparatus to which one embodiment of the present invention is applied.

【００１７】上記装置は、図１に示すように、公衆用電
話回線網（回線）１に接続されるもので、電話回線制御
部（制御部）３と、主制御部５と、音声認識部７と、応
答ガイダンス出力部（出力部）９とを備える。上記装置
は、上記各部に加えて、更に辞書選択部（選択部）１１
と、携帯電話用辞書（携帯用辞書）１３と、一般公衆回
線用辞書（公衆用辞書）１５と、ＰＨＳ回線用辞書（Ｐ
ＨＳ用辞書）１７と、電話番号／辞書対応テーブル（テ
ーブル）１９をも備える。As shown in FIG. 1, the apparatus is connected to a public telephone line network (line) 1, and includes a telephone line control unit (control unit) 3, a main control unit 5, a voice recognition unit. 7 and a response guidance output unit (output unit) 9. The apparatus further includes a dictionary selection unit (selection unit) 11 in addition to the above units.
, A mobile phone dictionary (portable dictionary) 13, a general public line dictionary (public dictionary) 15, and a PHS line dictionary (P
An HS dictionary 17 and a telephone number / dictionary correspondence table (table) 19 are also provided.

【００１８】制御部３は、回線１からの着信を検出する
と共に、着信検出時に、回線１を通じて送信される発信
者番号通知を受信し、その通知から発信者の電話番号を
検知する。制御部３は、上記着信検出及び発信者電話番
号を、主制御部５に着信検出情報及び発信者電話番号情
報として通知する。制御部３は、また、主制御部５から
の回線接続要求を受付け、その要求に基づいて回線１を
接続し、主制御部５に対し回線１との接続が完了した旨
を通知する。制御部３は、更に、出力部９から応答文が
送られると、それを受付けて回線１に出力する。The control unit 3 detects an incoming call from the line 1 and, upon detecting an incoming call, receives a caller ID notification transmitted through the line 1 and detects the telephone number of the caller from the notification. The control section 3 notifies the main control section 5 of the above-mentioned incoming call detection and caller telephone number as incoming call detection information and caller telephone number information. The control unit 3 also receives a line connection request from the main control unit 5, connects the line 1 based on the request, and notifies the main control unit 5 that the connection with the line 1 is completed. Further, when a response sentence is sent from the output unit 9, the control unit 3 receives the response sentence and outputs it to the line 1.

【００１９】主制御部５は、制御部３からの着信検出情
報及び発信者電話番号情報の通知を受けると、発信者電
話番号情報を選択部１１に通知すると共に、制御部３に
対し、回線接続要求を送る。主制御部５は、選択部１１
から使用すべき辞書種別に係る情報（辞書種別情報）が
送られると、それを受付け、その情報を音声認識部７に
通知する。When the main control unit 5 receives the notification of the incoming call detection information and the caller telephone number information from the control unit 3, it notifies the selection unit 11 of the caller telephone number information. Send a connection request. The main control unit 5 includes a selection unit 11
When the information (dictionary type information) relating to the dictionary type to be used is sent from the server, the information is received and the information is notified to the voice recognition unit 7.

【００２０】主制御部５は、また、出力部９に対し応答
文の出力要求を送り、出力部９から制御部３を通じて回
線１に上記要求に応じた応答文を出力した旨のガイダン
ス出力終了通知が送られると、それを受付けると共に、
音声認識部７に対して音声認識の実行を要求する。主制
御部５は、更に、音声認識部７から音声認識結果が通知
されると、それを受付けると共に、一連の音声認識処理
の動作を終了させるべきか否かを判定する。そして、終
了させるべきでないと判定したときは、上記音声認識結
果に基づいて応答ガイダンス文を決定し、出力部９に対
し、再度応答文の出力要求を送る。The main control unit 5 also sends a response sentence output request to the output unit 9, and outputs a guidance message indicating that the response sentence corresponding to the request has been output from the output unit 9 to the line 1 through the control unit 3. When a notification is sent, we will accept it,
It requests the voice recognition unit 7 to execute voice recognition. Further, when the voice recognition result is notified from the voice recognition unit 7, the main control unit 5 receives the notification and determines whether or not to end the series of voice recognition processing operations. Then, when it is determined that the process should not be terminated, a response guidance sentence is determined based on the speech recognition result, and a request to output the response sentence is sent to the output unit 9 again.

【００２１】音声認識部７は、主制御部５から辞書種別
情報が送られると、それを受付けると共に、上記情報
（辞書種別）で音声認識部７自身の初期化を行う。音声
認識部７は、また、主制御部５から音声認識を実行すべ
き旨の要求が送られると、それを受付けると共に、回線
１から制御部３経由で認識対象となる音声情報を取得
し、音声認識処理を実行する。そして、その音声認識処
理により得られた音声認識結果を主制御部５に通知す
る。When the dictionary type information is transmitted from the main control unit 5, the voice recognition unit 7 receives the dictionary type information and initializes the voice recognition unit 7 itself with the information (dictionary type). When receiving a request to execute voice recognition from the main control unit 5, the voice recognition unit 7 receives the request and acquires voice information to be recognized from the line 1 via the control unit 3; Execute voice recognition processing. Then, the main control unit 5 is notified of the voice recognition result obtained by the voice recognition processing.

【００２２】出力部９は、主制御部５から応答文の出力
要求が送られると、それを受付けると共に、その要求に
対応する応答文を制御部３を通じて回線１に出力し、そ
の後、ガイダンス出力終了通知を主制御部５に送る。When the output request of the response message is sent from the main control unit 5, the output unit 9 receives the request and outputs a response message corresponding to the request to the line 1 through the control unit 3, and then outputs the guidance output. An end notification is sent to the main control unit 5.

【００２３】選択部１１は、主制御部５から発信者電話
番号情報が送られると、それを受付けると共に、テーブ
ル１９を参照することにより上記電話番号情報が携帯、
ＰＨＳ、一般のいずれの電話機の電話番号かを識別す
る。そして、その識別結果に基づいて携帯用辞書１３、
公衆用辞書１５、ＰＨＳ用辞書１７のうちから使用すべ
き辞書を選択し、選択した辞書種別を主制御部５に通知
する。When the caller telephone number information is sent from the main controller 5, the selector 11 receives the caller's telephone number information and, by referring to the table 19, transmits the telephone number information to the mobile phone.
PHS and general telephone number are identified. Then, based on the identification result, the portable dictionary 13,
A dictionary to be used is selected from the public dictionary 15 and the PHS dictionary 17, and the selected dictionary type is notified to the main control unit 5.

【００２４】携帯用辞書１３、公衆用辞書１５及びＰＨ
Ｓ用辞書１７は、いずれも上述した認識用辞書である。
携帯用辞書１３には、携帯電話機の電話番号であること
を示す所定の番号情報と、各々の電話番号情報とが夫々
対応付けられて登録されており、公衆用辞書１５には、
通常の電話機の電話番号であることを示す所定の番号情
報と、各々の電話番号情報とが夫々対応付けられて登録
されている。更に、ＰＨＳ用辞書１７にも、上記各辞書
１３、１５におけると同様に、ＰＨＳの電話番号である
ことを示す所定の番号情報と、各々の電話番号情報とが
夫々対応付けられて登録されている。The portable dictionary 13, the public dictionary 15, and the PH
Each of the S dictionaries 17 is the recognition dictionary described above.
In the portable dictionary 13, predetermined number information indicating a telephone number of a mobile phone and each telephone number information are registered in association with each other.
Predetermined number information indicating a telephone number of a normal telephone is registered in association with each telephone number information. Further, in the PHS dictionary 17, similarly to the above-mentioned dictionaries 13 and 15, predetermined number information indicating a PHS telephone number and respective telephone number information are registered in association with each other. I have.

【００２５】既述の内容から明らかなように、テーブル
１９には、制御部３を通じて主制御部５から選択部１１
に送られた発信者電話番号情報が、携帯用辞書１３、公
衆用辞書１５、ＰＨＳ用辞書１７のうちのいずれに属す
るかを識別するための情報が格納されている。As is clear from the above description, the table 19 includes the main controller 5 through the controller 3 and the selector 11.
Is stored for identifying which of the portable dictionary 13, the public dictionary 15, and the PHS dictionary 17 the caller telephone number information sent to the server belongs to.

【００２６】図２は、上述したテーブル１９の一例を示
す説明図である。FIG. 2 is an explanatory diagram showing an example of the table 19 described above.

【００２７】テーブル１９は、図２に示すように、携帯
用辞書１３、公衆用辞書１５及びＰＨＳ用辞書１７に夫
々割当てられた辞書種別情報格納領域１９ａと、それら
の分割された各々の格納領域１９ａ毎に割当てられる電
話番号情報格納領域１９ｂとを備えている。As shown in FIG. 2, the table 19 includes a dictionary type information storage area 19a assigned to each of the portable dictionary 13, the public dictionary 15, and the PHS dictionary 17, and each of the divided storage areas. And a telephone number information storage area 19b assigned to each of the telephone numbers 19a.

【００２８】格納領域１９ｂの、携帯用辞書１３に対応
する領域には、携帯電話機の電話番号であることを示す
０１０、０２０、０３０、０４０及び０８０の番号情報
と、これらの番号情報毎に対応付けられている各々の電
話番号情報とが登録されている。また、格納領域１９ｂ
の、公衆用辞書１５に対応する領域には、通常の電話機
の電話番号であることを示す０１１、０１６６〜０９９
の番号情報（つまり市外局番。これらの市外局番から地
域が特定できる）と、これらの番号情報毎に対応付けら
れている各々の電話番号情報とが登録されている。更
に、格納領域１９ｂの、ＰＨＳ用辞書１７に対応する領
域には、ＰＨＳの電話番号であることを示す０５０の番
号情報と、この番号情報毎に対応付けられている電話番
号情報とが登録されている。In an area corresponding to the portable dictionary 13 in the storage area 19b, number information of 010, 020, 030, 040 and 080 indicating a telephone number of a mobile phone is provided. Each attached telephone number information is registered. The storage area 19b
In the area corresponding to the public dictionary 15, 011 and 0166 to 099 indicating a telephone number of a normal telephone are provided.
(That is, an area code; an area can be specified from these area codes) and telephone number information associated with each of these numbers. Further, in the area corresponding to the PHS dictionary 17 in the storage area 19b, 050 number information indicating a PHS telephone number and telephone number information associated with each of the number information are registered. ing.

【００２９】図３は、図１に記載した装置各部の処理動
作を示すタイミングチャートである。FIG. 3 is a timing chart showing the processing operation of each unit of the apparatus shown in FIG.

【００３０】図３において、まず、制御部３が回線１か
らの着信を検出すると（ステップＳ２１）、次に、受信
した発信者番号通知から発信者の電話番号を検知し（ス
テップＳ２２）、上記着信検出及び発信者電話番号を主
制御部７に通知する（ステップＳ２３）。上記発信者電
話番号は、主制御部５によって受付けられた後、選択部
１１に通知され（ステップＳ２４）、この通知と並行し
て、主制御部５から回線接続要求が制御部３に送られる
（ステップＳ２５）。上記通知を受付けると、選択部１
１では、テーブル１９を参照することにより上記電話番
号の属する電話機の種類を識別すると共に、使用すべき
辞書の種別を選択し、選択した辞書種別を主制御部５に
通知する（ステップＳ２６）。In FIG. 3, first, when the control unit 3 detects an incoming call from the line 1 (step S21), next, it detects the telephone number of the caller from the received caller ID notification (step S22). The main control unit 7 is notified of the incoming call detection and the caller telephone number (step S23). After the caller's telephone number is accepted by the main controller 5, the selector 11 is notified (step S24). In parallel with this notification, a line connection request is sent from the main controller 5 to the controller 3. (Step S25). Upon receiving the above notification, the selection unit 1
In step 1, the type of telephone to which the telephone number belongs is identified by referring to the table 19, the type of dictionary to be used is selected, and the selected dictionary type is notified to the main control unit 5 (step S26).

【００３１】一方、制御部３では、主制御部５からの回
線接続要求に応じて回線１を接続し、主制御部５に対し
回線１との接続が完了した旨を通知する（ステップＳ２
７）。次に、主制御部５は、選択部１１から辞書種別情
報が送られると、それを受付け、その情報を音声認識部
７に通知する（ステップＳ２８）。上記情報を受付ける
と、音声認識部７はそれによって自身の初期化を行う
（ステップＳ２９）。On the other hand, the control unit 3 connects the line 1 in response to the line connection request from the main control unit 5, and notifies the main control unit 5 that the connection with the line 1 is completed (step S2).
7). Next, when the dictionary type information is transmitted from the selection unit 11, the main control unit 5 receives the dictionary type information and notifies the voice recognition unit 7 of the information (step S28). Upon receiving the information, the voice recognition unit 7 initializes itself (step S29).

【００３２】次に、主制御部５は、出力部９に対し、応
答文の出力要求を送る（ステップＳ３０）。その出力要
求を受付けると、出力部９は、その要求に対応する応答
文を制御部３を通じて回線１に出力すると共に（ステッ
プＳ３１）、ガイダンス出力終了通知を主制御部５に送
る（ステップＳ３２）。この通知を受付けると、主制御
部５は、音声認識部７に対し音声認識の実行を要求する
（ステップＳ３３）。音声認識部７では、回線１から制
御部３経由で認識対象である音声情報を取得し、音声認
識処理を実行する（ステップＳ３４）。そして、その音
声認識処理により得られた音声認識結果を主制御部５に
通知する（ステップＳ３５）。Next, the main controller 5 sends a request to output a response sentence to the output unit 9 (step S30). Upon receiving the output request, the output unit 9 outputs a response sentence corresponding to the request to the line 1 through the control unit 3 (step S31), and sends a guidance output end notification to the main control unit 5 (step S32). . Upon receiving this notification, the main control unit 5 requests the voice recognition unit 7 to execute voice recognition (step S33). The voice recognition unit 7 acquires voice information to be recognized from the line 1 via the control unit 3 and executes voice recognition processing (step S34). Then, the main control unit 5 is notified of the speech recognition result obtained by the speech recognition processing (step S35).

【００３３】主制御部５では、上記音声認識結果を受付
けると、それに基づいて一連の音声認識処理の動作を終
了させるべきか否かを判定する（ステップＳ３６）。こ
の判定の結果、終了させても差支えないと判断したとき
には、一連の処理動作を終了し、制御部３は着信待ちの
状態になる（ステップＳ３７）。一方、終了させるべき
でないと判断したときには、上記音声認識結果に基づい
て応答ガイダンス文を再度決定し、出力部９に対し、再
度応答文の出力要求を送る（ステップＳ３０）。このよ
うに、応答文の出力と音声認識処理とを繰り返し実行す
ることによって一連の処理動作が終了する。When the main control unit 5 receives the result of the voice recognition, it determines whether or not to end the series of voice recognition processing based on the result (step S36). As a result of this determination, when it is determined that the termination can be performed, the series of processing operations is terminated, and the control unit 3 enters a state of waiting for an incoming call (step S37). On the other hand, when it is determined that the process should not be terminated, the response guidance sentence is determined again based on the result of the voice recognition, and the output request of the response sentence is again sent to the output unit 9 (step S30). Thus, the series of processing operations is completed by repeatedly executing the output of the response sentence and the voice recognition processing.

【００３４】以上説明したように、本発明の一実施形態
によれば、発信者の使用している電話機が携帯電話機、
ＰＨＳ、通常の電話機のうちのいずれであるかを示す電
話機種別や発信地域などの環境を特定することができる
ため、受信した音声を認識する際に、各環境用に用意さ
れた音声認識用辞書を使用することができる。そのた
め、認識精度を向上させることができ、これにより、上
記電話自動応答装置を備えるシステム全体の使い易さも
向上させることができる。As described above, according to one embodiment of the present invention, the telephone used by the caller is a mobile telephone,
A telephone recognition dictionary prepared for each environment when recognizing a received voice, since it is possible to specify a telephone type indicating a PHS or a normal telephone and an environment such as a calling area. Can be used. Therefore, the recognition accuracy can be improved, and thereby the ease of use of the entire system including the telephone automatic answering apparatus can be improved.

【００３５】なお、上述した内容は、あくまで本発明の
一実施形態に関するものであって、本発明が上記内容の
みに限定されることを意味するものでないのは勿論であ
る。例えば、上記電話自動応答装置において、上記各辞
書１３〜１７に加えて、認識対象となる各音声の環境毎
の音声データを基に作成した基本辞書から得られる音声
認識用辞書を更に備え、選択部１１が、入力した音声の
環境を識別できなかったとき、上記音声認識用辞書を用
いて音声認識処理を行うこともできる。It should be noted that the above-described content relates to one embodiment of the present invention, and does not mean that the present invention is limited to only the above-described content. For example, the telephone automatic answering apparatus further includes, in addition to the dictionaries 13 to 17, a voice recognition dictionary obtained from a basic dictionary created based on voice data for each environment of each voice to be recognized. When the environment of the input voice cannot be identified by the unit 11, the voice recognition process can be performed using the voice recognition dictionary.

【００３６】[0036]

【発明の効果】以上説明したように、本発明によれば、
認識対象となる音声が限定されることがなく、且つ、音
声の認識性能が低くなることがないようにすることがで
きる。As described above, according to the present invention,
The voice to be recognized is not limited, and the voice recognition performance does not decrease.

[Brief description of the drawings]

【図１】本発明の一実施形態が適用される電話自動応答
装置の全体構成を示すブロック図。FIG. 1 is a block diagram showing an overall configuration of an automatic telephone answering apparatus to which an embodiment of the present invention is applied.

【図２】図１の装置が備える電話番号／辞書テーブルの
一例を示す説明図。FIG. 2 is an explanatory view showing an example of a telephone number / dictionary table provided in the apparatus shown in FIG. 1;

【図３】図１の装置各部の処理動作を示すタイミングチ
ャート。FIG. 3 is a timing chart showing a processing operation of each unit of the apparatus in FIG. 1;

[Explanation of symbols]

１公衆用電話回線網（回線）３電話回線制御部（制御部）５主制御部７音声認識部９応答ガイダンス出力部（出力部）１１辞書選択部（選択部）１３携帯電話用辞書（携帯用辞書）１５一般公衆回線用辞書（公衆用辞書）１７ＰＨＳ回線用辞書（ＰＨＳ用辞書）１９電話番号／辞書対応テーブル（テーブル） DESCRIPTION OF SYMBOLS 1 Public telephone network (line) 3 Telephone line control part (control part) 5 Main control part 7 Speech recognition part 9 Response guidance output part (output part) 11 Dictionary selection part (selection part) 13 Dictionary for mobile telephone (mobile) Dictionary) 15 General public line dictionary (public dictionary) 17 PHS line dictionary (PHS dictionary) 19 Phone number / dictionary correspondence table (table)

───────────────────────────────────────────────────── フロントページの続きＦターム(参考） 5D015 GG01 HH13 KK02 KK04 5K015 AA06 AA07 AF07 GA07 5K024 AA75 AA76 BB01 BB03 BB04 CC01 CC11 DD01 DD02 EE01 EE09 FF06 GG01 GG10 5K067 AA23 BB04 DD54 EE02 FF32 GG13 ──────────────────────────────────────────────────続き Continuing on the front page F term (reference)

Claims

[Claims]

1. A plurality of speech recognition dictionaries set for each speech environment to be recognized, means for identifying an environment of an inputted speech, and a dictionary suitable for the identified environment is selected from the dictionaries. Means for performing recognition processing of voice input using the selected dictionary.

2. The speech recognition device according to claim 1, further comprising: a speech recognition dictionary obtained from a basic dictionary created based on speech data for each environment of each speech to be recognized; A speech recognition apparatus, wherein when the environment of the input speech cannot be identified, the speech recognition processing is performed using the speech recognition dictionary.

3. The voice recognition device according to claim 1, wherein the voice to be recognized is a call originated from one of a mobile phone, a PHS, and a normal telephone connected to a public telephone network. A speech recognition device, comprising:

4. The voice recognition device according to claim 1, wherein the plurality of voice recognition dictionaries include a voice from a mobile phone,
A voice recognition device, which is a voice recognition dictionary set corresponding to a voice from a PHS and a voice from a normal telephone.

5. The voice recognition device according to claim 1, wherein the identification unit identifies a voice environment to be recognized based on the notified caller telephone number. A speech recognition device characterized by the following.

6. The speech recognition apparatus according to claim 5, wherein the identification means is configured to register the notified caller's telephone number and a plurality of telephone numbers by classifying them for each of a mobile phone, a PHS, and a normal phone. A speech recognition apparatus for identifying the environment of the speech by comparing the telephone number / dictionary table with the speech information.

7. The speech recognition apparatus according to claim 6, wherein said selecting means selects a corresponding dictionary from said speech recognition dictionaries based on said identification result.

8. A voice recognition dictionary is set for each voice environment to be recognized, a first step of identifying an environment of the input voice, and an environment identified from the respective dictionaries. A speech recognition method, comprising: a second step of selecting a matching dictionary; and a third step of recognizing input speech using the selected dictionary.

9. An apparatus for automatically responding to a calling telephone based on a received voice, comprising: a plurality of voice recognition dictionaries set for each voice environment to be recognized; Means for selecting a dictionary from each of the dictionaries and performing the voice recognition processing; and means for transmitting a response message corresponding to the result of the recognition processing to the calling telephone. A telephone automatic answering device characterized by the above-mentioned.

10. The automatic telephone answering apparatus according to claim 9, wherein the calling telephone is any one of a mobile telephone, a PHS, and a normal telephone connected to a public telephone network. Telephone answering machine.

11. The automatic telephone answering apparatus according to claim 9, wherein the plurality of voice recognition dictionaries include a voice from a mobile phone,
An automatic telephone answering apparatus, which is a voice recognition dictionary set corresponding to a voice from a PHS and a voice from a normal telephone.

12. The telephone automatic answering apparatus according to claim 9, further comprising a voice recognition dictionary obtained from a basic dictionary created based on voice data for each environment of each voice to be recognized, and An automatic telephone answering apparatus, wherein when the means cannot identify the environment of the input voice, the voice recognition processing is performed using the voice recognition dictionary.

13. The automatic telephone answering apparatus according to claim 9, wherein said voice recognition processing means divides the notified caller telephone number and a plurality of telephone numbers into portable telephones, PHSs, and ordinary telephones. An automatic telephone answering apparatus characterized in that the environment of the voice is identified by comparing the registered telephone number / dictionary table.

14. A plurality of voice recognition dictionaries set for each voice environment to be recognized, means for identifying an input voice environment, and a dictionary adapted to the environment identified from each of the dictionaries And a computer readable program medium carrying a computer program for causing a computer to operate as each of the units in the voice recognition device that performs a voice recognition process using the selected dictionary.