JPH08248987A

JPH08248987A - Voice recognition method

Info

Publication number: JPH08248987A
Application number: JP7054819A
Authority: JP
Inventors: Tetsuya Muroi; 哲也室井; Masako Hirose; 雅子広瀬
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1995-03-14
Filing date: 1995-03-14
Publication date: 1996-09-27
Anticipated expiration: 2019-05-10
Also published as: JP3526101B2

Abstract

PURPOSE: To provide the voice recognition device in which information is provided to a caller in a more natural condition as if the caller feels that he is talking with another person without stagnating conversation and to support the caller's operations. CONSTITUTION: The device is provided with a voice recognition means 3 that conducts voice recognition by extracting the voice signals of a speaker A only from a communication path 2 on which speakers A and B make conversation and a recognition result processing means 4 which decides the information that is to be supplied to the speaker A employing the voice recognition result or decides the operation to support the operations of the speaker A. The means 4 has an information retrieval section and it retrieves the word, which becomes a keyword, from the conversation of the speaker A and the information related to the keyword is displayed on a display, for example.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、通信経路を通して行な
われる会話における音声認識方法に関し、より詳細に
は、会話中の音声を認識し、該認識した音声をキーワー
ドとして情報を検索し、該情報を話者に提供するように
した音声認識方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice recognition method in conversation conducted through a communication path, and more specifically, it recognizes voice during conversation, retrieves information using the recognized voice as a keyword, and retrieves the information. The present invention relates to a voice recognition method adapted to provide a speaker with.

【０００２】[0002]

【従来の技術】従来、例えば、商品の注文を受けたりあ
るいはその商品の問い合わせに答えるというような客と
の会話を電話でする場合に、接客者当人が商品などの情
報を全て記憶しておくことが難しいような場面では、電
話の側に台帳のようなものを用意して会話を進めている
が、客からの注文を注文コードに変換したり、目的の情
報を探し出すのに時間がかかるという欠点があった。2. Description of the Related Art Conventionally, for example, when a conversation with a customer such as receiving an order for a product or answering an inquiry about the product is made by telephone, the service person himself / herself stores all the information such as the product. In situations where it is difficult to store, we have prepared something like a ledger on the phone side to proceed with the conversation, but it takes time to convert orders from customers into order codes and to find the target information. There was a drawback of this.

【０００３】また、台帳の代わりに注文や情報検索のソ
フトウェアを搭載したコンピュータを使ってキーボード
やマウスにより入力する方法もしばしば採られている
が、操作に慣れていない場合や滅多に現われない注文な
ど、操作に戸惑ってしまい、客との会話がおろそかにな
ってしまうという欠点があった。Further, a method of inputting with a keyboard or mouse using a computer equipped with software for ordering and information retrieval instead of the ledger is often adopted, but if the operator is not familiar with the operation or an order rarely appears. However, there was a drawback that I was confused about the operation and neglected the conversation with the customer.

【０００４】一方、音声認識装置は、情報検索やコード
変換を行う場合にその入力をスムーズに行うことが出来
るため、例えば、「株式会社××の田中さん」と音声で
入力すると自動的に該当人物の電話番号０３３−３１１
−ｘｘｘｘとプッシュトーンを出力するいわゆる音声ダ
イヤリングという技術などがある（特表平２−５０２１
４９号公報，特開昭６２−１４０３４５号公報など）。
しかし、人間が機械に向かって話しかけるという習慣が
ないために、不自然さ、違和感がやはり問題点として残
っていた。On the other hand, since the voice recognition device can smoothly perform the input when performing information retrieval or code conversion, for example, when "Tanaka-san of XX Co., Ltd." is input by voice, it automatically corresponds. Phone number 033-311 of person
There is a so-called voice dialing technology that outputs -xxxx and push tone.
49, JP-A-62-140345, etc.).
However, since humans have no habit of talking to machines, unnaturalness and discomfort still remain problems.

【０００５】[0005]

【発明が解決しようとする課題】本発明は、上述した欠
点や問題点を解決しようとするもので、人間と人間が会
話しているという、より自然な状況の中で、その会話を
滞らせることなく、通話者に情報を提供し、或いは、通
話者の操作を支援するための音声認識方法を提供する。SUMMARY OF THE INVENTION The present invention is intended to solve the above-mentioned drawbacks and problems, and delays the conversation in a more natural situation where humans are in conversation. Without the need to provide a caller with information or a voice recognition method for supporting the caller's operation.

【０００６】[0006]

【課題を解決するための手段】本発明は、上記目的を達
成するため、（１）話者と話者とが会話を行なっている
通信経路から音声信号を抽出して音声認識を行なう音声
認識手段と、音声認識された結果を用いて、話者に提供
する情報を決定し、あるいは話者の操作を支援する動作
を決定する認識結果処理手段を具備し、片方の話者に対
してのみ、情報を提供し、あるいは操作支援を行なうよ
うにしたこと、更には、（２）前記（１）において、情
報を提供され、あるいは操作支援を行なわれる前記片方
の話者の音声のみを認識するようにしたこと、更には、
（３）前記（２）において、前記片方の話者の音声のみ
を抽出する音声入力手段を具備し、上記音声入力手段か
ら得られる音声のみを認識するようにしたこと、更に
は、（４）前記（２）において、音声を認識すべきタイ
ミングを指定するタイミング指示手段を具備し、該タイ
ミング指示手段にて、音声信号を抽出する指示があった
ときのみ、音声認識を行なうようにしたこと、更には、
（５）前記（２）において、認識対象の語いのグループ
を指示する語い指示手段を具備し、該語い指示手段で指
定された語いのみを認識対象として、音声認識を行なう
ようにしたこと、更には、（６）前記（２）において、
認識対象の語いのグループを指示する語い指示手段と、
音声を認識させるタイミングを指示するタイミング指示
手段を具備し、該タイミング指示手段から、音声認識を
行なう指示があったとき、前記語い指示手段で指定され
た語いのみを認識対象として、音声認識を行なうように
したこと、更には、（７）前記（２）において、認識さ
れた結果を検索キーとして検索する情報検索部を有する
ようにしたこと、更には、（８）前記（７）において、
前記検索キーと、該検索キーと対応付けて記述した情報
を１単位として、該情報の単位ごとに関係付けて格納し
た情報格納部とを有するようにしたこと、更には、
（９）前記（７）において、前記認識結果のスコアが特
定の値より小さい場合に、該認識結果と、該認識結果の
よみまたは表記の部分文字列とが共通の語を検索キーと
して検索する情報検索部を有するようにしたこと、更に
は、（１０）前記（７）において、前記認識結果のスコ
アが特定の値より小さい場合に、該認識結果と、該認識
結果の表記またはよみを部分的に含む語を検索キーとし
検索する情報検索部を有するようにしたこと、更には、
（１１）前記（７）において、前記認識結果のスコアが
特定の値より小さくかつ認識結果が数値の場合に、該認
識結果と、該認識結果の時間的に前の部分文字列の表記
またはよみが共通である語を検索キーとして検索する情
報検索部を有するようにしたこと、更には、（１２）前
記（７）において、前記認識結果から検索式を生成し、
検索を行なう情報検索部を有するようにしたことを特徴
としたものである。In order to achieve the above object, the present invention provides (1) voice recognition in which a voice signal is extracted by extracting a voice signal from a communication path in which a talker talks with each other. Means and a recognition result processing means for deciding the information to be provided to the speaker using the result of voice recognition, or the action for supporting the operation of the speaker, and for only one speaker. Providing information or performing operation support, and (2) in (1), recognizing only the voice of the one speaker who is provided with information or provides operation support. What I did, and moreover,
(3) In the above (2), a voice input means for extracting only the voice of the one speaker is provided, and only the voice obtained from the voice input means is recognized, and (4) In the above (2), a timing instructing means for designating a timing at which a voice should be recognized is provided, and the voice recognition is performed only when the timing instructing means gives an instruction to extract a voice signal. Furthermore,
(5) In the above (2), a vocabulary instruction unit for instructing a group of vocabulary to be recognized is provided, and voice recognition is performed with only the vocabulary designated by the vocabulary instruction unit as a recognition target. Further, (6) In the above (2),
Vocabulary instruction means for instructing a group of vocabulary to be recognized,
When a timing instruction means for instructing a timing for recognizing a voice is provided, and when there is an instruction to perform the voice recognition from the timing instruction means, only the vocabulary designated by the vocabulary instruction means is recognized as a voice recognition target. And (7) in (2) above, an information search section for searching the recognized result as a search key is provided, and (8) in (7) above. ,
A search key; and an information storage section that stores the search key and the information described in association with the search key as one unit in association with each other,
(9) In (7) above, when the score of the recognition result is smaller than a specific value, a search is performed by using a word in which the recognition result and a reading or a partial character string of the recognition result are common. (10) In (7), when the score of the recognition result is smaller than a specific value, the recognition result and the notation or reading of the recognition result are partially included. Having an information search unit that searches using words that are included as a search key, and further,
(11) In (7), when the score of the recognition result is smaller than a specific value and the recognition result is a numerical value, the recognition result and the notation or reading of the partial character string temporally preceding the recognition result. Is provided with an information search unit that searches using a word that is common as a search key, and (12) in (7), a search expression is generated from the recognition result,
It is characterized by having an information retrieval unit for conducting a retrieval.

【０００７】[0007]

【作用】通信経路から抽出された会話中の話者の音声信
号から音声認識を行い、その結果を用いて片方の話者に
対してのみ情報を提供し、あるいは、操作支援を行い、
さらに、情報を提供される前記話者の音声のみを認識す
る。更には、指定されたタイミングのときのみ前記音声
信号を抽出し、また、認識対象を指定された語いとす
る。更には、認識結果より検索キーあるいは検索式を得
て前記情報を検索し、前記話者に提供する。[Operation] The voice recognition is performed from the voice signal of the speaker in the conversation extracted from the communication path, and the result is used to provide information to only one speaker or to support the operation.
Further, it recognizes only the voice of the speaker provided with the information. Furthermore, the voice signal is extracted only at the designated timing, and the recognition target is the designated word. Furthermore, a search key or a search formula is obtained from the recognition result to search the information, and the information is provided to the speaker.

【０００８】[0008]

【Example】

〔請求項１の発明〕図１は、本発明の一実施例を説明す
るための全体概略構成図で、２人（あるいは２組）の話
者は、それぞれ受話器などの音声入出力部１を経て、公
衆回線などの通信経路２を通して会話を行なっている。
ここで、通信経路２における片方の話者の受話器などか
ら音声信号の一部を抽出し、音声認識手段３により音声
認識を行なう。音声認識の技術は広く知られており、こ
こで用い得る音声認識手段３は、例えば、特徴抽出部３
ａと照合部３ｂよりなり、音声信号は、この特徴抽出部
３ａで特徴ベクトルの時系列に変換される。音声認識に
適した特徴量としては、さまざまなものが知られている
が、本実施例では、１６kHzでサンプリングした音声波
形を窓長２５６，シフト幅１６０，予測次数２０次で線
形予測解析した後、１０次のＬＰＣメルケプストラムに
変換したものを用いることにする（１０msごとに１０次
の特徴ベクトルが得られることになる）。[Invention of Claim 1] FIG. 1 is an overall schematic configuration diagram for explaining an embodiment of the present invention. Two speakers (or two groups) are each equipped with a voice input / output unit 1 such as a receiver. After that, conversation is conducted through the communication path 2 such as a public line.
Here, a part of the voice signal is extracted from the receiver or the like of one speaker in the communication path 2, and the voice recognition means 3 performs voice recognition. The technology of voice recognition is widely known, and the voice recognition means 3 that can be used here is, for example, the feature extraction unit 3
The voice signal is converted into a time series of feature vectors by the feature extraction unit 3a. Although various types of feature quantities suitable for speech recognition are known, in this embodiment, after performing linear prediction analysis on a speech waveform sampled at 16 kHz with a window length 256, a shift width 160, and a prediction order of 20. The 10th-order LPC mel cepstrum is converted (a 10th-order feature vector is obtained every 10 ms).

【０００９】ここで得られた特徴ベクトルは、照合部３
ｂで照合され、認識結果を得る。照合方式についても、
ＤＰマッチングを用いる方法、ＨＭＭを用いる方法など
さまざまな技術が広く知られている。例えば、本実施例
では、例えば、日本音響学会講演論文集１−４−１（平
成５年３月）に開示されている方法を採ることにする。The feature vector obtained here is used by the matching unit 3
It is collated in b to obtain the recognition result. Regarding the matching method,
Various techniques such as a method using DP matching and a method using HMM are widely known. For example, in the present embodiment, for example, the method disclosed in Proceedings of the Acoustical Society of Japan 1-4-1 (March 1993) is adopted.

【００１０】次に、認識結果処理手段４において認識さ
れた結果を用いて提供する情報あるいは制御信号を決定
し、その決定にしたがってパーソナルコンピュータのデ
ィスプレーなどの情報提供手段により、片方の話者に情
報の提供を行ない、あるいは話者が操作する機械への入
力とする。Next, the information recognized by the recognition result processing means 4 is used to determine the information or control signal to be provided, and according to the decision, the information providing means such as a display of a personal computer informs one of the speakers. Is provided or input to the machine operated by the speaker.

【００１１】〔請求項２，３の発明〕図２は、本発明
の、他の実施例を説明するための図で、話者Ａと話者Ｂ
は、公衆回線などの通信経路２を通して会話を行なって
いる。ここで、本実施例では話者Ａが使用している受話
器あるいはヘッドセットなどの音声入力手段１₁から話
者Ａの音声のみを抽出し、音声認識手段３により音声認
識を行なう。[Invention of Claims 2 and 3] FIG. 2 is a diagram for explaining another embodiment of the present invention.
Have a conversation through a communication path 2 such as a public line. In the present embodiment extracts only the voice of the speaker A from the speech input unit 1 _1, such as a handset or headset speaker A is using, performs speech recognition by the speech recognition unit 3.

【００１２】音声認識の方法については様々なものが知
られており、例えば、図１の実施例に開示した方法を用
いれば良い。次に認識された結果を用いて認識結果処理
手段４により提供する情報を決定し、パーソナルコンピ
ュータのディスプレーなどの情報提供手段により、片方
の話者Ａに情報の提供を行なう。There are various known voice recognition methods. For example, the method disclosed in the embodiment of FIG. 1 may be used. Next, the information to be provided by the recognition result processing means 4 is determined using the recognized result, and the information is provided to one speaker A by the information providing means such as a display of a personal computer.

【００１３】更に具体的に説明すると、従来は、例え
ば、話者Ａが電話の交換業務の場合には、話者Ｂ「営業１課の高橋さんをお願いします。」話者Ａ「はい、営業１課の高橋でございますね。」というような会話をしつつ、台帳をめくって該当する人
物の内線番号を調べる必要があった。More specifically, in the conventional case, for example, when the speaker A is a telephone exchange business, the speaker B "I would like Mr. Takahashi of the Sales Department 1". Speaker A "Yes, It's Takahashi from Sales Department 1. ", but I had to turn over the ledger and look up the extension number of the person in question.

【００１４】しかし、本発明では、上記のような自然な
会話の中で、話者Ａの音声を認識して「営業１課」「高
橋」というキーワードから、該当の人物の内線番号を話
者Ａに表示することができる。また、提供する情報とし
て、話者Ａが操作する機械（この例では交換器）へは、
内線番号を提供し、話者が見るディスプレーには内線番
号だけでなく「営業１課」「高橋」という自然言語も表
示するようにすれば、さらに自然さが増すことになる。However, in the present invention, in the natural conversation as described above, the voice of the speaker A is recognized, and the extension number of the person is determined from the keywords "Sales Section 1" and "Takahashi". A can be displayed. In addition, as information to be provided, to the machine (the switch in this example) operated by the speaker A,
If the extension number is provided and the speaker sees not only the extension number but also the natural language such as "Sales Section 1" and "Takahashi", the naturalness will be further enhanced.

【００１５】〔請求項４の発明〕図３は、本発明の他の
実施例を説明するための図で、話者Ａと話者Ｂは、公衆
回線などの通信経路２を通して会話を行なっている。こ
こで、話者Ａの音声入力手段１₁からの音声信号は、話
者Ｂにつながる通信経路２と、後述の音声認識手段３と
に同時に供給される。音声認識の方法については様々な
ものが知られており、例えば図１の実施例に開示した方
法を用いれば良い。ここで、話者Ａは、これから発声す
る音声を認識するかどうかの指示をタイミング指示手段
５から行なう。[Invention of Claim 4] FIG. 3 is a diagram for explaining another embodiment of the present invention, in which a speaker A and a speaker B have a conversation through a communication path 2 such as a public line. There is. Here, the audio signal from the audio input means 1 ₁ of the speaker A has a communication path 2 that lead to the speaker B, are simultaneously supplied to the speech recognition means 3 which will be described later. There are various known voice recognition methods. For example, the method disclosed in the embodiment of FIG. 1 may be used. Here, the speaker A gives an instruction from the timing instructing means 5 as to whether or not to recognize the voice to be uttered.

【００１６】タイミング指示手段５としては、マウスあ
るいはキーボードを用いる方法が適当であるが、音声認
識手段３が誤認識する可能性も考慮して、キーボード入
力との併用ができるように、マウスあるいはキーボード
で入力項目の欄にカーソルを合わせたときに、音声認識
が動作するようにするのが好ましい。つまり、音声認識
結果を表示する場所にカーソルを予めセットし、もし音
声認識手段３による結果が誤っていたり、あるいはリジ
ェクトされて認識結果が無かったような場合にもすぐ本
来音声認識されるはずの言語をキーボードなどで入力し
て修復できるようにしておくことが望ましい。A method using a mouse or a keyboard is suitable as the timing instructing means 5, but in consideration of the possibility that the voice recognizing means 3 may make an erroneous recognition, a mouse or a keyboard can be used together with the keyboard input. It is preferable that the voice recognition is activated when the cursor is placed on the input item column in step. In other words, the cursor should be set in advance at the place where the voice recognition result is displayed, and if the result by the voice recognition means 3 is incorrect or if there is no recognition result due to rejection, the voice should be recognized immediately. It is desirable to be able to enter the language with a keyboard so that it can be restored.

【００１７】〔請求項５，６の発明〕図４は、本発明の
他の実施例を説明するための図で、話者Ａと話者Ｂは、
公衆回線などの通信経路２を通して会話を行なってい
る。ここで、話者Ａの受話器１２などの音声入力手段か
らの音声信号は、話者Ｂにつながる通信経路２と、後述
の音声認識手段３とに同時に供給される。音声認識の方
法については様々なものが知られており、例えば図１に
示した実施例に開示した方法を用いれば良い。語い指示
手段６は、例えば、ディスプレー上の異なる場所にカー
ソルを合わせ指定したとき、認識対象語いテーブル１０
より異なる語いを設定するようにすれば良い。[Invention of Claims 5 and 6] FIG. 4 is a diagram for explaining another embodiment of the present invention. Speaker A and speaker B are
A conversation is conducted through a communication path 2 such as a public line. Here, the voice signal from the voice input means such as the receiver 12 of the speaker A is simultaneously supplied to the communication path 2 connected to the speaker B and the voice recognition means 3 described later. Various methods of speech recognition are known, and for example, the method disclosed in the embodiment shown in FIG. 1 may be used. The vocabulary instructing means 6 recognizes a vocabulary table 10 to be recognized, for example, when the cursor is pointed at a different position on the display.
You can set different vocabulary.

【００１８】例えば、電話交換業務に本発明の音声認識
方法を適用する場合、図５に示すようなディスプレー画
面７の構成とし、図中の領域９₁にカーソル８が合って
いるときは、認識対象語いの指示をセットして所属、名
前を認識対象語いとし、図中の領域９₂にカーソル８が
合っているときは、商品名を認識対象語いとして、「プ
リンターのケーブルのご質問ですね。」の類の会話を音
声認識手段３により認識して、この認識結果を認識結果
処理手段４により処理し、担当する部署の内線番号を出
力するようにする。なお、ここで説明した「語い」とは
単に単語のグループでも良いし、あるいは「部署＋名
前」のような単語順を規定した文法でも良いし、さらに
「部署＋名前＋ですね」のような文章を規定する文法で
もよい。For example, when the voice recognition method of the present invention is applied to the telephone exchange work, the display screen 7 is constructed as shown in FIG. 5, and when the cursor 8 is placed in the area 9 ₁ in the drawing, the recognition is performed. belonging to set an indication of the target vocabulary, the recognition target word beloved name, when the cursor 8 matches the region 9 ₂ in the figure, as the recognition target vocabulary of the product name, the question of the cable of the "printer The speech recognition means 3 recognizes a conversation of the type "", and this recognition result is processed by the recognition result processing means 4, and the extension number of the department in charge is output. Note that the word "word" explained here may be simply a group of words, or a grammar that defines the word order such as "department + name", or even "department + name + isn't it?" It may be a grammar that prescribes different sentences.

【００１９】また、音声認識の結果が常に１００％の認
識率ではないことを考慮し、キー入力との併用をするシ
ステムに本発明を適用する場合には、キー入力の欄と本
発明の語い指示手段６が示す領域を重ねると、誤認識の
訂正やリジェクトされた項目をキーですぐに訂正／追加
することができるので、音声認識が失敗した場合でも会
話を滞らせる時間が少なくなる。Further, considering that the result of voice recognition is not always 100% recognition rate, when the present invention is applied to a system which is used together with key input, the key input column and the word of the present invention are used. By overlapping the areas indicated by the instructing means 6, the misrecognition can be corrected and the rejected item can be immediately corrected / added by the key. Therefore, even if the voice recognition fails, the time for delaying the conversation is reduced.

【００２０】以上に、認識結果処理手段４を有し、この
認識結果処理手段４により、片方の話者に情報を提供
し、或いは、操作支援を行うことについて説明したが、
前記の認識結果処理手段４として情報検索部を用い、以
下のようにして支援することも可能である。It has been described above that the recognition result processing means 4 is provided, and the recognition result processing means 4 provides information to one speaker or assists the operation.
An information search unit may be used as the recognition result processing unit 4 to support it as follows.

【００２１】情報検索部は音声認識手段３で認識された
結果を検索キーとして検索を行ない、その検索結果を表
示するが、そこには検索されるべき情報を格納した情報
格納部がある。話者は検索された表示結果をもとに対話
を引き続き進めていく。The information search unit performs a search using the result recognized by the voice recognition means 3 as a search key and displays the search result. There is an information storage unit that stores information to be searched. The speaker continues the dialogue based on the retrieved display results.

【００２２】〔請求項７の発明〕図６は、検索キーとし
て用いるキーワードテーブルの例で、これは認識の候補
単語となり、また、検索キーとなる語のリストで、表記
とよみとからなる。実際の検索では各キーワードに対応
する情報の格納場所を各キーワードに対応付けて記述す
ることもできる。図７（ａ）は、情報格納部の例で、検
索キーとそれに対応する情報とからなる。この例によ
り、商品の注文を受ける場合において、例えば、発注元
（図２の話者Ｂ）からの問い合わせの商品が［コピー］
の場合について、より具体的に説明する。今、電話など
で、「コピーを買いたいんだけど」というような注文を
受けた場合、受注側（図２の話者Ａ）はその注文の確認
を「コピーでございますね」と繰り返す。この受注側の
発話を音声認識手段で認識する。キーワードテーブル内
の単語が発話中にないかどうかを認識し、キーワード
［コピー］を抽出し、これを認識結果とする。[Invention of Claim 7] FIG. 6 shows an example of a keyword table used as a search key. This is a list of words that are candidate words for recognition, and is a search key. In the actual search, the storage location of the information corresponding to each keyword can be described in association with each keyword. FIG. 7A is an example of the information storage unit, which includes a search key and information corresponding thereto. According to this example, when an order for a product is received, for example, the product of the inquiry from the orderer (speaker B in FIG. 2) is [copy].
The case will be described more specifically. Now, when an order such as "I want to buy a copy" is received over the phone, the order receiving side (speaker A in Fig. 2) repeats the confirmation of the order as "copy." The utterance on the order receiving side is recognized by the voice recognition means. It is recognized whether or not a word in the keyword table is being uttered, a keyword [copy] is extracted, and this is used as a recognition result.

【００２３】この認識結果［コピー］を検索キーとし
て、情報格納部から検索し、商品［コピー］に関する詳
しい情報をディスプレイの表示、紙への印刷などとして
提示する（図７（ａ））。受注側は表示された［コピ
ー］に関するデータをもとに引き続き、発注者と対話を
続け、受注作業を行なう。The recognition result [copy] is used as a search key to search the information storage unit, and detailed information about the product [copy] is presented as a display on a display or printed on paper (FIG. 7A). The order receiving side continues to interact with the orderer based on the displayed [copy] data to perform the order receiving work.

【００２４】従って、この発明によると、対話をとめる
ことなく、また対話中に他の作業をすることなく、業務
中の対話に必要な情報を得ることができ、効率的に業務
を行なうことができる。Therefore, according to the present invention, it is possible to obtain information necessary for a dialogue during work without stopping the dialogue and without performing other work during the dialogue, and to perform the work efficiently. it can.

【００２５】〔請求項８の発明〕図７（ｂ）は、情報格
納部の他の例で、検索キーとそれに対応する情報からな
り、各情報はそれ自体が検索キーとなり、さらには、概
念上、下位の情報をもち、階層的な構造になっている。
図７（ｂ´）でいえば、検索キー「コピー」には「タイ
プ」「価格」が対応し、この「価格」は、次には、図７
（ｂ´´）に示すように、検索キーとなり「１００万
円」「２００万円」「５００万円」に対応する。[Invention of Claim 8] FIG. 7B shows another example of the information storage unit, which is composed of a search key and information corresponding to the search key. Each piece of information becomes a search key, and further, a concept. It has upper and lower information and has a hierarchical structure.
In FIG. 7B ′, “type” and “price” correspond to the search key “copy”.
As shown in (b ″), it serves as a search key and corresponds to “1 million yen”, “2 million yen”, and “5 million yen”.

【００２６】この例により、商品の注文を受ける場にお
いて、例えば、発注元（図２の話者Ｂ）からの問い合わ
せの商品が［コピー］の場合について、より具体的に説
明する。今、電話などで、「コピーを買いたいんだけ
ど」というような注文を受けた場合、受注側（図２の話
者Ａ）はその注文の確認を「コピーでございますね」と
繰り返す。この受注側の発話を音声認識手段で認識し、
キーワードテーブル中の語が発話中にないかどうかを認
識し、キーワード［コピー］を抽出し、これを認識結果
とする。With this example, in the case of receiving an order for merchandise, for example, the case where the merchandise inquired from the ordering source (speaker B in FIG. 2) is [copy] will be described more specifically. Now, when an order such as "I want to buy a copy" is received over the phone, the order receiving side (speaker A in Fig. 2) repeats the confirmation of the order as "copy." Recognize this utterance on the ordering side with voice recognition means,
It is recognized whether or not a word in the keyword table is in the utterance, and the keyword [copy] is extracted and used as the recognition result.

【００２７】この認識結果［コピー］を検索キーとし
て、情報格納部から検索し、結果を提示する（図７
（ｂ）の（ｂ´））。次に、情報を絞りこむために、検
索結果中の語についてたずねる。例えば、「価格はどの
くらいのものでしょうか」と発注元にたずねると、この
発話から検索結果中の語「価格」という語が抽出され
る。次に「価格」を検索キーとして検索する（図７
（ｂ）の（ｂ´´））。これに対し、例えば「５００万
円くらい」という返答があれば、受注側は確認の応答
「５００万円ですね」と発話する。検索結果中の語が発
話中にないか認識し、「５００万円」が認識され、価格
５００円の商品が検索される（図７（ｃ））。ここでさ
らに対話を続け、情報を検索し、絞り込むことも可能で
ある。The recognition result [copy] is used as a search key to search the information storage unit and the result is presented (FIG. 7).
(B ') of (b). Then ask for words in the search results to narrow down the information. For example, when asking the ordering party, "How much is the price?", The word "price" in the search result is extracted from this utterance. Next, search using "price" as a search key (Fig. 7
(B) of (b)). On the other hand, for example, if there is a response of "about 5 million yen", the order receiving side utters a confirmation response of "5 million yen." Recognizing whether or not the word in the search result is being uttered, "5 million yen" is recognized, and a product with a price of 500 yen is searched (FIG. 7 (c)). It is also possible to continue further dialogue here, search for information, and narrow down.

【００２８】従って、この発明によると、対話をとめる
ことなく、必要な情報が検索でき、しかも、関係のある
情報を随時検索、提示することで、対話を円滑に進め、
例えば、受発注などの業務を効率的に行なうことができ
る。特に、検索結果が大量にある場合に情報を絞り込む
ことで、要求にあった適量の情報を参照すればよく、対
話を滞らせず、円滑に進めることができる。Therefore, according to the present invention, the necessary information can be retrieved without stopping the dialogue, and the relevant information can be retrieved and presented at any time to facilitate the dialogue.
For example, it is possible to efficiently perform work such as ordering. In particular, by narrowing down the information when there are a large number of search results, it is sufficient to refer to an appropriate amount of information that meets the request, and it is possible to proceed smoothly without delaying the dialogue.

【００２９】〔請求項９の発明〕前述した情報検索部を
有するものに関し、以下に、他の実施例について説明す
る。音声認識の結果のスコアが特定の値より小さいと誤
認識の可能性があり、認識された語に似た語に求める正
しい検索キーがあると考えられるので、そのための第１
の対策として、認識結果と誤認識結果のよみまたは表記
の部分文字列とが共通の語を、さらなる検索キーとす
る。[Invention of Claim 9] Another embodiment of the invention having the above-mentioned information retrieval unit will be described below. If the score of the result of voice recognition is smaller than a specific value, there is a possibility of misrecognition, and it is considered that there is a correct search key required for a word similar to the recognized word.
As a countermeasure against the above, a word in which the recognition result and the reading of the misrecognition result or the partial character string of the notation is common is used as a further search key.

【００３０】これを商品に関する問い合わせを受ける場
において、この情報検索部を有する音声認識装置を用い
た場合、例えば、発注元（図２の話者Ｂ）からの問い合
わせの商品が「ＦＡＸ」の場合を例にして具体的に説明
する。When a voice recognition device having this information retrieval unit is used in a place where an inquiry about a product is received, for example, when the inquiry product from the ordering source (speaker B in FIG. 2) is "FAX" Will be specifically described.

【００３１】電話などで「ＦＡＸのメモリー送信につい
て聞きたいんだけど」というような問い合わせを受けた
場合、受注側（図２の話者Ａ）はその注文の確認を「Ｆ
ＡＸのメモリー送信でございますね」と繰り返す。キー
ワードテーブル内の単語がこの受注側の発話中にないか
どうかを音声認識手段で認識する。認識結果としてキー
ワード［ＦＡＸ］がスコア３０点、キーワード［ポーリ
ング送信］がスコア１５点で抽出されたとし、ここでは
認識された結果に対するしきい値を２０点とする。When an inquiry such as "I want to ask about FAX memory transmission" is received by telephone or the like, the order receiving side (speaker A in FIG. 2) confirms the order by "F".
It is AX memory transmission. " The voice recognition means recognizes whether or not the word in the keyword table is in the utterance of the order receiving side. As a recognition result, it is assumed that the keyword [FAX] is extracted with a score of 30 points and the keyword [polling transmission] is extracted with a score of 15 points. Here, the threshold value for the recognized result is 20 points.

【００３２】キーワード［ＦＡＸ］はしきい値より大な
ので、認識結果［ＦＡＸ］を検索キーとして、情報格納
部から検索する。次に、キーワード［ポーリング送信］
はスコアがしきい値より大ではないので、キーワード
［ポーリング送信］の部分よみが共通である語をキーワ
ードテーブルで調べる。［メモリー送信］［直送送信］
があるので、［ポーリング送信］［メモリー送信］［直
送送信］を検索キーとして検索し、提示する（図８
（ａ））。Since the keyword [FAX] is larger than the threshold value, the information storage unit is searched using the recognition result [FAX] as a search key. Next, the keyword [polling transmission]
Since the score is not larger than the threshold value, the word having the common partial reading of the keyword [polling transmission] is checked in the keyword table. [Memory Send] [Direct Send]
Since there is a search key, [polling transmission], [memory transmission], and [direct transmission] are used as search keys and presented (FIG. 8).
(A)).

【００３３】従って、この発明によると、認識結果に誤
りがあった場合も、認識された語に似た語を検索するこ
とで、正しい検索結果をえ、対話を円滑にすすめ、業務
を効率的におこなうことができる。Therefore, according to the present invention, even if there is an error in the recognition result, a word similar to the recognized word is searched for to obtain a correct search result, to facilitate the dialogue, and to work efficiently. Can be done.

【００３４】〔請求項１０の発明〕上述と同様の誤認識
の第２の対策として認識結果と誤認識結果の表記または
よみを部分的に含む語を検索キーとする。これを商品の
注文を受ける場において、例えば、発注元（図２の話者
Ｂからの問い合わせの商品が［コピー］の場合を例によ
り具体的に説明する。[Invention of Claim 10] As a second countermeasure for erroneous recognition similar to the above, a search key is a word partially containing notation or reading of the recognition result and the erroneous recognition result. This will be specifically described by taking an example of a case where the orderer (the product inquired by the speaker B in FIG. 2 is [copy]) at the place of receiving an order for the product.

【００３５】電話などで「カラーコピーを買いたいんだ
けど」というような注文を受けた場合、受注側（図２の
話者Ａ）はその注文の確認を「カラーコピーでございま
すね」と繰り返す。キーワードテーブル内の単語がこの
受注側の発話中にないかどうかを音声認識手段で認識す
る。認識結果としてキーワード［コピー］をスコア１５
点で抽出されたとし、ここでは例えば認識された結果に
対するしきい値を２０点とする。When an order such as "I want to buy a color copy" is received by telephone, etc., the order receiving side (speaker A in Fig. 2) repeats the confirmation of the order as "a color copy." . The voice recognition means recognizes whether or not the word in the keyword table is in the utterance of the order receiving side. Score 15 for the keyword [copy] as the recognition result
It is assumed that points are extracted, and here, for example, the threshold value for the recognized result is 20 points.

【００３６】キーワード［コピー］はスコアがしきい値
より大ではないので、キーワード［コピー］を含む検索
キーをキーワードテーブルで調べる。［カラーコピー］
があるので、［カラーコピー］［コピー］を検索キーと
して検索し、提示する（図８（ｂ））。Since the score of the keyword [copy] is not larger than the threshold value, the search key including the keyword [copy] is checked in the keyword table. [Color copy]
Therefore, [color copy] and [copy] are searched as search keys and presented (FIG. 8B).

【００３７】従って、この発明によると、認識結果に誤
りがあった場合も、認識された語に似た語を検索するこ
とで、正しい検索結果をえ、対話を円滑にすすめ、業務
を効率的におこなうことができる。Therefore, according to the present invention, even if there is an error in the recognition result, a word similar to the recognized word is searched for to obtain a correct search result, facilitate the dialogue, and improve the work efficiency. Can be done.

【００３８】〔請求項１１の発明〕上述と同様の誤認識
の第３の対策として認識結果と該認識結果の時間的に前
の部分文字列の表記またはよみが共通である語を検索キ
ーとする。これを商品に関する問い合わせを受ける場に
おいて、例えば、発注元（図２の話者Ｂ）からの問い合
わせの商品が［コピー］の場合を例にして具体的に説明
する。[Invention of Claim 11] As a third countermeasure against the same erroneous recognition as described above, a recognition result and a word having a common notation or reading of a partial character string preceding the recognition result in time are used as a search key. To do. This will be specifically described, for example, when the inquiry product from the ordering source (speaker B in FIG. 2) is [copy] when receiving an inquiry about the product.

【００３９】電話などで「コピーのタイプ１００につい
て聞きたいんだけど」というような問い合わせを受けた
場合、受注側（図２の話者Ａ）はその注文の確認を「コ
ピーのタイプ１００でございますね」と繰り返す。キー
ワードテーブル内の単語が、この受注側の発話中にない
かどうかを音声認識手段で認識する。認識結果としてキ
ーワード［コピー］がスコア３０点、キーワード［１１
０］がスコア１５点で抽出されたとし、ここでは認識さ
れた結果に対するしきい値を２０点とする。When an inquiry such as "I want to ask about copy type 100" is received over the telephone, the order receiving side (speaker A in FIG. 2) confirms the order by saying "copy type 100. I repeat. " The voice recognition means recognizes whether or not the word in the keyword table is in the utterance of the order receiving side. As the recognition result, the keyword [copy] has a score of 30 points, and the keyword [11]
0] is extracted with a score of 15 points, and here the threshold value for the recognized result is 20 points.

【００４０】キーワード［ＦＡＸ］はしきい値より大な
ので、認識結果［ＦＡＸ］を検索キーとして、情報格納
部から検索する。次にキーワード［１１０］はスコアが
しきい値より大ではなく、かつ数字からなるので、キー
ワード［１１０］の前方の部分文字列のよみが共通であ
る語をキーワードテーブルで調べる。［１００］［１０
５］があるので、［１００］［１１０］［１０５］を検
索キーとして検索し、ディスプレイに表示１または紙に
印刷する（図８（ｃ））。Since the keyword [FAX] is larger than the threshold value, the recognition result [FAX] is used as a search key to search the information storage unit. Next, since the keyword [110] has a score which is not larger than the threshold value and is made up of numbers, a word in which the reading of the partial character string in front of the keyword [110] is common is searched in the keyword table. [100] [10
5] is present, the search is performed using [100], [110], and [105] as search keys, and display 1 is displayed on the display or printed on paper (FIG. 8C).

【００４１】従って、この発明によると、認識結果に誤
りがあった場合も、認識された語に似た語を検索するこ
とで、正しい検索結果をえ、対話を円滑にすすめ、業務
を効率的におこなうことができる。特に、数詞表現に対
して有効である。Therefore, according to the present invention, even if there is an error in the recognition result, a word similar to the recognized word is searched for to obtain a correct search result, facilitate the dialogue, and improve the work efficiency. Can be done. It is especially effective for numeric expressions.

【００４２】〔請求項１２の発明〕また、前述の情報検
索部を有するものに関し、さらに、他の実施例を以下に
説明する。以上には、認識結果をそのまま検索キーに用
いた例について説明したが、以下の実施例では認識結果
を検索用の述語として用いることにしている。[Invention of Claim 12] Further, another embodiment of the invention having the above-mentioned information retrieval unit will be described below. The example in which the recognition result is directly used as the search key has been described above, but in the following embodiments, the recognition result is used as a predicate for search.

【００４３】図９（ａ）は検索述語対応表であり、検索
結果の語とそれに対応する検索式の述語を対応付けて記
述したものである。この例により、商品の注文を受ける
場において、この検索述語対応表を有する音声認識手段
を用いた場合について、例えば、発注元（図２の話者
Ｂ）からの問い合わせの商品が［コピーとＦＡＸ］の場
合を例により具体的に説明する。電話などで、「コピー
とＦＡＸを買いたいんだけど」というような注文を受け
た場合、受注側（図２の話者Ａ）はその注文の確認を
「コピーとＦＡＸでございますね」と繰り返す。キーワ
ードテーブルと検索述語対応表の語がこの受注側の発話
中にあるかどうかを音声認識手段で認識し、キーワード
［コピー］［と］［ＦＡＸ］を抽出し、これを認識結果
とする。FIG. 9A is a search predicate correspondence table, in which a search result word and a search expression predicate corresponding thereto are described in association with each other. According to this example, when a voice recognition means having this search predicate correspondence table is used in a place where an order for a product is received, for example, the product inquired from the ordering source (speaker B in FIG. The case will be specifically described with an example. When you receive an order over the phone, such as "I want to buy a copy and a fax," the ordering party (speaker A in Figure 2) repeats the confirmation of the order as "copy and fax." . Whether or not the words in the keyword table and the search predicate correspondence table are in the utterance on the ordering side is recognized by the voice recognition means, and the keywords [copy] [and] [FAX] are extracted and used as the recognition result.

【００４４】検索述語対応表から認識結果の［と］は
［ＯＲ］の意味があるので、認識結果［コピー］［ＦＡ
Ｘ］をそれぞれ検索キーとし、「ＯＲ」を検索用の述語
として検索式を生成して情報格納部から検索し、商品
［コピー］［ＦＡＸ］に関する詳しい情報を話者に提示
する（図９（ｂ））。From the search predicate correspondence table, since the recognition result [and] has the meaning of [OR], the recognition result [copy] [FA]
X] is used as a search key and "OR" is used as a search predicate to generate a search expression and the information storage unit is searched to present detailed information about the product [copy] [FAX] to the speaker (Fig. 9 ( b)).

【００４５】従って、この発明によると、対話をとめる
ことなく、対話中の表現から検索式を生成するので、対
話中に他の作業をすることなく、業務中の対話に必要な
情報を得ることができ、効率的に業務を行なうことがで
きる。Therefore, according to the present invention, since the search expression is generated from the expression in the dialogue without stopping the dialogue, it is possible to obtain the information necessary for the dialogue in business without any other work during the dialogue. Therefore, it is possible to perform business efficiently.

【００４６】[0046]

【発明の効果】以上のとおりの本発明の音声認識方法に
よって、次に示すような効果がもたらされることにな
る。The following effects are brought about by the voice recognition method of the present invention as described above.

【００４７】人間を相手に自然な会話をしながら、情報
が自動的に入手できたり、操作支援を受けたりできる。
また、音声認識をする話者を片方のみに特定し、その音
声のみを認識して提供すべき情報を決定するようにした
ため、特定しない方法に比べて高い認識率となる。一般
に、音声認識装置は、特定話者方式の方が不特定話者方
式の性能より高い。これは、ユーザーの個人の音声を辞
書として登録できるからである。請求項２，３の発明の
実施例で説明した会話の例では、話者Ｂは特定できない
が、話者Ａは特定でき、話者Ａの音声のみを認識して提
供すべき情報を決定するため、請求項１の方法に比べて
高い認識率で、すなわち安定して情報提供や操作支援が
可能になる。Information can be automatically obtained or operation support can be received while having a natural conversation with a human.
Further, since only one speaker who recognizes the voice is specified and only the voice is recognized to determine the information to be provided, the recognition rate is higher than that of the method that does not specify the speaker. In general, in a voice recognition device, the performance of the specific speaker system is higher than that of the unspecified speaker system. This is because the user's personal voice can be registered as a dictionary. In the conversation example described in the embodiments of the inventions of claims 2 and 3, the speaker B cannot be specified, but the speaker A can be specified, and only the voice of the speaker A is recognized to determine the information to be provided. Therefore, it is possible to stably provide information and support operations with a higher recognition rate than the method according to claim 1.

【００４８】また、音声認識させるタイミングを指示す
る音声認識指示手段から、認識をさせる指示があったと
きのみ認識するようになるので、例えば、実施例のよう
な電話交換業務において、話者Ｂ「私、××商会の田中と申します」話者Ａ「××商会の田中様でございますね」というような会話中に音声認識が動作していると、電話
を取り次ぐ先の「田中」の内線番号を表示するという話
者が認識させることを意図していない音声の認識による
思わぬ動作で使用者が混乱してしまう。このような混乱
を招く誤動作を防ぐことができる。Further, since the voice recognition instruction means for instructing the timing of voice recognition recognizes only when there is an instruction for recognition, for example, in the telephone exchange work as in the embodiment, the speaker B "I'm Tanaka of XX Shokai. ”Speaker A:“ Tanaka of XX Shokai ”. The user is confused by an unexpected operation of recognizing a voice, which is not intended to be recognized by the speaker, that is, displaying the extension number. It is possible to prevent such a malfunction that causes confusion.

【００４９】さらに、語い指示手段によって指定された
語いだけを認識対象とするので、識別すべき語い数が減
少し、高い認識率および認識速度を維持することができ
る。また、認識結果を検索キーとして情報を検索すると
いう処理で情報を提供するので間違いのない情報を速く
利用することにより効率的に業務を行なうことができ
る。さらに、検索される情報の格納部を階層構造として
いるので、特に検索結果が大量にある場合に情報を絞り
込むことで、要求にあった適量の情報を参照すればよ
く、対話を滞らせず、円滑に進めることができる。Further, since only the vocabulary designated by the vocabulary designating means is to be recognized, the number of vocabularies to be identified can be reduced, and a high recognition rate and a high recognition speed can be maintained. Further, since the information is provided by the process of searching the information using the recognition result as the search key, it is possible to efficiently perform the business by quickly using the error-free information. Further, since the storage unit for the information to be searched has a hierarchical structure, it is sufficient to narrow down the information especially when there are a large number of search results and refer to the appropriate amount of information that was requested, without interrupting the dialogue. You can proceed smoothly.

【００５０】また、音声認識の結果のスコアが特定の値
より小さい語についてもその語に似た語を検索キーと
し、特に商品の番号など数詞に対しても有効な手法でこ
の検索キーを得ているので、検索もれをなくすことがで
き、対話を円滑に進め、業務を効率的に行うことができ
る。For a word whose score of the result of voice recognition is smaller than a specific value, a word similar to the word is used as a search key, and this search key is obtained by a method particularly effective for numerics such as product numbers. Therefore, it is possible to eliminate the omission of search, facilitate smooth dialogue, and perform business efficiently.

【００５１】また、対話をとめることなく、必要な情報
が検索でき、しかも、関係のある情報を随時検索、提示
することで、対話を円滑に進め、例えば、受発注などの
業務を効率的に行なうことができる。また、認識結果に
誤りがあった場合も、認識された語に似た語を検索する
ことで、正しい検索結果をえ、対話を円滑にすすめ、業
務を効率的におこなうことができる。Further, necessary information can be retrieved without stopping the dialogue, and moreover, relevant information can be retrieved and presented at any time, so that the dialogue can be smoothly carried out, and for example, work such as ordering and ordering can be performed efficiently. Can be done. Further, even if the recognition result is incorrect, by searching for a word similar to the recognized word, the correct search result can be obtained, the dialogue can be smoothly performed, and the work can be efficiently performed.

【００５２】さらに、音声認識の結果の語を検索キーに
用いるだけではなく、検索用の述語として用い、対話中
の表現から検索式を生成するので、対話をとめたり、対
話中に他の作業をすることなく、業務中の対話に必要な
情報を得ることができ、効率的に業務を行なうことがで
きる。Further, not only the word as a result of speech recognition is used as a search key but also as a predicate for search and a search expression is generated from the expression in the dialogue, so that the dialogue is stopped or other work is performed during the dialogue. It is possible to obtain the information necessary for the dialogue during the work without carrying out, and to carry out the work efficiently.

[Brief description of drawings]

【図１】本発明の一実施例の構成例を示す図である。FIG. 1 is a diagram showing a configuration example of an embodiment of the present invention.

【図２】本発明の他の実施例の構成例を示す図であ
る。FIG. 2 is a diagram showing a configuration example of another embodiment of the present invention.

【図３】本発明のさらに他の実施例の構成例を示す図
である。FIG. 3 is a diagram showing a configuration example of still another embodiment of the present invention.

【図４】本発明のさらに他の実施例の構成例を示す図
である。FIG. 4 is a diagram showing a configuration example of still another embodiment of the present invention.

【図５】語い指示手段の実施例に用いるディスプレー
の画面構成を示す図である。FIG. 5 is a diagram showing a screen configuration of a display used in an embodiment of a vocabulary instruction unit.

【図６】検索キーとして用いるキーワードテーブルの
例を示す図である。FIG. 6 is a diagram showing an example of a keyword table used as a search key.

【図７】検索される情報の格納部の例を示す図であ
る。FIG. 7 is a diagram showing an example of a storage unit of information to be searched.

【図８】検索結果の例を示す図である。FIG. 8 is a diagram showing an example of a search result.

【図９】検索述語対応表及び検索結果の例を示す図で
ある。FIG. 9 is a diagram showing an example of a search predicate correspondence table and a search result.

[Explanation of symbols]

１…音声入出力部、１₁…音声入力手段、１₂…受話器、
２…通信経路、３…音声認識手段、３₁…認識対象語
い、４…認識結果処理手段、５…タイミング指示手段、
６…語い指示手段、７…ディスプレー画面、８…カーソ
ル。1 ... voice input and output unit, 1 ₁ ... voice input section, 1 ₂ ... handset,
2 ... communication path, 3 ... speech recognition means, 3 ₁ ... recognition target word, 4 ... recognition result processing means, 5 ... timing instruction means,
6 ... vocabulary instruction means, 7 ... display screen, 8 ... cursor.

Claims

[Claims]

1. Information to be provided to a speaker by using a voice recognition means for extracting a voice signal from a communication path in which the speaker and the speaker have a conversation and performing voice recognition, and a result of the voice recognition. Voice recognition method characterized by comprising a recognition result processing means for deciding a speaker or an operation for supporting the operation of a speaker, and providing information or operating support to only one speaker. .

2. The voice recognition method according to claim 1, wherein only the voice of the one speaker for which information is provided or operation support is performed is recognized.

3. The voice recognition method according to claim 2, further comprising voice input means for extracting only the voice of the one speaker, and recognizing only the voice obtained from the voice input means. Recognition method.

4. The voice recognition method according to claim 2, further comprising a timing instruction means for designating a timing at which voice should be recognized, and the voice recognition is performed only when an instruction to extract a voice signal is given from the timing instruction means. A voice recognition method characterized by performing.

5. The voice recognition method according to claim 2, further comprising vocabulary instruction means for instructing a group of vocabulary to be recognized, and recognizing only the vocabulary designated by the vocabulary instruction means as a recognition target. A voice recognition method characterized by performing.

6. The voice recognition method according to claim 2, further comprising: a vocabulary instruction means for instructing a vocabulary group to be recognized, and a timing instruction means for instructing a timing of recognizing a voice. A voice recognition method characterized in that, when a voice recognition instruction is given, voice recognition is performed with only the vocabulary designated by the vocabulary instruction means as a recognition target.

7. The voice recognition method according to claim 2, further comprising an information retrieval unit that retrieves the recognized result as a retrieval key.

8. The voice recognition method according to claim 7, wherein the search key and information described in association with the search key are 1
A voice recognition method, characterized in that it has, as a unit, an information storage unit that stores the information in association with each unit.

9. The voice recognition method according to claim 7, wherein when the score of the recognition result is smaller than a specific value, the recognition result and a partial character string of the reading or the notation of the recognition result are common words. A voice recognition method comprising an information search unit for searching as a search key.

10. The speech recognition method according to claim 7, wherein when the score of the recognition result is smaller than a specific value, the recognition result and a word partially containing the notation or reading of the recognition result are used as search keys. A voice recognition method comprising an information search unit for searching.

11. The voice recognition method according to claim 7, wherein when the score of the recognition result is smaller than a specific value and the recognition result is a numerical value, the recognition result and a partial character preceding in time of the recognition result. A voice recognition method, comprising: an information search unit that searches for words having common column notations or readings as a search key.

12. The voice recognition method according to claim 7, further comprising an information search unit that generates a search expression from the recognition result and performs a search.