JP3139679B2

JP3139679B2 - Voice input device and voice input method

Info

Publication number: JP3139679B2
Application number: JP11011036A
Authority: JP
Inventors: 一郎森
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1999-01-19
Filing date: 1999-01-19
Publication date: 2001-03-05
Anticipated expiration: 2019-01-19
Also published as: JP2000207166A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、入力した音声を
単語として認識し、この単語の意味内容からあらかじめ
定められている複数の入力項目のうち、この単語を入力
する入力項目を選択する音声入力装置及び音声入力方法
に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice input for recognizing an input voice as a word and selecting an input item for inputting the word from a plurality of input items predetermined from the meaning of the word. The present invention relates to a device and a voice input method.

【０００２】[0002]

【従来の技術】従来、音声によって画面上から入力を行
う音声入力装置が開発されている。この装置は、コンピ
ュータなどの操作に不慣れである利用者でも、容易に使
用できるように、音声によってデータを入力するもので
ある。2. Description of the Related Art Conventionally, a voice input device for performing input from a screen by voice has been developed. This apparatus inputs data by voice so that even a user who is unfamiliar with the operation of a computer or the like can easily use it.

【０００３】音声入力装置は、人が発した音声などを単
語として認識して、その単語に従って処理を行う。デー
タを入力するときには、それの入力項目を指定する必要
がある。このような音声入力装置として、特開平８−１
２９４７６号公報などに、データとともにそれの入力項
目を発声することにより指定する音声入力装置が掲載さ
れている。[0003] A voice input device recognizes a voice or the like uttered by a person as a word and performs processing according to the word. When entering data, it is necessary to specify the input items. As such a voice input device, Japanese Patent Laid-Open No. 8-1
Japanese Patent No. 29476 discloses a voice input device which is specified by uttering an input item of the data together with the data.

【０００４】図３は、従来技術の音声認識の動作を説明
するフローチャートである。なお、公報に掲載されてい
る音声入力装置は、病理検査支援システムに適用したも
のである。FIG. 3 is a flowchart for explaining the operation of the conventional speech recognition. The voice input device described in the official gazette is applied to a pathological examination support system.

【０００５】まず、ユーザが、顕微鏡などによって被検
査対象を見ながら、音声によって病理所見やそれの入力
項目を指示する（ステップＳ２１）。このとき発せられ
た音声は、マイクロフォンなどによって集音し、音声認
識合成装置に入力する（ステップＳ２２）。入力された
音声は、音声認識部に出力する。First, a user instructs a pathological finding or an input item thereof by voice while looking at an object to be inspected by a microscope or the like (step S21). The voice generated at this time is collected by a microphone or the like and input to the voice recognition / synthesis device (step S22). The input voice is output to a voice recognition unit.

【０００６】音声認識部は、入力された音声から、その
音声にかかる単語を認識する（ステップＳ２３）。認識
した単語は、コード形式でウィンドウズなどのＯＳに伝
えられる（ステップＳ２５）。なお、入力音声と認識し
た単語とが一致しているか否かをユーザに確認させるた
め、ヘッドフォンなどから認識した音を出力する（ステ
ップＳ２６）。The voice recognition unit recognizes a word associated with the voice from the input voice (step S23). The recognized word is transmitted to the OS such as Windows in a code format (step S25). In addition, in order to allow the user to confirm whether the input voice matches the recognized word, the recognized sound is output from headphones or the like (step S26).

【０００７】また、ＯＳからアプリケーション・プログ
ラムである病理検査支援システムに、認識した単語を出
力する（ステップＳ２７）。病理検査支援システムは、
入力された単語が、病理所見に関するものであれば、そ
の単語を、表示装置の画面の所望の項目に表示する（ス
テップＳ２８）。一方、入力された単語が、入力項目に
関するものであれば、その入力項目に所望の単語を入力
できるようにする（ステップＳ２９）。The recognized word is output from the OS to the pathological examination support system, which is an application program (step S27). Pathological examination support system
If the input word is related to a pathological finding, the word is displayed in a desired item on the screen of the display device (step S28). On the other hand, if the input word relates to an input item, a desired word can be input to the input item (step S29).

【０００８】このような従来技術においては、データ入
力を音声だけで行うことができる。また、認識された単
語は、音声で応答されるため、表示装置の画面をみるこ
となく入力音声と認識された音声とが一致しているか否
か確認できる。そのため、音声入力するとき、被検査対
象を観察するなどの作業を中断しなくてもよい。In such a conventional technique, data input can be performed only by voice. Further, since the recognized word is answered by voice, it is possible to confirm whether the input voice matches the recognized voice without looking at the screen of the display device. Therefore, when inputting voice, it is not necessary to interrupt the work such as observing the inspection object.

【０００９】[0009]

【発明が解決しようとする課題】しかし、上記の従来技
術は、以下に示すような問題点があった。However, the above-mentioned prior art has the following problems.

【００１０】まず、従来技術の音声入力装置を使用する
場合には、データの入力項目をユーザが音声などによっ
て指示していた。したがって、ユーザは、音声入力装置
を使用するためには、その前提として、どのような入力
項目が設けられているかを知っていることが必要であ
る。First, when a conventional voice input device is used, a user inputs a data input item by voice or the like. Therefore, in order to use the voice input device, the user needs to know what input items are provided as a prerequisite.

【００１１】したがって、設けられている入力項目を知
らない場合には、表示装置の画面を見て、表示画面上に
表示されている入力項目を確認してから指示しなければ
ならない。Therefore, if the user does not know the input items provided, he or she must look at the screen of the display device and check the input items displayed on the display screen before giving an instruction.

【００１２】また、従来技術の音声入力装置は、上述の
ように入力音声を単語として認識する機能を備えてい
る。しかし、入力されたデータにかかる単語を同一概念
の他の単語に変換する機能を備えていない。したがっ
て、同一概念の単語を入力しても、装置内では、別々の
単語が入力されたとして扱われてしまう場合がある。Further, the conventional voice input device has a function of recognizing an input voice as a word as described above. However, it does not have a function of converting a word related to input data into another word of the same concept. Therefore, even if words with the same concept are input, the words may be treated as different words in the apparatus.

【００１３】そこで、上記の問題点を解決するため、こ
の発明の音声入力装置は、ユーザが発する音声から、音
声にかかる単語とそれの入力項目を特定することを課題
とする。[0013] In order to solve the above problems, it is an object of the present invention to specify a word related to a voice and an input item thereof from a voice uttered by a user.

【００１４】[0014]

【課題を解決するための手段】上記の課題を解決するた
めに、この発明は、音声を入力する入力部と、前記入力
部によって入力された音声の音声波形から単語を認識す
る認識部と、予め単語を入力する入力項目が複数設けら
れており前記認識手段によって認識された単語がどの入
力項目に属するかを選択する選択部と、前記選択部で選
択された入力項目に前記認識部で認識された単語を入力
した状態で表示する表示部とを備えた音声入力装置であ
って、前記認識部によって認識された単語と前記選択部
によって選択された入力項目とを合成音声によって出力
する応答部を備え、出力結果に基づくユーザの返答に応
じて前記認識部で認識された単語を前記選択部で選択さ
れた入力項目に表示し又はユーザに音声の再入力を促す
ことを特徴とする。In order to solve the above problems BRIEF SUMMARY OF THE INVENTION The present invention includes an input unit for inputting a voice, the input
A recognition unit for recognizing a word from the voice waveform of the voice input by the unit, and a plurality of input items for inputting the word in advance are provided.
Which words are recognized by the recognition means
A selector for selecting belongs to force entry, selected by the selection unit
Enter the word recognized by the recognition unit in the selected input item
A voice input device having a display unit for displaying in a state where the
The word recognized by the recognition unit and the selection unit
Outputs the input item selected by using synthesized speech
Responding to user responses based on output results.
The word recognized by the recognition unit is selected by the selection unit.
Displayed in the input field or prompt the user to re-enter the voice
It is characterized by the following .

【００１５】また、この発明は、音声を入力し、前記入
力した音声の音声波形から単語を認識させ、予め単語が
入力される入力項目が複数設けられており前記認識させ
た単語がどの入力項目に属するかを選択させ、前記認識
させた単語を前記選択させた入力項目へ入力させた状態
で表示させる音声入力方法であって、前記認識させた単
語と前記選択させた入力項目とを合成音声によって出力
させ、出力結果に基づいて認識させた単語を選択させた
入力項目に入力した状態で表示させ又は音声の再入力を
行うことを特徴とする。さらに、この発明は、音声を入
力する入力部と、前記入力部によって入力された音声の
音声波形から単語を認識する認識部と、予め単語を入力
する入力項目が複数設けられており前記認識手段によっ
て認識された単語がどの入力項目に属するかを選択する
選択部と、前記選択部で選択された入力項目に前記認識
部で認識された単語を入力した状態で表示する表示部と
を備えた音声入力装置の使用方法であって、前記音声入
力装置は、前記認識部によって認識された単語と前記選
択部によって選択された入力項目とを合成音声によって
出力する応答部を備え、前記応答部の出力結果に基づい
て前記認識部で認識された単語を前記選択部で選択され
た入力項目した状態で表示させ又は音声の再入力を行う
ことを特徴とする。 Further, the present invention inputs a voice, to recognize words from voice speech waveform the input, pre-word
A plurality of input items to be input are provided and the
To select which input item the word belongs to ,
State words were allowed to input to the input item is the selected
In a speech input method for display, the recognized allowed a single
Output words and selected input items by synthesized speech
And select the recognized word based on the output result
Display the input items as they are entered or re-enter the audio.
It is characterized by performing . In addition, the present invention provides for audio input.
An input unit for inputting, and a voice input by the input unit.
Recognition unit for recognizing words from speech waveforms and pre-input words
There are a plurality of input items to be
Select which input item the recognized word belongs to
A selection unit and the input item selected by the selection unit
A display unit that displays the words recognized by the unit as they are input
A method of using a voice input device comprising:
The force device is configured to control the word recognized by the recognition unit and the selection.
The input item selected by the selector
A response unit for outputting, based on an output result of the response unit.
The word recognized by the recognition unit is selected by the selection unit
Display or re-input the voice
It is characterized by the following.

【００１６】[0016]

【発明の実施の形態】以下、この発明の実施形態につい
て、図面を参照して説明する。なお、この実施形態は、
音声入力装置を、たとえば家計簿のソフトウェアに適用
したものである。Embodiments of the present invention will be described below with reference to the drawings. In this embodiment,
The voice input device is applied, for example, to software for a household account book.

【００１７】図１は、この実施形態の音声入力装置を示
す構成図である。図１において、音声入力装置は、ユー
ザが発する音声を入力する入力部１１と、入力部１１か
ら入力された音声を単語として認識する認識部１２と、
認識部１２に接続され音声を認識するための情報が登録
されている認識辞書１３とを有する。FIG. 1 is a block diagram showing a voice input device according to this embodiment. In FIG. 1, a voice input device includes an input unit 11 for inputting a voice uttered by a user, a recognition unit 12 for recognizing a voice input from the input unit 11 as a word,
A recognition dictionary 13 which is connected to the recognition unit 12 and in which information for recognizing voice is registered.

【００１８】また、認識部１２において認識された単語
の入力項目を選択する選択部１４と、選択部１４に接続
され単語の入力項目を選択するための情報が登録されて
いる選択辞書１５と、認識された単語と選択された入力
項目とを音声によって応答する応答部１６と、認識され
た単語と選択された入力項目とのデータを出力する出力
部１７と、認識された単語と選択された入力項目とを記
憶する記憶部１８と、単語とそれの入力項目などを表示
する表示部１９とを有している。A selecting unit 14 for selecting an input item of the word recognized by the recognizing unit 12, a selection dictionary 15 connected to the selecting unit 14 and registering information for selecting the input item of the word; A response unit 16 that responds by voice between the recognized word and the selected input item, an output unit 17 that outputs data of the recognized word and the selected input item, and an output unit 17 that outputs the recognized word and the selected input item. It has a storage unit 18 for storing input items, and a display unit 19 for displaying words and their input items.

【００１９】図２は、表示部１９に表示されている家計
簿表を示す図である。図２において、家計簿表の１行目
は、「内訳」、「品名」、「金額」及び「備考」の各欄
からなる。たとえば、食品の内訳として、乳製品、主
食、肉・魚、…、などの欄が設けられている。また、衣
類の内訳には、衣類、靴、…などの欄が設けられてい
る。FIG. 2 is a diagram showing a household account book table displayed on the display unit 19. In FIG. 2, the first line of the household account book table includes columns of “Breakdown”, “Product Name”, “Amount”, and “Remarks”. For example, as a breakdown of food, columns such as dairy products, staple food, meat / fish, and so on are provided. Further, in the breakdown of clothing, columns such as clothing, shoes, and so on are provided.

【００２０】図３は、この実施形態の音声入力装置の動
作を示すフローチャートである。ここでは、たとえば、
入力するデータが牛乳とそれの値段が200円という場合
を例として説明する。FIG. 3 is a flowchart showing the operation of the voice input device of this embodiment. Here, for example,
An example in which the input data is milk and its price is 200 yen will be described.

【００２１】まず、ユーザが「牛乳、200円」と発声す
ると、マイクロフォンなどの入力部１１はその音声を入
力する（ステップＳ１）。入力された音声は認識部１２
に出力される。認識部１２は、音声を図示しないフィル
タによって、音声を認識するのに不要なノイズなどを除
去する。そして、図示しないＡ／Ｄ変換器によってＡ／
Ｄ変換して、音声波形をデジタル化する。First, when the user utters "milk, 200 yen", the input unit 11 such as a microphone inputs the voice (step S1). The input voice is recognized by the recognition unit 12.
Is output to The recognizing unit 12 removes unnecessary noise and the like for recognizing the voice by using a filter (not shown) for the voice. A / D converter (not shown) converts the A / D
The audio waveform is digitized by D-conversion.

【００２２】デジタル化した音声波形は、認識辞書１３
に登録されている単語の波形と比較される。比較は、Ｈ
ＭＭ（Hidden Markov Model）などの手法を用いて行わ
れる。そして、入力音声は、認識辞書１３に登録されて
いる単語のうち、音声波形と最も近い波形にかかる単語
とみなされる。こうして、「牛乳、200円」という入力
音声は、「牛乳」、「200円」という２つの単語として
認識される（ステップＳ２）。The digitized speech waveform is stored in the recognition dictionary 13
Is compared with the waveform of the word registered in. The comparison is H
This is performed using a technique such as MM (Hidden Markov Model). The input voice is regarded as a word having a waveform closest to the voice waveform among words registered in the recognition dictionary 13. Thus, the input voice "milk, 200 yen" is recognized as two words "milk" and "200 yen" (step S2).

【００２３】つぎに、認識された単語は、選択部１４に
入力される。選択部１４は、選択辞書１５に登録されて
いる選択情報を抽出する。選択辞書１５には、「食
品」、「衣服」などの項目ごとに関係する単語を一括し
て登録している。そして、選択情報から、認識された単
語をいずれの入力項目に出力するべきかを選択する（ス
テップＳ３）。なお、選択情報について、詳しくは後述
する。Next, the recognized word is input to the selection unit 14. The selection unit 14 extracts selection information registered in the selection dictionary 15. In the selection dictionary 15, words related to items such as "food" and "clothing" are registered collectively. Then, from the selection information, the input item to which the recognized word should be output is selected (step S3). The selection information will be described later in detail.

【００２４】選択部１４によって、「牛乳」の入力項目
が「食品」と選択されると、「牛乳」は「食品」の内訳
に入力されるデータである旨と、「200円」は「金額」
の欄に入力されるデータである旨とが、応答部１６から
合成音声によって出力される（ステップＳ４）。たとえ
ば、『「食品、牛乳」「金額、20円」』という合成音声
である。これによって、ユーザは、発声した「牛乳、20
0円」という音声が、音声入力装置で正しく認識された
ことを確認することができる（ステップＳ５）。When the input item of "milk" is selected as "food" by the selection unit 14, "milk" is data to be input into the breakdown of "food", and "200 yen" is "
Is output from the response section 16 as synthesized speech from the response section 16 (step S4). For example, the synthesized speech is “food, milk” and “amount, 20 yen”. This allows the user to say “milk, 20
It can be confirmed that the voice "0 yen" has been correctly recognized by the voice input device (step S5).

【００２５】なお、ユーザは、自己が発した音声と応答
部１６から発せられる合成音声とが一致している場合に
は、その旨を音声などを発することによって、認識され
た単語と選択された入力項目とを特定することができる
（ステップＳ７）。When the voice uttered by the user matches the synthesized voice uttered from the response unit 16, the user utters the voice to that effect to select the recognized word. An input item can be specified (step S7).

【００２６】一方、自己が発した音声と応答部１６から
発せられる合成音声とが一致していない場合、すなわち
「牛乳」、「200円」以外の合成音声が応答部１６から
出力された場合には、ユーザは訂正した単語などを発す
ることにより、誤りを訂正することができる（ステップ
Ｓ６）。On the other hand, if the voice uttered by the user does not match the synthesized voice uttered from the response unit 16, that is, if a synthesized voice other than "milk" or "¥ 200" is output from the response unit 16, The user can correct an error by issuing a corrected word or the like (step S6).

【００２７】こうして、単語及びそれの入力項目が特定
されると、出力部１７は、「牛乳」という単語を「食
品」欄に入力するということと、「200円」という単語
を「金額」欄に入力するということとを、記憶部１８と
表示部１９とに出力する。記憶部１８は、入力された単
語及びそれの入力項目を記憶する。表示部１９は、入力
された単語をそれの特定の入力項目へ表示する（ステッ
プＳ８）。When the word and the input item thereof are specified in this way, the output unit 17 inputs the word "milk" in the "food" column and the word "200 yen" in the "amount" column. Is output to the storage unit 18 and the display unit 19. The storage unit 18 stores the input word and its input items. The display unit 19 displays the input word on the specific input item (step S8).

【００２８】つづいて、選択辞書１５に登録されている
選択情報について説明する。選択情報は、以下に示すよ
うな種種の情報を総括したものである。すなわち、選択
情報のうち１つ目の情報は、入力された単語に対応する
入力項目を選択するためのものである。認識部１２にお
いて認識された単語が、あらかじめ定められている項目
のいずれに関する単語であるかを選択する。Next, the selection information registered in the selection dictionary 15 will be described. The selection information is a summary of various types of information as described below. That is, the first information of the selection information is for selecting an input item corresponding to the input word. The recognition unit 12 selects which of the predetermined items the word recognized by the recognition unit 12 is related to.

【００２９】２つ目は、表示部１９の表示画面に表示さ
せる文字の形式などの情報である。たとえば、漢字、ひ
らがな、カタカナなど文字の種類、半角、全角などとい
う文字の大きさなどの情報である。The second is information such as a character format to be displayed on the display screen of the display unit 19. For example, it is information such as a character type such as kanji, hiragana, and katakana, and a character size such as half-width or full-width.

【００３０】３つ目は、応答部１６から出力される合成
音声の音声情報である。たとえば、選択辞書１５には、
「いずれの入力項目にデータを入力しますか？」などの
音声情報が登録してあり、データの入力項目となる候補
が複数ある場合に、応答部１６からこの音声情報に基づ
く合成音声として出力される。なお、この情報について
は、実施形態２において説明する。The third is speech information of the synthesized speech output from the response unit 16. For example, in the selection dictionary 15,
If voice information such as "Which input item do you want to input data?" Is registered and there are a plurality of candidates to be data input items, the response unit 16 outputs as synthesized voice based on this voice information. Is done. This information will be described in a second embodiment.

【００３１】４つ目は、選択情報は、ユーザが発した単
語を同一概念である単語に変換するための情報である。
したがって、たとえばユーザが「ミルク」と発声した場
合であっても、「ミルク」という単語を「牛乳」という
単語に変換して、表示部１９に表示させることもでき
る。Fourth, the selection information is information for converting a word uttered by the user into a word having the same concept.
Therefore, for example, even when the user utters “milk”, the word “milk” can be converted to the word “milk” and displayed on the display unit 19.

【００３２】（実施形態２）図４は、この実施形態の表
示部１９に表示されている家計簿表を示すである。図４
において、家計簿表の１行目は、「内訳」、「品名」、
「金額」及び「備考」の各欄からなる。たとえば、収入
には、その内訳として、夫、妻、…、などの欄が設けら
れている。また、「品名」の欄には、給与、…などを入
力できるように設けられている。(Embodiment 2) FIG. 4 shows a household account book displayed on a display unit 19 of this embodiment. FIG.
, The first row of the household account book contains "breakdown", "article name",
It consists of "Amount" and "Remarks" columns. For example, the income is provided with fields such as husband, wife,... In the "article name" column, a salary,...

【００３３】図５は、この実施形態の音声入力装置の動
作を示すフローチャートである。ここでは、たとえば、
入力データが「給与、200,000円」であり、ユーザが夫
である場合を例として説明する。FIG. 5 is a flowchart showing the operation of the voice input device of this embodiment. Here, for example,
The case where the input data is “salary, 200,000 yen” and the user is a husband will be described as an example.

【００３４】まず、ユーザである夫が「給与、200,000
円」と発声すると、マイクロフォンなどの入力部１１は
その音声を入力する（ステップＳ１１）。入力された音
声は実施形態１と同様の処理がなされる。認識部１２に
おいて、図示しないＡ／Ｄ変換器によってＡ／Ｄ変換し
て、音声波形をデジタル化する。First, the husband who is the user says, "Salary, 200,000
When saying "circle", the input unit 11 such as a microphone inputs the voice (step S11). The same processing as in the first embodiment is performed on the input voice. In the recognizing unit 12, an A / D converter (not shown) performs A / D conversion to digitize the audio waveform.

【００３５】そして、デジタル化した音声波形は、認識
辞書１３に登録されている単語の波形と比較され、「給
与、200,000円」という入力音声は、「給与」、「200,0
00円」という２つの単語として認識される（ステップＳ
１２）。The digitized voice waveform is compared with the word waveform registered in the recognition dictionary 13, and the input voice of "salary, 200,000 yen" is converted to "salary", "200,0
00 yen "(step S
12).

【００３６】つぎに、認識された単語は、選択部１４に
入力される。選択部１４は、選択辞書１５に登録されて
いる選択情報を抽出する。そして、選択情報から、認識
された単語をいずれの入力項目に出力するべきかを選択
する（ステップＳ１３）。Next, the recognized word is input to the selection unit 14. The selection unit 14 extracts selection information registered in the selection dictionary 15. Then, from the selection information, the input item to which the recognized word should be output is selected (step S13).

【００３７】「給与」の入力項目が「品名」と選択され
ると、「給与」は「品名」の内訳に入力されるデータで
ある旨と、「200,000円」は「金額」欄に入力されるデ
ータである旨とが、応答部１６から合成音声によって出
力される（ステップＳ１４）。これによって、ユーザ
は、発声した「給与、200,000円」という音声が、正し
く認識されたことを確認することができる。When the input item of "salary" is selected as "article name", "salary" is data to be entered in the breakdown of "article name", and "200,000 yen" is entered in the "amount" column. Is output from the response unit 16 as a synthesized voice (step S14). As a result, the user can confirm that the voice "salary, 200,000 yen" is correctly recognized.

【００３８】なお、実施形態１と同様にユーザは、認識
された単語と選択された入力項目とを特定、訂正するこ
とができる（ステップＳ１５〜Ｓ１７）。As in the first embodiment, the user can specify and correct the recognized word and the selected input item (steps S15 to S17).

【００３９】ここで、単語の入力項目の候補が複数ある
場合が考えられる。この実施形態の家計簿のソフトウェ
アには、収入の欄には、「内訳」として、夫と妻との入
力項目を設けている。かかる場合に、ユーザが「給与、
200,000円」と発しても、いずれを入力項目とするのか
判断できない。Here, there may be a case where there are a plurality of word input item candidates. In the software of the household account book of this embodiment, input items of husband and wife are provided as “breakdown” in the income column. In such a case, the user may be asked "
200,000 yen ", it is not possible to determine which is the input item.

【００４０】そのため、この実施形態では、応答部１６
から「入力項目は、夫ですか？、妻ですか？」という旨
の合成音声が出力される（ステップＳ１８）。これによ
って、ユーザは、「夫」と発することによって、入力項
目を特定することができる（ステップＳ１９）。Therefore, in this embodiment, the response unit 16
Output a synthesized voice saying "Is the input item a husband? Or a wife?" (Step S18). Thus, the user can specify an input item by saying “husband” (step S19).

【００４１】こうして、単語及びそれの入力項目が特定
されると、出力部１７は、「給与」という単語は「品
名」欄に入力するということと、「200,000円」は「金
額」欄に入力するということとを、記憶部１８と表示部
１９とに出力する。記憶部１８は、入力された単語及び
それの入力項目を記憶する。表示部１９は、入力された
単語をそれの特定の入力項目へ表示する（ステップＳ２
０）。When the word and the input item thereof are specified in this way, the output unit 17 inputs that the word "salary" is entered in the "article name" column and "200,000 yen" is entered in the "amount" column. Is output to the storage unit 18 and the display unit 19. The storage unit 18 stores the input word and its input items. The display unit 19 displays the input word on the specific input item (step S2).
0).

【００４２】なお、上記のいずれの実施形態において
も、音声入力装置を家計簿のソフトウェアに適用した場
合を例として説明したが、これらの音声入力装置は、病
理検査支援システムなどにも適用することができる。In each of the above embodiments, the case where the voice input device is applied to household account book software has been described as an example. However, these voice input devices are also applicable to a pathological examination support system or the like. Can be.

【００４３】[0043]

【発明の効果】この発明によると、ユーザが音声を発す
ると、その音声にかかる単語とそれの入力項目を選択す
る。そして、選択結果を合成音声によって出力してユー
ザに特定させる。このため、ユーザは音声を発するたび
に、入力項目を確認するために表示画面を見る必要がな
い。したがって、レシートや帳票などの読み上げに専念
することができる。According to the present invention, when a user utters a voice, a word relating to the voice and an input item thereof are selected. Then, the selection result is output as a synthesized voice and the user is specified. Therefore, the user does not need to look at the display screen to confirm the input item every time the user speaks. Therefore, it is possible to concentrate on reading out a receipt or a form.

【００４４】また、この発明によると、ユーザが発した
音声の入力項目の候補が複数ある場合であっても、その
旨を合成音声によって出力して、ユーザに入力項目を特
定させることができる。したがって、上記と同様に、単
語の入力項目を確認するために表示画面を見る必要がな
い。Further, according to the present invention, even when there are a plurality of candidates for the input item of the voice uttered by the user, it is possible to output the effect to the user by the synthesized voice and to specify the input item. Therefore, similarly to the above, it is not necessary to look at the display screen to confirm the word input items.

【００４５】さらに、この発明の音声入力装置には、ユ
ーザが発した音声を同一概念である単語に変換するため
の情報を備えている。このため、ユーザが発する音声が
同一概念であれば、入力音声の記憶・表示がまちまちに
ならない。Further, the voice input device of the present invention has information for converting a voice uttered by the user into a word having the same concept. For this reason, if the sounds uttered by the user are of the same concept, the storage and display of the input sounds will not be mixed.

【図面の簡単な説明】[Brief description of the drawings]

【図１】実施形態１に示す音声入力装置の構成図であ
る。FIG. 1 is a configuration diagram of a voice input device according to a first embodiment.

【図２】実施形態１の表示部に表示させる家計簿表を示
す図である。FIG. 2 is a diagram illustrating a household account book table displayed on a display unit according to the first embodiment.

【図３】実施形態１の音声入力装置の動作を示すフロー
チャートである。FIG. 3 is a flowchart illustrating an operation of the voice input device according to the first embodiment.

【図４】実施形態２の表示部に表示させる家計簿表を示
す図である。FIG. 4 is a diagram illustrating a household account book table displayed on a display unit according to a second embodiment.

【図５】実施形態２の音声入力装置の動作を示すフロー
チャートである。FIG. 5 is a flowchart illustrating an operation of the voice input device according to the second embodiment.

[Explanation of symbols]

１１入力部１２認識部１３認識辞書１４選択部１５選択辞書１６応答部１７出力部１８記憶部１９表示部 DESCRIPTION OF SYMBOLS 11 Input part 12 Recognition part 13 Recognition dictionary 14 Selection part 15 Selection dictionary 16 Response part 17 Output part 18 Storage part 19 Display part

Claims

(57) [Claims]

1. An input unit for inputting a voice, and the input unit
Therefore, a recognition unit that recognizes a word from the voice waveform of the input voice and a plurality of input items for inputting the word in advance are provided.
Which input item is the word recognized by the recognizing means.
A selector for selecting belongs eye, it is selected by the selection unit
Input the word recognized by the recognition unit to the input item
A voice input device having a display unit for displaying in a state.
The word recognized by the recognition unit and the selection unit
Output the selected input item with synthesized speech
Answer part, and respond to the user's response based on the output result
The word recognized by the recognizing unit is input to the word selected by the selecting unit.
A voice input device which is displayed on a force item or prompts the user to re-input a voice.

2. The method according to claim 1, wherein the selecting unit is connected to a selection dictionary in which words controlled by the plurality of input items are registered collectively, and the selection unit extracts the input items from the words registered in the selection dictionary. The voice input device according to claim 1, wherein the voice input device is selected.

3. The speech input device according to claim 1, wherein, when there are a plurality of candidates for the input item of the recognized word, the response unit outputs the fact by a synthesized speech.

4. The voice input device according to claim 2, wherein the selection dictionary includes information for converting the recognized word into a word having the same concept.

5. Enter the voice, to recognize words from voice speech waveform the input, the input terms advance word is input
Which input words eyes were and the recognition provided with a plurality
A voice for allowing the user to select whether the item belongs to an item and displaying the recognized word in a state where the word is input to the selected input item.
An input method, wherein the recognized word and the selected input item are combined.
Output by voice and recognize based on output results
Words are displayed as they are entered in the selected input item.
Is a voice input method characterized by re-inputting voice.

6. An input unit for inputting voice, and said input unit
Therefore, a word is recognized from the speech waveform of the input speech.
Multiple input items to input words in advance Provided
Which input item is the word recognized by the recognizing means.
A selection unit for selecting whether the object belongs to the eyes,
Input the word recognized by the recognition unit to the input item
How to use a voice input device with a display unit for displaying in a state
Law, The voice input device is a unit that is recognized by the recognition unit.
Synthesized sound between word and input item selected by the selection unit
Equipped with a response unit that outputs by voice, Recognized by the recognition unit based on an output result of the response unit
Words in the input items selected in the selection section.
Voice input,
How to use force device.