JP6768323B2

JP6768323B2 - Speech recognition devices and methods, as well as computer programs and recording media

Info

Publication number: JP6768323B2
Application number: JP2016062072A
Authority: JP
Inventors: 麻衣海原; 裕松井
Original assignee: Pioneer Corp
Current assignee: Pioneer Corp
Priority date: 2016-03-25
Filing date: 2016-03-25
Publication date: 2020-10-14
Anticipated expiration: 2036-03-25
Also published as: JP2017173720A

Description

本発明は、例えば音声コマンドを認識して複数のモード間のモード切替えを行う音声認識装置及び方法、並びにコンピュータプログラム及び記録媒体の技術分野に関する。 The present invention relates to, for example, a voice recognition device and method for recognizing a voice command and switching modes between a plurality of modes, and a technical field of a computer program and a recording medium.

医療現場等では、手を触れることなく各種機器を操作できることが要求される場合がある。このような要求に対して、例えば特許文献１では、画像操作を行うための音声が認識された場合に、視線位置座標を基点として医療画像の操作を行うという技術が提案されている。 In medical settings, it may be required to be able to operate various devices without touching them. In response to such a request, for example, Patent Document 1 proposes a technique of manipulating a medical image with the line-of-sight position coordinates as a base point when a sound for performing an image manipulation is recognized.

特開２０１５−９３１４７号公報JP-A-2015-93147

しかしながら、特許文献１のように音声を認識して機器を操作する場合、機器の操作を意図しない音声によって誤った操作が実行されてしまうおそれがある。例えば、通常の会話に含まれる単語が、機器操作用の音声コマンドとして認識されてしまうことで、意図しない操作が実行されてしまう可能性がある。このような場合、機器が適切に操作されないことに起因して、様々な不都合が生じ得る。特に、医療現場においては、安全かつ迅速な処置が求められるため、１回の誤操作が極めて甚大な被害を招いてしまうおそれがある
本発明が解決しようとする課題には、上記のようなものが一例として挙げられる。本発明は、音声による正確な機器操作を実現することが可能な音声認識装置及び方法、並びにコンピュータプログラム及び記録媒体を提供することを課題とする。 However, when the device is operated by recognizing the voice as in Patent Document 1, there is a possibility that an erroneous operation may be executed by the voice that is not intended to operate the device. For example, a word included in a normal conversation may be recognized as a voice command for operating a device, so that an unintended operation may be executed. In such a case, various inconveniences may occur due to improper operation of the device. In particular, in the medical field, safe and prompt treatment is required, and one erroneous operation may cause extremely great damage. The problems to be solved by the present invention include the above. Take as an example. An object of the present invention is to provide a voice recognition device and method capable of realizing accurate device operation by voice, and a computer program and a recording medium.

上記課題を解決するための音声認識装置は、第１音声コマンドに反応して、複数の第１モード間のモード切換え及び前記第１モードから第２モードへのモード切換えを行う第１切換え手段と、前記第１音声コマンドとは異なる第２音声コマンドに反応して、前記第２モードから前記第１モードへのモード切換えを行う第２切換え手段と、前記第１モード又は前記第２モードに応じた画像を表示部に表示させる表示制御部とを備える。 The voice recognition device for solving the above-mentioned problems is a first switching means for performing mode switching between a plurality of first modes and mode switching from the first mode to the second mode in response to the first voice command. In response to a second voice command different from the first voice command, a second switching means for switching the mode from the second mode to the first mode, and depending on the first mode or the second mode. It is provided with a display control unit for displaying the image on the display unit.

上記課題を解決するための第２の音声認識装置は、第１音声コマンドに反応可能な第１モードと、前記第１モードに切り替えるための、前記第１音声コマンドとは異なる第２音声コマンドにのみ反応可能な第２モードと、前記第２音声コマンドに反応して、前記第２モードから前記第１モードへのモード切換えを行う第２切換え手段と、を備える。 The second voice recognition device for solving the above problems has a first mode capable of responding to the first voice command and a second voice command different from the first voice command for switching to the first mode. It includes a second mode capable of responding only to the second mode, and a second switching means for switching the mode from the second mode to the first mode in response to the second voice command.

上記課題を解決するための音声認識方法は、第１音声コマンドに反応して、複数の第１モード間のモード切換え及び前記第１モードから第２モードへのモード切換えを行う第１切換え工程と、前記第１音声コマンドとは異なる第２音声コマンドに反応して、前記第２モードから前記第１モードへのモード切換えを行う第２切換え工程と、前記第１モード又は前記第２モードに応じた画像を表示部に表示させる表示制御工程とを備える。 The voice recognition method for solving the above problems includes a first switching step of performing mode switching between a plurality of first modes and mode switching from the first mode to the second mode in response to the first voice command. In response to a second voice command different from the first voice command, a second switching step of switching the mode from the second mode to the first mode and depending on the first mode or the second mode. It is provided with a display control step of displaying the image on the display unit.

上記課題を解決するためのコンピュータプログラムは、第１音声コマンドに反応して、複数の第１モード間のモード切換え及び前記第１モードから第２モードへのモード切換えを行う第１切換え工程と、前記第１音声コマンドとは異なる第２音声コマンドに反応して、前記第２モードから前記第１モードへのモード切換えを行う第２切換え工程と、前記第１モード又は前記第２モードに応じた画像を表示部に表示させる表示制御工程とをコンピュータに実行させる。 A computer program for solving the above problems includes a first switching step of switching a mode between a plurality of first modes and switching a mode from the first mode to a second mode in response to a first voice command. A second switching step of switching the mode from the second mode to the first mode in response to a second voice command different from the first voice command, and the first mode or the second mode are supported. A computer is made to execute a display control step of displaying an image on a display unit.

上記課題を解決するための記録媒体は、上述したコンピュータプログラムが記録されている。 The computer program described above is recorded as a recording medium for solving the above problems.

実施例に係る音声認識装置の構成を示すブロック図である。It is a block diagram which shows the structure of the voice recognition apparatus which concerns on Example. 実施例に係る音声認識装置で切り替え可能な各モードを示すモード遷移図である。It is a mode transition diagram which shows each mode which can be switched by the voice recognition apparatus which concerns on embodiment. 実施例に係る音声認識装置の動作の流れを示すフローチャートである。It is a flowchart which shows the operation flow of the voice recognition apparatus which concerns on embodiment. 通常モードからのモード切替え方法を示す概念図である。It is a conceptual diagram which shows the mode switching method from a normal mode. 特殊モードからのモード切替え方法を示す概念図である。It is a conceptual diagram which shows the mode switching method from a special mode. 特殊モードから任意の通常モードに切替える方法を示す概念図である。It is a conceptual diagram which shows the method of switching from a special mode to an arbitrary normal mode. 特殊モードにおける通常モードの一部表示例を示す概念図である。It is a conceptual diagram which shows the partial display example of the normal mode in a special mode. 通常モードから任意の特殊モードに切替える方法を示す概念図である。It is a conceptual diagram which shows the method of switching from a normal mode to an arbitrary special mode.

＜１＞
本実施形態に係る音声認識装置は、第１音声コマンドに反応して、複数の第１モード間のモード切換え及び前記第１モードから第２モードへのモード切換えを行う第１切換え手段と、前記第１音声コマンドとは異なる第２音声コマンドに反応して、前記第２モードから前記第１モードへのモード切換えを行う第２切換え手段と、前記第１モード又は前記第２モードに応じた画像を表示部に表示させる表示制御部とを備える。 <1>
The voice recognition device according to the present embodiment includes a first switching means for performing mode switching between a plurality of first modes and mode switching from the first mode to the second mode in response to the first voice command, and the above-mentioned. A second switching means for switching the mode from the second mode to the first mode in response to a second voice command different from the first voice command, and an image corresponding to the first mode or the second mode. Is provided with a display control unit for displaying the image on the display unit.

本実施形態に係る音声認識装置によれば、その動作時には、第１切換え手段により、複数の第１モード間の切換え、及び第１モードから第２モードへのモード切替えが行われる。第１切換え手段は、第１音声コマンドに反応してモード切替えを行う。第１音声コマンドは、各モードに切換えを行うための音声コマンドとして、モード毎に予め設定されている。 According to the voice recognition device according to the present embodiment, during its operation, the first switching means switches between the plurality of first modes and switches the mode from the first mode to the second mode. The first switching means switches modes in response to the first voice command. The first voice command is preset for each mode as a voice command for switching to each mode.

また本実施形態では、第２切換え手段により、第２モードから第１モードへのモード切換えが行われる。第２切換え手段は、第２音声コマンドに応じて、第２モードから第１モードへの切換えを行う。第２音声コマンドは、第１音声コマンドとは異なる音声コマンドとして予め設定されている。 Further, in the present embodiment, the mode switching from the second mode to the first mode is performed by the second switching means. The second switching means switches from the second mode to the first mode in response to the second voice command. The second voice command is preset as a voice command different from the first voice command.

表示制御部は、第１モード又は第２モードに応じた画像を表示部に表示させる。このため、第１切換え手段及び第２切換え手段によりモードが切換えられると、表示部に表示される画像も切換えられることになる。 The display control unit causes the display unit to display an image corresponding to the first mode or the second mode. Therefore, when the mode is switched by the first switching means and the second switching means, the image displayed on the display unit is also switched.

本実施形態では特に、第１モード時には、第１音声コマンドによって他の各モードへの切換えが行える。即ち、第１モードからは、第１音声コマンドによって他の第１モードに切替えることもできるし、第１音声コマンドによって第２モードに切替えることもできる。一方、第２モード時には、第２音声コマンドでしか他のモードへの切換えが行えない。即ち、第２モード時に第１音声コマンドが認識されたとしても、他のモードへの切換えは行われない。 In this embodiment, in particular, in the first mode, it is possible to switch to each of the other modes by the first voice command. That is, from the first mode, it is possible to switch to another first mode by the first voice command, or to switch to the second mode by the first voice command. On the other hand, in the second mode, it is possible to switch to another mode only by the second voice command. That is, even if the first voice command is recognized in the second mode, the mode is not switched to another mode.

このように構成すれば、第２モードから他のモードへの切換え方法が限定されることになるため、第２モードから第１モードへの切換えを意図しない場合に、誤って第１モードへの切換えが行われてしまうことを防止できる。具体的には、モード切替えを意図せず発した音声がモードを切替えるための音声コマンドとして認識されてしまい、不適切なモード切替えが行われてしまうことを防止できる。 With this configuration, the method of switching from the second mode to another mode is limited. Therefore, if the switching from the second mode to the first mode is not intended, the mode is mistakenly switched to the first mode. It is possible to prevent switching from being performed. Specifically, it is possible to prevent an unintentionally emitted voice for mode switching from being recognized as a voice command for switching modes, resulting in inappropriate mode switching.

以上説明したように、本実施形態に係る音声認識装置によれば、複数の第１モードと第２モードとの間で、好適なモード切替えを実現することができる。 As described above, according to the voice recognition device according to the present embodiment, it is possible to realize suitable mode switching between the plurality of first modes and the second mode.

＜２＞
本実施形態に係る音声認識装置の一態様では、前記表示制御部は、前記第２モード時に暗転画像を表示させる。 <2>
In one aspect of the voice recognition device according to the present embodiment, the display control unit displays a darkened image in the second mode.

この態様によれば、第２モード時において表示される暗転画像が、意図せずに第１モードに応じた画像に切換えられてしまうことを防止できる。このようにすれば、周囲が暗い状況（即ち、暗転画像を表示させた状態）で行うべき作業をしている際に、明るい状況（第１モードに応じた画像を表示させた状態）になってしまうことを防止できる。 According to this aspect, it is possible to prevent the darkened image displayed in the second mode from being unintentionally switched to the image corresponding to the first mode. By doing so, when the surroundings are dark (that is, the darkened image is displayed) and the work to be performed is performed, the surroundings are bright (the image corresponding to the first mode is displayed). It can be prevented from being lost.

なお、暗転画像を表示させる具体的な状況例としては、医療現場における暗室処理（例えば、蛍光診断）等が挙げられる。 A specific example of a situation in which a darkened image is displayed includes darkroom treatment (for example, fluorescence diagnosis) in a medical field.

＜３＞
本実施形態に係る音声認識装置の他の態様では、前記表示制御部は、前記第２モード時に、前記第２モードに応じた画像の一部に前記第１モードに応じた画像の少なくとも一部を表示させる。 <3>
In another aspect of the voice recognition device according to the present embodiment, in the second mode, the display control unit adds at least a part of the image corresponding to the first mode to a part of the image corresponding to the second mode. Is displayed.

この態様によれば、第２モード時においても、第１モードに応じた画像の少なくとも一部を視認することができる。即ち、第２モード時においても、他のモードで示される情報を確認することができる。このため、例えば第２モードにおいて暗転画像をさせている場合であっても、周囲を比較的暗い状況に保ったまま、第１モードで示される情報を確認できる。 According to this aspect, at least a part of the image corresponding to the first mode can be visually recognized even in the second mode. That is, even in the second mode, the information displayed in the other modes can be confirmed. Therefore, for example, even when a blackout image is produced in the second mode, the information shown in the first mode can be confirmed while keeping the surroundings in a relatively dark state.

＜４＞
本実施形態に係る音声認識装置の他の態様では、前記第２切換え手段は、前記第２音声コマンドに反応して、前記第２モードに切替える直前の前記第１モードへのモード切替えを行う。 <4>
In another aspect of the voice recognition device according to the present embodiment, the second switching means responds to the second voice command and performs mode switching to the first mode immediately before switching to the second mode.

この態様によれば、第２モードから第１モードへの切換えが行われる場合には、第２モードに切換えられる直前の第１モードへと切換えられる。このようにすれば、複数存在する第１のモードに対して容易にモード切替えが行える。 According to this aspect, when switching from the second mode to the first mode is performed, the mode is switched to the first mode immediately before the switching to the second mode. In this way, the mode can be easily switched with respect to the plurality of first modes.

＜５＞
本実施形態に係る音声認識装置の他の態様では、前記第２音声コマンドは、当該音声認識装置が使用され得る環境で発せられる可能性が低い単語又は擬音である。 <5>
In another aspect of the voice recognition device according to this embodiment, the second voice command is a word or onomatopoeia that is unlikely to be issued in an environment in which the voice recognition device can be used.

この態様によれば、意図せずして第２音声コマンドが発せられてしまうことを抑制できる。よって、第２モードから第１モードへの切換えを効果的に制限することが可能である。なお、「音声認識装置が使用され得る環境で発せられる可能性が低い単語」は、予め音声認識装置が使用され得る環境において発せられる単語を調べておき、高い頻度で発せられる単語を除外するようにして設定すればよい。或いは、全く意味を持たない単語（通常では使用されない単語）を用いてもよい。「擬音」は、人が発することができる声以外の音であり、例えば舌打ち等の音が一例として挙げられる。 According to this aspect, it is possible to prevent the second voice command from being unintentionally issued. Therefore, it is possible to effectively limit the switching from the second mode to the first mode. For "words that are unlikely to be uttered in an environment where a voice recognition device can be used", check the words that are uttered in an environment where a voice recognition device can be used in advance, and exclude words that are uttered frequently. And set it. Alternatively, words that have no meaning at all (words that are not normally used) may be used. The "onomatopoeia" is a sound other than a voice that can be emitted by a person, and an example is a sound such as a tongue tapping.

＜６＞
本実施形態に係る音声認識装置の他の態様では、前記表示制御部は、手術時に目視すべき画像を表示させる。 <6>
In another aspect of the voice recognition device according to the present embodiment, the display control unit displays an image to be visually observed at the time of surgery.

この態様によれば、手術中の医師等が、音声を利用して好適にモード切替えを行うことができる。また、第２モード時に暗転画像を表示させるようにすれば、暗室処置を好適に行うことが可能である。 According to this aspect, a doctor or the like during surgery can suitably switch modes by using voice. Further, if the darkened image is displayed in the second mode, the darkroom treatment can be preferably performed.

＜７＞
本実施形態に係る音声認識装置の他の態様では、前記第２切換え手段は、前記第２音声コマンドに前記第１モードの各々に対応する第３音声コマンドを組み合わせた音声コマンドに反応して、前記第２モードから前記第３音声コマンドに対応した前記第１モードへのモード切替えを行う。 <7>
In another aspect of the voice recognition device according to the present embodiment, the second switching means responds to a voice command in which the second voice command is combined with a third voice command corresponding to each of the first modes. The mode is switched from the second mode to the first mode corresponding to the third voice command.

この態様によれば、第２音声コマンドに第３音声コマンドを組み合わせることで、第２モードから任意の第１モードへの切換えが可能となる。第３音声コマンドは、複数の第１モードの各々に切換えを行うための音声コマンドとして、モード毎に予め設定されている。なお、第３音声コマンドは、第１音声コマンドと同一のものであっても構わない。 According to this aspect, by combining the second voice command with the third voice command, it is possible to switch from the second mode to an arbitrary first mode. The third voice command is preset for each mode as a voice command for switching to each of the plurality of first modes. The third voice command may be the same as the first voice command.

＜８＞
本実施形態に係る音声認識装置の他の態様では、前記第２音声コマンドは、所定の音声に視線の動き又はジェスチャーを組み合わせたものである。 <8>
In another aspect of the voice recognition device according to the present embodiment, the second voice command is a combination of a predetermined voice and a movement of the line of sight or a gesture.

この態様によれば、第２音声コマンドを認識させるためには、所定の音声を発するだけでなく、視線の動きやジェスチャー等を行うことが要求される。よって、モード切替えを意図せずに発した音声が第２音声コマンドとして認識されてしまうことを好適に回避できる。 According to this aspect, in order to recognize the second voice command, it is required not only to emit a predetermined voice but also to perform a movement of the line of sight, a gesture, and the like. Therefore, it is possible to preferably avoid that the voice emitted unintentionally for mode switching is recognized as the second voice command.

＜９＞
本実施形態に係る音声認識装置の他の態様では、第２切換え手段は、前記第１モードに切り替えるための、前記第１音声コマンドとは異なる第２音声コマンドにのみ反応し、前記第２モードから前記第１モードへのモード切換えを行う。 <9>
In another aspect of the voice recognition device according to the present embodiment, the second switching means responds only to a second voice command different from the first voice command for switching to the first mode, and the second mode To switch the mode to the first mode.

この態様によれば、第２モードから第１モードへのモード切替え時には、第２切換え手段が第２音声コマンドにのみ反応する。このため、第２モードが誤って第１モードにモード切替えされてしまうことを防止することができる。 According to this aspect, when the mode is switched from the second mode to the first mode, the second switching means responds only to the second voice command. Therefore, it is possible to prevent the second mode from being accidentally switched to the first mode.

＜１０＞
本実施形態に係る第２の音声認識装置は、第１音声コマンドに反応可能な第１モードと、前記第１モードに切り替えるための、前記第１音声コマンドとは異なる第２音声コマンドにのみ反応可能な第２モードと、前記第２音声コマンドに反応して、前記第２モードから前記第１モードへのモード切換えを行う第２切換え手段と、を備える。 <10>
The second voice recognition device according to the present embodiment responds only to the first mode capable of responding to the first voice command and the second voice command different from the first voice command for switching to the first mode. A possible second mode and a second switching means for switching the mode from the second mode to the first mode in response to the second voice command are provided.

本実施形態に係る第２の音声認識装置によれば、第１モードと第２モードとの間でモード切換えを行うことができる。本実施形態では特に、第１モードが第１音声コマンドに反応可能とされている一方で、第２モードは第２音声コマンドにのみ反応可能とされている。即ち、第２モードはから第１モードへのモード切替えは、第２音声コマンドでしか行えず、第１音声コマンドや他の音声コマンドではモード切替えは行えない。 According to the second voice recognition device according to the present embodiment, the mode can be switched between the first mode and the second mode. In this embodiment, in particular, the first mode is responsive to the first voice command, while the second mode is responsive only to the second voice command. That is, the mode switching from the second mode to the first mode can be performed only by the second voice command, and the mode cannot be switched by the first voice command or other voice commands.

上述した構成によれば、第２モードでの作業中に、意図せぬ音声コマンドの認識によって、第２モードが他のモードにモード切替えされてしまうことを防止することができる。 According to the above-described configuration, it is possible to prevent the second mode from being switched to another mode due to unintended recognition of a voice command while working in the second mode.

＜１１＞
本実施形態に係る音声認識方法は、第１音声コマンドに反応して、複数の第１モード間のモード切換え及び前記第１モードから第２モードへのモード切換えを行う第１切換え工程と、前記第１音声コマンドとは異なる第２音声コマンドに反応して、前記第２モードから前記第１モードへのモード切換えを行う第２切換え工程と、前記第１モード又は前記第２モードに応じた画像を表示部に表示させる表示制御工程とを備える。 <11>
The voice recognition method according to the present embodiment includes a first switching step of performing mode switching between a plurality of first modes and mode switching from the first mode to the second mode in response to the first voice command, and the above-mentioned. A second switching step of switching the mode from the second mode to the first mode in response to a second voice command different from the first voice command, and an image corresponding to the first mode or the second mode. Is provided with a display control process for displaying the image on the display unit.

本実施形態に係る音声認識方法によれば、上述した本実施形態に係る音声認識装置と同様に、複数の第１モードと第２モードとの間で、好適なモード切替えを実現することができる。 According to the voice recognition method according to the present embodiment, suitable mode switching can be realized between the plurality of first modes and the second mode, similarly to the voice recognition device according to the present embodiment described above. ..

なお、本実施形態に係る音声認識方法においても、上述した本実施形態に係る音声認識装置における各種態様と同様の各種態様を採ることが可能である。 In the voice recognition method according to the present embodiment, it is possible to adopt various aspects similar to the various aspects in the voice recognition device according to the above-described embodiment.

＜１２＞
本実施形態に係るコンピュータプログラムは、第１音声コマンドに反応して、複数の第１モード間のモード切換え及び前記第１モードから第２モードへのモード切換えを行う第１切換え工程と、前記第１音声コマンドとは異なる第２音声コマンドに反応して、前記第２モードから前記第１モードへのモード切換えを行う第２切換え工程と、前記第１モード又は前記第２モードに応じた画像を表示部に表示させる表示制御工程とをコンピュータに実行させる。 <12>
The computer program according to the present embodiment has a first switching step of performing mode switching between a plurality of first modes and mode switching from the first mode to the second mode in response to the first voice command, and the first switching step. A second switching step of switching the mode from the second mode to the first mode in response to a second voice command different from the first voice command, and an image corresponding to the first mode or the second mode. Have the computer execute the display control process to be displayed on the display unit.

本実施形態に係るコンピュータプログラムによれば、上述した本実施形態に係る音声認識方法と同様の処理をコンピュータに実行させることができるため、複数の第１モードと第２モードとの間で、好適なモード切替えを実現することができる。 According to the computer program according to the present embodiment, since the computer can execute the same processing as the voice recognition method according to the above-described embodiment, it is suitable between the plurality of first modes and the second mode. Mode switching can be realized.

なお、本実施形態に係るコンピュータプログラムにおいても、上述した本実施形態に係る音声認識装置における各種態様と同様の各種態様を採ることが可能である。 In the computer program according to the present embodiment, it is possible to adopt various aspects similar to the various aspects in the voice recognition device according to the above-described embodiment.

＜１３＞
本実施形態に係る記録媒体は、上述したコンピュータプログラムが記録されている。 <13>
The computer program described above is recorded on the recording medium according to the present embodiment.

本実施形態に係る記録媒体によれば、上述したコンピュータプログラムをコンピュータにより実行させることにより、複数の第１モードと第２モードとの間で、好適なモード切替えを実現することができる。 According to the recording medium according to the present embodiment, suitable mode switching can be realized between the plurality of first modes and the second mode by executing the above-mentioned computer program by a computer.

本実施形態に係る音声認識装置及び音声認識方法、並びにコンピュータプログラム及び記録媒体の作用及び他の利得については、以下に示す実施例において、より詳細に説明する。 The voice recognition device and the voice recognition method according to the present embodiment, and the actions and other gains of the computer program and the recording medium will be described in more detail in the following examples.

以下では、音声認識装置及び方法、並びにコンピュータプログラム及び記録媒体の実施例について、図面を参照しながら詳細に説明する。なお、以下では、音声認識装置が医療現場の手術室で用いられる表示システムに適用される場合を例にとり説明する。 Hereinafter, examples of the voice recognition device and method, and the computer program and recording medium will be described in detail with reference to the drawings. In the following, a case where the voice recognition device is applied to a display system used in an operating room in a medical field will be described as an example.

＜装置構成＞
先ず、本実施例に係る音声認識装置の構成について、図１を参照して説明する。ここに図１は、実施例に係る音声認識装置の構成を示すブロック図である。 <Device configuration>
First, the configuration of the voice recognition device according to this embodiment will be described with reference to FIG. Here, FIG. 1 is a block diagram showing a configuration of a voice recognition device according to an embodiment.

図１において、本実施例に係る音声認識装置は、音声取得部１１０と、音声認識部１２０と、モード判定部１３０と、音声コマンド判定部１４０と、モード変更部１５０と、画面遷移部１６０とを備えて構成されている。 In FIG. 1, the voice recognition device according to the present embodiment includes a voice acquisition unit 110, a voice recognition unit 120, a mode determination unit 130, a voice command determination unit 140, a mode change unit 150, and a screen transition unit 160. It is configured with.

音声取得部１１０は、例えばマイクロフォンを含んで構成されており、取得した音声を示す音声信号を出力可能に構成されている。音声取得部１１０から出力された音声信号は、音声認識部１２０に出力される構成となっている。 The voice acquisition unit 110 is configured to include, for example, a microphone, and is configured to be able to output a voice signal indicating the acquired voice. The voice signal output from the voice acquisition unit 110 is configured to be output to the voice recognition unit 120.

音声認識部１２０は、音声信号が示す音声に含まれている単語（即ち、モードを切替えるための音声コマンドとして認識され得るワード）を認識することが可能に構成されている。音声認識部１２０で認識された単語は、該単語を示す信号として音声コマンド判定部１４０に出力される構成となっている。 The voice recognition unit 120 is configured to be able to recognize a word (that is, a word that can be recognized as a voice command for switching a mode) included in the voice indicated by the voice signal. The word recognized by the voice recognition unit 120 is output to the voice command determination unit 140 as a signal indicating the word.

モード判定部１３０は、音声認識装置の現在のモードに関する情報を取得することが可能に構成されている。なお、音声認識装置における各モードについては、後に詳述する。モード判定部１３０で取得されたモードに関する情報は、音声コマンド判定部１４０に出力される構成となっている。 The mode determination unit 130 is configured to be able to acquire information regarding the current mode of the voice recognition device. Each mode in the voice recognition device will be described in detail later. The information about the mode acquired by the mode determination unit 130 is output to the voice command determination unit 140.

音声コマンド判定部１４０は、音声認識部で認識された単語が、モードに応じた音声コマンドであるか否かを判定可能に構成されている。音声コマンド判定部１４０は、具体的な処理を実行するものとして、モード認識部１４１、一致率算出部１４２、及び一致率判定部１４３を備えている。 The voice command determination unit 140 is configured to be able to determine whether or not the word recognized by the voice recognition unit is a voice command according to the mode. The voice command determination unit 140 includes a mode recognition unit 141, a match rate calculation unit 142, and a match rate determination unit 143 to execute specific processing.

モード認識部１４１は、モード判定部１３０から入力された現在のモードに関する情報に基づいて、認識すべき音声コマンドを決定する。言い換えれば、モードに応じた音声コマンドを選択する。 The mode recognition unit 141 determines a voice command to be recognized based on the information regarding the current mode input from the mode determination unit 130. In other words, select the voice command according to the mode.

一致率算出部１４２は、音声認識部１２０で認識された単語と、予め登録されている音声コマンドとの一致率を算出する。なお、一致率の算出方法については、既存の様々な技術を採用することができるため、ここでの詳細な説明は省略する。 The match rate calculation unit 142 calculates the match rate between the word recognized by the voice recognition unit 120 and the voice command registered in advance. As for the method of calculating the matching rate, various existing techniques can be adopted, and therefore detailed description thereof will be omitted here.

一致率判定部１４３は、モード認識部１４１で認識されたモードと、一致率算出部１４２で算出された一致率とに基づいて、取得された音声が認識すべき音声コマンドであるか否かを判定する。 The match rate determination unit 143 determines whether or not the acquired voice is a voice command to be recognized based on the mode recognized by the mode recognition unit 141 and the match rate calculated by the match rate calculation unit 142. judge.

なお、音声コマンド判定部１４０では、上記一致率とは異なる指標を利用して音声コマンドであるか否かを判定するようにしても構わない。音声コマンド判定部１４０における判定結果は、モード変更部１５０に出力される構成となっている。 The voice command determination unit 140 may determine whether or not the command is a voice command by using an index different from the above-mentioned match rate. The determination result in the voice command determination unit 140 is output to the mode change unit 150.

モード変更部１５０は、音声コマンドに応じてモードを切替えることが可能に構成されている。モード変更部１５０は、モードを切換えた結果を画面遷移部１６０に出力するように構成されている。 The mode change unit 150 is configured so that the mode can be switched according to a voice command. The mode changing unit 150 is configured to output the result of switching the mode to the screen transition unit 160.

画面遷移部１６０は、モード変更部１５０においてモードが切換えられた際に、外部の表示部（例えば、液晶ディスプレイ等）の表示を、変更後のモードに応じたものに切替えることが可能に構成されている。 The screen transition unit 160 is configured to be able to switch the display of an external display unit (for example, a liquid crystal display or the like) to one according to the changed mode when the mode is switched by the mode changing unit 150. ing.

＜モード説明＞
次に、上述した音声認識装置によって切換えられる各モードについて、図２を参照して具体的に説明する。ここに図２は、実施例に係る音声認識装置で切り替え可能な各モードを示すモード遷移図である。 <Mode description>
Next, each mode switched by the above-mentioned voice recognition device will be specifically described with reference to FIG. Here, FIG. 2 is a mode transition diagram showing each mode that can be switched by the voice recognition device according to the embodiment.

図２に示すように、本実施例に係る認識装置は、３つの通常モード（ＮＡＶＩモード、ＧＥＦモード、及びＰＲＥＯＰＥモード）と、１つの特殊モード（暗転モード）との間で相互にモード切替えを行うことができる。 As shown in FIG. 2, the recognition device according to the present embodiment mutually switches between three normal modes (NAVI mode, GEF mode, and PREOPE mode) and one special mode (darkening mode). It can be carried out.

ＮＡＶＩモードは、所謂デフォルト画面に相当するモードであり、手術中の患者の表情及び四肢の画像、並びにＢＩＳ値やＴ１／Ｔ２画像を表示するモードである。なお、ＢＩＳ値は、麻酔を使用する手術において患者の沈静度を示す値である。また、Ｔ１／Ｔ２画像は、ＭＲＩ（Magnetic Resonance Imaging）による撮影時において、強調する物質を変更した際に撮影される画像である。 The NAVI mode is a mode corresponding to a so-called default screen, and is a mode for displaying an image of a patient's facial expression and limbs during surgery, as well as a BIS value and a T1 / T2 image. The BIS value is a value indicating the degree of calmness of the patient in the operation using anesthesia. Further, the T1 / T2 image is an image taken when the substance to be emphasized is changed at the time of taking a picture by MRI (Magnetic Resonance Imaging).

ＧＥＦ（Gefrierschnitt:ゲフリール）モードは、生体検査結果を表示するモードであり、採取組織ごとの分析結果等を表示させることが可能である。 The GEF (Gefrierschnitt) mode is a mode for displaying biopsy results, and can display analysis results and the like for each collected tissue.

ＰＲＥＯＰＥ（Preoperative:術前診断）モードは、術前画像を表示するモードである。図に示す例では、脳に関する情報を示す画像が表示されている。 The PREOPE (Preoperative: preoperative diagnosis) mode is a mode for displaying a preoperative image. In the example shown in the figure, an image showing information about the brain is displayed.

暗転モードは、暗転画像（即ち、黒画面）を表示するためのモードであり、暗室処置等を行うために画面の光を遮断したい場合に用いられる。 The darkening mode is a mode for displaying a darkening image (that is, a black screen), and is used when it is desired to block the light of the screen in order to perform a darkroom treatment or the like.

なお、上記モードは一例であり、複数の通常モードと、少なくとも１つの特殊モードとを相互に切替えるものであれば、本実施例に係る音声認識装置を適用することが可能である
＜処理説明＞
次に、本実施例に係る音声認識装置の動作について、図３を参照して説明する。ここに図３は、本実施例に係る音声認識装置の動作の流れを示すフローチャートである。 The above mode is an example, and the voice recognition device according to this embodiment can be applied as long as it switches between a plurality of normal modes and at least one special mode. <Process description>
Next, the operation of the voice recognition device according to this embodiment will be described with reference to FIG. Here, FIG. 3 is a flowchart showing an operation flow of the voice recognition device according to the present embodiment.

図３において、本実施例に係る音声認識装置の動作時には、まず音声取得部１１０において音声が取得される（ステップＳ１０１）。取得された音声は、音声認識部１２０において認識され（ステップＳ１０２）、音声コマンド判定部１４０に出力される。また、上述した音声の取得及び認識に並行して、又は相前後して、モード判定部１３０による現在のモード確認が行われる（ステップＳ１０３）。現在のモードに関する情報は、音声コマンド判定部１４０に出力される。 In FIG. 3, when the voice recognition device according to the present embodiment is operating, the voice acquisition unit 110 first acquires voice (step S101). The acquired voice is recognized by the voice recognition unit 120 (step S102) and output to the voice command determination unit 140. Further, the mode determination unit 130 confirms the current mode in parallel with or in parallel with the acquisition and recognition of the above-mentioned voice (step S103). Information about the current mode is output to the voice command determination unit 140.

音声コマンド判定部１４０では、取得された音声に含まれる単語と、予め音声コマンドとして登録された単語との一致率が算出される（ステップＳ１０４）。なお、一致率は複数の単語の各々に対応する複数の値として算出されるが、結果として出力されるのは最も高い一致率のみである。 The voice command determination unit 140 calculates the matching rate between the words included in the acquired voice and the words registered in advance as voice commands (step S104). The match rate is calculated as a plurality of values corresponding to each of the plurality of words, but only the highest match rate is output as a result.

一致率が算出されると、現在のモードが通常モードであるか否かが判定される（ステップＳ１０５）。即ち、現在のモードが、通常モード（即ち、ＮＡＶＩモード、ＧＥＦモード、又はＰＲＥＯＰＥモード）なのか、それとも特殊モード（即ち、暗転モード）なのかが判定される。 When the match rate is calculated, it is determined whether or not the current mode is the normal mode (step S105). That is, it is determined whether the current mode is a normal mode (that is, NAVI mode, GEF mode, or PREOPE mode) or a special mode (that is, darkening mode).

現在のモードが通常モードである場合（ステップＳ１０５：ＹＥＳ）、一致率の最も高いコマンドが通常コマンドであり、且つ一致率が所定の閾値以上であるか否かが判定される（ステップＳ１０６）。なお、通常コマンドは、「第１音声コマンド」の一具体例であり、通常モード間でのモード切替え、及び通常モードから特殊モードへのモード切替えを行うための音声コマンドとして、モード毎に決められている。具体的には、通常モード間でのモード切替えは、モード名がそのまま音声コマンドとなっている。一方で、通常モードから特殊モードへのモード切替えは、「暗転」というワードが音声コマンドとして登録されている。なお、所定の閾値は、認識された音声が音声コマンドであるか否かを判定するために設定された閾値であり、最適な値が予め設定されている。 When the current mode is the normal mode (step S105: YES), it is determined whether or not the command having the highest match rate is the normal command and the match rate is equal to or higher than a predetermined threshold value (step S106). The normal command is a specific example of the "first voice command", and is determined for each mode as a voice command for mode switching between normal modes and mode switching from normal mode to special mode. ing. Specifically, in mode switching between normal modes, the mode name is a voice command as it is. On the other hand, in the mode switching from the normal mode to the special mode, the word "darkening" is registered as a voice command. The predetermined threshold value is a threshold value set for determining whether or not the recognized voice is a voice command, and an optimum value is set in advance.

一致率の最も高いコマンドが通常コマンドであり、且つ一致率が所定の閾値以上である場合（ステップＳ１０６：ＹＥＳ）、モード変更部１５０において、現在のモードから通常コマンドが示す他のモードへのモード切換えが行われる（ステップＳ１０７）。そして、画面遷移部１６０によって、表示部の画面が変更後のモードに応じたものへ遷移される（ステップＳ１０８）。なお、一致率の最も高いコマンドが通常コマンドでない、或いは一致率が所定の閾値以上でない場合（ステップＳ１０６：ＮＯ）、音声コマンド判定はエラーとなり、モード変更部１５０によるモード変更は行われない（ステップＳ１０９）。 When the command with the highest match rate is a normal command and the match rate is equal to or higher than a predetermined threshold value (step S106: YES), the mode change unit 150 switches from the current mode to another mode indicated by the normal command. Switching is performed (step S107). Then, the screen transition unit 160 transitions the screen of the display unit to the one corresponding to the changed mode (step S108). If the command with the highest match rate is not a normal command, or if the match rate is not greater than or equal to a predetermined threshold value (step S106: NO), the voice command determination results in an error, and the mode change unit 150 does not change the mode (step). S109).

他方、現在のモードが特殊モードである場合（ステップＳ１０５：ＮＯ）、一致率の最も高いコマンドが特殊コマンドであり、且つ一致率が所定の閾値以上であるか否かが判定される（ステップＳ１１０）。なお、特殊コマンドは、「第２音声コマンド」の一具体例であり、特殊モードから通常モードへのモード切替えを行うための音声コマンドとして、通常コマンドとは異なるものが設定されている。本実施例では、「再開」というワードが特殊コマンドとして設定されている。 On the other hand, when the current mode is the special mode (step S105: NO), it is determined whether or not the command having the highest match rate is the special command and the match rate is equal to or higher than a predetermined threshold value (step S110). ). The special command is a specific example of the "second voice command", and a voice command different from the normal command is set as a voice command for switching the mode from the special mode to the normal mode. In this embodiment, the word "resume" is set as a special command.

一致率の最も高いコマンドが特殊コマンドであり、且つ一致率が所定の閾値以上である場合（ステップＳ１１０：ＹＥＳ）、モード変更部１５０において、特殊モードから通常モードへのモード切換えが行われる（ステップＳ１１１）。そして、画面遷移部１６０によって、表示部の画面が変更後のモードに応じたものへ遷移される（ステップＳ１１２）。なお、一致率の最も高いコマンドが特殊コマンドでない、或いは一致率が所定の閾値以上でない場合（ステップＳ１１０：ＮＯ）、音声コマンド判定はエラーとなり、モード変更部１５０によるモード変更は行われない（ステップＳ１１３）。 When the command with the highest match rate is a special command and the match rate is equal to or higher than a predetermined threshold value (step S110: YES), the mode change unit 150 switches the mode from the special mode to the normal mode (step). S111). Then, the screen transition unit 160 transitions the screen of the display unit to the one corresponding to the changed mode (step S112). If the command with the highest match rate is not a special command, or if the match rate is not greater than or equal to a predetermined threshold value (step S110: NO), the voice command determination results in an error, and the mode change unit 150 does not change the mode (step). S113).

＜具体的なモード切替え動作＞
次に、音声コマンドを用いた具体的なモード切替え動作について、図４から図８を参照して説明する。ここに図４は、通常モードからのモード切替え方法を示す概念図であり、図５は、特殊モードからのモード切替え方法を示す概念図である。また図６は、特殊モードから任意の通常モードに切替える方法を示す概念図であり、図７は、特殊モードにおける通常モードの一部表示例を示す概念図である。図８は、通常モードから任意の特殊モードに切替える方法を示す概念図である。 <Specific mode switching operation>
Next, a specific mode switching operation using a voice command will be described with reference to FIGS. 4 to 8. Here, FIG. 4 is a conceptual diagram showing a mode switching method from the normal mode, and FIG. 5 is a conceptual diagram showing a mode switching method from the special mode. Further, FIG. 6 is a conceptual diagram showing a method of switching from the special mode to an arbitrary normal mode, and FIG. 7 is a conceptual diagram showing a partial display example of the normal mode in the special mode. FIG. 8 is a conceptual diagram showing a method of switching from the normal mode to an arbitrary special mode.

図４に示すように現在のモードが通常モードのＧＥＦモードであるとする。この状態で、「ＮＡＶＩモード」というワードを含む音声が取得されると、ＮＡＶＩモードへの切換えに対応する通常コマンド「ＮＡＶＩモード」が認識され、ＧＥＦモードからＮＡＶＩモードへのモード切替えが行われる。同様に、「ＰＲＥＯＰＥモード」というワードを含む音声が取得されると、ＰＲＥＯＰＥモードへの切換えに対応する通常コマンド「ＰＲＥＯＰＥモード」が認識され、ＧＥＦモードからＰＲＥＯＰＥモードへのモード切替えが行われる。 As shown in FIG. 4, it is assumed that the current mode is the GEF mode of the normal mode. In this state, when the voice including the word "NAVI mode" is acquired, the normal command "NAVI mode" corresponding to the switching to the NAVI mode is recognized, and the mode is switched from the GEF mode to the NAVI mode. Similarly, when the voice including the word "PREOPE mode" is acquired, the normal command "PREOPE mode" corresponding to the switching to the PREOPE mode is recognized, and the mode is switched from the GEF mode to the PREOPE mode.

また、「暗転」というワードを含む音声が取得されると、暗転モードへの切換えに対応する通常コマンド「暗転」が認識され、ＧＥＦモードから暗転モードへのモード切替えが行われる。 Further, when the voice including the word "darkening" is acquired, the normal command "darkening" corresponding to the switching to the darkening mode is recognized, and the mode is switched from the GEF mode to the darkening mode.

このように、通常モードであるＧＥＦモードからは、通常コマンドによるモード切替えが行える。 In this way, the mode can be switched by a normal command from the GEF mode, which is the normal mode.

図５に示すように現在のモードが特殊モードの暗転モードであるとする。この状態で、「ＮＡＶＩモード」というワードを含む音声が取得されると、ＮＡＶＩモードへの切換えに対応する通常コマンド「ＮＡＶＩモード」が認識されるが、特殊モード時には通常コマンドによるモード切替えは行われない。このため、暗転モードからＮＡＶＩモードへのモード切替えは行われない。 As shown in FIG. 5, it is assumed that the current mode is the darkening mode of the special mode. In this state, when the voice including the word "NAVI mode" is acquired, the normal command "NAVI mode" corresponding to the switching to the NAVI mode is recognized, but the mode switching by the normal command is performed in the special mode. Absent. Therefore, the mode is not switched from the dark mode to the NAVI mode.

一方、「再開」ワードを含む音声が取得されると、特殊モードから通常モードへの切換えに対応する特殊コマンド「再開」が認識され、暗転モードからＧＥＦモード（暗転モードに切換える直前の通常モード）へのモード切替えが行われる。 On the other hand, when the voice including the "restart" word is acquired, the special command "restart" corresponding to the switching from the special mode to the normal mode is recognized, and the dark mode is changed to the GEF mode (normal mode immediately before switching to the dark mode). The mode is switched to.

このように、特殊モードである暗転モードからは、通常コマンドによるモード切替えが行えない。一方で、特殊コマンドを利用すれば、特殊モードから通常モードへのモード切替えが行える。 In this way, the mode cannot be switched by a normal command from the dark mode, which is a special mode. On the other hand, if a special command is used, the mode can be switched from the special mode to the normal mode.

図６に示すように、再び現在のモードが特殊モードの暗転モードである場合を考える。この状態で「再開」ワードを含む音声が取得されると、既に説明したように、特殊モードから通常モードへの切換えに対応する特殊コマンド「再開」が認識され、暗転モードからＧＥＦモード（暗転モードに切換える直前の通常モード）へのモード切替えが行われる。 As shown in FIG. 6, consider the case where the current mode is the darkening mode of the special mode again. When the voice including the "resume" word is acquired in this state, the special command "resume" corresponding to the switching from the special mode to the normal mode is recognized as described above, and the dark mode is changed to the GEF mode (dark mode). The mode is switched to (normal mode) immediately before switching to.

一方で、「再開」及び「ＮＡＶＩモード」というワードが連続して取得されると、特殊モードから通常モードへの切換えに対応する特殊コマンド「再開」が認識されると共に、切換え先としてＮＡＶＩモードを指定する指定コマンド「ＮＡＶＩモード」が認識され、暗転モードからＮＡＶＩモード（指定コマンドに応じたモード）へのモード切替えが行われる。なお、指定コマンドは、「第３音声コマンド」の一具体例であり、切り替え先のモードを指定するためのコマンドとして、モード毎に設定されている。具体的には、通常コマンドと同様にモード名がそのまま指定コマンドとなっている。 On the other hand, when the words "restart" and "NAVI mode" are continuously acquired, the special command "restart" corresponding to the switching from the special mode to the normal mode is recognized, and the NAVI mode is used as the switching destination. The designated designated command "NAVI mode" is recognized, and the mode is switched from the darkening mode to the NAVI mode (mode corresponding to the designated command). The designated command is a specific example of the "third voice command", and is set for each mode as a command for designating the switching destination mode. Specifically, the mode name is the specified command as it is like the normal command.

このように、特殊コマンド及び指定コマンドを組み合わせれば、特殊モードから任意の通常モードへの切換えが行える。即ち、特殊モードに切換えられる直前の通常モード以外の通常モードに切替えることが可能となる。 In this way, by combining the special command and the designated command, it is possible to switch from the special mode to an arbitrary normal mode. That is, it is possible to switch to a normal mode other than the normal mode immediately before switching to the special mode.

図７に示すように、暗転モードは、暗転画像だけを示すものでなくともよい。本実施例では、ＮＡＶＩモードが暗転モードに切換えられた場合には、暗転画像のみを示す暗転モード（ＮＡＶＩ）が実現される。一方で、ＧＥＦモードが暗転モードに切換えられた場合には、暗転画像にＧＥＦモードの重要な情報の一部を表示する暗転モード（ＧＥＦ）が実現され、ＰＲＥＯＰＥモードが暗転モードに切換えられた場合には、暗転画像にＰＲＥＯＰＥモードの重要な情報の一部を表示する暗転モード（ＰＲＥＯＰＥ）が実現される。 As shown in FIG. 7, the darkening mode does not have to show only the darkening image. In this embodiment, when the NAVI mode is switched to the darkening mode, the darkening mode (NAVI) showing only the darkening image is realized. On the other hand, when the GEF mode is switched to the darkening mode, the darkening mode (GEF) that displays a part of the important information of the GEF mode on the darkening image is realized, and the PREOPE mode is switched to the darkening mode. Is realized in a darkening mode (PREOPE) that displays a part of important information of the PREOPE mode on a darkening image.

このようにすれば、暗転モードによる処置中も、通常モードで示される情報を確認することができる。 In this way, the information shown in the normal mode can be confirmed even during the treatment in the dark mode.

図８に示すように、現在のモードが通常モードのＮＡＶＩモードであるとする。この状態で、「暗転」というワードを含む音声が取得されると、既に説明したように、暗転モードへの切換えに対応する通常コマンド「暗転」が認識され、ＮＡＶＩモードから暗転モードへのモード切替えが行われる。 As shown in FIG. 8, it is assumed that the current mode is the normal mode NAVI mode. In this state, when the voice including the word "darkening" is acquired, the normal command "darkening" corresponding to the switching to the darkening mode is recognized as described above, and the mode is switched from the NAVI mode to the darkening mode. Is done.

一方で、「暗転」及び「ＧＥＦモード」というワードが連続して取得されると、暗転モードへの切換えに対応する通常コマンド「暗転」が認識されると共に、ＧＥＦモードの一部表示を要求する表示コマンド「ＧＥＦモード」が認識され、ＮＡＶＩモードからＧＥＦモードの重要な情報の一部を表示する暗転モード（ＧＥＦ）へのモード切替えが行われる。なお、表示コマンドは、暗転画像に一部表示する通常モードを指定するためのコマンドとして、モード毎に設定されている。具体的には、通常コマンドと同様にモード名がそのまま指定コマンドとなっている。 On the other hand, when the words "darkening" and "GEF mode" are continuously acquired, the normal command "darkening" corresponding to the switching to the darkening mode is recognized, and a part of the GEF mode is requested to be displayed. The display command "GEF mode" is recognized, and the mode is switched from the NAVI mode to the darkening mode (GEF) that displays a part of important information of the GEF mode. The display command is set for each mode as a command for designating a normal mode in which a part of the darkened image is displayed. Specifically, the mode name is the specified command as it is like the normal command.

このように、通常コマンド及び表示コマンドを組み合わせれば、通常モードから任意の特殊モードへの切換えが行える。即ち、暗転画像に任意の通常モードを一部表示させることが可能となる。 In this way, by combining the normal command and the display command, it is possible to switch from the normal mode to an arbitrary special mode. That is, it is possible to partially display an arbitrary normal mode on the darkened image.

＜実施例の効果＞
最後に、本実施例に係る音声認識装置によって得られる技術的効果について詳細に説明する。 <Effect of Examples>
Finally, the technical effect obtained by the voice recognition device according to the present embodiment will be described in detail.

図１から図５で説明したように、本実施例に係る音声認識装置によれば、通常モード時には、通常コマンドによって他の各モードへの切換えが行える。即ち、通常モードからは、通常コマンドによって他の通常モードに切替えることもできるし、通常コマンドによって特殊モードに切替えることもできる。一方、特殊モード時には、特殊コマンドでしか他のモードへの切換えが行えない。即ち、特殊モード時に通常コマンドが認識されたとしても、通常モードへの切換えは行われない。 As described with reference to FIGS. 1 to 5, according to the voice recognition device according to the present embodiment, in the normal mode, switching to each of the other modes can be performed by a normal command. That is, from the normal mode, it is possible to switch to another normal mode by a normal command, or to switch to a special mode by a normal command. On the other hand, in the special mode, it is possible to switch to another mode only with a special command. That is, even if the normal command is recognized in the special mode, the mode is not switched to the normal mode.

従って、特殊モードから通常モードへの切換え方法が限定されることになるため、特殊モードから通常モードへの切換えを意図しない場合に、誤って通常モードへの切換えが行われてしまうことを防止できる。 Therefore, since the method of switching from the special mode to the normal mode is limited, it is possible to prevent accidentally switching to the normal mode when the switching from the special mode to the normal mode is not intended. ..

仮に、特殊モード時にも通常コマンドによるモード切替えが可能であるとすると、暗転モード時において「さっきＮＡＶＩモードで見た…」という会話をした場合に、「ＮＡＶＩモード」という通常コマンドが認識され、ＮＡＶＩモードへの切換えが実行されてしまう。この場合、暗転モードで暗室処置を行っていたとすると、ＮＡＶＩモードへの切換えによって暗室状態が解除され、適切な暗室処理が行えなくなってしまう。医療現場において、このような不都合は甚大な被害を招くおそれがある。 Assuming that the mode can be switched by the normal command even in the special mode, the normal command "NAVI mode" is recognized and the NAVI is recognized when the conversation "I saw it in the NAVI mode earlier ..." is performed in the dark mode. Switching to the mode is executed. In this case, if the darkroom treatment is performed in the darkroom mode, the darkroom state is released by switching to the NAVI mode, and appropriate darkroom treatment cannot be performed. In the medical field, such inconvenience may cause great damage.

これに対し、本実施例に係る音声認識装置によれば、モード切替えを意図せず発した音声がモードを切替えるための音声コマンドとして認識されてしまい、不適切なモード切替えが行われてしまうことを防止できる。 On the other hand, according to the voice recognition device according to the present embodiment, the voice emitted unintentionally for mode switching is recognized as a voice command for switching modes, and inappropriate mode switching is performed. Can be prevented.

また、図６から図８で説明したように、指定コマンド及び表示コマンドを組み合わせることで、より適切なモード切換えが行える。従って、確認すべき情報を極めて好適に表示させることが可能となる。 Further, as described with reference to FIGS. 6 to 8, more appropriate mode switching can be performed by combining the designated command and the display command. Therefore, it is possible to display the information to be confirmed extremely preferably.

本発明は、上述した実施形態に限られるものではなく、特許請求の範囲及び明細書全体から読み取れる発明の要旨或いは思想に反しない範囲で適宜変更可能であり、そのような変更を伴う音声認識装置及び音声認識方法、並びにコンピュータプログラム及び記録媒体もまた本発明の技術的範囲に含まれるものである。 The present invention is not limited to the above-described embodiment, and can be appropriately modified within the scope of claims and within a range not contrary to the gist or idea of the invention that can be read from the entire specification, and a voice recognition device accompanied by such a modification. And speech recognition methods, as well as computer programs and recording media are also included in the technical scope of the present invention.

１１０音声取得部
１２０音声認識部
１３０モード判定部
１４０音声コマンド判定部
１４１モード認識部
１４２一致率算出部
１４３一致率判定部
１５０モード変更部
１６０画面遷移部 110 Voice acquisition unit 120 Voice recognition unit 130 Mode judgment unit 140 Voice command judgment unit 141 Mode recognition unit 142 Match rate calculation unit 143 Match rate judgment unit 150 Mode change unit 160 Screen transition unit

Claims

A first switching means that performs mode switching between a plurality of first modes and mode switching from the first mode to the second mode in response to a first voice command.
A second switching means for switching the mode from the second mode to the first mode in response to a second voice command different from the first voice command.
It is provided with a display control unit for displaying an image corresponding to the first mode or the second mode on the display unit .
The display control unit is a voice recognition device characterized by displaying a darkened image in the second mode .

A first switching means that performs mode switching between a plurality of first modes and mode switching from the first mode to the second mode in response to a first voice command.
A second switching means for switching the mode from the second mode to the first mode in response to a second voice command different from the first voice command.
With a display control unit that displays an image corresponding to the first mode or the second mode on the display unit.
With
In the second mode, the display control unit causes a part of the image corresponding to the second mode to display at least a part of the image corresponding to the first mode.
A voice recognition device characterized by this.

A first switching means that performs mode switching between a plurality of first modes and mode switching from the first mode to the second mode in response to a first voice command.
A second switching means for switching the mode from the second mode to the first mode in response to a second voice command different from the first voice command.
With a display control unit that displays an image corresponding to the first mode or the second mode on the display unit.
With
The second switching means responds to the second voice command and performs mode switching to the first mode immediately before switching to the second mode.
A voice recognition device characterized by this.

The voice recognition device according to any one of claims 1 to 3 , wherein the second voice command is a word or an onomatopoeia that is unlikely to be issued in an environment in which the voice recognition device can be used.

The voice recognition device according to any one of claims 1 to 4 , wherein the display control unit displays an image to be visually observed at the time of surgery.

The second switching means responds to a voice command in which the second voice command is combined with a third voice command corresponding to each of the first modes, and the second voice command corresponds to the third voice command. The voice recognition device according to any one of claims 1 to 5 , wherein the mode is switched to the first mode.

The voice recognition device according to any one of claims 1 to 6 , wherein the second voice command is a combination of a predetermined voice and a movement or gesture of a line of sight.

The second switching means responds only to a second voice command different from the first voice command for switching to the first mode, and performs mode switching from the second mode to the first mode. The voice recognition device according to any one of claims 1 to 5 .

A first switching step of performing mode switching between a plurality of first modes and mode switching from the first mode to the second mode in response to a first voice command.
A second switching step of switching the mode from the second mode to the first mode in response to a second voice command different from the first voice command.
A display control step of displaying an image corresponding to the first mode or the second mode on the display unit.
Including
The display control step is a voice recognition method characterized in that a darkened image is displayed in the second mode .

A first switching step of performing mode switching between a plurality of first modes and mode switching from the first mode to the second mode in response to a first voice command.
A second switching step of switching the mode from the second mode to the first mode in response to a second voice command different from the first voice command.
The computer is made to execute the display control step of displaying the image corresponding to the first mode or the second mode on the display unit .
The display control step is a computer program characterized in that a darkened image is displayed in the second mode .

A first switching step of performing mode switching between a plurality of first modes and mode switching from the first mode to the second mode in response to a first voice command.
A second switching step of switching the mode from the second mode to the first mode in response to a second voice command different from the first voice command.
A display control step of displaying an image corresponding to the first mode or the second mode on the display unit.
Let the computer run
In the display control step, at the time of the second mode, at least a part of the image corresponding to the first mode is displayed on a part of the image corresponding to the second mode.
A computer program characterized by that.

A first switching step of performing mode switching between a plurality of first modes and mode switching from the first mode to the second mode in response to a first voice command.
A second switching step of switching the mode from the second mode to the first mode in response to a second voice command different from the first voice command.
A display control step of displaying an image corresponding to the first mode or the second mode on the display unit.
Let the computer run
In the second switching step, in response to the second voice command, the mode is switched to the first mode immediately before switching to the second mode.
A computer program characterized by that.