JP2013061793A

JP2013061793A - Input support device, input support method, and input support program

Info

Publication number: JP2013061793A
Application number: JP2011199750A
Authority: JP
Inventors: Kiyoyuki Suzuki; 清幸鈴木; Janaca Atupelage Chamidu; ジャナカアトゥペラゲチャミドゥ; Motokazu Hozumi; 元一穂積; Masami Nakamura; 雅巳中村; Yutaka Kondo; 裕近藤
Original assignee: CYBER CLERK INSTITUTE; Advanced Media Inc
Current assignee: CYBER CLERK INSTITUTE; Advanced Media Inc
Priority date: 2011-09-13
Filing date: 2011-09-13
Publication date: 2013-04-04

Abstract

PROBLEM TO BE SOLVED: To provide an input support device capable of easily adding a voice input function to an existing application requiring high security.SOLUTION: An information processing device 200 including an input support device of the present invention adds a voice input function to an electronic medical chart unit 203 that is application software for displaying input areas for inputting information on an input screen. The information processing device includes: a screen configuration table storage unit 208 for storing a screen configuration table in which the display position of each input area is associated with first information for specifying reading indicating the input area; and an input information generation unit 210 for obtaining second information indicating a result of voice recognition processing on uttered voice, referring to the screen configuration table to identify the display position of an input area corresponding to the obtained second information, and inputting operation information for selecting the identified display position to the electronic medical chart unit 203.

Description

本発明は、電子カルテなどのアプリケーションソフトウェアに対して音声入力機能を追加する、入力支援装置、入力支援方法、および入力支援プログラムに関する。 The present invention relates to an input support apparatus, an input support method, and an input support program for adding a voice input function to application software such as an electronic medical record.

データベースに大量のデータ入力を行うためのアプリケーションソフトウェア（以下、単に「アプリケーション」という）は、通常、データベースごと、あるいは用途ごとに、独自のものが用意されている。病院の電子カルテは、その代表的なものである。近年の音声認識技術の向上により、このような既存のアプリケーションに音声入力機能を採り入れたいというニーズが高まっている。 Application software (hereinafter simply referred to as “application”) for inputting a large amount of data into a database is usually prepared for each database or for each application. The hospital's electronic medical record is a typical example. With the recent improvement of voice recognition technology, there is an increasing need to incorporate a voice input function into such an existing application.

そこで、既存のアプリケーションに対し、その改変を行わずに音声入力機能を追加する技術が存在する（例えば特許文献１参照）。 Therefore, there is a technique for adding a voice input function to an existing application without modifying the application (see, for example, Patent Document 1).

特許文献１に記載の技術（以下「従来技術」という）は、ウェブアプリケーションの入力画面を構成するユーザインタフェース部品に対して、ユーザから読みの設定を受け付ける。そして、従来技術は、入力音声の音声認識結果が設定されたいずれかの読みに該当するとき、その読みが設定された部品に応じた処理を実行する。例えば、部品がテキスト入力フィールドである場合、従来技術は、その入力フィールドに、読みに続けて入力された文字列を設定する。 The technology described in Patent Document 1 (hereinafter referred to as “conventional technology”) accepts reading settings from a user for user interface components that constitute an input screen of a web application. Then, in the conventional technology, when the speech recognition result of the input speech corresponds to any reading set, the processing according to the component for which the reading is set is executed. For example, when the part is a text input field, the prior art sets a character string input after reading in the input field.

このような従来技術によれば、既存のアプリケーションに対して特に改変を行うことなく、その操作を音声により行うことができる。 According to such a conventional technique, the operation can be performed by voice without particularly modifying the existing application.

特開２００７−１６４７３２号公報JP 2007-164732 A

しかしながら、従来技術は、電子カルテのような高いセキュリティを求められるアプリケーションには、適用し難いという課題を有する。そのようなアプリケーションは、通常、オープンソースとはなっておらず、部品単位での処理を外部から行うことは難しいからである。 However, the conventional technology has a problem that it is difficult to apply to an application such as an electronic medical record that requires high security. This is because such an application is usually not open source, and it is difficult to perform processing in parts from outside.

本発明の目的は、高いセキュリティを求められる既存のアプリケーションに対して音声入力機能を簡単に追加することができる入力支援装置、入力支援方法、および入力支援プログラムを提供することである。 An object of the present invention is to provide an input support device, an input support method, and an input support program that can easily add a voice input function to an existing application that requires high security.

本発明の一態様に係る入力支援装置は、情報を入力するための入力エリアを入力画面に表示するアプリケーションソフトウェアに対して、音声入力機能を追加する入力支援装置であって、前記入力エリアの表示位置と、当該入力エリアを表す読みを特定する第１の情報と、を対応付けた画面構成テーブルを格納する画面構成テーブル格納部と、発話音声に対する音声認識処理の結果を表す第２の情報を取得し、取得した前記第２の情報に対応する前記入力エリアの表示位置を、前記画面構成テーブルを参照して特定し、特定した前記表示位置を選択する操作情報を、前記アプリケーションソフトウェアに入力する入力情報生成部と、を有する。 An input support apparatus according to an aspect of the present invention is an input support apparatus that adds a voice input function to application software that displays an input area for inputting information on an input screen, and displays the input area. A screen configuration table storage unit that stores a screen configuration table in which the position and first information that identifies the input area are associated with each other; and second information that represents the result of the speech recognition processing for the uttered speech Obtaining, specifying the display position of the input area corresponding to the acquired second information with reference to the screen configuration table, and inputting operation information for selecting the specified display position to the application software An input information generation unit.

本発明の一態様に係る入力支援方法は、情報を入力するための入力エリアと前記入力エリアの属性を示すラベルテキストとを入力画面に表示するアプリケーションソフトウェアに対して、音声入力機能を追加する入力支援方法であって、前記入力エリアの表示位置および前記ラベルテキストの表示位置を取得するステップと、前記入力エリアの表示位置と前記ラベルテキストの表示位置との相対関係に基づき、前記入力エリアの表示位置に対し、前記ラベルテキストの読みを特定する情報を、当該入力エリアを表す読みを特定する情報として対応付けた画面構成テーブルを生成するステップと、発話音声に対する音声認識処理の結果を表す情報を取得するステップと、取得した前記音声認識処理の結果を表す情報に対応する前記入力エリアの表示位置を、前記画面構成テーブルを参照して特定するステップと、特定した前記表示位置を選択する操作情報を、前記アプリケーションソフトウェアに入力するステップとを有する。 An input support method according to an aspect of the present invention is an input that adds a voice input function to application software that displays an input area for inputting information and label text indicating an attribute of the input area on an input screen. A support method comprising: obtaining a display position of the input area and a display position of the label text; and displaying the input area based on a relative relationship between the display position of the input area and the display position of the label text. A step of generating a screen configuration table in which information specifying the reading of the label text is associated with the position as information specifying the reading indicating the input area, and information indicating the result of the speech recognition processing for the uttered speech The step of acquiring and the input area corresponding to the information indicating the result of the acquired voice recognition process The shown position, has a step of specifying by referring to the screen configuration table, the operation information for selecting the display position specified, the inputting to the application software.

本発明の一態様に係る入力支援プログラムは、情報を入力するための入力エリアと前記入力エリアの属性を示すためのラベルテキストとを入力画面に表示するアプリケーションソフトウェアに対して、音声入力機能を追加する入力支援プログラムであって、前記アプリケーションソフトウェアにアクセス可能なコンピュータに、前記入力エリアの表示位置および前記ラベルテキストの表示位置を取得する処理と、前記入力エリアの表示位置と前記ラベルテキストの表示位置との相対関係に基づき、前記入力エリアの表示位置に対し、前記ラベルテキストの読みを特定する情報を、当該入力エリアを表す読みを特定する情報として対応付けた画面構成テーブルを生成する処理と、発話音声に対する音声認識処理の結果を表す情報を取得する処理と、取得した前記音声認識処理の結果を表す情報に対応する前記入力エリアの表示位置を、前記画面構成テーブルを参照して特定する処理と、特定した前記表示位置を選択する操作情報を、前記アプリケーションソフトウェアに入力する処理とを実行させる。 An input support program according to an aspect of the present invention adds a voice input function to application software that displays an input area for inputting information and a label text for indicating an attribute of the input area on an input screen. An input support program for obtaining a display position of the input area and a display position of the label text in a computer accessible to the application software, a display position of the input area, and a display position of the label text A process of generating a screen configuration table in which information for identifying the reading of the label text is associated with the display position of the input area as information for identifying the reading representing the input area, based on the relative relationship with Processing to acquire information that represents the result of speech recognition processing for spoken speech A process for specifying the display position of the input area corresponding to the acquired information representing the result of the voice recognition process with reference to the screen configuration table, and operation information for selecting the specified display position. The process to input to the software is executed.

本発明によれば、高いセキュリティを求められる既存のアプリケーションに対して音声入力機能を簡単に追加することができる。 According to the present invention, a voice input function can be easily added to an existing application that requires high security.

本発明の一実施の形態に係る入力支援システムの構成を示すシステム構成図The system block diagram which shows the structure of the input assistance system which concerns on one embodiment of this invention 本実施の形態に係る情報処理装置の構成の一例を示すブロック図FIG. 2 is a block diagram illustrating an example of a configuration of an information processing device according to the present embodiment 本実施の形態におけるテーブル作成ルールの内容の一例を示す図The figure which shows an example of the content of the table creation rule in this Embodiment 本実施の形態における操作変換ルールの一例を示す図The figure which shows an example of the operation conversion rule in this Embodiment 本実施の形態における無線通信端末の構成の一例を示すブロック図FIG. 2 is a block diagram illustrating an example of a configuration of a wireless communication terminal in this embodiment 本実施の形態における音声認識サーバの構成の一例を示すブロック図A block diagram showing an example of a configuration of a voice recognition server in the present embodiment 本実施の形態における装置情報の内容の一例を示す図The figure which shows an example of the content of the apparatus information in this Embodiment 本実施の形態におけるクラーク情報の内容の一例を示す図The figure which shows an example of the content of the clerk information in this Embodiment 本実施の形態におけるクラーク決定ルールの内容の一例を示す図The figure which shows an example of the content of the clerk decision rule in this Embodiment 本実施の形態における情報処理装置の動作の一例を示すフローチャートFlowchart illustrating an example of the operation of the information processing apparatus in this embodiment 本実施の形態における入力画面の一例を示す平面図Plan view showing an example of an input screen in the present embodiment 本実施の形態におけるテーブル生成処理の一例を示すフローチャートFlowchart showing an example of table generation processing in the present embodiment 本実施の形態における画面構成テーブルの内容の一例を示す図The figure which shows an example of the content of the screen structure table in this Embodiment 本実施の形態における辞書用情報の内容の一例を示す図The figure which shows an example of the content of the information for dictionary in this Embodiment 本実施の形態におけるＩＤテキスト対応情報の内容の一例を示す図The figure which shows an example of the content of the ID text corresponding | compatible information in this Embodiment 本実施の形態における情報処理装置から無線通信端末への送信データの構成の一例を示す図The figure which shows an example of a structure of the transmission data from the information processing apparatus to a radio | wireless communication terminal in this Embodiment 本実施の形態における無線通信端末の動作の一例を示すフローチャートFlowchart showing an example of operation of the wireless communication terminal in this embodiment 本実施の形態における無線通信端末から音声認識サーバへの第１の送信データの構成の一例を示す図The figure which shows an example of a structure of the 1st transmission data from the radio | wireless communication terminal in this Embodiment to a speech recognition server. 本実施の形態における無線通信端末から音声認識サーバへの第２の送信データの構成の一例を示す図The figure which shows an example of a structure of the 2nd transmission data from the radio | wireless communication terminal in this Embodiment to a speech recognition server. 本実施の形態における音声認識装置の動作の一例を示すフローチャートThe flowchart which shows an example of operation | movement of the speech recognition apparatus in this Embodiment. 本実施の形態におけるＩＤ読み対応テーブルの内容の一例を示す図The figure which shows an example of the content of the ID reading corresponding | compatible table in this Embodiment 本実施の形態における音声認識サーバから校正端末への送信データの構成の一例を示す図The figure which shows an example of a structure of the transmission data from the speech recognition server in this Embodiment to a calibration terminal. 本実施の形態における校正端末から音声認識サーバへの送信データの構成の一例を示す図The figure which shows an example of a structure of the transmission data from the calibration terminal in this Embodiment to a speech recognition server. 本実施の形態における音声認識サーバから情報処理装置への送信データの構成の一例を示す図The figure which shows an example of a structure of the transmission data from the speech recognition server in this Embodiment to an information processing apparatus. 本実施の形態における情報表示装置による操作変換処理の一例を示すフローチャートThe flowchart which shows an example of the operation conversion process by the information display apparatus in this Embodiment

以下、本発明の一実施の形態について、図面を参照して詳細に説明する。本実施の形態は、本発明を、病院の電子カルテに音声入力機能を追加する入力支援システムに適用した例である。 Hereinafter, an embodiment of the present invention will be described in detail with reference to the drawings. This embodiment is an example in which the present invention is applied to an input support system that adds a voice input function to an electronic medical chart of a hospital.

まず、本発明に係る入力支援システムの概要について説明する。 First, the outline of the input support system according to the present invention will be described.

図１は、本発明の一実施の形態に係る入力支援システムの構成を示すシステム構成図である。 FIG. 1 is a system configuration diagram showing a configuration of an input support system according to an embodiment of the present invention.

図１において、入力支援システム１００は、情報処理装置２００と、情報処理装置２００が接続する院内ＬＡＮ３００と、無線通信端末４００とを有する。情報処理装置２００および院内ＬＡＮ３００は、病院内に配置されている。無線通信端末４００は、情報処理装置２００のユーザ（医師）に携帯され、当該ユーザが情報処理装置２００を操作するときには病院内に位置している。 In FIG. 1, the input support system 100 includes an information processing device 200, a hospital LAN 300 to which the information processing device 200 is connected, and a wireless communication terminal 400. The information processing device 200 and the in-hospital LAN 300 are arranged in a hospital. The wireless communication terminal 400 is carried by a user (physician) of the information processing apparatus 200 and is located in the hospital when the user operates the information processing apparatus 200.

なお、院内ＬＡＮ３００、情報処理装置２００、および無線通信端末４００は、図示しないが、実際には複数配置されている。 In-hospital LAN 300, information processing device 200, and wireless communication terminal 400 are actually arranged in a plurality, although not shown.

また、入力支援システム１００は、公共通信網５００と、公共通信網５００に接続された音声認識サーバ６００および第１〜第Ｎの校正端末７００−１〜７００−Ｎとを有する。公共通信網５００、音声認識サーバ６００、および第１〜第Ｎの校正端末７００−１〜７００−Ｎは、病院外に配置されている。 The input support system 100 includes a public communication network 500, a voice recognition server 600 connected to the public communication network 500, and first to Nth calibration terminals 700-1 to 700-N. The public communication network 500, the voice recognition server 600, and the first to Nth calibration terminals 700-1 to 700-N are arranged outside the hospital.

情報処理装置２００は、電子カルテのアプリケーション（以下「電子カルテ」という）が搭載されたパーソナルコンピュータである。電子カルテに入力された情報は、例えば、院内ＬＡＮ３００に配置された情報サーバ（図示せず）に送信される。情報処理装置２００は、無線通信端末４００、公共通信網５００、音声認識サーバ６００、および第１〜第Ｎの校正端末７００−１〜７００−Ｎを用いて、電子カルテに対する音声入力操作を実現する。 The information processing apparatus 200 is a personal computer on which an electronic medical record application (hereinafter referred to as “electronic medical record”) is mounted. The information input to the electronic medical record is transmitted to an information server (not shown) arranged in the hospital LAN 300, for example. The information processing apparatus 200 implements a voice input operation on the electronic medical chart using the wireless communication terminal 400, the public communication network 500, the voice recognition server 600, and the first to Nth calibration terminals 700-1 to 700-N. .

本実施の形態において、電子カルテは、位置および大きさが可変のウィンドウにより入力画面を表示する、既存のアプリケーションソフトウェアである。入力画面は、テキストボックスなどの複数の入力エリアおよび複数のラベルテキストを配置している。入力エリア（コンポーネント）は、テキスト入力および項目選択の少なくとも１つを受け付ける画像の領域であり、例えば、テキストボックスや選択ボタンである。ラベルテキストは、入力画面に表示される、音声入力などによりユーザによって入力されたテキスト（以下「入力テキスト」という）以外のテキストであり、例えば、入力エリアの属性として表示される文字列である。また、本実施の形態における電子カルテは、各入力エリアに対する読みの設定がされておらず、かつ、高いセキュリティを求められるアプリケーションである。 In the present embodiment, the electronic medical record is existing application software that displays an input screen using a window whose position and size are variable. On the input screen, a plurality of input areas such as text boxes and a plurality of label texts are arranged. The input area (component) is an image area that accepts at least one of text input and item selection, and is, for example, a text box or a selection button. The label text is text other than text (hereinafter referred to as “input text”) input by the user by voice input or the like displayed on the input screen, and is, for example, a character string displayed as an attribute of the input area. In addition, the electronic medical record in the present embodiment is an application in which reading is not set for each input area and high security is required.

院内ＬＡＮ３００は、患者情報など、秘匿性の高い各種情報をやり取りする通信ネットワークである。 The in-hospital LAN 300 is a communication network that exchanges various types of highly confidential information such as patient information.

無線通信端末４００は、例えば、情報処理装置２００のユーザ（医師）が携帯する、無線ＬＡＮ機能とインターネット通信機能とを備えた携帯電話機である。無線通信端末４００は、病院内に位置しているとき、情報処理装置２００および公共通信網５００のそれぞれと、無線通信により接続可能となっている。 The wireless communication terminal 400 is, for example, a mobile phone that is carried by a user (physician) of the information processing apparatus 200 and has a wireless LAN function and an Internet communication function. When the wireless communication terminal 400 is located in the hospital, it can be connected to each of the information processing apparatus 200 and the public communication network 500 by wireless communication.

また、無線通信端末４００は、ユーザの発話音声を入力する。無線通信端末４００は、公共通信網５００、音声認識サーバ６００、および第１〜第Ｎの校正端末７００−１〜７００−Ｎを用いて、入力した音声（以下、単に「端末入力音声」という）に対する音声認識結果を取得する。そして、無線通信端末４００は、取得した音声認識結果を、情報処理装置２００へ転送する。 In addition, the wireless communication terminal 400 inputs a user's uttered voice. The wireless communication terminal 400 uses the public communication network 500, the voice recognition server 600, and the first to Nth calibration terminals 700-1 to 700-N to input voice (hereinafter simply referred to as “terminal input voice”). Get voice recognition result for. Then, the wireless communication terminal 400 transfers the acquired speech recognition result to the information processing apparatus 200.

公共通信網５００は、インターネットなど、不特定多数の端末が接続可能な通信ネットワークである。 The public communication network 500 is a communication network to which an unspecified number of terminals can be connected, such as the Internet.

音声認識サーバ６００は、端末入力音声に対して高精度な音声認識処理を行う。また、音声認識サーバ６００は、公共通信網５００および第１〜第Ｎの校正端末７００−１〜７００−Ｎを用いて、更に高精度な音声認識結果を得る。 The speech recognition server 600 performs highly accurate speech recognition processing on terminal input speech. The voice recognition server 600 obtains a voice recognition result with higher accuracy using the public communication network 500 and the first to Nth calibration terminals 700-1 to 700-N.

第１〜第Ｎの校正端末７００−１〜７００−Ｎは、音声認識サーバ６００が行った音声認識処理の結果（以下、適宜、「未校正音声認識結果」という）に対する校正作業を受け付ける。以下、第１〜第Ｎの校正端末７００−１〜７００−Ｎを用いて校正作業を行う人は、「クラーク」という。また、校正作業の結果は、適宜、「校正済音声認識結果」という。 The first to Nth calibration terminals 700-1 to 700-N accept a calibration operation on the result of the speech recognition processing performed by the speech recognition server 600 (hereinafter, referred to as “uncalibrated speech recognition result” as appropriate). Hereinafter, the person who performs the calibration work using the first to Nth calibration terminals 700-1 to 700-N is referred to as “clerk”. The result of the calibration work is appropriately referred to as “calibrated speech recognition result”.

このような入力支援システム１００は、クラークによる校正作業を取り入れることにより、高精度な音声認識結果を得ることができる。そして、公共通信網５００に接続する複数の校正端末７００での校正作業を可能にすることにより、活用可能な人的リソースを増やすことができ、より高精度な音声認識結果を、しかも短時間で得ることが可能となる。 Such an input support system 100 can obtain a highly accurate speech recognition result by incorporating the calibration work by Clark. Further, by enabling calibration work with a plurality of calibration terminals 700 connected to the public communication network 500, it is possible to increase human resources that can be used, and to obtain more accurate speech recognition results in a shorter time. Can be obtained.

ところが、電子カルテおよび院内ＬＡＮ３００のセキュリティ確保の観点からみると、情報処理装置２００は、公共通信網５００と直接に接続すべきではない。 However, from the viewpoint of ensuring the security of the electronic medical record and the in-hospital LAN 300, the information processing apparatus 200 should not be directly connected to the public communication network 500.

そこで、入力支援システム１００は、上述の通り、情報処理装置２００と公共通信網５００との接続を、無線通信端末４００との無線通信を介した間接的なものにする。また、入力支援システム１００は、音声入力を無線通信端末４００で行うことにより、基本的には音声認識結果の１方向転送が行われるのみとすることができ、情報処理装置２００と公共通信網５００との間の通信頻度を大幅に低減することができる。 Thus, as described above, the input support system 100 makes the connection between the information processing apparatus 200 and the public communication network 500 indirect via wireless communication with the wireless communication terminal 400. In addition, the input support system 100 can basically perform only one-way transfer of a voice recognition result by performing voice input with the wireless communication terminal 400, and the information processing apparatus 200 and the public communication network 500 can be used. The frequency of communication with can be greatly reduced.

これにより、入力支援システム１００は、電子カルテおよび院内ＬＡＮ３００のセキュリティを確保しつつ、高精度な音声入力機能を電子カルテに追加することができる。 Thereby, the input support system 100 can add a highly accurate voice input function to the electronic medical chart while ensuring the security of the electronic medical chart and the in-hospital LAN 300.

また、本実施の形態における電子カルテは、上述の通り既存のアプリケーションであり、そのまま使用したいというニーズがある。 Moreover, the electronic medical record in this Embodiment is an existing application as above-mentioned, and there exists a need to use as it is.

ところが、本実施の形態における電子カルテの各入力エリアには、上述の通り、予め読みが設定されているものではない。 However, as described above, reading is not set in advance in each input area of the electronic medical record in the present embodiment.

そこで、入力支援システム１００は、情報処理装置２００において、入力画面を解析し、入力エリアの表示位置（以下「入力エリア位置」という）とラベルテキストの表示位置（以下「ラベルテキスト位置」という）との相対関係に基づいて、入力エリア位置に対するラベルテキストの対応付けを行う。そして、入力支援システム１００は、情報処理装置２００において、ラベルテキストに対応する音声認識結果が得られたとき、そのラベルテキストに対応する入力エリア位置を選択する操作情報（入力エリア位置に対する選択操作を示す操作情報）を、電子カルテに入力する。 Therefore, the input support system 100 analyzes the input screen in the information processing apparatus 200, and displays the display position of the input area (hereinafter referred to as “input area position”) and the display position of the label text (hereinafter referred to as “label text position”). Based on the relative relationship, the label text is associated with the input area position. When the information processing apparatus 200 obtains a voice recognition result corresponding to the label text, the input support system 100 selects operation information for selecting the input area position corresponding to the label text (selection operation for the input area position is performed). (Operation information shown) is input to the electronic medical record.

これにより、入力支援システム１００は、電子カルテを改変せずに、音声入力機能を電子カルテに追加することができる。すなわち、入力支援システム１００は、高いセキュリティを求められる既存のアプリケーションである電子カルテに対して、高精度な音声入力機能を簡単に追加することができる。 Thereby, the input support system 100 can add a voice input function to the electronic medical record without modifying the electronic medical record. That is, the input support system 100 can easily add a high-accuracy voice input function to an electronic medical record that is an existing application that requires high security.

なお、入力支援システム１００は、情報処理装置２００において、入力エリア位置を取得し、取得した入力エリア位置に対して、ユーザから手動でラベルテキストの対応付けを行うようにしてもよい。 Note that the input support system 100 may acquire the input area position in the information processing apparatus 200 and manually associate the label text with the acquired input area position from the user.

以上で、入力支援システム１００の概要についての説明を終える。 This is the end of the description of the outline of the input support system 100.

次に、各装置の構成について説明する。 Next, the configuration of each device will be described.

図２は、情報処理装置２００の構成の一例を示すブロック図である。 FIG. 2 is a block diagram illustrating an example of the configuration of the information processing apparatus 200.

図２において、情報処理装置２００は、ＬＡＮ通信部２０１、端末通信部２０２、電子カルテ部２０３、画像出力部２０４、および操作入力部２０５を有する。また、情報処理装置２００は、テーブル作成ルール格納部２０６、画面構成解析部２０７、画面構成テーブル格納部２０８、操作変換ルール格納部２０９、および入力情報生成部２１０を有する。 In FIG. 2, the information processing apparatus 200 includes a LAN communication unit 201, a terminal communication unit 202, an electronic medical record unit 203, an image output unit 204, and an operation input unit 205. The information processing apparatus 200 includes a table creation rule storage unit 206, a screen configuration analysis unit 207, a screen configuration table storage unit 208, an operation conversion rule storage unit 209, and an input information generation unit 210.

なお、例えば、これら機能部のうち、テーブル作成ルール格納部２０６、画面構成解析部２０７、および画面構成テーブル格納部２０８は、入力エリア位置とラベルテキストとの対応付けの設定を行うアプリケーションを構成する。 Of these functional units, for example, the table creation rule storage unit 206, the screen configuration analysis unit 207, and the screen configuration table storage unit 208 constitute an application for setting the association between the input area position and the label text. .

また、操作変換ルール格納部２０９および入力情報生成部２１０は、設定された対応付けに基づいて音声認識結果を電子カルテに入力するアプリケーションを構成する。 In addition, the operation conversion rule storage unit 209 and the input information generation unit 210 constitute an application that inputs a speech recognition result to the electronic medical record based on the set association.

ＬＡＮ通信部２０１は、例えばＬＡＮインタフェースであり、院内ＬＡＮ３００と通信可能に接続する。 The LAN communication unit 201 is a LAN interface, for example, and is connected to the hospital LAN 300 so as to be communicable.

端末通信部２０２は、例えば無線ＬＡＮインタフェースであり、無線通信端末４００と無線通信を行う。 The terminal communication unit 202 is a wireless LAN interface, for example, and performs wireless communication with the wireless communication terminal 400.

電子カルテ部２０３は、上述の電子カルテを実現する機能部である。電子カルテ部２０３は、例えば、ＬＡＮ通信部２０１を介して、院内ＬＡＮ３００の情報サーバと通信を行う。 The electronic medical chart unit 203 is a functional unit that realizes the above-described electronic medical chart. The electronic medical chart unit 203 communicates with the information server of the hospital LAN 300 via the LAN communication unit 201, for example.

画像出力部２０４は、例えば液晶ディスプレイ（図示せず）であり、電子カルテ部２０３が生成する入力画面を表示する。 The image output unit 204 is a liquid crystal display (not shown), for example, and displays an input screen generated by the electronic medical chart unit 203.

操作入力部２０５は、例えばマウスおよびキーボード（図示せず）であり、ユーザから電子カルテ部２０３に対する手入力操作を受け付ける。 The operation input unit 205 is, for example, a mouse and a keyboard (not shown), and accepts a manual input operation on the electronic medical record unit 203 from the user.

テーブル作成ルール格納部２０６は、テーブル作成ルールを予め格納する。テーブル作成ルールは、入力エリア位置に対するラベルテキストの対応付けの条件を、入力エリア位置とラベルテキスト位置との相対関係によって規定するルールである。なお、この相対関係は、配置方向および距離を少なくとも含む。 The table creation rule storage unit 206 stores table creation rules in advance. The table creation rule is a rule that defines a condition for associating a label text with an input area position based on a relative relationship between the input area position and the label text position. This relative relationship includes at least the arrangement direction and the distance.

図３は、テーブル作成ルールの内容の一例を示す図である。 FIG. 3 is a diagram illustrating an example of the contents of the table creation rule.

図３に示すように、テーブル作成ルール８１０は、タイプ８１１ごとに優先順位８１２が設定された、対応付けの対象８１３を記述している。タイプ８１１は、入力エリアの入力形態のタイプ（以下、単に「タイプ」という）を示し、ここでは、「タブ」、「テキストボックス」、「選択ボタン」、および「選択メニュー」を含むものとする。優先順位８１２は、より高い優先順位８１２の対応付けの対象が存在する場合において、より低い優先順位８１２の対応付けの対象を無視すべきであるということを示す。対応付けの対象８１３は、入力エリアに対して対応付けるべきラベルテキストの条件を示す。 As illustrated in FIG. 3, the table creation rule 810 describes a correspondence target 813 in which a priority order 812 is set for each type 811. The type 811 indicates the type of input form of the input area (hereinafter simply referred to as “type”), and includes “tab”, “text box”, “select button”, and “select menu”. The priority 812 indicates that when there is an association target with a higher priority 812, the association target with a lower priority 812 should be ignored. The association target 813 indicates a label text condition to be associated with the input area.

ここで、記号Ｄ_ｘは、入力エリアとラベルテキストとの間のｘ軸方向の離隔距離を表し、記号Ｄ_ｙは、入力エリアとラベルテキストとの間のｙ軸方向の離隔距離を表すものとする。また、記号｜Δｘ｜は、入力エリアのｘ軸座標とラベルテキストのｘ軸座標との差を表し、記号｜Δｙ｜は、入力エリアのｙ軸座標とラベルテキストのｙ軸座標との差を表す。 Here, the symbol D _x represents the x-axis direction of the separation between the input area and the label text, symbol D _y has a represents the y-axis direction of the separation between the input area and the label text To do. The symbol | Δx | represents the difference between the x-axis coordinate of the input area and the x-axis coordinate of the label text, and the symbol | Δy | represents the difference between the y-axis coordinate of the input area and the y-axis coordinate of the label text. Represent.

例えば、「テキストボックス」というタイプ８１１の「１」という優先順位８１２に対応付けて、「左側に位置し、Ｄ_ｘ＜Ｐ_１かつ｜Δｙ｜＜Ｐ_２のラベルテキスト」と記述されている。これは、テキストボックスの左側に位置し、Ｄ_ｘ＜Ｐ_１かつ｜Δｙ｜＜Ｐ_２を満たすラベルテキストが存在する場合、当該テキストボックスに当該ラベルテキストを対応付けるべきであるということを示す。 For example, “label text of D _x <P ₁ and | Δy | <P ₂ located on the left side” is described in association with the priority 812 of “1” of the type 811 of “text box”. This indicates that if there is a label text located on the left side of the text box and satisfying D _x <P ₁ and | Δy | <P ₂ , the label text should be associated with the text box.

なお、ここで、Ｐ_１は、ユーザが、通常の感覚において、入力エリアの左側に位置するラベルテキストを入力エリアの属性を示す情報と感じるｘ軸方向の離隔距離の、上限値である。また、Ｐ_２は、ユーザが、通常の感覚において、入力エリアの左側に位置するラベルテキストを入力エリアの属性を示す情報と感じるｙ軸座標の差分の、上限値である。 Note that, P _1, the user, in the normal sense, in the x-axis direction of the distance and feel information indicating an attribute of the input area of the label text on the left side of the input area, which is the upper limit. Further, P _2, the user, in the normal sense, the difference between the y-axis coordinate and feel information indicating an attribute of the input area of the label text on the left side of the input area, which is the upper limit.

したがって、テキストボックスの左側に位置し、Ｄ_ｘ＜Ｐ_１かつ｜Δｙ｜＜Ｐ_２を満たすラベルテキストを、当該テキストボックスに対応付ける対応付けは、入力画面を見たときのユーザの感覚に適合する。なお、テーブル作成ルール８１０の他の対応付けも、同様にユーザの感覚に適合した内容となっている。 Therefore, the correspondence that associates the label text that is located on the left side of the text box and satisfies D _x <P ₁ and | Δy | <P ₂ with the text box matches the user's feeling when viewing the input screen. . Note that the other associations of the table creation rules 810 are also adapted to the user's sense.

したがって、このようなテーブル作成ルール８１０に基づいて入力エリアに対するラベルテキストの対応付けを行うことにより、各入力エリアに対するラベルテキストの設定を、ユーザの感覚に適合した内容で行うことが可能となる。 Therefore, by associating the label text with the input area based on the table creation rule 810, it is possible to set the label text for each input area with the contents suitable for the user's sense.

図２の画面構成解析部２０７は、電子カルテ部２０３に問い合わせて、入力エリア位置およびラベルテキスト位置を取得する。そして、画面構成解析部２０７は、取得した、入力エリア位置およびラベルテキスト位置から、テーブル作成ルールに基づいて、画面構成テーブルを生成する。画面構成テーブルは、入力エリア位置に対するラベルテキストの対応付けと、対応付けごとのＩＤとを記述したテーブルである。画面構成テーブルに記述されるＩＤは、言い換えると、入力エリアを表す読みを特定する情報（第１の情報）であり、かつ、ラベルテキストの読みを特定する情報（第３の情報）である。 The screen configuration analysis unit 207 in FIG. 2 inquires of the electronic medical record unit 203 and acquires the input area position and the label text position. Then, the screen configuration analysis unit 207 generates a screen configuration table from the acquired input area position and label text position based on the table creation rule. The screen configuration table is a table describing the association of the label text with the input area position and the ID for each association. In other words, the ID described in the screen configuration table is information (first information) that specifies the reading representing the input area, and is information (third information) that specifies the reading of the label text.

また、画面構成解析部２０７は、画面構成テーブルに記述された対応付けごとに、ＩＤとラベルテキストとの組を、端末通信部２０２を介して無線通信端末４００へ送信する。 In addition, the screen configuration analysis unit 207 transmits a set of ID and label text to the wireless communication terminal 400 via the terminal communication unit 202 for each association described in the screen configuration table.

また、画面構成解析部２０７は、選択メニューが存在するとき、選択メニューのＩＤと複数のメニュー項目テキストとを組み付けた辞書用情報を、端末通信部２０２を介して無線通信端末へ送信する。選択メニューとは、予め用意された複数のメニュー項目テキストに対する選択を受け付ける入力エリアであり、例えばプルダウンメニューである。 Further, when there is a selection menu, the screen configuration analysis unit 207 transmits dictionary information in which the ID of the selection menu and a plurality of menu item texts are combined to the wireless communication terminal via the terminal communication unit 202. The selection menu is an input area that accepts selection of a plurality of menu item texts prepared in advance, and is, for example, a pull-down menu.

画面構成テーブル格納部２０８は、画面構成解析部２０７により生成された画面構成テーブルを記憶する。 The screen configuration table storage unit 208 stores the screen configuration table generated by the screen configuration analysis unit 207.

なお、画面構成解析部２０７は、上述の通り、入力エリア位置に対して、ユーザから手動でラベルテキストの対応付けを行うようにしてもよい。 As described above, the screen configuration analysis unit 207 may manually associate the label text with the input area position from the user.

この場合、情報処理装置２００は、必ずしもテーブル作成ルール格納部２０６を備えなくてもよい。また、この場合のメリットは、例えば、ユーザが発話し易い読み（例えば略語や外国語）、あるいは覚え易い読みを、自由に設定することができる点である。 In this case, the information processing apparatus 200 does not necessarily include the table creation rule storage unit 206. In addition, the merit in this case is that, for example, a reading that is easy for a user to speak (for example, an abbreviation or a foreign language) or a reading that is easy to remember can be set freely.

操作変換ルール格納部２０９は、操作変換ルールを予め格納する。操作変換ルールは、音声認識処理の結果を表す情報（第２の情報）を、画面構成テーブルを用いて操作情報に変換するためのルールである。 The operation conversion rule storage unit 209 stores operation conversion rules in advance. The operation conversion rule is a rule for converting information (second information) representing the result of the speech recognition processing into operation information using the screen configuration table.

図４は、操作変換ルールの内容の一例を示す図である。 FIG. 4 is a diagram illustrating an example of the contents of the operation conversion rule.

図４に示すように、操作変換ルール８２０は、タイプ８２１に対応付けて、操作情報の内容８２２を記述している。 As shown in FIG. 4, the operation conversion rule 820 describes the content 822 of the operation information in association with the type 821.

操作情報の内容８２２は、音声認識処理の結果に対応するラベルテキストが画面構成テーブルに存在するとき、そのラベルテキストに対応する入力エリア（以下「操作対象入力エリア」という）に関して出力すべき操作情報の内容を示す。 The content of the operation information 822 is operation information to be output regarding an input area (hereinafter referred to as “operation target input area”) corresponding to the label text when the label text corresponding to the result of the speech recognition processing exists in the screen configuration table. The contents of

例えば、「テキストボックス」というタイプ８２１に対応付けて、「取得された絶対座標でワンクリック＋入力テキスト」という操作情報の内容８２２が記述されている。これは、音声認識処理の結果が示す入力エリアの基準点（例えば左上端部）の絶対座標の位置においてワンクリックする操作と、音声認識処理の結果に含まれる入力テキストを続けて入力する操作とを示す操作情報を、出力すべきであるといいうことを示す。 For example, the content 822 of the operation information “one click with the acquired absolute coordinates + input text” is described in association with the type 821 of “text box”. This is one-click operation at the absolute coordinate position of the reference point (for example, the upper left corner) of the input area indicated by the result of the speech recognition process, and an operation of continuously inputting the input text included in the result of the speech recognition process. This indicates that the operation information indicating that the information should be output.

図２の入力情報生成部２１０は、端末入力音声に対する音声認識処理の結果を表す情報が、画面構成テーブルにおいていずれかの入力エリアの表示位置に対応付けられているとき、当該表示位置に対する操作情報を、電子カルテ部２０３に入力する。 The input information generation unit 210 in FIG. 2 operates the operation information for the display position when the information representing the result of the speech recognition process for the terminal input voice is associated with the display position of any input area in the screen configuration table. Is input to the electronic medical chart unit 203.

より具体には、入力情報生成部２１０は、無線通信端末４００から送られてきた音声認識処理の結果を表す情報を、画面構成テーブルにおいて検索する。そして、入力情報生成部２１０は、当該ＩＤが画面構成テーブルに記述されたＩＤと一致するとき、当該ＩＤに対応する入力エリアの表示位置に対する選択操作を示す操作情報を、電子カルテ部２０３に入力する。 More specifically, the input information generation unit 210 searches the screen configuration table for information representing the result of the speech recognition process sent from the wireless communication terminal 400. Then, when the ID matches the ID described in the screen configuration table, the input information generation unit 210 inputs operation information indicating a selection operation for the display position of the input area corresponding to the ID to the electronic medical chart unit 203. To do.

また、入力情報生成部２１０は、音声認識処理の結果を表す情報が、選択メニューのＩＤと入力テキストとを示すとき、当該ＩＤに該当する入力エリアの表示位置に対する選択操作を示す操作情報を、まず電子カルテ部２０３に入力する。そして、これに続けて、入力情報生成部２１０は、テキストの入力操作を示す操作情報を、電子カルテ部２０３に入力する。 In addition, when the information representing the result of the speech recognition process indicates the ID of the selection menu and the input text, the input information generation unit 210 displays operation information indicating a selection operation for the display position of the input area corresponding to the ID. First, an input is made to the electronic medical chart unit 203. Subsequently, the input information generation unit 210 inputs operation information indicating a text input operation to the electronic medical chart unit 203.

なお、入力情報生成部２１０は、これらの操作情報を、操作変換ルール（図４参照）を用いて、音声認識処理の結果を変換することにより取得する。 Note that the input information generation unit 210 acquires the operation information by converting the result of the speech recognition process using the operation conversion rule (see FIG. 4).

図５は、無線通信端末４００の構成の一例を示すブロック図である。 FIG. 5 is a block diagram illustrating an exemplary configuration of the wireless communication terminal 400.

図５において、無線通信端末４００は、網通信部４０１、装置通信部４０２、音声入力部４０３、および音声認識管理部４０４を有する。 In FIG. 5, the wireless communication terminal 400 includes a network communication unit 401, a device communication unit 402, a voice input unit 403, and a voice recognition management unit 404.

網通信部４０１は、例えば、公共通信網５００に接続された携帯電話網（図示せず）と無線通信を行うための無線通信インタフェースである。網通信部４０１は、携帯電話網および公共通信網５００を介して、音声認識サーバ６００および第１〜第Ｎの校正端末７００−１〜７００−Ｎのそれぞれと、通信可能に接続する。 The network communication unit 401 is a wireless communication interface for performing wireless communication with, for example, a mobile phone network (not shown) connected to the public communication network 500. The network communication unit 401 is communicably connected to each of the voice recognition server 600 and the first to Nth calibration terminals 700-1 to 700-N via the mobile phone network and the public communication network 500.

装置通信部４０２は、例えば無線ＬＡＮインタフェースであり、公共通信網５００とは別の通信経路（ここでは無線ＬＡＮ）により、情報処理装置２００に接続する。 The device communication unit 402 is, for example, a wireless LAN interface, and is connected to the information processing device 200 via a communication path different from the public communication network 500 (here, a wireless LAN).

音声入力部４０３は、ユーザの発話音声を入力する。 The voice input unit 403 inputs a user's uttered voice.

音声認識管理部４０４は、音声入力部４０３により入力された音声である端末入力音声を、音声認識処理の対象として、網通信部４０１を介して音声認識サーバ６００へ送信する。 The voice recognition management unit 404 transmits terminal input voice, which is voice input by the voice input unit 403, to the voice recognition server 600 via the network communication unit 401 as a target of voice recognition processing.

また、音声認識管理部４０４は、校正作業の結果（校正済音声認識結果）を、網通信部４０１を介して第１〜第Ｎの校正端末７００−１〜７００−Ｎから受信する。そして、音声認識管理部４０４は、受信した校正作業の結果を、電子カルテ部２０３に対する操作情報として、装置通信部４０２を介して情報処理装置２００へ送信する。 Further, the voice recognition management unit 404 receives the result of the calibration work (calibrated voice recognition result) from the first to Nth calibration terminals 700-1 to 700-N via the network communication unit 401. Then, the voice recognition management unit 404 transmits the received calibration work result as operation information for the electronic medical chart unit 203 to the information processing apparatus 200 via the apparatus communication unit 402.

また、音声認識管理部４０４は、情報処理装置２００から装置通信部４０２を介して受信したＩＤとラベルテキストとの組を、網通信部４０１を介して音声認識サーバ６００へ送信する。 In addition, the voice recognition management unit 404 transmits the set of ID and label text received from the information processing apparatus 200 via the apparatus communication unit 402 to the voice recognition server 600 via the network communication unit 401.

図６は、音声認識サーバ６００の構成の一例を示すブロック図である。 FIG. 6 is a block diagram illustrating an example of the configuration of the voice recognition server 600.

図６において、音声認識サーバ６００は、網通信部６０１、音声認識データベース６０２、逆認識処理部６０３、対応テーブル格納部６０４、および音声認識処理部６０５を有する。また、音声認識サーバ６００は、装置情報格納部６０６、クラーク情報格納部６０７、クラーク決定ルール格納部６０８、および校正管理部６０９を有する。 6, the voice recognition server 600 includes a network communication unit 601, a voice recognition database 602, a reverse recognition processing unit 603, a correspondence table storage unit 604, and a voice recognition processing unit 605. The speech recognition server 600 also includes a device information storage unit 606, a clerk information storage unit 607, a clerk determination rule storage unit 608, and a calibration management unit 609.

網通信部６０１は、例えば、公共通信網５００に接続されたプロバイダ網（図示せず）と有線通信を行うためのＬＡＮインタフェースである。網通信部６０１は、プロバイダ網および公共通信網５００を介して、音声認識サーバ６００および第１〜第Ｎの校正端末７００−１〜７００−Ｎのそれぞれと、通信可能に接続する。 The network communication unit 601 is a LAN interface for performing wired communication with a provider network (not shown) connected to the public communication network 500, for example. The network communication unit 601 is communicably connected to the speech recognition server 600 and the first to Nth calibration terminals 700-1 to 700-N via the provider network and the public communication network 500.

音声認識データベース６０２は、音声認識処理に用いられる音響モデル、言語モデル、および辞書などのデータを格納する。 The speech recognition database 602 stores data such as acoustic models, language models, and dictionaries used for speech recognition processing.

逆認識処理部６０３は、無線通信端末４００から受信したＩＤとラベルテキストとの組ごとに、ラベルテキストの読みをラベルテキストに対する逆認識処理により取得する。そして、逆認識処理部６０３は、取得した読みとＩＤとを対応付けたＩＤ読み対応テーブルを生成する。 The reverse recognition processing unit 603 acquires the reading of the label text for each set of ID and label text received from the wireless communication terminal 400 by reverse recognition processing for the label text. And the reverse recognition process part 603 produces | generates the ID reading corresponding | compatible table which matched acquired reading and ID.

対応テーブル格納部６０４は、逆認識処理部６０３により生成されたＩＤ読み対応テーブルを格納する。 The correspondence table storage unit 604 stores the ID reading correspondence table generated by the reverse recognition processing unit 603.

音声認識処理部６０５は、音声認識データベース６０２を用いて、無線通信端末４００から受信した端末入力音声に対する音声認識処理を行う。 The speech recognition processing unit 605 performs speech recognition processing on the terminal input speech received from the wireless communication terminal 400 using the speech recognition database 602.

装置情報格納部６０６は、装置情報を格納する。装置情報は、情報処理装置２００、音声入力の対象となっているアプリケーション（ここでは電子カルテ部２０３）、および当該アプリケーションのユーザの少なくとも１つの属性情報を含む。 The device information storage unit 606 stores device information. The apparatus information includes at least one attribute information of the information processing apparatus 200, an application (in this case, the electronic medical chart unit 203) that is a target of voice input, and a user of the application.

図７は、装置情報の内容の一例を示す図である。 FIG. 7 is a diagram illustrating an example of the contents of the device information.

図７に示すように、装置情報８３０は、端末ＩＤ８３１、装置ＩＤ８３２、分野８３３、希望レベル８３４、および希望単価８３５を、対応付けて記述する。 As illustrated in FIG. 7, the device information 830 describes terminal ID 831, device ID 832, field 833, desired level 834, and desired unit price 835 in association with each other.

端末ＩＤ８３１は、無線通信端末（他の無線通信端末を含む）のＩＤである。装置ＩＤは、端末ＩＤ８３１が示す無線通信端末を利用して音声認識機能が追加される情報処理装置（他の情報処理装置を含む）のＩＤである。分野８３３は、装置ＩＤが示す情報処理装置において音声入力の対象となるアプリケーションの分野を示す。希望レベル８３４は、装置ＩＤが示す情報処理装置のユーザが希望する、上記アプリケーションに対する音声認識結果の校正の精度のレベル（以下、単に「レベル」という）を示す。希望単価８３５は、上記ユーザが希望する、上記アプリケーションに対する音声認識の単価（例えば音声１分当たりの金額。以下、単に「単価」という）を示す。 The terminal ID 831 is an ID of a wireless communication terminal (including other wireless communication terminals). The device ID is an ID of an information processing device (including other information processing devices) to which a voice recognition function is added using the wireless communication terminal indicated by the terminal ID 831. A field 833 indicates a field of an application that is a target of voice input in the information processing apparatus indicated by the apparatus ID. The desired level 834 indicates the level of accuracy of the calibration of the speech recognition result for the application desired by the user of the information processing apparatus indicated by the apparatus ID (hereinafter simply referred to as “level”). The desired unit price 835 indicates the unit price of voice recognition for the application desired by the user (for example, the amount per voice per minute. Hereinafter, simply referred to as “unit price”).

例えば、「Ｔ１」という端末ＩＤ８３１には、「Ｍ１」という装置ＩＤ８３２が対応付けられている。これは、端末ＩＤが「Ｔ１」である無線通信端末から受信した端末入力音声については、その音声認識結果を、装置ＩＤが「Ｍ１」である情報処理装置に送信すべきであるということを示す。ここでは、無線通信端末４００の端末ＩＤが、「Ｔ１」であるものとする。また、情報処理装置２００の装置ＩＤが、「Ｍ１」であるものとする。また、「Ｔ１」という端末ＩＤに対応付けて、「医療」という分野８３３が記述されている。これは、対応する情報処理装置２００において音声入力の対象となるアプリケーション（ここでは電子カルテ部２０３）は、医療分野のアプリケーションであるということを示す。また、「Ｔ１」という端末ＩＤに対応付けて、「２以上」という希望レベル８３４、および、「１００円／分以下」という希望単価８３５が記述されている。これは、校正作業の精度の希望レベルが２以上であり、その希望単価が１分当たり１００円以下であるということを示す。 For example, a terminal ID 831 “T1” is associated with a device ID 832 “M1”. This indicates that for the terminal input voice received from the wireless communication terminal whose terminal ID is “T1”, the voice recognition result should be transmitted to the information processing apparatus whose apparatus ID is “M1”. . Here, it is assumed that the terminal ID of the wireless communication terminal 400 is “T1”. Further, it is assumed that the device ID of the information processing device 200 is “M1”. Further, a field “833” “medical” is described in association with the terminal ID “T1”. This indicates that the application (in this case, the electronic medical chart unit 203) that is the target of voice input in the corresponding information processing apparatus 200 is an application in the medical field. Further, a desired level 834 of “2 or more” and a desired unit price 835 of “100 yen / minute or less” are described in association with the terminal ID “T1”. This indicates that the desired level of accuracy of the calibration work is 2 or more and the desired unit price is 100 yen or less per minute.

装置情報８３０は、例えば、オペレータにより事前に手入力により設定されてもよいし、校正管理部６０９が情報処理装置および無線通信端末に必要な情報を問い合わせて作成してもよい。 For example, the device information 830 may be set manually by an operator in advance, or may be created by the calibration management unit 609 inquiring information necessary for the information processing device and the wireless communication terminal.

図６のクラーク情報格納部６０７は、クラーク情報を格納する。クラーク情報は、校正端末のオペレータの属性情報を含む。本実施の形態において、クラーク情報は、現在、校正端末において校正作業が可能か否かを示す情報と、オペレータの校正作業の対象の得意範囲を示す情報を含むものとする。 The clerk information storage unit 607 in FIG. 6 stores clerk information. The clerk information includes attribute information of the operator of the calibration terminal. In the present embodiment, it is assumed that the clerk information includes information indicating whether or not the calibration work is currently possible at the calibration terminal and information indicating the range of the target of the operator's calibration work.

図８は、クラーク情報の内容の一例を示す図である。 FIG. 8 is a diagram illustrating an example of the contents of the clerk information.

図８に示すように、クラーク情報８４０は、校正端末ＩＤ８４１、クラークＩＤ８４２、作業ステータス８４３、不得意分野８４４、得意分野８４５、レベル８４６、および単価８４７を記述する。 As shown in FIG. 8, the clerk information 840 describes a calibration terminal ID 841, a clerk ID 842, a work status 843, a weak field 844, a strong field 845, a level 846, and a unit price 847.

校正端末ＩＤ８４１は、校正端末７００のＩＤである。クラークＩＤ８４２は、校正端末ＩＤ８４１が示す校正端末７００を使用するクラークのＩＤである。作業ステータス８４３は、クラークＩＤ８４２が示すクラークが、現在、校正作業を即時に行うことができる状態か否かを示す。不得意分野８４４は、上記クラークが校正作業を不得意とする分野を示す。得意分野８４４は、上記クラークが校正作業を得意とする分野を示す。レベル８４６は、上記クラークのレベルを示す。単価８４７は、上記クラークにより校正作業が行われた場合の単価を示す。 The calibration terminal ID 841 is an ID of the calibration terminal 700. The Clark ID 842 is an ID of a Clark who uses the calibration terminal 700 indicated by the calibration terminal ID 841. The work status 843 indicates whether the clerk indicated by the clerk ID 842 is currently in a state where the calibration work can be performed immediately. The weak field 844 indicates a field in which the Clark is not good at calibration work. A specialty field 844 indicates a field in which the Clark is good at calibration work. Level 846 indicates the level of the clerk. The unit price 847 indicates the unit price when the calibration work is performed by the clerk.

例えば、「Ｐ１」という校正端末ＩＤ８４１に対応付けて、「Ｃ１」というクラークＩＤ８４２が記述されている。これは、「Ｐ１」という校正端末ＩＤ８４１に送信した未校正音声認識結果は、「Ｃ１」というクラークＩＤ８４２により校正作業を受けるということを示す。また、「Ｃ１」というクラークＩＤ８４２に対応付けて、「作業可」という作業ステータス８４３、「文学」という不得意分野８４４、および、「医療」という得意分野８４５が記述されている。これは、クラークＩＤが「Ｃ１」であるクラークが、現在、即時の校正作業が可能であり、不得意分野は文学であり、得意分野は医療であるということを示す。また、「Ｃ１」というクラークＩＤ８４２に対応付けて、「１」というレベル８４６、および、「９０円／分」という単価８４７が記述されている。これは、当該クラークのレベルは１であるということ、および、当該クラークが校正作業を行った場合の単価が１分あたり９０円であるということを示す。 For example, a Clark ID 842 “C1” is described in association with the calibration terminal ID 841 “P1”. This indicates that the uncalibrated speech recognition result transmitted to the calibration terminal ID 841 “P1” is subjected to calibration work by the Clark ID 842 “C1”. Further, a work status 843 “work is possible”, a weak field 844 “literature”, and a good field 845 “medical” are described in association with the Clark ID 842 “C1”. This indicates that the clerk with the clerk ID “C1” is currently capable of immediate proofreading, the weak field is literature, and the strong field is medical. Further, a level 846 of “1” and a unit price 847 of “90 yen / min” are described in association with the Clark ID 842 of “C1”. This indicates that the level of the clerk is 1, and that the unit price when the clerk performs the calibration work is 90 yen per minute.

図６のクラーク決定ルール格納部６０８は、クラーク決定ルールを予め格納する。クラーク決定ルールは、音声認識処理の結果に対する校正端末７００の選択の条件を、装置情報と、クラーク情報との関係によって規定するルールである。本実施の形態において、クラーク決定ルールは、現在、校正作業が可能である校正端末を選択することを少なくとも規定するものとする。また、クラーク決定ルールは、音声入力の対象となっているアプリケーション（ここでは電子カルテ部２０３）およびユーザを得意範囲に含むオペレータが存在するとき、当該オペレータの校正端末７００を選択することを少なくとも規定するものとする。 The clerk determination rule storage unit 608 in FIG. 6 stores clerk determination rules in advance. The clerk determination rule is a rule that defines the selection condition of the calibration terminal 700 for the result of the speech recognition process based on the relationship between the device information and the clerk information. In the present embodiment, it is assumed that the Clark determination rule at least defines that a calibration terminal that can currently perform calibration work is selected. In addition, the clerk determination rule at least stipulates that when there is an application (in this case, the electronic medical chart unit 203) that is a target of voice input and an operator that includes the user in a good range, the calibration terminal 700 of the operator is selected. It shall be.

図９は、クラーク決定ルールの内容の一例を示す図である。 FIG. 9 is a diagram illustrating an example of the contents of the clerk determination rule.

図９に示すように、クラーク決定ルール８５０は、優先順位８５１が設定された条件８５２を記述する。 As shown in FIG. 9, the clerk determination rule 850 describes a condition 852 in which a priority order 851 is set.

優先順位８５１は、より高い優先順位８５１の条件が満たされるようなクラークをより優先的に選択すべきことを示す。条件８５２は、校正作業を行わせるべきクラークの条件を示す。 Priority 851 indicates that a clerk that satisfies the higher priority 851 condition should be selected more preferentially. A condition 852 indicates a condition of the Clark to be calibrated.

例えば、「１」という優先順位に対応付けて、「作業可である」という条件８５２が記述されている。これは、作業可となっていることが、校正作業を行わせるべきクラークの最優先の条件であるということを示す。 For example, a condition 852 that “work is possible” is described in association with the priority order “1”. This indicates that being ready to work is the top priority condition for Clark to be calibrated.

図６の校正管理部６０９は、音声認識処理部６０５による音声認識処理の結果（未校正音声認識結果）ごとに、対応する装置情報を取得する。校正管理部６０９は、クラーク決定ルールに基づいて、第１〜第Ｎの校正端末７００−１〜７００−Ｎの中から１つまたは複数の校正端末７００を選択する。そして、校正管理部６０９は、選択した校正端末７００に音声認識処理の結果に対する校正作業を行わせる。その後、校正管理部６０９は、校正端末７００から受信した校正作業の結果（校正済音声認識結果）が示す読みを、ＩＤ読み対応テーブルで検索する。そして、校正管理部６０９は、当該読みがＩＤ読み対応テーブルに存在するとき、当該読みに対応するＩＤを、無線通信端末４００へ送信する。 The calibration management unit 609 in FIG. 6 acquires corresponding device information for each result of speech recognition processing (uncalibrated speech recognition result) by the speech recognition processing unit 605. The calibration management unit 609 selects one or a plurality of calibration terminals 700 from the first to Nth calibration terminals 700-1 to 700-N based on the Clark determination rule. Then, the calibration management unit 609 causes the selected calibration terminal 700 to perform a calibration operation on the result of the speech recognition process. Thereafter, the calibration management unit 609 searches the ID reading correspondence table for the reading indicated by the calibration work result (calibrated speech recognition result) received from the calibration terminal 700. When the reading exists in the ID reading correspondence table, the calibration management unit 609 transmits the ID corresponding to the reading to the wireless communication terminal 400.

以上の情報処理装置２００、無線通信端末４００、および音声認識サーバ６００は、例えば、ＣＰＵ（Central Processing Unit）、制御プログラムを格納したＲＯＭ（Read Only Memory）などの記憶媒体、およびＲＡＭ（Random Access Memory）などの作業用メモリをそれぞれ有する。この場合、上記した各構成部の機能は、ＣＰＵが制御プログラムを実行することにより実現される。 The information processing apparatus 200, the wireless communication terminal 400, and the speech recognition server 600 described above include, for example, a CPU (Central Processing Unit), a storage medium such as a ROM (Read Only Memory) storing a control program, and a RAM (Random Access Memory). ) And the like. In this case, the function of each component described above is realized by the CPU executing the control program.

以上のような構成を有する情報処理装置２００、無線通信端末４００、および音声認識サーバ６００により、図１で説明したそれぞれの機能が実現される。すなわち、入力支援システム１００は、これらの装置と第１〜第Ｎの校正端末７００−１〜７００−Ｎとを備えることにより、高いセキュリティを求められる既存のアプリケーションである電子カルテに対して、高精度な音声入力機能を簡単に追加することができる。 Each function described in FIG. 1 is realized by the information processing apparatus 200, the wireless communication terminal 400, and the voice recognition server 600 configured as described above. That is, the input support system 100 includes these devices and the first to N-th calibration terminals 700-1 to 700-N, so that the electronic medical record, which is an existing application that requires high security, is highly effective. Accurate voice input function can be easily added.

以上で、各装置の構成についての説明を終える。 This is the end of the description of the configuration of each device.

以下、各装置の動作について説明する。但し、電子カルテへの音声認識機能の追加に関する処理のみに着目して説明を行う。 Hereinafter, the operation of each apparatus will be described. However, the description will be given focusing only on the processing related to the addition of the voice recognition function to the electronic medical record.

図１０は、情報処理装置２００の動作の一例を示すフローチャートである。 FIG. 10 is a flowchart illustrating an example of the operation of the information processing apparatus 200.

まず、ステップＳ１１００において、画面構成解析部２０７は、電子カルテ部２０３が新たな入力画面を表示したか否かを判断する。 First, in step S1100, the screen configuration analysis unit 207 determines whether the electronic medical record unit 203 has displayed a new input screen.

新たな入力画面とは、解析済みの（つまり対応する画面構成テーブルを生成済みの）入力画面と内容が異なる入力画面だけでなく、内容が同じであっても大きさおよび配置の少なくとも１つが異なる入力画面を含む。 The new input screen is not only an input screen whose contents are different from the input screen that has been analyzed (that is, the corresponding screen configuration table has been generated), but at least one of the size and arrangement is different even if the contents are the same. Includes input screens.

図１１は、入力画面の一例を示す平面図である。 FIG. 11 is a plan view showing an example of the input screen.

図１１に示すように、入力画面８６１は、情報処理装置２００のデスクトップ画面８６２に表示されるウィンドウである。入力画面８６１は、例えば、操作入力部２０５を介してユーザにより行われるカーソル８６３の操作により、デスクトップ画面８６２における位置および大きさが可変となっている。 As shown in FIG. 11, the input screen 861 is a window displayed on the desktop screen 862 of the information processing apparatus 200. For example, the position and size of the input screen 861 on the desktop screen 862 can be changed by the operation of the cursor 863 performed by the user via the operation input unit 205.

本実施の形態において、入力画面８６１は、内部に「患者情報」というラベルテキスト８６４が配置されたタブ８６５と、内部に「病院情報」というラベルテキスト８６６が配置されたタブ８６７とを有する。入力画面８６１は、左側近傍に「氏名」というラベルテキスト８６８が配置されたテキストボックス８６９を有する。入力画面８６１は、左側近傍に「性別」というラベルテキスト８７０が配置され、右側近傍に「男」というラベルテキスト８７１が配置された選択ボタン８７２を有する。入力画面８６１は、左側近傍に「性別」というラベルテキスト８７０が配置され、右側近傍に「女」というラベルテキスト８７３が配置された選択ボタン８７４を有する。入力画面８６１は、左側近傍に「診療科目」というラベルテキスト８７５が配置された選択メニュー８７６を有する。選択メニュー８７６は、「内科」、「外科」、「整形外科」、・・・というメニュー項目テキスト群８７７を有する。 In the present embodiment, the input screen 861 has a tab 865 in which a label text 864 of “patient information” is arranged, and a tab 867 in which a label text 866 of “hospital information” is arranged. The input screen 861 has a text box 869 in which a label text 868 of “name” is arranged near the left side. The input screen 861 has a selection button 872 in which a label text 870 “sex” is arranged near the left side and a label text 871 “male” is arranged near the right side. The input screen 861 includes a selection button 874 in which a label text 870 “sex” is arranged in the vicinity of the left side, and a label text 873 of “female” is arranged in the vicinity of the right side. The input screen 861 has a selection menu 876 in which a label text 875 “medical subject” is arranged near the left side. The selection menu 876 has a menu item text group 877 of “Internal medicine”, “Surgery”, “Orthopedics”,.

なお、後述するが、画面構成解析部２０７は、デスクトップ画面８６２の座標系である絶対座標系８７８と、入力画面８６１の座標系である相対座標系８７９とを用いて、入力画面８６１の解析（以下、適宜、「画面構成解析」という）を行う。絶対座標系８７８は、Ｘ軸とＹ軸とから成るものとする。相対座標系８７９は、ｘ軸とｙ軸とから成るものとする。なお、図１１に示す絶対座標系８７８および相対座標系８７９は、デスクトップ画面８６２および入力画面８６１には表示されない。 As will be described later, the screen configuration analysis unit 207 uses the absolute coordinate system 878 that is the coordinate system of the desktop screen 862 and the relative coordinate system 879 that is the coordinate system of the input screen 861 to analyze the input screen 861 ( Hereinafter, “screen configuration analysis” is performed as appropriate. The absolute coordinate system 878 is composed of an X axis and a Y axis. The relative coordinate system 879 is composed of an x-axis and a y-axis. Note that the absolute coordinate system 878 and the relative coordinate system 879 shown in FIG. 11 are not displayed on the desktop screen 862 and the input screen 861.

なお、絶対座標系における位置のみが異なるものは、新たな入力画面として扱ってもよいが、本実施の形態に係る情報処理装置２００は、新たな入力画面として取り扱わないものとする。但し、情報処理装置２００は、絶対座標系における入力画面の（相対座標系の）位置を逐次取得し、操絶対座標系を用いた内容で操作情報を生成するものとする。 In addition, although what differs only in the position in an absolute coordinate system may be handled as a new input screen, the information processing apparatus 200 which concerns on this Embodiment shall not be handled as a new input screen. However, it is assumed that the information processing apparatus 200 sequentially acquires the position of the input screen (relative coordinate system) in the absolute coordinate system and generates operation information with the contents using the absolute operation coordinate system.

画面構成解析部２０７は、電子カルテ部２０３が新たな入力画面を表示した場合（図１０のＳ１１００：ＹＥＳ）、ステップＳ１２００へ進む。また、画像構成解析部２０７は、電子カルテ部２０３が新たな入力画面を表示していない場合（Ｓ１１００：ＮＯ）、ステップＳ１３００へ進む。 If the electronic medical chart unit 203 displays a new input screen (S1100: YES in FIG. 10), the screen configuration analysis unit 207 proceeds to step S1200. If the electronic medical chart unit 203 does not display a new input screen (S1100: NO), the image configuration analysis unit 207 proceeds to step S1300.

ステップＳ１２００において、画面構成解析部２０７は、ステップＳ１１００において、テーブル生成処理を行って、ステップＳ１３００へ進む。テーブル生成処理は、入力画面を解析して画面構成テーブルを生成する処理である。テーブル生成処理の詳細については後述する。 In step S1200, the screen configuration analysis unit 207 performs table generation processing in step S1100, and proceeds to step S1300. The table generation process is a process of analyzing the input screen and generating a screen configuration table. Details of the table generation processing will be described later.

そして、ステップＳ１３００において、入力情報生成部２１０は、新たな音声認識結果を受信したか否かを判断する。入力情報生成部２１０は、新たな音声認識結果を受信した場合（Ｓ１３００：ＹＥＳ）、ステップＳ１４００へ進む。また、入力情報生成部２１０は、新たな音声認識結果を受信していない場合（Ｓ１３００：ＮＯ）、ステップＳ１５００へ進む。 In step S1300, the input information generation unit 210 determines whether a new speech recognition result has been received. When the input information generation unit 210 receives a new speech recognition result (S1300: YES), the input information generation unit 210 proceeds to step S1400. If the input information generation unit 210 has not received a new speech recognition result (S1300: NO), the process proceeds to step S1500.

ステップＳ１４００において、入力情報生成部２１０は、操作変換処理を行って、ステップＳ１５００へ進む。操作変換処理は、端末入力音声に対する音声認識処理の結果を、操作情報に変換して電子カルテ部２０３へ渡す処理である。操作変換処理の詳細については後述する。 In step S1400, the input information generation unit 210 performs an operation conversion process, and proceeds to step S1500. The operation conversion process is a process of converting the result of the voice recognition process for the terminal input voice into operation information and passing it to the electronic medical chart unit 203. Details of the operation conversion process will be described later.

そして、ステップＳ１５００において、入力情報生成部２１０は、ユーザ操作などにより処理の終了を指示されたか否かを判断する。入力情報生成部２１０は、処理の終了を指示されていない場合（Ｓ１５００：ＮＯ）、ステップＳ１１００へ戻る。また、入力情報生成部２１０は、処理の終了を指示された場合（Ｓ１５００：ＹＥＳ）、一連の処理を終了する。 In step S1500, the input information generation unit 210 determines whether an instruction to end the process is given by a user operation or the like. If the input information generation unit 210 is not instructed to end the process (S1500: NO), the process returns to step S1100. Further, when instructed to end the process (S1500: YES), the input information generation unit 210 ends the series of processes.

このような動作により、情報処理装置２００は、新たな入力画面が表示されるごとに、その入力画面に対応した画面構成テーブルを生成することができる。そして、情報処理装置２００は、新たな音声認識結果が受信されるごとに、その音声認識結果の操作情報への変換を行うことができる。 With this operation, the information processing apparatus 200 can generate a screen configuration table corresponding to an input screen each time a new input screen is displayed. Then, each time a new voice recognition result is received, the information processing apparatus 200 can convert the voice recognition result into operation information.

図１２は、テーブル生成処理（図１０のステップＳ１２００）の一例を示すフローチャートである。 FIG. 12 is a flowchart illustrating an example of the table generation process (step S1200 in FIG. 10).

まず、ステップＳ１２０１において、画面構成解析部２０７は、電子カルテ部２０３から、入力画面の画面情報を取得する。画面情報は、入力画面を構成する各入力エリアのタイプおよび表示位置と、入力画面を構成する各ラベルテキストと、その表示位置およびサイズを含む。画面情報の取得は、入力画面がマイクロソフト社のウィンドウズ（登録商標）におけるウィンドウである場合、対応するウィンドウズメッセージを電子カルテ部２０３へ渡すことにより、可能である。 First, in step S <b> 1201, the screen configuration analysis unit 207 acquires screen information of the input screen from the electronic medical record unit 203. The screen information includes the type and display position of each input area constituting the input screen, each label text constituting the input screen, and the display position and size. Screen information can be acquired by passing a corresponding Windows message to the electronic medical chart unit 203 when the input screen is a window in Microsoft Windows (registered trademark).

そして、ステップＳ１２０２において、画面構成解析部２０７は、取得した入力エリアのうち１つを選択する。 In step S1202, the screen configuration analysis unit 207 selects one of the acquired input areas.

そして、ステップＳ１２０３において、画面構成解析部２０７は、テーブル作成ルール（図３参照）に従って、選択中の入力エリアにラベルテキストを対応付ける。 In step S1203, the screen configuration analysis unit 207 associates the label text with the selected input area according to the table creation rule (see FIG. 3).

例えば、図１１に示す入力画面８６１において、テキストボックス８６９が選択されている場合、画面構成解析部２０７は、「氏名」というラベルテキスト８６８を、テキストボックス８６９に対応付ける。 For example, when the text box 869 is selected on the input screen 861 illustrated in FIG. 11, the screen configuration analysis unit 207 associates the label text 868 “name” with the text box 869.

そして、ステップＳ１２０４において、画面構成解析部２０７は、ステップＳ１２０３において行った対応付けと、コンポーネットのタイプおよび相対座標などを、画面構成テーブルに登録する。後述のステップＳ１２０７の処理により、このような画面構成テーブルへの登録は、全ての入力エリアについて行われる。 In step S1204, the screen configuration analysis unit 207 registers the association performed in step S1203, the component type, relative coordinates, and the like in the screen configuration table. Such registration in the screen configuration table is performed for all input areas by processing in step S1207 described later.

図１３は、図１１に示す入力画面から図３に示すテーブル作成ルールに基づいて生成された、画面構成テーブルの内容の一例を示す図である。 FIG. 13 is a diagram showing an example of the contents of the screen configuration table generated from the input screen shown in FIG. 11 based on the table creation rules shown in FIG.

図１３に示すように、画面構成テーブル８８０は、フィールドＩＤ８８１ごとに、親ＩＤ８８２、タイプ８８３、相対座標８８４、プロパティ８８５、およびラベルテキスト８８６を記述する。 As shown in FIG. 13, the screen configuration table 880 describes a parent ID 882, a type 883, a relative coordinate 884, a property 885, and a label text 886 for each field ID 881.

フィールドＩＤ８８１は、入力エリアのＩＤである。親ＩＤ８８２は、入力エリアの親に位置づける親入力エリアのＩＤである。タイプ８８３は、フィールドＩＤ８８１が示す入力エリアのタイプである。相対座標８８４は、入力画面８６１の相対座標系８２９における、フィールドＩＤ８８１が示す入力エリアの基準点の座標（図１１参照）である。プロパティ８８５は、フィールドＩＤ８８１が示す入力エリアがタブである場合のタブインデックスである。ラベルテキスト８８６は、フィールドＩＤ８８１が示す入力エリアに対応付けられたラベルテキスト８８６である。 The field ID 881 is an input area ID. The parent ID 882 is the ID of the parent input area that is positioned as the parent of the input area. The type 883 is the type of the input area indicated by the field ID 881. The relative coordinates 884 are the coordinates of the reference point of the input area indicated by the field ID 881 in the relative coordinate system 829 of the input screen 861 (see FIG. 11). A property 885 is a tab index when the input area indicated by the field ID 881 is a tab. The label text 886 is label text 886 associated with the input area indicated by the field ID 881.

例えば、「２」というフィールドＩＤ８８１に対応付けて、「１」という親ＩＤ８８２、「テキストボックス」というタイプ８８３、および「ｘ_２，ｙ_２」という相対座標８８４が記述されている。これは、フィールドＩＤ８８１が「２」である入力エリアは、フィールドＩＤ８８１が「１」である入力エリアを親とし、テキストボックスであり、その基準点の相対座標が（ｘ_２，ｙ_２）であるということを示す。また、「２」というフィールドＩＤ８８１に対応付けて、「氏名」というラベルテキスト８８６が記述されている。これは、フィールドＩＤ８８１が「２」である入力エリアに、「氏名」というラベルテキスト８８６が対応付けられたことを示す。 For example, a parent ID 882 “1”, a type 883 “text box”, and a relative coordinate 884 “x ₂ , y ₂ ” are described in association with a field ID 881 “2”. This is because the input area whose field ID 881 is “2” is a text box whose parent is the input area whose field ID 881 is “1”, and the relative coordinates of the reference point are (x ₂ , y ₂ ). It shows that. Further, a label text 886 of “name” is described in association with the field ID 881 of “2”. This indicates that the label text 886 “name” is associated with the input area whose field ID 881 is “2”.

そして、図１２のステップＳ１２０５において、画面構成解析部２０７は、選択中の入力エリアのタイプが選択メニューであるか否かを判断する。画面構成解析部２０７は、選択中の入力エリアが選択メニューである場合（Ｓ１２０５：ＹＥＳ）、ステップＳ１２０６へ進む。また、画面構成解析部２０７は、選択中の入力エリアが選択メニューではない場合（Ｓ１２０５：ＮＯ）、ステップＳ１２０７へ進む。 In step S1205 of FIG. 12, the screen configuration analysis unit 207 determines whether or not the type of the input area being selected is a selection menu. If the currently selected input area is a selection menu (S1205: YES), screen configuration analysis unit 207 proceeds to step S1206. If the input area being selected is not a selection menu (S1205: NO), the screen configuration analysis unit 207 proceeds to step S1207.

ステップＳ１２０６において、画面構成解析部２０７は、選択メニューのメニュー項目テキストから辞書用情報を生成して、ステップＳ１２０７へ進む。 In step S1206, the screen configuration analysis unit 207 generates dictionary information from the menu item text of the selection menu, and proceeds to step S1207.

図１４は、図１１に示す入力画面から生成される辞書用情報の内容の一例を示す図である。 FIG. 14 is a diagram showing an example of the contents of the dictionary information generated from the input screen shown in FIG.

図１４に示すように、辞書用情報８９０は、フィールドＩＤ８９１に対応付けて、メニュー項目テキスト８９２のリストを記述する。メニュー項目テキスト８９２は、フィールドＩＤ８９１が示す入力エリア（選択メニュー）における選択の対象であるメニュー項目テキストを示す。 As illustrated in FIG. 14, the dictionary information 890 describes a list of menu item text 892 in association with the field ID 891. The menu item text 892 indicates the menu item text to be selected in the input area (selection menu) indicated by the field ID 891.

そして、図１２のステップＳ１２０７において、画面構成解析部２０７は、入力画面を構成する全ての入力エリアを処理したか否かを判断する。画面構成解析部２０７は、未処理の入力エリアが存在する場合（Ｓ１２０７：ＮＯ）、ステップＳ１２０２へ戻る。また、画面構成解析部２０７は、全ての入力エリアを処理した場合（Ｓ１２０７：ＹＥＳ）、ステップＳ１２０８へ進む。 Then, in step S1207 in FIG. 12, the screen configuration analysis unit 207 determines whether or not all input areas constituting the input screen have been processed. If there is an unprocessed input area (S1207: NO), the screen configuration analysis unit 207 returns to step S1202. If the screen configuration analysis unit 207 has processed all input areas (S1207: YES), the screen configuration analysis unit 207 proceeds to step S1208.

ステップＳ１２０８において、画面構成解析部２０７は、画面構成テーブルから、ＩＤテキスト対応情報を生成する。ＩＤテキスト対応情報は、フィールドＩＤとラベルテキストとの対応付けを示す情報である。 In step S1208, the screen configuration analysis unit 207 generates ID text correspondence information from the screen configuration table. The ID text correspondence information is information indicating the correspondence between the field ID and the label text.

図１５は、図１３に示す画面構成テーブルから生成されるＩＤテキスト対応情報の内容の一例を示す図である。 FIG. 15 is a diagram showing an example of the contents of ID text correspondence information generated from the screen configuration table shown in FIG.

図１５に示すように、ＩＤテキスト対応情報９００は、フィールドＩＤ９０１とラベルテキスト９０２との組およびタイプ９０３を記述する。ラベルテキスト９０２は、フィールドＩＤ９０１が示す入力エリアに対応付けられたラベルテキストを示す。タイプ９０３は、フィールドＩＤ９０１が示す入力エリアのタイプを示す。 As shown in FIG. 15, ID text correspondence information 900 describes a set of field ID 901 and label text 902 and type 903. A label text 902 indicates a label text associated with the input area indicated by the field ID 901. A type 903 indicates the type of the input area indicated by the field ID 901.

そして、図１２のステップＳ１２０９において、画面構成解析部２０７は、生成したＩＤテキスト対応情報を、無線通信端末４００へ送信する。また、画面構成解析部２０７は、辞書用情報を生成した場合には、これについても、無線通信端末４００へ送信する。そして、画面構成解析部２０７は、図１０の処理へ戻る。 In step S1209 of FIG. 12, the screen configuration analysis unit 207 transmits the generated ID text correspondence information to the wireless communication terminal 400. Further, when generating the dictionary information, the screen configuration analysis unit 207 also transmits this to the wireless communication terminal 400. Then, the screen configuration analysis unit 207 returns to the process of FIG.

図１６は、情報処理装置２００から無線通信端末４００への送信データの構成の一例を示す図である。 FIG. 16 is a diagram illustrating an example of a configuration of transmission data from the information processing apparatus 200 to the wireless communication terminal 400.

図１６に示すように、情報処理装置２００から無線通信端末４００への送信データ９１０は、例えば、ＩＤテキスト対応情報９１１を含み、更に、適宜、辞書用情報９１２を含む。 As illustrated in FIG. 16, transmission data 910 from the information processing apparatus 200 to the wireless communication terminal 400 includes, for example, ID text correspondence information 911 and further includes dictionary information 912 as appropriate.

このようなテーブル生成処理により、情報処理装置２００は、電子カルテ部２０３の入力画面から、入力エリア位置に対するラベルテキストの対応付けを決定することができる。そして、情報処理装置２００は、その対応付けを、画面構成データとして保持すると共に、ＩＤテキスト対応情報として無線通信端末４００へ送信することができる。 By such table generation processing, the information processing apparatus 200 can determine the association of the label text with the input area position from the input screen of the electronic medical chart unit 203. Then, the information processing apparatus 200 can store the association as screen configuration data and transmit the association to the wireless communication terminal 400 as ID text correspondence information.

図１７は、無線通信端末４００の動作の一例を示すフローチャートである。 FIG. 17 is a flowchart illustrating an example of the operation of the wireless communication terminal 400.

まず、音声認識管理部４０４は、ＩＤテキスト対応情報および辞書用情報の少なくとも一方を、情報処理装置２００から受信したか否かを判断する。音声認識管理部４０４は、ＩＤテキスト対応情報および辞書用情報の少なくとも一方を受信した場合（Ｓ２０１０：ＹＥＳ）、ステップＳ２０２０へ進む。また、音声認識管理部４０４は、ＩＤテキスト対応情報および辞書用情報のいずれも受信していない場合（Ｓ２０１０：ＮＯ）、ステップＳ２０３０へ進む。 First, the speech recognition management unit 404 determines whether or not at least one of ID text correspondence information and dictionary information has been received from the information processing apparatus 200. If the voice recognition management unit 404 receives at least one of ID text correspondence information and dictionary information (S2010: YES), the process proceeds to step S2020. If neither the ID text correspondence information nor the dictionary information is received (S2010: NO), the speech recognition management unit 404 proceeds to step S2030.

ステップＳ２０２０において、音声認識管理部４０４は、ＩＤテキスト対応情報および辞書用情報のうち受信したものを、音声認識サーバ６００へ転送して、ステップＳ２０３０へ進む。この際、音声認識管理部４０４は、送信元のＩＤとして、無線通信端末４００の端末ＩＤを、転送する情報に付与する。 In step S2020, the speech recognition management unit 404 transfers the received ID text information and dictionary information to the speech recognition server 600, and proceeds to step S2030. At this time, the voice recognition management unit 404 assigns the terminal ID of the wireless communication terminal 400 to the information to be transferred as the transmission source ID.

図１８は、無線通信端末４００から音声認識サーバ６００への第１の送信データの構成の一例を示す図である。 FIG. 18 is a diagram illustrating an example of a configuration of first transmission data from the wireless communication terminal 400 to the voice recognition server 600.

図１８に示すように、無線通信端末４００から音声認識サーバ６００への第１の送信データ９２０は、例えば、端末ＩＤ９２１およびＩＤテキスト対応情報９２２を含み、更に、適宜、辞書用情報９２３を含む。端末ＩＤ９２１は、第１の送信データ９２０の送信元である無線通信端末４００の端末ＩＤを示す。ＩＤテキスト対応情報９２２および辞書用情報９２３は、図１６のＩＤテキスト対応情報９１１および辞書用情報９１２に対応している。 As shown in FIG. 18, the first transmission data 920 from the wireless communication terminal 400 to the voice recognition server 600 includes, for example, a terminal ID 921 and ID text correspondence information 922, and further includes dictionary information 923 as appropriate. The terminal ID 921 indicates the terminal ID of the wireless communication terminal 400 that is the transmission source of the first transmission data 920. ID text correspondence information 922 and dictionary information 923 correspond to ID text correspondence information 911 and dictionary information 912 in FIG.

そして、図１７のステップＳ２０３０において、音声認識管理部４０４は、音声入力部４０３を介してユーザの音声入力があったか否かを判断する。音声認識管理部４０４は、音声入力があった場合（Ｓ２０３０：ＹＥＳ）、ステップＳ２０４０へ進む。また、音声認識管理部４０４は、音声入力がない場合（Ｓ２０３０：ＮＯ）、ステップＳ２０５０へ進む。 In step S2030 in FIG. 17, the voice recognition management unit 404 determines whether or not there is a user voice input via the voice input unit 403. If there is a voice input (S2030: YES), the voice recognition management unit 404 proceeds to step S2040. If there is no voice input (S2030: NO), the voice recognition management unit 404 proceeds to step S2050.

ステップＳ２０４０において、音声認識管理部４０４は、端末入力音声の音声データを含む音声情報を、音声認識サーバ６００へ送信して、ステップＳ２０５０へ進む。この際、音声認識管理部４０４は、送信元として無線通信端末４００の端末ＩＤを、送信する情報に付与する。 In step S2040, the voice recognition management unit 404 transmits the voice information including the voice data of the terminal input voice to the voice recognition server 600, and proceeds to step S2050. At this time, the voice recognition management unit 404 gives the terminal ID of the wireless communication terminal 400 as the transmission source to the information to be transmitted.

図１９は、無線通信端末４００から音声認識サーバ６００への第２の送信データの構成の一例を示す図である。 FIG. 19 is a diagram illustrating an example of a configuration of second transmission data from the wireless communication terminal 400 to the voice recognition server 600.

図１９に示すように、無線通信端末４００から音声認識サーバ６００への第２の送信データ９３０は、例えば、端末ＩＤ９３１および音声情報９３２を含む。端末ＩＤ９３１は、第２の送信データ９３０の送信元である無線通信端末４００の端末ＩＤを示す。音声情報９３２は、端末入力音声の音声データを含む情報である。 As illustrated in FIG. 19, the second transmission data 930 from the wireless communication terminal 400 to the voice recognition server 600 includes, for example, a terminal ID 931 and voice information 932. The terminal ID 931 indicates the terminal ID of the wireless communication terminal 400 that is the transmission source of the second transmission data 930. The voice information 932 is information including voice data of terminal input voice.

そして、図１７のステップＳ２０５０において、音声認識管理部４０４は、音声認識サーバ６００から音声認識結果を受信したか否かを判断する。音声認識管理部４０４は、音声認識結果を受信した場合（Ｓ２０５０：ＹＥＳ）、ステップＳ２０６０へ進む。また、音声認識管理部４０４は、音声認識結果を受信していない場合（Ｓ２０５０：ＮＯ）、ステップＳ２０７０へ進む。 In step S2050 of FIG. 17, the speech recognition management unit 404 determines whether a speech recognition result has been received from the speech recognition server 600. If the speech recognition management unit 404 receives a speech recognition result (S2050: YES), the speech recognition management unit 404 proceeds to step S2060. If the voice recognition management unit 404 has not received a voice recognition result (S2050: NO), the process proceeds to step S2070.

ステップＳ２０６０において、音声認識管理部４０４は、受信した音声認識結果を、情報処理装置２００へ転送して、ステップＳ２０７０へ進む。ＩＤテキスト対応情報および辞書用情報のうち受信したものを、音声認識サーバ６００へ転送して、ステップＳ２０３０へ進む。この転送されるデータの詳細については後述する。 In step S2060, the voice recognition management unit 404 transfers the received voice recognition result to the information processing apparatus 200, and proceeds to step S2070. Of the ID text correspondence information and the dictionary information, the received information is transferred to the voice recognition server 600, and the process proceeds to step S2030. Details of the transferred data will be described later.

そして、ステップＳ２０７０において、音声認識管理部４０４は、ユーザ操作などにより処理の終了を指示されたか否かを判断する。音声認識管理部４０４は、処理の終了を指示されていない場合（Ｓ２０７０：ＮＯ）、ステップＳ２０１０へ戻る。また、音声認識管理部４０４は、処理の終了を指示された場合（Ｓ２０７０：ＹＥＳ）、一連の処理を終了する。 In step S2070, the voice recognition management unit 404 determines whether an instruction to end the process is given by a user operation or the like. If the voice recognition management unit 404 is not instructed to end the process (S2070: NO), the process returns to step S2010. Further, when instructed to end the process (S2070: YES), the voice recognition management unit 404 ends the series of processes.

このような動作により、無線通信端末４００は、情報処理装置２００から送られてきたＩＤテキスト対応情報を、音声認識サーバ６００へ転送し、端末入力音声の音声情報を、音声認識サーバ６００へ送信することができる。また、無線通信端末４００は、音声認識サーバ６００から送られてきた音声認識結果を、情報処理装置２００へ転送することができる。 By such an operation, the wireless communication terminal 400 transfers the ID text correspondence information sent from the information processing apparatus 200 to the voice recognition server 600 and transmits the voice information of the terminal input voice to the voice recognition server 600. be able to. In addition, the wireless communication terminal 400 can transfer the voice recognition result sent from the voice recognition server 600 to the information processing apparatus 200.

図２０は、音声認識装置６００の動作の一例を示すフローチャートである。 FIG. 20 is a flowchart illustrating an example of the operation of the speech recognition apparatus 600.

まず、ステップＳ３０１０において、逆認識処理部６０３は、無線通信端末４００からＩＤテキスト対応情報（図１５参照）を受信したか否かを判断する。逆認識処理部６０３は、ＩＤテキスト対応情報を受信していない場合（Ｓ３０１０：ＮＯ）、後述のステップＳ３０３０へ進む。また、逆認識処理部６０３は、ＩＤテキスト対応情報を受信した場合（Ｓ３０１０：ＹＥＳ）、ステップＳ３０２０へ進む。また、逆認証処理部６０３は、逆認識処理部６０３は、ＩＤテキスト対応情報に含まれるラベルテキストを読みに逆変換して、ＩＤ読み対応テーブルを生成する。 First, in step S3010, the reverse recognition processing unit 603 determines whether or not ID text correspondence information (see FIG. 15) has been received from the wireless communication terminal 400. If the ID text correspondence information is not received (S3010: NO), the reverse recognition processing unit 603 proceeds to step S3030 described later. If the ID recognition information is received (S3010: YES), the reverse recognition processing unit 603 proceeds to step S3020. Further, the reverse authentication processing unit 603 reversely converts the label text included in the ID text correspondence information into reading, and generates an ID reading correspondence table.

図２１は、図１５に示すＩＤテキスト対応情報から生成されるＩＤ読み対応テーブルの内容の一例を示す図である。 FIG. 21 is a diagram showing an example of the contents of the ID reading correspondence table generated from the ID text correspondence information shown in FIG.

図２１に示すように、ＩＤ読み対応テーブル９４０は、フィールドＩＤ９４１に対応付けて、読み９４２を記述する。読み９４２は、フィールドＩＤ９４１が示す入力エリアに対応付けられていたラベルテキストの読みを示す。 As shown in FIG. 21, the ID reading correspondence table 940 describes the reading 942 in association with the field ID 941. A reading 942 indicates the reading of the label text associated with the input area indicated by the field ID 941.

例えば、「２」というフィールドＩＤ９４１に対応付けて、「しめい」という読み９４２が記述されている。これは、フィールドＩＤ９４１が「２」である入力エリアに対応付けられたラベルテキストの読みが、「しめい」であるということを示す。 For example, the reading 942 “Shimei” is described in association with the field ID 941 “2”. This indicates that the reading of the label text associated with the input area whose field ID 941 is “2” is “Shime”.

なお、逆認識処理部６０３は、受信したＩＤテキスト対応情報に記述された入力エリアのタイプおよびラベルテキストと、逆変換処理により得られた読みとから、言語モデルを構築または再構築し、音声認識データベース６０２に登録する。 The reverse recognition processing unit 603 constructs or reconstructs a language model from the input area type and label text described in the received ID text correspondence information and the reading obtained by the reverse conversion process, and performs speech recognition. Register in the database 602.

例えば、逆認識処理部６０３は、「しめい」という読みがテキストボックスに対応すること、および、その基のラベルテキストが「氏名」であることから、「しめい」という読みの直後には名前が配置されるとする内容を、言語モデルに追加する。また、例えば、「しんりょうかもく」という読みが、選択メニューに対応している。このことから、逆認識処理部６０３は、「しんりょうかもく」という読みの直後には、対応して受信した辞書用情報に記述されたメニュー項目テキストが配置されるとする内容を、言語モデルに追加する。 For example, the reverse recognition processing unit 603 reads the name immediately after the reading “Shimei” because the reading “Shimei” corresponds to a text box and the label text of the group is “name”. The content that is placed is added to the language model. In addition, for example, the reading “Shinryokamo” corresponds to the selection menu. From this, the reverse recognition processing unit 603 determines that the menu item text described in the corresponding dictionary information received immediately after reading “Shinryokamoku” is the language model. Add to

なお、音声認識サーバ６００は、端末ＩＤを用いて、ＩＤ読み対応テーブル９４０、言語モデル、辞書用情報などを、無線通信端末４００ごとに管理するものとする。 Note that the speech recognition server 600 manages the ID reading correspondence table 940, the language model, dictionary information, and the like for each wireless communication terminal 400 using the terminal ID.

そして、図２０のステップＳ３０３０において、逆認識処理部６０３は、無線通信端末４００から辞書用情報（図１４参照）を受信したか否かを判断する。逆認識処理部６０３は、辞書用情報を受信した場合（Ｓ３０３０：ＹＥＳ）、ステップＳ３０４０へ進む。また、逆認識処理部６０３は、辞書用辞書を受信していない場合（Ｓ３０３０：ＮＯ）、ステップＳ３０５０へ進む。 In step S3030 of FIG. 20, the reverse recognition processing unit 603 determines whether dictionary information (see FIG. 14) has been received from the wireless communication terminal 400. When the reverse recognition processing unit 603 receives the dictionary information (S3030: YES), the process proceeds to step S3040. If the reverse recognition processing unit 603 has not received a dictionary dictionary (S3030: NO), the process proceeds to step S3050.

ステップＳ３０４０において、逆認識処理部６０３は、辞書用情報に含まれるメニュー項目テキストのリストを、辞書用情報に含まれるフィールドＩＤについての音声認識結果候補（辞書）として、音声認識データベース６０２に登録する。 In step S3040, the reverse recognition processing unit 603 registers the menu item text list included in the dictionary information in the speech recognition database 602 as a speech recognition result candidate (dictionary) for the field ID included in the dictionary information. .

なお、音声認識サーバ６００は、端末ＩＤを用いて、音声認識結果候補（辞書）を、無線通信端末４００ごとに管理するものとする。 Note that the speech recognition server 600 manages a speech recognition result candidate (dictionary) for each wireless communication terminal 400 using the terminal ID.

そして、ステップＳ３０５０において、音声認識処理部６０５は、無線通信端末４００から音声情報を受信したか否かを判断する。音声認識処理部６０５は、音声情報を受信した場合（Ｓ３０５０：ＹＥＳ）、ステップＳ３０６０へ進む。また、音声認識処理部６０５は、音声情報を受信していない場合（Ｓ３０５０：ＮＯ）、ステップＳ３０７０へ進む。 In step S <b> 3050, the voice recognition processing unit 605 determines whether voice information has been received from the wireless communication terminal 400. If the voice recognition processing unit 605 receives voice information (S3050: YES), the process proceeds to step S3060. If the voice recognition processing unit 605 has not received voice information (S3050: NO), the process proceeds to step S3070.

ステップＳ３０６０において、音声認識処理部６０５は、音声認識データベース６０２０を用いて、受信した音声情報に含まれる音声データに対する音声認識処理を行う。音声認識処理の結果、端末入力音声を仮名に変換した読みと、その読みを仮名漢字変換した仮名漢字テキストとが得られる。 In step S3060, the speech recognition processing unit 605 performs speech recognition processing on the speech data included in the received speech information using the speech recognition database 6020. As a result of the speech recognition processing, a reading obtained by converting the terminal input speech into a kana and a kana-kanji text obtained by converting the reading into kana-kanji are obtained.

そして、ステップＳ３０８０において、校正管理部６０９は、装置情報（図７参照）およびクラーク情報（図８参照）に基づき、クラーク決定ルール（図９参照）に従って、校正作業を行わせるクラークを決定する。そして、校正管理部６０９は、音声認識処理の結果（未校正音声認識結果）を、決定したクラークに対応する校正端末７００へ送信して、ステップＳ３０７０へ進む。 In step S3080, the calibration management unit 609 determines a clerk to be calibrated according to the clerk determination rule (see FIG. 9) based on the device information (see FIG. 7) and the clerk information (see FIG. 8). Then, the calibration management unit 609 transmits the result of the speech recognition process (uncalibrated speech recognition result) to the calibration terminal 700 corresponding to the determined clerk, and proceeds to step S3070.

図２２は、音声認識サーバ６００から校正端末７００への送信データの構成の一例を示す図である。 FIG. 22 is a diagram illustrating an example of a configuration of transmission data from the speech recognition server 600 to the calibration terminal 700.

図２２に示すように、音声認識サーバ６００から校正端末７００への送信データ９５０は、端末ＩＤ９５１、音声情報９５２、および未校正音声認識結果９５３を含む。 As shown in FIG. 22, the transmission data 950 from the speech recognition server 600 to the calibration terminal 700 includes a terminal ID 951, speech information 952, and an uncalibrated speech recognition result 953.

端末ＩＤ９５１は、音声情報９５２の送信元の無線通信端末４００の端末ＩＤを示す。未校正音声認識結果９５３は、音声情報９５２に対する音声認識結果である。 The terminal ID 951 indicates the terminal ID of the wireless communication terminal 400 that is the transmission source of the audio information 952. The uncalibrated speech recognition result 953 is a speech recognition result for the speech information 952.

この未校正音声認識結果９５３は、校正端末７００における校正作業を経て、校正済音声認識結果として返信される。 The uncalibrated speech recognition result 953 is returned as a calibrated speech recognition result after calibrating in the calibration terminal 700.

図２３は、校正端末７００から音声認識サーバ６００への送信データの構成の一例を示す図である。 FIG. 23 is a diagram illustrating an example of a configuration of transmission data from the calibration terminal 700 to the speech recognition server 600.

図２３に示すように、校正端末７００から音声認識サーバ６００への送信データ９６０は、端末ＩＤ９６１および校正済音声認識結果９６２を含む。端末ＩＤ９６１は、校正済音声認識結果９６２の基となった音声情報９５２の送信元の無線通信端末４００の端末ＩＤを示す。 As shown in FIG. 23, transmission data 960 from the calibration terminal 700 to the speech recognition server 600 includes a terminal ID 961 and a calibrated speech recognition result 962. The terminal ID 961 indicates the terminal ID of the wireless communication terminal 400 that is the transmission source of the voice information 952 that is the basis of the calibrated voice recognition result 962.

図２０のステップＳ３０７０において、校正管理部６０９は、校正端末７００から校正結果（校正済音声認識結果）を受信したか否かを判断する。校正管理部６０９は、校正結果を受信した場合（Ｓ３０７０：ＹＥＳ）、ステップＳ３０９０へ進む。また、校正管理部６０９は、校正結果を受信していない場合（Ｓ３０７０：ＮＯ）、ステップＳ３１００へ進む。 In step S3070 of FIG. 20, the calibration management unit 609 determines whether a calibration result (calibrated speech recognition result) has been received from the calibration terminal 700. When the calibration management unit 609 receives the calibration result (S3070: YES), the calibration management unit 609 proceeds to step S3090. If the calibration management unit 609 has not received the calibration result (S3070: NO), the calibration management unit 609 proceeds to step S3100.

ステップＳ３０９０において、校正管理部６０９は、ＩＤ読み対応テーブル（図２１参照）において、受信した校正済音声認識結果に対応するフィールドＩＤを取得する。そして、校正管理部６０９は、取得したフィールドＩＤを、校正済音声認識結果として、無線通信端末４００へ送信する。また、校正済音声認識結果が、「しめい、やまだはなこ」および「氏名、山田花子」というように、ＩＤ読み対応テーブルに記述された読みと入力テキストとから成る場合がある。この場合、校正管理部６０９は、フィールドＩＤと入力テキストとを、校正済音声認識結果として、無線通信端末４００へ送信する。この際、入力テキストとしては、仮名漢字変換されたテキスト（上述の例では、「やまだはなこ」ではなく「山田花子」）が送信される。そして、校正管理部６０９は、ステップＳ３１００へ進む。 In step S3090, the calibration management unit 609 acquires a field ID corresponding to the received calibrated voice recognition result in the ID reading correspondence table (see FIG. 21). Then, the calibration management unit 609 transmits the acquired field ID to the wireless communication terminal 400 as a calibrated voice recognition result. In addition, the proofread speech recognition result may be composed of a reading and an input text described in the ID reading correspondence table such as “Shimei, Yamada Hanako” and “Name, Hanako Yamada”. In this case, the calibration management unit 609 transmits the field ID and the input text to the wireless communication terminal 400 as a calibrated voice recognition result. At this time, as the input text, text converted to Kana-Kanji (in the above example, “Yamada Hanako” instead of “Yamada Hanako”) is transmitted. Then, the calibration management unit 609 proceeds to step S3100.

音声認識サーバ６００から無線通信端末４００への送信データは、上述の通り、無線通信端末４００により、情報処理装置２００へと転送される。 Transmission data from the voice recognition server 600 to the wireless communication terminal 400 is transferred to the information processing apparatus 200 by the wireless communication terminal 400 as described above.

図２４は、音声認識サーバ６００から情報処理装置２００への送信データの構成の一例を示す図である。 FIG. 24 is a diagram illustrating an example of a configuration of transmission data from the voice recognition server 600 to the information processing apparatus 200.

図２４に示すように、音声認識サーバ６００から情報処理装置２００への送信データ９７０は、フィールドＩＤ９７１を含み、更に、適宜、入力テキスト９７２を含む。入力テキスト９７２が含まれるのは、フィールドＩＤ９７１が示す入力エリアの種類が、例えば、テキストボックスあるいは選択メニューのときである。 As shown in FIG. 24, the transmission data 970 from the speech recognition server 600 to the information processing apparatus 200 includes a field ID 971 and further includes an input text 972 as appropriate. The input text 972 is included when the type of the input area indicated by the field ID 971 is, for example, a text box or a selection menu.

そして、図２０のステップＳ３１００において、校正管理部６０９は、オペレータ操作などにより処理の終了を指示されたか否かを判断する。校正管理部６０９は、処理の終了を指示されていない場合（Ｓ３１００：ＮＯ）、ステップＳ３０１０へ戻る。また、校正管理部６０９は、処理の終了を指示された場合（Ｓ３１００：ＹＥＳ）、一連の処理を終了する。 In step S3100 in FIG. 20, the calibration management unit 609 determines whether or not an instruction to end the process is given by an operator operation or the like. When the calibration management unit 609 is not instructed to end the process (S3100: NO), the calibration management unit 609 returns to step S3010. In addition, when instructed to end the process (S3100: YES), the calibration management unit 609 ends the series of processes.

このような動作により、音声認識サーバ６００は、入力画面の属性や構成を反映させて精度の高い音声認識処理を行うことができ、更に、適切なクラークを選択し校正作業を行わせて、更に高い音声認識処理を行うことができる。 With such an operation, the voice recognition server 600 can perform voice recognition processing with high accuracy by reflecting the attributes and configuration of the input screen, further select an appropriate clerk, perform calibration work, and further High voice recognition processing can be performed.

図２５は、情報表示装置２００による操作変換処理（図１０のステップＳ１４００）の一例を示すフローチャートである。 FIG. 25 is a flowchart illustrating an example of the operation conversion process (step S1400 in FIG. 10) performed by the information display apparatus 200.

まず、ステップＳ１４０１において、入力情報生成部２１０は、入力画面８６１の絶対座標（例えば、相対座標系８７９の原点ｏの絶対座標、図１１参照）を、電子カルテ部２０３から取得する。入力画面８６１の絶対座標の取得は、入力画面がマイクロソフト社のウィンドウズ（登録商標）におけるウィンドウである場合、対応するウィンドウズメッセージを電子カルテ部２０３へ渡すことにより、可能である。 First, in step S1401, the input information generation unit 210 acquires the absolute coordinates of the input screen 861 (for example, the absolute coordinates of the origin o of the relative coordinate system 879, see FIG. 11) from the electronic medical chart unit 203. The absolute coordinates of the input screen 861 can be obtained by passing a corresponding Windows message to the electronic medical chart unit 203 when the input screen is a window in Microsoft Windows (registered trademark).

そして、ステップＳ１４０２において、入力情報生成部２１０は、受信した校正済認識結果に含まれるフィールドＩＤに対応する入力エリアの相対座標を、画面構成テーブルから取得する。そして、入力エリアの相対座標を、入力画面の絶対座標に基づいて、絶対座標に変換する。例えば、相対座標系と絶対座標系とが同一スケールである場合には、入力情報生成部２１０は、入力エリアの相対座標の値と入力画面の絶対座標の値との加算することにより、入力エリアの絶対座標を取得する。 In step S1402, the input information generation unit 210 acquires the relative coordinates of the input area corresponding to the field ID included in the received proofreading recognition result from the screen configuration table. Then, the relative coordinates of the input area are converted into absolute coordinates based on the absolute coordinates of the input screen. For example, when the relative coordinate system and the absolute coordinate system have the same scale, the input information generation unit 210 adds the value of the relative coordinate of the input area and the value of the absolute coordinate of the input screen, thereby obtaining the input area. Get the absolute coordinates of.

そして、ステップＳ１４０３において、入力情報生成部２１０は、操作変換ルール（図４参照）に従って、校正済認識結果に対応する操作情報を生成する。そして、入力情報生成部２１０は、生成した操作情報を電子カルテ部２０３に入力して、図１０の処理へ戻る。 In step S1403, the input information generation unit 210 generates operation information corresponding to the proofread recognition result according to the operation conversion rule (see FIG. 4). Then, the input information generation unit 210 inputs the generated operation information to the electronic medical record unit 203, and returns to the processing of FIG.

例えば、「２」というフィールドＩＤと「山田花子」という入力テキストとを含む校正済認識結果が受信されたとする。この場合、入力情報生成部２１０は、まず、画面構成テーブル（図１３参照）から、対応する入力エリアのタイプがテキストボックスであり、その相対座標が（ｘ_２、ｙ_２）であるということを取得する。この結果、入力情報生成部２１０は、操作変換ルール８２０から、対応する入力エリアの絶対座標でワンクリックの操作と、続けて入力テキストの入力操作とを示す操作情報を生成することを決定する。そして、入力情報生成部２１０は、相対座標（ｘ_２、ｙ_２）を絶対座標に変換し、決定した内容の操作情報を生成し、電子カルテ部２０３に入力する。 For example, it is assumed that a proofread recognition result including a field ID “2” and an input text “Hanako Yamada” is received. In this case, the input information generation unit 210 first determines from the screen configuration table (see FIG. 13) that the type of the corresponding input area is a text box and the relative coordinates thereof are (x ₂ , y ₂ ). get. As a result, the input information generation unit 210 determines to generate, from the operation conversion rule 820, operation information indicating a one-click operation at the absolute coordinates of the corresponding input area and subsequently an input operation of the input text. Then, the input information generation unit 210 converts the relative coordinates (x ₂ , y ₂ ) into absolute coordinates, generates operation information having the determined contents, and inputs the operation information to the electronic medical chart unit 203.

このような操作変換処理により、情報処理装置２００は、入力画面の大きさ、位置、構成が可変であるような電子カルテ部２０３に対しても、これを改変することなく、音声認識結果に対応する操作を行うことを可能にする。 Through such an operation conversion process, the information processing apparatus 200 can respond to a voice recognition result without modifying the electronic medical chart unit 203 whose input screen size, position, and configuration are variable. It is possible to perform an operation.

また、以上のような動作により、情報処理装置２００は、搭載する電子アプリを改変することなく、無線通信端末４００などの外部装置を用いて、しかも電子アプリのセキュリティを確保した状態で、電子カルテに音声入力機能を追加することができる。 In addition, through the above-described operation, the information processing apparatus 200 uses an external device such as the wireless communication terminal 400 without modifying the electronic application to be mounted, and in addition to ensuring the security of the electronic application, Voice input function can be added to

以上説明したように、本実施の形態に係る入力支援システム１００は、入力エリア位置とラベルテキスト位置との相対関係に基づいて、入力エリア位置にラベルテキストを対応付け、音声認識結果による入力エリアに対する選択操作を可能にした。これにより、入力支援システム１００は、電子カルテ部２０３に対し、音声入力機能を簡単に追加することができる。すなわち、入力支援システム１００は、入力エリアに読みが設定されていない既存のアプリケーションに対して音声入力機能を簡単に追加することができる。 As described above, the input support system 100 according to the present embodiment associates a label text with an input area position based on the relative relationship between the input area position and the label text position, The selection operation was enabled. Thereby, the input support system 100 can easily add a voice input function to the electronic medical chart unit 203. That is, the input support system 100 can easily add a voice input function to an existing application in which reading is not set in the input area.

また、入力支援システム１００は、入力画面の相対座標を用いるので、入力画面が単に移動した場合に、その都度、画面構成解析を行わないようにすることができ、処理負荷を軽減することができる。 Further, since the input support system 100 uses the relative coordinates of the input screen, it is possible to prevent the screen configuration analysis from being performed each time the input screen simply moves, thereby reducing the processing load. .

また、入力支援システム１００は、公共通信網５００に音声認識サーバ６００と複数の校正端末７００を配置し、これらを利用して電子カルテ部２０３に音声認識機能を追加しつつ、これらと情報処理装置２００との間に無線通信端末４００を配置するようにした。これにより、入力支援システム１００は、情報処理装置２００を公共通信網５００に直接に通信可能に接続することなく、電子カルテ部２０３に対し、音声入力機能を簡単に追加することができる。すなわち、入力支援システム１００は、高いセキュリティを求められるアプリケーションに対して高精度な音声入力機能を簡単に追加することができる。 The input support system 100 also includes a voice recognition server 600 and a plurality of proofreading terminals 700 arranged in the public communication network 500, and uses these to add a voice recognition function to the electronic medical chart unit 203. The wireless communication terminal 400 is arranged between the terminal 200 and the terminal 200. Thus, the input support system 100 can easily add a voice input function to the electronic medical chart unit 203 without connecting the information processing apparatus 200 to the public communication network 500 so as to be directly communicable. That is, the input support system 100 can easily add a highly accurate voice input function to an application that requires high security.

なお、以上説明した実施の形態では、音声認識結果を操作情報に変換する機能部として、入力情報生成部２１０を情報処理装置に配置したが、かかる機能部の配置はこれに限定されない。音声認識結果を操作情報に変換する機能部は、例えば、無線通信端末４００に配置してもよい。この場合、画面構成解析部２０７は、生成した画面構成テーブルを、無線端末装置４００に送信してこれに格納させればよい。 In the embodiment described above, the input information generation unit 210 is arranged in the information processing apparatus as a function unit for converting the voice recognition result into operation information. However, the arrangement of the function unit is not limited to this. The functional unit that converts the voice recognition result into the operation information may be disposed in the wireless communication terminal 400, for example. In this case, the screen configuration analysis unit 207 may transmit the generated screen configuration table to the wireless terminal device 400 and store it therein.

また、入力画面の各ページの構成や大きさが固定である場合には、全てのページについて一旦画面構成テーブルが生成されれば、画面構成解析部２０７は不要である。したがって、この場合、他の機能部（例えば入力情報生成部２１０）が、画面構成解析部２０７を情報処理装置２００から消去するようにしてもよい。 Further, when the configuration and size of each page of the input screen are fixed, the screen configuration analysis unit 207 is not necessary once the screen configuration table is generated for all pages. Therefore, in this case, another function unit (for example, the input information generation unit 210) may delete the screen configuration analysis unit 207 from the information processing apparatus 200.

また、入力支援システム１００は、ＩＤではなく、ラベルテキストの読みのテキストデータや、入力エリアの座標などを、音声認識結果としてもよい。但し、画面構成解析部２０７は、音声認識結果から入力エリアを特定できるように、画面構成テーブルを生成する必要がある。 Further, the input support system 100 may use text data for reading the label text, coordinates of the input area, and the like as the voice recognition result instead of the ID. However, the screen configuration analysis unit 207 needs to generate a screen configuration table so that the input area can be identified from the speech recognition result.

また、入力支援システム１００は、キーボード入力信号をエミュレートした操作情報を、アプリケーションに入力するようにしてもよい。 Further, the input support system 100 may input operation information emulating a keyboard input signal to an application.

この場合、例えば、無線通信端末４００の音声認識管理部４０４は、受信した校正作業の結果を、キーボード入力信号（平仮名文字列）に変換して、装置通信部４０２を介して情報処理装置２００へ送信する。このキーボード入力信号は、例えば、校正作業の結果である仮名漢字まじりテキストを平仮名文字列に変換したものを入力するために行われるべきキーボード操作の内容と、その後の変換操作（入力した平仮名文字列全てを選択し、変換すること）・確定操作（仮名漢字まじりに確定すること）を示す信号である。 In this case, for example, the voice recognition management unit 404 of the wireless communication terminal 400 converts the received calibration work result into a keyboard input signal (Hiragana character string), and sends it to the information processing device 200 via the device communication unit 402. Send. This keyboard input signal includes, for example, the contents of the keyboard operation to be performed to input the kana-kanji magic text that is the result of the proofreading operation into the hiragana character string, and the subsequent conversion operation (the input hiragana character string This is a signal indicating that all are selected and converted) / confirming operation (confirming to kana / kanji magic).

また、情報処理装置２００の入力情報生成部２１０は、入力エリアのいずれかが選択されている状態で、端末通信部２０２を介してこのキーボード入力信号を受信したとき、当該キーボード入力信号を仮名漢字テキストに変換する。そして、入力情報生成部２１０は、選択されている入力エリアの表示位置に対する当該仮名漢字テキストの入力操作を示す操作情報を、電子カルテ部２０３に入力する。 In addition, when the input information generation unit 210 of the information processing apparatus 200 receives this keyboard input signal via the terminal communication unit 202 in a state where any of the input areas is selected, the input information generation unit 210 converts the keyboard input signal into the kana kanji. Convert to text. Then, the input information generation unit 210 inputs operation information indicating an input operation of the kana / kanji text for the display position of the selected input area to the electronic medical record unit 203.

更に、入力支援システム１００は、入力エリアの選択については手動で受け付け、テキスト入力についてのみ音声認識機能を用いるようにしてもよい。例えば、キーボード入力信号を一般の無線キーボードの無線信号で実現した場合、通常の無線キーボード入力と同じ経路で電子カルテ部２０３に入力するため、入力情報生成部２１０および端末通信部２０２は通常のコンピュータ端末に普通に備わっている入力機能である。 Further, the input support system 100 may manually accept input area selection and use the speech recognition function only for text input. For example, when the keyboard input signal is realized by a wireless signal of a general wireless keyboard, it is input to the electronic medical chart unit 203 through the same route as a normal wireless keyboard input. This is an input function normally provided in the terminal.

また、入力情報生成部２１０のうち仮名漢字テキストへの変換機能は、情報処理装置２００に搭載されている機能と兼用としてもよい。音声認識サーバ６００との音声認識辞書は、対象分野に特化した変換辞書（よみと表記のペア）を持っているため、これを情報処理装置２００に備わってる仮名漢字変換（IME）にインポートしておくことが望ましい。そうすることにより、一回の変換操作で仮名漢字変換が成功する確率が高くなる。 In addition, the conversion function to the kana / kanji text in the input information generation unit 210 may be combined with the function installed in the information processing apparatus 200. Since the speech recognition dictionary with the speech recognition server 600 has a conversion dictionary (a pair of reading and notation) specialized for the target field, it is imported into the kana-kanji conversion (IME) provided in the information processing apparatus 200. It is desirable to keep it. By doing so, the probability that the kana-kanji conversion will be successful in a single conversion operation increases.

また、画面構成解析部２０７は、実際に表示される画像をスキャンし、入力画面から、各入力エリアの位置およびタイプや、各ラベルテキストの表示位置を取得（抽出）するようにしてもよい。かかる抽出は、例えば、パターンマッチングや画像特徴量抽出などにより行うことができる。 The screen configuration analysis unit 207 may scan an actually displayed image and acquire (extract) the position and type of each input area and the display position of each label text from the input screen. Such extraction can be performed, for example, by pattern matching or image feature amount extraction.

また、本発明は、電子カルテ以外の各種アプリケーションに対して適用可能であることは勿論である。 Of course, the present invention is applicable to various applications other than electronic medical records.

本発明は、高いセキュリティを求められる既存のアプリケーションに対して音声入力機能を簡単に追加することができる入力支援装置、入力支援方法、および入力支援プログラムとして有用である。 The present invention is useful as an input support apparatus, an input support method, and an input support program that can easily add a voice input function to an existing application that requires high security.

１００入力支援システム
２００情報処理装置
２０１ＬＡＮ通信部
２０２端末通信部
２０３電子カルテ部
２０４画像出力部
２０５操作入力部
２０６テーブル作成ルール格納部
２０７画面構成解析部
２０８画面構成テーブル格納部
２０９操作変換ルール格納部
２１０入力情報生成部
３００院内ＬＡＮ
４００無線通信端末
４０１装置通信部
４０２、６０１網通信部
４０３音声入力部
４０４音声認識管理部
５００公共通信網
６００音声認識サーバ
６０２音声認識データベース
６０３逆認識処理部
６０４対応テーブル格納部
６０５音声認識処理部
６０６装置情報格納部
６０７クラーク情報格納部
６０８クラーク決定ルール格納部
６０９校正管理部
７００校正端末 DESCRIPTION OF SYMBOLS 100 Input support system 200 Information processing apparatus 201 LAN communication part 202 Terminal communication part 203 Electronic medical chart part 204 Image output part 205 Operation input part 206 Table creation rule storage part 207 Screen structure analysis part 208 Screen structure table storage part 209 Operation conversion rule storage Unit 210 Input information generation unit 300 Hospital LAN
400 wireless communication terminal 401 device communication unit 402, 601 network communication unit 403 voice input unit 404 voice recognition management unit 500 public communication network 600 voice recognition server 602 voice recognition database 603 reverse recognition processing unit 604 correspondence table storage unit 605 voice recognition processing unit 606 Device information storage unit 607 Clark information storage unit 608 Clark determination rule storage unit 609 Calibration management unit 700 Calibration terminal

Claims

An input support device that adds a voice input function to application software that displays an input area for inputting information on an input screen,
A screen configuration table storage unit that stores a screen configuration table in which the display position of the input area and the first information specifying the reading representing the input area are associated with each other;
The second information representing the result of the speech recognition process for the speech voice is acquired, the display position of the input area corresponding to the acquired second information is specified with reference to the screen configuration table, and the specified An operation information for selecting a display position, and an input information generation unit for inputting to the application software,
Input support device.

The input screen further displays label text indicating the attributes of the input area,
The display position of the input area and the display position of the label text are acquired, and third information for specifying the reading of the label text is determined based on the relative relationship between the display position of the input area and the display position of the label text. A screen configuration analysis unit that generates the screen configuration table in association with the display position of the input area as the first information specifying the reading representing the input area,
The input support apparatus according to claim 1.

The third information for specifying the reading of the label text is information for specifying a result of reverse speech recognition processing for the label text.
The input support apparatus according to claim 2.

The screen configuration analysis unit
Queries the application software for the display position of the input area and the display position of the label text.
The input support apparatus according to claim 3.

The screen configuration analysis unit
Scanning the input screen, and detecting the input area and the label text from the input screen to obtain the display position of the input area and the display position of the label text,
The input support apparatus according to claim 3.

The screen configuration table is:
Using the coordinate system of the input screen, associating the display position of the input area with the first information specifying the reading representing the input area,
The input information generation unit
The operation information for obtaining the position of the input screen on the display screen for displaying the input screen, and selecting the display position of the input area using the coordinate system of the display screen based on the obtained position of the input screen. Enter into the application software,
The input support apparatus according to claim 3.

The relative relationship includes at least an arrangement direction and a distance,
The input support apparatus according to claim 3.

The input information generation unit
An operation for selecting the display position of the input area when the second information representing the result of the speech recognition process is composed of a combination of the first information specifying the reading representing the input area and the input text. Following the information, input operation information for inputting the input text to the application software.
The input support apparatus according to claim 3.

The screen configuration analysis unit
For each reading representing the input area, an ID is assigned as the first information for specifying the reading, and for each ID, a set of the ID and the label text is subjected to the reverse speech recognition processing and the speech recognition processing. To the voice recognition server to perform,
The input information generation unit
Receiving the ID corresponding to the result of the voice recognition process as the second information representing the result of the voice recognition process;
The input support apparatus according to claim 3.

An input support method for adding a voice input function to application software that displays an input area for inputting information and a label text indicating an attribute of the input area on an input screen,
Obtaining a display position of the input area and a display position of the label text;
Based on the relative relationship between the display position of the input area and the display position of the label text, the information specifying the reading of the label text with respect to the display position of the input area, the information specifying the reading representing the input area Generating a screen configuration table associated with
Obtaining information representing the result of the speech recognition process for the spoken speech;
Identifying the display position of the input area corresponding to the information representing the acquired result of the voice recognition processing with reference to the screen configuration table;
Inputting operation information for selecting the specified display position into the application software,
Input support method.

An input support program for adding a voice input function to application software for displaying an input area for inputting information and a label text for indicating an attribute of the input area on an input screen,
A computer accessible to the application software;
Processing for obtaining the display position of the input area and the display position of the label text;
Based on the relative relationship between the display position of the input area and the display position of the label text, the information specifying the reading of the label text with respect to the display position of the input area, the information specifying the reading representing the input area Processing to generate a screen configuration table associated with
A process of acquiring information representing the result of the voice recognition process for the uttered voice;
Processing for specifying the display position of the input area corresponding to the information representing the acquired result of the speech recognition processing with reference to the screen configuration table;
A process of inputting operation information for selecting the specified display position into the application software, and
Input support program.