JP4003651B2

JP4003651B2 - Operation writing method, program, and recording medium recording the program

Info

Publication number: JP4003651B2
Application number: JP2003029765A
Authority: JP
Inventors: 勝宮本; 輝夫浜野
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2003-02-06
Filing date: 2003-02-06
Publication date: 2007-11-07
Anticipated expiration: 2023-02-06
Also published as: JP2004240759A

Description

【０００１】
【発明の属する技術分野】
本発明は、操作の書き取り方法、プログラム及び該プログラムを記録した記録媒体に関する。詳細には、利用者の操作過程を文章として作成するものに関する。
【０００２】
【従来の技術及び発明が解決しようとする課題】
近年、操作履歴を収集分析し、得られた傾向に基づいて、情報の推薦又はマーケティングに役立てる研究が多くなされている。これらの分析結果は、入力である操作履歴の性質によって大きく変わってくる。従って、できるだけ詳細な操作履歴を把握することができれば、利用者の真のニーズにマッチした分析結果が得られる可能性がある。
【０００３】
ソフトウエアの詳細な操作履歴を取得するために、個々のソフトウエアに、操作履歴を取得する機能を組み込むことが考えられる。しかし、これでは、任意のソフトウエアの操作履歴を取得するのは困難であり、汎用性がない。
【０００４】
汎用的に操作履歴を取得するために、利用者が操作内容を音声で発話し、発話内容を音声認識の技術でディクテーションする方法があった（例えば非特許文献１参照）。しかし、発話の速度やなまりなどによる音声認識の認識率低下は、逃れられない問題である。また、利用者に操作内容の発話をしいるため、利用者の負荷が高いという問題もある。
【０００５】
【非特許文献１】
川端豪、中蔦信弥ら著「ここまできた音声認識・音声合成」、ＮＴＴ技術ジャーナル、１９９９年、Ｖｏｌ.１１、Ｎｏ.１２、ｐｐ１２−４０
【０００６】
そこで、本発明は、上記の問題を鑑みてなされたものであり、汎用的で、利用者に負担をかけることなく、詳細な操作履歴を取得する操作の書き取り方法、プログラム及び該プログラムを記録した記録媒体を提供することを目的とする。
【０００７】
【課題を解決するための手段】
本発明は、利用者によって操作される操作手段と、操作対象を含む画像情報を表示する表示手段とからなるユーザインタフェース装置を用いて、利用者の操作過程を文章として作成する処理装置における操作の書き取り方法、プログラム及び該プログラムを記録した記録媒体に関する。
【０００８】
本発明における操作の書き取り方法によれば、
前記操作手段における操作事象を監視する第１のステップと、
前記操作事象が発生した際に、前記表示手段に表示されている画像情報から前記操作対象及び操作種類を特定する第２のステップと、
前記操作対象を示す文章と、前記操作種類を示す文章とを組み合わせた文章を作成する第３のステップと、を有しており、
操作対象の特定は、
操作事象が発生した際の、操作手段が指示する画面上の座標の周囲の画像情報である周囲画像情報を取得するステップと、
取得した周囲画像情報と、予め蓄積している操作対象毎の部分画像情報とを比較して、一致又は類似度の最も高い部分画像情報を検索するステップと、
検索した部分画像情報の中の文字列を認識、あるいは、蓄積された部分画像情報に係る文書情報に基づき、検索した部分画像情報に係る文書情報を検索するステップとを含んでいることを特徴とする。これにより、汎用的で、利用者に負荷がかからない方法で、詳細な操作履歴を取得することが可能となる。
【００１１】
更に、本発明における操作の書き取り方法の他の実施形態によれば、操作対象に階層構造が存在する場合、
第２のステップは、操作対象に対する上位操作対象を特定し、
第３のステップは、上位操作対象を示す文章を更に組み合わせた文章を作成する、ことも好ましい。これにより、操作対象の位置を特定しやすい操作履歴を記述することが可能となる。
【００１２】
更に、本発明における操作の書き取り方法の他の実施形態によれば、第３のステップは、上位操作対象の部分画像情報に対する操作対象の部分画像情報の相対的な位置を示す文章を更に組み合わせた文章を作成することも好ましい。
【００１３】
本発明における操作の書き取りプログラムによれば、
前記操作手段における操作事象を監視する第１のステップと、
前記操作事象が発生した際に、前記表示手段に表示されている画像情報から前記操作対象及び操作種類を特定する第２のステップと、
前記操作対象を示す文章と、前記操作種類を示す文章とを組み合わせた文章を作成する第３のステップとをコンピュータに実行させ、
操作対象の特定のために、
操作事象が発生した際の、操作手段が指示する画面上の座標の周囲の画像情報である周囲画像情報を取得するステップと、
取得した周囲画像情報と、予め蓄積している操作対象毎の部分画像情報とを比較して、一致又は類似度の最も高い部分画像情報を検索するステップと、
検索した部分画像情報の中の文字列を認識、あるいは、蓄積された部分画像情報に係る文書情報に基づき、検索した部分画像情報に係る文書情報を検索するステップとをコンピュータに実行させることを特徴とする。
【００１６】
更に、本発明における操作の書き取りプログラムの他の実施形態によれば、操作対象に階層構造が存在する場合、
第２のステップは、操作対象に対する上位操作対象を特定し、
第３のステップは、上位操作対象を示す文章を更に組み合わせた文章を作成するようにコンピュータを実行させることも好ましい。
【００１７】
更に、本発明における操作の書き取りプログラムの他の実施形態によれば、第３のステップは、上位操作対象の部分画像情報に対する操作対象の部分画像情報の相対的な位置を示す文章を更に組み合わせた文章を作成するようにコンピュータを実行させることも好ましい。
【００１８】
前述した本発明における操作の書き取りプログラムを記録した記録媒体であってもよい。
【００１９】
【発明の実施の形態】
以下では、本発明の実施の形態について、図面を参照して詳細に説明する。
【００２０】
図１は、本発明による第１の実施形態のシステム構成図である。また、図２及び図３は、ディスプレイに表示された表示情報の例である。
【００２１】
図１によれば、マウス及びキーボートとディスプレイとに接続された処理装置は、操作監視部１と、操作対象特定部２と、操作種別特定部３と、説明文合成部４とから構成される。
【００２２】
操作監視部１は、操作対象に対する利用者の操作を監視する。操作対象としては、例えば、「メニュー」、「ボタン」、「ウィンドウ」、「ラジオボタン」等がある。また、利用者の操作としては、例えば、マウスのクリック、キーボードの入力、タッチパネルへの接触、音声入力等がある。このような操作事象の発生を検知したとき、操作監視部１は、操作識別情報と共に、操作対象の特定要求を操作対象特定部２へ出力し、操作種別の特定要求を操作種別特定部３へ出力する。
【００２３】
操作対象特定部２は、操作監視部１からの操作対象の特定要求の通知を受けた際に、操作識別情報に基づいて利用者が操作した操作対象を特定する。ここで、操作対象とは、アプリケーションを構成する画面情報のうち、利用者の操作が直接作用した部位のことをいう。
【００２４】
例えば、図２によれば、利用者がアプリケーションの画面中の「ＯＫボタン」をクリックした場合、利用者の操作が直接作用した「ＯＫボタン」が操作対象となる。また、例えば、図３によれば、利用者がアプリケーション画面中の「編集メニュー」ボタンをクリックした場合、利用者の操作が直接作用した「編集メニュー」ボタンが操作対象となる。
【００２５】
尚、操作対象は、操作監視部１によって取得される操作識別情報と、操作対象の対応表とを用いて特定してもよい。また、操作監視部１から取得可能な操作識別情報を用いて、何らかの方法で自動的に操作対象を特定してもよい。
【００２６】
操作種別特定部３は、操作監視部１からの操作対象の特定要求の通知を受けた際に、操作識別情報に基づいて、利用者が操作対象に対して行った操作種別を特定する。例えば、「アイコン」という操作対象に対しては、利用者の操作によって「クリックする」という操作又は「ドラッグする」という操作など、複数の種類が利用可能であり、こらら個々の操作の種類が操作種別となる。
【００２７】
そして、利用者が行った操作が、これら複数の操作のうち、いずれだったのかを特定することが、操作種別の特定という行為である。操作対象によっては、１種類しか操作種別が存在しない場合もあり、このような操作対象の場合は、必然的に操作種別は一意に決まる。
【００２８】
説明文合成部４は、操作対象特定部２で取得した操作対象を示す文章と、操作種別特定部３で取得した操作種別を示す文章とを組合せて、操作履歴を文章化する。文章化する上では、文章のテンプレートを用意し、取得した操作対象と操作種別を埋め込むといった方法などがある。具体的な例を以下に示す。
【００２９】
テンプレート＝「［操作対象］を［操作種別］。」であって、
操作対象＝「編集メニュー」、操作種別＝「選択する」である場合、
作成された文章＝「「編集メニュー」を「選択する」。」となる。
【００３０】
ここで、文章に、操作対象及び操作種別が分かるような画像を抽出して、付与してもよい。
【００３１】
図４は、図１に基づく書き取り方法のフローチャートである。
【００３２】
（Ｓ１）操作監視部１が操作事象を監視しており、操作事象が発生したらＳ２へ、発生していなかったら、Ｓ１自身に戻って、引き続き操作事象を監視する。
（Ｓ２）操作種別特定部２が、操作監視部１から取得した操作識別情報に基づいて、操作種別を識別し、Ｓ３へ進む。
（Ｓ３）操作対象特定部３が、操作監視部１から取得した操作識別情報に基づいて、操作対象を識別し、Ｓ４へ進む。
（Ｓ４）説明文合成部４が、操作対象特定部２で取得した操作対象を示す文章と、操作種別特定部３で取得した操作種別を示す文章とを組合せて、操作履歴を文章化する。その後、再びＳｌに戻り、次回の操作事象の監視を継続する。
【００３３】
図５は、本発明による第２の実施形態のシステム構成図である。
【００３４】
図５によれば、処理装置は、操作監視部１'と、操作対象特定部２'と、操作種別特定部３と、説明文合成部４と、画像認識部５'とを有する。尚、操作種別特定部３と説明文合成部４とは、図１のものと同一であるので、その説明は以下では省略する。
【００３５】
操作監視部１'は、操作事象が発生したら、パソコンにおけるマウスのような操作を指示するポインタの画面上の座標を取得し、当該座標を操作識別情報として、操作対象特定部２'及び操作種別特定部３へ通知する。
【００３６】
操作対象特定部２'は、周囲画像取得部２−１と、対象種別識別部２−２と、タイトル認識部２−３とから構成される。
【００３７】
周囲画像取得部２−１は、操作監視部１'から通知された操作識別情報の座標に基づいて、当該座標周囲の周囲画像情報、即ち操作対象の周囲画像情報を取得し、対象種別識別部２−２へ通知する。
【００３８】
対象種別識別部２−２は、周囲画像取得部２−１から通知された周囲画像情報を、形状認識部５−１へ通知し、周囲画像情報の中に、操作対象の候補が存在するかどうかを問い合わせる。存在すれば、その対象を操作対象として特定する。特定した操作対象の範囲の画像情報を、タイトル認識部２−３へ通知する。
【００３９】
タイトル認識部２−３は、対象種別識別部２−２から通知された画像情報を文字認識部５−２へ通知し、当該画像情報の中に文字列の候補が存在するかどうかを問い合わせる。存在すれば、その文字列を操作対象のタイトルとして特定する。
【００４０】
画像認識部５は、形状認識部５−１と、文字認識部５−２とから構成される。
【００４１】
形状認識部５−１は、操作対象毎の部分画像情報を予め蓄積している。そして、対象種別識別部２−２から通知された周囲画像情報と、蓄積された候補となる部分画像情報とを比較し、一致するか又は最も類似度の高い部分画像情報を検索する。検索された部分画像情報は、対象種別識別部２−２へ通知される。一方、いずれの部分画像情報にも類似しない場合は、その旨を返す。
【００４２】
文字認識部５−２は、タイトル認識部２−３から通知された部分画像情報に、文字列の候補が存在するかどうかを検出するものである。また、部分画像情報に係る文章情報を予め蓄積しておき、部分画像情報と一致する文章情報を検索するものであってもよい。検出又は検索された文章情報は、タイトル認識部２−３へ通知される。一方、いずれの文章情報にも一致しない場合は、その旨を返す。
【００４３】
図６は、図５に基づく書き取り方法のフローチャートである。
【００４４】
（Ｓ２−１）操作監視部１'は、操作事象を監視し、操作事象が発生したら、パソコンにおけるマウスのような操作を指示するポインタの画面上の座標を取得し、操作識別情報と共に当該座標情報を操作種別特定部３へ通知する。操作種別特定部３によって操作種別がマウスのクリックであると判定されると、操作対象特定部２'に、マウス座標と共に操作対象特定要求を通知し、Ｓ２−２へ進む。それ以外の操作種別の場合は、Ｓ２−１へ戻り、操作の監視を継続する。
（Ｓ２−２）操作対象特定部２'の周囲画像取得部２−１は、取得したマウス座標の周囲画像情報を取得する。この周囲画像情報は、マウス座標を中心とした円状の画像や、予め大きさの決まった長方形の画像が想定される。その後、Ｓ２−３へ進む。
（Ｓ２−３）周囲画像取得部２一１取得した画像を、対象種別識別部２−２が、形状認識部５−１へ通知し、周囲画像情報の中に、操作対象の候補が存在するかどうかを問い合わせる。周囲画像情報の中に「メニュー」や「ボタン」の形状と類似度が高い部分が存在すると判定されれば、Ｓ２−４へ進む。また、周囲画像情報の中に、特定の操作対象が存在しないと判定されれば、Ｓ２−１へ戻り、操作の監視を継続する。
（Ｓ２−４）タイトル認識部２−３が、対象種別識別部２−２から通知された部分画像情報を、文字認識部５−２へ通知し、部分画像情報の中に、文字列の候補が存在するかどうかを問い合わせる。存在すれば、その文字列を操作対象のタイトルとして認識し、Ｓ２−５へ進む。
（Ｓ２−５）特定した操作対象を含む画面の一部をキャプチャし、Ｓ２−６へ進む。
（Ｓ２−６）説明文合成部４が、操作対象特定部２'で取得した操作対象のタイトルと、操作種別特定部３で取得した操作種別を組合せて、操作の説明文を合成する。尚、文章に、Ｓ２−５でキャプチャした画像を、付与してもよい。
【００４５】
図７は、本発明による第３の実施形態のシステム構成図である。また、図８は、ディスプレイに表示された表示情報の例である。尚、上位階層特定部６以外は、第２の実施形態と同じであるため、以下では説明を省略する。
【００４６】
上位階層特定部６は、上位階層の対象特定部６−１と、タイトル認識部６−２とから構成される。
【００４７】
上位階層の対象特定部６−１は、操作対象特定部２'で特定した操作対象に階層構造が存在する場合に、操作対象の親の階層の操作対象を特定する。操作対象が「メニュー」である場合には、親のメニュー階層をたどり、親のメニュー名に対応する画像を取得する。また、操作対象が「ボタン」の場合、「ボタン」が描画されている「ウィンドウ」のタイトルが存在する画像を親の階層の操作対象として特定する。たどる階層数は１つだけに限らず、必要に応じて再帰的に更に親の階層をたどってもよい。
【００４８】
タイトル認識部６−２は、上位階層の対象特定部６−１が取得した画像を、文字認識部５−２へ通知し、画像の中に、文字列の候補が存在するかどうかを問い合わせる。存在すれば、その文字列を操作対象のタイトルとして認識する。
【００４９】
説明文合成部４は、操作対象特定部２で取得した操作対象と、操作種別特定部３で取得した操作種別と、上位階層特定部６で取得した親の操作対象とを組合せて、操作履歴を文章化する。
【００５０】
操作対象がメニューの場合、操作対象のメニューが、メニュー階層の中でどの位置に存在するかを記述する。例えば、「［ファイル］−[上書き保存］を選択する」のように、親のメニュー階層も併記する。
【００５１】
また、操作対象がボタンの場合は、親の操作対象であるウィンドウの中で、ボタンが配置している相対座標をもと、相対的な位置を記述する。例えば、「印刷ウィンドウの右下にあるＯＫボタンを押下する」のように相対的な位置を併記する。
【００５２】
更に、文章に、操作対象及び操作種別が分かるような画像を抽出して、付与してもよい。文章化する上では、文章のテンプレートを用意し、取得した操作対象と操作種別を埋め込むといった方法などがある。
【００５３】
図９は、図７に基づく操作の書き取り方法のフローチャートである。尚、Ｓ３−１、Ｓ３−２、Ｓ２−６'以外は、第２の実施形態と同じであるため、以下では説明を省略する。
【００５４】
（Ｓ３−１）上位階層の対象特定部６−１が、操作対象特定部２'で特定した操作対象に階層構造が存在する場合に、操作対象の親の階層の操作対象を特定する。また、タイトル認識部６−２が、上位階層の対象特定部６−１が取得した画像を、文字認識部５−２へ通知し、画像の中に、文字列の候補が存在するかどうかを問い合わせ、存在すれば、その文字列を操作対象のタイトルとして認識する。その後、Ｓ３−２へ進む。
（Ｓ３−２）上位階層の対象特定部６−１が、更に上位の親の操作対象が存在しないかを判定し、存在する場合には、Ｓ３−１に戻る。存在しない場合には、Ｓ２−５へ進む。
（Ｓ２−６'）説明文合成部４が、操作対象特定部２'で取得した操作対象のタイトルと、操作種別特定部３で取得した操作種別と、上位階層特定部６で取得した親の操作対象とを組合せて、操作履歴を文章化する。
【００５５】
【発明の効果】
上述のように、本発明における操作の書き取り方法、プログラム及び該プログラムを記録した記録媒体によれば、汎用的で、利用者に負担をかけることなく、詳細な操作履歴を取得することができる。特に、アプリケーション又はＯＳなどのプラットフォームに依存しない汎用的な方法で、操作対象の内容を特定しやすい操作履歴を記述することが可能となる。また、操作対象の位置を特定しやすい操作履歴を記述することもできる。
【図面の簡単な説明】
【図１】本発明による第１の実施形態のシステム構成図である。
【図２】ディスプレイに表示された表示情報の例である。
【図３】ディスプレイに表示された表示情報の例である。
【図４】図１に基づく書き取り方法のフローチャートである。
【図５】本発明による第２の実施形態のシステム構成図である。
【図６】図５に基づく書き取り方法のフローチャートである。
【図７】本発明による第３の実施形態のシステム構成図である。
【図８】ディスプレイに表示された表示情報の例である。
【図９】図７に基づく書き取り方法のフローチャートである。
【符号の説明】
１操作監視部
２操作対象特定部
２−１周囲画像取得部
２−２対象種別識別部
２−３タイトル認識部
３操作種別特定部
４説明文合成部
５画像認識部
５−１形状認識部
５−２文字認識部
６上位階層特定部
６−１上位階層の対象特定部
６−２タイトル認識部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an operation writing method, a program, and a recording medium on which the program is recorded. Specifically, the present invention relates to a process for creating a user's operation process as a sentence.
[0002]
[Prior art and problems to be solved by the invention]
In recent years, many studies have been made to collect and analyze operation histories and use them for information recommendation or marketing based on the obtained trends. These analysis results vary greatly depending on the nature of the operation history that is the input. Therefore, if the operation history as detailed as possible can be grasped, there is a possibility that an analysis result that matches the true needs of the user is obtained.
[0003]
In order to acquire a detailed operation history of software, it is conceivable to incorporate a function for acquiring an operation history into each software. However, with this, it is difficult to acquire an operation history of arbitrary software, and there is no versatility.
[0004]
There has been a method in which a user utters an operation content by voice and dictates the utterance content by a voice recognition technique in order to acquire an operation history for a general purpose (see Non-Patent Document 1, for example). However, a decrease in the recognition rate of speech recognition due to the speed of utterance or rounding is a problem that cannot be avoided. Another problem is that the user's load is high because the user can speak the operation content.
[0005]
[Non-Patent Document 1]
Kawabata Go, Nakamata Nobuya et al., “Voice Recognition / Synthesis”, NTT Technical Journal, 1999, Vol. 11, No. 12, pp12-40
[0006]
Therefore, the present invention has been made in view of the above problems, and is a general-purpose operation writing method and program for acquiring detailed operation history without burdening the user, and the program recorded therein. An object is to provide a recording medium.
[0007]
[Means for Solving the Problems]
The present invention provides a user interface device comprising an operation means operated by a user and a display means for displaying image information including an operation target, and the operation of a processing device for creating a user's operation process as a sentence. The present invention relates to a writing method, a program, and a recording medium on which the program is recorded.
[0008]
According to the operation writing method of the present invention,
A first step of monitoring an operation event in the operation means;
A second step of specifying the operation target and the operation type from the image information displayed on the display means when the operation event occurs;
And text indicating the operation target, a third step of creating a sentence of a combination of a text indicating the type of operation, and have a,
To identify the operation target,
Obtaining surrounding image information, which is image information around coordinates on the screen indicated by the operating means when an operation event occurs;
Comparing the acquired surrounding image information with the partial image information for each operation object stored in advance and searching for partial image information having the highest degree of coincidence or similarity;
Recognizing a character string in the searched partial image information or searching the document information related to the searched partial image information based on the document information related to the stored partial image information. To do. As a result, a detailed operation history can be acquired by a general-purpose method that does not place a burden on the user.
[0011]
Furthermore, according to another embodiment of the operation writing method of the present invention, when a hierarchical structure exists in the operation target,
In the second step, an upper operation target for the operation target is specified,
In the third step, it is also preferable that a sentence further combining sentences indicating the upper operation target is created. This makes it possible to describe an operation history that makes it easy to specify the position of the operation target.
[0012]
Further, according to another embodiment of the operation writing method of the present invention, the third step further combines a sentence indicating a relative position of the operation target partial image information with respect to the upper operation target partial image information. It is also preferable to create a sentence.
[0013]
According to the operation writing program of the present invention,
A first step of monitoring an operation event in the operation means;
A second step of specifying the operation target and the operation type from the image information displayed on the display means when the operation event occurs;
And text indicating the operation target, to execute a third step of creating a sentence of a combination of a text indicating the type of operation to the computer,
To identify the operation target,
Obtaining surrounding image information, which is image information around coordinates on the screen indicated by the operating means when an operation event occurs;
Comparing the acquired surrounding image information with the partial image information for each operation object stored in advance and searching for partial image information having the highest degree of coincidence or similarity;
Recognizing a character string in the searched partial image information, or causing the computer to execute a step of searching for document information related to the searched partial image information based on the document information related to the stored partial image information. And
[0016]
Furthermore, according to another embodiment of the operation writing program of the present invention, when the operation target has a hierarchical structure,
In the second step, an upper operation target for the operation target is specified,
In the third step, it is also preferable that the computer is executed so as to create a sentence further combining sentences indicating the upper operation target.
[0017]
Furthermore, according to another embodiment of the operation writing program of the present invention, the third step further combines a sentence indicating the relative position of the operation target partial image information with respect to the upper operation target partial image information. It is also preferable to run the computer to create a sentence.
[0018]
It may be a recording medium on which the above-described operation writing program according to the present invention is recorded.
[0019]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.
[0020]
FIG. 1 is a system configuration diagram of a first embodiment according to the present invention . Also, FIGS. 2 and 3 are examples of the display information displayed on the display.
[0021]
According to FIG. 1, a processing device connected to a mouse, a keyboard, and a display includes an operation monitoring unit 1, an operation target specifying unit 2, an operation type specifying unit 3, and an explanation sentence synthesizing unit 4. .
[0022]
The operation monitoring unit 1 monitors user operations on the operation target. Examples of the operation target include “menu”, “button”, “window”, “radio button”, and the like. User operations include mouse click, keyboard input, touch panel touch, voice input, and the like. When detecting the occurrence of such an operation event, the operation monitoring unit 1 outputs an operation target specifying request to the operation target specifying unit 2 together with the operation identification information, and sends an operation type specifying request to the operation type specifying unit 3. Output.
[0023]
When the operation target specifying unit 2 receives the notification of the operation target specifying request from the operation monitoring unit 1, the operation target specifying unit 2 specifies the operation target operated by the user based on the operation identification information. Here, the operation target refers to a part of the screen information constituting the application, on which the user's operation is directly applied.
[0024]
For example, according to FIG. 2, when the user clicks the “OK button” on the screen of the application, the “OK button” to which the user's operation directly acts becomes the operation target. For example, according to FIG. 3, when the user clicks the “edit menu” button in the application screen, the “edit menu” button to which the user's operation directly acts becomes the operation target.
[0025]
The operation target may be specified using the operation identification information acquired by the operation monitoring unit 1 and the operation target correspondence table. In addition, the operation identification information that can be acquired from the operation monitoring unit 1 may be used to automatically specify the operation target by some method.
[0026]
When receiving the notification of the operation target specification request from the operation monitoring unit 1, the operation type specifying unit 3 specifies the operation type performed by the user on the operation target based on the operation identification information. For example, for the operation object “icon”, a plurality of types such as an operation of “clicking” or an operation of “dragging” can be used depending on the operation of the user. Operation type.
[0027]
Then, identifying which of the plurality of operations is the operation performed by the user is an act of identifying the operation type. Depending on the operation target, there may be only one operation type. In such an operation target, the operation type is inevitably determined uniquely.
[0028]
The explanatory note synthesizing unit 4 combines the text indicating the operation target acquired by the operation target specifying unit 2 and the text indicating the operation type acquired by the operation type specifying unit 3 into a text of the operation history. There is a method of preparing a sentence template and embedding the acquired operation target and operation type in writing a sentence. Specific examples are shown below.
[0029]
Template = “[operation target] is [operation type]”,
If the operation target = "Edit menu" and the operation type = "Select",
Created sentence = “Select” “Edit menu”. "
[0030]
Here, an image that indicates the operation target and the operation type may be extracted and added to the sentence.
[0031]
FIG. 4 is a flowchart of the writing method based on FIG.
[0032]
(S1) The operation monitoring unit 1 is monitoring an operation event. If an operation event occurs, the process proceeds to S2. If not, the process returns to S1 itself and continues to monitor the operation event.
(S2) The operation type identification unit 2 identifies the operation type based on the operation identification information acquired from the operation monitoring unit 1, and proceeds to S3.
(S3) The operation target specifying unit 3 identifies the operation target based on the operation identification information acquired from the operation monitoring unit 1, and proceeds to S4.
(S4) The explanatory note synthesizing unit 4 combines the text indicating the operation target acquired by the operation target specifying unit 2 and the text indicating the operation type acquired by the operation type specifying unit 3 into a text of the operation history. Thereafter, the process returns to Sl again, and monitoring of the next operation event is continued.
[0033]
FIG. 5 is a system configuration diagram of the second embodiment according to the present invention .
[0034]
According to FIG. 5, the processing device includes an operation monitoring unit 1 ′, an operation target specifying unit 2 ′, an operation type specifying unit 3, an explanation sentence synthesizing unit 4, and an image recognition unit 5 ′. Note that the operation type identification unit 3 and the explanation sentence synthesis unit 4 are the same as those shown in FIG.
[0035]
When an operation event occurs, the operation monitoring unit 1 ′ acquires the coordinates on the screen of a pointer that indicates an operation such as a mouse on a personal computer, and uses the coordinates as operation identification information to specify the operation target specifying unit 2 ′ and the operation type. Notify the specific unit 3.
[0036]
The operation target specifying unit 2 ′ includes a surrounding image acquisition unit 2-1, a target type identification unit 2-2, and a title recognition unit 2-3.
[0037]
The surrounding image acquisition unit 2-1 acquires the surrounding image information around the coordinates, that is, the surrounding image information of the operation target based on the coordinates of the operation identification information notified from the operation monitoring unit 1 ′, and the target type identification unit Notify 2-2.
[0038]
The target type identification unit 2-2 notifies the shape recognition unit 5-1 of the surrounding image information notified from the surrounding image acquisition unit 2-1, and whether there is a candidate for the operation target in the surrounding image information. Ask about it. If it exists, the target is specified as the operation target. The image information in the specified operation target range is notified to the title recognition unit 2-3.
[0039]
The title recognition unit 2-3 notifies the character recognition unit 5-2 of the image information notified from the target type identification unit 2-2, and inquires whether a character string candidate exists in the image information. If it exists, the character string is specified as the title of the operation target.
[0040]
The image recognition unit 5 includes a shape recognition unit 5-1 and a character recognition unit 5-2.
[0041]
The shape recognition unit 5-1 stores partial image information for each operation target in advance. Then, the surrounding image information notified from the target type identification unit 2-2 is compared with the accumulated partial image information as candidates, and the partial image information that matches or has the highest similarity is searched. The retrieved partial image information is notified to the target type identification unit 2-2. On the other hand, if it is not similar to any of the partial image information, this is returned.
[0042]
The character recognizing unit 5-2 detects whether a character string candidate exists in the partial image information notified from the title recognizing unit 2-3. Alternatively, text information related to the partial image information may be stored in advance, and text information that matches the partial image information may be searched. The detected or searched text information is notified to the title recognition unit 2-3. On the other hand, if it does not match any text information, a message to that effect is returned.
[0043]
FIG. 6 is a flowchart of the writing method based on FIG.
[0044]
(S2-1) The operation monitoring unit 1 ′ monitors an operation event. When the operation event occurs, the operation monitoring unit 1 ′ acquires the coordinates on the screen of a pointer that indicates an operation such as a mouse on a personal computer, and the coordinates together with the operation identification information. Information is notified to the operation type identification unit 3. When the operation type specifying unit 3 determines that the operation type is a mouse click, the operation target specifying unit 2 ′ is notified of the operation target specifying request together with the mouse coordinates, and the process proceeds to S2-2. In the case of other operation types, the process returns to S2-1 and continues to monitor the operation.
(S2-2) The surrounding image acquisition unit 2-1 of the operation target specifying unit 2 ′ acquires the surrounding image information of the acquired mouse coordinates. As the surrounding image information, a circular image centered on mouse coordinates or a rectangular image having a predetermined size is assumed. Then, it progresses to S2-3.
(S2-3) The surrounding image acquisition unit 2-11 The target type identification unit 2-2 notifies the acquired image to the shape recognition unit 5-1, and there are operation target candidates in the surrounding image information. Inquire whether or not. If it is determined that there is a portion having a high similarity to the shape of the “menu” or “button” in the surrounding image information, the process proceeds to S2-4. If it is determined that there is no specific operation target in the surrounding image information, the process returns to S2-1 to continue monitoring the operation.
(S2-4) The title recognition unit 2-3 notifies the character recognition unit 5-2 of the partial image information notified from the target type identification unit 2-2, and character string candidates are included in the partial image information. Ask if there exists. If it exists, the character string is recognized as a title to be operated, and the process proceeds to S2-5.
(S2-5) A part of the screen including the specified operation target is captured, and the process proceeds to S2-6.
(S2-6) The explanatory note synthesizing unit 4 combines the operation target title acquired by the operation target specifying unit 2 ′ and the operation type acquired by the operation type specifying unit 3 to synthesize an operation description. Note that the image captured in S2-5 may be added to the text.
[0045]
FIG. 7 is a system configuration diagram of the third embodiment according to the present invention . Also, FIG. 8 shows an example of display information displayed on the display. In addition, since it is the same as 2nd Embodiment except the upper hierarchy specific | specification part 6, description is abbreviate | omitted below.
[0046]
The upper layer specifying unit 6 includes an upper layer target specifying unit 6-1 and a title recognizing unit 6-2.
[0047]
The upper layer target specifying unit 6-1 specifies an operation target of the parent hierarchy of the operation target when the operation target specified by the operation target specifying unit 2 ′ has a hierarchical structure. When the operation target is “menu”, the parent menu hierarchy is followed, and an image corresponding to the parent menu name is acquired. When the operation target is “button”, an image including the title of “window” in which “button” is drawn is specified as the operation target of the parent hierarchy. The number of hierarchies to be traced is not limited to one, and the parent hierarchy may be further recursively retraced as necessary.
[0048]
The title recognizing unit 6-2 notifies the character recognizing unit 5-2 of the image acquired by the upper layer target specifying unit 6-1 and inquires whether a character string candidate exists in the image. If it exists, the character string is recognized as an operation target title.
[0049]
The explanatory note synthesizing unit 4 combines the operation target acquired by the operation target specifying unit 2, the operation type acquired by the operation type specifying unit 3, and the parent operation target acquired by the higher hierarchy specifying unit 6, Is documented.
[0050]
When the operation target is a menu, the position where the operation target menu exists in the menu hierarchy is described. For example, the parent menu hierarchy is also written as “Select [File]-[Save]”.
[0051]
When the operation target is a button, the relative position is described based on the relative coordinates arranged in the button in the parent operation target window. For example, the relative position is written together, such as “Press the OK button at the lower right of the print window”.
[0052]
Furthermore, an image that indicates the operation target and the operation type may be extracted and added to the sentence. There is a method of preparing a sentence template and embedding the acquired operation target and operation type in writing a sentence.
[0053]
FIG. 9 is a flowchart of the operation writing method based on FIG. In addition, since it is the same as that of 2nd Embodiment except S3-1, S3-2, S2-6 ', description is abbreviate | omitted below.
[0054]
(S3-1) When the hierarchical structure exists in the operation target specified by the operation target specifying unit 2 ′, the upper layer target specifying unit 6-1 specifies the operation target of the parent hierarchy of the operation target. Further, the title recognition unit 6-2 notifies the character recognition unit 5-2 of the image acquired by the upper layer target specifying unit 6-1 and determines whether or not a character string candidate exists in the image. If there is an inquiry, the character string is recognized as the title of the operation target. Then, it progresses to S3-2.
(S3-2) The target specifying unit 6-1 in the upper hierarchy determines whether there is a higher-level parent operation target, and if it exists, the process returns to S3-1. If not, the process proceeds to S2-5.
(S2-6 ′) The description synthesizing unit 4 acquires the operation target title acquired by the operation target specifying unit 2 ′, the operation type acquired by the operation type specifying unit 3, and the parent acquired by the upper hierarchy specifying unit 6. The operation history is documented in combination with the operation target.
[0055]
【The invention's effect】
As described above, according to the operation writing method, the program, and the recording medium on which the program is recorded according to the present invention, it is general-purpose and a detailed operation history can be acquired without imposing a burden on the user. In particular, it is possible to describe an operation history that makes it easy to specify the content of an operation target by a general-purpose method that does not depend on a platform such as an application or an OS. It is also possible to describe an operation history that makes it easy to specify the position of the operation target.
[Brief description of the drawings]
FIG. 1 is a system configuration diagram of a first embodiment according to the present invention.
FIG. 2 is an example of display information displayed on a display.
FIG. 3 is an example of display information displayed on a display.
FIG. 4 is a flowchart of a writing method based on FIG. 1;
FIG. 5 is a system configuration diagram of a second embodiment according to the present invention.
FIG. 6 is a flowchart of a writing method based on FIG.
FIG. 7 is a system configuration diagram of a third embodiment according to the present invention.
FIG. 8 is an example of display information displayed on a display.
FIG. 9 is a flowchart of a writing method based on FIG.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 Operation monitoring part 2 Operation target specific | specification part 2-1 Surrounding image acquisition part 2-2 Target classification identification part 2-3 Title recognition part 3 Operation classification specific part 4 Description sentence composition part 5 Image recognition part 5-1 Shape recognition part 5 -2 Character recognition unit 6 Upper layer specifying unit 6-1 Upper layer target specifying unit 6-2 Title recognition unit

Claims

An operation writing method in a processing device for creating a user's operation process as a sentence using a user interface device comprising an operation means operated by a user and a display means for displaying image information including an operation target. There,
A first step of monitoring an operation event in the operation means;
A second step of specifying the operation target and the operation type from the image information displayed on the display means when the operation event occurs;
A third step of creating a sentence combining the sentence indicating the operation object and the sentence indicating the operation type ;
And have a,
To identify the operation target,
Obtaining surrounding image information, which is image information around coordinates on the screen indicated by the operating means when an operation event occurs;
Comparing the acquired surrounding image information with the partial image information for each operation object stored in advance, and searching for partial image information having the highest match or similarity;
Recognizing the character string in the searched partial image information, or searching the document information related to the searched partial image information based on the document information related to the stored partial image information;
A method for writing an operation characterized by including:

When a hierarchical structure exists in the operation target,
The second step specifies an upper operation target for the operation target,
The writing method according to claim 1, wherein the third step creates a sentence further combining sentences indicating the upper operation target.

The third step, according to claim 2, characterized in that to create a further combined sentence text indicating the relative position of the operation target of the partial image information for the upper operation target partial image information Writing method.

An operation writing program for creating a user's operation process as a sentence using a user interface device including an operation means operated by a user and a display means for displaying image information including an operation target,
A first step of monitoring an operation event in the operation means;
A second step of specifying the operation target and the operation type from the image information displayed on the display means when the operation event occurs;
A third step of creating a sentence combining the sentence indicating the operation object and the sentence indicating the operation type ;
Cause the computer to execute,
To identify the operation target,
Obtaining surrounding image information, which is image information around coordinates on the screen indicated by the operating means when an operation event occurs;
Comparing the acquired surrounding image information with the partial image information for each operation object stored in advance and searching for partial image information having the highest degree of coincidence or similarity;
Recognizing the character string in the searched partial image information, or searching the document information related to the searched partial image information based on the document information related to the stored partial image information;
An operation writing program characterized by causing a computer to execute .

When a hierarchical structure exists in the operation target,
The second step specifies an upper operation target for the operation target,
5. The operation writing program according to claim 4 , wherein the third step causes the computer to execute so as to create a sentence further combining sentences indicating the upper operation target.

In the third step, the computer is executed so as to create a sentence in which a sentence indicating a relative position of the partial image information of the operation target with respect to the partial image information of the upper operation target is further combined. The operation writing program according to claim 5 .

The recording medium which recorded the program of any one of Claim 4 to 6 .