JP2011034548A

JP2011034548A - System and method for acquiring handwritten pattern

Info

Publication number: JP2011034548A
Application number: JP2010026729A
Authority: JP
Inventors: Seiichi Uchida; 誠一内田; Masakazu Iwamura; 雅一岩村; Shinichiro Omachi; 真一郎大町; Koichi Kise; 浩一黄瀬; Kazumasa Iwata; 和将岩田
Original assignee: Osaka University NUC; Osaka Prefecture University
Current assignee: Osaka University NUC; Osaka Prefecture University
Priority date: 2009-07-10
Filing date: 2010-02-09
Publication date: 2011-02-17

Abstract

<P>PROBLEM TO BE SOLVED: To provide a method capable of taking a content handwritten on paper into a database without needing special paper. <P>SOLUTION: The handwritten pattern acquisition system includes a pen unit 1 having a pen tip and a moving image camera 3 for imaging the vicinity thereof, and a processing unit which determines a trace of the moving pen tip between a series of frames of taken moving image. In the pen unit, a paper fingerprint that is an irregular pattern formed on a surface of paper by paper making is taken by the camera. The processing unit includes an extraction processing unit which extracts a plurality of feature points each showing a local feature of the paper fingerprint on each frame image; a correspondence processing unit which determines each corresponding feature point between sequential frame images; and a trace processing unit which determines a position change of the pen tip to the paper surface based on a position change of each corresponding feature point in the sequential frames to determine a handwritten pattern as the trace of the pen tip. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

この発明は、カメラが搭載されたペンを用いて、手書きパターンをリアルタイムに取得する手書きパターンの取得システムおよび取得方法に関する。 The present invention relates to a handwritten pattern acquisition system and acquisition method for acquiring a handwritten pattern in real time using a pen equipped with a camera.

「紙とペン」による手書き(手書きコンテンツ)は我々人類にとって最も歴史ある情報の生成・記録メディアである。ペーパレス化が叫ばれ、様々な情報革新がありながら、未だに紙の手帳を愛用する人も多く、講義メモも紙のノートに取る人も多い。そこでこうした手書きコンテンツをデジタル化してデータベースに取り込むことができれば、保存、検索、データの複製が容易になり、日常の手書き内容を保存できるライフログ、いわばライティング・ライフ・ログ("writing-life-log")を実現できる。 Handwriting (handwritten content) with "paper and pen" is the most historical information generation and recording medium for humanity. There is a lot of people who use paper notebooks, and many people take lecture notes as paper notes. Therefore, if such handwritten content can be digitized and imported into a database, it will be easy to save, search, and duplicate data, and life log that can save everyday handwritten content, so-called writing life log ("writing-life-log") ") Can be realized.

この発想に基づいて、手書きパターンをリアルタイムに取得する機器が既にいくつか提案されている。その一例は、これはスウェーデンのAnoto社が開発したデジタルペン、製品名 Anoto penDocuments（以下、簡単のためアノトペンあるいはアノトシステムという）である（例えば、[2009年2月12日検索]インターネット＜URL：http：//www.anoto.com/＞、参照）。アノトシステムは、紙面に予め印刷された細かいドットパターンをペンのカメラで読み取ることで、どの紙のどの位置に記述しているかを判断し、ペン先の軌跡（筆跡）を保存するものである。文書を紙に印刷する際に、紙と文書を関連付けておけば、筆跡を文書上に正しく配置することも可能となる。 Based on this idea, several devices that acquire handwritten patterns in real time have already been proposed. An example of this is a digital pen developed by Sweden's Anoto, the product name Anoto penDocuments (hereinafter referred to as Anotopen or Anoto System for simplicity) (eg [Search February 12, 2009] Internet <URL: http: //www.anoto.com/>, see) The Anoto system reads a fine dot pattern pre-printed on a paper surface with a pen camera, determines which position on which paper is described, and saves the locus (handwriting) of the pen tip. When the document is printed on paper, the handwriting can be correctly arranged on the document by associating the paper with the document.

より詳細には、アノトシステムとは、ドットパターンが印刷された専用紙"ANOTO paper"と小型カメラ・Bluetooth対応通信機能を内蔵した"ANOTO pen"からなるシステムである。専用紙のドットパターンをペンのカメラで認識することで、筆跡情報を入手する。ドットパターンは約0.3mmの間隔で格子状に配置されており、直交する格子からわずかにずれるようになっている。ずれは上下左右の4方向となっており、このずれのパターンをペンに内蔵されたカメラによって読み取る。カメラが一回に読み取る範囲は6×6の36ドットであり、36個のずれの組み合わせによって、異なる位置情報を得られる。この特徴は2⁷²通りの組み合わせとなり、ユーラシア大陸と同程度の広さの領域からある一点を認識できる。また、ドットパターンの組み合わせにより紙に様々な機能を与えることも可能となる。例えば、蓄積されたデータをPCなどに送信する領域を作成することで、データ送信を容易に行うことなどが考えられる。 More specifically, the Anoto system is a system consisting of “ANOTO paper”, a special paper with a dot pattern printed on it, and an “ANOTO pen” with a small camera and Bluetooth-compatible communication function. Handwriting information is obtained by recognizing the dot pattern of the special paper with a pen camera. The dot patterns are arranged in a grid pattern at intervals of about 0.3 mm, and are slightly displaced from the orthogonal grid. The deviation is in four directions, up, down, left, and right, and this deviation pattern is read by the camera built in the pen. The range that the camera reads at one time is 6 × 6 36 dots, and different positional information can be obtained by combining 36 shifts. This feature is a combination of ²⁷² ways, and one point can be recognized from an area as wide as the Eurasian continent. Further, various functions can be given to the paper by combining the dot patterns. For example, it may be possible to easily transmit data by creating an area for transmitting accumulated data to a PC or the like.

他のカメラ付きペンの例として、AraiらによるPaperLinkが挙げられる（例えば、非特許文献１参照）。これは、ペン型の小型カメラと蛍光ラインマーカーを組み合わせたもので、紙の文書をハイパーテキストのように扱うことができる。具体的には、ペンのボタンを押しながら蛍光ラインマーカーで紙の上をなぞり、その部分のパターンをカメラで切り出しておく。そして切り出された領域に対するアクションを定義しておけば、後でその領域を撮影することで定義しておいたアクションが起動される。 Another example of a pen with a camera is PaperLink by Arai et al. (See Non-Patent Document 1, for example). This is a combination of a pen-type small camera and a fluorescent line marker, and can handle a paper document like hypertext. Specifically, the user presses the button on the pen and traces on the paper with the fluorescent line marker, and cuts out the pattern of the portion with the camera. If an action is defined for the clipped area, the defined action is activated by photographing the area later.

また、Iwataらは紙面に記された文書及び／又は画像（以下、文書画像）を検索する文書画像検索技術を利用した筆跡の復元と文書上への配置を行う手法を提案している（例えば、非特許文献３参照）。 In addition, Iwata et al. Have proposed a method for restoring a handwriting using a document image retrieval technique for retrieving a document and / or an image (hereinafter referred to as a document image) written on a paper and arranging it on the document (for example, Non-Patent Document 3).

カメラで撮影された画像から実際に手書き内容を得るためには、動画像の各フレームに断片的に撮影された手書きパターンからその全体像を復元する必要がある。これに用いられる技術は、いわゆるビデオモザイキングと呼ばれるものである（例えば、非特許文献２参照）。具体的には、現在フレーム画像と直前フレーム画像との隣接する2フレーム間で特徴点の対応関係を求め、それに基づいてフレーム間の姿勢変化を求める。この姿勢変化を全ての隣接フレームに渡って求めれば、全フレーム分の姿勢変化系列が把握できる。そしてそれらを用いて各フレーム画像を貼り合わせることで、手書きパターンの全体像を復元した1枚の画像を生成できる。 In order to actually obtain handwritten content from an image captured by a camera, it is necessary to restore the entire image from a handwritten pattern captured in fragments in each frame of the moving image. The technique used for this is so-called video mosaicing (see, for example, Non-Patent Document 2). Specifically, the correspondence relationship between the feature points is obtained between two adjacent frames of the current frame image and the immediately preceding frame image, and the posture change between the frames is obtained based thereon. If this posture change is obtained over all adjacent frames, the posture change series for all frames can be grasped. Then, by combining the frame images using them, it is possible to generate one image in which the entire image of the handwritten pattern is restored.

T. Arai, D. Aust, S. E. Hudson, "PaperLink: a technique for hyperlinking from real paper to electronic content," Proc. ACM Conf. Human Factors in Computing Systems (CHI'97), pp. 327-334, 1997.T. Arai, D. Aust, S. E. Hudson, "PaperLink: a technique for hyperlinking from real paper to electronic content," Proc. ACM Conf. Human Factors in Computing Systems (CHI'97), pp. 327-334, 1997. M. Irani and P. Anandan, "Video indexing based on mosaic representations," Proc. IEEE, vol. 86, no. 5, pp. 905-921, 1998.M. Irani and P. Anandan, "Video indexing based on mosaic representations," Proc. IEEE, vol. 86, no. 5, pp. 905-921, 1998. Kazumasa Iwata, Koichi Kise, Tomohiro Nakai, Masakazu Iwamura, Seiichi Uchida, Shinichiro Omachi, "Capturing Digital Ink as Retrieving Fragments of Document Images," Proceedings of the 10th International Conference on Document Analysisand Recognition (ICDAR2009), pp.1236-1240, (2009-7).Kazumasa Iwata, Koichi Kise, Tomohiro Nakai, Masakazu Iwamura, Seiichi Uchida, Shinichiro Omachi, "Capturing Digital Ink as Retrieving Fragments of Document Images," Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR2009), pp.1236-1240, (2009-7).

以上のように、手書きパターンを画像として自動的にデータベース登録できれば、前述のライティング・ライフ・ログを実現できる。
ただし、アノトシステムは、特殊なドットパターンが印刷されていない紙を用いることはできないため、利便性を損ねているという問題点もある。即ち、専用紙が必要になることによる利便性の低下が課題として挙げられる。ドットパターンは普通紙に印刷して利用できるが、約0.3mmごとに配置されており、高精度な専用のプリンタが必要となる。また、ノートとして購入するにしても高価であり、入手方法にも限りがある。このような理由により、アノトシステムを日常的に利用するのは費用と手間がかかる。 As described above, if the handwritten pattern can be automatically registered in the database as an image, the above-described writing life log can be realized.
However, since the Anoto system cannot use paper on which a special dot pattern is not printed, there is a problem that convenience is impaired. That is, the problem is a decrease in convenience due to the necessity of dedicated paper. The dot pattern can be used by printing on plain paper, but it is arranged approximately every 0.3mm, and a high-precision dedicated printer is required. Moreover, even if it is purchased as a notebook, it is expensive and the method for obtaining it is limited. For this reason, daily use of the Anoto system is expensive and time consuming.

特殊なドットパターンを用いることなく、上記と同様の機能、すなわち、白紙上の筆跡の復元と文書への筆跡配置を行えるカメラペンシステムが望まれている。
また、前述のPaperLinkやIwataらの手法は、ラインマーカーでなぞる紙面上に文書等が記載されていることが前提である。 There is a demand for a camera pen system that can perform the same function as described above without using a special dot pattern, that is, the restoration of handwriting on white paper and the placement of handwriting on a document.
In addition, the above-described methods of PaperLink and Iwata are based on the premise that a document or the like is described on the paper traced by the line marker.

この発明は、以上のような事情を考慮してなされたものであって、第１の課題として、特別な用紙を必要とせず、用紙に手書きされた筆跡をコンテンツとしてデータベースに取り込むことのできる手法を提供するものである。 The present invention has been made in consideration of the above-described circumstances, and as a first problem, it is possible to capture a handwriting handwritten on a sheet as a content into a database without requiring a special sheet. Is to provide.

第２の課題として、この発明は、用紙上の筆跡を得るという上記第１の課題を解決する手法と文書画像検索技術を利用して文書上への配置を行う手法との巧みな統合により、白紙上の筆跡を復元できかつ予め文書画像が記された用紙上に筆記がなされたときその文書画像に対する筆記位置を得ることのできる手法を提供するものである。 As a second problem, the present invention provides a skillful integration between a technique for solving the first problem of obtaining a handwriting on paper and a technique for arranging on a document using a document image search technique. It is an object of the present invention to provide a technique capable of restoring a handwriting on a white paper and obtaining a writing position with respect to the document image when writing is performed on a paper on which the document image is previously recorded.

上記の第１の課題を解決するため、この発明では、各フレーム画像から特徴点を検出する際、紙指紋に着目する。紙指紋とは、抄かれた紙の表面に観察される紙の微細構造による凹凸模様のことである。紙面を接写すれば、紙を形成する植物繊維の絡み具合がランダム状の模様を生成していることがわかる。この模様中の例えばコーナー点やエッジ点を検出できれば、それを特徴点として対応付けに利用できる。こうした紙面に自然に存在する特徴点は従来あまり利用されることはなかった。従ってこの発明の特徴の一つであるといえる。 In order to solve the first problem, the present invention focuses on paper fingerprints when detecting feature points from each frame image. The paper fingerprint is a concavo-convex pattern due to the fine structure of the paper observed on the surface of the paper. If a close-up of the paper surface is taken, it can be seen that the entanglement of the plant fibers forming the paper produces a random pattern. If, for example, a corner point or an edge point in this pattern can be detected, it can be used as a feature point for association. Such feature points that exist naturally on the paper have not been used so far. Therefore, this is one of the features of the present invention.

なお、この紙指紋という名称は、その模様により紙の同一性を検証する技術に由来する。例えば、富士ゼロックスが開発した紙指紋照合技術XAYA（例えば、[2009年2月12日検索]インターネット＜URL：http://www.fujixerox.co.jp/company/technical/xaya/＞参照）では、紙表面の模様すなわち紙指紋をスキャナで光学的に読み取ってデータベースなどに記録しておき、識別時には入力された紙画像の紙指紋と照合する。高精度で用紙を識別できるとしている。 The name “paper fingerprint” is derived from a technique for verifying the identity of paper by its pattern. For example, in the paper fingerprint verification technology XAYA developed by Fuji Xerox (for example, [Search February 12, 2009] Internet <URL: http://www.fujixerox.co.jp/company/technical/xaya/>) The paper surface pattern, that is, the paper fingerprint, is optically read by a scanner and recorded in a database or the like, and compared with the paper fingerprint of the inputted paper image at the time of identification. The paper can be identified with high accuracy.

この発明は、前述の第１の課題を解決する第１発明として、ペン先とその付近を撮影する動画カメラとを有してなるペン部と、撮影された動画像の一連のフレーム間において移動するペン先の軌跡を求める処理部とを備え、前記ペン部は、前記用紙が抄かれて紙面に形成された凹凸模様である紙指紋を前記カメラが撮影し、前記処理部は、各フレーム画像に写った紙指紋の局所的特徴をそれぞれ表す複数の特徴点を抽出する抽出処理部、前後のフレーム画像間で対応する各特徴点を決定する対応処理部、対応する各特徴点の前後フレームでの位置変化に基づいて紙面に対するペン先の位置変化を決定しペン先の軌跡としての手書きパターンを求める軌跡処理部を備えることを特徴とする手書きパターン取得システムを提供する。 As a first invention for solving the first problem, the present invention moves between a pen unit having a pen tip and a moving image camera for photographing the vicinity of the pen tip and a series of frames of the captured moving image. And a processing unit that obtains a locus of a pen tip to be used, wherein the camera captures a paper fingerprint that is a concavo-convex pattern formed on the paper surface after the paper is made, and the processing unit is configured to display each frame image. An extraction processing unit that extracts a plurality of feature points each representing a local feature of a paper fingerprint reflected in the image, a corresponding processing unit that determines corresponding feature points between the preceding and following frame images, and a frame before and after each corresponding feature point A handwriting pattern acquisition system is provided that includes a locus processing unit that determines a change in the position of the pen tip with respect to the paper surface based on the change in position and obtains a handwritten pattern as a locus of the pen tip.

また、異なる観点から、前記第１発明は、ペン先とその付近を撮影する動画カメラとを有してなるペン部を用いた手書きが用紙にされるとき前記動画カメラがペン先付近の紙面を撮影する工程と、処理部が、撮影された動画像の一連のフレーム間において移動するペン先の軌跡を求める工程とを備え、前記カメラ部は、前記用紙が抄かれて紙面に形成された凹凸模様である紙指紋を撮影し、前記処理部は、各フレーム画像に写った紙指紋の局所的特徴をそれぞれ表す複数の特徴点を抽出し、前後のフレーム画像間で対応する各特徴点を決定し、対応する各特徴点の前後フレームでの位置変化に基づいて紙面に対するペン先の位置変化を決定し、ペン先の軌跡としての手書きパターンを求めることを特徴とする手書きパターン取得方法を提供する。 Further, from a different point of view, the first invention relates to a paper surface near the pen tip when handwritten using a pen unit having a pen tip and a moving image camera that captures the vicinity of the pen tip. A photographing step, and a processing unit that obtains a locus of a pen tip that moves between a series of frames of the captured moving image, wherein the camera unit is formed with unevenness formed on the paper surface by the paper being made. A paper fingerprint that is a pattern is photographed, and the processing unit extracts a plurality of feature points that respectively represent local features of the paper fingerprint that appear in each frame image, and determines corresponding feature points between the preceding and following frame images. And providing a handwritten pattern acquisition method characterized by determining a position change of the pen tip relative to the paper surface based on a position change of each corresponding feature point in the front and back frames and obtaining a handwritten pattern as a locus of the pen tip. .

また、第２の課題を解決すべく、この発明は上記第１の課題を解決する第１発明と文書画像検索技術を利用して文書上の位置を得る手法とを統合するのであるが、前記統合で最も大きな障害となる事項は、動作に必要な撮影範囲の違いにある。即ち、第１発明による手法が狭い範囲を捉えた高精細画像を要求するのに対して、Iwataらの手法には多くの文字を捉えた広い範囲の画像が必要である。広角カメラを用い、接写をしつつ広い範囲の画像を得ることも考えられるが、広角カメラによる幾何学的歪みは大きな問題となり、解像度、歪み、色収差等の要求を満たすためには高価なカメラが要求される。この相反する問題に対して、発明者らは異なる解決手法を見出すべく検討を重ねた結果、画像のモザイキング技術によって対処できることを見出した。即ち、狭い範囲を撮影して筆跡を復元しつつ、画像をモザイキングすることにより、より大きな画像を構成していく。そして、十分な大きさの画像が得られたら、文書画像検索を用いて筆跡の文書内での位置を求めるというものである。後述するように、発明者らはプロトタイプシステムを用いた実験に基づいて第２の課題を解決する手法につきその有効性を評価した。 In order to solve the second problem, the present invention integrates the first invention for solving the first problem and a technique for obtaining a position on a document using a document image search technique. The biggest obstacle to integration is the difference in shooting range required for operation. That is, the technique according to the first invention requires a high-definition image capturing a narrow range, whereas the technique of Iwata et al. Requires a wide range image capturing many characters. Although it is conceivable to use a wide-angle camera to obtain a wide range of images while performing close-up photography, geometric distortion due to the wide-angle camera is a big problem, and expensive cameras are required to meet the requirements for resolution, distortion, chromatic aberration, etc. Required. As a result of repeated investigations to find different solutions for the conflicting problems, the inventors have found that they can be handled by image mosaicing technology. That is, a larger image is constructed by mosaicking the image while photographing a narrow range and restoring the handwriting. When a sufficiently large image is obtained, the position of the handwriting in the document is obtained using document image retrieval. As will be described later, the inventors evaluated the effectiveness of the technique for solving the second problem based on experiments using a prototype system.

以上のように、この発明はまた、第２の課題を解決する第２発明として、ペン先とその付近を撮影する動画カメラとを有し、用紙が抄かれて紙面に形成された凹凸模様である紙指紋及び前記紙面に記された文書画像を前記動画カメラで撮影するペン部と、撮影された動画の各フレームから紙指紋の局所的特徴を紙面特徴点として抽出し、前後のフレームで対応する紙面特徴点の移動量に基づいて前記ペン先の紙面に対する軌跡を得る第１軌跡処理部と、各フレームに文書画像が写っているとき、その文書画像の局所的特徴を文書画像特徴点として抽出し、前後のフレームで対応する文書画像特徴点の移動量に基づいて文書画像に対する前記軌跡の位置を得る第２軌跡処理部と、前記第２軌跡処理部は、対応する文書画像特徴点が重なるように各フレームを組み合わせて１つのフレームより広い領域の文書画像特徴点の配置を決定し、前記配置に基づいて文書画像に対する前記軌跡の位置を得ることを特徴とする手書きパターン取得システムを提供する。 As described above, the present invention also has, as a second invention for solving the second problem, an uneven pattern formed on a paper surface by having a pen tip and a moving image camera for photographing the vicinity thereof. A pen part that captures a paper fingerprint and a document image written on the paper surface with the video camera, and a local feature of the paper fingerprint is extracted as a paper surface feature point from each frame of the captured video, and is supported by the previous and next frames A first trajectory processing unit that obtains a trajectory of the pen tip with respect to the paper surface based on a movement amount of the paper surface feature point, and when a document image is captured in each frame, the local feature of the document image is used as the document image feature point. A second trajectory processing unit that extracts and obtains the position of the trajectory with respect to the document image based on the amount of movement of the corresponding document image feature point in the preceding and following frames, and the second trajectory processing unit includes the corresponding document image feature points. To overlap A combination of frames to determine the placement of one frame area larger than the document image feature points, to provide a handwriting pattern acquisition system characterized by obtaining a position of the track with respect to the document image based on the arrangement.

前記第１発明による手書きパターン取得システムにおいて、前記ペン部は、前記用紙が抄かれて紙面に形成された凹凸模様である紙指紋を前記カメラが撮影し、前記処理部は、各フレーム画像に写った紙指紋の局所的特徴をそれぞれ表す複数の特徴点を抽出する抽出処理部、前後のフレーム画像間で対応する各特徴点を決定する対応処理部、対応する各特徴点の前後フレームでの位置変化に基づいて紙面に対するペン先の位置変化を決定しペン先の軌跡としての手書きパターンを求める軌跡処理部を備えてなるので、特別な用紙を必要とせず、用紙に手書きされたコンテンツをデータベースに取り込むことができる。また、手書きパターンそれ自体から軌跡を決定する手法でないため、手書きパターンの隠蔽に強い。さらに、ペン先が移動してもフレーム画像間に写る手書きパターンが変化しない状態、いわゆる開口問題を回避することができるという優れた利点を有する。
前記第１発明による手書きパターン取得方法も、同様の利点を有する。 In the handwritten pattern acquisition system according to the first invention, the pen unit captures a paper fingerprint, which is a concavo-convex pattern formed on the paper surface after the paper is made, and the processing unit is captured in each frame image. An extraction processing unit for extracting a plurality of feature points each representing a local feature of a paper fingerprint, a corresponding processing unit for determining each corresponding feature point between the preceding and following frame images, and a position of each corresponding feature point in the preceding and following frames Since it has a trajectory processing unit that determines the change in the position of the pen tip relative to the paper surface based on the change and obtains a handwritten pattern as the trajectory of the pen tip, no special paper is required and the content handwritten on the paper is stored in the database. Can be captured. Moreover, since it is not a method of determining a locus from the handwritten pattern itself, it is strong against concealing the handwritten pattern. Furthermore, there is an excellent advantage that a so-called opening problem can be avoided in a state where a handwritten pattern captured between frame images does not change even when the pen tip moves.
The handwritten pattern acquisition method according to the first invention has the same advantages.

また、前記第２発明による手書きパターン取得システムにおいて、前記第２軌跡処理部は、対応する文書画像特徴点が重なるように各フレームを組み合わせて１つのフレームより広い領域の文書画像特徴点の配置を決定し、前記配置に基づいて文書画像に対する前記軌跡の位置を得るので、紙指紋の撮影とそれよりも広い領域の撮影が要求される文書画像の撮影を１つの動画カメラでまかなうことができる。従って、紙面に特殊な加工を行わずに白紙上でもペン先の軌跡を取得でき、かつ、予め紙面に記された文書画像にマークや注記等の筆記がなされたときには、その文書画像に対する前記軌跡の位置を求めることができる。 In the handwritten pattern acquisition system according to the second aspect of the invention, the second trajectory processing unit combines document frames so that corresponding document image feature points overlap, and arranges document image feature points in a wider area than one frame. Since the position of the trajectory with respect to the document image is obtained based on the arrangement, it is possible to shoot a document image that requires shooting of a paper fingerprint and a wider area with a single moving camera. Therefore, the locus of the pen tip can be acquired even on white paper without performing special processing on the paper surface, and when a mark, a note, or the like is written on the document image previously recorded on the paper surface, the locus for the document image is obtained. Can be determined.

この発明において、ペン部は、実際に筆記具として一般の用紙に文字、数字、図形やパターンなどを描くために使用される。即ち、ボールペン、鉛筆あるいはフェルトペンなどである。ペン部が有する動画カメラは、例えば携帯電話やコンピュータに内蔵されるような小型のカメラが好ましいが、時系列のフレーム画像を撮影できるものであれば、特にその大きさ、撮像素子等の限定はない。 In the present invention, the pen unit is actually used as a writing instrument for drawing letters, numbers, figures, patterns, etc. on general paper. That is, it is a ballpoint pen, a pencil or a felt pen. The moving image camera included in the pen unit is preferably a small camera built in, for example, a mobile phone or a computer. However, if the camera can capture time-series frame images, the size and the imaging device are not particularly limited. Absent.

第１発明に係る前記処理部、及び、第２発明に係る第１及び第２軌跡処理部は、ペンと一体であってもよいが、別体であってもよい。一体型の場合は、求まった手書きパターンのデータを記憶する記憶装置を有しており、好ましくは無線、あるいは有線でホストのコンピュータと通信して記憶されたデータをホストへ入力してもよい。あるいは、規格化されたメモリーカード等の媒体を介してホストへデータをコピーあるいは移動できるように構成されていてもよい。処理部は、マイクロコンピュータ、メモリを主としたハードウェアで構成され、前記マイクロコンピュータが所定の制御プログラムを実行することによりその機能が実現されてもよい。 The processing unit according to the first invention and the first and second trajectory processing units according to the second invention may be integrated with the pen or may be separate. In the case of the integrated type, it has a storage device for storing the obtained handwritten pattern data, and the stored data may preferably be input to the host by communicating with the host computer wirelessly or by wire. Alternatively, the data may be configured to be copied or moved to the host via a standardized medium such as a memory card. The processing unit may be configured by hardware mainly including a microcomputer and a memory, and the function may be realized by the microcomputer executing a predetermined control program.

ペン部と処理部が別体の場合、両者は好ましくは無線、あるいは優先で処理部と通信するように構成されてもよい。処理部は、マイクロコンピュータ、メモリを主としたハードウェアで構成され、前記マイクロコンピュータが所定の制御プログラムを実行することによりその機能が実現されてもよい。あるいは、パーソナルコンピュータ等汎用の情報処理装置上でＣＰＵが所定のアプリケーションプログラムを実行することによりその機能が実現されてもよい。 When the pen unit and the processing unit are separate, they may be configured to communicate with the processing unit, preferably wirelessly or preferentially. The processing unit may be configured by hardware mainly including a microcomputer and a memory, and the function may be realized by the microcomputer executing a predetermined control program. Alternatively, the function may be realized by the CPU executing a predetermined application program on a general-purpose information processing apparatus such as a personal computer.

この発明の第１発明及び／又は第２発明による手書きパターン取得システムにより取得された手書きパターンは、パターンデータとして外部のデータベースに蓄積され、使用されてもよいが、例えば、本システムの外部で公知の文字認識処理によって文字データに変換され、文字データとして使用されてもよい。 The handwritten pattern acquired by the handwritten pattern acquisition system according to the first and / or second invention of the present invention may be stored and used as an external database as pattern data. For example, it is publicly known outside the system. The character may be converted into character data by the character recognition process and used as character data.

この発明によるライティング・ライフ・ログ実現のコンセプトを示す説明図である。It is explanatory drawing which shows the concept of writing life log realization by this invention. 図２は、この実施形態におけるペン先カメラで実際に撮影された画像の一例である。FIG. 2 is an example of an image actually taken by the pen tip camera in this embodiment. この発明に係る紙指紋の一例を示す画像である。It is an image which shows an example of the paper fingerprint based on this invention. この発明の実施形態において、隣接するフレーム画像間で点対応の関係を実際に求めた結果を示す説明図である。In embodiment of this invention, it is explanatory drawing which shows the result of having actually calculated | required the point correspondence between adjacent frame images. この発明に係るカメラ付きペンを用いて数字の"2"を筆記した際のペン先画像のフレーム系列を示す画像である。It is an image showing a frame series of a nib image when the numeral “2” is written using the camera-equipped pen according to the present invention. 図５の各フレーム画像に基づくビデオモザイクの結果を示す説明図である。It is explanatory drawing which shows the result of the video mosaic based on each frame image of FIG. 図５のフレーム系列で実際に書かれた"2"をイメージスキャナでスキャンした画像である。FIG. 6 is an image obtained by scanning “2” actually written in the frame sequence of FIG. 5 with an image scanner. 図５の各フレームにおけるSURF特徴点対の数を示すグラフである。6 is a graph showing the number of SURF feature point pairs in each frame of FIG. 5. 図７と異なる手書き文字"2"（図１０の手書き文字である）を筆記した場合のビデオモザイクの結果を示す説明図である。It is explanatory drawing which shows the result of the video mosaic at the time of writing handwritten character "2" (it is the handwritten character of FIG. 10) different from FIG. 図７と異なる手書き文字"2"をイメージスキャナでスキャンした画像である。8 is an image obtained by scanning a handwritten character “2” different from FIG. 7 with an image scanner. 図９の手書き文字の各フレームにおける特徴点対応数を示すグラフある。10 is a graph showing the number of feature point correspondences in each frame of the handwritten character of FIG. 9. この発明で用いられる紙指紋のSURF特徴点を白紙から抽出した結果を示す説明図である。It is explanatory drawing which shows the result of having extracted the SURF feature point of the paper fingerprint used by this invention from white paper. この発明で、２つのフレーム間のSURF特徴点の対応関係を視覚化した説明図である。It is explanatory drawing which visualized the correspondence of the SURF feature point between two frames by this invention. この発明のうち第２発明に係るモザイク画像から抽出されたLLAH特徴点の様子を示す説明図である。It is explanatory drawing which shows the mode of the LLAH feature point extracted from the mosaic image which concerns on 2nd invention among this invention. 第２発明に係る実施態様で使用したカメラペンの外観を示す説明図である。It is explanatory drawing which shows the external appearance of the camera pen used by the embodiment which concerns on 2nd invention. 第２発明に係る処理の流れを示す説明図である。It is explanatory drawing which shows the flow of the process which concerns on 2nd invention. 第２発明に係る実験例２において、文書領域に50個の筆記を行い復元した筆跡を評価する実験の結果を示すグラフである。In Experimental example 2 which concerns on 2nd invention, it is a graph which shows the result of the experiment which evaluates the handwriting which restored by performing 50 writing in the document area | region. 実験例２において、筆跡の復元結果が回転した一例を示す説明図である。In Experimental example 2, it is explanatory drawing which shows an example which the restoration result of the handwriting rotated. 実験例２において、文書中の余白に筆記を行ったときの筆跡の復元結果を示す説明図である。In Experimental example 2, it is explanatory drawing which shows the restoration result of a handwriting when writing in the margin in a document. 実験例２において、文書領域に広く筆記をしたときの筆跡の復元結果を示す説明図である。In Experimental Example 2, it is explanatory drawing which shows the restoration result of a handwriting when writing widely in a document area. 実験例２において、小さな領域に筆記したときの筆跡の復元結果を示す説明図である。In Experimental example 2, it is explanatory drawing which shows the restoration result of a handwriting when writing in a small area | region.

以下、この発明の好ましい態様について説明する。
前記抽出処理部は、異なるフレーム画像間で位置が変化する特徴点を紙面の特徴点と判断し位置が変化しない特徴点をペン先部の特徴点と判断し、各フレーム画像に写るペン先部からは特徴点を抽出しないようにしてもよい。このようにすれば、ペン先部から特徴点を抽出しないよう処理するので、紙面に対する位置変化を求める際にノイズとなる特徴点の抽出が抑制され、より正確に特徴点の対応関係を求めることができる。 Hereinafter, preferred embodiments of the present invention will be described.
The extraction processing unit determines a feature point whose position changes between different frame images as a feature point on the paper surface, determines a feature point whose position does not change as a feature point of the pen tip part, and displays the pen point part in each frame image The feature points may not be extracted from. In this way, processing is performed so as not to extract feature points from the pen tip, so that extraction of feature points that become noise when determining a change in position with respect to the paper surface is suppressed, and correspondence between feature points is obtained more accurately. Can do.

また、前記対応処理部は、後のフレームのペン先から所定範囲内の領域は、特徴点の対応をとらないようにしてもよい。このようにすれば、前のフレームに対応する特徴点が存在しない手書きパターン部から特徴点を抽出しないよう処理するので、紙面に対する位置変化を求める際にノイズとなり得る特徴点の抽出が抑制され、より正確に特徴点の対応関係を求めることができる。 Further, the correspondence processing unit may not correspond to the feature points in an area within a predetermined range from the pen tip of the subsequent frame. In this way, processing is performed so as not to extract feature points from a handwritten pattern portion that does not have a feature point corresponding to the previous frame, so that extraction of feature points that can be noise when suppressing position changes with respect to the paper surface is suppressed, The correspondence between feature points can be obtained more accurately.

さらにまた、前記対応処理部は、決定するとき、各フレーム画像に写る紙面の射影歪みを補正した後、前後フレーム画像間で対応する特徴点の位置変化を決定してもよい。このようにすれば、用紙へ手書きする際にペン部が用紙に対し傾いた状態であっても、それによる射影歪みが補正されるので、補正しない場合にくらべて軌跡の位置をより正確に求めることができる。 Furthermore, when the determination is made, the correspondence processing unit may determine the positional change of the corresponding feature point between the previous and subsequent frame images after correcting the projection distortion of the paper image that appears in each frame image. In this way, even when the pen portion is tilted with respect to the paper when handwritten on the paper, the projection distortion due to the correction is corrected. Therefore, the position of the trajectory can be obtained more accurately than without correction. be able to.

前記対応処理部は、射影歪みを補正すべく各フレーム画像に対応する射影変換行列をそれぞれ算出し、前および／または後フレームの射影変換行列と要素が不連続なフレームはペン先の軌跡を求めるフレームから除外してもよい。このようにすれば、ペン先の軌跡が不連続になるようなフレームがノイズとして除去されるので、当該処理を行わない場合に比べてより適切に手書きパターンを抽出することができる。 The correspondence processing unit calculates a projection transformation matrix corresponding to each frame image in order to correct projection distortion, and obtains a pen tip locus for a frame whose elements are discontinuous with the projection transformation matrix of the previous and / or subsequent frames. It may be excluded from the frame. In this way, since a frame in which the pen tip locus is discontinuous is removed as noise, a handwritten pattern can be extracted more appropriately than when the processing is not performed.

また、前記対応処理部は、誤った対応関係のノイズ除去処理を行ってもよい。このようにすれば、当該処理を行わない場合に比べて、前後フレーム画像間での位置変化をより正確に決定することができる。 In addition, the correspondence processing unit may perform noise removal processing with an incorrect correspondence. In this way, it is possible to determine the position change between the previous and next frame images more accurately than when the processing is not performed.

さらにまた、前記抽出処理部は、特徴点の抽出手法として、SURFのアルゴリズムを用いてもよい。SURFの手法により抽出された特徴点は、特徴量は回転やスケール変化に対して不変であり、また照明変化にも頑健という性質を持っている。また、SURFの基礎となったSIFTの手法に比べて処理が軽いため、SIFTよりも動画への適用に好適である。ただし、紙面特徴点の抽出手法は必ずしもSURFに限られるものではなく、他の局所特徴量、例えば、SIFT, PCA-SIFTなどを用いることもできると考えられる。
さらに、前述のSIFT, SURFなどは領域検出器(region detector)と特徴記述子(feature descriptor)としての機能を兼ね備えるものであるが、両者の機能を分離し、何れか一方あるいは両方をたの手法に置換してもよい。適用可能な領域検出器としては、harris-affine, hessian-affine, MSERなどが考えられ、この発明に適用可能な特徴記述子としてはSIFTなどのほかにshape contextなどが考えられる。 Furthermore, the extraction processing unit may use a SURF algorithm as a feature point extraction method. The feature points extracted by the SURF method have the property that the feature quantity is invariant to rotation and scale change, and robust to illumination changes. In addition, since the processing is lighter than the SIFT method that is the basis of SURF, it is more suitable for application to video than SIFT. However, the feature point extraction method is not necessarily limited to SURF, and other local feature amounts such as SIFT and PCA-SIFT can be used.
Furthermore, the above-mentioned SIFT, SURF, etc. have both functions as a region detector and a feature descriptor. However, the functions of both are separated and either or both are used. May be substituted. Applicable region detectors include harris-affine, hessian-affine, MSER, and the like, and feature descriptors applicable to the present invention include shape context in addition to SIFT.

前記第２発明において、複数の文書画像がその文書画像から抽出された文書画像特徴点と関連付けられ登録されてなる文書画像データベースの中から、前記配置に対応する文書画像を検索する文書画像検索部をさらに備え、第２軌跡処理部は、前記配置と前記検索部により検索された文書画像に関連付けられた文書画像特徴点との対応関係に基づいて、各フレームに写った文書画像の幾何学的歪みの歪量を決定し、その歪量を用いて前記軌跡を補正してもよい。このようにすれば、文書画像の検索に成功したとき紙面と正対した平面への射影変換ができるので、紙指紋のみから得られる筆跡よりも誤差のより小さい筆跡が得られる。 In the second invention, a document image search unit for searching for a document image corresponding to the arrangement from a document image database in which a plurality of document images are registered in association with document image feature points extracted from the document image. The second trajectory processing unit further includes: a geometric image of the document image captured in each frame based on a correspondence relationship between the arrangement and the document image feature point associated with the document image searched by the search unit. The amount of distortion may be determined, and the locus may be corrected using the amount of distortion. In this way, when the search for the document image is successful, projective conversion to a plane that faces the paper surface can be performed, so that a handwriting with a smaller error than the handwriting obtained only from the paper fingerprint can be obtained.

第１軌跡処理部は、紙面特徴点の抽出手法として、SURFのアルゴリズムを用いることが好ましい。ただし、紙面特徴点の抽出手法は必ずしもSURFに限られるものではなく、他の局所特徴量、例えば、SIFT, PCA-SIFTなどを用いることもできると考えられる。
さらに、前述のSIFT, SURFなどは領域検出器(region detector)と特徴記述子(feature descriptor)としての機能を兼ね備えるものであるが、両者の機能を分離し、何れか一方あるいは両方を他の手法に置換してもよい。適用可能な領域検出器としては、harris-affine, hessian-affine, MSERなどが考えられ、この発明に適用可能な特徴記述子としてはSIFTなどのほかにshape contextなどが考えられる。 The first trajectory processing unit preferably uses a SURF algorithm as a paper feature point extraction method. However, the feature point extraction method is not necessarily limited to SURF, and other local feature amounts such as SIFT and PCA-SIFT can be used.
Furthermore, the above-mentioned SIFT, SURF, etc. have functions as a region detector and a feature descriptor. However, the functions of both are separated, and either one or both are separated by other methods. May be substituted. Applicable region detectors include harris-affine, hessian-affine, MSER, and the like, and feature descriptors applicable to the present invention include shape context in addition to SIFT.

また、第２軌跡処理部は、文書画像特徴点の抽出手法として、連結成分の重心を抽出しLLAHのアルゴリズムを用いて各特徴点を表してもよい。なお、この発明はLLAHに必ずしも限定されず、SURFやSIFTのアルゴリズムを適用して抽出した特徴量を用い、近似最近傍探索に下で参照する野口らの手法などを用いて実現することも不可能ではないと考えられる。ただし、LLAHを適用して得られる特徴量（LLAH特徴点）がコンパクトで容量を必要としないのに対して，SURFやSIFTなどを適用して得られる特徴量は大きな容量を必要とする。SURFやSIFTなどを適用して得られる特徴量を用いる場合は、LLAH特徴量に比べて処理に長い時間を要したりデータベースが大きくなったりすることが予想される。従って、LLAH特徴量を用いることが好ましい。
ここで示した種々の好ましい態様は、それら複数を組み合わせることもできる。 Further, the second trajectory processing unit may extract the centroid of the connected component and represent each feature point using the LLAH algorithm as a document image feature point extraction method. Note that the present invention is not necessarily limited to LLAH, and it is not possible to use the feature amount extracted by applying the SURF or SIFT algorithm and the method of Noguchi et al. It is not possible. However, the feature quantity obtained by applying LLAH (LLAH feature point) is compact and does not require capacity, whereas the feature quantity obtained by applying SURF or SIFT requires a large capacity. When using feature values obtained by applying SURF, SIFT, etc., it is expected that processing will take a longer time and the database will be larger than LLAH feature values. Therefore, it is preferable to use the LLAH feature amount.
The various preferable aspects shown here can also be combined.

以下、図面を用いてこの発明をさらに詳述する。なお、以下の説明は、すべての点で例示であって、この発明を限定するものと解されるべきではない。
まず、この発明の前記第１発明に対応する実施形態を説明する。
この実施形態において、カメラ付きペンから手書きパターン全体を得るための原理について述べる。次に紙指紋からの特徴点検出について述べ、さらにその特徴点を手がかりにフレーム間の姿勢変化の推定を行う方法を述べ、最後にビデオモザイキングによりパターン全体を１枚の画像として得る方法を説明する。そして実験結果を示し、最後に今後の課題について述べる。 Hereinafter, the present invention will be described in more detail with reference to the drawings. In addition, the following description is an illustration in all the points, Comprising: It should not be interpreted as limiting this invention.
First, an embodiment corresponding to the first invention of the present invention will be described.
In this embodiment, the principle for obtaining the entire handwritten pattern from the camera-equipped pen will be described. Next, feature point detection from paper fingerprints will be described, and a method for estimating posture change between frames using the feature points as a clue will be described. Finally, a method for obtaining the entire pattern as one image by video mosaicing will be described. . The experimental results are shown, and finally future issues are described.

≪カメラ付きペンの構成および処理の概要≫
図１は、この発明によるライティング・ライフ・ログ実現のコンセプトを示す説明図である。図１(a)は、この実施形態に係るカメラ付きペンの概略構成を示す斜視図である。
図(b)は、手書きされたコンテンツを、ビデオモザイキング技術を用いてデータベースに取り込む様子を示す説明図である。図１(a)に示すように、この実施形態ではペン先に超小型CCDカメラ(製品名：プラムネットハンディミニ、型名：CCN3412Y)を搭載したカメラつきペンを用いた。このペン先カメラで筆記途中の手書きパターンおよび紙面を撮影する。カメラはペンに固定されているため、ペンは視野内で常に同じ位置にある。図２は、この実施形態におけるペン先カメラで実際に撮影された画像の一例を示す。 ≪Overview of configuration and processing of pen with camera≫
FIG. 1 is an explanatory diagram showing the concept of realizing a writing life log according to the present invention. Fig.1 (a) is a perspective view which shows schematic structure of the pen with a camera which concerns on this embodiment.
FIG. (B) is an explanatory diagram showing how handwritten content is captured into a database using video mosaicing technology. As shown in FIG. 1A, in this embodiment, a pen with a camera equipped with a micro CCD camera (product name: Plumnet Handymini, model name: CCN3412Y) is used at the pen tip. The pen tip camera captures a handwritten pattern and a paper surface during writing. Since the camera is fixed to the pen, the pen is always in the same position in the field of view. FIG. 2 shows an example of an image actually taken by the pen tip camera in this embodiment.

実際に手書き内容全体を得るためには、図１(b)に示すように、動画像の各フレーム画像として撮影された断片的な手書きパターンからその全体像を復元する必要がある。この処理が、いわゆるビデオモザイキング処理と呼ばれるものである。このため、この実施形態では、処理対象のフレーム画像と直前のフレーム画像の隣接する2フレームの画像上でそれぞれ対応する特徴点を公知のSIFT(より詳細にはSURF、SURFの詳細は、例えば、H. Bay,T. Tuytelaars, and L. V. Gool, "SURF: speeded up robust features," Proc. ECCV2006 (LNCS volume 3951), part 1, pp. 404-417, 2006.参照)の手法を用いて検出する。両フレーム画像に共通する特徴点を抽出することができれば、それらの対応関係からフレーム間の紙面に対する位置的変化を推定できる。詳細には、紙面が射影歪みを受けた状態の各フレーム画像と紙面との対応関係を示す射影変換行列をすべての隣接フレーム間において求める。そして、各フレーム画像の射影歪みを補正したうえで、紙面上での位置的変化を推定する。それらの結果に基づき、一連のフレームに渡る紙面上の位置的変化、すなわちペン先の移動の軌跡を求めることができる。 In order to actually obtain the entire handwritten content, as shown in FIG. 1B, it is necessary to restore the entire image from the fragmented handwritten pattern photographed as each frame image of the moving image. This process is called a so-called video mosaicing process. For this reason, in this embodiment, the feature points corresponding to the adjacent two frame images of the frame image to be processed and the immediately preceding frame image are known SIFTs (more specifically, details of SURF, SURF, for example, H. Bay, T. Tuytelaars, and LV Gool, "SURF: speeded up robust features," Proc. ECCV2006 (LNCS volume 3951), part 1, pp. 404-417, 2006.) . If a feature point common to both frame images can be extracted, a positional change with respect to the paper surface between frames can be estimated from their correspondence. Specifically, a projective transformation matrix indicating the correspondence between each frame image in a state where the paper surface has undergone projection distortion and the paper surface is obtained between all adjacent frames. Then, after correcting the projection distortion of each frame image, the positional change on the paper surface is estimated. Based on these results, a positional change on the paper surface over a series of frames, that is, a locus of movement of the pen tip can be obtained.

ところで、カメラの位置については、様々な形態が考えられる。例えば、ペン尻付近にカメラをつければ、より広範囲の文字領域を撮影できると想定できる。従って、文字全体(さらには紙面全体)を一括して捉えられる可能性があり、その意味ではモザイキングが必要なペン先カメラより有利である。しかし、ペン尻カメラにはペンを持つ手によるオクルージョン（閉鎖、隠蔽）が発生し得る。またペンの動きそのものが必要とされるようなアプリケーションの場合、振幅の大きなペン尻では動き推定が困難になる可能性がある。 By the way, various forms of the position of the camera can be considered. For example, if a camera is attached in the vicinity of the pen butt, it can be assumed that a wider character area can be photographed. Therefore, there is a possibility that the entire character (and the entire paper surface) can be captured at a time, which is advantageous over a pen tip camera that requires mosaicing. However, the pen butt camera may be occluded (closed, concealed) by the hand holding the pen. For applications that require pen movement itself, it may be difficult to estimate movement with a pen butt having a large amplitude.

このように、ペン先カメラは、その取り付け位置に応じて相補的な役割を為す。ペン先にカメラを搭載することで、より確実に手書きパターン付近を撮影することができる。さらに、後述するように紙指紋を有効に利用できるため、詳細なペン先の動きを推定することが可能となる。このようにペン先とペン尻のカメラでは役割が違う点に留意しつつ、処理目的にふさわしい取り付け位置を決定すべきである。 Thus, the nib camera plays a complementary role depending on the mounting position. By mounting a camera on the pen tip, the vicinity of the handwritten pattern can be photographed more reliably. Furthermore, since the paper fingerprint can be used effectively as will be described later, it is possible to estimate the detailed movement of the pen tip. In this way, it is necessary to determine the mounting position suitable for the processing purpose while keeping in mind that the roles of the pen tip and pen butt cameras are different.

≪紙指紋からの特徴点検出≫
この発明では、紙面上の位置を取得するための基準として紙指紋に着目する。前述のように、紙には表面上に幾何学的模様がある。図３は、この発明に係る紙指紋の一例を示す画像である。図３の画像は、図２の一部を拡大したもの。ただし、見やすいように輝度を調整のうえ、コントラストを強調してある。この紙指紋を用いて、各フレーム画像の位置関係(すなわちフレーム間移動量および方向)を把握することができれば、手書きパターンがうまく撮影されてなくても紙面に手書きされた軌跡を取得できる。この手法は次の二つの意味で極めて有効である。 ≪Feature point detection from paper fingerprint≫
In the present invention, attention is paid to the paper fingerprint as a reference for acquiring the position on the paper surface. As mentioned above, paper has a geometric pattern on its surface. FIG. 3 is an image showing an example of a paper fingerprint according to the present invention. The image in FIG. 3 is an enlarged view of a part of FIG. However, the contrast is emphasized after adjusting the brightness for easy viewing. If the positional relationship (that is, the movement amount and direction between frames) of each frame image can be grasped by using this paper fingerprint, the locus handwritten on the paper surface can be acquired even if the handwritten pattern is not successfully photographed. This method is extremely effective in the following two ways.

第一に、ペン先による隠蔽問題の回避がある。例えば、図２のようにペン先が見えているとき、ペンが紙面の右から左に動いたとすると手書きパターンはペン先部分に隠蔽されて全く見えない。従って手書きパターンに注目して移動量推定をしようとしても不可能である。これに対し、紙指紋から移動量がわかれば手書きパターンが見えなかったとしても問題ない。 First, there is avoidance of the concealment problem caused by the pen tip. For example, when the pen tip is visible as shown in FIG. 2, if the pen moves from the right to the left of the paper, the handwritten pattern is hidden by the pen tip portion and cannot be seen at all. Therefore, it is impossible to estimate the movement amount by paying attention to the handwritten pattern. On the other hand, there is no problem even if the handwritten pattern cannot be seen if the amount of movement is known from the paper fingerprint.

第二に、手書きパターンの開口問題回避がある。手書きパターンとして紙面の左から右へ水平線を書き続けた状況を考える。この場合、画面内の手書きパターンは常に同じのものが見え、従ってペンが動いているのか静止しているのか判断できない。これは動き推定における開口問題である。水平線は極端な例であるが、文字を筆記する際にも局所的に変化のないパターンは頻繁に発生しているので、その箇所で不自然な移動量推定が発生し、結果的に手書きパターン形状は非線形に伸縮したものとなり得る。これに対し、紙指紋に着目すれば、ペンが動いている場合は紙指紋も動き、逆にペンが静止していれば紙指紋も静止しているため、この開口問題を回避できる。 Secondly, there is an avoidance of a handwritten pattern opening problem. Consider a situation in which a horizontal line continues to be written as a handwritten pattern from left to right on the page. In this case, the same handwritten pattern on the screen can always be seen, so it cannot be determined whether the pen is moving or stationary. This is an aperture problem in motion estimation. The horizontal line is an extreme example, but when writing a character, a pattern that does not change locally often occurs, so an unnatural amount of movement estimation occurs at that point, resulting in a handwritten pattern. The shape can be non-linearly stretched. On the other hand, focusing on the paper fingerprint, the paper fingerprint moves when the pen is moving, and conversely, the paper fingerprint is stationary when the pen is stationary.

特徴点としては、回転・スケール不変量かつ明るさ変化に頑強なものが望ましい。前者はペンが回転することによってフレーム画像が回転するためであり、さらにカメラ位置と紙面の距離関係も運筆によって変わるためである。こうした状況でも安定して特徴点を抽出するためには、回転およびスケール変化に対する不変量（回転・スケール不変量）が望ましい。厳密には射影変換に対する不変量が望ましいが、隣接フレーム間での変位はそう大きくないので、回転・スケール不変量であれば、近似的に対応できるものと考えられる。一方、後者は各フレーム画像の一部に現れる影の影響を排除したいためである。 The feature points are preferably rotation / scale invariants and robust to brightness changes. The former is because the frame image is rotated by the rotation of the pen, and further, the distance relationship between the camera position and the paper surface changes depending on the stroke. In order to extract feature points stably even in such a situation, invariants with respect to rotation and scale changes (rotation / scale invariants) are desirable. Strictly speaking, an invariant with respect to the projective transformation is desirable, but since the displacement between adjacent frames is not so large, it is considered that a rotation / scale invariant can be approximated. On the other hand, the latter is for eliminating the influence of shadows appearing in a part of each frame image.

この実施形態ではこれらの要求を満たすものとしてSIFT(Scale-Invariant Feature Transform)の枠組みによる特徴点検出ならびに特徴記述を利用した。よく知られているように、SIFTで記述される特徴量は回転やスケール変化に対して不変であり、また照明変化にも頑健という性質を持っている。この明細書ではSIFTの高速版であるSURF(Speed-up Robust Features)を用いることとした。 In this embodiment, feature point detection and feature description based on the framework of SIFT (Scale-Invariant Feature Transform) are used to satisfy these requirements. As is well known, features described in SIFT are invariant to rotation and scale changes, and are robust to changes in lighting. In this specification, SURF (Speed-up Robust Features), which is a high-speed version of SIFT, is used.

≪ビデオモザイキング≫
隣接フレーム間で極めて類似したSURF特徴を持つ点の対を複数求めることで、隣接フレーム間の姿勢変化を推定することができる。紙面が平面の場合、カメラ付きペンで撮影したフレーム画像は、互いに射影変換（幾何学的変換の一種で、３次元空間内の奥行き方向の矩形が２次元平面上で台形など任意の凸型矩形で表される変換、射影変換以外に、奥行き方向の矩形が平行四辺形で表されるアフィン変換等がある。）の関係にある。そこで、隣接フレーム間の点対応関係から射影変換を推定すれば、それを用いて隣接フレームを重ね合わせることができる。この処理をすべてのフレームにわたって行えば、一連のフレームを重ね合わせることができる。いわゆるビデオモザイキングである。 ≪Video mosaic ≫
By obtaining a plurality of pairs of points having SURF features that are very similar between adjacent frames, it is possible to estimate the posture change between adjacent frames. When the paper surface is flat, the frame images taken with the camera pen are projected and converted to each other (a kind of geometric conversion. In addition to the conversion and projection conversion represented by (3), there is an affine transformation in which a rectangle in the depth direction is represented by a parallelogram. Therefore, if projective transformation is estimated from the point correspondence between adjacent frames, the adjacent frames can be superimposed using the projection transformation. If this process is performed over all frames, a series of frames can be superimposed. This is so-called video mosaicing.

隣接フレーム間で対応する点を求めるにあたり、SURF特徴の類似性に基づいて点対応を求めることの技術的意義について説明する。
ビデオモザイキングの説明で述べたように、当該処理の最初のステップは、隣接フレーム間に幾つかの点対応関係を定めることである。
すなわち、第tフレーム画像内のある点Aに注目したとき、それが第t+1フレーム画像内のどこの点Bに対応しているかを、それら2点の類似性を手がかりに見つける。この時点では、2フレーム間の射影変換はまだ推定されていないので、2点A,Bが元々紙面上の同一点であったとしても、それらの周りの見えは同一ではない。従って、そうした射影変換の影響があっても、極力安定して類似点対を見つける必要がある。
この点、SURFは回転およびスケール不変であるため、点AおよびB付近の特徴をそれぞれSURFで記述しておけば、射影変換が大きくない限り、ほぼそれらのSURF特徴は類似したものとなる。このため、Aと似た特徴を持つ点を第t+1フレーム画像内に探せば、対応点Bを見つけられると期待できる。
実際の処理では、両フレームにおいてそれぞれ大量にSURF特徴点を一旦検出しておき、そして単純に特徴間のユークリッド距離が閾値以下となる点対を複数見つけることで、フレーム間の点対応関係を定めることになる。 The technical significance of obtaining the point correspondence based on the similarity of the SURF features when obtaining the corresponding points between adjacent frames will be described.
As mentioned in the description of video mosaicing, the first step in the process is to establish several point correspondences between adjacent frames.
That is, when a certain point A in the t-th frame image is noticed, the point B in the t + 1-th frame image corresponding to the point B is found by using the similarity between the two points. At this point in time, the projective transformation between the two frames has not been estimated yet, so even if the two points A and B are originally the same point on the paper, their surroundings are not the same. Accordingly, it is necessary to find a pair of similar points as stably as possible even if there is an influence of such projective transformation.
In this respect, since SURF is rotation and scale invariant, if the features near points A and B are described in SURF, respectively, the SURF features are almost similar unless projective transformation is large. Therefore, it can be expected that the corresponding point B can be found by searching for a point having characteristics similar to A in the t + 1-th frame image.
In actual processing, a large number of SURF feature points are detected once in both frames, and point correspondences between frames are determined by simply finding a plurality of point pairs whose Euclidean distance between features is less than or equal to a threshold value. It will be.

図４は、この発明の実施形態において、隣接するフレーム画像間で点対応の関係を実際に求めた結果を示す説明図である。類似したSURF特徴を持つ点を線分で結んで表示している。SURF特徴点は紙面に多数検出されるので図として見難くなっているが、良く見ると同じ紙面上の位置どうしを対応付けているものが多いことがわかる。 FIG. 4 is an explanatory diagram showing a result of actually obtaining a point correspondence relationship between adjacent frame images in the embodiment of the present invention. Points with similar SURF features are connected by line segments. Since many SURF feature points are detected on the paper surface, it is difficult to see as a figure. However, if you look closely, you can see that many of them are associated with the same position on the paper surface.

同図を見ると、全く誤った点対応を与えている場合があることもわかる。この大きく誤った点対応を含めて射影変換行列を求めた場合、手法によってはその悪影響が拡大し、誤った射影変換行列が得られる可能性もある。紙指紋にも限界があると予想されるため、こうした誤った点対応はむしろ不可避と考えるのが妥当であろう。 From the figure, it can be seen that there are cases in which a completely wrong point correspondence is given. When a projection transformation matrix is obtained including this large and incorrect point correspondence, the adverse effect is magnified depending on the method, and an erroneous projection transformation matrix may be obtained. Since paper fingerprints are expected to be limited, it would be appropriate to consider such an incorrect point correspondence rather unavoidable.

このため、射影変換行列の推定には、いわゆるロバスト推定法が必要になる。そこでこの明細書ではRANSAC （M. A. Fischler and R. C. Bolles, "Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography," Comm. of the ACM, vol. 24, no .6, pp. 381-395, 1981.参照）を利用する。RANSACは少数の点対応で射影変換行列を求め、その射影変換行列によりどの程度他の点対応を説明できるかを評価する方法である。射影変換行列を求める点対応の組をランダムに変えながらこの評価を行うことで、ロバストに射影変換行列を求めることが可能である。 For this reason, a so-called robust estimation method is required for the projection transformation matrix estimation. Therefore, in this specification, RANSAC (MA Fischler and RC Bolles, "Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography," Comm. Of the ACM, vol. 24, no .6, pp. 381 -395, 1981.). RANSAC is a method for obtaining a projection transformation matrix with a small number of point correspondences and evaluating how much other point correspondences can be explained by the projection transformation matrix. By performing this evaluation while randomly changing the point-corresponding set for obtaining the projective transformation matrix, it is possible to robustly obtain the projective transformation matrix.

フレーム間画像で特徴点の対応関係から射影変換を推定する際には、この発明特有の工夫が3つある。これらを以下に列挙する。
第一は、ペン先部分に現れる特徴点の除去である。カメラがペンに固定されているために、ペン部分は画像内で常に一定の位置にある。従って、紙指紋部分の特徴点はフレーム間で動きを見せたとしても、ペン部分の特徴点は静止しているように見える。このため、これらを総合して移動量を推定してしまうと、ペンの特徴点が悪影響を及ぼし、誤った結果が得られる。このため、ペン部分に現れるSIFT特徴点は無視する必要がある。 There are three devices specific to the present invention when projective transformation is estimated from the correspondence between feature points in an inter-frame image. These are listed below.
The first is the removal of feature points that appear in the nib portion. Because the camera is fixed to the pen, the pen portion is always in a fixed position in the image. Accordingly, even if the feature point of the paper fingerprint portion moves between frames, the feature point of the pen portion appears to be stationary. For this reason, if the movement amount is estimated by combining these, the feature point of the pen has an adverse effect, and an incorrect result is obtained. For this reason, SIFT feature points appearing in the pen part need to be ignored.

第二は、ペン先付近の手書きパターン（黒インク）の除去である。このペン先付近の手書きパターンは、直前のフレームから現在のフレームまでの時間の間に新たに筆記された部分である。このため、直前のフレームには対応する点がなく、最悪の場合は誤対応を生じることになる。従って、この部分も、ペン先部分と同様に無視して考える必要がある。今回のペン先画像のビデオモザイキングと通常のビデオモザイキングとの違いは、このようにこの発明では動的に生成されつつあるパターンを対象としている点である。 The second is removal of a handwritten pattern (black ink) near the pen tip. The handwritten pattern near the pen tip is a newly written portion during the time from the immediately preceding frame to the current frame. For this reason, there is no corresponding point in the immediately preceding frame, and in the worst case, an erroneous correspondence occurs. Therefore, it is necessary to ignore this part as well as the pen tip part. The difference between the current video mosaic of the nib image and the normal video mosaic is that the present invention is intended for patterns that are being dynamically generated.

第三は、誤った射影変換行列の無視である。基本的に手書きは連続的に行われるので、射影変換自体も連続的なものが得られるはずである。しかし、SURF特徴点対応の不安定性などの理由に、突発的に全く誤った射影変換行列が得られる場合がある。このような場合、現在のフレームをスキップして、一つ前のフレームと次のフレーム間で射影変換をすればよい。フレームの大部分はオーバーラップしているので、数フレームスキップしてもあまり影響はないといえる。ただし、連続スキップにより間が開きすぎると、2フレーム間には大きな姿勢変化が発生し、それだけSURF特徴の対応が難しくなるので、注意が必要である。 The third is ignoring the wrong projective transformation matrix. Basically, handwriting is performed continuously, so that the projective transformation itself should be continuous. However, there is a case where a projective transformation matrix suddenly completely wrong may be obtained due to instability of correspondence with SURF feature points. In such a case, the current frame may be skipped and projective transformation may be performed between the previous frame and the next frame. Since most of the frames overlap, it can be said that skipping a few frames has little effect. However, if the gap is too long due to continuous skipping, a large pose change occurs between the two frames, and it is difficult to handle SURF features.

≪実験例１≫
図５は、この発明に係るカメラ付きペンを用いて数字の"2"を筆記した際のペン先画像のフレーム系列を示す画像である。用いた紙はコピー用紙(非再生紙)であり、筆記した"2"のサイズはおよそ3.5cm2.5cmであった。フレーム数はおよそ340フレームであった。カメラのフレームレートが30fpsであるため、これはおよそ11秒に相当する、これは動きボケを避けるべく、ゆっくり筆記したためである。 ≪Experimental example 1≫
FIG. 5 is an image showing a frame sequence of a pen tip image when the numeral “2” is written using the camera-equipped pen according to the present invention. The paper used was copy paper (non-recycled paper), and the size of “2” written was approximately 3.5 cm 2.5 cm. The number of frames was approximately 340 frames. Since the camera's frame rate is 30 fps, this corresponds to approximately 11 seconds, because it was written slowly to avoid motion blur.

図６は、図５の各フレーム画像に基づくビデオモザイクの結果を示す説明図である。モザイク画像の上に、各フレームでのペン先位置に小さな黒丸(●)をプロットしている。この黒丸の系列がすなわち復元された手書きパターンである。また、図７は、図５のフレーム系列で実際に書かれた"2"をイメージスキャナでスキャンした画像である。 FIG. 6 is an explanatory diagram showing the result of the video mosaic based on each frame image of FIG. On the mosaic image, a small black circle (●) is plotted at the pen tip position in each frame. This series of black circles is a restored handwritten pattern. FIG. 7 is an image obtained by scanning “2” actually written in the frame sequence of FIG. 5 with an image scanner.

図７と比べると、図６の復元画像はかなりジャギーであるが、それでも"2"であるとは見て取れる。今回の場合、単純に逐次的に貼り合せてモザイク画像を作ったため、射影変換の推定誤差が蓄積していく。また初期フレームが紙面に正対していなければ、それが全体に影響する。このため、原理的に形状が不安定になりやすい。それでもこの程度の復元ができているということは、紙指紋の特徴点を用いたモザイキングに見込みがあることを示している。 Compared to FIG. 7, the restored image of FIG. 6 is quite jaggy, but it can still be seen as “2”. In this case, since the mosaic image is simply created by sequentially pasting, the projection conversion estimation error accumulates. If the initial frame does not face the paper, it affects the whole. For this reason, the shape tends to be unstable in principle. Still, this level of restoration indicates that there is a promise for mosaicing using the features of paper fingerprints.

図８は、図５の各フレームにおけるSURF特徴点対の数を示すグラフである。横軸は各フレームのＩＤであり、縦軸はSURF特徴点対の数である。同図の縦線は、そのフレームで不自然な射影変換行列が求まったためにスキップしたことを表している。詳細な吟味は今後の課題であるが、図６と併せて考えると、特徴点対が少なくなり、スキップが起こる付近では、精度が落ち易いという傾向があるように見える。具体的には"2"の屈曲点付近および上部付近においてスキップが多く見られ、特に後者付近の復元パターンはやはりジャギーになっている。 FIG. 8 is a graph showing the number of SURF feature point pairs in each frame of FIG. The horizontal axis is the ID of each frame, and the vertical axis is the number of SURF feature point pairs. The vertical line in the figure represents that skipping was performed because an unnatural projection transformation matrix was found in that frame. Detailed examination is an issue for the future, but when considered in conjunction with FIG. 6, there appears to be a tendency that the number of feature point pairs is reduced and the accuracy tends to be reduced near skipping. Specifically, many skips are observed near the inflection point and the upper part of "2", and the restoration pattern near the latter is still jaggy.

図１０は、図７と異なる手書き文字"2"をイメージスキャナでスキャンした画像である。図９は、図１０の"2"を筆記した場合のビデオモザイクの結果を示す説明図である。この"2"はおよそ1.6cm 1.1cmのサイズであり、図６のものより小さい。全体で118フレーム、すなわち4秒程度で書かれたものである。若干の非線形伸縮が見られるが、"2"であることは明瞭にわかる程度の精度は保っている。図１１は、図９の手書き文字の各フレームにおける特徴点対応数を示すグラフある。 FIG. 10 is an image obtained by scanning a handwritten character “2” different from that in FIG. 7 with an image scanner. FIG. 9 is an explanatory diagram showing the result of the video mosaic when “2” in FIG. 10 is written. This “2” is approximately 1.6 cm 1.1 cm in size and smaller than that of FIG. It is written in 118 frames in total, that is, about 4 seconds. Although some nonlinear expansion / contraction is observed, the accuracy of “2” is clearly understood. FIG. 11 is a graph showing the number of feature points corresponding to each frame of the handwritten character of FIG.

今回は黒インク部分に重みを置くといったような処理は一切しておらず、紙面・インクの区別無く求めたSURF特徴で射影変換を推定している。発明者らが次の文献、「伊東克啓、内田誠一、岩村雅一、大町真一郎、黄瀬浩一、 "ペン先カメラ画像からの手書きパターンの復元、" 電子情報通信学会2008年総合大会ISS特別企画学生ポスターセッション、ISS-P-323， 2008.」で示したように、実際には黒インク部分の重ね合わせ評価だけでもかなりの精度でモザイキングは可能である。従って、今後は黒インクがペン先に隠蔽されている場合にだけ紙指紋を使うといった工夫も可能と思われる。 This time, no processing such as placing a weight on the black ink part is performed, and the projective transformation is estimated based on the SURF feature obtained without distinction between the paper and the ink. The inventors have published the following article, “Katsuhiro Ito, Seiichi Uchida, Masakazu Iwamura, Shinichiro Omachi, Koichi Kise,“ Restoring handwritten patterns from nib camera images, ”ISS Special Meeting 2008 ISS Special Program As shown in “Student Poster Session, ISS-P-323, 2008.”, in fact, mosaicing is possible with considerable accuracy by just evaluating the overlay of black ink. Therefore, in the future, it may be possible to use a paper fingerprint only when the black ink is hidden behind the pen tip.

また、この発明に係る技術は、次の文献、「田中一弘, 内田誠一, 岩村雅一, 大町真一郎, 黄瀬浩一, "データ埋め込みペンに関する基礎的検討," ヒューマンインタフェース学会論文誌, vol. 10, no. 4, pp. 559-567, 2008.」で提案されている「情報埋め込みペン」と組み合わせて利用できる。この情報埋め込みペンでは、紙への筆記と同時に微小インクドットの塗布により様々な情報(例えばURLや筆記者IDなど)を手書きコンテンツに埋め込むことができる。この発明で得られる手書きコンテンツの全体形状と埋め込んだ情報をペアにしてライティング・ライフ・ログに登録しておくことで、手書きコンテンツにサイバーメディア的機能を付加することが可能となる。 In addition, the technology according to the present invention is disclosed in the following literature, “Kazuhiro Tanaka, Seiichi Uchida, Masakazu Iwamura, Shinichiro Omachi, Koichi Kise,“ Basic study on data embedding pen, ”Journal of Human Interface Society, vol. 10, no. 4, pp. 559-567, 2008. ”can be used in combination with the“ information embedding pen ”proposed. With this information embedding pen, various information (for example, URL, writer ID, etc.) can be embedded in handwritten content by applying small ink dots simultaneously with writing on paper. By registering the entire shape of the handwritten content obtained in this invention and the embedded information as a pair in the writing life log, it is possible to add a cybermedia function to the handwritten content.

以上の説明のごとく、この発明は、ペン先に取り付けたカメラからの映像から手書きパターンを復元することを目的としたビデオモザイキング法の具体的な手法を提供する。この発明の最大の特徴は紙面の模様（紙指紋）を利用することにある。紙指紋からSURF特徴点を抽出し、それをビデオモザイキングに利用することで、例えば手書きパターンそのものがペン先に隠蔽されているような状況であっても、ペンの動きすなわち手書きパターンを復元できる。極めて初期的な検討段階ではあるが、このような単純な方式でも手書きパターンの概略が復元できることがわかった。 As described above, the present invention provides a specific method of a video mosaicing method for restoring a handwritten pattern from an image from a camera attached to a pen tip. The greatest feature of the present invention is to use a pattern (paper fingerprint) on the paper. By extracting SURF feature points from paper fingerprints and using them for video mosaicing, for example, even in a situation where the handwritten pattern itself is concealed by the pen tip, the pen movement, that is, the handwritten pattern can be restored. Although it was an extremely early study stage, it was found that the outline of a handwritten pattern could be restored even with such a simple method.

今後の改良の可能性としては以下が挙げられる。まずは、SURF特徴を求める前に紙指紋を強調するための画質変換を行ったり、黒インク部分の対応関係を重要視しながら紙指紋特徴点による対応と組み合わせたりするなど、SURFによる特徴点対応の安定化を図ることが考えられる。さらに、ビデオモザイキングの方法の改良が重要と考えられる。この明細書では隣接フレーム間の位置合わせを繰り返す手法を採ったので、後のフレームになるほど誤差が蓄積してしまい、モザイキング結果が崩れてしまう場合があった。再出現点の利用（池谷彰彦, 佐藤智和, 池田聖, 神原誠之, 中島昇, 横矢直和, "カメラパラメータ推定による紙面を対象とした超解像ビデオモザイキング," 信学論, vol. J88-D-II, no. 8, pp. 1490-1498, 2005.参照）などによる安定化が必須と思われる。動きボケの除去の検討も重要であろう。ペンの移動速度によっては、動きボケが顕著になり、SURF特徴点を検出できなくなり、結果としてマッチングもできなくなってしまう。従って動きボケの除去が重要になって来るが、手書きが写っているフレームでは手書きが線状パターンであることを活かしたボケ除去（例えば、X. Y. Qi, L. Zhang, C. L. Tan, "Motion deblurring for optical character recognition," Proc. ICDAR2005, pp. 389-393, 2005.参照）も考えられる。 Possible future improvements include: First, the image quality conversion for emphasizing the paper fingerprint before finding the SURF feature, or combining with the feature with the paper fingerprint feature point while emphasizing the correspondence relationship of the black ink part, etc. Stabilization can be considered. Furthermore, it is considered important to improve the method of video mosaicing. In this specification, since the method of repeating alignment between adjacent frames is employed, errors may accumulate as the frame is later, and the mosaicing result may be corrupted. Use of re-appearance points (Akihiko Ikeya, Tomokazu Sato, Kiyoshi Ikeda, Noriyuki Kanbara, Noboru Nakajima, Naokazu Yokoya, "Super-resolution video mosaic for paper based on camera parameter estimation," Theory of Science, vol. J88- D-II, no. 8, pp. 1490-1498, 2005)) is considered essential. It may be important to consider removing motion blur. Depending on the moving speed of the pen, motion blur becomes significant, and SURF feature points cannot be detected, and as a result, matching cannot be performed. Therefore, it is important to remove motion blur, but in the frame where the handwriting is shown, the blurring that makes use of the fact that the handwriting is a linear pattern (for example, XY Qi, L. Zhang, CL Tan, "Motion deblurring for optical character recognition, "Proc. ICDAR2005, pp. 389-393, 2005.) is also conceivable.

評価についても、復元精度の定量的な評価や、筆記速度に対する耐性測定が考えられる。またこの発明では紙指紋を手がかりにモザイク画像を求めているので、紙の質の影響についても調査すれば、その結果得られる知見に基づいた改良も考えられる。 As for evaluation, quantitative evaluation of restoration accuracy and resistance measurement against writing speed can be considered. Further, in the present invention, since a mosaic image is obtained using a paper fingerprint as a clue, if the influence of paper quality is also investigated, an improvement based on the knowledge obtained as a result can be considered.

続いて、前記第２発明について説明する。説明を理解し易くするため、まず、第２発明の基礎となる第１発明および文書画像検索技術を用いる手法の２つについて説明する。 Next, the second invention will be described. In order to make the explanation easy to understand, first, the first invention as the basis of the second invention and the technique using the document image search technique will be described.

≪第１発明による手法≫
第１発明の手法を、改めて簡単にまとめておく。第１発明では、白紙や余白部分に対応するため、紙指紋を利用する。紙の表面を接写することで紙指紋を撮影できる。この模様から特徴点を抽出し、各フレーム間でのペン先の移動量を求めれば、筆記の動きを把握できる。これにより、専用紙を必要としない筆跡の復元が可能である。 << Method according to the first invention >>
The technique of the first invention will be briefly summarized again. In the first invention, a paper fingerprint is used to deal with a blank paper or a blank portion. A paper fingerprint can be taken by taking a close-up of the paper surface. By extracting feature points from this pattern and determining the amount of movement of the pen tip between each frame, the movement of writing can be grasped. Thereby, it is possible to restore the handwriting that does not require special paper.

ここで、抽出される特徴点には、回転やスケール変化への不変性が必要となる。これは、筆記中には、ペンの回転や紙面に対する角度の変化が起きるためである。そのため、特徴点抽出および特徴記述としてSURFを用いる。SURF特徴量は、回転やスケール変化に不変であり、また照明変化にも頑健な性質を持つ。図１２は、白紙からSURF特徴点を抽出した結果を示す。 Here, the extracted feature points need to be invariant to rotation and scale changes. This is because the rotation of the pen and the change of the angle with respect to the paper surface occur during writing. Therefore, SURF is used as feature point extraction and feature description. SURF features are invariant to rotation and scale changes, and are robust to lighting changes. FIG. 12 shows the result of extracting SURF feature points from white paper.

ペン先の座標の移動量は、SURF特徴点の点対応関係から、射影変換行列を求めることで計算できる。図１３は、点対応関係を視覚化したものである。類似したSURF特徴点を線で結んでおり、多くの点は対応が正しく取れていると分かる。しかし、誤った点対応を取る場合も存在する。このような誤った点対応を含めて射影変換行列を求めると、誤った射影変換行列が求まる可能性がある。そのため、ロバスト推定法として、RANSACを利用する。RANSACはランダムに選択した点対応から射影変換行列を求め、その射影変換行列により他の点対応をどの程度説明できるかを評価する手法である。この評価を繰り返すことで、ロバストに射影変換行列が求められる。 The amount of movement of the nib coordinate can be calculated by obtaining a projective transformation matrix from the point correspondence of the SURF feature points. FIG. 13 is a visualization of the point correspondence. Similar SURF feature points are connected by a line, and it can be seen that many points are correctly handled. However, there are cases where incorrect point correspondences are taken. If a projective transformation matrix is obtained including such an incorrect point correspondence, an erroneous projective transformation matrix may be obtained. Therefore, RANSAC is used as a robust estimation method. RANSAC is a technique for obtaining a projective transformation matrix from randomly selected point correspondences and evaluating how much the other point correspondences can be explained by the projection transformation matrix. By repeating this evaluation, a projective transformation matrix is robustly obtained.

≪文書画像検索技術を用いる手法≫
筆記の復元を文書上に行うカメラ付きペンシステムとして、LLAH(Locally Likely Arrangement Hashing)を用いた手法がある（中居友弘、黄瀬浩一、岩村雅一、"Webカメラを用いたリアルタイム文書画像検索"、電子情報通信学会論文誌D，J90-D，8，pp.2262-2265, Aug．2007. 参照）。この手法は以下の手順で筆記情報を得る。
1. ペンに取り付けられたカメラで紙面上の文書を撮影する。
2. 得られた画像から連結成分の重心(LLAH特徴点)を抽出し、LLAHを用いた文書画像検索を行う。その結果、対応する文書画像と、その画像に対する射影変換行列が得られる。
3. 射影変換行列より、ペン先が文書画像のどの座標に位置するかを推定する。
4. 推定されたペン先の座標が妥当かを評価し、妥当であれば座標を記録する。 ≪Method using document image search technology≫
There is a method using LLAH (Locally Likely Arrangement Hashing) as a pen system with a camera that restores writing on a document (Tomohiro Nakai, Koichi Kise, Masakazu Iwamura, “Real-time document image search using a Web camera”, (See IEICE Transactions D, J90-D, 8, pp.2262-2265, Aug. 2007.) In this method, writing information is obtained by the following procedure.
1. Take a document on paper with a camera attached to the pen.
2. Extract the center of gravity (LLAH feature point) of the connected component from the obtained image, and perform document image search using LLAH. As a result, a corresponding document image and a projective transformation matrix for the image are obtained.
3. Estimate the coordinates of the pen tip in the document image from the projective transformation matrix.
4. Evaluate whether the estimated pen tip coordinates are valid, and if so, record the coordinates.

上記のプロセスを繰り返し、推定されたペン先の位置を結ぶことで筆跡情報が得られる（前記非特許文献３参照）。このシステムは、紙面に印刷された文書から特徴点を得るため、筆記の対象として専用紙を必要としない。また、筆記した文書と、筆記位置を特定できる。 Handwriting information is obtained by repeating the above process and connecting the estimated pen tip positions (see Non-Patent Document 3). Since this system obtains feature points from a document printed on paper, it does not require special paper as an object of writing. In addition, the written document and the writing position can be specified.

≪両者を統合するときの問題点≫
両者の問題点として、各々単独では日常的な筆記すべてに対応できないことが挙げられる。日常的な筆記には、白紙に対するメモ書きから、文書への下線部など様々なものがある。そのため、白紙に対する筆記に対応しつつ、文書に筆記したときは、その文書名、文書上での筆記位置が求まるシステムが必要である。第１発明による手法は、紙指紋から抽出される特徴点を用いて筆記を復元する。紙指紋を利用することで、一般的な紙における筆記の復元を可能とするが、紙面に印刷された文書と筆記の関係性を見ることを考慮していない。一方で、文書画像検索技術を用いた手法では、筆記と文書の関係性を見ることを可能とする。ただし、文書画像検索手法として用いるLLAHは、印刷文字から検索に必要なLLAH特徴点を抽出するため、文書が印刷されている領域を撮影しなければ、筆記の復元が不可能である。 ≪Problems when integrating the two≫
The problem with both is that they cannot handle all of their daily writing alone. There are various daily writings, from writing notes on blank paper to underlining the document. Therefore, there is a need for a system that can determine the document name and the writing position on the document when writing on the document while supporting writing on a blank sheet. The technique according to the first invention restores writing using feature points extracted from a paper fingerprint. By using paper fingerprints, it is possible to restore writing on general paper, but it does not take into account the relationship between writing on a document printed on paper and writing. On the other hand, the technique using the document image search technique makes it possible to see the relationship between writing and a document. However, since LLAH used as a document image search method extracts LLAH feature points necessary for search from printed characters, it is impossible to restore writing unless the area where the document is printed is photographed.

≪第２発明による解決≫
第２発明で提案するシステムは、紙面に特殊な加工を行わずに白紙上でも筆跡を取得でき、筆記先が文書であるときには、文書上での筆記位置を求めることを可能とするものである。第２発明を実装するには、第１発明による手法と文書画像検索を用いた手法の間にある問題を解決する必要がある。 ≪Solution according to the second invention≫
The system proposed in the second invention can acquire handwriting even on white paper without performing special processing on the paper surface, and when the writing destination is a document, the writing position on the document can be obtained. . In order to implement the second invention, it is necessary to solve the problem between the technique according to the first invention and the technique using document image retrieval.

2つの手法の間にある大きな問題点として、カメラの設置位置の問題が挙げられる。カメラの設置位置が問題になるのは、白紙の紙面から情報を多く得るときと、文書の情報を多く得るときに、異なる視野が求められるためである。例えば、紙指紋は紙の繊維であり、非常に細かな特徴であるため、撮影には高い解像度の画像が必要となる。したがって、カメラをペン先に近づけて設置することで、安定して紙指紋が撮影でき、高い精度での筆跡復元が可能となる。一方で、文書画像検索を行うときは、多くの文書領域が撮影できれば、文書の違いを区別しやすくなる。そのため、カメラをペン先から遠ざけて設置することで、検索精度が向上する。このように、筆跡の復元と文書画像検索の精度を高くするためには、相反するカメラの設置位置が求められる。このとき、2つのカメラをペンに取り付けることも考えられるが、実用性を考えると、カメラは1つであるべきだと言える。 A major problem between the two methods is the problem of the camera installation position. The camera installation position is a problem because different fields of view are required when obtaining a large amount of information from a blank sheet and when obtaining a large amount of document information. For example, a paper fingerprint is a fiber of paper and has very fine features, so that a high resolution image is required for photographing. Therefore, by installing the camera close to the pen tip, it is possible to stably capture a paper fingerprint and to restore handwriting with high accuracy. On the other hand, when a document image search is performed, if a large number of document areas can be photographed, it becomes easy to distinguish between documents. Therefore, the search accuracy is improved by installing the camera away from the pen tip. Thus, in order to improve the accuracy of handwriting restoration and document image search, the opposite camera installation positions are required. At this time, it is possible to attach two cameras to the pen, but in terms of practicality, it can be said that there should be one camera.

また、第１発明による手法の問題として、射影変換行列を連続して求めていく中での誤差の蓄積がある。隣接フレーム間のみの情報を用いて射影変換を繰り返すと、後のフレームになるほど誤差が蓄積され、正確なペン先の位置を推定できなくなる。そのため、実際の筆記と比較して文字の形状が崩れてしまう。 Further, as a problem of the technique according to the first invention, there is an accumulation of errors while continuously obtaining the projective transformation matrix. If projective transformation is repeated using information only between adjacent frames, errors are accumulated in later frames, and the accurate pen tip position cannot be estimated. For this reason, the shape of the characters is lost compared to actual writing.

１．カメラ設置位置問題への対処法
カメラの設置位置問題に対しては、画像モザイキング技術を用いることで対処する。画像モザイキング技術とは、撮影されたフレームを組み合わせていき、広い範囲を撮影したに等しい画像を作り出す技術である（例えば、佐藤智和、池谷彰彦、池田聖、神原誠之、中島昇、横矢直和、 "カメラ外部パラメータ推定による平面を対象とした超解像ビデオモザイキング、" 第９回パターン計測シンポジウム講演論文集、 pp. 13-20, Nov. 2004. 参照）。第２発明では、SURF特徴点の対応から、フレームごとの射影変換行列が求まる。そこで、各フレーム画像を射影変換し、モザイク画像を得ることで、広い視野の撮影画像が得られる。図１４に、モザイク画像から抽出されたLLAH特徴点の様子を示す。これにより、LLAH特徴点の数を増やすことができ、文書画像検索の精度を高められる。 1. Coping with the camera installation position problem The camera installation position problem is addressed by using image mosaicing technology. Image mosaicing technology is a technology that combines frames taken to create images that are equivalent to a wide area (for example, Tomokazu Sato, Akihiko Ikeda, Kiyoshi Ikeda, Noriyuki Kambara, Noboru Nakajima, Naokazu Yokoya, "Super-resolution video mosaicing for planes based on camera external parameter estimation," Proceedings of the 9th Pattern Measurement Symposium, pp. 13-20, Nov. 2004.). In the second invention, a projection transformation matrix for each frame is obtained from the correspondence of the SURF feature points. Thus, each frame image is projectively transformed to obtain a mosaic image, thereby obtaining a captured image with a wide field of view. FIG. 14 shows the state of LLAH feature points extracted from the mosaic image. As a result, the number of LLAH feature points can be increased, and the accuracy of document image search can be improved.

実際に第２発明で使用したカメラペンを図１５に示す。超小型CCDカメラ(株式会社アサヒ電子研究所 NCM03-K)をペン本体11の半ばに取り付ける（図１５のカメラ13）。また、カメラ13の上部には紫外線ライト15を取り付ける。紫外線ライト15を取り付ける理由は、白色光環境下では紙指紋から特徴点が得にくいためである。自然光や蛍光灯のような白色光は、紙面で強く反射するため、カメラ13で撮影する紙面は明るくなり、紙指紋が光によって隠されてしまう。そのため、紙指紋から特徴点を抽出しやすくするためには、強く反射しない光を紙面に当てる必要がある。そこで、ペン本体11に、紫外線ライト15を取り付けることで、紙面から離れた位置にカメラ13を設置しても紙指紋から特徴点が得られるようにする。 The camera pen actually used in the second invention is shown in FIG. A micro CCD camera (Asahi Electronics Laboratory NCM03-K) is attached to the middle of the pen body 11 (camera 13 in FIG. 15). An ultraviolet light 15 is attached to the upper part of the camera 13. The reason for attaching the ultraviolet light 15 is that it is difficult to obtain a feature point from a paper fingerprint in a white light environment. Since white light such as natural light or fluorescent light is strongly reflected on the paper surface, the paper surface photographed by the camera 13 becomes bright, and the paper fingerprint is hidden by the light. Therefore, in order to make it easy to extract feature points from a paper fingerprint, it is necessary to apply light that does not reflect strongly to the paper surface. Therefore, by attaching an ultraviolet light 15 to the pen body 11, a feature point can be obtained from the paper fingerprint even if the camera 13 is installed at a position away from the paper surface.

２．特徴点の再出現
誤差の蓄積による筆跡のズレの問題に対しては、SURF特徴点の再出現を調べることで対処する。これは、文字は交差したり元の位置に戻ったりすることが多くあるため、同じ領域を撮影するときに、過去に抽出したSURF特徴点との対応を取ることができれば、蓄積誤差の少ない射影変換が可能となるからである。例えば'8'のように途中で交差し元の線に繋げる文字は、カメラ13が同じ領域を撮影する。そのため、一度出現したSURF特徴点を保存しておき、同じ領域を再度通過する際に、保存されたすべてのSURF特徴点と対応を取ることで誤差の補正をすることができる。 2. Reappearance of feature points The problem of handwriting misalignment due to accumulated errors is addressed by examining the reappearance of SURF feature points. This is because characters often intersect or return to their original positions, so if you can take correspondence with SURF feature points extracted in the past when shooting the same area, projection with less accumulation error This is because conversion is possible. For example, for a character that intersects halfway and is connected to the original line, such as “8”, the camera 13 captures the same area. For this reason, once the SURF feature points that have appeared once are stored, and when passing through the same region again, errors can be corrected by taking correspondence with all the stored SURF feature points.

ここで、過去のSURF特徴点をすべて保存していくと、時間がたつにつれて膨大な数となる。SURF特徴点が増えるほど、点対応の計算にかかる時間が増える。そのため、ハッシュを用いて対応点の検索を高速化する（例えば、野口和人、黄瀬浩一、岩村雅一、 "近似最近傍探索の多段階化による物体の高速認識、" 画像の認識・理解シンポジウム(MIRU2007)論文集、 OS-B2-02, pp.111-118, July, 2007. 参照）。ハッシュ値は、64次元あるSURF特徴量の16次元を用いて計算する。SURF特徴量が、
Here, if all the past SURF feature points are preserved, the number will increase over time. As the number of SURF feature points increases, the time required for point correspondence calculation increases. Therefore, it is possible to speed up the search for corresponding points using a hash (for example, Kazuhito Noguchi, Koichi Kise, Masakazu Iwamura, "High-speed object recognition by multi-step approximate nearest neighbor search," Image Recognition and Understanding Symposium) (See MIRU2007), OS-B2-02, pp.111-118, July, 2007. The hash value is calculated using 16 dimensions of the SURF feature quantity having 64 dimensions. The SURF feature is

であるとき、
When

を用いて2値化を行い、ビットベクトル
を作成する。ここで、μ_j は、事前実験用に撮影した画像から得た特徴量各次元の中央値である。そして、
によってハッシュ値を求める。ここで、mod は剰余演算、H_sizeは、ハッシュ表のサイズである。 Binarization using, bit vector
Create Here, μ _j is the median value of each dimension of the feature amount obtained from the image taken for the preliminary experiment. And
The hash value is obtained by Here, mod is the remainder operation, and H _size is the _size of the hash table.

≪処理の流れ≫
第２発明における、具体的な処理の流れを述べる。処理の流れを図１６に示す。まず、撮影した紙面の紙指紋や印刷文字からSURF特徴点を抽出する。次に、ハッシュ表から取り出された特徴量との比較を行い、点対応を求める。この点対応関係より、基準とするフレーム画像との射影変換行列が求められる。そして、この行列を用いて、平面上でのペン先17の位置を求める。同時に、対応点が見つからなかった特徴点も、座標を射影変換し、ハッシュ表に登録する。この処理を繰り返すことで、一連のペン先17の座標を得ることができる。また、ｍ（＞１）フレーム間隔ごとに画像のモザイキングをする。そして、ｎ（＞ｍ）フレーム間隔ごとに、モザイク画像からLLAH特徴点を抽出し、一定量の特徴点が得られれば文書画像検索をする。特徴点が一定量を超えないときは、引き続き画像モザイキングを続ける。そして、検索を行った時には、結果が正しいと判断されれば、求めてきたペン先17の座標を射影変換し繋げることで、筆跡を文書画像上に復元する。また、文書画像検索を行ったときには、モザイキング画像を初期化する。これは、射影変換の誤差が蓄積されることを防ぐためである。 ≪Process flow≫
A specific processing flow in the second invention will be described. The flow of processing is shown in FIG. First, the SURF feature points are extracted from the paper fingerprints and printed characters of the photographed paper. Next, a point correspondence is obtained by comparing with the feature value extracted from the hash table. From this point correspondence, a projective transformation matrix with a reference frame image is obtained. Then, using this matrix, the position of the pen tip 17 on the plane is obtained. At the same time, for the feature points for which no corresponding points are found, the coordinates of the feature points are projected and registered in the hash table. By repeating this process, a series of coordinates of the nib 17 can be obtained. In addition, image mosaicing is performed every m (> 1) frame intervals. Then, LLAH feature points are extracted from the mosaic image every n (> m) frame intervals, and a document image search is performed if a certain amount of feature points is obtained. If the feature points do not exceed a certain amount, image mosaicing is continued. When the search is performed, if it is determined that the result is correct, the handwriting is restored on the document image by projective transformation of the obtained coordinates of the pen tip 17 and connecting them. When a document image search is performed, the mosaicing image is initialized. This is to prevent accumulation of projection transformation errors.

≪実験例≫
文書に対して筆記を行い、結果を評価した。対象用紙には、コピー用紙(再生紙)を用いた。ペン本体11に取り付けたカメラ13のフレームレートは30fpsである。モザイキングは10フレームごとに行い、文書画像検索は30フレームごとに行った。この値は、予備実験から得た知識を元に設定した。また、1文字あたりの筆記速度は5〜10秒であり、比較的ゆっくりとした筆記を行った。これはモーションブラーを避けるためである。文字や図形は、連続した一つの線で筆記した。連続した一つの線であるのは、現状ではペン先17のアップダウンは考慮していないためである。また、文字や図形の大きさや形は統一せず任意とした。実験中の筆記には、白紙領域と文書領域に渡って撮影したものも含む。 ≪Experimental example≫
Written on the document and evaluated the results. Copy paper (recycled paper) was used as the target paper. The frame rate of the camera 13 attached to the pen body 11 is 30 fps. Mosaiking was performed every 10 frames, and document image retrieval was performed every 30 frames. This value was set based on knowledge obtained from preliminary experiments. Moreover, the writing speed per character was 5 to 10 seconds, and writing was performed relatively slowly. This is to avoid motion blur. Letters and figures were written with a single continuous line. The reason for the single continuous line is that the up / down of the nib 17 is not considered at present. In addition, the size and shape of the characters and figures are not unified but arbitrary. The writing during the experiment includes photographs taken over a blank area and a document area.

実験結果の評価は、ペンタブレットにより得られる筆跡情報との比較によって行った。評価値として、ペンタブレットにより作成された解答と、第２発明よりできる筆跡の一致割合を求めた。ただし、ペンタブレットから得る筆跡情報は、実際の筆記と比較するとズレがある。これは、印刷時におきる、紙面の傾き、余白の設定などにより、所持する画像ファイルと紙面での座標が一致しないためである。そのため、目視で同程度の復元と判断できるズレを許容して評価を行った。実験では、許容するズレの範囲を約3mmとした。 Evaluation of the experimental results was performed by comparison with handwriting information obtained with a pen tablet. As an evaluation value, the coincidence ratio between the answer created by the pen tablet and the handwriting made from the second invention was obtained. However, the handwriting information obtained from the pen tablet is misaligned compared to the actual writing. This is because the coordinates of the image file and the possessed paper do not coincide with each other due to the inclination of the paper surface and the setting of the margin that occur during printing. Therefore, the evaluation was performed while allowing a deviation that can be visually judged to be the same level of restoration. In the experiment, the allowable deviation range was about 3 mm.

文書領域に50個の筆記を行った結果を図１７に示す。図１７では、実験結果として筆跡が回答と一致した度合い（精度）を0〜100％の範囲に渡り10％刻みの10段階に分けて表している。なお、1フレームあたりの平均処理時間は128msであった。
図１５はSURF特徴点の点対応から。復元した筆跡を評価した結果である。ここで、多くの結果の評価値が30%前後にある理由として、得られる筆跡は、ペン本体11の傾きやカメラ13の回転を補正できないことが挙げられる。図１８に、筆跡の復元結果が回転した一例を示す。実験では、基準となるフレーム画像を定め、そのフレーム画像の平面に対して筆跡を復元する。この基準となるフレームが紙面の縦横との傾きがあるときは、図１７のように結果が傾いて表現される。そのため、図１５の評価値は、射影変換の誤差による筆跡の乱れだけでなく、カメラ13の位置関係による歪みや回転により評価値が下がった。 FIG. 17 shows the result of writing 50 pieces in the document area. In FIG. 17, the degree (accuracy) that the handwriting coincides with the answer as an experimental result is shown in 10 steps in 10% increments over a range of 0 to 100%. The average processing time per frame was 128 ms.
FIG. 15 is from the point correspondence of SURF feature points. It is the result of evaluating the restored handwriting. Here, the reason why the evaluation value of many results is around 30% is that the obtained handwriting cannot correct the tilt of the pen body 11 and the rotation of the camera 13. FIG. 18 shows an example in which the handwriting restoration result is rotated. In the experiment, a reference frame image is determined, and handwriting is restored with respect to the plane of the frame image. When the reference frame is inclined with respect to the vertical and horizontal directions of the paper, the result is expressed with an inclination as shown in FIG. For this reason, the evaluation value in FIG. 15 decreased due to not only handwriting disturbance due to projective transformation error but also distortion and rotation due to the positional relationship of the camera 13.

図１５は、文書画像検索を行い、文書上に筆跡を復元したときの、解答画像との一致した割合を示す。図１５では、図１５と比較して、評価値が上昇した。これは、文書画像検索に成功したとき、紙面と正対した平面への射影変換ができるためである。紙面上に射影変換されることで、図１８のような、筆跡の回転や歪みが補正でき、解答画像に近づいた。文書中の余白に筆記を行った復元結果を図１９に示す。図１９のように、周囲の文書領域が撮影できれば、余白領域の筆記も文書上での位置を求めることができた。また、評価値が高い結果の例として、図２０を示す。図２０のように、文書領域に広く筆記をしたときは、文書画像検索の精度が高くなった。これは、ペン先17が大きく動くことで、モザイキングにより多くの特徴点が抽出できたからである。一方で、評価値が低くなった例を、図２１に示す。図２１のように、小さな領域に筆記したときは、文書画像検索の精度が低下し、位置のズレが発生した。これは、同じ領域を捉えたフレーム画像が多く、モザイキングを行っても特徴点の数が増えないためである。この問題を解決するためには、LLAHの改良が必要である。具体的には、特徴量に改良を加えることで、特徴点数が少ない状況でも精度の高い検索を可能とする必要がある。 FIG. 15 shows the ratio of matching with the answer image when the document image search is performed and the handwriting is restored on the document. In FIG. 15, the evaluation value increased compared to FIG. 15. This is because when the document image search is successful, projective transformation to a plane that faces the paper surface can be performed. By projective transformation on the paper, rotation and distortion of the handwriting as shown in FIG. 18 can be corrected, and the answer image is approached. FIG. 19 shows a restoration result obtained by writing in the margin in the document. As shown in FIG. 19, if the surrounding document area can be photographed, the position of the margin area can also be obtained on the document. Moreover, FIG. 20 is shown as an example of a result with a high evaluation value. As shown in FIG. 20, the accuracy of the document image search is increased when writing widely in the document area. This is because many feature points can be extracted by mosaicing because the nib 17 moves greatly. On the other hand, an example in which the evaluation value is lowered is shown in FIG. As shown in FIG. 21, when writing in a small area, the accuracy of the document image search was reduced, and a positional shift occurred. This is because there are many frame images that capture the same region, and the number of feature points does not increase even when mosaicing is performed. To solve this problem, LLAH needs to be improved. Specifically, it is necessary to improve the feature amount so that a highly accurate search is possible even in a situation where the number of feature points is small.

図１７において、評価値が0%になる結果は、筆跡の追跡や、LLAHによる文書画像検索に失敗したものである。本実験で、追跡や検索に失敗した最も大きな理由として、実験環境による撮影画像の変化が大きいことが挙げられる。撮影画像に最も影響を与えるものとして、自然光や蛍光灯の強い光がある。実験においては、対策として紫外線ライト15を取り付けたカメラペンを用いた。しかし、図１５のカメラペンは、外からの光を遮る作りになっていない。そのため、実験において、撮影する画像ごとに外から入ってくる光の度合いが違い、それによって結果が左右される。この問題は、本実験から得た結果を元に、カメラペンを外乱に影響されない形に作ることで解決できると考えられる。 In FIG. 17, the result that the evaluation value becomes 0% is that the tracking of the handwriting or the document image search by LLAH has failed. In this experiment, the biggest reason for failure in tracking and searching is the large change in the captured image due to the experimental environment. Natural light and strong light from a fluorescent lamp are the ones that most affect a photographed image. In the experiment, a camera pen with an ultraviolet light 15 was used as a countermeasure. However, the camera pen in FIG. 15 is not designed to block light from the outside. Therefore, in the experiment, the degree of light entering from the outside differs for each image to be photographed, and the result depends on it. This problem can be solved by making the camera pen unaffected by disturbances based on the results obtained from this experiment.

前述した実施の形態の他にも、この発明について種々の変形例があり得る。それらの変形例は、この発明の範囲に属さないと解されるべきものではない。この発明には、請求の範囲と均等の意味および前記範囲内でのすべての変形とが含まれるべきである。 In addition to the embodiments described above, there can be various modifications of the present invention. These modifications should not be construed as not belonging to the scope of the present invention. The present invention should include the meaning equivalent to the scope of the claims and all modifications within the scope.

1：ペン部
2：動画カメラ
5：データベース
11：ペン本体
13：カメラ
15：紫外線ライト
17：ペン先 1: Pen part
2: Video camera
5: Database
11: Pen body
13: Camera
15: UV light
17: Nib

Claims

A pen unit having a pen tip and a video camera for photographing the vicinity thereof;
A processing unit that obtains a locus of a pen tip that moves between a series of frames of a captured moving image;
In the pen unit, the camera shoots a paper fingerprint which is a concavo-convex pattern formed on the paper surface after the paper is made,
The processing unit is an extraction processing unit that extracts a plurality of feature points each representing a local feature of a paper fingerprint captured in each frame image, a corresponding processing unit that determines corresponding feature points between the preceding and following frame images, A handwriting pattern acquisition system comprising a trajectory processing unit for determining a pen tip position change with respect to a paper surface based on a position change of each feature point in the front and back frames and obtaining a handwritten pattern as a pen tip trajectory .

The extraction processing unit determines a feature point whose position changes between different frame images as a feature point on the paper surface, determines a feature point whose position does not change as a feature point of the pen tip part, and displays the pen point part in each frame image The system according to claim 1, wherein no feature points are extracted from the system.

3. The system according to claim 1, wherein the correspondence processing unit does not correspond to feature points in an area within a predetermined range from a pen tip of a subsequent frame.

The said corresponding | compatible process part determines the position change of the corresponding feature point between back-and-front frame images, after correcting the projection distortion of the paper surface which appears in each frame image, when determining. The described system.

The correspondence processing unit calculates a projection transformation matrix corresponding to each frame image in order to correct projection distortion, and obtains a pen tip locus for a frame whose elements are discontinuous with the projection transformation matrix of the previous and / or subsequent frames. The system according to claim 4, wherein the system is excluded from the frame.

The system according to claim 1, wherein the correspondence processing unit performs noise removal processing of an erroneous correspondence relationship.

The system according to claim 1, wherein the extraction processing unit uses a SURF algorithm as a feature point extraction method.

A step of photographing the paper surface near the pen tip when the handwriting using the pen unit having the pen tip and the moving image camera for photographing the vicinity of the pen tip is made on the paper;
A processing unit comprising a step of obtaining a locus of a nib that moves between a series of frames of a captured moving image;
The camera unit photographs a paper fingerprint which is a concavo-convex pattern formed on the paper surface after the paper is drawn, and the processing unit is a plurality of feature points respectively representing local features of the paper fingerprint shown in each frame image The corresponding feature points between the preceding and following frame images are determined, the position change of the pen tip relative to the paper surface is determined based on the position change of the corresponding feature points in the previous and next frames, and the locus of the pen tip is determined. The handwritten pattern acquisition method characterized by calculating | requiring the handwritten pattern of this.

A pen unit that has a pen tip and a moving image camera that captures the vicinity of the pen tip, and that captures a paper fingerprint that is a concavo-convex pattern formed on the paper surface by paper and a document image recorded on the paper surface by the moving camera; ,
First trajectory processing for extracting a local feature of a paper fingerprint from each frame of a captured moving image as a paper surface feature point and obtaining a trajectory of the pen tip with respect to the paper surface based on a movement amount of the corresponding paper surface feature point in the preceding and following frames And
When a document image is captured in each frame, a local feature of the document image is extracted as a document image feature point, and the position of the trajectory with respect to the document image based on the amount of movement of the corresponding document image feature point in the preceding and following frames A second trajectory processing unit for obtaining
The second trajectory processing unit determines the placement of document image feature points in a wider area than one frame by combining the frames so that corresponding document image feature points overlap, and the trajectory for the document image based on the placement. Handwritten pattern acquisition system characterized by obtaining the position of

A document image search unit for searching for a document image corresponding to the arrangement from a document image database in which a plurality of document images are registered in association with document image feature points extracted from the document image;
The second trajectory processing unit calculates a geometric distortion amount of the document image captured in each frame based on a correspondence relationship between the arrangement and the document image feature point associated with the document image searched by the search unit. The system according to claim 9, wherein the system determines and corrects the trajectory using the distortion amount.

The system according to claim 9 or 10, wherein the first trajectory processing unit uses a SURF algorithm as a paper feature point extraction method.

The system according to any one of claims 9 to 11, wherein the second trajectory processing unit extracts, as a document image feature point extraction method, a centroid of a connected component and represents each feature point using an LLAH algorithm.