JP5944338B2

JP5944338B2 - Information processing apparatus, information processing program, and information processing method

Info

Publication number: JP5944338B2
Application number: JP2013060966A
Authority: JP
Inventors: 祐宮崎
Original assignee: Yahoo Japan Corp
Current assignee: Yahoo Japan Corp
Priority date: 2013-03-22
Filing date: 2013-03-22
Publication date: 2016-07-05
Anticipated expiration: 2033-03-22
Also published as: JP2014186546A

Description

本発明は、情報処理装置、情報処理プログラム、および情報処理方法に関する。 The present invention relates to an information processing apparatus, an information processing program, and an information processing method.

書籍のスキャン画像を基にテキストデータを作成する書籍の電子化技術が普及している。書籍を電子化するには、まず、書籍をスキャンし、スキャン画像から文字画像を抽出する。この文字画像に対応する文字コードを文字パターン辞書から取得することにより、文字画像をテキストデータに変換する。そうして、スキャン画像を基にテキストデータを作成することにより、書籍を電子ファイル化する。 2. Description of the Related Art Computerization technology for books that creates text data based on scanned images of books has become widespread. In order to digitize a book, first, the book is scanned and a character image is extracted from the scanned image. By obtaining a character code corresponding to the character image from the character pattern dictionary, the character image is converted into text data. Then, by creating text data based on the scanned image, the book is converted into an electronic file.

事業者に電子化を依頼する書籍には、依頼した時点ですでに手書文字が付されている場合がある。例えば、ユーザは学術文献等を読む際に、学術文献で特に重要と考える部分を波線で示したり、「要点」「ポイント」というワードを記載したりして、後で読み返したときに重要な部分が一目で判断できるようにすることがある。このような手書文字は書籍を電子ファイル化する際、ノイズとして削除されるのが通常である。 There are cases in which handwritten characters are already attached to books that are requested to be digitized by business operators at the time of request. For example, when reading a scholarly document, the part that is considered particularly important in the scholarly document is indicated by a wavy line, or the words “main points” and “points” are written, and the important part when reading back later. May be judged at a glance. Such handwritten characters are usually deleted as noise when a book is converted into an electronic file.

特開２０１１−０５３８８９号公報JP 2011-053889 A 特開２００９−２３１１７２号公報JP 2009-231172 A 特開２００９−２１２６５５号公報JP 2009-212655 A

しかしながら、書籍を電子化する際に手書文字が有効利用されていない問題がある。 However, there is a problem that handwritten characters are not effectively used when digitizing a book.

開示の技術は、上記に鑑みてなされたものであって、手書文字を有効利用することができる情報処理装置、情報処理プログラム、および情報処理方法を提供することを目的とする。 The disclosed technology has been made in view of the above, and an object thereof is to provide an information processing apparatus, an information processing program, and an information processing method capable of effectively using handwritten characters.

本願に係る情報処理装置は、スキャン画像に含まれる手書箇所の範囲を前記スキャン画像に対応する座標を用いて特定し、特定された前記範囲の前後に対応する対象文章を取得する取得手段と、前記スキャン画像、前記範囲を特定する情報、および前記対象文章を対応付けて記憶する記憶手段と、を備えたことを特徴とする。 The information processing apparatus according to the present application, the obtaining means the range of handwriting portions included in the scanned image identified using the coordinates corresponding to the scanned image, and acquires the target sentence corresponding to before and after the identified said range , characterized by comprising the scan picture image, information identifying the scope, and a storage means for storing in association with the Target sentence.

開示の技術の実施態様によれば、手書文字を有効利用することができるという効果を奏する。 According to the embodiment of the disclosed technology, there is an effect that handwritten characters can be effectively used.

図１は、実施例１に係る手書データを取得し、対象文章を投稿用のＷｅｂページに投稿するまでの処理の流れを説明するための図である。FIG. 1 is a diagram for explaining a flow of processing from acquiring handwritten data according to the first embodiment to posting a target sentence on a Web page for posting. 図２は、書籍の電子化処理システムに係る構成の一例を示した図である。FIG. 2 is a diagram showing an example of a configuration relating to a book electronic processing system. 図３は、電子化処理サーバの構成を示す機能ブロック図である。FIG. 3 is a functional block diagram showing the configuration of the electronic processing server. 図４は、画像データの一例を示した図である。FIG. 4 is a diagram illustrating an example of image data. 図５は、テキストデータの一例を示した図である。FIG. 5 is a diagram showing an example of text data. 図６は、手書データの一例を示した図である。FIG. 6 is a diagram showing an example of handwritten data. 図７は、画像データから各手書データを作成するまでの流れを示したフロー図である。FIG. 7 is a flowchart showing a flow from creation of each handwriting data from image data. 図８は、スキャン画像から投稿用のＷｅｂページへ遷移する様子を示した図である。FIG. 8 is a diagram showing a transition from a scanned image to a posting Web page. 図９は、対象文章にハッシュタグを付した場合の手書データの一例を示す図である。FIG. 9 is a diagram illustrating an example of handwriting data when a hash tag is attached to the target sentence. 図１０は、端末が投稿サイトの入力欄にハッシュタグを設定し、対象文章を入力したときの一例を示した図である。FIG. 10 is a diagram illustrating an example when the terminal sets a hash tag in the input field of the posting site and inputs a target sentence.

以下に、本願に係る情報処理装置、情報処理プログラム、および情報処理方法を実施するための実施形態について図面を参照しつつ詳細に説明する。なお、この実施形態により本願に係る検索装置、検索プログラム、および検索方法が限定されるものではない。また、以下の各実施形態において同一の部位には同一の符号を付し、重複する説明は省略される。各実施例は、処理内容を矛盾させない範囲で適宜組み合わせることが可能である。 Hereinafter, embodiments for carrying out an information processing apparatus, an information processing program, and an information processing method according to the present application will be described in detail with reference to the drawings. Note that the search device, search program, and search method according to the present application are not limited by this embodiment. In the following embodiments, the same portions are denoted by the same reference numerals, and redundant description is omitted. Each embodiment can be appropriately combined within a range in which processing contents do not contradict each other.

［処理の概要］
まず、図１を用いて、実施例１に係る表示処理について説明する。図１は、手書データを取得し、対象文章を投稿用のＷｅｂページに投稿するまでの処理の流れを説明するための図である。情報処理サーバ２００は、主に、スキャン画像１１に含まれる手書箇所の前後に対応する対象文章を取得する処理と、スキャン画像１１および対象文章を対応付けて記憶する処理とをおこなう。以下、より具体的に説明する。 [Process overview]
First, display processing according to the first embodiment will be described with reference to FIG. FIG. 1 is a diagram for explaining the flow of processing from acquiring handwritten data to posting a target sentence on a posting Web page. The information processing server 200 mainly performs processing for acquiring target texts corresponding to before and after the handwritten portion included in the scanned image 11 and processing for storing the scanned images 11 and the target text in association with each other. More specific description will be given below.

まず、情報処理サーバ２００は、スキャン画像１１に含まれる手書箇所の前後に対応する対象文章を取得する。そのために、情報処理サーバ２００は、最初に、スキャナ等によって読み込まれた画像データ２２１ａを受け付け、手書文字もしくは手書符号が記載された箇所を手書文字データとして取得する。例えば、情報処理サーバ２００は、画像データ２２１ａに記載された「為替レート」という用語の下の部分に下線が引かれ、その下に「ポイント」という文字が手書きで記載されているので、この手書箇所をまとめて長方形で囲んで抽出する。そして、情報処理サーバ２００は、手書箇所を抽出する際、手書箇所が記された画像データ２２１に対応するページ数「３」と、手書箇所を長方形で囲んだときの左上および右下の座標「（９０，８０）−（１３０，８５）」とを、手書文字データと関連付け、さらに手書箇所ごとにシーケンシャルに与えられる手書ＩＤ「１」を関連付ける。 First, the information processing server 200 acquires target sentences corresponding to before and after the handwritten part included in the scanned image 11. For this purpose, the information processing server 200 first receives the image data 221a read by a scanner or the like, and acquires a handwritten character or a portion where a handwritten code is written as handwritten character data. For example, the information processing server 200 is underlined in the lower part of the term “exchange rate” described in the image data 221a, and the character “point” is written by hand under it. Extract the book parts by enclosing them in a rectangle. When the information processing server 200 extracts the handwritten portion, the number of pages “3” corresponding to the image data 221 in which the handwritten portion is written, and the upper left and lower right when the handwritten portion is surrounded by a rectangle. The coordinates “(90, 80)-(130, 85)” are associated with the handwritten character data, and further, the handwritten ID “1” given sequentially for each handwritten portion is associated.

次に、情報処理サーバ２００は、抽出した手書箇所の前後にある対象文章を取得する。例えば、情報処理サーバ２００は、手書箇所があるページ数「３」と、手書箇所に係る座標「（９０，８０）−（１３０，８５）」とに基づき手書箇所を特定し、手書箇所の前後に対応する対象文章データを取得する。例えば、情報処理サーバ２００は、手書箇所に係る行の前後１行を含む、計３行をテキストデータとして取得する。 Next, the information processing server 200 acquires target sentences before and after the extracted handwritten part. For example, the information processing server 200 identifies the handwriting location based on the number of pages with the handwriting location “3” and the coordinates “(90, 80) − (130, 85)” relating to the handwriting location, Get the target sentence data corresponding to before and after the text. For example, the information processing server 200 acquires a total of three lines as text data including one line before and after the line related to the handwritten part.

次に、情報処理サーバ２００は、手書データ１０をスキャン画像１１に対応付けて記憶する。情報処理サーバ２００は、対象文章を取得する際に対象文章の行「３−６」を取得する。そして、情報処理サーバ２００は、対象文章の行「３−６」と、対象文章のテキストデータとを手書ＩＤ「１」に関連付け、手書データ１０とする。すなわち、情報処理サーバ２００は、先述の手書ＩＤ、ページ数、座標、手書文字データと、取得した対象文章の行、および対象文章データを、手書データ１０として対応付けて記憶する。 Next, the information processing server 200 stores the handwritten data 10 in association with the scanned image 11. The information processing server 200 acquires the line “3-6” of the target sentence when acquiring the target sentence. Then, the information processing server 200 associates the line “3-6” of the target sentence with the text data of the target sentence with the handwriting ID “1” and sets it as the handwriting data 10. That is, the information processing server 200 stores the handwritten ID, the number of pages, the coordinates, the handwritten character data, the acquired line of the target sentence, and the target sentence data in association with each other as the handwritten data 10.

端末３０ａは、手書データ１０およびスキャン画像１１をダウンロードし、表示部３１ａにスキャン画像１１を表示する。端末３０ａは、スキャン画像１１の手書箇所が押下されると、手書データ１０を基に手書箇所に対応する対象文章を取得する。そして、端末３０ａは、手書箇所が押下されたとき、表示部３１ａが投稿用のＷｅｂページに切り替え、表示部３１ｂに遷移する。さらに、端末３０ｂは、投稿用のＷｅｂページの入力欄に対象文章を入力する。端末３０ｂは、投稿ボタンの押下を検知すると、対象文章を投稿用のＷｅｂページに投稿する。 The terminal 30a downloads the handwritten data 10 and the scanned image 11, and displays the scanned image 11 on the display unit 31a. When the handwritten portion of the scanned image 11 is pressed, the terminal 30a acquires the target sentence corresponding to the handwritten portion based on the handwritten data 10. In the terminal 30a, when the handwritten part is pressed, the display unit 31a switches to the web page for posting, and transitions to the display unit 31b. Furthermore, the terminal 30b inputs the target sentence in the input field of the Web page for posting. When the terminal 30b detects that the posting button is pressed, the terminal 30b posts the target sentence to the posting web page.

このように、情報処理サーバ２００は、スキャン画像に含まれる手書箇所の前後に対応する対象文章を取得し、スキャン画像１１および対象文章を対応付けて記憶する。スキャン画像１１と、対象文章とを対応付けた手書データ１０とをダウンロードした端末３０は、表示部３１ａに表示されている手書箇所が押下された際、手書データ１０から対象文章を取得し、対象文章を投稿用のＷｅｂページの入力欄に入力することができる。このため、ユーザは、表示部３１ｂの投稿用のＷｅｂページにおいて、投稿ボタンを選択する等の簡易な操作をするだけで、選択した手書箇所に対応する対象文章を投稿サイトに投稿できる。 As described above, the information processing server 200 acquires the target sentence corresponding to the handwritten part included in the scan image, and stores the scan image 11 and the target sentence in association with each other. The terminal 30 that has downloaded the scan image 11 and the handwritten data 10 in which the target text is associated with each other acquires the target text from the handwritten data 10 when the handwritten location displayed on the display unit 31a is pressed. Then, the target sentence can be input in the input field of the Web page for posting. For this reason, the user can post the target text corresponding to the selected handwritten part to the posting site only by performing a simple operation such as selecting a posting button on the posting Web page of the display unit 31b.

［電子化システムの全体構成］
図２は、電子化システム１００の全体構成の一例を示した図である。ユーザ端末１０１と電子化サーバ１１０と管理システム１２０は、図示しないネットワークに接続し、各種の情報を交換する。かかるネットワークの一態様としては、有線または無線を問わず、ＬＡＮ（Local Area Network）やＶＰＮ（Virtual Private Network）、移動体通信網などの任意の通信網が挙げられる。 [Overall configuration of electronic system]
FIG. 2 is a diagram illustrating an example of the overall configuration of the computerized system 100. The user terminal 101, the computerized server 110, and the management system 120 are connected to a network (not shown) and exchange various information. As an aspect of such a network, there is an arbitrary communication network such as a local area network (LAN), a virtual private network (VPN), or a mobile communication network regardless of wired or wireless.

ユーザ端末１０１は、ユーザが操作する端末装置である。例えば、ユーザ端末１０１は、デスクトップ型ＰＣ（パーソナルコンピュータ）、タブレット型ＰＣ、ノート型ＰＣなどの情報処理装置等である。なお、ユーザ端末１０１は、スマートフォン、ＰＤＡ（Personal Digital Assistant）、携帯電話機であってもよい。なお、図２の例では、ユーザ端末１０１として、デスクトップ型ＰＣとスマートフォンをそれぞれ１つ示したが、これはユーザ端末の例示であり、他の種類の端末を使用してもよい。 The user terminal 101 is a terminal device operated by a user. For example, the user terminal 101 is an information processing apparatus such as a desktop PC (personal computer), a tablet PC, or a notebook PC. The user terminal 101 may be a smart phone, a PDA (Personal Digital Assistant), or a mobile phone. In the example of FIG. 2, one desktop PC and one smartphone are shown as the user terminal 101, but this is an example of a user terminal, and other types of terminals may be used.

電子化サーバ１１０は、書籍の電子化をする事業者の所有するサーバである。電子化サーバ１１０には、スキャナ等の機器を接続し、スキャンした画像から書籍の電子化を行い、テキストデータ２２２を作成する。 The electronic server 110 is a server owned by a business operator who digitizes books. A device such as a scanner is connected to the electronic server 110, and the book is digitized from the scanned image to create text data 222.

管理システム１２０は、各種の管理を行うシステムである。管理システム１２０は、電子書籍データ等を管理しており、受信サーバ１２１と、ファイル管理サーバ１２２と、認証サーバ１２３と、決済サーバ１２４とを有する。受信サーバ１２１とファイル管理サーバ１２２と認証サーバ１２３と決済サーバ１２４は、ネットワークを介して電子化サーバ１１０に接続されている。また、受信サーバ１２１は、電子化サーバ１１０からテキストデータ２２２、スキャン画像１１を受信する。また、ファイル管理サーバ１２２は、登録されたユーザ毎にユーザ用の記憶領域を有している。 The management system 120 is a system that performs various types of management. The management system 120 manages electronic book data and the like, and includes a receiving server 121, a file management server 122, an authentication server 123, and a settlement server 124. The receiving server 121, the file management server 122, the authentication server 123, and the settlement server 124 are connected to the computerized server 110 via a network. Further, the receiving server 121 receives the text data 222 and the scanned image 11 from the electronic server 110. Further, the file management server 122 has a storage area for a user for each registered user.

次に、電子化システム１００が受け付けるユーザの操作、およびユーザの操作に対する電子化システム１００の処理について説明する。電子化サーバ１１０は、事業者Ｗｅｂページ１０３を提供しており、事業者Ｗｅｂページ１０３から書籍の電子化の依頼を受け付ける。ユーザは、書籍を電子化する場合、事業者Ｗｅｂページ１０３から会員登録を行う（１）。事業者Ｗｅｂページ１０３には、書籍の電子化に関する料金など各種の情報が表示される。 Next, a user operation accepted by the computerization system 100 and a process of the computerization system 100 for the user operation will be described. The digitization server 110 provides a provider web page 103 and accepts a request for digitization of a book from the provider web page 103. When a user digitizes a book, the user performs member registration from the business entity web page 103 (1). Various types of information such as fees related to digitization of books are displayed on the business entity web page 103.

事業者Ｗｅｂページ１０３は、管理システム１２０のユーザＩＤおよびパスワードを入力する入力領域を有しており、ユーザＩＤおよびパスワードを用いて認証サーバ１２３によりユーザの認証を行う。ユーザは、事業者Ｗｅｂページ１０３に管理システム１２０のユーザＩＤおよびパスワードを入力してログインする（１）。事業者Ｗｅｂページ１０３は、入力されたユーザＩＤおよびパスワードを用いて認証サーバ１２３によりユーザの認証を行い、認証が得られた場合、書籍の電子化の依頼を受け付ける（２）。事業者Ｗｅｂページ１０３は、書籍の電子化の依頼を受け付けると、依頼内容を受信サーバ１２１へ通知する。 The business entity web page 103 has an input area for inputting the user ID and password of the management system 120, and authenticates the user by the authentication server 123 using the user ID and password. The user logs in by entering the user ID and password of the management system 120 on the business entity web page 103 (1). The business entity web page 103 authenticates the user by the authentication server 123 using the input user ID and password, and accepts a request for digitization of the book when the authentication is obtained (2). Upon receiving a book digitization request, the business entity web page 103 notifies the reception server 121 of the request content.

事業者がユーザから発送された書籍を受け付けると（３）、事業者は図示されていないスキャナにより書籍を読み取り、電子化サーバ１１０はスキャン画像１１を取得する。電子化サーバ１１０は、スキャン画像１１に含まれる文字列を取得し、電子化することによりテキストデータに変換する（４）。 When the business operator accepts the book sent from the user (3), the business operator reads the book with a scanner (not shown), and the electronic server 110 acquires the scanned image 11. The computerization server 110 acquires a character string included in the scanned image 11 and converts it into text data by digitizing the character string (4).

電子化サーバ１１０は、ユーザにより課金を確認すると（５）、スキャン画像１１、テキストデータ２２２を含む電子書籍データ１２２を受信サーバ１２１へ送信して、記憶部２１の事業者用の記憶領域に書き込む（６）。 When the electronic server 110 confirms the billing by the user (5), the electronic server 110 transmits the electronic book data 122 including the scanned image 11 and the text data 222 to the reception server 121 and writes it in the storage area for the business in the storage unit 21. (6).

受信サーバ１２１は、事業者用の記憶領域に電子書籍データ１１２が書き込まれた際、電子書籍データ１１２を、ファイル管理サーバ１２２における依頼元のユーザの記憶領域に移動する。これにより、ユーザはユーザ端末１０１から依頼元のユーザの記憶領域にアクセスすることで、スキャン画像１１およびテキストデータ２２２を閲覧できる（７）。 The receiving server 121 moves the electronic book data 112 to the requesting user's storage area in the file management server 122 when the electronic book data 112 is written in the storage area for the business. Thereby, the user can browse the scanned image 11 and the text data 222 by accessing the storage area of the requesting user from the user terminal 101 (7).

［情報処理サーバにおける処理］
実施例１に係る情報処理サーバ２００の機能構成の一例について説明する。図３は、実施例１に係る電子化処理サーバの構成を示す機能ブロック図である。図３に示すように、情報処理サーバ２００は、制御部２１０と、記憶部２２０とを有する。また、情報処理サーバ２００は、入力部２０１と、通信Ｉ／Ｆ２０２とに接続される。なお、先述した図２のファイル管理サーバ１２２は、情報処理サーバ２００の一例である。 [Processing in the information processing server]
An example of a functional configuration of the information processing server 200 according to the first embodiment will be described. FIG. 3 is a functional block diagram illustrating the configuration of the electronic processing server according to the first embodiment. As illustrated in FIG. 3, the information processing server 200 includes a control unit 210 and a storage unit 220. The information processing server 200 is connected to the input unit 201 and the communication I / F 202. The file management server 122 of FIG. 2 described above is an example of the information processing server 200.

入力部２０１は、スキャナ等により取込まれたスキャン画像１１を入力するための装置である。入力部２０１は、書籍の各ページをスキャン画像１１として記憶部２２０に入力する。また、通信Ｉ／Ｆ２０２は、ＮＩＣ（Network Interface Card）等のインターフェースである。通信Ｉ／Ｆ２０２は、画像データ２２１と、テキストデータ２２２と、手書ＩＤ、ページ数、座標、手書文字データ、対象文章の行、および対象文章データを関連付けた手書データ１０とを端末３０へ送信する。 The input unit 201 is a device for inputting the scanned image 11 captured by a scanner or the like. The input unit 201 inputs each page of the book to the storage unit 220 as the scanned image 11. The communication I / F 202 is an interface such as a NIC (Network Interface Card). The communication I / F 202 transmits image data 221, text data 222, handwriting ID, number of pages, coordinates, handwriting character data, target sentence line, and handwriting data 10 associated with the target sentence data to the terminal 30. Send to.

記憶部２２０は、各種情報を記憶するデバイスである。記憶部２２０は、画像データ２２１と、テキストデータ２２２と、手書データ１０とを有する。記憶部２２０は、それ以外に、制御部２１０で実行されるＯＳ（Operating System）や、後述する格納処理を実行するプログラムを含む各種プログラムを記憶する。 The storage unit 220 is a device that stores various types of information. The storage unit 220 includes image data 221, text data 222, and handwritten data 10. In addition, the storage unit 220 stores various programs including an OS (Operating System) executed by the control unit 210 and a program for executing a storage process described later.

記憶部２２０が有する画像データ２２１は、スキャン画像１１の各ページの画像である。図４は、画像データ２２１の一例を示した図である。図４のように、記憶部２２０は、スキャン画像１１をページ単位に分割し、各ページの画像を画像データ２２１として記憶してもよい。記憶部２２０は、画像データ２２１を記憶する際、スキャン画像１１における画像データ２２１に対応するページ数を関連付けて記憶してもよい。また、画像データ２２１には手書箇所が含まれてもよい。なお、記憶部２２０は、別途、書籍の全ページに係る画像を記憶してもよい。 Image data 221 included in the storage unit 220 is an image of each page of the scanned image 11. FIG. 4 is a diagram illustrating an example of the image data 221. As illustrated in FIG. 4, the storage unit 220 may divide the scanned image 11 into pages and store the image of each page as image data 221. When storing the image data 221, the storage unit 220 may store the number of pages corresponding to the image data 221 in the scanned image 11 in association with each other. The image data 221 may include a handwritten part. In addition, the memory | storage part 220 may memorize | store the image which concerns on all the pages of a book separately.

記憶部２２０が有するテキストデータ２２２は、スキャン画像１１をテキスト化したものである。図５は、テキストデータ２２２の一例を示した図である。記憶部２２０は、スキャン画像１１をテキストデータ２２２に対応付けて記憶する。このため、スキャン画像１１およびテキストデータ２２２を受信した端末３０は、テキストデータ２２２の各文字について、スキャン画像１１での位置をそれぞれ特定することができる。このように、スキャン画像１１はテキストデータと対応付けられている。 The text data 222 included in the storage unit 220 is obtained by converting the scanned image 11 into text. FIG. 5 is a diagram illustrating an example of the text data 222. The storage unit 220 stores the scanned image 11 in association with the text data 222. For this reason, the terminal 30 that has received the scan image 11 and the text data 222 can specify the position of each character of the text data 222 in the scan image 11. Thus, the scan image 11 is associated with the text data.

記憶部２２０が有する手書データ１０は、スキャン画像１１に含まれる各手書箇所の座標等を対応付けたデータである。手書データ１０は、手書ＩＤ、ページ数、座標、手書文字データ、対象文章の行、および対象文章データを関連付け、さらに、手書箇所ごとにシーケンシャルに与えられる手書ＩＤを関連付ける。記憶部２２０は、手書ＩＤを主キーに設定し、各手書データ１０をデータベースで管理してもよい。なお、情報処理サーバ２００は、手書データ１０により、座標と手書文字データとが対応付けられているので、スキャン画像１１から各手書文字データを取り除くことも可能である。 The handwritten data 10 included in the storage unit 220 is data in which the coordinates and the like of each handwritten part included in the scanned image 11 are associated. The handwriting data 10 associates a handwriting ID, the number of pages, coordinates, handwriting character data, a target sentence line, and target sentence data, and further associates a handwriting ID given sequentially for each handwriting part. The storage unit 220 may set the handwriting ID as a primary key and manage each handwriting data 10 in a database. Note that the information processing server 200 can remove each handwritten character data from the scanned image 11 because the handwriting data 10 associates the coordinates with the handwritten character data.

次に、図６を用いて、手書データ１０の各項目について説明する。図６は、手書データ１０の一例を示した図である。図６に示すように、手書データ１０は、手書ＩＤ、頁、座標、手書文字データ、行、および対象文章データを含む。取得部２１２は、手書箇所ごとにシーケンシャルに番号を付すことにより、「手書ＩＤ」を設定する。情報処理サーバ２００は、「手書ＩＤ」を一意な番号に設定するので、各手書データ１０をデータベースで管理する場合に、「手書ＩＤ」を各レコードの主キーとすることができる。手書データ１０の「頁」は、手書箇所があるページ数である。手書データ１０の「座標」は、画像データ２２１における手書箇所を長方形で囲ったときの左上端の座標と右下端の座標を示す。手書データ１０の「手書文字データ」は、「座標」に対応する手書箇所を画像として抜き出したものである。手書データ１０の「行」は、画像データ２２１における対象文章の行である。手書データ１０の「対象文章データ」は、スキャン画像１１における対象文章の部分をテキストデータとして抜き出したものである。なお、「対象文章データ」は、画像データ２２１から対象文章に係る部分を画像として抜き出したものであってもよい。 Next, each item of the handwritten data 10 will be described with reference to FIG. FIG. 6 is a diagram showing an example of the handwriting data 10. As shown in FIG. 6, the handwritten data 10 includes handwritten ID, page, coordinates, handwritten character data, line, and target sentence data. The acquisition unit 212 sets a “handwriting ID” by sequentially numbering each handwriting location. Since the information processing server 200 sets the “handwriting ID” to a unique number, when the handwriting data 10 is managed in the database, the “handwriting ID” can be used as the main key of each record. The “page” of the handwritten data 10 is the number of pages with a handwritten part. The “coordinates” of the handwriting data 10 indicate the coordinates of the upper left corner and the lower right corner when the handwritten portion in the image data 221 is enclosed by a rectangle. “Handwritten character data” of the handwritten data 10 is obtained by extracting a handwritten portion corresponding to “coordinates” as an image. The “line” of the handwriting data 10 is a line of the target sentence in the image data 221. The “target sentence data” of the handwritten data 10 is obtained by extracting the portion of the target sentence in the scanned image 11 as text data. The “target sentence data” may be extracted from the image data 221 as a part of the target sentence.

例えば、図６の手書データ１０は、図４の手書箇所を示しており、手書ＩＤ「１」、手書箇所が記された画像データ２２１の頁「３」、手書箇所の範囲を示す座標「（９０，８０）−（１３０，８５）」、手書箇所を画像として抜き出した手書データ、対象文章の行「３−６」、および対象文章をテキストデータとして抜き出した対象文章データが関連付けられている。 For example, the handwritten data 10 in FIG. 6 indicates the handwritten portion in FIG. 4. The handwritten ID “1”, the page “3” of the image data 221 in which the handwritten portion is written, the range of the handwritten portion. "(90,80)-(130,85)" indicating the handwriting, handwritten data extracted from the handwritten part as an image, target sentence line "3-6", and the target sentence extracted from the target sentence as text data Data is associated.

制御部２１０は、受付部２１１と、取得部２１２と、送信部２１３とを備える。制御部２１０の各機能は、例えば、ＣＰＵ（Central Processing Unit）が所定のプログラムを実行することで実現することができる。 The control unit 210 includes a reception unit 211, an acquisition unit 212, and a transmission unit 213. Each function of the control unit 210 can be realized, for example, by a CPU (Central Processing Unit) executing a predetermined program.

制御部２１０が有する受付部２１１は、画像データ２２１を受け付ける。受付部２１１は、記憶手段２２０において画像データ２２１がスキャン画像１１の各ページに対応付けられているので、各画像データ２２１を受け付ける際に、対応するページ数をそれぞれ受け付けてもよい。なお、受付部２１１は、各画像データ２２１を受け付ける際、それぞれの画像データ２２１にページ数を割り振ってもよい。 The receiving unit 211 included in the control unit 210 receives the image data 221. Since the image data 221 is associated with each page of the scan image 11 in the storage unit 220, the accepting unit 211 may accept the corresponding number of pages when accepting each image data 221. The accepting unit 211 may assign the number of pages to each image data 221 when accepting each image data 221.

制御部２１０が有する取得部２１２は、スキャン画像１１に含まれる手書箇所の前後に対応する対象文章を取得する。取得部２１２は、まず、受付部２１１から、そのページに係る画像データ２２１と、受け付けた画像データ２２１のページ数とを取得する。取得部２１２は、次に、画像データ２２１の上部から下部に向かって、手書箇所を探し、手書箇所を見つけたらシーケンシャルに手書ＩＤを付し、ページ数と、手書箇所の座標と、手書文字データとを対応付ける。なお、手書文字と活字文字とを区別する際には、例えば、スペクトル領域局所ゆらぎ検出法を使用する。手書によるゆらぎを検出し、これを基に手書文字であるか活字文字であるかの判定をおこなう。 The acquisition unit 212 included in the control unit 210 acquires target sentences corresponding to before and after the handwritten portion included in the scanned image 11. First, the acquisition unit 212 acquires the image data 221 related to the page and the number of pages of the received image data 221 from the reception unit 211. Next, the acquisition unit 212 searches for the handwritten part from the upper part to the lower part of the image data 221. When the handwritten part is found, the acquisition part 212 sequentially attaches the handwritten ID, and the number of pages, the coordinates of the handwritten part, Associate handwritten character data. In order to distinguish between handwritten characters and printed characters, for example, a spectral region local fluctuation detection method is used. Fluctuation due to handwriting is detected, and based on this, it is determined whether it is a handwritten character or a printed character.

次に、取得部２１２は、手書データ１０の座標を基にして手書箇所の前後に対応する対象文章を取得する。そのために、取得部２１２は、まず、画像データ２２１における手書箇所の位置を、手書データ１０に係る「座標」を基にして特定する。 Next, the acquisition unit 212 acquires target sentences corresponding to before and after the handwritten part based on the coordinates of the handwritten data 10. For this purpose, the acquisition unit 212 first specifies the position of the handwritten portion in the image data 221 based on the “coordinates” related to the handwritten data 10.

取得部２１２は、次に、画像データ２２１における手書箇所前後に対応する対象文章の範囲を特定する。例えば、取得部２１２は、手書箇所の上下設定行数の範囲を、対象文章の範囲としてもよい。その場合、取得部２１２は、画像データ２２１における対象文章の範囲に記載された文字列を、テキストデータ２２２で特定し、テキストデータとして対象文章を取得する。このとき、取得部２１２は、画像データ２２１における対象文章の行も取得する。そして、取得部２１２は、手書ＩＤ、ページ数、座標、手書文字データ、対象文章の行、および対象文章データを関連付けて手書データ１０として記憶部２２０に記憶する。 Next, the acquisition unit 212 specifies the range of the target sentence corresponding to before and after the handwritten part in the image data 221. For example, the acquisition unit 212 may set the range of the number of lines set in the upper and lower handwritten locations as the range of the target sentence. In that case, the acquisition unit 212 identifies the character string described in the range of the target sentence in the image data 221 with the text data 222, and acquires the target sentence as text data. At this time, the acquisition unit 212 also acquires the line of the target sentence in the image data 221. Then, the acquisition unit 212 associates the handwritten ID, the number of pages, the coordinates, the handwritten character data, the target sentence line, and the target sentence data, and stores them in the storage unit 220 as the handwritten data 10.

また、取得部２１２は、手書箇所前後の文字数の範囲を対象文章の範囲としてもよい。この場合、取得部２１２は、まず、取得部２１２は、画像データ２２１における手書箇所の位置を、手書データ１０に係る「座標」を基にして、画像データ２２１における手書箇所の位置を特定する。次に、取得部２１２は、画像データ２２１における手書箇所の位置からテキストデータ２２２における手書箇所の位置を特定し、テキストデータ２２２から手書箇所前後の所定の文字数分の文字列を対象文章として取得する。なお、取得部２１２は、対象文章を画像データ２２１における対象文章の部分を、テキストに変換することにより対象文章を取得してもよい。 Moreover, the acquisition part 212 is good also considering the range of the number of characters before and behind a handwritten location as the range of an object sentence. In this case, the acquiring unit 212 first determines the position of the handwritten part in the image data 221 based on the “coordinates” of the handwritten data 10 based on the position of the handwritten part in the image data 221. Identify. Next, the acquisition unit 212 specifies the position of the handwritten portion in the text data 222 from the position of the handwritten portion in the image data 221, and sets a character string corresponding to a predetermined number of characters before and after the handwritten portion from the text data 222. Get as. The acquisition unit 212 may acquire the target sentence by converting the part of the target sentence in the image data 221 into text.

これにより、取得部２１２は、スキャン画像１１に含まれる各手書箇所に対応する対象文章をそれぞれ取得できる。なお、上記では取得部２１２は、対象文章をテキストデータとして取得する例について説明したが、画像データとして取得してもよい。 Thereby, the acquisition unit 212 can acquire each target sentence corresponding to each handwritten part included in the scanned image 11. In addition, although the acquisition part 212 demonstrated above the example which acquires a target sentence as text data, you may acquire as image data.

制御部２１０が有する送信部２１３は、手書データ１０と、スキャン画像１１とを通信Ｉ／Ｆ２０２を介して端末３０に送信する。なお、送信部２１３は、スキャン画像１１に対応付けられたテキストデータ２２２を送信してもよい。 The transmission unit 213 included in the control unit 210 transmits the handwritten data 10 and the scanned image 11 to the terminal 30 via the communication I / F 202. Note that the transmission unit 213 may transmit the text data 222 associated with the scanned image 11.

次は、図７を用いて、画像データ２２１から各手書箇所に係る、手書文字データおよび対象文章を手書データ１０に登録するまでの手順について説明する。図７は、画像データ２２１から各手書データ１０を作成するまでの流れを示したフロー図である。取得部２１２は、受付部２１１が受け付けた画像データ２２１に含まれる手書箇所を、画像データ２２１の上部から順番に探す（ステップＳ１０）。その処理方法は従来技術であり、たとえば特開２００９−２１２６５５等で開示されている。取得部２１２は、画像データ２２１に手書箇所がなければ（ステップＳ１０No）、処理を終了し、次のページに係る画像データ２２１の処理を開始する。取得部２１２は、画像データ２２１に手書箇所があれば（ステップＳ１０Yes）、その手書箇所を手書文字データとして抽出する（ステップＳ１１）。取得部２１２は、抽出した手書文字データを、画像データ２２１に係るページ数、および手書箇所の座標に関連付けて手書データ１０に登録する（ステップＳ１２）。さらに、取得部２１２は、手書箇所の座標に基づき、画像データ２２１において、手書箇所前後に対応する対象文章の範囲を特定し、対象文章の範囲を基にしてテキストデータ２２２より対象文章を抽出する（ステップＳ１３）。そして、取得部２１２は、対象文章の行と、対象文章とを手書データ１０とを、同じ手書ＩＤに係る手書データ１０に関連付けて登録する（ステップＳ１４）。上記処理は、ステップＳ１０で画像データ２２１の手書箇所を全て抽出したと判定されるまで繰り返される。 Next, a procedure from registering handwritten character data and target sentences to each handwritten portion from the image data 221 to the handwritten data 10 will be described with reference to FIG. FIG. 7 is a flowchart showing a flow from creation of each handwritten data 10 from the image data 221. The acquisition unit 212 searches for the handwritten portion included in the image data 221 received by the receiving unit 211 in order from the top of the image data 221 (step S10). The processing method is a conventional technique, for example, disclosed in Japanese Patent Application Laid-Open No. 2009-212655. If there is no handwritten part in the image data 221 (step S10 No), the acquisition unit 212 ends the process and starts the process of the image data 221 related to the next page. If there is a handwritten portion in the image data 221 (step S10 Yes), the acquiring unit 212 extracts the handwritten portion as handwritten character data (step S11). The acquisition unit 212 registers the extracted handwritten character data in the handwritten data 10 in association with the number of pages related to the image data 221 and the coordinates of the handwritten location (step S12). Furthermore, the acquisition unit 212 identifies the range of the target sentence corresponding to the handwritten part before and after the handwritten part in the image data 221 based on the coordinates of the handwritten part, and extracts the target sentence from the text data 222 based on the range of the target sentence. Extract (step S13). Then, the acquisition unit 212 registers the line of the target sentence and the target sentence in association with the handwriting data 10 related to the same handwriting ID (step S14). The above process is repeated until it is determined in step S10 that all handwritten portions of the image data 221 have been extracted.

［端末］
図８を用いて、スキャン画像１１の画面を操作して、対象文章を投稿用のＷｅｂページへ投稿するまでの流れについて説明する。図８は、スキャン画像１１から投稿用のＷｅｂページへ遷移する様子を示した図である。端末３０ａは、まず、表示部３１ａに情報処理サーバからダウンロードしたスキャン画像１１を表示する。なお、端末３０ａは、表示部３１ａに対して画面をスクロールする操作がなされると、次ページを表示する。 [Terminal]
The flow from the operation of the screen of the scanned image 11 to the posting of the target sentence to the posting Web page will be described with reference to FIG. FIG. 8 is a diagram showing a transition from the scanned image 11 to a posting Web page. First, the terminal 30a displays the scanned image 11 downloaded from the information processing server on the display unit 31a. Note that the terminal 30a displays the next page when an operation of scrolling the screen is performed on the display unit 31a.

端末３０ａは、「為替レート」の下にある手書箇所が押下されると、これに対応する手書データ１０を検索する。まず、端末３０ａは、現在表示部３１ａに表示しているスキャン画像のページに対応する手書データ１０を探し、押下された位置に係る座標と、手書データ１０の「座標」とを比較する。端末３０ａは、押下された位置に係る座標が手書データ１０の「座標」の範囲内であれば、押下された手書箇所に対応する手書データ１０と判定する。次に、端末３０ａは、押下された手書箇所に対応する対象文章データをその手書データ１０から取得する。 When the handwritten portion under “Exchange rate” is pressed, the terminal 30a searches for the handwritten data 10 corresponding thereto. First, the terminal 30a searches for the handwritten data 10 corresponding to the page of the scanned image currently displayed on the display unit 31a, and compares the coordinates relating to the pressed position with the “coordinates” of the handwritten data 10. . If the coordinates relating to the pressed position are within the range of “coordinates” of the handwritten data 10, the terminal 30 a determines that the handwritten data 10 corresponds to the pressed handwritten part. Next, the terminal 30a acquires the target sentence data corresponding to the pressed handwritten part from the handwritten data 10.

次に、端末３０ｂは、投稿サイトを表示し、投稿サイトの入力欄４１に取得した対象文章を入力する。端末３０ｂは、表示部３１ｂにおいて投稿ボタン４２が押下されると、対象文章を投稿する。これにより、ユーザはスキャン画像１１から手書箇所を選択し、投稿用のＷｅｂページで所定の操作をするだけで、文書の引用文を投稿することができる。すなわち、ユーザは書籍に関する情報を発信する際に、手書箇所に関連する情報を入力する手間を省くことができる。 Next, the terminal 30b displays the posting site and inputs the acquired target sentence in the input field 41 of the posting site. When the posting button 42 is pressed on the display unit 31b, the terminal 30b posts the target sentence. Thereby, the user can post a document quote by simply selecting a handwritten part from the scanned image 11 and performing a predetermined operation on the Web page for posting. That is, the user can save the trouble of inputting information related to the handwritten part when transmitting information about the book.

なお、投稿サイトはユーザ個人のブログのようなものであっても、所定文字数の短文を時系列で投稿できるサイトのようなものであってもよい。情報処理サーバ２００は、一度に投稿できる文字数が、所定の文字数に制限されているサイトに対象文章を投稿する場合、取得部２１２は、対象文章を取得する際に、対象文章として取得する手書箇所前後の文字数を、投稿用のＷｅｂページで制限されている文字数に設定し、対象文章として取得してもよい。 The posting site may be a user's personal blog or a site where a short sentence of a predetermined number of characters can be posted in time series. When the information processing server 200 posts the target sentence to a site where the number of characters that can be posted at a time is limited to a predetermined number of characters, the acquisition unit 212 acquires the target sentence as the target sentence when acquiring the target sentence. The number of characters before and after the location may be set to the number of characters restricted on the posting web page and acquired as the target sentence.

［効果］
上述してきたように、実施形態に係る情報処理サーバ２００は、取得部２１２と、記憶部２２０とを備える。取得部２１２は、スキャン画像１１に含まれる手書箇所の前後に対応する対象文章を取得する。記憶部２２０は、スキャン画像１１および対象文章を対応付けて記憶する。これにより、情報処理サーバ２００からスキャン画像１１と、スキャン画像１１に対応付けられた対象文章とを受信した端末３０は、スキャン画像１１に記載された手書文字から対象文章を取得することができる。その結果、手書文字を有効利用することができる。 [effect]
As described above, the information processing server 200 according to the embodiment includes the acquisition unit 212 and the storage unit 220. The acquisition unit 212 acquires target sentences corresponding to before and after the handwritten part included in the scanned image 11. The storage unit 220 stores the scan image 11 and the target sentence in association with each other. Thus, the terminal 30 that has received the scanned image 11 and the target sentence associated with the scanned image 11 from the information processing server 200 can acquire the target sentence from the handwritten characters described in the scanned image 11. . As a result, handwritten characters can be used effectively.

また、実施形態に係る情報処理サーバ２００において取得部２１２は、対象文章をスキャン画像１１からテキストデータとして取得する。これにより、情報処理サーバ２００からスキャン画像１１と、テキストデータとして取得された対象文章とを受信した端末３０は、簡易な操作で対象文章を投稿Ｗｅｂサイトの入力欄に記入することができ、対象文章を手入力しなくても投稿できるようになる。 In the information processing server 200 according to the embodiment, the acquisition unit 212 acquires the target sentence from the scan image 11 as text data. Accordingly, the terminal 30 that has received the scanned image 11 and the target sentence acquired as text data from the information processing server 200 can enter the target sentence in the input column of the posting website with a simple operation. You will be able to post without manually entering text.

［ハッシュタグを付す処理］
上記、第一の実施形態では、手書データ１０によりスキャン画像１１と対象文章とを関連付けたが、ハッシュタグの機能を有する投稿Ｗｅｂサイトで使用するためのハッシュタグを、手書データ１０によりさらに関連付けてもよい。そこで、第２の実施形態では、手書データ１０にハッシュタグを含めた例について説明する。なお、第２の実施形態に係る情報処理サーバ２００の構成は、制御部２１０に、抽出部を含ませたものである。 [Process to attach hash tag]
In the first embodiment, the scan image 11 and the target sentence are associated with the handwriting data 10. However, a hashtag for use on a posting website having a hashtag function is further added to the handwriting data 10. You may associate. Therefore, in the second embodiment, an example in which a hash tag is included in the handwritten data 10 will be described. Note that the configuration of the information processing server 200 according to the second embodiment is such that the control unit 210 includes an extraction unit.

情報処理サーバ２００は、制御部２１０にさらに抽出部を含んでもよい。手書箇所により指定されたスキャン画像中の指定文字列をハッシュタグとして抽出し、「ハッシュタグ」を手書データ１０に関連付ける。例えば、図４において、抽出部は、画像データ２２１の手書箇所により「為替レート」の部分を強調しているので、「為替レート」に対応する文字列をテキストデータ２２２から抽出する。そして、抽出部は、抽出した「為替レート」を手書データ１０の「ハッシュタグ」の欄に設定し、「ハッシュタグ」を手書データ１０と関連付ける。 The information processing server 200 may further include an extraction unit in the control unit 210. The designated character string in the scanned image designated by the handwritten part is extracted as a hash tag, and the “hash tag” is associated with the handwritten data 10. For example, in FIG. 4, the extraction unit emphasizes the “exchange rate” portion by the handwritten part of the image data 221, and therefore extracts a character string corresponding to the “exchange rate” from the text data 222. Then, the extraction unit sets the extracted “exchange rate” in the “hash tag” field of the handwritten data 10 and associates the “hash tag” with the handwritten data 10.

例えば、抽出部は、手書データ１０の「座標」に示された（９０，８０）−（１３０，８５）の範囲の上部にある文字列「為替レート」を指定文字列とし、手書データ１０に係る「ハッシュタグ」に設定する。 For example, the extraction unit sets the character string “exchange rate” in the upper part of the range of (90, 80) − (130, 85) indicated by “coordinates” of the handwriting data 10 as the designated character string, and the handwriting data 10 is set to “hash tag”.

情報処理サーバ２００から手書データ１０を受信した端末３０は、投稿サイトに投稿する際に対象文章にハッシュタグを付すことができる。図９は、対象文章にハッシュタグを付した場合の手書データ１０の一例である。情報処理サーバ２００は、手書箇所が指定している指定文字列をハッシュタグに設定する。例えば、図４において情報処理サーバ２００は、画像データ２２１の「為替レート」の下に手書箇所があるので、手書箇所が指定している指定文字列は「為替レート」である。 The terminal 30 that has received the handwritten data 10 from the information processing server 200 can attach a hash tag to the target sentence when posting to the posting site. FIG. 9 is an example of the handwriting data 10 when a hash tag is attached to the target sentence. The information processing server 200 sets the designated character string designated by the handwritten part in the hash tag. For example, in FIG. 4, since the information processing server 200 has a handwritten portion under “Exchange rate” in the image data 221, the designated character string designated by the handwritten portion is “Exchange rate”.

図１０は、端末３０が投稿サイトの入力欄にハッシュタグを設定し、対象文章を入力したときの一例を示した図である。端末３０は、情報処理サーバ２００から当該手書データ１０を受信し、対象文章を投稿する際に、手書データ１０の「ハッシュタグ」を、投稿用のＷｅｂページにおいてハッシュタグに設定して対象文章を投稿することができる。例えば、投稿サイトの入力欄４１に「# 為替レート」を対象文章の先頭に付して、半角の空白文字の後に対象文章「政策がデフレ・・・」を入力する。これにより、ユーザが重要と認識して直接手書きを付した部分をハッシュタグに設定して対象文章を投稿でき、投稿サイトを利用する他のユーザがハッシュタグを基に対象文章にたどり着くことが可能となる。 FIG. 10 is a diagram illustrating an example when the terminal 30 sets a hash tag in the input field of the posting site and inputs a target sentence. When the terminal 30 receives the handwritten data 10 from the information processing server 200 and posts the target text, the terminal 30 sets the “hash tag” of the handwritten data 10 as a hash tag on the posting Web page You can post a sentence. For example, “# exchange rate” is added to the head of the target sentence in the input field 41 of the posting site, and the target sentence “policy is deflation ...” is input after a single-byte space character. This makes it possible to post the target sentence by setting the part that the user recognizes as important and directly handwritten to the hash tag, and other users using the posting site can reach the target sentence based on the hash tag It becomes.

なお、情報処理サーバ２００において抽出部は、一つの手書データ１０につきハッシュタグを複数設定してもよい。例えば、抽出部は、対象文章として取得したテキストデータから重要な用語を複数抽出し、それぞれをハッシュタグに設定する。これにより、ユーザが対象文章を投稿した投稿サイトで他のユーザが対象文章へ、よりたどり着きやすくすることが可能となる。 In the information processing server 200, the extraction unit may set a plurality of hash tags for one handwritten data 10. For example, the extraction unit extracts a plurality of important terms from text data acquired as the target sentence, and sets each of them as a hash tag. This makes it easier for other users to reach the target sentence at the posting site where the user has posted the target sentence.

なお、実施例１において、情報処理サーバ２００は、対象文章をテキストデータとして取得したが、画像データとして取得してもよい。これにより、ユーザは、手書箇所を含めた画像データを、対象文章として自己が開設したブログ等に公開することができる。 In the first embodiment, the information processing server 200 acquires the target sentence as text data, but may acquire it as image data. Thereby, the user can publish image data including a handwritten part as a target sentence on a blog or the like opened by the user.

なお、図２のファイル管理サーバ１２２は、情報処理サーバ２００の一例であると説明したが、図２の電子化サーバ１１０が情報処理サーバ２００の一例であってもよい。 2 has been described as an example of the information processing server 200, the computerized server 110 in FIG. 2 may be an example of the information processing server 200.

以上、本願の実施形態のいくつかを図面に基づいて詳細に説明したが、これらは例示であり、発明の開示の欄に記載の態様を始めとして、当業者の知識に基づいて種々の変形、改良を施した他の形態で本発明を実施することが可能である。 As described above, some of the embodiments of the present application have been described in detail with reference to the drawings. However, these are merely examples, and various modifications, including the aspects described in the disclosure section of the invention, based on the knowledge of those skilled in the art, It is possible to implement the present invention in other forms with improvements.

また、特許請求の範囲に記載した「手段」は、「部（section、module、unit）」や「回路」などに読み替えることができる。例えば、受付手段は、受付部や受付回路に読み替えることができる。 Further, the “means” described in the claims can be read as “section (module, unit)” or “circuit”. For example, the reception unit can be read as a reception unit or a reception circuit.

１０手書データ
１１スキャン画像
２００情報処理サーバ
２０１入力部
２０２通信Ｉ／Ｆ
２１０制御部
２１１受付部
２１２取得部
２１３送信部
２２０記憶部
２２１画像データ
２２２テキストデータ 10 Handwritten data 11 Scanned image 200 Information processing server 201 Input unit 202 Communication I / F
210 Control Unit 211 Reception Unit 212 Acquisition Unit 213 Transmission Unit 220 Storage Unit 221 Image Data 222 Text Data

Claims

An acquisition unit that specifies a range of a handwritten portion included in a scan image using coordinates corresponding to the scan image, and acquires target sentences corresponding to before and after the specified range;
Storage means for storing the scanned image, information for specifying the range, and the target sentence in association with each other;
An extraction means for extracting a designated character string in the scanned image designated by the handwritten location as a hash tag,
The storage means stores the hash tag in association with the target sentence.
An information processing apparatus characterized by that.

The information processing apparatus according to claim 1, wherein the acquisition unit acquires the target sentence as image data including the handwritten part.

The information processing apparatus according to claim 1, wherein the acquisition unit acquires the target sentence as text data from the scanned image.

An information processing method executed by a computer,
Specify the range of the handwritten part included in the scanned image using the coordinates corresponding to the scanned image, obtain the target sentence corresponding to before and after the identified range,
Storing the scanned image, the information specifying the range, and the target sentence in association with each other ;
Extract a designated character string in the scanned image designated by the handwritten location as a hash tag,
Storing the hash tag in association with the target sentence;
An information processing method characterized by executing processing.

On the computer,
Specify the range of the handwritten part included in the scanned image using the coordinates corresponding to the scanned image, obtain the target sentence corresponding to before and after the identified range,
Storing the scanned image, the information specifying the range, and the target sentence in association with each other ;
Extract a designated character string in the scanned image designated by the handwritten location as a hash tag,
Storing the hash tag in association with the target sentence;
An information processing program for executing a process.