JP5989479B2

JP5989479B2 - Character recognition device, method for controlling character recognition device, control program, and computer-readable recording medium on which control program is recorded

Info

Publication number: JP5989479B2
Application number: JP2012207588A
Authority: JP
Inventors: ▲輝▼ 九鬼
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2012-09-20
Filing date: 2012-09-20
Publication date: 2016-09-07
Anticipated expiration: 2032-09-20
Also published as: JP2014063318A

Description

本発明は、画像に含まれる文字を認識する文字認識装置等に関する。 The present invention relates to a character recognition device that recognizes characters included in an image.

文字認識装置は、光学式文字認識などの技術を用いて、画像に含まれる文字を文字認識する。文字認識装置がカメラを搭載することにより、ユーザは、外出中に見かけた店舗の看板、書類などに記載された文字を該カメラを用いて撮像し、該文字を文字認識させることができる。 The character recognition device recognizes characters included in an image using a technique such as optical character recognition. By installing the camera in the character recognition device, the user can use the camera to capture images of characters written on store signs, documents, etc. that he sees while he is out and recognize the characters.

しかしながら、この場合、ユーザは、撮像された画像における、どの領域が文字認識の対象となる文字領域に該当するのかを文字認識装置に対して指定する必要がある。 However, in this case, the user needs to specify to the character recognition device which region in the captured image corresponds to the character region to be character-recognized.

文字領域を指定する機能を搭載した文字認識装置としては、例えば、撮像した画像を表示画面に表示し、該画像の一部の領域をマーカーで囲む、塗りつぶすなどのユーザからの操作を操作部にて受け付けることにより、上記文字領域を指定する機能を搭載した文字認識装置が公開されている。 As a character recognition device equipped with a function for designating a character area, for example, a captured image is displayed on a display screen, and a part of the image is surrounded by a marker, or a user operation such as painting is performed on the operation unit. The character recognition device equipped with the function of designating the character area by accepting it is disclosed.

また、カメラが備えているモニター画面に表示される枠内に、文字認識の対象となる文字を含むようにしてユーザが撮像することにより、上記文字領域を指定する機能を搭載した文字認識装置が公開されている。 In addition, a character recognition device equipped with a function for designating the character area by allowing the user to capture an image so that the character recognition target character is included in the frame displayed on the monitor screen of the camera is disclosed. ing.

また、撮像された画像に対して、ユーザが何も指定せずに、文字認識装置が自動的に上記文字領域を抽出する機能を搭載した文字認識装置が公開されている。 In addition, a character recognition device having a function of automatically extracting the character region by the character recognition device without specifying anything for the captured image is disclosed.

また、文字領域を指定する機能を搭載した文字認識装置として、下記の特許文献１〜３が開示されている。 Moreover, the following patent documents 1-3 are disclosed as a character recognition device equipped with a function for designating a character area.

特許文献１には、ＣＣＤカメラから撮像画像を入力し、撮像画像を表示器に表示し、タッチパネルから入力ペンにより画像範囲を指定させ、指定された範囲の表示画像から文字をパターン認識処理する端末装置が開示されている。 Japanese Patent Application Laid-Open No. 2004-228867 is a terminal that inputs a captured image from a CCD camera, displays the captured image on a display, specifies an image range with an input pen from a touch panel, and performs pattern recognition processing of characters from the display image in the specified range An apparatus is disclosed.

特許文献２には、カメラによって取り込まれた文書の画像、および文書の領域を指し示しているカメラ視野内の指の画像を処理して、該画像から、該文書中の指し示されている領域で参照されているウェブページを判別する装置が開示されている。 In Patent Document 2, an image of a document captured by a camera and an image of a finger within the camera field of view indicating a document area are processed, and from the image, an area indicated in the document is displayed. An apparatus for discriminating a referenced web page is disclosed.

特許文献３には、月を意味する文字列を文字認識することで紙カレンダーの月を認識し、認識した月に対応する予定データを取得し、指領域の先端部で直線が遮られている日付枠を検出することで、予定データの表示形式及び表示期間を決定するカレンダー装置が開示されている。 In Patent Document 3, the character of a character string that represents the month is recognized to recognize the month of the paper calendar, schedule data corresponding to the recognized month is acquired, and the straight line is blocked at the tip of the finger region. A calendar apparatus that determines the display format and display period of schedule data by detecting a date frame is disclosed.

特開平１１−２８２８３６号公報（１９９９年１０月１５日公開）Japanese Patent Laid-Open No. 11-282836 (released on October 15, 1999) 特表２００５−５０７５２６号公報（２００５年３月１７日公表）JP 2005-507526 A (published on March 17, 2005) 特開２０１２−４３３４８号公報（２０１２年３月１日公開）JP 2012-43348 A (published March 1, 2012)

特許文献１に係る文字認識装置は、カメラによって画像が撮像された後、文字認識の対象となる文字領域をユーザが入力ペンにより指定する必要がある。したがって、文字認識を実行するまでの手順が煩雑になるという問題がある。 In the character recognition device according to Patent Literature 1, after an image is captured by a camera, the user needs to specify a character region to be character-recognized by an input pen. Therefore, there is a problem that the procedure until the character recognition is executed becomes complicated.

特許文献２、３に係る文字認識装置は、カメラの被写体にユーザの指が置かれ、該指によって指定される範囲を上記文字領域として取得するため、撮像後にユーザが上記文字領域を指定するための操作を省略することができる。しかしながら、特許文献２、３に係る発明は、文字認識された文字を利用して行う処理の内容が予め決まっている。したがって、ユーザが、文字認識された文字を利用して、上記処理以外の処理を実行させたい場合、不都合である。 In the character recognition devices according to Patent Documents 2 and 3, the user's finger is placed on the subject of the camera, and the range specified by the finger is acquired as the character region. Therefore, the user specifies the character region after imaging. The operation of can be omitted. However, in the inventions according to Patent Documents 2 and 3, the content of processing performed using the character-recognized character is determined in advance. Therefore, it is inconvenient when the user wants to execute processing other than the above processing using the character-recognized character.

また、仮に、文字認識された文字を利用する処理の種類を選択することができる機能が文字認識装置に搭載されていたとしても、ユーザは、上記文字領域を指定する操作と、上記処理の種類を指定する操作の２つの操作を続けて行う必要がある。したがって、文字認識の後に続けて、文字認識された文字を利用した作業を行いたい場合、手順が煩雑になるという問題がある。 In addition, even if a function that can select the type of processing that uses a character-recognized character is installed in the character recognition device, the user can specify an operation for specifying the character area and the type of the processing. It is necessary to continuously perform two operations of specifying “ Therefore, there is a problem that the procedure becomes complicated when it is desired to perform an operation using the character recognized character after the character recognition.

本発明は、上記の問題点に鑑みてなされたものであり、その目的は、手軽な操作で、文字認識およびユーザの所望の処理を実行することができる文字認識装置等を提供することにある。 The present invention has been made in view of the above problems, and an object thereof is to provide a character recognition device and the like that can execute character recognition and user-desired processing with a simple operation. .

上記の課題を解決するために、本発明の一態様に係る文字認識装置は、画像に含まれる文字を認識する文字認識装置であって、指および文字を含む画像を取得する画像取得手段と、上記画像取得手段が取得した画像から、指部分と、当該指部分と隣接する文字部分との位置情報からジェスチャーを認識するジェスチャー認識手段と、上記ジェスチャー認識手段が認識した指のジェスチャーが、予め記憶している複数のジェスチャーのいずれと一致しているかを判定する指ジェスチャー判定手段と、上記ジェスチャー認識手段が認識した指によって示される位置の文字または一連の文字群を認識する文字認識手段と、上記指ジェスチャー判定手段が一致していると判定したジェスチャーと対応付けられた処理を、上記文字認識手段によって認識された文字または一連の文字群に対し実行する処理実行手段と、を備えている。 In order to solve the above problems, a character recognition device according to an aspect of the present invention is a character recognition device that recognizes characters included in an image, and an image acquisition unit that acquires an image including a finger and characters; Gesture recognition means for recognizing a gesture from position information of a finger part and a character part adjacent to the finger part from the image acquired by the image acquisition means, and a finger gesture recognized by the gesture recognition means are stored in advance. A finger gesture determination unit that determines which of the plurality of gestures matches, a character recognition unit that recognizes a character or a series of character groups at a position indicated by the finger recognized by the gesture recognition unit, and The character recognition means recognizes the process associated with the gesture determined to be the same as the finger gesture determination means. And a, a process execution means for executing relative character or stream of characters.

また、本発明の一態様に係る文字認識装置の制御方法は、画像に含まれる文字を認識する文字認識装置の制御方法であって、指および文字を含む画像を取得する画像取得ステップと、上記画像取得ステップが取得した画像から、指部分と、当該指部分と隣接する文字部分との位置情報からジェスチャーを認識するジェスチャー認識ステップと、上記ジェスチャー認識ステップが認識した指のジェスチャーが、予め記憶している複数のジェスチャーのいずれと一致しているかを判定する指ジェスチャー判定ステップと、上記ジェスチャー認識ステップが認識した指によって示される位置の文字または一連の文字群を認識する文字認識ステップと、上記指ジェスチャー判定ステップが一致していると判定したジェスチャーと対応付けられた処理を、上記文字認識ステップによって認識された文字または一連の文字群に対し実行する処理実行ステップと、を含む。 A control method for a character recognition device according to an aspect of the present invention is a control method for a character recognition device that recognizes characters included in an image, the image acquisition step for acquiring an image including a finger and characters, A gesture recognition step for recognizing a gesture from position information of a finger portion and a character portion adjacent to the finger portion, and a finger gesture recognized by the gesture recognition step from an image acquired by the image acquisition step are stored in advance. A finger gesture determination step for determining which of the plurality of gestures matches, a character recognition step for recognizing a character or a series of characters at the position indicated by the finger recognized by the gesture recognition step, and the finger The process associated with the gesture determined to match the gesture determination step Including a processing execution step of executing to recognized characters or stream of characters by the character recognition step.

なお、上記ジェスチャー認識手段が認識する指部分、上記ジェスチャー認識ステップで認識する指部分は、指の形状であってもよいし、指の動きであってもよい。 The finger part recognized by the gesture recognition unit and the finger part recognized in the gesture recognition step may be a finger shape or a finger movement.

本発明の一態様によれば、ユーザは、ジェスチャーで所望の文字領域を示すという手軽かつ直感的な操作のみで、文字認識および認識した文字に対する所望の処理を実行させることができるという効果を奏する。 According to an aspect of the present invention, there is an effect that a user can execute a desired process on a recognized character and recognized character only by a simple and intuitive operation of showing a desired character area with a gesture. .

本発明の実施形態に係る文字認識装置の構成を示す概略ブロック図である。It is a schematic block diagram which shows the structure of the character recognition apparatus which concerns on embodiment of this invention. 上記文字認識装置に登録されているジェスチャーの一例を示す外観図であり、（ａ）はジェスチャーが画面上部から指し示す場合を示す図であり、（ｂ）はジェスチャーが画面下部から指し示す場合を示す図であり、（ｃ）はジェスチャーが領域を指で挟む場合を示す図である。It is an external view which shows an example of the gesture registered into the said character recognition apparatus, (a) is a figure which shows the case where a gesture points from the upper part of a screen, (b) is the figure which shows the case where a gesture points from the lower part of a screen (C) is a figure which shows the case where a gesture pinches | interposes an area | region with a finger | toe. 上記文字認識装置に登録されているジェスチャーの一例を示す図である。It is a figure which shows an example of the gesture registered into the said character recognition apparatus. 上記ジェスチャーの種類と上記文字認識装置が実行する処理との対応関係を示すテーブルである。It is a table which shows the correspondence of the kind of said gesture, and the process which the said character recognition apparatus performs. 上記文字認識装置における処理の流れの一例を示したフローチャートである。It is the flowchart which showed an example of the flow of the process in the said character recognition apparatus.

本発明の一実施形態について、図１〜図５に基づいて以下に詳細に説明する。 One embodiment of the present invention will be described below in detail with reference to FIGS.

（文字認識装置１の構成）
図１は、本実施形態に係る文字認識装置１の構成を示す概略ブロック図である。文字認識装置１は、図１に示すように、制御部２、記憶部３、カメラ（画像取得手段）４、通信部（通信手段）５、表示部６、および、操作部７を備えている。 (Configuration of character recognition device 1)
FIG. 1 is a schematic block diagram showing the configuration of the character recognition device 1 according to this embodiment. As shown in FIG. 1, the character recognition device 1 includes a control unit 2, a storage unit 3, a camera (image acquisition unit) 4, a communication unit (communication unit) 5, a display unit 6, and an operation unit 7. .

文字認識装置１は、撮像によって取得した画像が含む文字を文字認識する機能を備えた情報処理装置である。文字認識装置１は、パーソナルコンピュータ、携帯電話、スマートフォン、タブレットＰＣ、ゲーム機器などの情報処理装置であってもよい。 The character recognition device 1 is an information processing device having a function of recognizing characters included in an image acquired by imaging. The character recognition device 1 may be an information processing device such as a personal computer, a mobile phone, a smartphone, a tablet PC, or a game device.

制御部２は、文字認識装置１の全体を統括して制御するものであり、例えばＣＰＵ（Central Processing Unit）等で構成することができる。制御部２は、記憶部３、カメラ４、通信部５、および、表示部６のそれぞれを制御する。制御部２の詳細な構成については後述する。 The control unit 2 controls the entire character recognition device 1 in an integrated manner, and can be configured by, for example, a CPU (Central Processing Unit). The control unit 2 controls each of the storage unit 3, the camera 4, the communication unit 5, and the display unit 6. The detailed configuration of the control unit 2 will be described later.

記憶部３は、制御部２が実行する（１）各部の制御プログラム、（２）ＯＳ（Operation System）プログラム、（３）アプリケーションプログラム、および、（４）これらプログラムを実行するときに読み出す各種データを記憶するものである。記憶部３は、フラッシュメモリなどの不揮発性の記憶装置によって構成される。また、記憶部３は、制御部２が上述の各種プログラムを実行する過程でデータを一時的に保持するための作業領域として、ＲＡＭ（Random Access Memory）などの揮発性の記憶装置によって構成される領域を有している。なお、記憶部３は、必ずしも文字認識装置１内に備えられる必要はなく、文字認識装置１に着脱可能な外部記憶装置、または、通信部５を介して通信可能なネットワーク上の外部記憶装置として、文字認識装置１に接続される構成であってもよい。 The storage unit 3 is executed by the control unit 2 (1) a control program for each unit, (2) an OS (Operation System) program, (3) an application program, and (4) various data read when executing these programs. Is memorized. The storage unit 3 is configured by a nonvolatile storage device such as a flash memory. In addition, the storage unit 3 is configured by a volatile storage device such as a RAM (Random Access Memory) as a work area for temporarily storing data in the process of executing the various programs described above by the control unit 2. Has an area. Note that the storage unit 3 is not necessarily provided in the character recognition device 1 and is an external storage device that can be attached to and detached from the character recognition device 1 or an external storage device on a network that can communicate via the communication unit 5. Further, a configuration connected to the character recognition device 1 may be possible.

そして、記憶部３は、特に、指画像パターン、指画像ジェスチャー対応テーブル、辞書データベース、メモ（文字）などを記憶する。 The storage unit 3 particularly stores a finger image pattern, a finger image gesture correspondence table, a dictionary database, a memo (character), and the like.

カメラ４は、操作部７にて受け付けるユーザの操作に基づき、文字、背景、人物などの物体を被写体として撮像する、通常のカメラが有する機能を備えている。特に、カメラ４は、指および文字を含む対象を被写体として撮像する。そして、カメラ４は、上記撮像によって取得した画像を制御部２に出力する。また、カメラ４は、複数の画像を定期的に撮像する機能を備えていてもよい。また、カメラ４は、映像を撮像する機能を備えていてもよい。この場合、操作部７にて受け付けるユーザの操作に従い、カメラ４は、複数の画像、または、映像を撮像する。なお、カメラ４は、背景、人物などを撮像する場合と、文字認識を目的として文字を撮像するための撮像の場合とで、処理の一部を切り替える機能を備えていてもよい。 The camera 4 has a function of a normal camera that captures an image of an object such as a character, background, or person as a subject based on a user operation received by the operation unit 7. In particular, the camera 4 captures an object including a finger and characters as a subject. Then, the camera 4 outputs the image acquired by the imaging to the control unit 2. In addition, the camera 4 may have a function of periodically capturing a plurality of images. Further, the camera 4 may have a function of capturing an image. In this case, the camera 4 captures a plurality of images or videos in accordance with a user operation received by the operation unit 7. Note that the camera 4 may have a function of switching a part of the processing between imaging a background, a person, and the like and imaging for capturing characters for the purpose of character recognition.

また、カメラ４は、取得した映像をリアルタイムに制御部２に出力してもよい。 The camera 4 may output the acquired video to the control unit 2 in real time.

通信部５は、文字認識装置１が通信ネットワークを介してインターネット通信を行うための通信インターフェースである。文字認識装置１では、通信部５を介して、文字認識された文字をキーとして、インターネットから情報を検索するようになっている。 The communication unit 5 is a communication interface for the character recognition device 1 to perform Internet communication via a communication network. In the character recognition device 1, information is retrieved from the Internet using the character recognized character as a key via the communication unit 5.

表示部６は、制御部２の指示に基づいて画像を表示する表示装置である。ＬＣ（Liquid Crystal）表示パネル、ＥＬ（Electro Luminescence）表示パネル等を表示部６として適用することができる。なお、図示していないが、制御部２または表示部６の内部には、ＶＤＰ（Video Display Processor）およびＶＲＡＭ（Video RAM）等の画像を表示するために必要な構成が適宜設けられている。また、表示部６は、画像表示と操作入力との両機能を備えるタッチパネルであってもよい。文字認識装置１では、表示部６は、文字認識された文字を利用した処理を実行した結果を表示するようになっている。また、表示部６は、カメラ４のモニター画面を兼ねる構成である。 The display unit 6 is a display device that displays an image based on an instruction from the control unit 2. An LC (Liquid Crystal) display panel, an EL (Electro Luminescence) display panel, or the like can be applied as the display unit 6. Although not shown, the control unit 2 or the display unit 6 is appropriately provided with a configuration necessary for displaying an image such as a VDP (Video Display Processor) and a VRAM (Video RAM). The display unit 6 may be a touch panel having both functions of image display and operation input. In the character recognition device 1, the display unit 6 displays the result of executing the process using the character-recognized character. The display unit 6 also serves as a monitor screen for the camera 4.

操作部７は、文字認識装置１のユーザの操作を受け付けるものであり、典型的には、物理キー、キーボード、タッチパネルなどである。なお、表示部６がタッチパネルである場合、表示部６が操作部７の機能を兼ねる構成である。 The operation unit 7 receives an operation of the user of the character recognition device 1 and is typically a physical key, a keyboard, a touch panel, or the like. When the display unit 6 is a touch panel, the display unit 6 also functions as the operation unit 7.

（制御部２の詳細な構成）
制御部２の構成について詳細に説明する。制御部２は、ジェスチャー判別部（ジェスチャー認識手段、指ジェスチャー判定手段）１０、画像結合部（結合手段）１１、文字切出部（文字認識手段）１２、文字認識処理部（文字認識手段）１３、キーワード検索部（処理実行手段）１４、辞書検索部（処理実行手段）１５、および、記憶処理部（処理実行手段）１６を備えている。 (Detailed configuration of the control unit 2)
The configuration of the control unit 2 will be described in detail. The control unit 2 includes a gesture determination unit (gesture recognition unit, finger gesture determination unit) 10, an image combination unit (connection unit) 11, a character cutout unit (character recognition unit) 12, and a character recognition processing unit (character recognition unit) 13. , A keyword search unit (process execution unit) 14, a dictionary search unit (process execution unit) 15, and a storage processing unit (process execution unit) 16.

ジェスチャー判別部１０は、カメラ４によって指および文字を含む対象が被写体として撮像されたとき、カメラ４から、該物体が被写体として写っている画像を取得し、該画像から、該指が示しているジェスチャーが、予め文字認識装置１に登録されたジェスチャーであるか否かを決定する。 When an object including a finger and characters is imaged as a subject by the camera 4, the gesture determination unit 10 acquires an image showing the object as a subject from the camera 4, and the finger indicates from the image. It is determined whether or not the gesture is a gesture registered in advance in the character recognition device 1.

具体的には、ジェスチャー判別部１０は、上記画像から指の輪郭に沿った領域である指領域を画像として切り出し、該指領域が示している指のジェスチャーをパターン認識などの技術を用いて判別する。 Specifically, the gesture determination unit 10 cuts out a finger area that is an area along the contour of the finger from the image as an image, and determines the finger gesture indicated by the finger area using a technique such as pattern recognition. To do.

例えば、記憶部３は、指のジェスチャーの外観を示す指画像パターンを記憶している。そして、記憶部３は、上記指画像パターンと、上記指画像パターンに対応する指のジェスチャーの種類とを対応付けた指画像ジェスチャー対応テーブルを記憶している。このとき、文字認識処理部１３は、切り出された上記指領域の画像と最も類似度が高い画像パターンを上記指画像パターンから検索する。そして、文字認識処理部１３は、上記指画像ジェスチャー対応テーブルを参照し、最も類似度が高いと判定された画像パターンに対応するジェスチャーを上記指領域が示す指のジェスチャーの候補である判定する。そして、ジェスチャー判別部１０は、上記候補の類似度が所定の値を下回っている場合、上記候補が予め文字認識装置１に登録されていないジェスチャーであると決定する。一方、上記候補の類似度が所定の値以上である場合、上記候補が予め文字認識装置１に登録されているジェスチャーであると決定する。 For example, the storage unit 3 stores a finger image pattern indicating the appearance of a finger gesture. The storage unit 3 stores a finger image gesture correspondence table in which the finger image pattern is associated with the type of finger gesture corresponding to the finger image pattern. At this time, the character recognition processing unit 13 searches the finger image pattern for an image pattern having the highest degree of similarity with the cut image of the finger region. Then, the character recognition processing unit 13 refers to the finger image gesture correspondence table, and determines that the gesture corresponding to the image pattern determined to have the highest similarity is a finger gesture candidate indicated by the finger region. Then, when the similarity of the candidate is lower than a predetermined value, the gesture determination unit 10 determines that the candidate is a gesture that is not registered in the character recognition device 1 in advance. On the other hand, when the similarity of the candidate is a predetermined value or more, it is determined that the candidate is a gesture registered in advance in the character recognition device 1.

なお、記憶部３は、上記指領域と文字領域との相対的な位置を記憶部３に記憶してもよい。この場合、ジェスチャー判別部１０は、上記指のジェスチャーと、上記指および隣接する文字領域の相対的な位置とから、ジェスチャーを判定する。上記の構成によれば、指のみによる判定、または、文字領域のみによる判定よりも、判定精度を高めることができる。 The storage unit 3 may store the relative positions of the finger area and the character area in the storage unit 3. In this case, the gesture determination unit 10 determines a gesture from the finger gesture and the relative positions of the finger and the adjacent character area. According to said structure, the determination precision can be improved rather than the determination only by a finger, or the determination only by a character area.

本実施形態において、予め文字認識装置１に登録されているジェスチャーは、ジェスチャーＡ（第２ジェスチャー）、ジェスチャーＢ（第２ジェスチャー）、ジェスチャーＣ（第３ジェスチャー）、ジェスチャーＤ（第１ジェスチャー）の４種類である。ジェスチャーＡ〜Ｄの外観などの詳細については後述する。なお、本実施形態では登録されているジェスチャーは４つとして説明するが、ジェスチャーの数は４つに限られるものではなく、いくつであってもよい。 In the present embodiment, gestures registered in advance in the character recognition device 1 are gesture A (second gesture), gesture B (second gesture), gesture C (third gesture), and gesture D (first gesture). There are four types. Details of the appearance of the gestures A to D will be described later. In the present embodiment, the number of registered gestures is described as four. However, the number of gestures is not limited to four and may be any number.

なお、ジェスチャー判別部１０がジェスチャーの種類を判別する方法は、上述のようなパターン認識の方法に限定されない。 Note that the method by which the gesture determination unit 10 determines the type of gesture is not limited to the pattern recognition method described above.

また、ジェスチャー判別部１０がジェスチャーの種類を判別するための画像は、カメラ４によって撮像された画像に限られず、ウェブページから取得した画像などであってもよい。 In addition, the image for the gesture determination unit 10 to determine the type of gesture is not limited to the image captured by the camera 4, and may be an image acquired from a web page.

また、ジェスチャー判別部１０が備える機能をカメラ４が備えていてもよい。この場合、カメラ４は、上記指がしばらく静止している状態をトリガとして、指および文字を含む対象を被写体として撮像する機能をさらに備えていてもよい。 Further, the camera 4 may have a function that the gesture determination unit 10 has. In this case, the camera 4 may further include a function of capturing an image of an object including the finger and characters as a subject by using the state where the finger is stationary for a while as a trigger.

画像結合部１１は、ジェスチャー判別部１０によって予め文字認識装置１に登録されたジェスチャーであると決定されたジェスチャーが、文字領域を囲むジェスチャーであった場合、カメラ４によって撮像された複数の画像または映像をカメラ４から取得し、該画像または映像を結合する。ここで、文字領域を囲む動作を開始する上記ジェスチャーは、上記ジェスチャーＤに対応する。画像結合部１１が実行する具体的な処理の内容については、後述する。 When the gesture determined to be a gesture registered in advance in the character recognition device 1 by the gesture discriminating unit 10 is a gesture surrounding the character area, the image combining unit 11 is a plurality of images captured by the camera 4 or A video is obtained from the camera 4 and the images or videos are combined. Here, the gesture that starts the motion surrounding the character area corresponds to the gesture D. Details of specific processing executed by the image combining unit 11 will be described later.

文字切出部１２は、ジェスチャー判別部１０によって、上記指領域が示すジェスチャーが予め文字認識装置１に登録されたジェスチャーであると決定された場合、上記ジェスチャーの種類に応じて、異なる方法で、上記画像から一部の領域を切り出す。文字切出部１２が実行する上記切り出し処理の詳細については、後述する。 When the gesture discriminating unit 10 determines that the gesture indicated by the finger area is a gesture registered in advance in the character recognition device 1, the character cutting unit 12 uses different methods depending on the type of the gesture, A part of the area is cut out from the image. Details of the cutout process performed by the character cutout unit 12 will be described later.

文字認識処理部１３は、文字切出部１２が取得した文字領域を含む画像に対し、文字認識処理を実行する。そして、文字認識処理部１３は、文字認識された文字または一連の文字群を文字コードとして取得する。上記一連の文字群は上記文字領域が含む英単語などである。文字認識処理部１３は、文字認識の方法として、光学式文字認識（以下、ＯＣＲ（Optical Character Recognition）と称する）などの技術を用いる。 The character recognition processing unit 13 performs character recognition processing on the image including the character area acquired by the character cutout unit 12. Then, the character recognition processing unit 13 acquires a character or a series of characters recognized as a character code. The series of character groups includes English words included in the character region. The character recognition processing unit 13 uses a technique such as optical character recognition (hereinafter referred to as OCR (Optical Character Recognition)) as a character recognition method.

キーワード検索部１４は、ジェスチャー判別部１０によって予め文字認識装置１に登録されたジェスチャーであると決定されたジェスチャーが、指で上部から文字を指図するジェスチャーであった場合、上記文字コードに対応する文字または一連の文字群をキーとして、通信部５を介し、インターネットから情報を検索する（以下、キーワード検索と称する）。ここで、指で上部から文字を指図する上記ジェスチャーは、上記ジェスチャーＡに対応する。そして、キーワード検索部１４は、得られたキーワード検索の結果を表示部６に表示する。 The keyword search unit 14 corresponds to the character code when the gesture determined by the gesture determination unit 10 as a gesture registered in the character recognition device 1 in advance is a gesture that directs a character from above with a finger. Information is searched from the Internet via the communication unit 5 using a character or a series of characters as a key (hereinafter referred to as keyword search). Here, the gesture in which a character is directed from the top with a finger corresponds to the gesture A. Then, the keyword search unit 14 displays the obtained keyword search result on the display unit 6.

辞書検索部１５は、ジェスチャー判別部１０によって予め文字認識装置１に登録されたジェスチャーであると決定されたジェスチャーが、指で下部から文字を指図するジェスチャーであった場合、記憶部３に記憶されており、用語と用語の説明とを対応付けて作成された辞書データベースから、上記文字コードに対応する文字または一連の文字群をキーとして用語を検索する（以下、辞書検索と称する）。ここで、指で下部から文字を指図する上記ジェスチャーは、上記ジェスチャーＢに対応する。そして、辞書検索部１５は、得られた辞書検索の結果を表示部６に表示する。なお、記憶部３に格納されている辞書データベースの種類は、特に限定されず、国語辞典、英和辞典、和英辞典、百科事典、または、技術用語辞典などの用語を含むデータベースであってもよい。 The dictionary search unit 15 is stored in the storage unit 3 when the gesture determined to be a gesture registered in the character recognition device 1 in advance by the gesture determination unit 10 is a gesture for directing a character from below with a finger. A term is searched from a dictionary database created by associating a term with a description of the term, using a character or a series of characters corresponding to the character code as a key (hereinafter referred to as dictionary search). Here, the gesture for instructing a character from below with a finger corresponds to the gesture B. Then, the dictionary search unit 15 displays the obtained dictionary search result on the display unit 6. In addition, the kind of dictionary database stored in the memory | storage part 3 is not specifically limited, The database containing terms, such as a national language dictionary, an English-Japanese dictionary, a Japanese-English dictionary, an encyclopedia, or a technical term dictionary, may be sufficient.

記憶処理部１６は、ジェスチャー判別部１０によって予め文字認識装置１に登録されたジェスチャーであると決定されたジェスチャーが、２本の指で文字を挟むジェスチャーであった場合、または、上記ジェスチャーＤであった場合、上記文字コードに対応する文字または一連の文字群をメモとして記憶部３に記憶する（以下、メモ記録と称する）。ここで、２本の指で文字を挟む上記ジェスチャーは、上記ジェスチャーＣに対応する。なお、記憶処理部１６は、操作部７にて受け付けるユーザの操作に基づき、ユーザが指定する任意のタイミングで、上記メモの内容を記憶部３から読み出し、表示部６に表示してもよい。 The memory processing unit 16 determines that the gesture determined in advance by the gesture determination unit 10 as a gesture registered in the character recognition device 1 is a gesture that sandwiches a character with two fingers, or uses the gesture D described above. If there is, a character or a series of character groups corresponding to the character code is stored in the storage unit 3 as a memo (hereinafter referred to as memo recording). Here, the gesture in which a character is sandwiched between two fingers corresponds to the gesture C. The storage processing unit 16 may read the content of the memo from the storage unit 3 and display it on the display unit 6 at an arbitrary timing designated by the user based on a user operation received by the operation unit 7.

（文字切出部１２）
文字切出部１２が実行する上記切り出し処理の詳細について説明する。 (Character cutout 12)
Details of the cutout process executed by the character cutout unit 12 will be described.

図２の（ａ）〜（ｃ）は、それぞれ、文字認識装置１に登録されているジェスチャーの一例を示す外観図である。図４の（ａ）〜（ｃ）に示す各ジェスチャーは、文字認識装置１の表示部６に表示されている。 2A to 2C are external views showing examples of gestures registered in the character recognition device 1, respectively. Each gesture shown in (a) to (c) of FIG. 4 is displayed on the display unit 6 of the character recognition device 1.

図２の（ａ）は、指で上部から文字を指図するジェスチャーを示したものであり、上記ジェスチャーＡに対応するものである。この場合、文字切出部１２は、上記指の先端の直下にある文字または一連の文字群を切り出すことにより生成した画像を、文字認識の対象となる文字領域として、文字認識処理部１３に出力する。 FIG. 2A shows a gesture for directing a character from above with a finger, and corresponds to the gesture A described above. In this case, the character cutout unit 12 outputs an image generated by cutting out a character or a series of character groups immediately below the tip of the finger to the character recognition processing unit 13 as a character area to be subjected to character recognition. To do.

図２の（ｂ）は、指で下部から文字を指図するジェスチャーを示したものであり、上記ジェスチャーＢに対応するものである。この場合、文字切出部１２は、上記指の先端の直上にある文字または一連の文字群を切り出すことにより生成した画像を、文字認識の対象となる文字領域として、文字認識処理部１３に出力する。 FIG. 2B shows a gesture for directing a character from below with a finger, and corresponds to the gesture B described above. In this case, the character cutout unit 12 outputs an image generated by cutting out a character or a series of character groups immediately above the tip of the finger to the character recognition processing unit 13 as a character area to be subjected to character recognition. To do.

図２の（ｃ）は、２本の指で文字を挟むジェスチャーを示したものであり、上記ジェスチャーＣに対応するものである。この場合、文字切出部１２は、２本の指の先端に挟まれた部分にある文字または一連の文字群を切り出すことにより生成した画像を、文字認識の対象となる文字領域として、文字認識処理部１３に出力する。 (C) of FIG. 2 shows a gesture for sandwiching a character with two fingers, and corresponds to the gesture C described above. In this case, the character cutout unit 12 uses the image generated by cutting out a character or a series of character groups sandwiched between the tips of two fingers as a character recognition target character region. Output to the processing unit 13.

（画像結合部１１）
画像結合部１１が実行する具体的な処理の内容について説明する。 (Image combiner 11)
Details of specific processing executed by the image combining unit 11 will be described.

図３は、文字認識装置１に登録されているジェスチャーの一例を示す外観図であり、ユーザが、指、または、指およびカメラ４を移動させることにより、文字領域を囲むジェスチャーを示したものである。 FIG. 3 is an external view showing an example of a gesture registered in the character recognition device 1, and shows a gesture surrounding a character area when a user moves a finger or a finger and the camera 4. is there.

まず、領域指定により文字領域の取り込みを行うとき、ユーザは、カメラ４に対して領域を指で囲む動作を開始するジェスチャーを提示する。このとき、ジェスチャー判別部１０は、上記ジェスチャーが、予め文字認識装置１に登録されたジェスチャーＤの一部であると判定する。 First, when capturing a character area by specifying an area, the user presents a gesture for starting the operation of surrounding the area with a finger to the camera 4. At this time, the gesture determination unit 10 determines that the gesture is a part of the gesture D registered in the character recognition device 1 in advance.

次に、ユーザは、図３に示すように、新聞紙５０の上でカメラ４を移動させながら指を移動させ、文字領域を囲むことにより、文字認識の対象となる文字領域を指定する。カメラ４は、上記移動の際、上記文字領域の一部を定期的に画像として撮像、または、映像として撮像することにより、上記文字領域の一部をそれぞれ含む複数の画像または映像を取得する。そして、ユーザは、領域を指で囲む動作を終了することを示すジェスチャーをカメラ４に対して提示する。このとき、カメラ４は、上記画像または映像の取得を終了する。 Next, as shown in FIG. 3, the user moves the finger while moving the camera 4 on the newspaper 50 and surrounds the character area, thereby designating a character area to be subjected to character recognition. During the movement, the camera 4 periodically captures a part of the character area as an image or an image, thereby acquiring a plurality of images or videos each including a part of the character area. Then, the user presents a gesture to the camera 4 indicating that the operation of surrounding the area with the finger is to be ended. At this time, the camera 4 ends the acquisition of the image or video.

一方、画像結合部１１は、上記複数の画像または映像を結合する。具体的には、画像結合部１１は、上記複数の画像、または、上記映像をサンプリングすることにより取得した複数の画像の間で、重複した領域をマッチングさせ、マッチングした領域を基に、該複数の画像を繋ぎ合わせる。上記領域を繋ぎ合わせる際に、画像結合部１１は、上記ジェスチャーＤを含む画像を始点に、領域指定が終了したことを示す上記ジェスチャーを含む画像を終点にして、それぞれの画像を繋ぎ合わせる。そして、画像結合部１１は、上記複数の画像または映像を結合することによって生成した１枚の画像を文字切出部１２に出力する。 On the other hand, the image combining unit 11 combines the plurality of images or videos. Specifically, the image combining unit 11 matches overlapping regions between the plurality of images or the plurality of images obtained by sampling the video, and based on the matched regions, the plurality of images Of images. When connecting the regions, the image combining unit 11 connects the images using the image including the gesture D as a starting point and the image including the gesture indicating that the region designation has been completed as an end point. Then, the image combining unit 11 outputs one image generated by combining the plurality of images or videos to the character cutting unit 12.

（変形例）
上記の構成では、文字認識装置１は、カメラ４によって指および文字を含む対象が被写体として撮像されたとき、カメラ４から該物体が被写体として写っている画像を取得し、該画像からジェスチャーを判別する。したがって、制御部２が備える各ブロックが指のジェスチャーを判別するためには、予め、カメラ４にて画像または映像が撮像されている必要があった。 (Modification)
In the above configuration, when the object including a finger and a character is captured as a subject by the camera 4, the character recognition device 1 acquires an image in which the object is captured as a subject from the camera 4 and discriminates a gesture from the image. To do. Therefore, in order for each block included in the control unit 2 to discriminate a finger gesture, an image or a video needs to be captured by the camera 4 in advance.

そこで、カメラ４が写している映像をリアルタイムに制御部２に出力し、制御部２は、入力された映像を連続的に判別するように構成してもよい。 Therefore, the video captured by the camera 4 may be output to the control unit 2 in real time, and the control unit 2 may be configured to continuously determine the input video.

上記の構成によれば、例えば、カメラ４に写しながら文字に指をかざすと同時に、リアルタイムで、辞書検索部１５にて翻訳された単語を表示させることができる。 According to the above configuration, for example, a word translated by the dictionary search unit 15 can be displayed in real time at the same time as holding a finger over a character while copying to the camera 4.

また、上記の構成によれば、細かい指の動きのパターン（指の上下往復、左右往復など）をジェスチャーの種類として判断できるようになる。例えば、上記の領域を指で囲む動作を開始するジェスチャー、上記の領域を指で囲む動作を終了することを示すジェスチャーは、特別な指の形状だけでなく、細かい指の動き（紙面等を人差指で叩く動作など）で定義することができ、ジェスチャー判別部１０によって判定することができる。また、指の形状は変わらないが、その周りの文字領域の位置が相対的に変化することをもって、囲む動作中、という判定をジェスチャー判別部１０が行うことができる。 Further, according to the above configuration, a fine finger movement pattern (finger vertical reciprocation, left-right reciprocation, etc.) can be determined as the type of gesture. For example, the gesture for starting the movement of surrounding the above area with a finger and the gesture for ending the movement of surrounding the above area with a finger are not only special finger shapes but also a fine finger movement (such as a finger And the like can be determined by the gesture determination unit 10. Further, although the shape of the finger does not change, the gesture determination unit 10 can determine that the surrounding operation is in progress by relatively changing the position of the surrounding character area.

さらに、上記の構成によれば、カメラ４がリアルタイムで指を認識することにより、操作部７にて受け付けるユーザの操作の一部を代替することができる。 Furthermore, according to said structure, when the camera 4 recognizes a finger in real time, a part of user's operation received in the operation part 7 can be substituted.

（ジェスチャーと処理との対応関係）
図４は、文字認識装置１に登録されているジェスチャーの種類と、該ジェスチャーに応じて、文字認識装置１が実行する処理との対応関係を示すテーブルである。 (Correspondence between gesture and processing)
FIG. 4 is a table showing the correspondence between the types of gestures registered in the character recognition device 1 and the processing executed by the character recognition device 1 in accordance with the gestures.

図４に示すように、上部から文字を指図するジェスチャー（ジェスチャーＡに対応）がカメラ４によって撮像された場合、キーワード検索部１４は、キーワード検索を実行する。また、下部から文字を指図するジェスチャー（ジェスチャーＢに対応）がカメラ４によって撮像された場合、辞書検索部１５は、辞書検索を実行する。また、２本の指で文字を挟むジェスチャー（ジェスチャーＣに対応）、または、領域を指で囲むジェスチャー（ジェスチャーＤに対応）がカメラ４によって撮像された場合、記憶処理部１６は、文字または一連の文字群をメモに記録する。 As illustrated in FIG. 4, when a gesture (corresponding to gesture A) that instructs a character from the top is captured by the camera 4, the keyword search unit 14 performs a keyword search. Further, when a gesture (corresponding to gesture B) that instructs a character from the lower part is captured by the camera 4, the dictionary search unit 15 performs a dictionary search. When the camera 4 captures a gesture that sandwiches a character with two fingers (corresponding to gesture C) or a gesture that surrounds an area with a finger (corresponding to gesture D), The character group is recorded in a memo.

なお、文字認識装置１は、文字をメモとして記録する場合、例えば、新聞の切り抜きのように、画像レイアウトを保持してもよいし、文字認識後のテキスト文字のみを記録してもよい。また、文字認識装置１は、２本の指で文字を挟むジェスチャー、または、領域を指で囲むジェスチャーを認識した後に、特別な指のパターンを判別し、その判別結果に基づいて、上記のような異なる記録方法を選択することができるように構成されていてもよい。 In addition, when recording a character as a memo, the character recognition device 1 may hold an image layout, for example, as a cutout of a newspaper, or may record only a text character after character recognition. In addition, the character recognition device 1 determines a special finger pattern after recognizing a gesture of sandwiching a character with two fingers or a gesture surrounding a region with a finger, and based on the determination result, as described above. Different recording methods may be selected.

このように、文字認識装置１は、ジェスチャー判別手段によって判別された上記ジェスチャーの種類に応じて、文字認識処理部１３によって文字認識された文字を利用した、異なる処理を実行するようになっている。 As described above, the character recognition device 1 performs different processing using the character recognized by the character recognition processing unit 13 according to the type of the gesture determined by the gesture determination means. .

なお、本実施形態において、文字認識装置１に予め登録されているジェスチャーは、上記ジェスチャーＡ〜Ｄであるが、文字認識装置１に登録可能なジェスチャーの種類は、上記ジェスチャーＡ〜Ｄに限定されない。 In the present embodiment, the gestures registered in advance in the character recognition device 1 are the gestures A to D. However, the types of gestures that can be registered in the character recognition device 1 are not limited to the gestures A to D. .

また、文字認識装置１が実行する処理は、上述の各処理に限定されない。 Moreover, the process which the character recognition apparatus 1 performs is not limited to each above-mentioned process.

また、上記ジェスチャーの種類と、該ジェスチャーに応じて文字認識装置１が実行する処理との対応関係は、一例を示したものであり、本発明の適用範囲はこれらの対応関係に限定されない。 The correspondence relationship between the types of gestures and the processing executed by the character recognition device 1 according to the gestures is an example, and the scope of application of the present invention is not limited to these correspondence relationships.

（処理の流れ）
図５は、文字認識装置１における処理の流れの一例を示したフローチャートである。 (Process flow)
FIG. 5 is a flowchart illustrating an example of a process flow in the character recognition device 1.

まず、カメラ４によって指および文字を含む対象が被写体として撮像される（Ｓ１）。このとき、ジェスチャー判別部１０は上記指および文字を含む対象が被写体として写っている画像を取得する。 First, an object including a finger and characters is imaged as a subject by the camera 4 (S1). At this time, the gesture determination unit 10 acquires an image in which the object including the finger and the character is captured as a subject.

次に、ジェスチャー判別部１０は、上記画像から指の輪郭に沿った領域である指領域を切り出す（Ｓ２）。 Next, the gesture discrimination | determination part 10 cuts out the finger area | region which is an area | region along the outline of a finger from the said image (S2).

そして、ジェスチャー判別部１０は、上記指領域が示すジェスチャーが、予め文字認識装置１に登録されたジェスチャーであるか否かを判定する（Ｓ３）。すなわち、上記指領域が示すジェスチャーが、ジェスチャーＡ、ジェスチャーＢ、ジェスチャーＣ、ジェスチャーＤの４種類のうち、何れか１つに該当するか、もしくは、上記４種類のジェスチャーの何れにも該当しないかを判定する。 And the gesture discrimination | determination part 10 determines whether the gesture which the said finger area | region shows is the gesture previously registered into the character recognition apparatus 1 (S3). That is, whether the gesture indicated by the finger area corresponds to any one of the four types of gesture A, gesture B, gesture C, and gesture D, or does not correspond to any of the above four types of gestures Determine.

上記ジェスチャーが、予め文字認識装置１に登録されたジェスチャーであった場合（Ｓ３でＹｅｓ）、すなわち、上記指領域が示すジェスチャーが、上記４種類のジェスチャーの何れか１つに該当する場合、処理はステップＳ４に進む。 If the gesture is a gesture registered in advance in the character recognition device 1 (Yes in S3), that is, if the gesture indicated by the finger area corresponds to one of the four types of gestures, Advances to step S4.

一方、上記ジェスチャーが、予め文字認識装置１に登録されていなかった場合（Ｓ３でＮｏ）、すなわち、上記指領域が示すジェスチャーが、上記４種類のジェスチャーの何れにも該当しない場合、処理はステップＳ１に戻る。 On the other hand, if the gesture is not registered in the character recognition device 1 in advance (No in S3), that is, if the gesture indicated by the finger area does not correspond to any of the four types of gestures, Return to S1.

そして、画像結合部１１は、ジェスチャー判別部１０によって判別されたジェスチャーが、文字領域を囲むジェスチャー（すなわち、ジェスチャーＤ）であるか否かを判定する（Ｓ４）。 Then, the image combining unit 11 determines whether or not the gesture determined by the gesture determining unit 10 is a gesture surrounding the character area (that is, gesture D) (S4).

上記ジェスチャーが、文字領域を囲むジェスチャーであった場合（Ｓ４でＹｅｓ）、処理はステップＳ５に進む。 If the gesture is a gesture surrounding the character area (Yes in S4), the process proceeds to step S5.

一方、上記ジェスチャーが、文字領域を囲むジェスチャーでなかった場合（Ｓ４でＮｏ）、処理はステップＳ６に進む。 On the other hand, when the gesture is not a gesture surrounding the character area (No in S4), the process proceeds to step S6.

そして、画像結合部１１は、撮像された複数の画像または映像をカメラ４から取得し、該画像または映像を結合する（Ｓ５）。 Then, the image combining unit 11 acquires a plurality of captured images or videos from the camera 4 and combines the images or videos (S5).

そして、文字切出部１２は、ジェスチャー判別部１０によって判別されたジェスチャーの種類に応じて、異なる方法で、上記画像から一部の領域を切り出す（Ｓ６）。例えば、上記指領域が示すジェスチャーがジェスチャーＡであった場合、文字切出部１２は、画像に含まれる指の先端の直下にある文字または一連の文字群を文字領域として切り出す。 Then, the character cutout unit 12 cuts out a partial region from the image by a different method depending on the type of gesture determined by the gesture determination unit 10 (S6). For example, when the gesture indicated by the finger region is gesture A, the character cutout unit 12 cuts out a character or a series of character groups immediately below the tip of the finger included in the image as a character region.

そして、文字認識処理部１３は、文字切出部１２によって切り出された文字領域に対し、文字認識を実行する（Ｓ７）。そして、文字認識処理部１３は、文字認識された文字または一連の文字群を文字コードとして取得する。 And the character recognition process part 13 performs character recognition with respect to the character area cut out by the character cutout part 12 (S7). Then, the character recognition processing unit 13 acquires a character or a series of characters recognized as a character code.

続けて、ジェスチャー判別部１０は、判別したジェスチャーの種類に応じて、上記文字コードを利用する処理を選択する（Ｓ８）。 Subsequently, the gesture determination unit 10 selects a process using the character code according to the determined type of gesture (S8).

具体的には、ジェスチャー判別部１０によって判別されたジェスチャーが、指で上部から文字を指図するものであった場合（Ｓ８でジェスチャーＡ）、ジェスチャー判別部１０は、キーワード検索部１４に対し、キーワード検索の実行を指示する。 Specifically, when the gesture discriminated by the gesture discriminating unit 10 directs a character from the top with a finger (gesture A in S8), the gesture discriminating unit 10 instructs the keyword search unit 14 to enter the keyword Instructs the search to be performed.

そして、キーワード検索部１４は、ジェスチャー判別部１０からの上記指示をトリガとして、上記文字コードに対応する文字または一連の文字群を用いて、キーワード検索を実行する（Ｓ９）。 And the keyword search part 14 performs a keyword search using the said instruction | indication from the gesture discrimination | determination part 10 as a trigger using the character or a series of character groups corresponding to the said character code (S9).

また、ジェスチャー判別部１０によって判別されたジェスチャーが、指で下部から文字を指図するものであった場合（Ｓ８でジェスチャーＢ）、ジェスチャー判別部１０は、辞書検索部１５に対し、辞書検索の実行を指示する。 Further, when the gesture determined by the gesture determination unit 10 is to direct a character from below with a finger (gesture B in S8), the gesture determination unit 10 performs a dictionary search to the dictionary search unit 15. Instruct.

そして、辞書検索部１５は、ジェスチャー判別部１０からの上記指示をトリガとして、上記文字コードに対応する文字または一連の文字群および記憶部３に記憶されている辞書データベースを用いて、単語検索を実行する（Ｓ１０）。 Then, the dictionary search unit 15 uses the instruction from the gesture determination unit 10 as a trigger to perform a word search using a character or a series of characters corresponding to the character code and a dictionary database stored in the storage unit 3. Execute (S10).

また、ジェスチャー判別部１０によって判別されたジェスチャーが、２本の指で文字を挟むものであった場合（Ｓ８でジェスチャーＣ）、または、文字領域を指で囲むものであった場合（Ｓ８でジェスチャーＤ）、ジェスチャー判別部１０は、記憶処理部１６に対し、メモ記録の実行を指示する。 Also, when the gesture determined by the gesture determination unit 10 is a character that sandwiches a character with two fingers (gesture C in S8), or when the character region is surrounded by a finger (gesture in S8) D) The gesture determination unit 10 instructs the storage processing unit 16 to execute memo recording.

そして、記憶処理部１６は、ジェスチャー判別部１０からの上記指示をトリガとして、上記文字コードに対応する文字または一連の文字群をメモとして記憶部３に格納する（Ｓ１１）。 And the memory | storage process part 16 stores the character or a series of character group corresponding to the said character code as a memo in the memory | storage part 3 by using the said instruction | indication from the gesture discrimination | determination part 10 as a trigger (S11).

（まとめ）
本発明の一態様に係る文字認識装置は、画像に含まれる文字を認識する文字認識装置であって、指および文字を含む画像を取得する画像取得手段（カメラ４）と、上記画像取得手段が取得した画像から、指部分と、当該指部分と隣接する文字部分との位置情報からジェスチャーを認識するジェスチャー認識手段（ジェスチャー判別部１０）と、上記ジェスチャー認識手段が認識した指のジェスチャーが、予め記憶している複数のジェスチャーのいずれと一致しているかを判定する指ジェスチャー判定手段（ジェスチャー判別部１０）と、上記ジェスチャー認識手段が認識した指によって示される位置の文字または一連の文字群を認識する文字認識手段（文字認識処理部１３）と、上記指ジェスチャー判定手段が一致していると判定したジェスチャーと対応付けられた処理を、上記文字認識手段によって認識された文字または一連の文字群に対し実行する処理実行手段（キーワード検索部１４、辞書検索部１５、記憶処理部１６）と、を備えている。 (Summary)
A character recognition device according to an aspect of the present invention is a character recognition device that recognizes characters included in an image, and includes an image acquisition unit (camera 4) that acquires an image including a finger and characters, and the image acquisition unit includes: From the acquired image, a gesture recognition unit (gesture determination unit 10) that recognizes a gesture from position information of a finger part and a character part adjacent to the finger part, and a finger gesture recognized by the gesture recognition unit Recognizes a character or a series of characters at the position indicated by the finger recognized by the finger recognition unit (gesture determination unit 10) for determining which of the plurality of stored gestures matches, and the gesture recognition unit Character recognition means (character recognition processing unit 13) that performs the gesture determination that the finger gesture determination means matches. A process execution unit (keyword search unit 14, dictionary search unit 15, storage processing unit 16) that executes a process associated with a character on a character or a series of character groups recognized by the character recognition unit. ing.

また、本発明の一態様に係る文字認識装置の制御方法は、画像に含まれる文字を認識する文字認識装置の制御方法であって、指および文字を含む画像を取得する画像取得ステップ（Ｓ１）と、
上記画像取得ステップが取得した画像から、指部分と、当該指部分と隣接する文字部分との位置情報からジェスチャーを認識するジェスチャー認識ステップ（Ｓ２、Ｓ３）と、上記ジェスチャー認識ステップが認識した指のジェスチャーが、予め記憶している複数のジェスチャーのいずれと一致しているかを判定する指ジェスチャー判定ステップ（Ｓ３、Ｓ４）と、上記ジェスチャー認識ステップが認識した指によって示される位置の文字または一連の文字群を認識する文字認識ステップ（Ｓ７）と、上記指ジェスチャー判定ステップが一致していると判定したジェスチャーと対応付けられた処理を、上記文字認識ステップによって認識された文字または一連の文字群に対し実行する処理実行ステップ（Ｓ９、Ｓ１０、Ｓ１１）と、を含む。 A control method for a character recognition device according to an aspect of the present invention is a control method for a character recognition device that recognizes characters included in an image, and an image acquisition step (S1) for acquiring an image including a finger and characters. When,
A gesture recognition step (S2, S3) for recognizing a gesture from position information of a finger portion and a character portion adjacent to the finger portion from the image acquired by the image acquisition step, and the finger recognition recognized by the gesture recognition step. A finger gesture determination step (S3, S4) for determining which of the plurality of gestures is stored in advance, and a character or a series of characters at the position indicated by the finger recognized by the gesture recognition step The character recognition step (S7) for recognizing a group and the processing associated with the gesture determined to match the finger gesture determination step are performed on the character or series of characters recognized by the character recognition step. Process execution steps (S9, S10, S11) to be executed.

上記の構成によれば、上記文字認識装置は、指および文字を含む画像において、指によって示される位置の文字（または一連の文字群）を認識し、指のジェスチャーと対応付けられた処理を、認識した文字（または一連の文字群）に対して実行する。ここで、一連の文字群とは、例えば英語の文章における英単語等である。 According to said structure, the said character recognition apparatus recognizes the character (or series of character group) of the position shown with a finger | toe in the image containing a finger | toe and a character, The process matched with the gesture of the finger | toe, Execute on a recognized character (or set of characters). Here, the series of character groups is, for example, English words in English sentences.

これにより、上記文字認識装置は、指のジェスチャーのみによって所望の文字を認識し、認識した文字に対し所望の処理を実行することができる。 Thereby, the said character recognition apparatus can recognize a desired character only with a finger gesture, and can perform a desired process with respect to the recognized character.

よって、ユーザは、ジェスチャーで所望の文字領域を示すという手軽かつ直感的な操作のみで、文字認識および認識した文字に対する所望の処理を連続的に実行させることができるという効果を奏する。 Therefore, the user can perform the character recognition and the desired processing for the recognized character continuously by simple and intuitive operation of showing the desired character area with the gesture.

さらに、上記画像取得手段は互いの画像の一部に共通部分がある複数の画像を取得し、上記文字認識装置は、上記複数の画像のそれぞれに含まれる指のジェスチャーが、上記複数のジェスチャーのうちの１つである第１ジェスチャー（ジェスチャーＤ）と一致していると上記指ジェスチャー判定手段が判定した場合、上記共通部分を重ね合わせることにより上記複数の画像を結合する結合手段（画像結合部１１）を備えていてもよい。 Further, the image acquisition means acquires a plurality of images having a common part in a part of each other image, and the character recognition device is configured such that a finger gesture included in each of the plurality of images is the one of the plurality of gestures. When the finger gesture determination unit determines that the first gesture (gesture D), which is one of them, matches, a combining unit (image combining unit) that combines the plurality of images by superimposing the common parts 11) may be provided.

上記の構成によれば、上記文字認識装置は、互いの画像の一部に共通部分がある複数の画像を取得し、上記複数の画像のそれぞれに含まれる指のジェスチャーが、上記複数のジェスチャーのうちの１つであった場合、上記共通部分を重ね合わせることにより上記複数の画像を結合する。 According to the above configuration, the character recognition device acquires a plurality of images having a common part in a part of each other image, and a finger gesture included in each of the plurality of images is the plurality of gestures. If it is one of them, the plurality of images are combined by superimposing the common parts.

よって、ユーザは、一度の撮像では撮像範囲に収まらないような撮像対象についても、容易に文字認識の対象範囲を設定することができるという効果を奏する。 Therefore, the user can easily set the target range for character recognition even for an imaging target that does not fit in the imaging range by one imaging.

さらに、上記文字認識装置は、通信ネットワークを介して外部の機器と通信する通信手段（通信部５）を備え、上記処理実行手段（キーワード検索部１４）は、上記ジェスチャー認識手段が認識した指のジェスチャーが、上記複数のジェスチャーのうちの１つである第２ジェスチャー（ジェスチャーＡ）と一致していると上記指ジェスチャー判定手段が判定した場合、上記文字認識手段によって認識された文字または一連の文字群をキーとして、上記通信手段を介して上記通信ネットワークから情報を検索してもよい。 Further, the character recognition device includes a communication unit (communication unit 5) that communicates with an external device via a communication network, and the processing execution unit (keyword search unit 14) detects the finger recognized by the gesture recognition unit. A character or a series of characters recognized by the character recognition means when the finger gesture determination means determines that a gesture matches a second gesture (gesture A) which is one of the plurality of gestures. Information may be retrieved from the communication network via the communication means using a group as a key.

上記の構成によれば、上記文字認識装置は、認識した指のジェスチャーが、上記複数のジェスチャーのうちの１つであった場合、認識された文字（または一連の文字群）をキーとして、通信ネットワークから情報を検索する。 According to the above configuration, when the recognized finger gesture is one of the plurality of gestures, the character recognition device communicates using the recognized character (or a series of characters) as a key. Retrieve information from the network.

よって、ユーザは、ジェスチャーで所望の文字を示すという手軽かつ直感的な操作のみで、文字認識および認識した文字をキーとして通信ネットワークから情報を検索することができるという効果を奏する。 Therefore, the user can retrieve information from the communication network using the recognized character and the recognized character as a key only by a simple and intuitive operation of showing a desired character with a gesture.

さらに、上記文字認識装置は、用語と用語の説明とを対応付けて作成された辞書データベースを備え、上記処理実行手段（辞書検索部１５）は、上記ジェスチャー認識手段が認識した指のジェスチャーが、上記複数のジェスチャーのうちの１つである第３ジェスチャー（ジェスチャーＢ）と一致していると上記指ジェスチャー判定手段が判定した場合、上記文字認識手段によって認識された文字または一連の文字群をキーとして、上記辞書データベースの用語を検索してもよい。 Furthermore, the character recognition device includes a dictionary database created by associating terms with explanations of terms, and the processing execution means (dictionary search unit 15) is configured such that the finger gesture recognized by the gesture recognition means is When the finger gesture determination unit determines that the third gesture (gesture B), which is one of the plurality of gestures, matches a character or a series of characters recognized by the character recognition unit As an alternative, the dictionary database term may be searched.

上記の構成によれば、上記文字認識装置は、認識した指のジェスチャーが、上記複数のジェスチャーのうちの１つであった場合、認識された文字（または一連の文字群）をキーとして、用語と用語の説明とを対応付けて作成された辞書データベースから用語を検索する。 According to the above configuration, when the recognized finger gesture is one of the plurality of gestures, the character recognition device uses the recognized character (or a series of characters) as a key. A term is searched from a dictionary database created by associating a term with a description of the term.

よって、ユーザは、ジェスチャーで所望の文字を示すという手軽かつ直感的な操作のみで、文字認識および認識した文字をキーとして辞書検索することができるという効果を奏する。 Therefore, the user can perform a dictionary search using the recognized character and the recognized character as a key only by a simple and intuitive operation of showing a desired character with a gesture.

さらに、上記文字認識装置は、文字を記憶する記憶部を備え、上記処理実行手段（記憶処理部１６）は、上記ジェスチャー認識手段が認識した指のジェスチャーが、上記複数のジェスチャーのうちの１つである第４ジェスチャー（ジェスチャーＣ）と一致していると上記指ジェスチャー判定手段が判定した場合、上記文字認識手段によって認識された文字または一連の文字群を上記記憶部に記憶してもよい。 The character recognition device further includes a storage unit for storing characters, and the processing execution unit (storage processing unit 16) is configured such that the finger gesture recognized by the gesture recognition unit is one of the plurality of gestures. When the finger gesture determination unit determines that the fourth gesture (gesture C) is the same, the character or the series of characters recognized by the character recognition unit may be stored in the storage unit.

上記の構成によれば、上記文字認識装置は、認識した指のジェスチャーが、上記複数のジェスチャーのうちの１つであった場合、認識された文字（または一連の文字群）記憶する。 According to the above configuration, when the recognized finger gesture is one of the plurality of gestures, the character recognition device stores the recognized character (or a series of character groups).

よって、ユーザは、ジェスチャーで所望の文字を示すという手軽かつ直感的な操作のみで、文字認識および認識した文字を記憶することができるという効果を奏する。 Therefore, the user can recognize the character and memorize the recognized character only by a simple and intuitive operation of showing a desired character with a gesture.

なお、上記文字認識装置は、コンピュータによって実現してもよく、この場合には、コンピュータを上記各手段として動作させることにより上記文字認識装置をコンピュータにて実現させる上記文字認識装置の制御プログラム、およびそれを記録したコンピュータ読み取り可能な記録媒体も、本発明の範疇に入る。 The character recognition device may be realized by a computer. In this case, a control program for the character recognition device that causes the character recognition device to be realized by the computer by causing the computer to operate as each of the means, and A computer-readable recording medium on which it is recorded also falls within the scope of the present invention.

本発明は上述した実施形態に限定されるものではなく、請求項に示した範囲で種々の変更が可能である。すなわち、請求項に示した範囲で適宜変更した技術的手段を組み合わせて得られる実施形態についても本発明の技術的範囲に含まれる。 The present invention is not limited to the above-described embodiments, and various modifications can be made within the scope shown in the claims. That is, embodiments obtained by combining technical means appropriately modified within the scope of the claims are also included in the technical scope of the present invention.

（ソフトウェアによる実現例）
最後に、文字認識装置１の各ブロック、特に制御部２は、集積回路（ＩＣチップ）上に形成された論理回路によってハードウェア的に実現してもよいし、ＣＰＵ（Central Processing Unit）を用いてソフトウェア的に実現してもよい。 (Example of software implementation)
Finally, each block of the character recognition device 1, particularly the control unit 2, may be realized in hardware by a logic circuit formed on an integrated circuit (IC chip), or a CPU (Central Processing Unit) is used. It may be realized by software.

後者の場合、文字認識装置１は、各機能を実現するプログラムの命令を実行するＣＰＵ、上記プログラムを格納したＲＯＭ（Read Only Memory）、上記プログラムを展開するＲＡＭ（Random Access Memory）、上記プログラムおよび各種データを格納するメモリ等の記憶装置（記録媒体）などを備えている。そして、本発明の目的は、上述した機能を実現するソフトウェアである文字認識装置１の制御プログラムのプログラムコード（実行形式プログラム、中間コードプログラム、ソースプログラム）をコンピュータで読み取り可能に記録した記録媒体を、上記文字認識装置１に供給し、そのコンピュータ（またはＣＰＵやＭＰＵ）が記録媒体に記録されているプログラムコードを読み出し実行することによっても、達成可能である。 In the latter case, the character recognition device 1 includes a CPU that executes instructions of a program that realizes each function, a ROM (Read Only Memory) that stores the program, a RAM (Random Access Memory) that expands the program, the program, A storage device (recording medium) such as a memory for storing various data is provided. An object of the present invention is a recording medium on which a program code (execution format program, intermediate code program, source program) of a control program of the character recognition device 1 which is software for realizing the functions described above is recorded so as to be readable by a computer. This can also be achieved by supplying the character recognition apparatus 1 and reading and executing the program code recorded on the recording medium by the computer (or CPU or MPU).

上記記録媒体としては、一時的でない有形の媒体（non-transitory tangible medium）、例えば、磁気テープやカセットテープ等のテープ類、フロッピー（登録商標）ディスク／ハードディスク等の磁気ディスクやＣＤ−ＲＯＭ／ＭＯ／ＭＤ／ＤＶＤ／ＣＤ−Ｒ等の光ディスクを含むディスク類、ＩＣカード（メモリカードを含む）／光カード等のカード類、マスクＲＯＭ／ＥＰＲＯＭ／ＥＥＰＲＯＭ（登録商標）／フラッシュＲＯＭ等の半導体メモリ類、あるいはＰＬＤ（Programmable logic device）やＦＰＧＡ（Field Programmable Gate Array）等の論理回路類などを用いることができる。 Examples of the recording medium include non-transitory tangible medium, such as magnetic tape and cassette tape, magnetic disk such as floppy (registered trademark) disk / hard disk, and CD-ROM / MO. Discs including optical discs such as / MD / DVD / CD-R, cards such as IC cards (including memory cards) / optical cards, semiconductor memories such as mask ROM / EPROM / EEPROM (registered trademark) / flash ROM Alternatively, logic circuits such as PLD (Programmable Logic Device) and FPGA (Field Programmable Gate Array) can be used.

また、文字認識装置１を通信ネットワークと接続可能に構成し、上記プログラムコードを通信ネットワークを介して供給してもよい。この通信ネットワークは、プログラムコードを伝送可能であればよく、特に限定されない。例えば、インターネット、イントラネット、エキストラネット、ＬＡＮ、ＩＳＤＮ、ＶＡＮ、ＣＡＴＶ通信網、仮想専用網（Virtual Private Network）、電話回線網、移動体通信網、衛星通信網等が利用可能である。また、この通信ネットワークを構成する伝送媒体も、プログラムコードを伝送可能な媒体であればよく、特定の構成または種類のものに限定されない。例えば、ＩＥＥＥ１３９４、ＵＳＢ、電力線搬送、ケーブルＴＶ回線、電話線、ＡＤＳＬ（Asymmetric Digital Subscriber Line）回線等の有線でも、ＩｒＤＡやリモコンのような赤外線、Ｂｌｕｅｔｏｏｔｈ（登録商標）、ＩＥＥＥ８０２．１１無線、ＨＤＲ（High Data Rate）、ＮＦＣ（Near Field Communication）、ＤＬＮＡ（Digital Living Network Alliance）、携帯電話網、衛星回線、地上波デジタル網等の無線でも利用可能である。なお、本発明は、上記プログラムコードが電子的な伝送で具現化された、搬送波に埋め込まれたコンピュータデータ信号の形態でも実現され得る。 The character recognition device 1 may be configured to be connectable to a communication network, and the program code may be supplied via the communication network. The communication network is not particularly limited as long as it can transmit the program code. For example, the Internet, intranet, extranet, LAN, ISDN, VAN, CATV communication network, virtual private network, telephone line network, mobile communication network, satellite communication network, and the like can be used. The transmission medium constituting the communication network may be any medium that can transmit the program code, and is not limited to a specific configuration or type. For example, even with wired lines such as IEEE 1394, USB, power line carrier, cable TV line, telephone line, and ADSL (Asymmetric Digital Subscriber Line) line, infrared rays such as IrDA and remote control, Bluetooth (registered trademark), IEEE 802.11 wireless, HDR ( It can also be used by radio such as High Data Rate (NFC), Near Field Communication (NFC), Digital Living Network Alliance (DLNA), mobile phone network, satellite line, and digital terrestrial network. The present invention can also be realized in the form of a computer data signal embedded in a carrier wave in which the program code is embodied by electronic transmission.

本発明は、画像に含まれる文字を文字認識する文字認識機能を搭載した情報処理装置に利用することができる。特に、パーソナルコンピュータ、携帯電話、スマートフォン、タブレットＰＣ、ゲーム機器などの様々な情報処理装置に幅広く適用することができる。 The present invention can be used for an information processing apparatus equipped with a character recognition function for recognizing characters included in an image. In particular, it can be widely applied to various information processing apparatuses such as a personal computer, a mobile phone, a smartphone, a tablet PC, and a game machine.

１文字認識装置
３記憶部
４カメラ（画像取得手段）
５通信部（通信手段）
１０ジェスチャー判別部（ジェスチャー認識手段、指ジェスチャー判定手段）
１１画像結合部（結合手段）
１２文字切出部（文字認識手段）
１３文字認識処理部（文字認識手段）
１４キーワード検索部（処理実行手段）
１５辞書検索部（処理実行手段）
１６記憶処理部（処理実行手段） 1 character recognition device 3 storage unit 4 camera (image acquisition means)
5 Communication part (communication means)
10 Gesture discriminating unit (gesture recognition means, finger gesture judgment means)
11 Image combiner (combination means)
12 Character extraction part (character recognition means)
13 Character recognition processing unit (character recognition means)
14 Keyword search part (process execution means)
15 Dictionary search unit (process execution means)
16 Storage processing unit (processing execution means)

Claims

A character recognition device for recognizing characters included in an image,
Image acquisition means for acquiring an image including a finger and a character;
Gesture recognition means for recognizing a gesture from position information of a finger part and a character part adjacent to the finger part from the image acquired by the image acquisition means;
Finger gesture determination means for determining which of the plurality of gestures stored in advance is a finger gesture recognized by the gesture recognition means;
A character recognition means for recognizing a character or a series of characters at a position indicated by the finger recognized by the gesture recognition means;
Processing execution means for executing processing associated with the gesture determined to match the finger gesture determination means on the character or series of characters recognized by the character recognition means ,
The position indicated by the finger recognized by the gesture recognition means is
When directing a character from the top of the display unit with a finger, it is a character or a series of characters directly under the tip of the finger in the display unit,
When directing a character from the lower part of the display unit with a finger, a character or a series of character groups immediately above the tip of the finger in the display unit,
When a character is sandwiched between two fingers, it is a character or a series of characters in a portion sandwiched between the tips of two fingers.
A character recognition device.

The image acquisition means acquires a plurality of images having a common part in a part of each other image,
When the finger gesture determination unit determines that a finger gesture included in each of the plurality of images matches a first gesture that is one of the plurality of gestures, the common portion is superimposed. The character recognition apparatus according to claim 1, further comprising a combining unit that combines the plurality of images.

A communication means for communicating with an external device via a communication network;
When the finger gesture determination unit determines that the finger gesture recognized by the gesture recognition unit matches a second gesture that is one of the plurality of gestures, the processing execution unit 3. The character recognition apparatus according to claim 1, wherein information is retrieved from the communication network via the communication unit using a character or a series of character groups recognized by the recognition unit as a key.

A dictionary database created by associating terms with term descriptions;
When the finger gesture determination unit determines that the finger gesture recognized by the gesture recognition unit matches a third gesture that is one of the plurality of gestures, the processing execution unit 4. The character recognition apparatus according to claim 1, wherein a term in the dictionary database is searched using a character or a series of character groups recognized by a recognition means as a key.

A storage unit for storing characters;
When the finger gesture determination unit determines that the finger gesture recognized by the gesture recognition unit matches a fourth gesture that is one of the plurality of gestures, the processing execution unit The character recognition device according to any one of claims 1 to 4, wherein a character or a series of character groups recognized by a recognition unit is stored in the storage unit.

A method for controlling a character recognition device that recognizes characters included in an image,
An image acquisition step of acquiring an image including a finger and a character;
A gesture recognition step for recognizing a gesture from position information of a finger part and a character part adjacent to the finger part from the image acquired by the image acquisition step;
A finger gesture determination step for determining which of the plurality of gestures stored in advance is the finger gesture recognized by the gesture recognition step;
A character recognition step for recognizing a character or a series of characters at the position indicated by the finger recognized by the gesture recognition step;
The process of the finger gesture determination step is associated with the gesture is determined that they coincide, see containing and a process executing step of executing to recognized characters or stream of characters by the character recognition step,
The position indicated by the finger recognized in the gesture recognition step is
When directing a character from the top of the display unit with a finger, it is a character or a series of characters directly under the tip of the finger in the display unit,
When directing a character from the lower part of the display unit with a finger, a character or a series of character groups immediately above the tip of the finger in the display unit,
When a character is sandwiched between two fingers, it is a character or a series of characters in a portion sandwiched between the tips of two fingers.
A control method for a character recognition device.

A control program for operating a computer included in the character recognition device according to any one of claims 1 to 5, wherein the control program causes the computer to function as each of the means.

A computer-readable recording medium on which the control program according to claim 7 is recorded.