JP2012160023A

JP2012160023A - Character extracting device, display method, and character extracting method

Info

Publication number: JP2012160023A
Application number: JP2011019243A
Authority: JP
Inventors: Toyokazu Itakura; 豊和板倉; Takehide Yano; 武秀屋野; Osahiro Ogawa; 修太小川; Yuichi Togashi; 雄一富樫
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2011-01-31
Filing date: 2011-01-31
Publication date: 2012-08-23

Abstract

PROBLEM TO BE SOLVED: To provide a character extracting device capable of executing a favorable extraction process to a character image included in a video, and to provide a display method and a character extraction method.SOLUTION: The character extracting device according to the present embodiment includes: reproduction means for reproducing video data; output means that outputs the video data reproduced by the reproduction means to a display device; an acceptance means for accepting an external input that corresponds to either position of the video displayed by the display device using the video data; image extraction means for extracting a first-shape image that is present within a predetermined range from the position corresponded with the external input, in the video displayed by the display device; and character extraction means that extracts a character from the image extracted by the extraction means.

Description

本発明の実施形態は、映像データ中から文字を抽出する文字抽出装置、表示方法及び文字抽出方法に関する。 Embodiments described herein relate generally to a character extraction device, a display method, and a character extraction method for extracting characters from video data.

テレビ放送番組の本編やコマーシャル等において、視聴者に対してインターネットでの検索を促す映像が表示される場合がある。ここで当該映像には、インターネットでの検索ページを模した検索ボタンや検索窓の画像が配置される。そして当該検索窓内に検索対象となる文字の画像を配置することにより、視聴者に対して当該文字の検索を促す。 In a main part or a commercial of a TV broadcast program, an image that prompts viewers to search on the Internet may be displayed. Here, a search button imitating a search page on the Internet and an image of a search window are arranged in the video. Then, by placing an image of the character to be searched in the search window, the viewer is prompted to search for the character.

特開２００９−４４６５８号公報JP 2009-44658 A

ここで、ユーザが当該文字を用いて検索等の処理を実行する場合には、テレビ受信装置や表示装置等の装置が、映像中の文字画像から文字を抽出することにより、当該文字をユーザが覚えたり書き留めたりする手間を抑制できることが好ましい。そして、当該文字の抽出の際には、装置による文字抽出の処理を好適に実行できることが好ましい。 Here, when the user performs a process such as search using the character, a device such as a television receiver or a display device extracts the character from the character image in the video so that the user can select the character. It is preferable that the trouble of memorizing and writing down can be suppressed. In extracting the character, it is preferable that the character extraction process by the apparatus can be suitably executed.

そこで本発明の実施形態は、映像に含まれる文字画像に対して好適な抽出処理を実行できる文字抽出装置、表示方法及び文字抽出方法の提供を目的とする。 Therefore, an object of the embodiment of the present invention is to provide a character extraction device, a display method, and a character extraction method that can execute a suitable extraction process on a character image included in a video.

上記の課題を解決するために、本実施形態の文字抽出装置は、映像データを再生する再生手段と、再生手段により再生された映像データを表示装置に出力する出力手段と、映像データを用いて表示装置が表示する映像の何れかの位置に対応する外部入力を受け付ける受付手段と、表示装置が表示する映像の中の、外部入力が対応する位置から一定範囲内にある第１形状の画像を抽出する画像抽出手段と、抽出手段により抽出された画像から文字を抽出する文字抽出手段とを備える。 In order to solve the above-described problems, a character extraction device according to the present embodiment uses a playback unit that plays back video data, an output unit that outputs video data played back by the playback unit to a display device, and video data. An accepting unit that accepts an external input corresponding to any position of an image displayed by the display device, and an image of a first shape within a certain range from the position corresponding to the external input in the image displayed by the display device. Image extracting means for extracting, and character extracting means for extracting characters from the image extracted by the extracting means.

第１実施形態に係る表示装置の利用形態例を示す図。FIG. 3 is a diagram showing an example of how the display device according to the first embodiment is used. 第１実施形態に係る表示装置のシステム構成例を示す図。The figure which shows the system configuration example of the display apparatus which concerns on 1st Embodiment. 第１実施形態に係る表示装置が生成する抽出文字履歴データベースの構成例を示す図。The figure which shows the structural example of the extraction character history database which the display apparatus which concerns on 1st Embodiment produces | generates. 第１実施形態に係る表示装置による文字抽出処理例を示す図。The figure which shows the example of a character extraction process by the display apparatus which concerns on 1st Embodiment. 第１実施形態に係る表示装置が表示する画面の遷移例を示す図。The figure which shows the example of a transition of the screen which the display apparatus which concerns on 1st Embodiment displays. 第１実施形態に係る表示装置による文字抽出処理の処理フロー例を示す図。The figure which shows the process flow example of the character extraction process by the display apparatus which concerns on 1st Embodiment. 第２実施形態に係る表示装置及び情報端末の利用形態例を示す図。The figure which shows the example of a usage form of the display apparatus and information terminal which concern on 2nd Embodiment. 第２実施形態に係る表示装置及び情報端末のシステム構成例。The system configuration example of the display apparatus and information terminal which concern on 2nd Embodiment. 第２実施形態に係る表示装置及び情報端末による文字抽出処理シーケンス例を示す図。The figure which shows the example of a character extraction process sequence by the display apparatus and information terminal which concern on 2nd Embodiment. 第３実施形態に係る表示装置及び情報端末が表示する画面の遷移例を示す図。The figure which shows the example of a transition of the screen which the display apparatus and information terminal which concern on 3rd Embodiment display. 第３実施形態に係る表示装置及び情報端末による文字抽出処理シーケンス例を示す図。The figure which shows the example of a character extraction process sequence by the display apparatus and information terminal which concern on 3rd Embodiment.

（第１実施形態）
以下、図面を参照して第１実施形態を説明する。本実施形態に係る文字抽出装置は、例えば表示装置１００として実現される。ここで表示装置１００は、テレビ放送等の映像データを受信し、当該受信した映像データに基づいた映像を表示部１０７に表示する機能を有する。また表示装置１００は、ポインティングデバイス２００等の操作装置からの操作信号を受け付け、当該操作信号に応じた処理を実行する。 (First embodiment)
Hereinafter, a first embodiment will be described with reference to the drawings. The character extraction device according to the present embodiment is realized as the display device 100, for example. Here, the display device 100 has a function of receiving video data such as a television broadcast and displaying a video based on the received video data on the display unit 107. In addition, the display device 100 receives an operation signal from an operation device such as the pointing device 200 and executes processing according to the operation signal.

表示装置１００は、表示部１０７に例えばＣＭ映像の画面Ｐ２０を表示する。当該画面Ｐ２０では、ＣＭ映像に含まれる検索ボックス画像Ｐ１１及び検索ボタン画像Ｐ１２と、表示装置１００が生成するカーソル画像Ｐ２１等が表示される。なおカーソル画像Ｐ２１の位置は、ポインティングデバイス２００からの移動指示操作入力に応じて変化する。そして実施形態に係る表示装置１００は、例えばポインティングデバイス２００のボタン２０１又は２０２に対する操作入力を受けた場合に、カーソル画像Ｐ１２が配置されている位置の一定範囲内から検索ボックス画像Ｐ１１を検出して、当該画像Ｐ１１に含まれる文字を抽出できるものであるが、詳細については図２乃至図６を参照して後述する。 The display device 100 displays a CM video screen P20 on the display unit 107, for example. On the screen P20, a search box image P11 and a search button image P12 included in the CM video, a cursor image P21 generated by the display device 100, and the like are displayed. Note that the position of the cursor image P <b> 21 changes according to a movement instruction operation input from the pointing device 200. For example, when receiving an operation input to the button 201 or 202 of the pointing device 200, the display device 100 according to the embodiment detects the search box image P11 from within a certain range of the position where the cursor image P12 is arranged. The characters included in the image P11 can be extracted. Details will be described later with reference to FIGS.

図２は表示装置１００のシステム構成例を示す。表示装置１００は、受信部１０１、記憶部１０２、録画再生制御部１０３、デコード部１０４、合成部１０５、信号処理部１０６、表示部１０７、操作受付部１０８、ＧＵＩ処理部１０９、画像抽出部１１０、文字抽出部１１１、抽出文字データベース１１２、検索部１１３、通信部１１４等を備える。 FIG. 2 shows a system configuration example of the display device 100. The display device 100 includes a receiving unit 101, a storage unit 102, a recording / playback control unit 103, a decoding unit 104, a combining unit 105, a signal processing unit 106, a display unit 107, an operation receiving unit 108, a GUI processing unit 109, and an image extraction unit 110. A character extraction unit 111, an extracted character database 112, a search unit 113, a communication unit 114, and the like.

受信部１０１は、例えば衛星デジタルテレビ放送や地上波デジタルテレビ放送の映像データを受信する。ここで受信部１０１は、録画再生制御部１０３からの指示に応じたチャンネルの映像データを受信し、受信した映像データを録画再生制御部１０３に出力する。なお受信部１０１は、インターネット等のネットワークを介して配信されるＩＰＴＶ等の映像データを受信しても良い。更に受信部１０１は、放送波に重畳された番組情報や、ネットワーク上のサーバに格納された番組情報を受信してもよい。これら番組情報には、例えば放送番組毎に、放送チャンネル、放送時間、番組タイトル名、出演者名、番組内容説明文等が対応付けられている。 The receiving unit 101 receives, for example, video data of satellite digital TV broadcast or terrestrial digital TV broadcast. Here, the receiving unit 101 receives video data of a channel corresponding to an instruction from the recording / playback control unit 103, and outputs the received video data to the recording / playback control unit 103. The receiving unit 101 may receive video data such as IPTV distributed via a network such as the Internet. Furthermore, the receiving unit 101 may receive program information superimposed on broadcast waves or program information stored in a server on the network. For example, a broadcast channel, a broadcast time, a program title name, a performer name, a program content description, etc. are associated with these program information for each broadcast program.

記憶部１０２は、録画再生制御部１０３から入力される映像データのストリームを記憶（録画）する機能を有する。また記憶部１０２は、録画再生制御部１０３から映像データの再生指示を受けた場合、記憶した映像データのうち当該指示に応じた映像データを録画再生制御部１０３に出力する。 The storage unit 102 has a function of storing (recording) a stream of video data input from the recording / playback control unit 103. When the storage unit 102 receives an instruction to reproduce video data from the recording / playback control unit 103, the storage unit 102 outputs video data corresponding to the instruction among the stored video data to the recording / playback control unit 103.

録画再生制御部１０３は、映像データの再生及び録画処理を制御する。再生処理において録画再生制御部１０３は、受信部１０１が受信する映像データ又は記憶部１０２に記憶された映像データをデコード部１０４にデコード（再生）させる。つまり、例えば操作受付部１０８がユーザから所定のチャンネルの放送番組の再生指示入力を受けた場合、録画再生制御部１０３は、受信部１０１に対して当該チャンネルの映像データの受信及び出力を指示する。なお録画再生制御部１０３は、当該再生の際に、再生中の映像データについての番組情報を文字抽出部１１１に出力する。 The recording / playback control unit 103 controls playback and recording processing of video data. In the reproduction process, the recording / reproduction control unit 103 causes the decoding unit 104 to decode (reproduce) the video data received by the reception unit 101 or the video data stored in the storage unit 102. That is, for example, when the operation accepting unit 108 receives a playback instruction input of a broadcast program of a predetermined channel from the user, the recording / playback control unit 103 instructs the receiving unit 101 to receive and output the video data of the channel. . The recording / playback control unit 103 outputs program information about the video data being played back to the character extraction unit 111 during the playback.

また録画再生制御部１０３は、録画した映像データの再生指示入力を操作受付部１０８が受け付けた場合、当該再生指示に応じた映像データの出力を記憶部１０２に指示する。そしてこのとき録画再生制御部１０３は、操作受付部１０８が受け付けた操作入力に応じて、映像データの再生及び停止を制御する。 In addition, when the operation receiving unit 108 receives a playback instruction input of recorded video data, the recording / playback control unit 103 instructs the storage unit 102 to output video data according to the playback instruction. At this time, the recording / playback control unit 103 controls the playback and stop of the video data in accordance with the operation input received by the operation receiving unit 108.

また録画処理において録画再生制御部１０３は、受信部１０１が受信した映像データのストリームを記憶部１０２に出力し、当該映像データを記憶部１０２に記憶（録画）させる。 In the recording process, the recording / playback control unit 103 outputs the stream of video data received by the receiving unit 101 to the storage unit 102 and causes the storage unit 102 to store (record) the video data.

デコード部１０４は、録画再生制御部１０３から入力された映像データのストリームをデコードし、当該デコードデータを合成部１０５及び画像抽出部１１０に出力する。 The decoding unit 104 decodes the video data stream input from the recording / playback control unit 103 and outputs the decoded data to the synthesis unit 105 and the image extraction unit 110.

合成部１０５は、デコード部１０４から入力された映像データと、ＧＵＩ処理部１０９から入力されたＧＵＩの映像データとを重畳し、当該重畳した映像データを信号処理部１０６に出力する。次に信号処理部１０６は、入力された映像データをディスプレイが表示可能な形式の映像信号に変換して、当該映像信号を表示部１０７に出力する。そして表示部１０７は、入力された映像信号を用いた映像を表示する。 The synthesizing unit 105 superimposes the video data input from the decoding unit 104 and the GUI video data input from the GUI processing unit 109, and outputs the superimposed video data to the signal processing unit 106. Next, the signal processing unit 106 converts the input video data into a video signal in a format that can be displayed on the display, and outputs the video signal to the display unit 107. The display unit 107 displays a video using the input video signal.

操作受付部１０８は、ポインティングデバイス２００やリモートコントローラ等の操作装置からの操作入力信号を受信する。ここで操作受付部１０８は、例えばカーソル画像Ｐ２１の位置を移動させる移動指示操作入力、文字の抽出を指示する抽出指示操作入力、録画した映像データの再生及び停止操作入力等を受信する。そして操作受付部１０８は、受信した操作入力をＧＵＩ処理部１０９や録画再生制御部１０３に出力する（録画再生制御部１０３への出力経路は不図示）。 The operation receiving unit 108 receives an operation input signal from an operating device such as a pointing device 200 or a remote controller. Here, the operation accepting unit 108 receives, for example, a movement instruction operation input for moving the position of the cursor image P21, an extraction instruction operation input for instructing character extraction, playback of recorded video data, and a stop operation input. Then, the operation reception unit 108 outputs the received operation input to the GUI processing unit 109 and the recording / playback control unit 103 (the output path to the recording / playback control unit 103 is not illustrated).

ＧＵＩ処理部１０９は、表示装置１００が表示するＧＵＩ（ＧｒａｐｈｉｃａｌＵｓｅｒＩｎｔｅｒｆａｃｅ）の映像データの生成と、ＧＵＩに対するユーザからの操作入力の処理とを行う。ここでＧＵＩ処理部１０９は、例えばカーソル画像Ｐ２１のデータを生成し、生成したデータを合成部１０５に出力する。そしてＧＵＩ処理部１０９は、操作受付部１０８から入力された移動指示操作入力に応じた位置にカーソル画像Ｐ２１が配置されるように映像データを生成する。 The GUI processing unit 109 generates GUI (Graphical User Interface) video data to be displayed on the display device 100, and processes an operation input from the user to the GUI. Here, the GUI processing unit 109 generates data of the cursor image P21, for example, and outputs the generated data to the combining unit 105. Then, the GUI processing unit 109 generates video data so that the cursor image P21 is arranged at a position corresponding to the movement instruction operation input input from the operation receiving unit 108.

また、ここでＧＵＩ処理部１０９は、抽出指示操作入力を受けた場合、画像抽出部１１０に画像の抽出を指示する。なお抽出指示操作入力とは、例えばポインティングデバイス２００のボタン２０１又は２０２に対する操作入力等である。 In addition, when the GUI processing unit 109 receives an extraction instruction operation input, the GUI processing unit 109 instructs the image extraction unit 110 to extract an image. The extraction instruction operation input is, for example, an operation input to the button 201 or 202 of the pointing device 200.

画像抽出部１１０には、デコード部１０４から映像データが入力される。そして画像抽出部１１０は、入力された映像データに含まれる検索ボックス画像をＧＵＩ処理部１０９からの指示に応じて抽出する。更に画像抽出部１１０は、抽出した検索ボックス画像に含まれる文字画像を抽出し、当該抽出した文字画像を文字抽出部１１１に出力する。 Video data is input from the decoding unit 104 to the image extraction unit 110. Then, the image extraction unit 110 extracts a search box image included in the input video data in accordance with an instruction from the GUI processing unit 109. Further, the image extraction unit 110 extracts a character image included in the extracted search box image, and outputs the extracted character image to the character extraction unit 111.

文字抽出部１１１は、入力された文字画像を解析して、当該文字画像に含まれる文字を検出・抽出する。つまり文字抽出部１１１は、文字画像から文字データ(テキストデータ）を生成する。そして文字抽出部１１１は、生成した文字データを抽出文字データベース１１２及び検索部１１３に出力する。また、ここで文字抽出部１１１には、録画再生制御部１０３から、再生中の映像データについての番組情報（チャンネル、放送時間、番組タイトル名、出演者名、番組内容説明文等）が入力される。そして文字抽出部１１１は、抽出した文字データと、当該文字データを抽出した映像データについての番組情報とを抽出文字データベース１１２及び検索部１１３に出力する。 The character extraction unit 111 analyzes the input character image, and detects and extracts characters included in the character image. That is, the character extraction unit 111 generates character data (text data) from the character image. Then, the character extraction unit 111 outputs the generated character data to the extracted character database 112 and the search unit 113. Here, program information (channel, broadcast time, program title name, performer name, program description, etc.) regarding the video data being reproduced is input from the recording / reproduction control unit 103 to the character extraction unit 111. The Then, the character extraction unit 111 outputs the extracted character data and the program information about the video data from which the character data is extracted to the extracted character database 112 and the search unit 113.

抽出文字データベース１１２は、文字抽出部１１１から入力された文字データと、当該文字データが抽出された映像データの番組情報とを対応付けて格納する。なお当該データベースについては図３を参照して後述する。 The extracted character database 112 stores the character data input from the character extraction unit 111 in association with the program information of the video data from which the character data is extracted. The database will be described later with reference to FIG.

検索部１１３は、文字抽出部１１１により抽出された文字データと、例えばインターネット上のサーバに格納された検索エンジンとを用いて検索を実行する。当該検索を実行する場合、検索部１１３はまずインターネットを利用するためのブラウザソフトウェアを起動する。続いて検索部１１３は、抽出された文字データに応じたデータを、通信部１１４を介して検索エンジンのＵＲＩに対して送信するようにブラウザソフトウェアを制御する。 The search unit 113 executes a search using the character data extracted by the character extraction unit 111 and a search engine stored in a server on the Internet, for example. When executing the search, the search unit 113 first activates browser software for using the Internet. Subsequently, the search unit 113 controls the browser software so as to transmit data corresponding to the extracted character data to the URI of the search engine via the communication unit 114.

次に検索部１１３は、サーバから送信された検索結果に関するウェブページのデータを受信する。ここで検索部１１３は、受信したウェブページのデータに基づいた検索結果画面を、ブラウザソフトウェアを用いて構成する。そして検索部１１３は、構成された検索結果画面の映像データの生成をＧＵＩ処理部１０９に指示する。 Next, the search part 113 receives the data of the web page regarding the search result transmitted from the server. Here, the search unit 113 configures a search result screen based on the received web page data using browser software. Then, the search unit 113 instructs the GUI processing unit 109 to generate video data of the configured search result screen.

また検索部１１３は、文字抽出部１１１又は抽出文字データベース１１２から入力される文字データと当該文字データに関する番組情報とについての履歴を表示する履歴画面の生成をＧＵＩ処理部１０９に指示することもできる。 The search unit 113 can also instruct the GUI processing unit 109 to generate a history screen that displays a history of character data input from the character extraction unit 111 or the extracted character database 112 and program information related to the character data. .

図３は、抽出文字データベース１１２のデータ構成例を示す図である。当該データベース１１２は、抽出した文字Ｂ１に対して、当該文字の抽出日時Ｂ２及び当該文字を抽出した番組情報Ｂ３等を対応付けて格納している。なおここで番組情報Ｂ３とは、放送波に重畳される番組情報やネットワーク上の番組情報配信サーバが配信する番組情報等に基づいた情報であり、例えば番組の番組名、放送チャンネル、放送時間、出演者名及び番組内容説明等が含まれる。 FIG. 3 is a diagram illustrating a data configuration example of the extracted character database 112. The database 112 stores the extracted character B1 in association with the extracted date B2 of the character, the program information B3 from which the character is extracted, and the like. Here, the program information B3 is information based on program information superimposed on broadcast waves, program information distributed by a program information distribution server on the network, etc., for example, program name, broadcast channel, broadcast time, The name of the performer and description of the contents of the program are included.

図４は表示装置１００による文字抽出の処理例を示す。そして図４（Ａ）は、表示装置１００が映像データを再生している場合に表示部１０７が表示する画面例を示す図である。画面３００は、コマーシャル映像等における一画面の例であり、当該画面３００には、再生された映像データに基づいた検索ボックス画像Ｐ１１と検索ボタン画像Ｐ１２とが含まれる。 FIG. 4 shows an example of character extraction processing by the display device 100. FIG. 4A is a diagram illustrating an example of a screen displayed on the display unit 107 when the display device 100 is reproducing video data. The screen 300 is an example of one screen in a commercial video or the like, and the screen 300 includes a search box image P11 and a search button image P12 based on the reproduced video data.

ここで検索ボックス画像Ｐ１１及び検索ボタン画像Ｐ１２は、例えばインターネット上の検索ウェブページで表示される検索語入力ボックスの画像と検索実行ボタンの画像とを模した画像であり、検索ボックス画像Ｐ１１は、検索ボタン画像Ｐ１２の左隣に配置されている。また検索ボックス画像Ｐ１１には、画面Ｐ１０が含まれる映像データの製作者や配信者が、ユーザに検索して欲しいと考えている文字の画像が含まれる。 Here, the search box image P11 and the search button image P12 are images imitating, for example, an image of a search word input box displayed on a search web page on the Internet and an image of a search execution button. It is arranged on the left side of the search button image P12. The search box image P11 includes an image of a character that the producer or distributor of the video data including the screen P10 wants the user to search.

ここで、操作受付部１０８が例えばポインティングデバイス２００からの操作入力を受け付けると、表示部１０７は画面Ｐ１０にカーソル画像Ｐ２１を重畳した画面Ｐ２０を表示する。なおカーソル画像Ｐ２１の画面Ｐ２０上での位置は、ポインティングデバイス２００からの移動操作入力に応じて変化する。 Here, when the operation reception unit 108 receives an operation input from the pointing device 200, for example, the display unit 107 displays a screen P20 in which the cursor image P21 is superimposed on the screen P10. Note that the position of the cursor image P <b> 21 on the screen P <b> 20 changes according to the movement operation input from the pointing device 200.

そして操作受付部１０８は、例えばポインティングデバイス２００からのボタン操作入力等を受け付けると、ＧＵＩ処理部１０９は画像抽出部１１０に画像の抽出を指示する。なおＧＵＩ処理部１０９は、当該指示の際に、画面Ｐ２０におけるカーソル画像Ｐ２１の位置（座標）についての情報を画像抽出部１１０に出力する。 When the operation accepting unit 108 accepts, for example, a button operation input from the pointing device 200, the GUI processing unit 109 instructs the image extracting unit 110 to extract an image. Note that the GUI processing unit 109 outputs information about the position (coordinates) of the cursor image P21 on the screen P20 to the image extraction unit 110 at the time of the instruction.

画像抽出部１１０は、画像抽出指示を受けると、デコード部１０４から入力されている映像データから例えば１のフレームを抽出する。そして画像抽出部１１０は、カーソル画像Ｐ２１の位置情報に基づき、抽出フレームの映像から、当該カーソル画像Ｐ２１の位置の一定距離内にある検索ボックス画像Ｐ１１の画像を抽出する。つまり画像抽出部１１０は、例えばカーソル画像Ｐ２１の先端Ａ１の位置から上方向に距離Ｌ１、下方向に距離Ｌ２、左方向に距離Ｌ３の範囲Ａ２内から、例えば画像の輝度変化が小さい平坦部と輝度変化が大きいエッジ部を判別し、略矩形のエッジ部を有する所定のサイズ（解像度）の画像を抽出する。 When receiving the image extraction instruction, the image extraction unit 110 extracts, for example, one frame from the video data input from the decoding unit 104. Then, the image extraction unit 110 extracts an image of the search box image P11 within a certain distance of the position of the cursor image P21 from the extracted frame image based on the position information of the cursor image P21. That is, for example, the image extraction unit 110 includes, for example, a flat portion in which a change in luminance of the image is small from within a range A2 of a distance L1 upward, a distance L2 downward, and a distance L3 leftward from the position of the tip A1 of the cursor image P21. An edge portion having a large luminance change is discriminated, and an image having a predetermined size (resolution) having a substantially rectangular edge portion is extracted.

ここで、前述のように、コマーシャル映像等において、検索ボックス画像Ｐ１１は検索ボタン画像Ｐ１２の近傍左方向に配置されることが多い。また検索ボックス画像の形状は、縦方向よりも横方向の長さが長い矩形形状であることが多い。更に検索ボックス画像は、ユーザにより視認できる所定範囲のサイズ（解像度）である場合が多い。このため、カーソル画像Ｐ２１が検索ボタン画像Ｐ１２に対応する位置にある場合に操作受付部１０８が当該画像Ｐ１２の位置に対応する位置への操作入力を受け付け、画像抽出部１１０が当該検索ボタン画像Ｐ１２の位置の一定範囲内から略矩形の所定サイズの画像を検出することにより、高い精度で検索ボックス画像Ｐ１１を検出でき、当該検索ボックス画像Ｐ１１に含まれる文字画像を検出・抽出できる。 Here, as described above, in a commercial video or the like, the search box image P11 is often arranged in the vicinity left of the search button image P12. In many cases, the search box image has a rectangular shape whose length in the horizontal direction is longer than that in the vertical direction. Furthermore, the search box image often has a size (resolution) within a predetermined range that can be visually recognized by the user. Therefore, when the cursor image P21 is at a position corresponding to the search button image P12, the operation receiving unit 108 receives an operation input to a position corresponding to the position of the image P12, and the image extracting unit 110 is related to the search button image P12. The search box image P11 can be detected with high accuracy by detecting a substantially rectangular image of a predetermined size from within a certain range of the position, and the character image included in the search box image P11 can be detected and extracted.

しかしながら、検索ボックス画像Ｐ１１の形状は必ずしも矩形であるとは限らないため、画像抽出部１１０は、矩形形状の画像だけでなく例えばエッジ部が周状となっている画像や横長の画像等、所定の形状の画像を抽出してもよい。なおここでは、画像抽出部１１０は、検索ボックス画像Ｐ１１に重なる位置又は当該画像Ｐ１１の近傍の位置に対する操作を受け付けた場合に、当該検索ボックス画像Ｐ１１を検出・抽出しても良い。そしてこの場合にも画像抽出部１１０は、操作を受けた位置から所定範囲内にある所定の形状の画像を検出することにより、検索ボックス画像Ｐ１１を検出できる。 However, since the shape of the search box image P11 is not necessarily a rectangle, the image extraction unit 110 is not limited to a rectangular image, for example, a predetermined image such as an image having a circumferential edge or a horizontally long image. An image of the shape may be extracted. Here, the image extraction unit 110 may detect and extract the search box image P11 when an operation is performed on a position overlapping the search box image P11 or a position near the image P11. Also in this case, the image extraction unit 110 can detect the search box image P11 by detecting an image having a predetermined shape within a predetermined range from the position where the operation is received.

そして画像抽出部１１０は、図４（Ｃ）に示すような検索ボックス画像Ｐ１１を抽出すると、図４（Ｄ）に示すように、当該抽出画像に含まれる文字列の画像Ｐ３０及びＰ４０を抽出する。そして文字抽出部１１１は、画像Ｐ３０及び画像Ｐ４０に含まれる文字Ｔ１及びＴ２を抽出する。 Then, when extracting the search box image P11 as shown in FIG. 4C, the image extraction unit 110 extracts the character string images P30 and P40 included in the extracted image as shown in FIG. 4D. . The character extraction unit 111 extracts characters T1 and T2 included in the image P30 and the image P40.

図５は、表示装置１００が抽出した文字を用いて検索を実行する際に表示する画面の遷移例を示す図である。
まず表示装置１００は、図５（Ａ）に示す画面Ｐ２０を表示する。当該画面Ｐ２０には受信部１０１が受信した映像又は記憶部１０２に格納された映像に基づいて検索ボックス画像Ｐ１１及び検索ボタン画像Ｐ１２が配置され、またユーザからの操作入力に応じてカーソル画像Ｐ２１が配置される。ここで操作受付部１０８が、例えばポインティングデバイス２００からのボタン操作入力等を受け付けると、画像抽出部１１０及び文字抽出部１１１により、検索ボックス画像Ｐ１１に含まれる文字画像が文字データに変換される。そして検索部１１３は、当該文字データを用いて検索を実行する。そして表示装置１００は、検索結果の情報を取得すると、図５（Ｂ）に示す画面Ｐ５０を表示する。 FIG. 5 is a diagram illustrating a transition example of screens displayed when a search is executed using characters extracted by the display device 100.
First, the display device 100 displays a screen P20 shown in FIG. A search box image P11 and a search button image P12 are arranged on the screen P20 based on the video received by the receiving unit 101 or the video stored in the storage unit 102, and the cursor image P21 is displayed in response to an operation input from the user. Be placed. When the operation receiving unit 108 receives a button operation input from the pointing device 200, for example, the image extracting unit 110 and the character extracting unit 111 convert the character image included in the search box image P11 into character data. And the search part 113 performs a search using the said character data. And the display apparatus 100 will display the screen P50 shown in FIG.5 (B), if the information of a search result is acquired.

そして図５（Ｂ）は、表示装置１００が検索結果を表示する場合の画面構成例である。画面Ｐ５０には、再生映像Ｐ６０、検索結果映像Ｐ７０、履歴映像Ｐ８０等が表示される。再生映像Ｐ６０は、デコード部１０４がデコード（再生）している映像データの映像、つまり受信部１０１が受信している映像データ又は記憶部１０２に記憶されている映像データの映像である。 FIG. 5B is a screen configuration example when the display device 100 displays the search result. On the screen P50, a reproduction video P60, a search result video P70, a history video P80, and the like are displayed. The reproduced video P60 is a video of video data decoded (reproduced) by the decoding unit 104, that is, video data received by the receiving unit 101 or video data stored in the storage unit 102.

検索結果映像Ｐ７０は、例えばブラウザソフトウェアのウィンドウ等であり、当該映像Ｐ７０には、インターネットサーバから取得した検索結果として、例えば検索されたウェブページＰ７１等が表示される。履歴映像Ｐ８０は、抽出文字データベース１１２に格納された情報に基づいた映像であり、文字抽出部１１１が抽出した文字Ｐ８１に対して、当該文字の抽出時間Ｐ８２、番組情報Ｐ８３等が対応付けられて表示される。 The search result video P70 is, for example, a window of browser software, and the web page P71 searched for, for example, is displayed on the video P70 as a search result acquired from the Internet server. The history video P80 is a video based on information stored in the extracted character database 112, and the character extraction time P82, program information P83, and the like are associated with the character P81 extracted by the character extraction unit 111. Is displayed.

なお、検索結果を表示する場合の画面構成例はこれに限るものではなく、例えば画面Ｐ２０においてユーザからの抽出指示操作入力を受けた場合に、検索結果映像Ｐ７０を表示部１０７の表示画面の全画面に表示しても良い。 Note that the screen configuration example in the case of displaying the search result is not limited to this. For example, when an extraction instruction operation input from the user is received on the screen P20, the search result video P70 is displayed on the entire display screen of the display unit 107. It may be displayed on the screen.

図６は、表示装置１００による文字抽出に係る処理フロー例を示す図である。
まず表示装置１００は、映像データを再生して再生映像を表示部１０７に出力する（Ｓ３０１）。ここで画像抽出部１１０は、ユーザからの抽出操作入力を受けると（Ｓ３０２のＹｅｓ）、再生映像のうち、カーソル画像Ｐ２１が表示されている位置の周辺の一定範囲内から検索ボックス画像Ｐ１１を検出及び抽出する（Ｓ３０３）。続いて画像抽出部１１０は、抽出した検索ボックス画像Ｐ１１に含まれる文字画像を抽出する（Ｓ３０４）。そして文字抽出部１１１は、文字画像を解析して文字データに変換し（Ｓ３０５）、当該文字データを抽出文字データベース１１２に格納する（Ｓ３０６）。 FIG. 6 is a diagram illustrating a processing flow example related to character extraction by the display device 100.
First, the display device 100 reproduces the video data and outputs the reproduced video to the display unit 107 (S301). When the image extraction unit 110 receives an input of an extraction operation from the user (Yes in S302), the image extraction unit 110 detects a search box image P11 from within a certain range around the position where the cursor image P21 is displayed in the reproduced video. And extraction (S303). Subsequently, the image extraction unit 110 extracts a character image included in the extracted search box image P11 (S304). Then, the character extraction unit 111 analyzes the character image and converts it into character data (S305), and stores the character data in the extracted character database 112 (S306).

そして検索部１１３は、抽出した文字データを用いてインターネットでの検索を実行し（Ｓ３０７）、当該検索の結果を受信すると検索結果画面を表示部１０７に表示させる（Ｓ３０８）。なお当該処理フローにおいて、表示装置１００は文字を抽出した場合に当該抽出文字を用いて検索を実行しているが、検索を実行するタイミングはこれに限るものではない。例えばＳ３０６で抽出文字を格納した後にユーザから検索を指示する操作入力を受けた場合に検索を実行してもよい。 Then, the search unit 113 executes a search on the Internet using the extracted character data (S307), and displays the search result screen on the display unit 107 when receiving the search result (S308). In the processing flow, when the display device 100 extracts a character, the display device 100 performs a search using the extracted character. However, the timing for executing the search is not limited to this. For example, the search may be executed when an operation input instructing the search is received from the user after the extracted character is stored in S306.

なお表示装置１００は、Ｓ３０２にて抽出操作入力を受ける前に、ユーザから映像再生の停止指示操作入力を受けてもよい。そしてこの場合に表示装置１００は、一時停止した映像を表示部１０７に表示し、停止映像に対するユーザからの抽出操作入力を受ける。また表示装置１００は、受信部１０１が受信中の映像データを再生している場合において当該停止指示操作を受けた場合には、映像の再生を一時停止するとともに映像データの録画を開始し、抽出操作入力を受けた後に当該録画映像の再生を開始できる。 Note that the display device 100 may receive a video reproduction stop instruction operation input from the user before receiving the extraction operation input in S302. In this case, the display device 100 displays the paused video on the display unit 107 and receives an extraction operation input from the user for the stopped video. In addition, when the receiving unit 101 is reproducing the video data being received, the display device 100 pauses the reproduction of the video and starts recording the video data when receiving the stop instruction operation. Playback of the recorded video can be started after receiving an operation input.

また、本実施形態において、表示装置１００は、例えばポインティングデバイス２００やリモートコントローラからの操作入力を受け付けるとして説明したが、表示装置１００はスマートフォンやＰＣ等の情報端末からの操作入力を受け付けても良い。この場合、情報端末は表示装置１００に対して無線／有線ＬＡＮ等のネットワークで接続され、当該情報端末に入力された操作入力をネットワーク経由で表示装置１００に送信する。そして表示装置１００は、入力された操作入力に応じた処理を実行する。 In the present embodiment, the display device 100 has been described as receiving an operation input from, for example, the pointing device 200 or a remote controller. However, the display device 100 may receive an operation input from an information terminal such as a smartphone or a PC. . In this case, the information terminal is connected to the display device 100 via a network such as a wireless / wired LAN, and the operation input input to the information terminal is transmitted to the display device 100 via the network. Then, the display device 100 executes processing according to the input operation input.

（第２実施形態）
次に図７及び図８を参照して第２実施形態を説明する。図７（Ａ）は本実施形態に係る表示装置１００ｂと情報端末３００の利用形態例を示す図である。ここで、表示装置１００ｂは情報端末３００と無線／有線ＬＡＮ等のネットワークで接続される。情報端末３００は、表示部３０９の表示画面上にタッチスクリーンが設けられており、ユーザからの操作入力を受け付ける。そして情報端末３００は、当該表示画面上に対するユーザのタッチ操作入力を受け付け、受け付けたタッチ操作入力をネットワーク経由で表示装置１００ｂに送信する。 (Second Embodiment)
Next, a second embodiment will be described with reference to FIGS. FIG. 7A is a diagram showing an example of usage of the display device 100b and the information terminal 300 according to the present embodiment. Here, the display device 100b is connected to the information terminal 300 via a network such as a wireless / wired LAN. The information terminal 300 is provided with a touch screen on the display screen of the display unit 309 and receives an operation input from the user. The information terminal 300 receives the user's touch operation input on the display screen, and transmits the received touch operation input to the display device 100b via the network.

表示装置１００ｂは第１実施形態における表示装置１００と同様に、表示部１０７ｂに画面Ｐ２０を表示する。画面Ｐ２０には、検索ボックス画像Ｐ１１、検索ボタン画像Ｐ１２及びカーソル画像Ｐ２１等が表示される。カーソル画像Ｐ２１の位置は、ポインティングデバイス２００や情報端末３００からの移動指示操作入力に応じて変化する。そして本実施形態に係る表示装置１００ｂは、ポインティングデバイス２００や情報端末３００から抽出指示操作入力を受けた場合に、カーソル画像Ｐ１２が配置されている位置の一定範囲内から検索ボックス画像Ｐ１１を検出して、当該画像Ｐ１１に含まれる文字を抽出できる。そして情報端末３００は、当該抽出された文字を用いたインターネット検索を実行し、当該検索の結果を表示できる。 The display device 100b displays the screen P20 on the display unit 107b, similarly to the display device 100 in the first embodiment. On the screen P20, a search box image P11, a search button image P12, a cursor image P21, and the like are displayed. The position of the cursor image P <b> 21 changes according to a movement instruction operation input from the pointing device 200 or the information terminal 300. When receiving the extraction instruction operation input from the pointing device 200 or the information terminal 300, the display device 100b according to the present embodiment detects the search box image P11 from within a certain range of the position where the cursor image P12 is arranged. Thus, the characters included in the image P11 can be extracted. Then, the information terminal 300 can execute an Internet search using the extracted characters and display the search result.

図７（Ｂ）及び（Ｃ）は、情報端末３００が表示する表示画面例である。画面Ｐ９０には、枠画像Ｐ９１、シークバー画像Ｐ９２、停止ボタン画像Ｐ９３、再生ボタン画像Ｐ９４等が表示される。ここで枠画像Ｐ９１は、表示装置１００ｂが表示するカーソル画像Ｐ２１に対する操作入力を受けるための領域である。そして情報端末３００は、当該画像Ｐ９１内の領域に対するタッチ操作に応じた指示を表示装置１００ｂに送信することにより、表示装置１００ｂに対するタッチパッドとして機能する。つまり情報端末３００は、画像Ｐ９１内の領域に対するタッチ操作に応じて、カーソル画像Ｐ２１の位置を移動させる移動指示、及び文字の抽出を指示する抽出指示を表示装置１００ｂに出力する。 7B and 7C are examples of display screens displayed on the information terminal 300. FIG. On the screen P90, a frame image P91, a seek bar image P92, a stop button image P93, a playback button image P94, and the like are displayed. Here, the frame image P91 is an area for receiving an operation input for the cursor image P21 displayed by the display device 100b. The information terminal 300 functions as a touch pad for the display device 100b by transmitting an instruction corresponding to the touch operation on the region in the image P91 to the display device 100b. That is, the information terminal 300 outputs, to the display device 100b, a movement instruction for moving the position of the cursor image P21 and an extraction instruction for instructing character extraction in response to a touch operation on a region in the image P91.

シークバー画像Ｐ９２、停止ボタン画像Ｐ９３及び再生ボタン画像Ｐ９４は、表示装置１００ｂが録画映像を再生している場合に、映像の再生位置、再生停止及び再生開始を指示するための画像である。情報端末３００は、これらボタンに対するユーザのタッチ操作に応じた指示を表示装置１００ｂに送信する。また画面Ｐ１００は、検索結果を表示する画面であり、検索に応じてインターネット上のサーバから受信したウェブページの情報を表示する。なお図７には図示しないが、情報端末３００は、表示装置１００ｂが表示する放送番組の映像に関して、表示番組のチャンネル変更を指示するための画像を表示してもよい。 The seek bar image P92, the stop button image P93, and the playback button image P94 are images for instructing the video playback position, playback stop, and playback start when the display device 100b plays back the recorded video. The information terminal 300 transmits an instruction according to the user's touch operation on these buttons to the display device 100b. The screen P100 is a screen for displaying a search result, and displays information on a web page received from a server on the Internet according to the search. Although not shown in FIG. 7, the information terminal 300 may display an image for instructing to change the channel of the display program regarding the video of the broadcast program displayed by the display device 100b.

図８は第２実施形態に係る表示装置１００ｂ及び情報端末３００のシステム構成例を示す図である。ここで表示装置１００ｂと情報端末３００とは、無線又は有線のネットワーク４００により接続されている。 FIG. 8 is a diagram illustrating a system configuration example of the display device 100b and the information terminal 300 according to the second embodiment. Here, the display device 100 b and the information terminal 300 are connected by a wireless or wired network 400.

まず表示装置１００ｂを説明する。表示装置１００ｂは、受信部１０１ｂ、記憶部１０２ｂ、録画再生制御部１０３ｂ、デコード部１０４ｂ、合成部１０５ｂ、信号処理部１０６ｂ、表示部１０７ｂ、操作受付部１０８ｂ、ＧＵＩ処理部１０９ｂ、画像抽出部１１０ｂ、文字抽出部１１１ｂ、抽出文字データベース１１２ｂ、検索部１１３ｂ、通信部１１４ｂ等を備える。なお表示装置１００ｂの各構成は第１実施形態と同様の機能を有するため、ここでは第１実施形態と異なる機能を中心に説明する。 First, the display device 100b will be described. The display device 100b includes a receiving unit 101b, a storage unit 102b, a recording / playback control unit 103b, a decoding unit 104b, a combining unit 105b, a signal processing unit 106b, a display unit 107b, an operation receiving unit 108b, a GUI processing unit 109b, and an image extracting unit 110b. A character extraction unit 111b, an extracted character database 112b, a search unit 113b, a communication unit 114b, and the like. Since each configuration of the display device 100b has the same function as that of the first embodiment, the description here will focus on functions different from those of the first embodiment.

ＧＵＩ処理部１０９ｂは、抽出指示操作入力を受けた場合、画像抽出部１１０に画像の抽出を指示する。なおＧＵＩ処理部１０９は、抽出指示操作入力を受けた際のカーソル画像の位置情報を画像抽出部１１０ｂに出力する。 When receiving an extraction instruction operation input, the GUI processing unit 109b instructs the image extraction unit 110 to extract an image. The GUI processing unit 109 outputs the position information of the cursor image when receiving the extraction instruction operation input to the image extraction unit 110b.

画像抽出部１１０ｂは、デコード部１０４ｂから入力された映像データから、抽出指示操作入力に応じたフレーム画像を抽出する。そして画像抽出部１１０ｂは、カーソル画像の位置情報に基づき、抽出フレーム画像において当該カーソル画像の位置の一定距離内にある検索ボックス画像の画像を抽出する。さらに画像抽出部１１０ｂは当該検索ボックス画像に含まれる文字画像を抽出する。そして画像抽出部１１０ｂは、当該抽出した文字画像を、文字抽出部１１１又は情報端末３００の通信部３０１に出力する。 The image extraction unit 110b extracts a frame image corresponding to the extraction instruction operation input from the video data input from the decoding unit 104b. Then, the image extraction unit 110b extracts an image of a search box image that is within a certain distance from the position of the cursor image in the extracted frame image based on the position information of the cursor image. Further, the image extraction unit 110b extracts a character image included in the search box image. Then, the image extraction unit 110b outputs the extracted character image to the character extraction unit 111 or the communication unit 301 of the information terminal 300.

文字抽出部１１１ｂは、入力された文字画像から文字データ(テキストデータ）を生成する。また、ここで文字抽出部１１１ｂには、録画再生制御部１０３ｂから、再生中の映像データについての番組情報が入力される。そして文字抽出部１１１ｂは、抽出した文字データと、当該文字データを抽出した映像データについての番組情報とを抽出文字データベース１１２ｂ又は情報端末３００の通信部３０１に出力する。 The character extraction unit 111b generates character data (text data) from the input character image. Here, the program information about the video data being reproduced is input to the character extraction unit 111b from the recording / reproduction control unit 103b. Then, the character extraction unit 111b outputs the extracted character data and the program information about the video data from which the character data is extracted to the extracted character database 112b or the communication unit 301 of the information terminal 300.

抽出文字データベース１１２ｂは、文字抽出部１１１ｂから入力された文字データと番組情報とを対応付けて格納する。検索部１１３ｂは、文字抽出部１１１ｂにより抽出された文字データを用いて検索を実行する。 The extracted character database 112b stores character data input from the character extraction unit 111b and program information in association with each other. The search unit 113b performs a search using the character data extracted by the character extraction unit 111b.

通信部１１４ｂは、情報端末３００から、カーソル画像Ｐ２１の位置を移動させる移動指示操作入力、文字の抽出を指示する抽出指示操作入力、録画した映像データの再生及び停止操作入力等を受けた場合、これらの操作入力を操作受付部１０８ｂに出力する。そして操作受付部１０８ｂは、受信した操作入力をＧＵＩ処理部１０９ｂや録画再生制御部１０３ｂに出力する。また通信部１１４ｂは、画像抽出部１１０ｂが抽出したフレーム画像、検索ボックス画像、文字画像、及び文字抽出部１１１ｂが抽出した文字データのうち何れかを情報端末３００の通信部３０１に送信する機能も有する。なお通信部１１４ｂは、画像抽出部１１０ｂが抽出したフレーム画像を通信部３０１に送信する場合、カーソル画像の位置情報も合わせて送信する。 When the communication unit 114b receives, from the information terminal 300, a movement instruction operation input for moving the position of the cursor image P21, an extraction instruction operation input for instructing character extraction, a playback / stop operation input of recorded video data, These operation inputs are output to the operation receiving unit 108b. Then, the operation reception unit 108b outputs the received operation input to the GUI processing unit 109b and the recording / playback control unit 103b. The communication unit 114b also has a function of transmitting any of the frame image extracted by the image extraction unit 110b, the search box image, the character image, and the character data extracted by the character extraction unit 111b to the communication unit 301 of the information terminal 300. Have. In addition, when transmitting the frame image extracted by the image extraction unit 110b to the communication unit 301, the communication unit 114b also transmits the position information of the cursor image.

次に情報端末３００のシステム構成例を説明する。情報端末３００は、通信部３０１、画像抽出部３０２、文字抽出部３０３、検索部３０４、ＧＵＩ処理部３０５、信号処理部３０６、表示部３０７、操作受付部３０８等を備える。 Next, a system configuration example of the information terminal 300 will be described. The information terminal 300 includes a communication unit 301, an image extraction unit 302, a character extraction unit 303, a search unit 304, a GUI processing unit 305, a signal processing unit 306, a display unit 307, an operation reception unit 308, and the like.

通信部３０１は、ＧＵＩ処理部３０５から、表示装置１００ｂに表示されたカーソル画像Ｐ２１の位置の移動を指示する移動指示操作入力、文字の抽出を指示する抽出指示操作入力、表示装置１００ｂが録画した映像データの再生及び停止を指示する操作入力等が入力されると、当該操作入力をネットワーク４００経由で表示装置１００ｂの通信部１１４ｂに送信する。 The communication unit 301 records, from the GUI processing unit 305, a movement instruction operation input for instructing movement of the position of the cursor image P21 displayed on the display device 100b, an extraction instruction operation input for instructing character extraction, and the display device 100b recorded. When an operation input for instructing reproduction and stop of video data is input, the operation input is transmitted to the communication unit 114b of the display device 100b via the network 400.

また通信部３０１は、表示装置１００ｂから、フレーム画像及び検索ボックス画像の何れかが入力されると、当該画像を画像抽出部３０２に出力する。なお通信部３０１はカーソル画像の位置情報が入力された場合にも当該情報を画像抽出部３０２に出力する。また通信部３０１は、表示装置１００ｂから文字画像が入力されると当該文字画像を文字抽出部３０３に、文字データが入力されると当該文字データを検索部３０４に出力する。 Further, when any one of the frame image and the search box image is input from the display device 100b, the communication unit 301 outputs the image to the image extraction unit 302. Note that the communication unit 301 also outputs the information to the image extraction unit 302 when the position information of the cursor image is input. The communication unit 301 outputs the character image to the character extraction unit 303 when a character image is input from the display device 100b, and outputs the character data to the search unit 304 when character data is input.

画像抽出部３０２は、通信部３０１からフレーム画像とカーソル位置情報とが入力されると、フレーム画像の中のカーソル位置情報が示す位置の一定の範囲内から検索ボックス画像を検出・抽出する。また画像抽出部３０２は、通信部３０１から検索ボックス画像が入力された場合又は画像抽出部３０２が検索ボックス画像を抽出した場合、当該検索ボックス画像に含まれる文字画像を抽出し、当該文字画像を文字抽出部３０３に出力する。 When the frame image and the cursor position information are input from the communication unit 301, the image extraction unit 302 detects and extracts a search box image from within a certain range of the position indicated by the cursor position information in the frame image. Further, when a search box image is input from the communication unit 301 or when the image extraction unit 302 extracts a search box image, the image extraction unit 302 extracts a character image included in the search box image, and extracts the character image. The data is output to the character extraction unit 303.

文字抽出部３０３は、通信部３０１又は画像抽出部３０２から文字画像が入力されると、当該文字画像に含まれる文字を検出して文字データを生成し、検索部３０４に出力する。 When a character image is input from the communication unit 301 or the image extraction unit 302, the character extraction unit 303 detects characters included in the character image, generates character data, and outputs the character data to the search unit 304.

検索部３０４は、通信部３０１又は文字抽出部３０３から文字データが入力されると、当該文字データを用いたインターネット検索を実行して、インターネット上のサーバから検索結果を取得する。そして検索部３０４は、検索結果表示画面の生成をＧＵＩ処理部３０５に指示する。 When character data is input from the communication unit 301 or the character extraction unit 303, the search unit 304 performs an Internet search using the character data and acquires a search result from a server on the Internet. Then, the search unit 304 instructs the GUI processing unit 305 to generate a search result display screen.

ＧＵＩ処理部３０５は、図７（Ｂ）及び（Ｃ）で示した画面Ｐ９０及びＰ１００の映像データを生成する。またＧＵＩ処理部３０５には、表示部３０７の画面に対する操作入力の位置情報が操作受付部３０８から入力される。ここでＧＵＩ処理部３０５は、当該位置情報に応じて、表示画面の何れの画像に対する操作を受けたかを判別し、操作を受けた画像に応じた指示の送信を通信部３０１に指示する。 The GUI processing unit 305 generates video data of the screens P90 and P100 shown in FIGS. 7B and 7C. Further, position information of operation input on the screen of the display unit 307 is input from the operation reception unit 308 to the GUI processing unit 305. Here, the GUI processing unit 305 determines which image on the display screen has been operated according to the position information, and instructs the communication unit 301 to transmit an instruction corresponding to the received image.

そして信号処理部３０６は映像データを表示装置用の映像信号に変換し、表示部３０７は当該映像信号を用いて映像を出力する。
操作受付部３０８は、例えば表示部３０７の画面上に設けられたタッチパネル等であり、ユーザからの操作入力を受け付ける。そして操作受付部３０８は、操作を受けた画面の位置（座標）についての情報をＧＵＩ処理部３０５に出力する。 The signal processing unit 306 converts the video data into a video signal for a display device, and the display unit 307 outputs a video using the video signal.
The operation accepting unit 308 is, for example, a touch panel provided on the screen of the display unit 307, and accepts an operation input from the user. Then, the operation reception unit 308 outputs information about the position (coordinates) of the screen that has received the operation to the GUI processing unit 305.

次に図９を参照して、表示装置１００ｂ及び情報端末３００による処理シーケンス例を説明する。まず表示装置１００ｂは、画面Ｐ２０の映像を出力する（Ｓ５０１）。ここで表示装置１００ｂは、ポインティングデバイス２００又は情報端末３００からの抽出指示を受けると（Ｓ５０２のＹｅｓ）、再生中の映像データのフレームを抽出する（Ｓ５０３）。そして表示装置１００ｂは、抽出したフレームの、カーソル画像Ｐ２１の位置情報に基づいた範囲内から検索ボックス画像を抽出する（Ｓ５０４）。 Next, a processing sequence example by the display device 100b and the information terminal 300 will be described with reference to FIG. First, the display device 100b outputs an image of the screen P20 (S501). When the display device 100b receives an extraction instruction from the pointing device 200 or the information terminal 300 (Yes in S502), the display device 100b extracts a frame of video data being reproduced (S503). Then, the display device 100b extracts a search box image from the range of the extracted frame based on the position information of the cursor image P21 (S504).

続いて表示装置１００ｂは、検索ボックス画像中の文字画像を抽出し、当該文字画像に含まれる文字を文字データとして検出する（Ｓ５０６）。次に表示装置１００ｂは、検出・抽出した文字データを情報端末３００に送信する（Ｓ５０７）。そして情報端末３００は、受信した文字データを用いた検索を実行し（Ｓ５０８）、検索結果を示す画面を表示する（Ｓ５０９）。 Subsequently, the display device 100b extracts a character image from the search box image, and detects characters included in the character image as character data (S506). Next, the display device 100b transmits the detected / extracted character data to the information terminal 300 (S507). The information terminal 300 executes a search using the received character data (S508), and displays a screen showing the search result (S509).

なお図９においては、表示装置１００ｂは文字データを情報端末３００に送信するとして説明したが、表示装置１００ｂが情報端末３００に送信するデータはこれに限るものではない。つまり表示装置１００ｂは、Ｓ５０３にて抽出したフレーム画像を情報端末３００に送信し、続くＳ５０４乃至Ｓ５０６の処理を情報端末３００が処理してもよい。また表示装置１００ｂは、Ｓ５０４にて抽出した検索ボックス画像を情報端末３００に送信し、情報端末３００がＳ５０５及びＳ５０６の処理を実行してもよい。更に表示装置１００ｂはＳ５０５で抽出した文字画像を情報端末３００に送信して、情報端末３００がＳ５０６の処理を実行しても良い。 In FIG. 9, the display device 100b has been described as transmitting character data to the information terminal 300, but the data that the display device 100b transmits to the information terminal 300 is not limited to this. That is, the display device 100b may transmit the frame image extracted in S503 to the information terminal 300, and the information terminal 300 may process the subsequent processes of S504 to S506. The display device 100b may transmit the search box image extracted in S504 to the information terminal 300, and the information terminal 300 may execute the processes of S505 and S506. Further, the display device 100b may transmit the character image extracted in S505 to the information terminal 300, and the information terminal 300 may execute the process of S506.

（第３実施形態）
次に図１０及び図１１を参照して第３実施形態を説明する。第３実施形態は、第２実施形態に係る表示装置１００ｂ及び情報端末３００により実現される。
図１０は第３実施形態に係る表示装置１００ｂ及び情報端末３００が表示する画面の遷移例を示す図である。まず情報端末３００は画面Ｐ２００を表示する。当該画面Ｐ２００には、抽出指示ボタンＰ２０１等が表示される。ここで表示装置１００ｂが画面Ｐ３００を表示している場合に情報端末３００が抽出指示ボタンＰ２０１への操作を受け付けると、情報端末３００は表示装置１００ｂに対して画面Ｐ３００の抽出及び送信を指示する。そして表示装置１００ｂは、画面３００のフレーム画像を抽出して情報端末３００に送信する。なお表示装置１００ｂは、フレーム画像の抽出及び送信の際にも映像の再生及び表示を継続する。 (Third embodiment)
Next, a third embodiment will be described with reference to FIGS. The third embodiment is realized by the display device 100b and the information terminal 300 according to the second embodiment.
FIG. 10 is a diagram illustrating a transition example of screens displayed on the display device 100b and the information terminal 300 according to the third embodiment. First, the information terminal 300 displays the screen P200. On the screen P200, an extraction instruction button P201 and the like are displayed. If the information terminal 300 accepts an operation on the extraction instruction button P201 when the display device 100b displays the screen P300, the information terminal 300 instructs the display device 100b to extract and transmit the screen P300. The display device 100b extracts the frame image of the screen 300 and transmits it to the information terminal 300. Note that the display device 100b continues to reproduce and display video even when extracting and transmitting frame images.

情報端末３００は、フレーム画像を受信すると、画面Ｐ２１０に当該フレーム画像を表示し、ユーザからのタッチ操作入力を受け付ける。そして情報端末３００は、当該画面Ｐ２１０に対する操作を受け付けると、画面Ｐ２１０の画像の中から当該操作を受けた位置の一定範囲内にある所定形状の画像を検出して、当該所定形状の画像内に含まれる文字を文字データとして抽出する。なおここで表示装置１００ｂは映像の再生及び表示を継続しており、再生中の映像の画面Ｐ３１０を表示する。 When receiving the frame image, the information terminal 300 displays the frame image on the screen P210 and accepts a touch operation input from the user. When the information terminal 300 receives an operation on the screen P210, the information terminal 300 detects an image having a predetermined shape within a certain range of the position where the operation is received from the image on the screen P210, and includes the image in the predetermined shape image. Extract included characters as character data. Here, the display device 100b continues to reproduce and display the video, and displays the screen P310 of the video being reproduced.

続いて情報端末３００は、抽出した文字データに基づいた検索要求をインターネットサーバに送信し、当該要求に応じてインターネットサーバから受信した情報を画面Ｐ２２０に表示する。ここで表示装置１００ｂは、前述の通り映像の再生及び表示を継続しており、再生中の映像の画面Ｐ３２０を表示する。 Subsequently, the information terminal 300 transmits a search request based on the extracted character data to the Internet server, and displays information received from the Internet server in response to the request on the screen P220. Here, the display device 100b continues to reproduce and display the video as described above, and displays the screen P320 of the video being reproduced.

次に図１１を参照して表示装置１００ｂ及び情報端末３００による処理シーケンス例を説明する。まず表示装置１００ｂは映像の再生及び表示を行い（Ｓ６０１）、情報端末３００は、図１０に示した画面Ｐ２００のような操作画面を表示する（Ｓ６０２）。ここで情報端末３００が抽出指示ボタンＰ２０１への操作を受け付けると（Ｓ６０３のＹｅｓ）、情報端末３００は表示装置１００ｂにフレーム抽出指示を送信する（Ｓ６０４）。続いて表示装置１００ｂは、フレーム抽出指示を受けると、再生中の映像のフレーム画像を抽出して（Ｓ６０５）情報端末３００に送信する（Ｓ６０７）。 Next, an example of a processing sequence by the display device 100b and the information terminal 300 will be described with reference to FIG. First, the display device 100b reproduces and displays a video (S601), and the information terminal 300 displays an operation screen such as the screen P200 shown in FIG. 10 (S602). When the information terminal 300 accepts an operation on the extraction instruction button P201 (Yes in S603), the information terminal 300 transmits a frame extraction instruction to the display device 100b (S604). Subsequently, when receiving a frame extraction instruction, the display device 100b extracts a frame image of the video being reproduced (S605) and transmits the frame image to the information terminal 300 (S607).

そして情報端末３００は、受信したフレーム画像を表示し（Ｓ６０７）、当該表示画像に対する操作入力を受けた場合（Ｓ６０８のＹｅｓ）、表示画像のうち当該操作入力に対応する位置の一定範囲内にある所定の形状の画像を抽出する（Ｓ６０９）。次に情報端末３００は、当該所定形状の画像から文字画像を抽出し（Ｓ６１０）、当該文字画像に含まれる文字を文字データに変換する（Ｓ６１１）。そして情報端末３００は、当該文字データを用いて検索を実行し（Ｓ６１２）、当該検索により得た情報を表示する（Ｓ６１３）。 The information terminal 300 displays the received frame image (S607), and when receiving an operation input for the display image (Yes in S608), the information terminal 300 is within a certain range of the position corresponding to the operation input in the display image. An image having a predetermined shape is extracted (S609). Next, the information terminal 300 extracts a character image from the image having the predetermined shape (S610), and converts characters included in the character image into character data (S611). Then, the information terminal 300 performs a search using the character data (S612), and displays information obtained by the search (S613).

なお図１１においては、情報端末３００がＳ６０９乃至Ｓ６１１の処理を実行するとして説明したが、これらの処理は表示装置１００ｂが実行してもよい。この場合に情報端末３００は、Ｓ６０８において表示画像に対する操作入力を受けた場合に、当該操作入力が表示画像の何れの位置に対応しているかを判別し、操作入力の位置情報を表示装置１００ｂに送信する。そして表示装置１００ｂは、Ｓ６０５で抽出したフレーム画像の中から、Ｓ６０９の処理と同様に、当該位置情報が示す位置の一定範囲内にある所定の形状の画像を抽出する。続いて表示装置１００ｂは、文字画像及び文字データを抽出して、文字データを情報端末３００に送信する。 In FIG. 11, the information terminal 300 has been described as executing the processes of S609 to S611, but these processes may be executed by the display device 100b. In this case, when receiving an operation input for the display image in S608, the information terminal 300 determines which position of the display image the operation input corresponds to and displays the position information of the operation input on the display device 100b. Send. Then, the display device 100b extracts an image having a predetermined shape within a certain range of the position indicated by the position information, from the frame image extracted in S605, similarly to the process in S609. Subsequently, the display device 100b extracts a character image and character data, and transmits the character data to the information terminal 300.

本発明のいくつかの実施形態を説明したが、これらの実施形態は、例として提示したものであり、発明の範囲を限定することは意図していない。例えば実施形態に係る文字抽出装置は、チューナを備え、ディスプレイに映像を出力する受信装置として実現されても良い。また、これら実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。これら実施形態やその変形は、発明の範囲や要旨に含まれると同様に、特許請求の範囲に記載された発明とその均等の範囲に含まれるものである。 Although several embodiments of the present invention have been described, these embodiments are presented by way of example and are not intended to limit the scope of the invention. For example, the character extraction device according to the embodiment may be realized as a reception device that includes a tuner and outputs an image on a display. Also, these embodiments can be implemented in various other forms, and various omissions, replacements, and changes can be made without departing from the spirit of the invention. These embodiments and their modifications are included in the scope and gist of the invention, and are also included in the invention described in the claims and the equivalents thereof.

１００…表示装置、１０１…受信部、１０２…記憶部、１０３…録画再生制御部、１０４…デコード部、１０５…合成部、１０６…信号処理部、１０７…表示部、１０８…操作受付部、１０９…ＧＵＩ処理部、１１０…画像抽出部、１１１…文字抽出部、１１２…抽出文字データベース、１１３…検索部、１１４…通信部、２００…ポインティングデバイス、３００…情報端末、３０１…通信部、３０２…画像抽出部、３０３…文字抽出部、３０４…検索部、３０５…ＧＵＩ処理部、３０６…信号処理部、３０７…表示部、３０８…操作受付部、４００…ネットワーク DESCRIPTION OF SYMBOLS 100 ... Display apparatus 101 ... Receiving part 102 ... Memory | storage part 103 ... Recording / reproducing control part 104 ... Decoding part 105 ... Synthesis | combination part 106 ... Signal processing part 107 ... Display part 108 ... Operation reception part 109 ... GUI processing unit, 110 ... image extraction unit, 111 ... character extraction unit, 112 ... extracted character database, 113 ... search unit, 114 ... communication unit, 200 ... pointing device, 300 ... information terminal, 301 ... communication unit, 302 ... Image extracting unit 303 ... Character extracting unit 304 ... Searching unit 305 ... GUI processing unit 306 ... Signal processing unit 307 ... Display unit 308 ... Operation receiving unit 400 ... Network

Claims

Playback means for playing back video data;
Output means for outputting the video data reproduced by the reproduction means to a display device;
Accepting means for accepting an external input corresponding to any position of the video displayed by the display device using the video data;
Image extracting means for extracting an image of a first shape within a certain range from the position corresponding to the external input in the video displayed by the display device;
A character extraction device comprising: character extraction means for extracting characters from the image extracted by the extraction means.

The character extracting apparatus according to claim 1, wherein the image extracting unit extracts the image having a substantially rectangular shape.

The character extracting apparatus according to claim 1, wherein the image extracting unit extracts the image having a certain size.

The character extraction device according to claim 1, wherein the image extraction unit extracts the image in a left direction within a certain range from the position of the external input.

The character extraction device according to claim 1, wherein the output unit outputs video data of an image indicating a position corresponding to the external input and the reproduced video data to the display device.

Further comprising search means for performing a search using the characters extracted by the extraction means;
The character extraction device according to claim 1, wherein the output unit outputs video data of search information obtained by the search to the display device.

The character extraction device according to claim 1, wherein the output unit outputs both the reproduced video data and video data for the search information to the display device.

The character extraction device according to claim 1, further comprising character storage means for storing the extracted character.

The character extraction device according to claim 8, wherein the output unit outputs video data indicating the one or more characters stored in the character storage unit to the display device.

The reproducing means stops the reproduction of the video data when the receiving means receives a second external input different from the external input;
The character extracting device according to claim 1, wherein the accepting unit accepts the external input after the reproduction of the video data is stopped.

Receiving means for receiving video data of a broadcast program;
Video storage means for storing the received video data in a storage device;
The reproduction means reproduces the received video data,
The character extraction device according to claim 1, wherein the video storage unit stores the received video data in the storage device when the reproduction unit stops the reproduction of the video data.

The character extracting device according to claim 1, wherein the accepting unit accepts the external input from an external device connected to the character extracting device via a network.

A display method in a display system comprising a first device and a second device as information processing devices, wherein the first device and the second device are connected by a network,
The first device comprises:
Playing video data,
Displaying video using the played video data; and
The first device or the second device is
Receiving an external input corresponding to any position of the displayed video;
Extracting a first shape image within a certain range from a position corresponding to the external input in the displayed video;
Extracting characters from the extracted image of the first shape;
Performing a search using the extracted characters;
The second device comprises:
Displaying the information obtained by the search.

The second device accepts the external input;
The said 1st apparatus is further provided with displaying the said image | video which does not contain the cursor image which shows the position corresponding to the said external input, when the said 2nd apparatus is accepting the said 1st external input. Display method of description.

A character extraction method in a device for extracting characters from video displayed on a display device,
Playing video data,
Outputting the reproduced video data to the display device;
Receiving an external input corresponding to any position of the video displayed by the display device using the video data;
Extracting an image of a first shape within a certain range from a position corresponding to the external input in the video displayed by the display device;
A character extraction method comprising: extracting characters from the extracted first shape image.