JP4907635B2

JP4907635B2 - Method, system and computer readable recording medium for extracting text based on the characteristics of a web page

Info

Publication number: JP4907635B2
Application number: JP2008295183A
Authority: JP
Inventors: ヒュンリ、ユン; 圭一金; 振洙朴
Original assignee: Naver Corp
Current assignee: Naver Corp
Priority date: 2007-11-21
Filing date: 2008-11-19
Publication date: 2012-04-04
Anticipated expiration: 2028-11-19
Also published as: KR20090052757A; JP2009129456A; CN101441648A; KR100958934B1; CN101441648B

Description

本発明は、ウェブページの特性に基づいてテキストを抽出する方法、システム及びコンピュータ読み取り可能な記録媒体に関するものである。より詳細には、本発明は、ウェブページ内のテキストを抽出した後、これを用いて音声変換や翻訳などのテキストに基づいたサービスを提供する際に、ウェブページの特性に応じて、単語、文章、段落及び全文など、相違する範囲のテキストを抽出する方法、システム及びコンピュータ読み取り可能な記録媒体に関するものである。 The present invention relates to a method and system for extracting text based on the characteristics of a web page, and a computer-readable recording medium. More specifically, the present invention extracts a text in a web page, and then uses this to provide a service based on text, such as speech conversion and translation, according to the characteristics of the web page, The present invention relates to a method, a system, and a computer-readable recording medium for extracting different ranges of text such as sentences, paragraphs, and full sentences.

最近、インターネットの使用が普遍化することによって、インターネットを通じた多様な情報の取得が可能になっている。ウェブサイトを通じてインターネットサービスを提供する業者は、より一層多様になっていくユーザーのニーズを充足させるために多様なサービスを提供しており、そのようなサービスの種類も増加している一方である。 Recently, the use of the Internet has become universal, and various information can be acquired through the Internet. Businesses that provide Internet services through websites offer a variety of services to meet the needs of the increasingly diverse users, and the types of such services are increasing.

インターネットユーザーは、このような業者が提供しているサービスに多様な形態で接している。特に、ウェブサイトを通じて、ニュース情報、辞書情報、全文情報、地域情報、ショッピング情報などの多様なインターネットコンテンツを得ようとする。 Internet users interact with the services provided by such vendors in various ways. In particular, various Internet contents such as news information, dictionary information, full-text information, regional information, and shopping information are obtained through a website.

このようなユーザーは、自分自身が希望するコンテンツを取得するためにウェブサイトを通じて検索を遂行し、これを通じて特定のウェブページから希望するコンテンツを取得する場合には、主にテキストから構成された当該コンテンツを肉眼で解読することが一般的である。しかしながら、ユーザーの立場では、このようにテキスト中心に提供されているコンテンツだけを利用することは、マルチメディア時代と呼ばれる最近においてはあまり好ましくないことである。現実的には、ウェブページに含まれている情報の量がますます多くなることにつれて、ユーザーは、テキストの形態で提供されたコンテンツを解読するために肉眼でそのテキストを読みきるまでにユーザーコンピュータのモニターのような表示手段から目を離してはいけない問題点もある。また、ユーザーの中には、コンテンツを通じて希望する情報を取得しながら他の仕事も遂行しようとするいわゆるマルチタスキングのニーズを持った者もあるが、このようなニーズも満たされ難い側面があった。 When such a user performs a search through a website in order to obtain the content he / she desires, and through this, he / she obtains the desired content from a specific web page, the relevant mainly composed of text It is common to decipher content with the naked eye. However, from the user's point of view, it is not so preferable in recent years called the multimedia era to use only the contents provided in the text center. Realistically, as the amount of information contained in a web page increases, users must read the text with their naked eyes to decipher the content provided in the form of text. There is also a problem that you must keep an eye on the display means such as monitors. Some users have the so-called multitasking needs to perform other tasks while obtaining the desired information through the content. However, there are aspects where such needs are difficult to be met. It was.

一方、最近、ＶｏＩＰ（ＶｏｉｃｅｏｖｅｒＩｎｔｅｒｎｅｔＰｒｏｔｏｃｏｌ）技術、音声認識技術、音声変換技術、音声合成技術、自動応答システムなどのＣＴＩ（ＣｏｍｐｕｔｅｒＴｅｌｅｐｈｏｎｙＩｎｔｅｇｒａｔｉｏｎ）技術が多くの関心を引いていることもやはり事実であり、このような技術におけるインターネット環境でもユーザーが音声で指示を与え、音声で情報の提供を受け、音声で意思を疎通する一歩進んだインターネットサービスを享有することができる。 On the other hand, recently, it is also true that CTI (Computer Telephony Integration) technologies such as VoIP (Voice over Internet Protocol) technology, speech recognition technology, speech conversion technology, speech synthesis technology, and automatic response system have attracted much interest. In addition, even in the Internet environment in such a technology, it is possible to enjoy an advanced Internet service in which a user gives an instruction by voice, receives information by voice, and communicates by voice.

これによって、テキスト中心のコンテンツ提供における問題を解決する一方、ＣＴＩ技術に幅広く利用するためにＴＴＳ（ＴｅｘｔＴｏＳｐｅｅｃｈ）技術が開発された。ＴＴＳ技術は、音声認識技術より広く用いられる技術であって、各種テキスト情報を音声に変換して提供するヒューマンインターフェース技術である。ウェブページでのＴＴＳ技術は、主にウェブページのテキストを抽出し、これを音声に変換してユーザーに提供する方式で実現する。例えば、ユーザーが、ウェブページの一定の位置で一定の時間の間マウスを停滞させる場合に発生するマウスオーバー（ｍｏｕｓｅ-ｏｖｅｒ）イベント（マウスをオーバーする行為）によってその時のマウスポインタ位置に該当する単語を抽出し、これを音声に変換する場合、又はユーザーがウェブページ上のテキストの一定の部分（領域）をドラッグし（ｄｒａｇｇｉｎｇ）、これを音声に変換する場合を挙げることができる。 Thus, while solving the problem in providing text-centric contents, TTS (Text To Speech) technology has been developed to be widely used in CTI technology. The TTS technology is a technology that is used more widely than the speech recognition technology, and is a human interface technology that converts various text information into speech and provides it. The TTS technology for web pages is realized mainly by extracting the text of the web page and converting it into speech and providing it to the user. For example, a word corresponding to the current position of the mouse pointer due to a mouse-over event (an action of moving the mouse) that occurs when the user pauses the mouse at a certain position on the web page for a certain period of time. Can be extracted and converted to speech, or the user can drag a certain part (region) of text on a web page and convert it to speech.

しかしながら、現在実現しているウェブページを通じたＴＴＳサービスは、完璧なヒューマンインターフェース技術とはいえない。具体的には、現在のＴＴＳサービスは、ユーザーのマウスオーバー操作により認識された位置の単語だけを音声に変換したり、ユーザーが自らマウスをドラッグして音声変換を希望するだけのテキストを指定するしかない問題があった。前者の場合には、ユーザーの意思とは異なり一律的にマウスオーバーした単語だけが音声に変換される問題があった。そして、後者の場合には、ユーザーが希望する範囲のテキストを音声に変換させるためにはユーザーが概略的ながら肉眼でテキストを解読した後、音声変換の対象になるテキストの範囲を指定しなければならないため、ユーザーが自らテキストを解読せざるを得ない場合をできるだけ排除しようとするＴＴＳ技術の本旨から外れることになり、さらにこのようなテキストを指定する行為は、追加的に時間が掛かる問題があった。 However, the currently realized TTS service through a web page is not a perfect human interface technology. Specifically, the current TTS service converts only the word at the position recognized by the user's mouse-over operation into speech, or specifies text that the user wants to convert by dragging the mouse. There was only a problem. In the former case, unlike the user's intention, there is a problem that only the word over which the mouse is uniformly over is converted into speech. In the latter case, in order to convert the text in the range desired by the user into speech, the user must roughly decode the text with the naked eye and then specify the range of text to be converted into speech. As a result, it will be out of the spirit of the TTS technology that tries to eliminate as much as possible the case where the user is forced to decipher the text. Furthermore, the act of specifying such text is an additional time-consuming problem. there were.

したがって、ユーザーの意思に合せて、ウェブページの特性に応じて特定の範囲（例えば、単語、文章、段落又は全文の範囲）のテキストを抽出し、各種テキスト基盤のサービスを提供することによって、ユーザーの便利さ（利便性）を増大させるためのアプローチが必要である。
大韓民国公開特許第２００３−６２８７６号公報大韓民国公開特許第２００２−７４２３号公報 Therefore, by extracting text in a specific range (for example, word, sentence, paragraph or full text range) according to the characteristics of the web page according to the user's intention and providing various text-based services, the user can There is a need for an approach to increase the convenience (convenience).
Korean Published Patent No. 2003-62876 Republic of Korea Published Patent No. 2002-7423

本発明は、ウェブページの特性に基づいて能動的にテキストを抽出することにその目的がある。 An object of the present invention is to actively extract text based on characteristics of a web page.

また、本発明の他の目的は、ウェブページの特性に基づいて、相違する範囲のテキストを能動的に抽出することによって、ウェブページのユーザーが当該テキストから変換したデータを便利に取得できるようにすることにある。 Another object of the present invention is to enable the user of a web page to conveniently acquire data converted from the text by actively extracting different ranges of text based on the characteristics of the web page. There is to do.

また、本発明のさらに他の目的は、ユーザーがウェブページ上で広い範囲のテキストを抽出しようとする場合、いちいちマウスをドラッグしなければならない不便さを減少させ、ウェブページの特性に基づいて必要とされる広い範囲のテキストを自動的に抽出することによって、ユーザーの不要な操作を減少させることにある。 Yet another object of the present invention is to reduce the inconvenience of having to drag the mouse each time when the user tries to extract a wide range of text on the web page, and based on the characteristics of the web page By automatically extracting a wide range of text, the unnecessary operation of the user is reduced.

また、本発明のさらに他の目的は前述の方法を実行するためのプログラムを記録したコンピュータ読み取り可能な記録媒体を提供することにある。 Still another object of the present invention is to provide a computer-readable recording medium in which a program for executing the above-described method is recorded.

このような目的を達成するための本発明の代表的な構成は次の通りである。 A typical configuration of the present invention for achieving such an object is as follows.

本発明の一態様において、ウェブページの特性に基づいてテキストを抽出する方法であって、ウェブページ上のテキストポインタを認識する段階、前記ウェブページの識別子の少なくとも一部に対応して格納されているテキスト抽出範囲に関する情報を確認する段階、前記テキストポインタ情報及び前記確認されたテキスト抽出範囲に関する情報に基づいてテキスト抽出範囲を決定する段階、及び前記決定された範囲のテキストを抽出する段階を含む方法が提供される。 In one aspect of the present invention, a method of extracting text based on characteristics of a web page, the step of recognizing a text pointer on the web page, stored corresponding to at least a portion of the identifier of the web page Confirming information relating to a text extraction range, determining a text extraction range based on the text pointer information and the information relating to the confirmed text extraction range, and extracting text in the determined range A method is provided.

本発明の他の態様において、ウェブページの特性に基づいてテキストを抽出する方法であって、ウェブページ上のテキストポインタを認識する段階、テキスト抽出情報データベースに前記ウェブページの識別子の少なくとも一部に対応するテキスト抽出範囲に関する情報が格納されているか否かを確認する段階、前記テキスト抽出情報データベースに前記テキスト抽出範囲に関する情報が格納されていないと確認される場合に、前記テキスト抽出範囲に関する情報を受信する段階、前記テキストポインタ情報及び前記受信したテキスト抽出範囲に関する情報に基づいてテキスト抽出範囲を決定する段階、及び前記決定された範囲のテキストを抽出する段階を含む方法が提供される。 In another aspect of the present invention, there is provided a method of extracting text based on characteristics of a web page, the step of recognizing a text pointer on the web page, and at least part of the identifier of the web page in a text extraction information database Confirming whether or not the information about the corresponding text extraction range is stored; if it is confirmed that the information about the text extraction range is not stored in the text extraction information database; A method is provided that includes receiving, determining a text extraction range based on the text pointer information and information about the received text extraction range, and extracting text in the determined range.

本発明のさらに他の態様において、テキストを音声に変換する方法であって、前述したテキスト抽出方法によって抽出されたテキストと関連した音声データを生成する段階をさらに含む方法が提供される。 In yet another aspect of the present invention, there is provided a method for converting text to speech, further comprising generating speech data associated with the text extracted by the text extraction method described above.

本発明のさらに他の態様において、ウェブページの特性に基づいてテキストを抽出するシステムであって、ウェブページ上のテキストポインタを認識するテキストポインタ認識部、前記ウェブページの識別子の少なくとも一部に対応して格納されているテキスト抽出範囲に関する情報を確認するテキスト抽出範囲情報確認部、前記テキストポインタ情報及び前記確認されたテキスト抽出範囲に関する情報に基づいてテキスト抽出範囲を決定するテキスト抽出範囲決定部、及び前記決定された範囲のテキストを抽出するテキスト抽出部を含むシステムが提供される。 In yet another aspect of the present invention, a system for extracting text based on the characteristics of a web page, the text pointer recognition unit for recognizing a text pointer on the web page, corresponding to at least part of the identifier of the web page A text extraction range information confirmation unit for confirming information on the text extraction range stored as a text extraction range determination unit for determining a text extraction range based on the text pointer information and the information on the confirmed text extraction range; And a text extraction unit for extracting the determined range of text.

本発明のさらに他の態様において、ウェブページの特性に基づいてテキストを抽出するシステムであって、テキスト抽出情報データベース、ウェブページ上のテキストポインタを認識するテキストポインタ認識部、前記テキスト抽出情報データベースに前記ウェブページの識別子の少なくとも一部に対応するテキスト抽出範囲に関する情報が格納されているか否かを確認し、そうでない場合、テキスト抽出範囲に関する情報を受信するテキスト抽出範囲情報受信部、前記テキストポインタ情報及び前記受信したテキスト抽出範囲に関する情報に基づいてテキスト抽出範囲を決定するテキスト抽出範囲決定部、及び前記決定された範囲のテキストを抽出するテキスト抽出部を含むシステムが提供される。 In yet another aspect of the present invention, a system for extracting text based on the characteristics of a web page includes a text extraction information database, a text pointer recognition unit for recognizing a text pointer on a web page, and the text extraction information database. A text extraction range information receiving unit for confirming whether or not information about a text extraction range corresponding to at least a part of the identifier of the web page is stored; A system is provided that includes a text extraction range determination unit that determines a text extraction range based on information and information about the received text extraction range, and a text extraction unit that extracts text in the determined range.

本発明のさらに他の態様において、テキストを音声に変換するシステムであって、ウェブページ上のテキストポインタを認識するテキストポインタ認識部、前記ウェブページの識別子の少なくとも一部に対応して格納されているテキスト抽出範囲に関する情報を確認するテキスト抽出範囲情報確認部、前記テキストポインタ情報及び前記確認したテキスト抽出範囲に関する情報に基づいてテキスト抽出範囲を決定するテキスト抽出範囲決定部、前記決定された範囲のテキストを抽出するテキスト抽出部、及び前記抽出されたテキストと関連した音声データを生成する音声データ生成部を含むシステムが提供される。 In still another aspect of the present invention, there is provided a system for converting text into speech, a text pointer recognition unit for recognizing a text pointer on a web page, stored corresponding to at least a part of the identifier of the web page. A text extraction range information confirmation unit for confirming information on the text extraction range, a text extraction range determination unit for determining a text extraction range based on the text pointer information and the information on the confirmed text extraction range, A system is provided that includes a text extraction unit that extracts text, and an audio data generation unit that generates audio data associated with the extracted text.

本発明のさらに他の態様において、テキストを音声に変換するシステムであって、テキスト抽出情報データベース、ウェブページ上のテキストポインタを認識するテキストポインタ認識部、前記テキスト抽出情報データベースに前記ウェブページの識別子の少なくとも一部に対応するテキスト抽出範囲に関する情報が格納されているか否かを確認し、そうでない場合、前記テキスト抽出範囲に関する情報を受信するテキスト抽出範囲情報受信部、前記テキストポインタ情報及び前記受信したテキスト抽出範囲に関する情報に基づいてテキスト抽出範囲を決定するテキスト抽出範囲決定部、前記決定された範囲のテキストを抽出するテキスト抽出部、及び前記抽出されたテキストと関連した音声データを生成する音声データ生成部を含むシステムが提供される。 In still another aspect of the present invention, there is provided a system for converting text into speech, a text extraction information database, a text pointer recognition unit for recognizing a text pointer on a web page, and an identifier of the web page in the text extraction information database. A text extraction range information receiving unit for receiving information on the text extraction range, the text pointer information, and the reception A text extraction range determination unit that determines a text extraction range based on information about the extracted text extraction range, a text extraction unit that extracts text in the determined range, and a voice that generates voice data associated with the extracted text System including data generator There is provided.

この以外にも、本発明において、ウェブページの特性に基づいてテキストを抽出するための他の方法、システム、及びこの方法を実行するためのコンピュータプログラムを記録するコンピュータ読み取り可能な記録媒体がさらに提供される。 In addition to this, the present invention further provides another method and system for extracting text based on the characteristics of a web page, and a computer-readable recording medium for recording a computer program for executing the method. Is done.

本発明において、ウェブページの特性に基づいてテキストが能動的に抽出され、音声変換サービス又は翻訳サービスのようなテキスト基盤のサービスが提供されるので、ユーザーの多くの操作がなくてもユーザーの意思に符合するテキスト基盤のデータを取得することができる。 In the present invention, text is actively extracted based on the characteristics of a web page, and a text-based service such as a speech conversion service or a translation service is provided. Text-based data that matches can be obtained.

また、本発明において、ユーザーがウェブページの特性を詳細に知らずにウェブページにアクセスする場合にも、その特性に合う範囲のテキストが自動的に抽出されるようにすることによって、ユーザーがウェブページに表示されたコンテンツを効率的に把握することができる。 Further, in the present invention, even when the user accesses the web page without knowing the characteristics of the web page in detail, the user can automatically extract the text within the range that matches the characteristics of the web page. It is possible to efficiently grasp the content displayed on the screen.

一方、本発明において、ユーザーがウェブページ上で広い範囲のテキストを抽出しようとする場合、ユーザーがこれを全てドラッグしなければならない不便さをなくすことができ、マウスドラッグのエラーによるテキスト抽出のエラーも防止することができる。 On the other hand, in the present invention, when a user tries to extract a wide range of text on a web page, the inconvenience of the user having to drag all of the text can be eliminated, and the text extraction error due to the mouse drag error can be eliminated. Can also be prevented.

以下、本発明を実施できる特定実施形態を例示として示す添付図面を参照して本発明について詳細に説明する。後述するこれらの実施形態は当業者が本発明を十分に実施できるように詳細に説明する。本発明の多様な実施形態は互いに異なるが、相互排他的である必要はないと理解しなければならない。例えば、ここに記載されている特定形状、構造及び特性は一実施形態に関連して本発明の技術的思想及びその範囲から逸脱せずに他の実施形態として具現することができる。また、ここに開示したそれぞれの実施形態のうち、個別構成要素の位置または配置は、本発明の技術的思想及びその範囲から逸脱せずに変更できることを理解するべきである。したがって、後述する詳細な説明は限定的な意味で扱うものでなく、本発明の技術的範囲は、適切に説明されるならば、その請求項に記載された本発明と均等な全ての技術的範囲と共に添付した特許請求範囲によって定められる。 The present invention will now be described in detail with reference to the accompanying drawings which illustrate, by way of example, specific embodiments in which the invention may be practiced. These embodiments described below are described in detail so that those skilled in the art can fully practice the present invention. It should be understood that the various embodiments of the present invention are different from each other but need not be mutually exclusive. For example, the specific shapes, structures, and characteristics described herein may be embodied in other embodiments without departing from the spirit and scope of the invention in connection with one embodiment. In addition, it should be understood that the position or arrangement of individual components in each of the embodiments disclosed herein can be changed without departing from the technical idea and scope of the present invention. The following detailed description is, therefore, not to be construed in a limiting sense, and the technical scope of the present invention is equivalent to all technical equivalents of the present invention described in the claims if properly described. It is defined by the scope of the appended claims along with the scope.

全体システムの構成
図１は、本発明の一実施形態に係るテキスト抽出システムの概略的な構成を示した図面である。 Configuration of Overall System FIG. 1 is a diagram showing a schematic configuration of a text extraction system according to an embodiment of the present invention.

図１に示すように、本発明の一実施形態に係るテキスト抽出システムは、ユーザーコンピュータ１００及びＴＴＳサーバー３００を含むことができる。ここで、ユーザーコンピュータ１００とＴＴＳサーバー３００は、専用回線を利用する近距離通信網（ＬＡＮ）または遠距離通信網（ＷＡＮ）などの多様なネットワーク環境を通じて通信することができる。このようなネットワーク環境は公知のワールドワイドウェブ（ＷｏｒｌｄＷｉｄｅＷｅｂ、ＷＷＷ）である。一方、ＴＴＳサーバー３００は、公知のネットワーク環境でインターネットプロトコルを通じて一つ以上のユーザーコンピュータ１００と双方向に通信することができる。また、このようなＴＴＳサーバー３００は、ユーザーコンピュータ１００からの要請に応じて、最新抽出範囲情報データベース５００及び音声変換データベース７００を参照して処理することができる。 As shown in FIG. 1, the text extraction system according to an embodiment of the present invention may include a user computer 100 and a TTS server 300. Here, the user computer 100 and the TTS server 300 can communicate through various network environments such as a short-range communication network (LAN) or a long-range communication network (WAN) using a dedicated line. Such a network environment is a well-known world wide web (World Wide Web, WWW). Meanwhile, the TTS server 300 can bidirectionally communicate with one or more user computers 100 through an Internet protocol in a known network environment. In addition, such a TTS server 300 can perform processing by referring to the latest extraction range information database 500 and the voice conversion database 700 in response to a request from the user computer 100.

ユーザーコンピュータ１００は、例えばパーソナルコンピュータ、携帯電話、ＰＤＡ等の通信端末装置から構成され、ユーザーコンピュータ１００における各種処理結果を画面表示するための表示装置、端末使用者の操作入力を受け付けるためのキーボード、マウス、タッチパネル等の操作入力部、音声出力を行うためのスピーカー等を備えている。また、ユーザーコンピュータ１００は、インターネットやＬＡＮ等の通信網を介して通信部１９０と通信可能に接続されている。 The user computer 100 is composed of a communication terminal device such as a personal computer, a mobile phone, or a PDA, for example, a display device for displaying various processing results in the user computer 100, a keyboard for receiving operation input of a terminal user, An operation input unit such as a mouse and a touch panel, a speaker for outputting sound, and the like are provided. The user computer 100 is communicably connected to the communication unit 190 via a communication network such as the Internet or a LAN.

ユーザーコンピュータの構成
図２Ａは、図１に示すテキスト抽出システムのうち、ユーザーコンピュータ１００の詳細構成を示す図面であり、図２Ｂは、ＴＴＳサーバー３００の詳細構成を示す図面である。 Configuration of User Computer FIG. 2A is a diagram illustrating a detailed configuration of the user computer 100 in the text extraction system illustrated in FIG. 1, and FIG. 2B is a diagram illustrating a detailed configuration of the TTS server 300.

図２Ａに示すように、ユーザーコンピュータ１００は、演算部１１０、テキスト抽出範囲情報データベース１３０、プログラム格納部１５０、ユーザー入力部１７０、出力部１８０及び通信部１９０を含むことができる。 As shown in FIG. 2A, the user computer 100 can include a calculation unit 110, a text extraction range information database 130, a program storage unit 150, a user input unit 170, an output unit 180, and a communication unit 190.

演算部１１０は、マウスオーバー認識部１１１、抽出範囲情報確認部１１２、抽出範囲情報要請部１１３、最新抽出範囲情報要請部１１５、抽出方式決定部１１７、テキスト抽出部１１８、及び音声データ提供部１１９を含むことができる。本発明の一実施形態によれば、マウスオーバー認識部１１１、抽出範囲情報確認部１１２、抽出範囲情報要請部１１３、最新抽出範囲情報要請部１１５、抽出方式決定部１１７、テキスト抽出部１１８及び音声データ提供部１１９は、そのうち少なくとも一部が演算部１１０に含まれることができ、物理的には様々な公知の記憶装置上に格納されることができる。また、このようなプログラムモジュールは、演算部１１０と通信可能な遠隔の記憶装置に格納されることもできる。このようなプログラムモジュールは、本発明によって後述する特定業務を遂行したり、特定抽象データ類型を実行したりするルーチン、サブルーチン、プログラム、オブジェクト、コンポーネント、データ構造などを包括するが、本発明はこれに制限されるものではない。 The calculation unit 110 includes a mouse-over recognition unit 111, an extraction range information confirmation unit 112, an extraction range information request unit 113, a latest extraction range information request unit 115, an extraction method determination unit 117, a text extraction unit 118, and a voice data provision unit 119. Can be included. According to an embodiment of the present invention, the mouse over recognition unit 111, the extraction range information confirmation unit 112, the extraction range information request unit 113, the latest extraction range information request unit 115, the extraction method determination unit 117, the text extraction unit 118, and the voice At least a part of the data providing unit 119 can be included in the arithmetic unit 110 and physically stored on various known storage devices. Such a program module can also be stored in a remote storage device that can communicate with the arithmetic unit 110. Such a program module includes a routine, a subroutine, a program, an object, a component, a data structure, and the like for performing a specific task described later according to the present invention and executing a specific abstract data type. It is not limited to.

一方、演算部１１０は、必要によってＵＲＬ（ＵｎｉｆｏｒｍＲｅｓｏｕｒｃｅＬｏｃａｔｏｒ）のようなウェブページの識別子に対応して格納されている、ウェブページ内のテキストの抽出範囲に関する情報（例えば、ウェブページの特性によって単位、文章、段落及び全文のうち、どの範囲のテキストを抽出するかに関する情報）を格納しているテキスト抽出範囲情報データベース１３０を参照することができ、このテキスト抽出範囲情報データベース１３０は、演算部１１０内に一つの構成要素として含まれることができる。 On the other hand, the calculation unit 110 stores information related to an extraction range of text in a web page, which is stored corresponding to a web page identifier such as a URL (Uniform Resource Locator) as necessary (for example, a unit according to the characteristics of the web page). , A text extraction range information database 130 that stores information on which range of text to extract from among sentences, paragraphs, and full texts). Can be included as a single component.

また、演算部１１０は、プログラム駆動部（不図示）をさらに含んで、プログラム格納部１５０に格納されているプログラム、すなわち、本発明におけるテキストを抽出したり、抽出されたテキストを用いてテキスト基盤のサービスを提供するためのプログラムが、ユーザーのウェブブラウザーが実行される際に共に駆動されるようにすることができる。プログラム格納部１５０は、必ずしもユーザーコンピュータ１００の一つの構成要素として含まれる必要はなく、コンピュータで読み取り可能な公知の記録媒体、すなわち、ハードディスク、フロッピーディスク、フロプティカルディスク、磁気テープ、ＣＤ−ＲＯＭ、ＤＶＤなどの記録媒体に代替することができる。 The calculation unit 110 further includes a program driving unit (not shown), and extracts a program stored in the program storage unit 150, that is, a text in the present invention, or a text base using the extracted text. The program for providing the service can be driven together when the user's web browser is executed. The program storage unit 150 does not necessarily have to be included as one component of the user computer 100, and is a known computer-readable recording medium, that is, a hard disk, a floppy disk, a floppy disk, a magnetic tape, a CD-ROM. It can be replaced by a recording medium such as a DVD.

ユーザー入力部１７０は、通常のコンピュータ入力手段、すなわち、キーボードやマウスなどでもよく、出力部１８０は、ウェブブラウザーの表示及び／又はウェブページの表示を視覚的に示すためのコンピュータモニターやテキストを音声で出力するスピーカーなどで実現することができる。 The user input unit 170 may be a normal computer input means, that is, a keyboard or a mouse, and the output unit 180 is a computer monitor or text for visually indicating the display of the web browser and / or the display of the web page. It can be realized with a speaker that outputs in.

サーバーの構成
一方、図２Ｂに示すＴＴＳサーバー３００は、ＴＴＳサービス、すなわち、ウェブページ内の少なくとも一部のテキストを音声に変換し、これをユーザーに提供するためのサーバーである。このようなＴＴＳサーバー３００は、インターネットポータルサイトのウェブサーバーでもよく、ＴＴＳサービスだけを専門的に提供する業者の運営サーバーでもよい。また、本発明の他の実施形態におけるＴＴＳサーバー３００は、ＴＴＳサービスと直接的に関連付けられていない一般的なウェブサーバーに置換することができる。 Server Configuration Meanwhile, the TTS server 300 shown in FIG. 2B is a server for converting TTS service, that is, at least part of text in a web page into speech and providing it to the user. Such a TTS server 300 may be a web server of an Internet portal site, or may be an operation server of a company that provides only TTS services. In addition, the TTS server 300 in another embodiment of the present invention can be replaced with a general web server that is not directly associated with the TTS service.

本発明の一実施形態に係るＴＴＳサーバー３００は、最新抽出範囲情報判断部３１０、最新抽出範囲情報取得部３３０及びＴＴＳ変換部３７０を含む。本発明の一実施形態によれば、最新抽出範囲情報判断部３１０、最新抽出範囲情報取得部３３０及びＴＴＳ変換部３７０は、そのうち少なくとも一部がＴＴＳサーバー３００に含まれることができ、ＴＴＳサーバー３００と通信するプログラムモジュールであることもできる。このようなプログラムモジュールは、運営システム、応用プログラムモジュール及びその他プログラムモジュールの形態でＴＴＳサーバー３００に含まれることができ、物理的には様々な公知の記憶装置上に格納されることができる。また、このようなプログラムモジュールは、ＴＴＳサーバー３００と通信可能な遠隔の記憶装置に格納されることもできる。このようなプログラムモジュールは、本発明によって後述する特定業務を遂行したり、特定抽象データ類型を実行したりするルーチン、サブルーチン、プログラム、オブジェクト、コンポーネント、データ構造などを包括するが、本発明はこれに制限されるものではない。 The TTS server 300 according to an embodiment of the present invention includes a latest extraction range information determination unit 310, a latest extraction range information acquisition unit 330, and a TTS conversion unit 370. According to an embodiment of the present invention, the latest extraction range information determination unit 310, the latest extraction range information acquisition unit 330, and the TTS conversion unit 370 may be at least partially included in the TTS server 300. It can also be a program module that communicates with. Such a program module can be included in the TTS server 300 in the form of an operating system, an application program module, and other program modules, and can be physically stored on various known storage devices. Such program modules can also be stored in a remote storage device that can communicate with the TTS server 300. Such a program module includes a routine, a subroutine, a program, an object, a component, a data structure, and the like for performing a specific task described later according to the present invention and executing a specific abstract data type. It is not limited to.

参考に、図１及び図２に示した各構成要素は、互いに必要によって信号を送受信することができるものと理解するべきであるが、本発明を実現するのに必要である、前述した信号交換のための公知の通信手段については、具体的な説明を省略する。 For reference, it should be understood that the components shown in FIGS. 1 and 2 can transmit and receive signals to and from each other as necessary, but the signal exchange described above is necessary to realize the present invention. A detailed description of known communication means for the above will be omitted.

テキストの抽出及び音声変換
図３は、本発明の一実施形態においてテキストを抽出し、抽出されたテキストを音声に変換する過程を示すフローチャートである。以下では、図２Ａ及び図２Ｂと共に図３を参照して、本発明の一実施形態におけるウェブページ内のテキストを抽出する過程と抽出されたテキストを音声変換して出力する過程について詳細に説明する。 Text Extraction and Speech Conversion FIG. 3 is a flowchart illustrating a process of extracting text and converting the extracted text into speech in an embodiment of the present invention. Hereinafter, with reference to FIGS. 2A and 2B and FIG. 3, a process of extracting text in a web page and a process of outputting the extracted text by voice conversion according to an embodiment of the present invention will be described in detail. .

ユーザーが、ユーザーコンピュータ１００を用いてウェブブラウザーを実行させる場合には、本発明の一実施形態においてテキストを抽出し、抽出されたテキストを音声に変換して出力するためのプログラムが共に駆動される。このプログラムは、前述したようにユーザーコンピュータ１００の内部に含まれるプログラム格納部１５０に記録されることもでき、別の記録媒体に記録されることもできる。 When a user uses a user computer 100 to execute a web browser, in one embodiment of the present invention, a program for extracting text and converting the extracted text into speech is output together. . As described above, this program can be recorded in the program storage unit 150 included in the user computer 100, or can be recorded in another recording medium.

以後、ユーザーは、インターネットに接続でき、ウェブブラウザーを通じて所定のＵＲＬを有するウェブページにアクセスすることができる。一方、数多くのサーバーがウェブブラウザーを通じてアクセス可能なコンテンツを提供するが、これらのアドレスを表示するために通常はＵＲＬを用いる。このようなＵＲＬは、インターネット上の各サーバーにあるファイルのアドレスを明示するためのものであるが、ＵＲＬは、比較的自由に決まる属性があるので、ウェブページの特性を示すための他の情報（例えば、本発明の一実施形態に係るテキスト抽出範囲に関する情報）もやはり含むことができる。いずれの場合においても、ＵＲＬまたはＵＲＬの一部は、本発明に係るテキスト抽出範囲に関する情報と対応することができる。 Thereafter, the user can connect to the Internet and access a web page having a predetermined URL through a web browser. On the other hand, many servers provide content accessible through a web browser, but URLs are usually used to display these addresses. Such a URL is for specifying the address of a file in each server on the Internet. However, since the URL has an attribute that is determined relatively freely, other information for indicating the characteristics of the web page is used. (For example, information regarding a text extraction range according to an embodiment of the present invention) may also be included. In any case, the URL or a part of the URL can correspond to the information regarding the text extraction range according to the present invention.

図３の例では、本発明の一実施形態におけるウェブページからテキストを抽出し、これを音声変換したデータを出力する過程について説明する。 In the example of FIG. 3, a process of extracting text from a web page and outputting data obtained by voice conversion of the text in one embodiment of the present invention will be described.

まず、ユーザーが、ユーザーコンピュータ１００のウェブブラウザーにより表示される、ウェブページに含まれたテキスト上にマウスポインタを位置させると、段階Ｓ３１０で、演算部１１０のマウスオーバー認識部１１１は、マウスオーバーイベントが発生したか否かを把握（判別）する。 First, when the user positions the mouse pointer on the text included in the web page displayed by the web browser of the user computer 100, the mouse over recognition unit 111 of the calculation unit 110 displays the mouse over event in step S310. It is grasped (determined) whether or not an error has occurred.

段階Ｓ３３０では、抽出範囲情報確認部１１２が、現在のウェブページのＵＲＬに対応して格納されているテキスト抽出範囲に関する情報がテキスト抽出範囲情報データベース１３０に存在するか否かを判断する。簡略に前述したように、テキスト抽出範囲情報データベース１３０には、ウェブページのＵＲＬに対応してテキスト抽出範囲に関する情報が格納される。このようなテキスト抽出範囲に関する情報は、ＵＲＬ別にそれぞれ格納されることができ、当該ウェブページのいくつかの類型別に分けて集合的に格納されることもできる。これについては、以下でより詳細に説明する。 In step S330, the extraction range information confirmation unit 112 determines whether or not the text extraction range information database 130 has information regarding the text extraction range stored corresponding to the URL of the current web page. As described briefly above, the text extraction range information database 130 stores information on the text extraction range corresponding to the URL of the web page. Such information regarding the text extraction range can be stored for each URL, and can also be stored collectively for each type of the web page. This will be described in more detail below.

段階Ｓ３３０において、抽出範囲情報確認部１１２により現在のウェブページのＵＲＬに対応するテキスト抽出範囲に関する情報が、テキスト抽出範囲情報データベース１３０に存在しないと判断される場合には、段階Ｓ３３１では、抽出範囲情報要請部１１３が、現在のウェブページのＵＲＬに該当するテキスト抽出範囲に関する情報をＴＴＳサーバー３００に要請する。本発明の一実施形態によれば、ＴＴＳサーバー３００が参照する最新抽出範囲情報データベース５００は、ＴＴＳサービスを提供するために必要とされる多様な情報、すなわち、各ＵＲＬのテキスト抽出範囲に関する情報及びＵＲＬ別に提供しているウェブページの類型に関する情報を周期的にアップデートしてテキスト抽出範囲に関する最新の情報を格納する。抽出範囲情報要請部１１３が、現在のウェブページのＵＲＬに該当するテキスト抽出範囲に関する情報を要請する場合には、ＴＴＳサーバー３００の最新抽出範囲情報取得部３３０は、最新抽出範囲情報データベース５００を参照して該当のＵＲＬに対応する最新の情報をユーザーコンピュータ１００の演算部１１０に伝送する。 In step S330, when the extraction range information confirmation unit 112 determines that the information regarding the text extraction range corresponding to the URL of the current web page does not exist in the text extraction range information database 130, in step S331, the extraction range The information request unit 113 requests the TTS server 300 for information regarding the text extraction range corresponding to the URL of the current web page. According to an embodiment of the present invention, the latest extraction range information database 500 referred to by the TTS server 300 includes various information required for providing the TTS service, that is, information on the text extraction range of each URL, and Information on the types of web pages provided for each URL is periodically updated to store the latest information on the text extraction range. When the extraction range information request unit 113 requests information regarding the text extraction range corresponding to the URL of the current web page, the latest extraction range information acquisition unit 330 of the TTS server 300 refers to the latest extraction range information database 500. Then, the latest information corresponding to the corresponding URL is transmitted to the calculation unit 110 of the user computer 100.

段階Ｓ３３０において、抽出範囲情報確認部１１２により当該ＵＲＬに対応するテキスト抽出範囲に関する情報がテキスト抽出範囲情報データベース１３０に存在すると判断される場合には、段階Ｓ３３３で、演算部１１０の最新抽出範囲情報要請部１１５は、テキスト抽出範囲情報データベース１３０に存在する情報が最新の情報であるか否かを判断した後、最新の情報ではない場合には、ＴＴＳサーバー３００から最新の情報の提供を受けるための要請を当該ＴＴＳサーバー３００に伝送する。ＴＴＳサーバー３００の最新抽出範囲情報判断部３１０は、最新抽出範囲情報要請部１１５の要請に応答して最新抽出範囲情報データベース５００に格納されている情報を参照してテキスト抽出範囲情報データベース１３０に存在する情報が最新の情報であるか否かを判断する。このとき、例えば、当該情報が最新の情報であれば、これに対応する所定の信号をユーザーコンピュータ１００に伝送する。一方、例えば、当該情報が最新の情報でない場合は、最新抽出範囲情報取得部３３０は、最新抽出範囲情報データベース５００に格納されている最新のテキスト抽出範囲情報を抽出してユーザーコンピュータ１００の演算部１１０に伝送することができる。 In step S330, if the extraction range information confirmation unit 112 determines that the information regarding the text extraction range corresponding to the URL exists in the text extraction range information database 130, the latest extraction range information of the calculation unit 110 in step S333. Since the request unit 115 determines whether or not the information existing in the text extraction range information database 130 is the latest information, the request unit 115 receives the provision of the latest information from the TTS server 300 if the information is not the latest information. Is transmitted to the TTS server 300. The latest extraction range information determination unit 310 of the TTS server 300 exists in the text extraction range information database 130 with reference to information stored in the latest extraction range information database 500 in response to a request from the latest extraction range information request unit 115. It is determined whether the information to be updated is the latest information. At this time, for example, if the information is the latest information, a predetermined signal corresponding to the information is transmitted to the user computer 100. On the other hand, for example, when the information is not the latest information, the latest extraction range information acquisition unit 330 extracts the latest text extraction range information stored in the latest extraction range information database 500 and calculates the calculation unit of the user computer 100. 110 can be transmitted.

段階Ｓ３４０では、演算部１１０は、段階Ｓ３３１または段階Ｓ３３３で発した要請に応答してＴＴＳサーバー３００が伝送してきたテキスト抽出範囲に関する情報を受信する。すなわち、演算部１１０は、現在ウェブブラウザーに表示されているウェブページのテキストのうち、マウスオーバーした位置にある単語だけを抽出するか、文章または段落を抽出するか、または当該ウェブページに含まれるテキスト全文を抽出するかに関する情報を受信する。ＴＴＳサーバー３００により参照される最新抽出範囲情報データベース５００には、ＵＲＬ別にテキスト抽出範囲に関する最新の情報がアップデートされて格納されているため、段階Ｓ３４０で、ＴＴＳサーバー３００から受信したテキスト抽出範囲に関する情報は、常に最新の情報である。 In step S340, the calculation unit 110 receives information regarding the text extraction range transmitted by the TTS server 300 in response to the request issued in step S331 or step S333. That is, the calculation unit 110 extracts only the word at the position where the mouse is over, or extracts a sentence or a paragraph from the text of the web page currently displayed on the web browser, or is included in the web page. Receive information about whether to extract full text. In the latest extraction range information database 500 referred to by the TTS server 300, the latest information regarding the text extraction range is updated and stored for each URL. Therefore, in step S340, the information regarding the text extraction range received from the TTS server 300 is stored. Is always the latest information.

また、演算部１１０は、ＴＴＳサーバー３００から受信したテキスト抽出範囲に関する最新情報をテキスト抽出範囲情報データベース１３０に格納する。現在のウェブページのテキスト抽出範囲に関する情報がテキスト抽出範囲情報データベース１３０に存在するが、ＴＴＳサーバー３００によりその情報が最新の情報でないと判断される場合には、テキスト抽出範囲情報データベース１３０に格納されている情報がＴＴＳサーバー３００により伝送された最新の情報にアップデートされ、そうでない場合には、このようなアップデートは省略されることができる。一方、段階Ｓ３３０での判断結果が「いいえ」の場合には、受信した最新のテキスト抽出範囲情報は、テキスト抽出範囲情報データベース１３０に新規で格納される。 In addition, the calculation unit 110 stores the latest information regarding the text extraction range received from the TTS server 300 in the text extraction range information database 130. If the information regarding the text extraction range of the current web page exists in the text extraction range information database 130, but the TTS server 300 determines that the information is not the latest information, it is stored in the text extraction range information database 130. Information is updated to the latest information transmitted by the TTS server 300; otherwise, such an update can be omitted. On the other hand, if the determination result in step S330 is “No”, the received latest text extraction range information is newly stored in the text extraction range information database 130.

段階Ｓ３５０では、ＴＴＳサーバー３００から受信したテキスト抽出範囲情報またはテキスト抽出範囲情報データベース１３０に格納されていた最新のテキスト抽出範囲に関する情報に基づいて、単語、文章、段落または全文を抽出することにおいて、必要な抽出方式を決定することができる。本発明に係るテキスト抽出方式に関する例については、後述する。 In step S350, based on the text extraction range information received from the TTS server 300 or the latest text extraction range information stored in the text extraction range information database 130, a word, sentence, paragraph or whole sentence is extracted. The required extraction scheme can be determined. An example regarding the text extraction method according to the present invention will be described later.

段階Ｓ３６０では、先行段階で決定されたテキスト抽出範囲とテキスト抽出方式に基づいてテキストを抽出する。このとき、抽出された範囲のテキストは反転して表示するなどのように、これにより抽出されないその他のテキストとは視覚的に区別されることができる。したがって、ユーザーは、ウェブページのうち、どの部分のテキストを抽出されたかに対して把握することができ、これにより当該ウェブページがどのような特性を有しているかに対しても間接的に確認することができる。さらに、ユーザーは、ウェブページとこれに対応するテキスト抽出の範囲が適切でないと判断される際には、これをユーザーフィードバックとしてＴＴＳサーバー３００に提供することができる。 In step S360, text is extracted based on the text extraction range and text extraction method determined in the preceding step. At this time, the extracted range of text can be visually distinguished from other text that is not extracted. Therefore, the user can grasp which part of the text of the web page has been extracted, thereby indirectly checking what characteristics the web page has. can do. Furthermore, when it is determined that the web page and the corresponding text extraction range are not appropriate, the user can provide this to the TTS server 300 as user feedback.

段階Ｓ３７０においては、段階Ｓ３６０で抽出されたテキストがＴＴＳサーバー３００に伝送される。ＴＴＳサーバー３００のＴＴＳ変換部３７０は、テキストを音声に変換するために必要である情報を格納している音声変換データベース７００を参照して受信したテキストを音声データに変換し、これをさらにユーザーコンピュータ１００に伝送することができる。音声変換データベース７００には、コード化された各テキスト別に音声データが格納されることもでき、単語別、文章別、または段落別にこれに対応する音声データが格納されることもできる。 In step S370, the text extracted in step S360 is transmitted to the TTS server 300. The TTS conversion unit 370 of the TTS server 300 converts the received text into speech data by referring to the speech conversion database 700 storing information necessary for converting the text into speech, and further converts this into speech data. 100 can be transmitted. In the voice conversion database 700, voice data can be stored for each encoded text, and corresponding voice data can be stored for each word, each sentence, or each paragraph.

段階Ｓ３８０においては、ＴＴＳサーバー３００から伝送される音声データがユーザーコンピュータ１００に伝送される。 In step S380, the audio data transmitted from the TTS server 300 is transmitted to the user computer 100.

段階Ｓ３９０においては、受信した音声データは、演算部１１０の音声データ提供部１１９により提供されると共に、当該音声データは、スピーカーなどの出力部１８０から出力されることができる。 In step S390, the received audio data is provided by the audio data providing unit 119 of the computing unit 110, and the audio data can be output from the output unit 180 such as a speaker.

本明細書においては、本発明の一実施形態においてユーザーコンピュータ１００にはＴＴＳサーバー３００とは別のテキスト抽出範囲情報データベース１３０が存在し、基本的にこれに格納されているテキスト抽出範囲情報に基づいてテキストが抽出されるものと説明しているが、このような構成要素を省略したまま、どの範囲のテキストを抽出するかを決定するための参照データベースを最新抽出範囲情報データベース５００に一元化することも可能であり、音声変換が例示的に説明されたこととは異なり、ＴＴＳサーバー３００による音声変換データベース７００を参照することのなく、ユーザーコンピュータ１００で行われることもやはり可能である点を理解しなければならない。一方、本発明に係る変更例においては、本発明で言及している、いわゆるテキスト抽出は、ユーザーコンピュータ１００だけでなく、その代案としてＴＴＳサーバー３００でも遂行されることができる点を理解しなければならない。 In the present specification, in one embodiment of the present invention, the user computer 100 has a text extraction range information database 130 different from the TTS server 300, and is basically based on the text extraction range information stored in the database. However, the reference database for deciding which range of text to extract is centralized in the latest extraction range information database 500 while omitting such components. It is also possible to understand that, unlike the case where the voice conversion is described as an example, it can also be performed by the user computer 100 without referring to the voice conversion database 700 by the TTS server 300. There must be. On the other hand, in the modification according to the present invention, it should be understood that the so-called text extraction referred to in the present invention can be performed not only by the user computer 100 but also by the TTS server 300 as an alternative. Don't be.

テキスト抽出の範囲に関する情報の活用
本発明の一実施形態によれば、ウェブページの特性に基づいて相違する範囲のテキストを抽出して利用することができる。以下では、テキスト抽出の範囲を差別化するための基準であるウェブページの特性の例について説明する。 Utilization of Information on Text Extraction Range According to an embodiment of the present invention, it is possible to extract and use text in a different range based on the characteristics of a web page. Below, the example of the characteristic of the web page which is a reference | standard for differentiating the range of text extraction is demonstrated.

ユーザーがユーザーコンピュータ１００を用いてアクセスするウェブページは、それぞれ固有のＵＲＬを有し、またそれぞれのウェブページは、一定の特性を有する。このようなウェブページは、そのコンテンツの属性によってニュース記事ページ、生活情報ページ、ショッピング情報ページ、百科事典ページ、語学辞書ページ、全文情報ページ、ブログページなどのように多様に区分されることができる。万一、あるウェブページに含まれているコンテンツがニュース記事であれば、該当ウェブページを見ているユーザーは、特定の単語または文章に集中するよりは、ニュース記事の全文や所定の段落の内容を把握しようとするはずである。一方、ユーザーが本出願人の著名な知識サービスである「知識ｉＮ」コーナーのような全文情報を扱っているウェブページを見る場合には、知識質問とこれに関する答えの内容だけに興味があるはずである。また、百科事典または語学辞書のウェブページを見ているユーザーであれば、特定の単語に関する定義及びこれを説明するための例文だけに興味がある可能性が高い。したがって、ウェブページに含まれているコンテンツの属性や類型によってテキスト基盤のサービスを提供するための基礎になるテキストの抽出範囲は変わらなければならない。すなわち、例えば、ニュース記事が含まれているウェブページに対しては、当該ページ内のテキストを段落または全文単位で抽出したほうが好ましく、辞書ページに対しては、単語及びそれに関連した説明部に該当するテキストだけをまず抽出したほうが好ましいであろう。 Each web page that a user accesses using the user computer 100 has a unique URL, and each web page has certain characteristics. Such web pages can be classified into various categories such as news article pages, life information pages, shopping information pages, encyclopedia pages, language dictionary pages, full text information pages, blog pages, etc. according to the content attributes. . In the unlikely event that the content contained in a web page is a news article, the user viewing the web page will not be able to concentrate on a specific word or sentence, but rather the full text of the news article or the contents of a given paragraph. Should try to figure out. On the other hand, if a user views a web page that deals with full-text information such as the “knowledge iN” corner, which is the applicant's famous knowledge service, he / she should be interested only in knowledge questions and the contents of answers related to them. It is. In addition, a user who is viewing a web page of an encyclopedia or a language dictionary is likely to be interested only in the definition of a specific word and an example sentence for explaining the definition. Therefore, the text extraction range that is the basis for providing a text-based service must be changed according to the attributes and types of content included in the web page. That is, for example, for a web page containing a news article, it is preferable to extract the text in the page in paragraphs or full text units, and for a dictionary page, it corresponds to a word and its associated description. It would be preferable to extract only the text you want.

このために、本発明の一実施形態に係るテキスト抽出範囲情報データベース１３０には、各ウェブページの特性に基づいて相違するテキスト抽出範囲に関する情報が格納されることができる。本発明に係るテキスト抽出範囲情報データベース１３０では、ウェブページのＵＲＬなどがテキスト抽出範囲に関する情報に対応して格納されることができる。 For this reason, the text extraction range information database 130 according to an embodiment of the present invention may store information regarding different text extraction ranges based on the characteristics of each web page. In the text extraction range information database 130 according to the present invention, a URL of a web page or the like can be stored corresponding to information on the text extraction range.

必要によって、抽出範囲情報データベース１３０の情報は、ユーザーのオンライン／オフライン要請により変更または削除されることができる。なお、ＴＴＳサービスを提供する業者だけが抽出範囲情報データベース１３０の情報に対してアクセスする権限を持つのがよい。前述したように、抽出範囲情報データベース１３０内の情報は、ＴＴＳサーバー３００との通信により最新の情報にアップデートすることができる。このためには、ＴＴＳサーバー３００は、抽出範囲情報データベース１３０を含んで構成されてもよいし、ＴＴＳサーバー３００と通信する最新抽出範囲情報データベース５００を用いて構成されてもよい。 If necessary, information in the extraction range information database 130 can be changed or deleted according to a user's online / offline request. It should be noted that only the trader providing the TTS service should have the authority to access the information in the extraction range information database 130. As described above, the information in the extraction range information database 130 can be updated to the latest information through communication with the TTS server 300. To this end, the TTS server 300 may be configured to include the extraction range information database 130, or may be configured to use the latest extraction range information database 500 that communicates with the TTS server 300.

最新抽出範囲に関する情報の取得
本発明の一実施形態における演算部１１０の抽出範囲情報確認部１１２によりテキスト抽出範囲に関する情報が存在するか否かを確認し、その結果に基づいてＴＴＳサーバー３００から最新抽出範囲に関する情報を取得する過程について詳しく説明する。 Acquisition of information regarding latest extraction range The extraction range information confirmation unit 112 of the calculation unit 110 according to an embodiment of the present invention confirms whether or not there is information regarding the text extraction range, and based on the result, updates the latest information from the TTS server 300. The process of acquiring information related to the extraction range will be described in detail.

前述したように、抽出範囲情報確認部１１２は、現在のウェブページに対応するテキスト抽出範囲に関する情報がユーザーコンピュータ１００のテキスト抽出範囲情報データベース１３０に存在するか否かを確認する。 As described above, the extraction range information confirmation unit 112 confirms whether or not the information regarding the text extraction range corresponding to the current web page exists in the text extraction range information database 130 of the user computer 100.

例えば、テキスト抽出範囲情報データベース１３０に現在のウェブページのＵＲＬに対応するテキスト抽出範囲に関する情報が存在しないと判断される場合には、演算部１１０の抽出範囲情報要請部１１３は、ＴＴＳサーバー３００に対して当該テキスト抽出範囲に関する情報を要請する。 For example, when it is determined that there is no information regarding the text extraction range corresponding to the URL of the current web page in the text extraction range information database 130, the extraction range information request unit 113 of the calculation unit 110 sends the information to the TTS server 300. Request information about the text extraction range.

これによって、ＴＴＳサーバー３００の最新抽出範囲情報取得部３３０は、最新抽出範囲情報データベース５００を参照して抽出範囲情報要請部１１３から要請されたテキスト抽出範囲に関する情報を取得し、これをユーザーコンピュータ１００の演算部１１０に伝送する。演算部１１０は、テキスト抽出範囲に関する情報を受信し、これをテキスト抽出範囲情報データベース１３０に格納すると共に、これに基づいて現在のウェブページのテキストを抽出する。 Accordingly, the latest extraction range information acquisition unit 330 of the TTS server 300 acquires information on the text extraction range requested from the extraction range information request unit 113 with reference to the latest extraction range information database 500, and uses this information as the user computer 100. To the arithmetic unit 110. The calculation unit 110 receives information about the text extraction range, stores it in the text extraction range information database 130, and extracts the text of the current web page based on the information.

一方、本発明の一実施形態によれば、抽出範囲情報確認部１１２により現在のウェブページに対応するテキスト抽出範囲に関する情報がテキスト抽出範囲情報データベース１３０に既に存在すると判断される場合には、演算部１１０の最新抽出範囲情報要請部１１５は、ＴＴＳサーバー３００に該当情報が最新の情報であるか否かの判断を要請することができる。 On the other hand, according to one embodiment of the present invention, if the extraction range information confirmation unit 112 determines that the information related to the text extraction range corresponding to the current web page already exists in the text extraction range information database 130, The latest extraction range information request unit 115 of the unit 110 can request the TTS server 300 to determine whether or not the corresponding information is the latest information.

続いて、ＴＴＳサーバー３００の最新抽出範囲情報判断部３１０は、最新抽出範囲情報データベース５００を参照して現在のテキスト抽出範囲情報データベース１３０に存在するテキスト抽出範囲に関する情報が最新の情報であるか否かを判断することができる。 Subsequently, the latest extraction range information determination unit 310 of the TTS server 300 refers to the latest extraction range information database 500 to determine whether or not the information regarding the text extraction range existing in the current text extraction range information database 130 is the latest information. Can be determined.

仮に、テキスト抽出範囲情報データベース１３０に存在する情報と最新抽出範囲情報データベース５００に存在する情報が同一であれば、ＴＴＳサーバー３００は、ユーザーコンピュータ１００の演算部１１０にテキスト抽出範囲情報データベース１３０の情報が最新の情報であることを確認する所定の信号を伝送することができる。 If the information existing in the text extraction range information database 130 and the information existing in the latest extraction range information database 500 are the same, the TTS server 300 sends the information in the text extraction range information database 130 to the calculation unit 110 of the user computer 100. It is possible to transmit a predetermined signal for confirming that is the latest information.

また、例えば、テキスト抽出範囲情報データベース１３０に存在する情報と最新抽出範囲情報データベース５００に存在する情報が相違すると判断される場合には、ＴＴＳサーバー３００は、最新抽出範囲情報データベース５００に存在する情報をユーザーコンピュータ１００の演算部１１０に伝送することができる。これによって、演算部１１０は、テキスト抽出範囲情報データベース１３０に格納されていた情報を受信した情報に代替することができる。 Further, for example, when it is determined that the information existing in the text extraction range information database 130 and the information existing in the latest extraction range information database 500 are different, the TTS server 300 receives information existing in the latest extraction range information database 500. Can be transmitted to the calculation unit 110 of the user computer 100. Accordingly, the calculation unit 110 can replace the information stored in the text extraction range information database 130 with the received information.

テキストの抽出方式
本発明の一実施形態によれば、ウェブページに含まれるテキストを単語、文章、段落または全文の単位で抽出することにおいて、ＭＳＡＡ（ＭｉｃｏｒＳｏｆｔＡｃｔｉｖｅＡｃｃｅｓｓｉｂｉｌｉｔｙ）の方式やＩＨＴＭＬ（ＩｎｎｅｒＨＴＭＬ）の方式を使用して抽出することができる。本発明の一実施形態によれば、抽出方式に対する決定もやはり必要によってウェブページの特性に基づいて定められる。ここで、ＭＳＡＡの方式は、通常、多く使用されるＩｎｔｅｒｎｅｔＥｘｐｌｏｒｅｒ（登録商標）ウェブブラウザーと共に提供される所定の関数を用いてウェブページ内の所定の範囲のテキストを抽出する方式であり、ＩＨＴＭＬの方式は、ＨＴＭＬ形式で作成されたウェブページからタグ単位でテキストを抽出する方式（例えば、所定のタグの間のテキストを抽出する方式）である。本発明に係るテキスト抽出方式の決定は、図２Ａに示すような抽出方式決定部１１７によって遂行されることができる。 Text Extraction Method According to an embodiment of the present invention , in extracting text included in a web page in units of words, sentences, paragraphs or whole sentences, MSAA (MicroSoft Active Accessibility) method or IHTML (Inner HTML) is used. This method can be used for extraction. According to an embodiment of the present invention, the decision on the extraction method is also determined based on the characteristics of the web page if necessary. Here, the MSAA method is a method of extracting a predetermined range of text in a web page using a predetermined function provided with a commonly used Internet Explorer (registered trademark) web browser. The method is a method of extracting text in tag units from a web page created in the HTML format (for example, a method of extracting text between predetermined tags). The determination of the text extraction method according to the present invention can be performed by an extraction method determination unit 117 as shown in FIG. 2A.

例えば、ユーザーが、下記のようなＨＴＭＬソースで作成されたウェブページにアクセスしていると仮定してみる。 For example, assume that a user is accessing a web page created with the following HTML source.

＜ｄｉｖｃｌａｓｓ='ｋｎＣｎｔ' ｓｔｙｌｅ='ｏｖｅｒｆｌｏｗ：ｈｉｄｄｅｎ；ｗｏｒｄ−ｗｒａｐ：ｂｒｅａｋ−ｗｏｒｄ；ｗｏｒｄ−ｂｒｅａｋ：ｂｒｅａｋ−ａｌｌ；'＞
＜Ｐ＞数学は＆ｎｂｓｐ；科学にも密接に関連があり、多くの学問で必要とされる重要な学問であるが＜／Ｐ＞
＜Ｐ＞なぜノーベル賞がありませんか？＜／Ｐ＞
＜Ｐ＞フィールズ賞に対しても詳細に書いてください＜／Ｐ＞
＜Ｐ＞数学系のノーベル賞であると．．．＜／Ｐ＞
＜／ｄｉｖ＞ <Div class = 'knCnt' style = 'overflow: hidden; word-wrap: break-word; word-break: break-all;'>
<P> Mathematics   is closely related to science and is an important discipline that is needed in many disciplines </ P>
<P> Why is there no Nobel Prize? </ P>
<P> Please write in detail for the Fields Prize </ P>
<P> Mathematical Nobel Prize. . . </ P>
</ Div>

演算部１１０のマウスオーバー認識部１１１によりマウスオーバーイベントが「科学にも密接に」のうち、「科」の位置で発生したと認識された場合、ＭＳＡＡの方式によれば、当該テキスト前後の最も近いタグ（すなわち、例文では＜Ｐ＞と＜／Ｐ＞）の間のテキストである「数学は科学にも密接に関連があり、多くの学問で必要とされる重要な学問であるが」との文章が抽出されることができる。一方、ＩＨＴＭＬの方式によれば、＜Ｐ＞とのＨＴＭＬタグの単位でテキストを抽出することも可能であるが、すべてのＨＴＭＬ文書を持ってきて、＜ｄｉｖ＞タグに基づいてテキストを抽出することも可能である。このように、＜ｄｉｖ＞タグに基づいてテキストが抽出されるのであれば、テキスト全体が抽出される。 When the mouse over recognition unit 111 of the arithmetic unit 110 recognizes that the mouse over event has occurred in the “family” position among “close to science”, according to the MSAA method, “Mathematics is closely related to science and is an important discipline that is needed in many disciplines,” which is the text between close tags (ie, <P> and </ P> in the example sentence) Can be extracted. On the other hand, according to the IHTML method, text can be extracted in units of <P> HTML tags, but all HTML documents are brought and text is extracted based on <div> tags. It is also possible. In this way, if text is extracted based on the <div> tag, the entire text is extracted.

すなわち、テキスト抽出範囲情報データベース１３０または最新抽出範囲情報データベース５００の情報に基づいてウェブページでマウスオーバーした位置のテキストを抽出する際に、文章単位でテキストを抽出することが好ましい場合には、演算部１１０の抽出方式決定部１１７は、ＭＳＡＡの方式を選ぶことが役に立つ。一方、ウェブページの特性上、段落または全文範囲のテキストを抽出する場合には、所定のＨＴＭＬタグに基づいて容易にテキストを抽出することのできるＩＨＴＭＬの方式を選ぶことが好ましい。 That is, when it is preferable to extract the text in units of sentences when extracting the text at the position where the mouse is over on the web page based on the information in the text extraction range information database 130 or the latest extraction range information database 500, the calculation is performed. The extraction method determination unit 117 of the unit 110 is useful to select the MSAA method. On the other hand, due to the characteristics of the web page, it is preferable to select an IHTML system that can easily extract text based on a predetermined HTML tag when extracting text in a paragraph or full text range.

本発明に係る実施形態におけるウェブページの特性に基づいてテキストを抽出するための方法は、コンピュータにより具現される多様な動作を実行するためのプログラム命令により実現でき、また、これらのプログラムを記録したコンピュータ読取可能な記録媒体として提供することも可能である。コンピュータ読取可能な記録媒体としては、プログラム命令、データファイル、データ構造などを単独または組合せて含むことができる。この媒体に記録されるプログラム命令は本発明のために特別に設計され構成されたものでもよく、コンピュータソフトウェアの当業者に公知され使用できるものでもよい。コンピュータ読取可能な記録媒体の例は、ハードディスク、フロッピー（登録商標）ディスク及び磁気テープのような磁気媒体（ｍａｇｎｅｔｉｃｍｅｄｉａ）、ＣＤ−ＲＯＭ、ＤＶＤのような光記録媒体（ｏｐｔｉｃａｌｍｅｄｉａ）、フロップティーカールディスク（Ｆｌｏｐｔｉｃａｌｄｉｓｋ）のような磁気−光媒体（ｍａｇｎｅｔｏ−ｏｐｔｉｃａｌｍｅｄｉａ）、及びリードオンリーメモリ（ＲＯＭ）、ランダムアクセスメモリ（ＲＡＭ）、フラッシュメモリなどのようなプログラム命令を格納して行うように特別に構成されたハードウェア装置を含む。プログラム命令の例には、コンパイラーにより作られるような機械語コードだけでなく、インタープリターなどを使用してコンピュータによって実行されることのできる高級言語コードが含まれる。前述したハードウェア装置は本発明の動作を行うために一つ以上のソフトウェアモジュールとして作動するように構成されることができ、その逆も同様である。 The method for extracting text based on the characteristics of the web page in the embodiment according to the present invention can be realized by program instructions for executing various operations embodied by a computer, and the programs are recorded. It can also be provided as a computer-readable recording medium. The computer-readable recording medium can include program instructions, data files, data structures, etc. alone or in combination. The program instructions recorded on this medium may be specially designed and configured for the present invention or may be known and usable by those skilled in the art of computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs and DVDs, and flop tea curls. Specially designed to store and execute program instructions such as magnetic-optical media such as a disk (floppy disk), read only memory (ROM), random access memory (RAM), flash memory, etc. The hardware device comprised in is included. Examples of program instructions include not only machine language code as produced by a compiler, but also high-level language code that can be executed by a computer using an interpreter or the like. The hardware device described above can be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

以上のように、本発明では具体的な構成要素などのような特定事項と限定された実施形態及び図面により説明したが、これは本発明のより全体的な理解を助けるために提供されたものであり、本発明は前述の実施形態に限定されるものではなく、本発明が属する分野で通常的な知識を持った者であれば、このような記載から多様な修正及び変形が可能であるはずである。 As described above, the present invention has been described with reference to specific items such as specific components and limited embodiments and drawings, which are provided to assist in a more comprehensive understanding of the present invention. The present invention is not limited to the above-described embodiment, and various modifications and variations can be made from such description as long as the person has ordinary knowledge in the field to which the present invention belongs. It should be.

以上、本発明の詳細な説明では具体的な実施形態について説明したが、本発明の要旨から逸脱しない範囲内で多様に変形できる。よって、本発明の権利範囲は、上述の実施形態に限定されるものではなく、特許請求の範囲の記載及びこれと均等なものに基づいて定められるべきである。 As mentioned above, although specific embodiment was described in detailed description of this invention, it can change variously within the range which does not deviate from the summary of this invention. Therefore, the scope of rights of the present invention should not be limited to the above-described embodiment, but should be determined based on the description of the scope of claims and equivalents thereof.

本発明の一実施形態に係るテキスト抽出システムの概略的な構成を示した図面である。1 is a diagram illustrating a schematic configuration of a text extraction system according to an embodiment of the present invention. 図１に示したテキスト抽出システムのうち、ユーザーコンピュータの詳細構成を示した図面である。It is drawing which showed the detailed structure of the user computer among the text extraction systems shown in FIG. 図１に示したテキスト抽出システムのうち、ＴＴＳサーバーの詳細構成を示した図面である。2 is a diagram illustrating a detailed configuration of a TTS server in the text extraction system illustrated in FIG. 1. 本発明の一実施形態によってテキストを抽出し、抽出したテキストを音声に変換する過程を示すフローチャートである。4 is a flowchart illustrating a process of extracting text and converting the extracted text into speech according to an exemplary embodiment of the present invention.

Explanation of symbols

１００ユーザーコンピュータ
１１０演算部
１３０テキスト抽出範囲情報データベース
１５０プログラム格納部
１７０ユーザー入力部
１８０出力部
３００ＴＴＳサーバー
５００最新抽出範囲情報データベース
７００音声変換データベース DESCRIPTION OF SYMBOLS 100 User computer 110 Operation part 130 Text extraction range information database 150 Program storage part 170 User input part 180 Output part 300 TTS server 500 Latest extraction range information database 700 Speech conversion database

Claims

A method for extracting text based on characteristics of a web page, the steps performed by a computer comprising:
A step that recognizes the text pointer on the web page,
Stored in correspondence with at least a part of the identifier of the web page, text extraction range information relating to which range of words, sentences, paragraphs and whole sentences to be extracted for each characteristic of the web page to be extracted from among the steps of the text pointer identifying the text extraction range information for recognized the web page,
Based on the position of the recognized text pointer on the web page and the identified text extraction range information , the text in the web page corresponding to the position of the text pointer is expressed in words, sentences, paragraphs, or the method comprising to determine whether extracted with full text units to determine the text extraction range which differs by properties of the web page that the text pointer is recognized,
Extracting text in a web page in which the text pointer is recognized based on the determined text extraction range ;
Including methods.

A method for extracting text based on characteristics of a web page, the steps performed by a computer comprising:
A step that recognizes the text pointer on the web page,
The attached corresponding to at least a portion of the identifier of the web page, text extraction for storing the word that is set in advance for each characteristic of the web page, text, text extraction range information on how to extract any range of paragraph and full text a step of referring to the information database, and determines whether it is stored the text extraction range information for a web page said text pointer is recognized,
If the text extraction range information is determined not to be stored in the text extraction information database, the text pointer recognition separately from the latest text extraction information database stores text extraction range information for each property of the web page the method comprising the steps of: receiving a text extraction range information of the web page, which is,
Based on the position of the recognized text pointer on the web page and the received text extraction range information , the text in the web page corresponding to the position of the text pointer is converted to a word unit, a sentence unit, a paragraph unit, or an entire sentence. the method comprising to determine whether to extract the unit, determines the text extraction range which differs by properties of the web page that the text pointer is recognized,
Extracting text in a web page in which the text pointer is recognized based on the determined text extraction range ;
Including methods.

The step of recognizing the text pointer on the web page recognizes the text pointer on the web page by determining whether or not a mouse over event has occurred for the text in the web page. The method according to claim 1 or 2.

4. The method according to claim 3, wherein it is determined that the mouse over event has occurred when the mouse pointer remains in a predetermined area of the web page for a predetermined time or more.

The method according to claim 1 or 2, wherein the identifier of the web page is a URL.

The method according to claim 2, wherein only the latest text extraction range information is stored in the text extraction information database.

Phase, and further comprising the step of determining the text of the web page MSAA (MicorSoft Active Accessibility) to be extracted in a manner, or extracted with IHTML (Inner HTML) method of determining the text extraction range The method according to claim 1 or 2.

A method of converting text to speech, the steps performed by a computer comprising:
3. A method further comprising generating speech data associated with text extracted by the method of claim 1 or 2.

Audio data the generated A method according to claim 8, characterized in that the audio data corresponding to the extracted text.

Audio data the generated A method according to claim 8, characterized in that the audio data corresponding to the text translated the extracted text.

A system for extracting text based on the characteristics of a web page,
And the text pointer recognition unit recognizes the text pointer on the web page,
Stored in correspondence with at least a part of the identifier of the web page, text extraction range information relating to which range of words, sentences, paragraphs and whole sentences to be extracted for each characteristic of the web page to be extracted from within the text extraction range information confirmation unit for identifying said text extraction range information for the said web page text pointer is recognized,
Based on the position of the recognized text pointer on the web page and the identified text extraction range information , the text in the web page corresponding to the position of the text pointer is expressed in words, sentences, paragraphs, or It determines whether or not to extract full text units, and text extraction range determining unit that determines the text extraction range which differs by properties of the web page that the text pointer is recognized,
A text extraction unit that extracts text in a web page in which the text pointer is recognized based on the determined text extraction range ;
Including system.

A system for extracting text based on the characteristics of a web page,
The attached corresponding to at least a portion of the identifier of the web page, text extraction for storing the word that is set in advance for each characteristic of the web page, text, text extraction range information on how to extract any range of paragraph and full text and information database,
And the text pointer recognition unit recognizes the text pointer on the web page,
Referring to the text extraction information database , it is determined whether or not the text extraction range information for the web page where the text pointer is recognized is stored, and the text extraction range information is stored in the text extraction information database. A text extraction range for receiving the text extraction range information of the web page in which the text pointer is recognized from a separate latest text extraction information database storing text extraction range information for each characteristic of the web page, and the information receiving unit,
Based on the position of the recognized text pointer on the web page and the received text extraction range information , the text in the web page corresponding to the position of the text pointer is converted to a word unit, a sentence unit, a paragraph unit, or an entire sentence. It determines whether or not to extract the unit, and text extraction range determining unit that determines the text extraction range which differs by properties of the web page that the text pointer is recognized,
A text extraction unit that extracts text in a web page in which the text pointer is recognized based on the determined text extraction range ;
Including system.

The text pointer recognition unit, by mouse-over event for the text in the web page to determine whether it has occurred, to claim 11 or 12, characterized in that to recognize the text pointer on the web page The described system.

The system according to claim 13 , wherein the mouse over event is determined to have occurred when the mouse pointer remains in a predetermined area of the web page for a predetermined time or longer.

The system according to claim 11 or 12 , wherein the identifier of the web page is a URL.

The system according to claim 12, wherein only the latest text extraction range information is stored in the text extraction information database.

The system according to claim 11 or 12 , wherein the text extraction range determination unit determines whether the text of the web page is extracted by an MSAA method or an IHTML method.

A system for converting text to speech,
And the text pointer recognition unit recognizes the text pointer on the web page,
Stored in correspondence with at least a part of the identifier of the web page, text extraction range information relating to which range of words, sentences, paragraphs and whole sentences to be extracted for each characteristic of the web page to be extracted from within the text extraction range information confirmation unit for identifying said text extraction range information for the said web page text pointer is recognized,
Based on the position of the recognized text pointer on the web page and the identified text extraction range information , the text in the web page corresponding to the position of the text pointer is expressed in words, sentences, paragraphs, or It determines whether or not to extract full text units, and text extraction range determining unit that determines the text extraction range which differs by properties of the web page that the text pointer is recognized,
Based on the determined text extraction range, a text extraction unit for extracting the text in a web page said text pointer is recognized,
A voice data generation unit that generates voice data associated with the extracted text ;
Including system.

A system for converting text to speech,
Text extraction information associated with at least a part of the identifier of the web page and storing text extraction range information regarding which range is extracted from words, sentences, paragraphs and whole sentences set in advance for each characteristic of the web page and the database,
And the text pointer recognition unit recognizes the text pointer on the web page,
Referring to the text extraction information database, the text extraction range information for a web page said text pointer is recognized, it is determined whether or not is stored, the text extraction range information stored in the text extracting information database A text extraction range for receiving the text extraction range information of the web page in which the text pointer is recognized from a separate latest text extraction information database storing text extraction range information for each characteristic of the web page, Information receiver,
Based on the position of the recognized text pointer on the web page and the received text extraction range information , the text in the web page corresponding to the position of the text pointer is converted to a word unit, a sentence unit, a paragraph unit, or an entire sentence. A text extraction range determination unit that determines whether or not to extract in units, and determines different text extraction ranges according to the characteristics of the web page in which the text pointer is recognized ;
Based on the determined text extraction range, a text extraction unit for extracting the text in a web page said text pointer is recognized,
A voice data generation unit that generates voice data associated with the extracted text ;
Including system.

The audio data generation unit The system of claim 18 or 19, characterized in that to generate the audio data corresponding to the extracted text.

The system of claim 18 or 19, characterized in that to generate the audio data corresponding to the audio data generation unit, the translation of the extracted text text.

The computer, computer-readable recording medium storing a program for causing execution of the steps of the method according to claim 1 or 2.