JP2010282612A

JP2010282612A - Web reader system using tts server and method thereof

Info

Publication number: JP2010282612A
Application number: JP2010103816A
Authority: JP
Inventors: Young Gug Kim; グックキム、ヨン
Original assignee: VOICEWARE CO Ltd
Current assignee: VOICEWARE CO Ltd
Priority date: 2009-06-05
Filing date: 2010-04-28
Publication date: 2010-12-16
Also published as: KR20100131172A; KR101040585B1

Abstract

<P>PROBLEM TO BE SOLVED: To provide a TTS service which reproduces text indicated by a mouse pointer on a web page by voice, regardless of the kinds of an operating system and a web browser. <P>SOLUTION: A web reader system includes a web reader WAS client 30, a web reader WAS 40, a TTS server 50, a TTS engine 60, and a voice database 70. The web reader system is driven by the steps of: extracting text, transferring the extracted text to the TTS engine, synthesizing voice, transferring the synthesized voice data to a TTS web reader client, correcting a web page so that voice can be reproduced by a web browser based on the transferred voice data, and reproducing the voice. The web page text information is transmitted to the TTS server using Java Script, to output the voice data in the TTS server to the web page. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、ＴＴＳサーバーを用いたウェブリーダーシステム及び方法に関する。より具体的に本発明は、ウェブページでマウスポインターが位置する部分のテキストを抽出して音声に合成し、それを再生するシステム及び方法に関する。 The present invention relates to a web reader system and method using a TTS server. More specifically, the present invention relates to a system and method for extracting a text of a portion where a mouse pointer is located on a web page, synthesizing the text, and reproducing it.

最近、ＴＴＳ（ＴｅｘｔＴｏＳｐｅｅｃｈ）技術が急に発達することに応じて、様々な方法でＴＴＳ機能を応用して生活の便宜を提供しようとする研究が活発である。 In recent years, in response to the rapid development of TTS (Text To Speech) technology, research to provide convenience of life by applying the TTS function in various ways is active.

電話機を通じて音声で銀行口座、株式、天気などの情報を提供受けることのできるシステムがあり、最近には受信したイーメールをＴＴＳを通じて音声で聞くことのできる製品も出ている。 There are systems that can provide information such as bank accounts, stocks, and weather via telephone, and recently there are products that can listen to received e-mails via TTS.

特に、インターネット上では、インターネットコンテンツを音声に合成してくれるとか、音声で望むウェブページにアクセスできるようにしてくれる技術が提案されている。 In particular, on the Internet, technologies have been proposed that synthesize Internet content into speech or that allow users to access desired web pages by speech.

しかしながら、既存のこのようなウェブページでＴＴＳ機能を用いるためには、オペレーティングシステムやウェブブラウザーの種類に従属的な機能（例えば、ＡｃｔｉｖｅＸ（登録商標））を用いて制限的なオペレーティングシステム（例えば、Ｗｉｎｄｏｗｓ（登録商標））と制限的なウェブブラウザー（例えば、ＩｎｔｅｒｎｅｔＥｘｐｌｏｒｅｒ（登録商標））を使用すべきだとの問題点がある。 However, in order to use the TTS function in such an existing web page, a limited operating system (for example, Active X (registered trademark)) using a function dependent on the type of the operating system or the web browser (for example, Active X (registered trademark)). , Windows (registered trademark)) and a limited web browser (for example, Internet Explorer (registered trademark)) should be used.

それで、本発明者は、運営体制やウェブブラウザーの種類に関わらず動作するようにするため、ジャバスクリプト（登録商標）を用いてウェブページのテキスト情報をＴＴＳサーバーに持ち込んで、ＴＴＳサーバーの音声データをウェブページに送り出すシステムを開発することに至る。 Therefore, the present inventor brought the text information of the web page to the TTS server using Javascript (registered trademark) to operate regardless of the operating system and the type of the web browser, and the voice data of the TTS server. It will lead to the development of the system which sends out to the web page.

本発明の目的は、ウェブページでＴＴＳサービスを提供することにある。 An object of the present invention is to provide a TTS service on a web page.

本発明の他の目的は、オペレーティングシステムやウェブブラウザーの種類に関わらず用いることができるＴＴＳサービスを提供することにある。 Another object of the present invention is to provide a TTS service that can be used regardless of the type of operating system or web browser.

本発明の前記及びその他の目的は、下記で説明される本発明によりすべて達成することができる。 The above and other objects of the present invention can be achieved by the present invention described below.

本発明のＴＴＳサーバーを用いたウェブリーダーシステムは、オペレーティングシステムやウェブブラウザーの種類に関わらず動作するＴＴＳサービスを提供するために、ジャバスクリプトを用いてウェブページのテキスト情報をＴＴＳサーバーに持ち込んで、ＴＴＳサーバーの音声データをウェブページに送り出すことを特徴とする。 The web reader system using the TTS server of the present invention brings the text information of the web page to the TTS server using Javascript in order to provide a TTS service that operates regardless of the type of operating system or web browser. The voice data of the TTS server is sent to a web page.

本発明は、オペレーティングシステムやウェブブラウザーの種類に関わらず用いることができるＴＴＳサービスを提供する。 The present invention provides a TTS service that can be used regardless of the type of operating system or web browser.

本発明に係るＴＴＳサーバーを用いたウェブリーダーシステムの構成図である。1 is a configuration diagram of a web reader system using a TTS server according to the present invention. FIG. 本発明のＴＴＳサーバーを用いたウェブリーダーシステムの動作の流れ図である。It is a flowchart of operation | movement of the web reader system using the TTS server of this invention. 本発明のＴＴＳウェブリーダーＷＡＳクライアントのテキストを抽出するソースコードの一例である。It is an example of the source code which extracts the text of the TTS web reader WAS client of this invention.

図１は、本発明に係るシステムの概略的な構成図である。 FIG. 1 is a schematic configuration diagram of a system according to the present invention.

図１を参照すると、本発明は、ウェブリーダーＷＡＳクライアント（３０）、ウェブリーダーＷＡＳ（４０）、ＴＴＳサーバー（５０）、ＴＴＳエンジン（６０）、音声データベース（７０）からなる。 Referring to FIG. 1, the present invention includes a web reader WAS client (30), a web reader WAS (40), a TTS server (50), a TTS engine (60), and a speech database (70).

ウェブリーダーＷＡＳクライアント（３０）は、ウェブページでマウスポインターが位置する部分のテキストを抽出し、抽出したテキストをウェブリーダーＷＡＳに伝達し、ウェブリーダーＷＡＳから伝達された音声データをウェブブラウザー（１０）で再生することができるようにウェブページ（２０）を実時間で修正する。このとき、テキストを抽出して音声データを再生するようにウェブページを実時間で修正することにはジャバスクリプトを用いる。 The web reader WAS client (30) extracts the text of the part where the mouse pointer is located on the web page, transmits the extracted text to the web reader WAS, and the voice data transmitted from the web reader WAS is transmitted to the web browser (10). The web page (20) is modified in real time so that it can be played at. At this time, Javascript is used to correct the web page in real time so that the text is extracted and the audio data is reproduced.

ウェブリーダーＷＡＳ（４０）は、ウェブリーダーＷＡＳクライアント（３０）が抽出したテキストをＴＴＳサーバーに伝送し、ＴＴＳサーバーから音声データを受信してウェブブラウザーが解析することができるように、適当なウェブプロトコルを用いてウェブリーダーＷＡＳクライアント（３０）に再伝送する。 The web reader WAS (40) transmits the text extracted by the web reader WAS client (30) to the TTS server, receives voice data from the TTS server, and can be analyzed by the web browser. To the web reader WAS client (30).

ＴＴＳサーバー（５０）は、ウェブリーダーＷＡＳ（４０）から受信したテキスト情報をＴＴＳエンジンに伝送して、ＴＴＳエンジンで合成された音声データをＴＣＰ／ＩＰを通じてウェブリーダーＷＡＳにサービスする。 The TTS server (50) transmits the text information received from the web reader WAS (40) to the TTS engine, and services the voice data synthesized by the TTS engine to the web reader WAS via TCP / IP.

ＴＴＳエンジン（６０）は、ウェブリーダーＷＡＳクライアント（３０）が抽出したテキスト情報に基づいて、音声データベース（７０）を用いて音声データを合成する。 The TTS engine (60) synthesizes voice data using the voice database (70) based on the text information extracted by the web reader WAS client (30).

音声データベース（７０）は、ＴＴＳエンジンで用いる音声が入っているデータベースである。 The voice database (70) is a database containing voice used in the TTS engine.

図２は、本発明のＴＴＳサーバーを用いたウェブリーダーシステムの動作の流れ図であって、動作の順序は次のようである。 FIG. 2 is a flowchart of the operation of the web reader system using the TTS server of the present invention, and the order of operations is as follows.

第１の段階は、テキストを抽出する段階である。ＴＴＳウェブリーダーＷＡＳクライアント（３０）は、ウェブブラウザー（１０）を通じて接続したウェブサーバーに位置したウェブページ（２０）で現在マウスポインターが位置する部分の有効なテキストをジャバスクリプトを用いて抽出する。 The first stage is a stage for extracting text. The TTS web reader WAS client (30) extracts the valid text of the part where the mouse pointer is currently located in the web page (20) located on the web server connected through the web browser (10) using Javascript.

図３は、ＴＴＳウェブリーダーＷＡＳクライアント（３０）のテキストを抽出するソースコードの一例である。なお、図３に示すソースコードの著作権は出願人にある。 FIG. 3 is an example of source code for extracting the text of the TTS web reader WAS client (30). Note that the copyright of the source code shown in FIG. 3 belongs to the applicant.

第２の段階は、第１の段階で抽出したテキストをＴＴＳエンジンに伝送する段階である。ＴＴＳウェブリーダーＷＡＳクライアント（３０）は、抽出したテキストをＴＴＳウェブリーダーＷＡＳ（４０）に伝達する。ＴＴＳウェブリーダーＷＡＳは、伝達されたテキストを再度ＴＴＳサーバー（５０）に伝達し、ＴＴＳサーバーに伝達されたテキストはＴＴＳエンジンに伝送される。 The second stage is a stage in which the text extracted in the first stage is transmitted to the TTS engine. The TTS web reader WAS client (30) transmits the extracted text to the TTS web reader WAS (40). The TTS web reader WAS transmits the transmitted text to the TTS server (50) again, and the transmitted text is transmitted to the TTS engine.

第３の段階は、第２の段階でＴＴＳエンジン（６０）に伝送されたテキストに基づいて音声を合成する段階である。ＴＴＳエンジンは、伝送されたテキストに該当する音声データを音声データベース（７０）から検索して音声を合成する。 The third step is a step of synthesizing speech based on the text transmitted to the TTS engine (60) in the second step. The TTS engine searches the speech database (70) for speech data corresponding to the transmitted text and synthesizes speech.

第４の段階は、合成された音声データをＴＴＳウェブリーダークライアントに伝送する段階である。前記合成された音声データは、ＴＴＳサーバー（５０）を経てウェブリーダーＷＡＳ（４０）に伝送され、ウェブリーダーＷＡＳは、音声データをウェブプロトコルに合わせてウェブリーダーＷＡＳクライアント（３０）に再伝送する。 The fourth step is a step of transmitting the synthesized voice data to the TTS web reader client. The synthesized voice data is transmitted to the web reader WAS (40) via the TTS server (50), and the web reader WAS retransmits the voice data to the web reader WAS client (30) in accordance with the web protocol.

第５の段階は、伝送された音声データに基づいてウェブページを修正する段階である。ウェブリーダーＷＡＳクライアント（３０）は、ウェブリーダーＷＡＳ（４０）から伝達された音声データをウェブブラウザーで再生することができるようにウェブページ（２０）を実時間で修正する。 The fifth step is a step of modifying the web page based on the transmitted audio data. The web reader WAS client (30) modifies the web page (20) in real time so that the audio data transmitted from the web reader WAS (40) can be reproduced by the web browser.

第６の段階は、音声を再生する段階である。ウェブリーダーＷＡＳクライアント（３０）がウェブページ（２０）を修正すると、ウェブブラウザー（１０）は修正されたウェブページを通じて音声を再生することになる。 The sixth stage is a stage for reproducing sound. When the web reader WAS client (30) modifies the web page (20), the web browser (10) reproduces sound through the modified web page.

したがって、本発明のＴＴＳサーバーを用いたウェブリーダーシステム及びその方法は、ウェブページの上にマウスポインターを位置させるとマウスポインターが位置する部分のテキストを抽出して音声データを合成し、合成された音声データをウェブブラウザーで実時間で再生することになる。 Therefore, the web reader system and method using the TTS server of the present invention are synthesized by extracting the text of the part where the mouse pointer is located and synthesizing the voice data when the mouse pointer is positioned on the web page. Audio data will be played in real time on a web browser.

本発明の単純な変形ないし変更は、この分野の通常の知識を有する者により容易に実施でき、このような変形や変更はすべて本発明の領域に含まれる。 Simple variations or modifications of the present invention can be easily carried out by those having ordinary knowledge in the field, and all such variations and modifications are included in the scope of the present invention.

１０…ウェブブラウザー、２０…ウェブページ、３０…ウェブリーダーＷＡＳクライアント、４０…ウェブリーダーＷＡＳ、５０…ＴＴＳサーバー、６０…ＴＴＳエンジン、７０…音声データベース。 DESCRIPTION OF SYMBOLS 10 ... Web browser, 20 ... Web page, 30 ... Web reader WAS client, 40 ... Web reader WAS, 50 ... TTS server, 60 ... TTS engine, 70 ... Voice database.

Claims

A web reader WAS client (30) that extracts the text of the portion of the web page where the mouse pointer is located and modifies the web page in real time so that the audio data can be played back by the web browser;
A web reader WAS (40) that receives text information from the web reader WAS client and transmits voice data to the web reader WAS client;
A TTS server (50) that receives text information from the web reader WAS and services voice information to the web reader WAS using TCP / IP;
An audio database (70) for storing and managing audio data;
A TTS including a TTS engine (60) that receives text information extracted by the web reader WAS client from a TTS server, synthesizes voice data using the voice database, and retransmits the synthesized voice data to the TTS server. Web reader system using a server.

The web reader system using a TTS server according to claim 1, wherein the web reader WAS client uses Javascript to extract text and modify a web page in real time.

Extracting the text of the part where the mouse pointer is located using a TTS web reader client;
Transmitting the extracted text to a TTS engine;
Synthesizing speech based on the transmitted text;
Transmitting the synthesized voice data to a TTS web reader client;
Modifying the web page so that the web browser can play audio based on the transmitted audio data;
A method of driving a web reader system using a TTS server, including the step of playing audio.

The TTS according to claim 3, wherein a JavaScript is used to extract a text of a portion where the mouse pointer is located and to modify a web page so that a voice can be reproduced by a web browser. A method of driving a web reader system using a server.