JP2009086597A

JP2009086597A - Text-to-speech conversion service system and method

Info

Publication number: JP2009086597A
Application number: JP2007259847A
Authority: JP
Inventors: Shiyunsuke Akifuji; 俊介秋藤
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2007-10-03
Filing date: 2007-10-03
Publication date: 2009-04-23

Abstract

<P>PROBLEM TO BE SOLVED: To solve the problem wherein listeners will hear inappropriate terms uttered if parts of text are converted to create audio data and a playback order of the audio data is created to match utterances of inappropriate terms. <P>SOLUTION: When text data is converted into audio data, text data indicating the reading of the audio data is also generated. If a change in the playback order of text data indicating the reading of audio data corresponding to a plurality of partial text data makes the changed text data indicating reading match a preset reading-prohibited term, the audio data corresponding to the partial text data are replaced with predetermined audio data. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、ブログサイト（blog（Weblog）をWebサーバで公開および運用することを代行するサイト）やＳＮＳサイト（Social Networking Service：コミュニティ型のWebサイト）において、ユーザからのテキストデータからなる文章をネットワーク経由で入力し、音声データに変換して公開し、他のユーザが閲覧した際に文章を音声で出力するテキスト音声変換サービスシステムに関するものである。 In the present invention, a blog site (a site that acts as a proxy for publishing and operating a blog (Weblog) on a Web server) or an SNS site (Social Networking Service: a community-type Web site) The present invention relates to a text-to-speech conversion service system that inputs via a network, converts it into voice data, publishes it, and outputs a sentence by voice when viewed by another user.

人間の声を蓄積、解析し、特徴を表す特徴データを抽出し、その特徴データを用いて、任意のテキストデータを音声データに変換し、あたかも人間のように自然なアクセントで発声する音声合成装置が出現してきた。これらの音声合成装置として、例えば、ＨｉｄｅｙｕｋｉＭｉｚｕｎｏ、他著、Ｔｅｘｔ−ｔｏ−ＳｐｅｅｃｈＳｙｎｔｈｅｓｉｓＴｅｃｈｎｏｌｏｇｙＵｓｉｎｇＣｏｒｐｕｓ−ＢａｓｅｄＡｐｐｒｏａｃｈ、ＮＴＴＴｅｃｈｎｉｃａｌＲｅｖｉｅｗ、Ｖｏｌ．２、Ｎｏ．３、ｐｐ．７０−７５、Ｍａｒｃｈ２００４（非特許文献１）がある。 A speech synthesizer that accumulates and analyzes human voices, extracts feature data representing features, converts any text data into speech data using the feature data, and utters it with natural accents like a human Has emerged. As these speech synthesizers, for example, Hideyuki Mizuno, et al., Text-to-Speech Synthesis Technology Corpus-Based Approach, NTT Technical Review, Vol. 2, no. 3, pp. 70-75, March 2004 (Non-Patent Document 1).

このような音声合成装置を用いると、視聴者の聞く環境によっては、特徴データの元となった声を提供した者（以下、元話者）が実際に話しているのと同じように聞こえる可能性がある。元話者は、俳優、またはアニメーションの登場人物（キャラクタと呼ぶ）の声を吹き替える声優の場合もある。公衆の面前で、このような音声合成装置を用いると、これらの俳優、キャラクタが発話することのありえない乱暴な言い回しなどの不適切な言葉を発話させることが可能である。この場合、音声データに変換された内容によっては、これらの俳優や声優のイメージを傷つける可能性がある。 Using such a speech synthesizer, depending on the listening environment of the viewer, it may sound as if the person who provided the voice from which the feature data was based (hereinafter referred to as the former speaker) is actually speaking There is sex. The former speaker may be an actor or a voice actor who dubbes the voice of an animated character (called a character). When such a speech synthesizer is used in front of the public, it is possible to utter inappropriate words such as rough words that cannot be spoken by these actors and characters. In this case, depending on the contents converted into audio data, the image of these actors and voice actors may be damaged.

この課題を解決するために、いくつかの技術が開発されてきた。 In order to solve this problem, several techniques have been developed.

例えば、特開平５−１６５４８６号公報(特許文献１)に記載の技術では、テキストデータでなる入力文章を音声信号に変換して発音出力するテキスト音声変換装置において、読み上げ禁止用語を格納する読み上げ禁止テーブルと、入力文章を単語単位で切り出し、読み上げ禁止テーブルを検索して入力文章に含まれている単語が読み上げ禁止用語か否かを判断する読み上げ禁止用語判断手段と、この読み上げ禁止用語判断手段の判断結果に基づいて、読み上げ禁止用語に該当する単語の発音を禁止する発音禁止手段とを設けている。 For example, in the technique described in Japanese Patent Application Laid-Open No. 5-165486 (Patent Document 1), in a text-to-speech conversion device that converts an input sentence composed of text data into a speech signal and outputs a pronunciation, prohibition of reading out a reading-prohibited term. A reading prohibition term determining means for extracting a table and an input sentence in units of words, searching a reading prohibition table to determine whether or not a word included in the input sentence is a reading prohibition term, and a reading prohibition term determination means Based on the determination result, pronunciation prohibiting means for prohibiting pronunciation of a word corresponding to the reading prohibition term is provided.

上記特許文献１の技術では、テキストデータでなる入力文章を音声信号に変換して発音出力するテキスト音声変換装置において、読み上げ禁止用語とこの読み上げ禁止用語を表現する置換表現との対を格納する読み上げ禁止テーブルと、入力文章を単語単位で切り出し、読み上げ禁止テーブルを検索して入力文章に含まれている単語が読み上げ禁止用語か否かを判断する読み上げ禁止用語判断手段と、この読み上げ禁止用語判断手段の判断結果に基づいて、読み上げ禁止用語を置換表現に変換して発音させる読み上げ禁止用語置換手段とを設けている。 In the technique disclosed in Patent Document 1, in a text-to-speech conversion device that converts an input sentence composed of text data into a speech signal and outputs a pronunciation, it reads out a pair of a reading-prohibited term and a replacement expression that expresses the reading-prohibited term. A prohibition table, input sentence is cut out in units of words, a reading prohibition table is searched to determine whether a word included in the input sentence is a read prohibition term, and this reading prohibition term determination means On the basis of the determination result, a reading prohibition term replacement means for converting a reading prohibition term into a replacement expression and generating a pronunciation is provided.

さらに、特開２００４−２７１７２７号公報(特許文献２)に記載の技術では、発注者の指定した音声メッセージの発話内容と、ある特定の話者の音声特徴データとを用いて音声合成処理し、それによって得られた音声合成データを音声データとして提供する音声データ提供システムであって、発注者から音声データ作成を受注する受注側は、発注者の指定した音声メッセージの発話内容を受信するとともに、その音声メッセージを発話させる話者の選択情報を受信すると、受信した音声メッセージの発話内容に選択された話者の発話する内容として不適切な表現が含まれているか否かを判定し、不適切な表現が含まれていないと判定された場合に、その音声メッセージの発話内容と当該選択された話者の音声特徴データとを用いて音声合成処理し、その音声合成データを音声データとして提供するようにしていた。 Furthermore, in the technique described in Japanese Patent Application Laid-Open No. 2004-271727 (Patent Document 2), speech synthesis processing is performed using the utterance content of the voice message designated by the orderer and the voice feature data of a specific speaker, The voice data providing system that provides the voice synthesis data obtained as a voice data, and the order receiving side that receives an order for voice data creation from the orderer receives the utterance content of the voice message designated by the orderer, When the selection information of the speaker who utters the voice message is received, it is determined whether or not the utterance content of the received voice message contains an inappropriate expression as the utterance content of the selected speaker. If it is determined that the speech expression is not included, the speech synthesis process is performed using the utterance content of the voice message and the voice feature data of the selected speaker. And it had been the voice synthesis data to be provided as audio data.

ＨｉｄｅｙｕｋｉＭｉｚｕｎｏ他著、Ｔｅｘｔ−ｔｏ−ＳｐｅｅｃｈＳｙｎｔｈｅｓｉｓＴｅｃｈｎｏｌｏｇｙＵｓｉｎｇＣｏｒｐｕｓ−ＢａｓｅｄＡｐｐｒｏａｃｈ、ＮＴＴＴｅｃｈｎｉｃａｌＲｅｖｉｅｗ、Ｖｏｌ．２、Ｎｏ．３、ｐｐ．７０−７５、Ｍａｒｃｈ２００４Hideyuki Mizuno et al., Text-to-Speech Technology Technology Using Corpus-Based Approach, NTT Technical Review, Vol. 2, no. 3, pp. 70-75, March 2004 特開平５−１６５４８６号公報JP-A-5-165486 特開２００４−２７１７２７号公報JP 2004-271727 A

インターネットの普及に伴い、個人が情報を発信するＷｅｂページを設けることが多くなった。この代表的なものとしてブログがある。ブログとは、個人やグループが、日々起こったことや特定の話題について記述したＷｅｂページの集合である。多くのブログが、写真、音楽、動画のファイルを掲載しており、記述した人以外の者もＷｅｂブラウザを用いてアクセスすることで、掲載された文章を読んだり、写真や動画を見たり、音楽を聴くことができる。音声合成装置で作成した音声データをブログサイトなどで一般に公開することもできる。 With the spread of the Internet, it has become more common to provide Web pages for individuals to send information. A typical example is a blog. A blog is a set of web pages that describe what happened daily or a specific topic by an individual or group. Many blogs publish photos, music, and video files, and people other than those who have written can access them using a web browser, read the posted text, watch photos and videos, I can listen to music. The voice data created by the voice synthesizer can be made public on a blog site.

上述の技術では、音声データをブログサイトなどで一般に公開する際に、文章の一部分を変換した音声データを作成し、不適切な用語の発声と同じになるように音声データの再生順序を作成すれば、これを聞いた側では、不適切な用語を発声したように聞こえてしまい、元話者の俳優、声優のイメージを傷つけることになる可能性がある。 With the above-mentioned technology, when the voice data is made public on a blog site, etc., the voice data is created by converting a part of the sentence, and the playback order of the voice data is created so that it is the same as the utterance of inappropriate terms. For example, the person who hears this may sound as if he / she uttered an inappropriate term, which may damage the image of the actor and voice actor of the former speaker.

本発明のテキスト音声変換サービスシステム及び方法は次のような態様により、上記課題を解決する。接続する端末から受信したテキストデータから複数の部分テキストデータを抽出する。抽出した複数の部分テキストデータを音声データに変換し、この音声データの読みを表すテキストデータを生成する。複数の部分テキストデータに対応する音声データの読みを表すテキストデータを連続させたとき（音声データの再生順序を変更したとき）、連続させた読みを表すテキストデータが予め設定した読み上げ禁止用語に該当する場合に、部分テキストデータに対応する音声データを予め定めた音声データに置換する。 The text-to-speech conversion service system and method of the present invention solve the above-mentioned problems in the following manner. A plurality of partial text data is extracted from the text data received from the connected terminal. The extracted partial text data is converted into voice data, and text data representing the reading of the voice data is generated. When text data representing the reading of audio data corresponding to multiple partial text data is made continuous (when the playback order of the audio data is changed), the text data representing the continuous reading corresponds to a preset reading-prohibited term In this case, the voice data corresponding to the partial text data is replaced with predetermined voice data.

本発明は、態様はさらに、音声合成のための音声合成サイト、及び読み上げ禁止用語をチェックする内容審査サイトをそれぞれ独立したサイトとし、他のブログサイトと共用することにより、効率的になる。 The aspect of the present invention is further efficient by making each of the speech synthesis site for speech synthesis and the content examination site for checking the reading prohibition term independent sites and sharing them with other blog sites.

本発明は、音声データに対応する読みのテキストデータを用いることにより、部分テキストデータの音声合成に伴う、不適切な用語の発声を禁止することができる。 According to the present invention, by using the text data of the reading corresponding to the voice data, it is possible to prohibit the utterance of an inappropriate term accompanying the voice synthesis of the partial text data.

ブログサイトなどでテキストデータを音声データへ変換して再生する実施例を以下に説明する。 An embodiment in which text data is converted into audio data and reproduced at a blog site or the like will be described below.

図１から図９を用いて、実施例１を詳細に説明する。実施例1は、Ｗｅｂブラウザと音声再生機能とを備えるパーソナルコンピュータ、ブログサイト、及び音声合成サイトの３つの部分から構成される。 The first embodiment will be described in detail with reference to FIGS. The first embodiment is composed of three parts: a personal computer having a Web browser and a voice reproduction function, a blog site, and a voice synthesis site.

実施例1では、ユーザがパーソナルコンピュータからブログサイトにアクセスし、文章を投稿したときに、文章の一部を俳優やアニメーションのキャラクタの音声に近似した音声に置き換えることができる。そして他のユーザが他のパーソナルコンピュータからブログサイトにアクセスし、文章を閲覧する際に、文章の一部を俳優やアニメーションのキャラクタが話しているように再生できる。 In the first embodiment, when a user accesses a blog site from a personal computer and posts a sentence, a part of the sentence can be replaced with a voice approximate to the voice of an actor or animation character. When another user accesses the blog site from another personal computer and browses the text, a part of the text can be reproduced as if an actor or an animated character is speaking.

図1は、実施例１の構成図である。１と２は、オペレーティング・システムを搭載したパーソナルコンピュータ（PC）である。３と４は、人間が認知できるように文字や図形を表示するディスプレイ装置である。５と６は、ユーザが文字を入力するキーボードである。７と８は、それぞれ、ディスプレイ装置３と４に表示された図形や文字の場所を指し示すボタンつきのマウス(ポインティングデバイス)である。９と１０は、パーソナルコンピュータで動作するプログラムであって、ＨＴＭＬを用いて記述されたテキストデータをディスプレイ３と４に表示するＷｅｂブラウザ、１１と１２は、パーソナルコンピュータで動作するプログラムであって、ＷＡＶＥフォーマットで記録された音声データを音声で再生する音声再生部、１３と１４は、それぞれ音声再生部１１と１２の出力を人間が認知できる音声に変換するスピーカである。 FIG. 1 is a configuration diagram of the first embodiment. Reference numerals 1 and 2 denote personal computers (PCs) equipped with an operating system. Reference numerals 3 and 4 denote display devices that display characters and figures so that humans can recognize them. Reference numerals 5 and 6 are keyboards for the user to input characters. Reference numerals 7 and 8 denote mouses (pointing devices) with buttons indicating the locations of figures and characters displayed on the display devices 3 and 4, respectively. 9 and 10 are programs that operate on a personal computer, Web browsers that display text data described using HTML on the displays 3 and 4, and 11 and 12 are programs that operate on a personal computer. Audio reproduction units 13 and 14 for reproducing audio data recorded in the WAVE format with audio are speakers that convert the outputs of the audio reproduction units 11 and 12 into audio that can be recognized by humans, respectively.

ＷＡＶＥフォーマットは、音声ファイルの形式であり、音声信号をデジタルデータに変換したものを記録するための保存形式である。ＨＴＭＬ（ＨｙｐｅｒＴｅｘｔＭａｒｋｕｐＬａｎｇｕａｇｅ）は、Ｗｅｂ上の文章を記述するためのマークアップ言語である。 The WAVE format is an audio file format, and is a storage format for recording an audio signal converted into digital data. HTML (HyperText Markup Language) is a markup language for describing sentences on the Web.

ブログサイト２０は、ＨＴＴＰ（Hyper Text Transfer Protocol）でPCと通信するＷｅｂサーバ２２、ＨＴＭＬで記述されたＨＴＭＬテキストデータを編集する編集部２４、ＨＴＭＬで記述されたＨＴＭＬテキストデータを格納するＨＴＭＬテキストデータベース２６、音声データとテキストデータを格納する音声デキストデータベース２８、変換要求部３０、読み上げ禁止用語データベース３２、および内容審査部３４を有する。変換要求部３０は、ＨＴＭＬテキストデータベース２６に格納されたテキストデータの一部分を音声に変換することを要求し、変換した音声データと読み上げテキストデータを音声テキストデータベース２８に格納するように指示を出し、ＨＴＭＬテキストデータベース２６に格納されたテキストデータの一部を置換する。読み上げ禁止用語データベース３２は、音声での読み上げに適していない読み上げ禁止用語を格納する読み上げ禁止用語データベースである。内容審査部３４は、ＨＴＭＬテキストデータベース２６、音声テキストデータベース２８、及び読み上げ禁止用語データベース３２とを参照し、読み上げる用語が読み上げ禁止用語である場合、読み上げる用語に対応する音声データを変更する。 The blog site 20 includes a Web server 22 that communicates with a PC using HTTP (Hyper Text Transfer Protocol), an editing unit 24 that edits HTML text data described in HTML, and an HTML text database that stores HTML text data described in HTML. 26, an audio text database 28 for storing audio data and text data, a conversion requesting unit 30, a reading-inhibited term database 32, and a content examination unit 34. The conversion requesting unit 30 requests to convert a part of the text data stored in the HTML text database 26 into speech, issues an instruction to store the converted speech data and the read-out text data in the speech text database 28, A part of the text data stored in the HTML text database 26 is replaced. The reading-prohibited term database 32 is a reading-prohibited term database that stores reading-prohibited terms that are not suitable for voice reading. The content examination unit 34 refers to the HTML text database 26, the speech text database 28, and the reading-prohibited term database 32, and when the term to be read is a reading-prohibited term, changes the speech data corresponding to the term to be read.

ブログサイト２０は、ブログの提供を代行するインターネット上のＷｅｂサイトであり、サーバなどのコンピュータとＷｅｂサイト用のソフトウェアから構成される。Ｗｅｂサイトに一意に対応するＵＲＩ(Uniform Resource Identifier)をＰＣ1のＷｅｂブラウザ９で入力することで、Ｗｅｂサイトであるブログサイト２０にアクセスし、ブログのＷｅｂページを閲覧できる。ブログサイト２０には、ユーザを認証するログイン機能やＷｅｂページを検索する検索機能などがある。 The blog site 20 is a website on the Internet that acts as a blog provider, and includes a computer such as a server and software for the website. By inputting a URI (Uniform Resource Identifier) uniquely corresponding to the Web site with the Web browser 9 of the PC 1, the user can access the blog site 20 as the Web site and browse the blog Web page. The blog site 20 has a login function for authenticating a user and a search function for searching a Web page.

編集部２４、変換要求部３０、および内容審査部３４は、ブログサイト２０で動作するプログラムであり、ＨＴＭＬテキストデータベース２６、音声デキストデータベース２８、及び読み上げ禁止用語データベース３２は、ブログサイト２０で用いられるデータベースである。これらは、ブログサイト２０を構成するハードウエア(コンピュータ)、そのオペレーティング・システム、及びそれらが提供するファイルシステムなどで実現する。 The editing unit 24, the conversion requesting unit 30, and the content examination unit 34 are programs that operate on the blog site 20, and the HTML text database 26, the voice dext database 28, and the reading-prohibited term database 32 are used on the blog site 20. It is a database. These are realized by hardware (computer) constituting the blog site 20, an operating system thereof, a file system provided by them, and the like.

音声合成サイト４０は、ブログサイト２０からＵＲＩとテキストデータとを受信する受信部４２、漢字かな混じりのテキストデータを入力し、ＷＡＶＥなどの音声ファイルの形式の音声データと、ローマ字などで記述した発声を表現する読みのテキストとを出力する音声合成部４４、ブログサイト２０へ、指定されたＵＲＩを用いてデータを送信する送信部４６である。漢字かな混じりのテキストデータを音声データに変換する音声合成機能を実装するため技術については、非特許文献１に詳細に記載されている。 The speech synthesis site 40 receives a URI 42 and text data from the blog site 20, inputs text data mixed with kanji and kana, and speaks speech data in the form of speech files such as WAVE and utterances written in Roman characters, etc. Are a speech synthesizer 44 that outputs a reading text that expresses the data, and a transmitter 46 that transmits data to the blog site 20 using a designated URI. A technique for implementing a speech synthesis function for converting text data mixed with kanji and kana into speech data is described in detail in Non-Patent Document 1.

本実施例では、「ａｈｏ」と「ｂａｋａ」は不適切な用語の発声であり、これを聞いた側では不適切な用語が発声されたように聞こえる。 In this embodiment, “aho” and “baka” are utterances of inappropriate terms, and the person who hears them sounds as if an inappropriate term was uttered.

以下、図２と図３を用いて、パーソナルコンピュータ１でのエンドユーザの操作に伴うブログサイト２０の編集部２４の動作を説明する。 Hereinafter, the operation of the editing unit 24 of the blog site 20 in accordance with an end user operation on the personal computer 1 will be described with reference to FIGS. 2 and 3.

図２は、エンドユーザがブログに文章を書き込む際にパーソナルコンピュータ１のディスプレイ装置３に表示される入力画面２００の例である。 FIG. 2 is an example of an input screen 200 displayed on the display device 3 of the personal computer 1 when the end user writes a sentence on the blog.

文章を入力する入力画面２００は、文章のタイトルを入力するタイトル入力部分２０２、文章の本文を入力する本文入力部分２０４、変換した音声を読み上げる俳優やアニメーションのキャラクタと絵文字の対応を表示する音声−絵文字対応表示部分２０６、タイトル入力部分２０２と本文入力部分２０４に書き込まれた文章をブログサイト２０のＨＴＭＬテキストデータベース２６に格納せずに終了することを指示する「取り消すボタン」２０８、タイトル入力部分２０２と本文入力部分２０４に書き込まれた文章をブログサイト２０のＨＴＭＬテキストデータベース２６に格納することを指示する「書き込むボタン」２１０などを表示する。 The input screen 200 for inputting a sentence includes a title input part 202 for inputting a title of the sentence, a body input part 204 for inputting the text of the sentence, and an audio for displaying the correspondence between the actor or animation character that reads the converted voice and the pictogram. “Cancel button” 208 for instructing to end the text written in the pictogram corresponding display portion 206, the title input portion 202 and the text input portion 204 without storing them in the HTML text database 26 of the blog site 20, the title input portion 202 And a “write button” 210 for instructing to store the text written in the text input portion 204 in the HTML text database 26 of the blog site 20.

音声−絵文字対応表示部分２０６は、本文入力部分２０４で、絵文字である星印で囲まれたテキストを俳優１の音声に変換し、絵文字である三角印で囲まれたテキストを俳優２の音声に変換することを表示している。この例では、２種類の音声を選択できるようにしているが、２種類以上あっても良い。音声−絵文字対応表示部分２０６の表示は、ユーザごとに異なっていても良く、例えば絵文字である四角が俳優１としても良い。 The voice-pictogram correspondence display portion 206 converts the text surrounded by the star, which is a pictogram, into the voice of the actor 1 in the body input portion 204, and converts the text enclosed by the triangular mark, which is a pictogram, into the voice of the actor 2. It is displayed to convert. In this example, two types of sound can be selected, but two or more types may be selected. The display of the voice-pictogram correspondence display portion 206 may be different for each user. For example, a square as a pictogram may be the actor 1.

図３は、編集部２４の処理フローチャートである。いま、パーソナルコンピュータ１のＷｅｂブラウザ９がブログサイト２０にアクセスし、Ｗｅｂサーバ２２を通じて、編集部２４が処理を開始したとする。 FIG. 3 is a process flowchart of the editing unit 24. Now, assume that the Web browser 9 of the personal computer 1 accesses the blog site 20 and the editing unit 24 starts processing through the Web server 22.

編集部２４は、処理を開始し(ステップ３００)、図２のタイトル入力部分２０２及び本文入力部分２０４が空白の画面を作成し(ステップ３０２)、それらを表示する命令をＷｅｂサーバ２２を通じてパーソナルコンピュータ１のＷｅｂブラウザ９に送信すると、Ｗｅｂブラウザ９はディスプレイ３に表示し(ステップ３０４)、ユーザからの入力を待つ（ステップ３０６）。 The editing unit 24 starts the process (step 300), creates a screen in which the title input part 202 and the text input part 204 in FIG. 2 are blank (step 302), and issues a command for displaying them to the personal computer through the web server 22. When it is transmitted to the first Web browser 9, the Web browser 9 displays it on the display 3 (step 304) and waits for an input from the user (step 306).

テキストの入力があったならば、編集部２４は入力されたテキストを表示する命令を、Ｗｅｂサーバ２２を通じてパーソナルコンピュータ１のＷｅｂブラウザ９に送信し、Ｗｅｂブラウザ９はディスプレイ装置３に表示する。ステップ３０４と３０６とを繰り返すことによって、テキストを入力し、それを表示する。ステップ３０６でテキスト入力ではなく、「取り消すボタン」２０８、又は「書き込むボタン」２１０が選択(入力)されたときは、その選択されたボタンによって分岐する(ステップ３０８)。「取り消すボタン」２０８が選択されたときは、タイトル入力部分２０２及び本文入力部分２０４を空白に変更し(ステップ３１０)、処理を終了する(ステップ３１４)。「書き込むボタン」２１０が選択されたときは、タイトル入力部分２０２、本文入力部分２０４にある内容を、適当なユニークなＵＲＩでＨＴＭＬテキストデータベース２６へ格納し(ステップ３１２)、処理を終了する(ステップ３１４)。 If there is an input of text, the editing unit 24 transmits an instruction to display the input text to the Web browser 9 of the personal computer 1 through the Web server 22, and the Web browser 9 displays it on the display device 3. By repeating steps 304 and 306, the text is entered and displayed. When the “cancel button” 208 or “write button” 210 is selected (input) instead of text input in step 306, the process branches depending on the selected button (step 308). When the “cancel button” 208 is selected, the title input part 202 and the text input part 204 are changed to blanks (step 310), and the process ends (step 314). When the “write button” 210 is selected, the contents in the title input part 202 and the text input part 204 are stored in the HTML text database 26 with an appropriate unique URI (step 312), and the process is terminated (step 312). 314).

ここでは、ユーザは、キーボード５とマウス７を用いて、Ｗｅｂブラウザ９から図２に示すように、タイトル入力部分２００に「近所の公園」、本文入力部分２０４に「面白い場所だった。また、いこうかな。」と入力したとする(ステップ３００から３０４)。 Here, the user uses the keyboard 5 and the mouse 7 from the Web browser 9 as shown in FIG. 2, the title input portion 200 is “Neighborhood Park”, and the text input portion 204 is “an interesting place. Is entered "(steps 300 to 304).

次に、その後、ユーザは、「場」と「か」を音声データに変換するために、それぞれ、「場」と「か」の直前と直後に、それぞれ、絵文字である星印を挿入したとする。図２の例では、「場」と「か」を俳優１の合成音声に変換することを指示した画面を示している。 Next, after that, in order to convert “place” and “ka” into audio data, the user inserts an asterisk as a pictograph immediately before and after “place” and “ka”, respectively. To do. In the example of FIG. 2, a screen instructing to convert “place” and “ka” into the synthesized voice of actor 1 is shown.

ユーザは、この文章を格納したくないときは、取り消すボタン２０８を押す。入力したデータは、ステップ３１０で、消去され、パーソナルコンピュータ１のＷｅｂブラウザ９では、図２で、タイトル入力部分２０２、本文入力部分２０４を空白に変更し、ディスプレイ装置３に表示し、処理を終了する(ステップ３０８から３１４)。 When the user does not want to store this sentence, the user presses the cancel button 208. The input data is erased in step 310, and the web browser 9 of the personal computer 1 changes the title input part 202 and the text input part 204 to blanks in FIG. 2, displays them on the display device 3, and ends the processing. (Steps 308 to 314).

ここでは、ユーザは、書き込むボタン２１０をマウス７で選択し、タイトル入力部分２０２と本文入力部分２０４に書き込まれた文章をブログサイト２０のＨＴＭＬテキストデータベース２６へ適当なＵＲＩをつけて、ファイルとして格納したとする（ステップ３１０から３１４）。ここでは、付けられたＵＲＩは、「ｈｔｔｐ：：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／ｈｏｎｂｕｎ．ｈｔｍｌ」とする。 Here, the user selects the write button 210 with the mouse 7 and stores the texts written in the title input part 202 and the text input part 204 in the HTML text database 26 of the blog site 20 with appropriate URIs stored as files. (Steps 310 to 314). Here, the attached URI is “http://blog1.com/u1/10/honbun.html”.

このときにＨＴＭＬテキストデータベース２６に格納されたＨＴＭＬ文章を図４に示す。図４は、ひとつのファイルであるＨＴＭＬ文章４００を示す。図４の左端の番号４０１０から４０９０は、本実施例を説明するために付加した行番号であり、本来のＨＴＭＬ文章には含まれない。 FIG. 4 shows an HTML sentence stored in the HTML text database 26 at this time. FIG. 4 shows an HTML sentence 400 that is one file. The numbers 4010 to 4090 at the left end of FIG. 4 are line numbers added for explaining this embodiment, and are not included in the original HTML text.

変換要求部３０は、定期的にＨＴＭＬテキストデータベース２６を参照し、新たに格納されたファイルであるＨＴＭＬ文章を検出し、星印や三角印などの絵文字で囲まれたテキストデータの部分を抽出し、抽出されたテキストデータ毎に、ユニークなＵＲＩを生成し、それらのテキストデータと生成したＵＲＩを音声合成サイト４０へ送信し、ＵＲＩに対応するファイル名の音声データと、そのＵＲＩの最後尾の拡張子「．ｗａｖ」を「．ｔｘｔ」で置換した、音声データの読みのテキストデータを得る。 The conversion request unit 30 periodically refers to the HTML text database 26 to detect HTML text that is a newly stored file, and extracts a text data portion surrounded by pictograms such as stars and triangles. For each extracted text data, a unique URI is generated, the text data and the generated URI are transmitted to the speech synthesis site 40, the voice data of the file name corresponding to the URI, and the last of the URI The text data of the voice data reading obtained by replacing the extension “.wav” with “.txt” is obtained.

ここでは、音声合成サイト４０に送信されたデータのひとつは、テキストデータが「場」であり、それに対応して生成されたＵＲＩは、
「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０１．ｗａｖ」
であり、他のひとつは、テキストデータが、「か」であり、それに対応して生成されたＵＲＩは、
「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０２．ｗａｖ」
であるとする。 Here, one of the data transmitted to the speech synthesis site 40 is text data “place”, and the URI generated corresponding to the data is “
“Http://blog1.com/u1/10/01.wav”
The other one is that the text data is “ka”, and the corresponding URI generated is
“Http://blog1.com/u1/10/02.wav”
Suppose that

ここでは、生成されるＵＲＩは、ＨＴＭＬテキストデータベース２６に格納されたＨＴＭＬ文章のパス名にユニークな数字を追加したものとしているが、ＨＴＭＬ文章のパス名とは関係のないものでも良い。 Here, the generated URI is obtained by adding a unique number to the path name of the HTML text stored in the HTML text database 26, but may be unrelated to the path name of the HTML text.

上記のデータを受信した音声合成サイト４０の受信部４２は、テキストデータを音声合成部４４へ、ＵＲＩを送信部４６へそれぞれ出力する。音声合成部４４は、テキストデータを音声データに変換し、音声データの発声音を表現する読みのテキストデータと音声データを出力する。 The reception unit 42 of the speech synthesis site 40 that has received the above data outputs the text data to the speech synthesis unit 44 and the URI to the transmission unit 46. The speech synthesizer 44 converts the text data into speech data, and outputs reading text data and speech data that express the utterance of the speech data.

送信部４６は、音声合成部４４から音声データと読みのテキストデータを得、受信部４２からＵＲＩとを得て、ＵＲＩに対応する音声データのファイルと、そのＵＲＩの最後尾の拡張子「．ｗａｖ」を「．ｔｘｔ」で置換したＵＲＩに対応させた読みのテキストデータとをブログサイト２０の変換要求部３０へ戻す。ここで、読みのテキストデータは「場」を「ｂａ」という発声音に変換し、「か」を「ｋａ」という発声音に変換したとし、テキストデータは「ｂａ」と「ｋａ」となる。 The transmitter 46 obtains voice data and reading text data from the voice synthesizer 44, obtains a URI from the receiver 42, and obtains a voice data file corresponding to the URI and an extension “. The read text data corresponding to the URI in which “wav” is replaced with “.txt” is returned to the conversion request unit 30 of the blog site 20. Here, it is assumed that the text data of the reading is converted from “ba” to a utterance sound “ba” and “ka” is converted to a utterance sound “ka”, and the text data is “ba” and “ka”.

この段階では、「場」に対応する読みのテキストデータのＵＲＩは、
「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０１．ｔｘｔ」
であり、このＵＲＩで指定されるファイルの内容は、文字として解釈して「ｂａ」である。「か」に対応する読みのテキストデータのＵＲＩは、
「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０２．ｔｘｔ」
であり、このＵＲＩで指定されるファイルの内容は、文字として解釈して「ｋａ」である。 At this stage, the URI of the text data of the reading corresponding to “place” is
“Http://blog1.com/u1/10/01.txt”
The content of the file specified by this URI is “ba” when interpreted as characters. The URI of the text data of the reading corresponding to “ka” is
“Http://blog1.com/u1/10/02.txt”
The content of the file specified by this URI is “ka” when interpreted as characters.

これらのデータを受信した変換要求部３０は、受信したデータを音声テキストデータベース２８へ格納し、ＨＴＭＬテキストデータベース２６にアクセスし、音声データと、テキストデータを得たＨＴＭＬ文章の絵文字で囲まれた部分を、Ｗｅｂブラウザが再生できるようにＵＲＩを含む適当なタグで置換する。 Upon receiving these data, the conversion requesting unit 30 stores the received data in the speech text database 28, accesses the HTML text database 26, and is surrounded by speech data and pictorial characters of the HTML sentence from which the text data was obtained. Is replaced with an appropriate tag including a URI so that the Web browser can reproduce it.

図５は、置換した後のＨＴＭＬ文章５００を表したものである。左端の番号５０１０から５０９０は、本実施例を説明するために付加した行番号であり、本来のＨＴＭＬ文章には含まれない。 FIG. 5 shows the HTML sentence 500 after replacement. The numbers 5010 to 5090 at the left end are line numbers added for explaining this embodiment, and are not included in the original HTML text.

読みのテキストデータを得たＨＴＭＬ文章の絵文字で囲まれた部分と音声データとを置換するので、星印で囲まれたテキストデータの部分を音声データのＵＲＩを含む適当なタグで置換する。ここでは、Ｗｅｂブラウザが、リンク先のＵＲＩの最後尾を拡張子と解釈して、その拡張子に対応するアプリケーション・プログラムを自動的に起動するとし、「＜ａ」タグ、リンク先を示すＵＲＩ、及び「＜／ａ」タグで置換する。 Since the portion of the HTML text obtained by reading the text data is replaced with the speech data, the portion of the text data surrounded by the star is replaced with an appropriate tag including the URI of the speech data. Here, it is assumed that the Web browser interprets the end of the URI of the link destination as an extension and automatically starts the application program corresponding to the extension, and the “<a” tag and the URI indicating the link destination , And “</ a” tags.

したがって、図４の４０６０行目の星で囲まれた部分「場」が、図５の５０６０行目から５０６６行目に示すように、「面白い
＜ａｈｒｅｆ＝“ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０１．ｗａｖ”＞
場＜／ａ＞
所だった。」
に置換される。また、図４の４０７０行目の星で囲まれた部分「か」が、図５の５０７０行目から５０７６行目に示すように、
「また、いこう
＜ａｈｒｅｆ＝“ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０２．ｗａｖ”＞
か＜／ａ＞
な。」
に置換される。以降は、図４に示すＨＴＭＬ文章に代わって、図５に示すＨＴＭＬ文章がＨＴＭＬテキストデータベース２６に格納される。 Therefore, the portion “field” surrounded by the star on line 4060 in FIG. 4 is “interesting <a href =“ http: // blog1. com / u1 / 10/01. wav ">
</a>
It was a place. "
Is replaced by In addition, as shown in the lines 5070 to 5076 in FIG.
“Also, let ’s say <a href =“ http: // blog1. com / u1 / 10/02. wav ">
</a>
Yeah. "
Is replaced by Thereafter, the HTML text shown in FIG. 5 is stored in the HTML text database 26 instead of the HTML text shown in FIG.

図６は、音声テキストデータベース２８に格納される音声データと読みのテキストデータの例を説明する図である。図６は、Ｗａｖｅ形式の音声データ６０２と６０６、及び音声データの発声音を表現するテキストデータ６０４と６０８を示す。 FIG. 6 is a diagram for explaining an example of voice data and reading text data stored in the voice text database 28. FIG. 6 shows Wave data 602 and 606 in Wave format and text data 604 and 608 representing the utterance sound of the audio data.

内容審査部３４は、ＨＴＭＬテキストデータベース２６にあるすべてのＨＴＭＬ文章と音声テキストデータベース２８にある音声データとテキストデータを参照し、読み上げ禁止用語データベース３２に格納した読み上げ禁止用語を用いて処理する。 The content examining unit 34 refers to all the HTML texts in the HTML text database 26, the speech data and text data in the speech text database 28, and processes them using the speech-prohibited terminology stored in the speech-prohibited term database 32.

以下、内容審査部３４の処理を図５から図８を用いて詳細に説明する。図７は、内容審査部３４の処理フローチャートである。図８は、読み上げ禁止用語データベース３２に格納されている読み上げ禁止用語の例を示すものである。 Hereinafter, the processing of the content examining unit 34 will be described in detail with reference to FIGS. FIG. 7 is a process flowchart of the content examination unit 34. FIG. 8 shows examples of prohibited words to be read stored in the prohibited words database 32.

内容審査部３４は、定期的にＨＴＭＬテキストデータベース２６にあるＨＴＭＬ文章について、各々、図７のステップ７００からステップ７３０の処理を行う。内容審査部３４は、処理を開始する(ステップ７００)と、ＵＲＩのリストを記憶する変数である＄Ｆと、文字列を記憶する変数である＄Ｓを空にする(ステップ７０２)。次にＨＴＭＬテキストデータベース２６のひとつのＨＴＭＬ文章、つまり、ＵＲＩで指定されるひとつのファイルの先頭から１行を読み込み(ステップ７０４)、ファイルの終了（ＥＯＦ）か否かを判定する(ステップ７０６)。終了であるときはステップ７１４へ進み、終了でないときはステップ７０８へ進み、パタンマッチを行い、音声データのＵＲＩで指定されたファイル（Ｗａｖｅファイル）を抽出し、ステップ７１０へ進む。 The content examination unit 34 periodically performs the processing from step 700 to step 730 in FIG. 7 for the HTML texts in the HTML text database 26. When the process is started (step 700), the content examining unit 34 empties $ F, which is a variable for storing a list of URIs, and $ S, which is a variable for storing a character string (step 702). Next, one HTML sentence in the HTML text database 26, that is, one line from the head of one file specified by the URI is read (step 704), and it is determined whether the end of file (EOF) or not (step 706). . If it is completed, the process proceeds to step 714, and if not completed, the process proceeds to step 708 to perform pattern matching, extract a file (Wave file) specified by the URI of the audio data, and proceed to step 710.

ステップ７１０で、パタンマッチが成功し、Ｗａｖｅ形式のＵＲＩがあるときは、ステップ７１２へ進み、ＵＲＩを＄Ｆに追加し、そのＵＲＩの拡張子を「．ｗａｖ」から「．ｔｘｔ」に置換したＵＲＩで指定されるファイルに格納されているテキストデータを＄Ｓに追加し、ステップ７０４へ戻り、ステップ７０４から７１２を繰り返す。Ｗａｖｅ形式のＵＲＩがないときは、ステップ７０４へ戻り、ステップ７０４から７１２を繰り返す。 If the pattern match is successful and there is a Wave format URI in step 710, the process proceeds to step 712, where the URI is added to $ F and the extension of the URI is replaced from “.wav” to “.txt”. The text data stored in the file specified by the URI is added to $ S, the process returns to step 704, and steps 704 to 712 are repeated. If there is no Wave format URI, the process returns to step 704 and steps 704 to 712 are repeated.

ここでは、内容審査部３４は、図５のＨＴＭＬ文章について処理し、ステップ７０２から７１２の処理を繰り返したとする。すると、図５のＨＴＭＬ文章５００の５０６２行目がステップ７０８でパタンマッチし、ステップ７１２で、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０１．ｗａｖ」が、＄Ｆに追加される。また、「．ｗａｖ」から「．ｔｘｔ」に置換したＵＲＩは、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０１．ｔｘｔ」となり、それに格納されているテキストデータ、「ｂａ」が＄Ｓに追加される。 Here, it is assumed that the content examining unit 34 processes the HTML text of FIG. 5 and repeats the processing from steps 702 to 712. Then, line 5062 of the HTML sentence 500 in FIG. 5 is pattern-matched at step 708, and “http://blog1.com/u1/10/01.wav” is added to $ F at step 712. The URI replaced from “.wav” to “.txt” is “http://blog1.com/u1/10/01.txt”, and the text data “ba” stored therein is $ S. To be added.

ステップ７０４へ戻り、上述のステップ７０４からステップ７１２を繰り返し、再び、図５のＨＴＭＬ文章５００の５０７２行目がステップ７０８でパタンマッチし、ステップ７１２で、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０２．ｗａｖ」が、＄Ｆに追加される。また、「．ｗａｖ」から「．ｔｘｔ」に置換したＵＲＩは、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０２．ｔｘｔ」となり、それに格納されているテキストデータ、「ｋａ」が＄Ｓに追加される。 Returning to step 704, the above-mentioned steps 704 to 712 are repeated, and line 5072 of the HTML sentence 500 of FIG. 5 is again pattern-matched in step 708. In step 712, “http://blog1.com/u1/ 10 / 02.wav "is added to $ F. The URI replaced from “.wav” to “.txt” is “http://blog1.com/u1/10/02.txt”, and the text data stored in the URI “ka” is $ S. To be added.

したがって、＄Ｆは、［ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０１．ｗａｖ、ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０２．ｗａｖ］となり、＄Ｓは、［ｂａ、ｋａ］となる。 Therefore, $ F is [http: // blog1. com / u1 / 10/01. wav, http: // blog1. com / u1 / 10/02. wav] and $ S becomes [ba, ka].

ステップ７１４へ進み、＄Ｓが空白のときは終了し（７３０）、空白でないときは、＄Ｓと読み上げ禁止用語データベース２８にある読み上げ禁止用語とのパタンマッチを行う(ステップ７１６)。次に、ステップ７１８へ進み、パタンマッチが成功した場合は、対応する音声データの内容を予め決めた音声データに置換し(ステップ７２０)、＄Ｆと＄Ｓを一要素分左へシフトする(ステップ７２２)。パタンマッチが成功しなかった場合は、ステップ７２２へ進む。ステップ７２２が終了した後は、ステップ７１４からステップ７２２を繰り返し、＄Ｓが空白のときは終了する(ステップ７３０)。 Proceeding to step 714, if $ S is blank, the process ends (730), and if it is not blank, pattern matching is performed between $ S and the prohibited words to be read in the prohibited words database 28 (step 716). Next, the process proceeds to step 718. If the pattern match is successful, the corresponding audio data is replaced with predetermined audio data (step 720), and $ F and $ S are shifted to the left by one element (step 720). Step 722). If the pattern match is not successful, the process proceeds to step 722. After step 722 ends, step 714 to step 722 are repeated, and when $ S is blank, it ends (step 730).

ここでは、図８に示すように、禁止用語データベース２８には、２つの読み上げ禁止用語８０２と８０４が格納され、内容は、それぞれ、「ａｈｏ」と「ｂａｋａ」であったとする。ここでは、ステップ７１４で、＄Ｓは［ｂａ、ｋａ］であり、空白で無いので、ステップ７１６へ進む。読み上げ禁止用語８０２は、「ａｈｏ」であり、パタンマッチは成功しないが、読み上げ禁止用語８０４は、「ｂａｋａ」であり、パタンマッチは成功する。したがって、ステップ７２０で、対応する音声データの内容を予め決めた音声データに置換する。ここでは、音声データの置換は、＄Ｆの先頭（左側）にあるＵＲＩで指定される音声データのファイルの内容を無音に置換するとする。 Here, as shown in FIG. 8, it is assumed that the prohibited term database 28 stores two reading-prohibited terms 802 and 804 and the contents are “aho” and “baka”, respectively. Here, in step 714, $ S is [ba, ka] and is not blank, so the process proceeds to step 716. The reading prohibition term 802 is “aho” and the pattern match does not succeed, but the reading prohibition term 804 is “baka” and the pattern match succeeds. Accordingly, in step 720, the content of the corresponding audio data is replaced with predetermined audio data. Here, it is assumed that the audio data is replaced with silence in the audio data file specified by the URI at the top (left side) of $ F.

この段階で、＄Ｆは、［ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０１．ｗａｖ、ｈｔｔｐ：／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０２．ｗａｖ］であるので、図６の音声データ６０２である「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０１．ｗａｖ」の内容は、「ｂａ」に対応する音ではなく、無音となる。 At this stage, $ F is [http: // blog1. com / u1 / 10/01. wav, http: // blog1. com / u1 / 10/02. wav], the content of “http://blog1.com/u1/10/01.wav”, which is the audio data 602 of FIG. 6, is not a sound corresponding to “ba” but is silent.

ステップ７２２で、＄Ｆと＄Ｓを一要素分左へシフトし、＄Ｆは、［ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０２．ｗａｖ］となり、＄Ｓは、［ｋａ］となる。ステップ７１４からステップ７１８でパタンマッチするものは無いので、ステップ７２２へ進み、＄Ｆと＄Ｓを一要素分左へシフトし、再度、ステップ７１４で、＄Ｆと＄Ｓは、両者とも空白になり、内容審査部３４の処理を終了する（ステップ７３０）。 In step 722, $ F and $ S are shifted to the left by one element, and $ F is changed to [http: // blog1. com / u1 / 10/02. wav] and $ S becomes [ka]. Since there is no pattern match from step 714 to step 718, the process proceeds to step 722, $ F and $ S are shifted to the left by one element, and again in step 714, both $ F and $ S are left blank. Thus, the processing of the content examination unit 34 is terminated (step 730).

上記の状態でパーソナルコンピュータ２のＷｅｂブラウザ１０が、ブログサイト２０へアクセスし、Ｗｅｂサーバ２０を経由して、ＨＴＭＬテキストデータベース２６にある、ＨＴＭＬ文章(図５)を閲覧したとする。このとき、パーソナルコンピュータ２のディスプレイ４には、図９に示すように表示される。 Assume that the Web browser 10 of the personal computer 2 accesses the blog site 20 and browses the HTML text (FIG. 5) in the HTML text database 26 via the Web server 20 in the above state. At this time, it is displayed on the display 4 of the personal computer 2 as shown in FIG.

図９は、パーソナルコンピュータ２のディスプレイ４に表示される表示画面の例であり、９００は、表示画面、９０２は、文章のタイトル、９０４は、文章の本文、９０６と９０８は、リンクの存在を示す下線である。Ｗｅｂブラウザでは、下線のあるテキストをマウスでクリックすると、＜ａ＞タグの内部にかかれたリンクの拡張子に対応するアプリケーション・プログラムが起動し、リンクで示すファイルを読み込み、再生し、スピーカ１４から音声として出力する。 FIG. 9 is an example of a display screen displayed on the display 4 of the personal computer 2, 900 is a display screen, 902 is a sentence title, 904 is a sentence body, and 906 and 908 are links. This is an underline. In the Web browser, when the underlined text is clicked with the mouse, an application program corresponding to the extension of the link written in the <a> tag is started, and the file indicated by the link is read and played, and the speaker 14 Output as audio.

ここで、パーソナルコンピュータ２を操作しているユーザが、下線９０６、と９０８をマウスでこの順番に選択したとする。下線９０６と９０８に対応するリンクは、それぞれ図５の５０６２から５０６４行目と５０７２から５０７４行目であり、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０１．ｗａｖ」と、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０２．ｗａｖ」である。拡張子が「．ｗａｖ」であるので、パーソナルコンピュータ２の音声再生部１２は、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０１．ｗａｖ」と「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０２．ｗａｖ」の再生を試みる。音声再生部１２は、ブログサイト２０のＷｅｂサーバ２２経由で、音声テキストデータベース２８へアクセスし、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０１．ｗａｖ」と「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０２．ｗａｖ」に対応する音声ファイル６０２と６０６をパーソナルコンピュータ２にダウンロードし、再生する。 Here, it is assumed that the user operating the personal computer 2 selects the underlines 906 and 908 in this order with the mouse. The links corresponding to the underlines 906 and 908 are the lines 5062 to 5064 and lines 5072 to 5074 in FIG. 5, respectively, “http://blog1.com/u1/10/01.wav” and “http: //Blog1.com/u1/10/02.wav ". Since the extension is “.wav”, the audio playback unit 12 of the personal computer 2 uses “http://blog1.com/u1/10/01.wav” and “http://blog1.com/u1/”. "10 / 02.wav" is attempted to be played back. The voice reproduction unit 12 accesses the voice text database 28 via the Web server 22 of the blog site 20, and “http://blog1.com/u1/10/01.wav” and “http://blog1.com”. Audio files 602 and 606 corresponding to “/u1/10/02.wav” are downloaded to the personal computer 2 and played back.

ここでは、上述のように、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０１．ｗａｖ」の内容は、無音となっているので、下線９０６をマウスで選択したときは、「ｂａ」の音は再生されず、スピーカ１４は、無音のままであり、下線９０８をマウスで選択したときだけ「ｋａ」の音が再生されるので、スピーカ１４から「ｂａｋａ」と聞こえることは無い。 Here, as described above, the content of “http://blog1.com/u1/10/01.wav” is silent, so when the underline 906 is selected with the mouse, the “ba” The sound is not reproduced, the speaker 14 remains silent, and the sound “ka” is reproduced only when the underline 908 is selected with the mouse, so that the speaker 14 does not hear “baka”.

本実施例では、ステップ７２０で、音声データの置換は、左にある単語に対応する方の音声データのファイルを無音にするとしたが、無音の代わりに、予め固定した音にしても良い。 In the present embodiment, in step 720, the voice data is replaced by silence in the voice data file corresponding to the word on the left. However, instead of silence, a sound fixed in advance may be used.

本実施例では、パーソナルコンピュータ１のユーザが、ステップ３００から３１４の処理で、格納した合成音声の再生を含むＷｅｂページをそのまま、パーソナルコンピュータ２のユーザが閲覧する例を取り上げたが、パーソナルコンピュータ１のユーザが、一度格納したＷｅｂページを、再度編集し、再生される音声ファイルへのリンクの位置や順序を変更しても、内容審査部３４は、定期的にステップ７００から７３０の処理を行うので、不適切な発声を検出することができる。 In the present embodiment, an example has been described in which the user of the personal computer 1 browses the Web page including the reproduction of the stored synthesized speech as it is in the processing of steps 300 to 314, but the user of the personal computer 2 browses the web page as it is. Even if the user edits the Web page once stored and changes the position and order of links to the audio file to be played back, the content examining unit 34 periodically performs the processing of steps 700 to 730. Therefore, inappropriate utterances can be detected.

上述の実施例１で、音声合成サイト４０で、受信部４２が受信したテキストデータを、音声合成部４４へ入力する前に、テキストデータで内容を審査し、不適切と判断した場合は、変更したテキストデータを音声合成部４４へ送るテキストデータとしても良い。 In the first embodiment described above, before the text data received by the receiving unit 42 is input to the voice synthesizing unit 44 at the speech synthesizing site 40, the contents are examined with the text data, and if it is determined to be inappropriate, the change is made. The text data thus sent may be text data sent to the speech synthesizer 44.

本実施例では、音声テキストデータベース２８に格納されたテキストデータ６０４と６０８は、変換要求部３０と内容審査部３４からだけアクセスされ、編集部２４など、他のプログラムからアクセスされることが無いので、いったん音声ファイルを作成し、音声テキストデータベース２８に格納した後に、音声テキストデータベース２８のテキストデータだけを変更して、内容審査部３４で適正と判定され、不適切な用語を発声するように変更することはできないという効果がある。 In this embodiment, the text data 604 and 608 stored in the speech text database 28 are accessed only from the conversion requesting unit 30 and the content examining unit 34, and are not accessed from other programs such as the editing unit 24. Once the voice file is created and stored in the voice text database 28, only the text data in the voice text database 28 is changed, and the content judging unit 34 determines that it is appropriate, and changes to utter an inappropriate term. It has the effect that it cannot be done.

本実施例では、日本語の場合を用いたが、他の言語、例えば英語や中国語なども、発声を表現するテキストデータを用いることができるので、同様に扱うことができる。 In this embodiment, the case of Japanese is used, but other languages such as English and Chinese can also be handled in the same manner because text data expressing utterances can be used.

実施例１では、変換したテキストデータを、そのまま、テキストデータとしてファイルに格納し、音声データとともにテキストデータをブログサイトへ送信したが、テキストデータとしてファイルに格納する代わりに、音声データにテキストデータを電子透かしで埋め込んで、電子透かし入りの音声データだけをブログサイトへ送信してもよい。 In the first embodiment, the converted text data is directly stored in a file as text data, and the text data is transmitted to the blog site together with the voice data. Instead of storing the text data in the file as text data, the text data is stored in the voice data. It may be embedded with a digital watermark and only the audio data with the digital watermark may be transmitted to the blog site.

以下、この場合を実施例２として、図２から図４、図８から図１３を用いて説明する。図２から図４、図８から図９は、それぞれ実施例１と同じである。 Hereinafter, this case will be described as a second embodiment with reference to FIGS. 2 to 4 and FIGS. 8 to 13. 2 to 4 and FIGS. 8 to 9 are the same as those of the first embodiment.

この実施例２は、Ｗｅｂブラウザと音声再生機能を備えるパーソナルコンピュータ、実施例１とは異なる構成のブログサイトＡ、及び実施例１とは異なる構成の音声合成サイトＡの３つの部分から構成される。実施例２の構成を図１０に示す。図１０において、図１と同じものには同じ符号を付している。 The second embodiment is composed of three parts: a personal computer having a web browser and a voice reproduction function, a blog site A having a configuration different from that of the first embodiment, and a speech synthesis site A having a configuration different from that of the first embodiment. . The configuration of Example 2 is shown in FIG. 10, the same components as those in FIG. 1 are denoted by the same reference numerals.

ブログサイトＡ１０００における、実施例1との主な違いは、音声データベース１００２、変換要求部Ａ１００４、電子透かし検出部１００６及び内容審査部Ａ１００８にある。音声データベース１００２は、電子透かしが挿入された電子透かし入り音声データを格納する。変換要求部Ａ１００４は、ＨＴＭＬテキストデータベース２６に格納されたテキストデータの一部分を電子透かし入り音声に変換することを要求し、変換した電子透かし入り音声データを音声データベース１００２に格納するように指示を出し、ＨＴＭＬテキストデータベース２６に格納されたテキストデータの一部を置換する。電子透かし検出部１００６は、音声データベース１００２に格納された電子透かし入り音声データから電子透かしで挿入されたテキストデータを抽出する。内容審査部Ａ１００８は、ＨＴＭＬテキストデータベース２６と、読み上げ禁止用語データベース３２と、電子透かし検出部１００６から得たテキストデータとを参照し、読み上げ禁止用語か否かを判定し、読み上げ禁止用語と判定したときは、音声データベース１００２に格納された対応する音声データを変更する。 The main differences between the blog site A1000 and the first embodiment are the voice database 1002, the conversion request unit A1004, the digital watermark detection unit 1006, and the content examination unit A1008. The voice database 1002 stores voice data with a digital watermark into which a digital watermark has been inserted. The conversion request unit A1004 requests to convert a part of the text data stored in the HTML text database 26 into digital watermarked voice, and issues an instruction to store the converted digital watermarked voice data in the voice database 1002. , A part of the text data stored in the HTML text database 26 is replaced. The digital watermark detection unit 1006 extracts text data inserted by digital watermark from the voice data with digital watermark stored in the voice database 1002. The content examination unit A1008 refers to the HTML text database 26, the reading-prohibited term database 32, and the text data obtained from the digital watermark detection unit 1006, determines whether or not it is a reading-prohibited term, and determines that it is a reading-prohibited term. If so, the corresponding audio data stored in the audio database 1002 is changed.

音声データベース１００２、変換要求部Ａ１００４、電子透かし検出部１００６、内容審査部Ａ１００８は、ブログサイトＡ１０００で動作するプログラムであり、ブログサイトＡ１０００を構成するハードウエア(コンピュータ)、そのオペレーティング・システム、及びそれらが提供するファイルシステムなどで実現する。 The voice database 1002, the conversion request unit A1004, the digital watermark detection unit 1006, and the content examination unit A1008 are programs that operate on the blog site A1000. The hardware (computer) that configures the blog site A1000, its operating system, and the like This is realized by the file system provided by.

音声合成サイトＡ１０１０は、ブログサイトＡ１０００からＵＲＩとテキストデータとを受信する受信部Ａ１０１２、音声データにテキストデータの電子透かしを挿入する電子透かし挿入部１０１４、電子透かし入りの音声データをＵＲＩに対応するファイル名の音声データのファイルとして、ブログサイトＡ１０００へ送信する送信部Ａ１０１６である。 The speech synthesis site A1010 corresponds to the reception unit A1012 that receives the URI and text data from the blog site A1000, the digital watermark insertion unit 1014 that inserts the digital watermark of the text data into the voice data, and the voice data with the digital watermark corresponding to the URI. The transmission unit A1016 transmits the file name audio data file to the blog site A1000.

本実施例でも、「ａｈｏ」と「ｂａｋａ」は、不適切な用語の発声であり、これを聞いた側では、不適切な用語を発声したように聞こえる。 Also in the present embodiment, “aho” and “baka” are utterances of inappropriate terms, and the person who hears them sounds as if they have uttered inappropriate terms.

パーソナルコンピュータ１でのエンドユーザの操作の流れと、編集部２４の処理の流れは、実施例１と同じである（図２と図３）。ここでは、エンドユーザは、書き込むボタン２１０をマウス７で操作し、タイトル入力部分２０２と本文入力部分２０４に書き込まれた文章をブログサイトＡ１０００のＨＴＭＬテキストデータベース２６へ適当なＵＲＩで、ファイルとして格納したとする（ステップ３１０から３１４）。 The operation flow of the end user on the personal computer 1 and the processing flow of the editing unit 24 are the same as those in the first embodiment (FIGS. 2 and 3). Here, the end user operates the write button 210 with the mouse 7, and stores the text written in the title input part 202 and the text input part 204 as a file in the HTML text database 26 of the blog site A1000 with an appropriate URI. (Steps 310 to 314).

ここでは、付けられたＵＲＩは、実施例１と同様に「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／ｈｏｎｂｕｎ．ｈｔｍｌ」であり、ＨＴＭＬテキストデータベース２６に格納されたＨＴＭＬ文章は実施例１と同様に図４であったとする。 Here, the URI attached is “http://blog1.com/u1/10/honbun.html” as in the first embodiment, and the HTML text stored in the HTML text database 26 is the same as in the first embodiment. Similarly, assume that FIG.

変換要求部Ａ１００４は、定期的にＨＴＭＬテキストデータベース２６を参照し、新たに格納されたファイルであるＨＴＭＬ文章を検出し、星印や三角印など絵文字で囲まれたテキストデータの部分を抽出し、抽出されたテキストデータ毎に、ユニークなＵＲＩを生成し、それらテキストデータとＵＲＩを音声合成サイトＡ１０１０へ送信し、ＵＲＩに対応するファイル名の電子透かし入り音声データを得る。 The conversion request unit A1004 periodically refers to the HTML text database 26, detects HTML text that is a newly stored file, extracts a text data portion surrounded by pictograms such as stars and triangles, A unique URI is generated for each extracted text data, and the text data and the URI are transmitted to the speech synthesis site A 1010 to obtain digital watermarked speech data having a file name corresponding to the URI.

ここでは、音声合成サイトに出力されたデータのひとつは、テキストデータが「場」であり、ＵＲＩが「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０３．ｗａｖ」であったとし、他のひとつは、テキストデータが「か」であり、ＵＲＩが「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０４．ｗａｖ」であったとする。 Here, it is assumed that one of the data output to the speech synthesis site is that the text data is “place” and the URI is “http://blog1.com/u1/10/03.wav” One is that the text data is “ka” and the URI is “http://blog1.com/u1/10/04.wav”.

上記のデータを受信した音声合成サイトＡ１０１０の受信部Ａ１０１２は、テキストデータを音声合成部４４へ出力し、ＵＲＩを送信部Ａ１０１６へ出力する。音声合成部４４は、テキストデータを音声データに変換し、音声データの発声音を表現するテキストデータと、音声データを電子透かし挿入部１０１４へ出力する。 Receiving unit A1012 of speech synthesis site A1010 that has received the above data outputs text data to speech synthesis unit 44 and outputs a URI to transmission unit A1016. The voice synthesizer 44 converts the text data into voice data, and outputs the text data expressing the voice of the voice data and the voice data to the digital watermark insertion unit 1014.

電子透かし挿入部１０１４は、音声合成部４４から入力した音声データに、テキストデータを電子透かしとして挿入し、送信部Ａ１０１６へ出力する。音声データへの電子透かしの挿入、電子透かしの検出、抽出については、特開２００３−９９０７７号公報に記載してある。 The digital watermark insertion unit 1014 inserts text data as a digital watermark into the voice data input from the voice synthesis unit 44 and outputs the text data to the transmission unit A1016. Japanese Patent Laid-Open No. 2003-99077 describes the insertion of digital watermarks into audio data and the detection and extraction of digital watermarks.

送信部Ａ１０１６は、電子透かし挿入部１０１４から得た、電子透かし入り音声データを、受信部Ａ１０１２から得たＵＲＩに対応するファイル名の音声データのファイルとしてブログサイトＡ１０００の変換要求部Ａ１００４へ戻す。ここで、テキストデータは、「場」を「ｂａ」という音に変換し、「か」を「ｋａ」という音に変換したものとする。 The transmission unit A1016 returns the audio data with digital watermark obtained from the digital watermark insertion unit 1014 to the conversion request unit A1004 of the blog site A1000 as a file of audio data having a file name corresponding to the URI obtained from the reception unit A1012. Here, it is assumed that the text data is obtained by converting “place” into a sound “ba” and converting “ka” into a sound “ka”.

一方の「ｂａ」が、電子透かしで挿入された電子透かし入りの音声データのファイルは、ＵＲＩが「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０３．ｗａｖ」であり、他方の「ｋａ」が、電子透かしで挿入された電子透かし入りの音声データのファイルは、ＵＲＩが「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０４．ｗａｖ」である。 One “ba” is a watermarked audio data file inserted with a digital watermark. The URI is “http://blog1.com/u1/10/03.wav” and the other “ka” However, a file of audio data with a digital watermark inserted with a digital watermark has a URI “http://blog1.com/u1/10/04.wav”.

これらのデータを受信した変換要求部Ａ１００４は、受信したデータを音声データベース１００２へ格納し、ＨＴＭＬテキストデータベース２６にアクセスし、音声データと、テキストデータを得たＨＴＭＬ文章の絵文字で囲まれた部分を、Ｗｅｂブラウザが再生できるようにＵＲＩを含む適当なタグで置換する。 Upon receiving these data, the conversion request unit A1004 stores the received data in the speech database 1002, accesses the HTML text database 26, and converts the speech data and the portion surrounded by pictographs of the HTML text from which the text data was obtained. Replace with an appropriate tag containing a URI so that the web browser can play it.

図１１は、置換した後のＨＴＭＬ文章１１００を表したものである。左端の番号１１０１０から１１０９０は、本実施例を説明するために付加した行番号であり、本来のＨＴＭＬ文章には、含まれない。 FIG. 11 shows the HTML sentence 1100 after replacement. The numbers 11010 to 11090 at the left end are line numbers added for explaining the present embodiment, and are not included in the original HTML text.

音声データと、テキストデータを得たＨＴＭＬ文章の絵文字で囲まれた部分を置換するので、ここでは、星印で囲まれたテキストデータの部分を音声データのＵＲＩを含む適当なタグで置換する。ここでは、Ｗｅｂブラウザが、リンク先のＵＲＩの最後尾を拡張子と解釈して、その拡張子に対応するアプリケーション・プログラムを自動的に起動するとし、「＜ａ」タグ、リンク先を示すＵＲＩ、「＜／ａ」タグで置換する。 Since the portion surrounded by the pictographs of the HTML text obtained from the voice data and the text data is replaced, the portion of the text data surrounded by the star is replaced with an appropriate tag including the URI of the voice data. Here, it is assumed that the Web browser interprets the end of the URI of the link destination as an extension and automatically starts the application program corresponding to the extension, and the “<a” tag and the URI indicating the link destination , Replace with “</ a” tag.

したがって、図４の４０６０行目の星印で囲まれた部分「場」が、図１１の１１０６０行目から１１０６６行目に示すように、
「面白い
＜ａｈｒｅｆ＝“ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０３．ｗａｖ”＞
場＜／ａ＞
所だった。」
に置換される。図４の４０７０行目の星印で囲まれた部分「か」が、図１１の１１０７０行目から１１０７６行目に示すように、
「また、いこう
＜ａｈｒｅｆ＝“ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０４．ｗａｖ”＞
か＜／ａ＞
な。」
に置換される。以降は、図４に示すＨＴＭＬ文章に代わって、図１１に示すＨＴＭＬ文章がＨＴＭＬテキストデータベース２６に格納される。図１２は、音声データベース１００２に格納される電子透かし入り音声データの例１２０２と１２０４とを示す。 Therefore, the portion “field” surrounded by the star on line 4060 in FIG. 4 is shown on lines 11060 to 11066 in FIG.
“Interesting <a href =“ http: // blog1. com / u1 / 10/03. wav ">
</a>
It was a place. "
Is replaced by As shown in the 11070th line to the 11076th line in FIG.
“Also, let ’s say <a href =“ http: // blog1. com / u1 / 10/04. wav ">
</a>
Yeah. "
Is replaced by Thereafter, the HTML text shown in FIG. 11 is stored in the HTML text database 26 instead of the HTML text shown in FIG. FIG. 12 shows examples 1202 and 1204 of voice data with digital watermarks stored in the voice database 1002.

内容審査部Ａ１００８は、ＨＴＭＬテキストデータベース２６にあるＨＴＭＬ文章と音声データベース１００２にある電子透かし入り音声データとを参照し、読み上げ禁止用語データベース３２に格納した読み上げ禁止用語を用いて処理する。 The content examination unit A 1008 refers to the HTML text in the HTML text database 26 and the digital watermarked voice data in the voice database 1002 and performs processing using the read-inhibited terminology stored in the read-in prohibition term database 32.

以下、内容審査部Ａ１００８の処理を図８、図１１から図１３を用いて詳細に説明する。図１３は、内容審査部Ａ１００８の処理を示すフローチャートである。 Hereinafter, the processing of the content examination unit A1008 will be described in detail with reference to FIGS. 8 and 11 to 13. FIG. 13 is a flowchart showing the processing of the content examination unit A1008.

内容審査部Ａ１００８は、定期的にＨＴＭＬテキストデータベース２６にあるファイルであるＨＴＭＬ文章の各々について、図１３のステップ１３００からステップ１３３０の処理を行う。 The content examination unit A1008 periodically performs the processing from step 1300 to step 1330 in FIG. 13 for each HTML sentence that is a file in the HTML text database 26.

内容審査部Ａ１００８は、処理を開始する(ステップ１３００)と、ＵＲＩのリストを記憶する変数である＄Ｆと、文字列を記憶する変数である＄Ｓを空にする(ステップ１３０２)。次にＨＴＭＬテキストデータベース２６のひとつのＨＴＭＬ文章、つまりＵＲＩで指定されるひとつのファイルの先頭から１行を読み込み(ステップ１３０４)、ファイルの終了（ＥＯＦ）か否かを判定する(ステップ１３０６)。 When the content examination unit A1008 starts processing (step 1300), $ F which is a variable for storing a list of URIs and $ S which is a variable for storing character strings are emptied (step 1302). Next, one HTML sentence in the HTML text database 26, that is, one line from the head of one file specified by the URI is read (step 1304), and it is determined whether the end of file (EOF) or not (step 1306).

終了であるときは、ステップ１３１４へ進み、終了でないときは、ステップ１３０８へ進み、パタンマッチを行い、音声データのＵＲＩで指定されたファイルを抽出し、ステップ１３１０へ進む。 If it is finished, the process proceeds to step 1314. If it is not finished, the process proceeds to step 1308 to perform pattern matching, extract the file specified by the URI of the audio data, and proceed to step 1310.

ステップ１３１０で、パタンマッチが成功し、音声データのＵＲＩがあるときは、ステップ１３１１へ進み、パタンマッチしたＵＲＩで指定される音声データのファイルの音声データを電子透かし検出部１００６へ出力し、電子透かしで挿入されているテキストデータを抽出し、ステップ１３１２へ進み、パタンマッチしたＵＲＩを＄Ｆに追加し、テキストデータを＄Ｓに追加し、ステップ１３０４へ戻り、ステップ１３０４から１３１２を繰り返す。音声データのＵＲＩがないときは、ステップ１３０４へ戻り、ステップ１３０４から１３１２を繰り返す。 If the pattern match succeeds and there is a URI of the voice data in step 1310, the process proceeds to step 1311, where the voice data of the voice data file specified by the pattern matched URI is output to the digital watermark detection unit 1006, The text data inserted by the watermark is extracted, and the process proceeds to step 1312. The pattern-matched URI is added to $ F, the text data is added to $ S, the process returns to step 1304, and steps 1304 to 1312 are repeated. If there is no audio data URI, the process returns to step 1304 and steps 1304 to 1312 are repeated.

ここでは、内容審査部Ａ１００８は、図１１のＨＴＭＬ文章について処理し、ステップ１３０２から１３１２の処理を繰り返したとする。すると、図１１のＨＴＭＬ文章１１００の１１０６２行目がステップ１３０８でパタンマッチし、ステップ１３１１で、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０３．ｗａｖ」から電子透かしで挿入されているテキストデータ「ｂａ」を抽出する。ステップ１３１２で「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０３．ｗａｖ」が、＄Ｆに追加され、「ｂａ」が＄Ｓに追加される。 Here, it is assumed that the content examination unit A1008 processes the HTML text of FIG. 11 and repeats the processing of steps 1302 to 1312. Then, the 11062st line of the HTML sentence 1100 in FIG. 11 is pattern-matched in Step 1308, and in Step 1311, the text inserted from “http://blog1.com/u1/10/03.wav” with a digital watermark is inserted. Data “ba” is extracted. In step 1312, “http://blog1.com/u1/10/03.wav” is added to $ F, and “ba” is added to $ S.

ステップ１３０４へ戻り、上述のステップ１３０４からステップ１３１２を繰り返し、再び、図１１のＨＴＭＬ文章１１００の１１０７２行目がステップ１３０８でパタンマッチし、ステップ１３１１で、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０４．ｗａｖ」から電子透かしで挿入されているテキストデータ「ｋａ」を抽出する。ステップ１３１２で「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０４．ｗａｖ」が、＄Ｆに追加され、「ｋａ」が＄Ｓに追加される。 Returning to step 1304, the above steps 1304 to 1312 are repeated, and the 11072st line of the HTML sentence 1100 in FIG. 11 is again pattern matched in step 1308. In step 1311, “http://blog1.com/u1/ The text data “ka” inserted by the digital watermark is extracted from “10 / 04.wav”. In step 1312, “http://blog1.com/u1/10/04.wav” is added to $ F, and “ka” is added to $ S.

したがって、＄Ｆは、［ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０３．ｗａｖ、ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０４．ｗａｖ］となり、＄Ｓは、［ｂａ、ｋａ］となる。 Therefore, $ F is [http: // blog1. com / u1 / 10/03. wav, http: // blog1. com / u1 / 10/04. wav] and $ S becomes [ba, ka].

ステップ１３１４へ進み、＄Ｓが空白のときは終了し（１３３０）、空白でないときは、＄Ｓと読み上げ禁止用語データベース２８にある読み上げ禁止用語とのパタンマッチを行う(ステップ１３１６)。次に、ステップ１３１８へ進み、パタンマッチが成功した場合は、対応する音声データの内容を予め決めた音声データに置換し(ステップ１３２０)、＄Ｆと＄Ｓを一要素分左へシフトする(ステップ１３２２)。パタンマッチが成功しなかった場合は、ステップ１３２２へ進む。ステップ１３２２が終了した後は、ステップ１３１４からステップ１３２２を繰り返し、＄Ｓが空白のときは終了する(ステップ１３３０)。 Proceeding to step 1314, if $ S is blank, the process ends (1330). If $ S is not blank, pattern matching is performed between $ S and the prohibited words to be read in the prohibited words database 28 (step 1316). Next, the process proceeds to step 1318. If the pattern match is successful, the content of the corresponding audio data is replaced with predetermined audio data (step 1320), and $ F and $ S are shifted to the left by one element (step 1320). Step 1322). If the pattern match is not successful, the process proceeds to step 1322. After step 1322 is completed, steps 1314 to 1322 are repeated, and when $ S is blank, the process ends (step 1330).

ここでは、図８に示すように、禁止用語データベース２８には、２つの読み上げ禁止用語８０２と８０４が格納され、内容は、それぞれ、「ａｈｏ」と「ｂａｋａ」であったとする。ステップ１３１４で、＄Ｓは［ｂａ、ｋａ］であり、空白で無いので、ステップ１３１６へ進む。読み上げ禁止用語８０２は「ａｈｏ」であり、パタンマッチは成功しないが、読み上げ禁止用語８０４は「ｂａｋａ」であり、パタンマッチは成功する。従って、ステップ１３２０で、対応する音声データの内容を予め決めた音声データに置換する。ここでは、音声データの置換は、＄Ｆの先頭（左側）にあるＵＲＩで指定される音声データのファイルの内容を無音に置換するとする。 Here, as shown in FIG. 8, it is assumed that the prohibited term database 28 stores two reading-prohibited terms 802 and 804 and the contents are “aho” and “baka”, respectively. In step 1314, $ S is [ba, ka] and is not blank, so the process proceeds to step 1316. The reading prohibition term 802 is “aho” and the pattern match does not succeed, but the reading prohibition term 804 is “baka” and the pattern match succeeds. Accordingly, in step 1320, the content of the corresponding audio data is replaced with predetermined audio data. Here, it is assumed that the audio data is replaced with silence in the audio data file specified by the URI at the top (left side) of $ F.

この段階で、＄Ｆは、［ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０３．ｗａｖ、ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０４．ｗａｖ］であるので、図１２の音声データ１２０２である「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０３．ｗａｖ」の内容は、「ｂａ」に対応する音ではなく、無音となる。 At this stage, $ F is [http: // blog1. com / u1 / 10/03. wav, http: // blog1. com / u1 / 10/04. wav], the content of “http://blog1.com/u1/10/03.wav”, which is the audio data 1202 in FIG. 12, is not a sound corresponding to “ba” but a silence.

次に、ステップ１３２２で、＄Ｆと＄Ｓを一要素分左へシフトし、＄Ｆは、［ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０４．ｗａｖ］となり、＄Ｓは、［ｋａ］となる。ステップ１３１４からステップ１３１８でパタンマッチするものは無いので、ステップ１３２２へ進み、＄Ｆと＄Ｓを一要素分左へシフトし、再度、ステップ１３１４で、＄Ｆと＄Ｓは、両者とも空白になり、内容審査部Ａ１００８の処理を終了する（ステップ１３３０）。 Next, in step 1322, $ F and $ S are shifted to the left by one element, and $ F is changed to [http: // blog1. com / u1 / 10/04. wav] and $ S becomes [ka]. Since there is no pattern match from step 1314 to step 1318, the process proceeds to step 1322, where $ F and $ S are shifted to the left by one element, and again in step 1314, both $ F and $ S are left blank. Thus, the processing of the content examination unit A1008 is terminated (step 1330).

上記の状態でパーソナルコンピュータ２のユーザがＷｅｂブラウザ１０を用いて、ブログサイト１０００へアクセスし、Ｗｅｂサーバ２２を経由して、ＨＴＭＬテキストデータベース２６にある、ＨＴＭＬ文章(図１１)を閲覧したとする。このとき、パーソナルコンピュータ２のディスプレイ４には、実施例１と同様に図９に示すように表示される。ここで、パーソナルコンピュータ２を操作しているユーザが、下線９０６、と９０８をマウスでこの順番に選択したとする。 In this state, the user of the personal computer 2 accesses the blog site 1000 using the Web browser 10 and browses the HTML text (FIG. 11) in the HTML text database 26 via the Web server 22. . At this time, it is displayed on the display 4 of the personal computer 2 as shown in FIG. Here, it is assumed that the user operating the personal computer 2 selects the underlines 906 and 908 in this order with the mouse.

下線９０６と９０８に対応するリンクは、それぞれ図１１の１１０６２から１１０６４行目と１１０７２から１１０７４行目であり、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０３．ｗａｖ」と「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０４．ｗａｖ」である。拡張子が「．ｗａｖ」であるので、実施例1と同様にパーソナルコンピュータ２の音声再生部１２は、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０３．ｗａｖ」と「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０４．ｗａｖ」の再生を試みる。 The links corresponding to the underlines 906 and 908 are the lines 11062 to 11064 and the lines 11072 to 11074 in FIG. 11, respectively, “http://blog1.com/u1/10/03.wav” and “http: // /Blog1.com/u1/10/04.wav ". Since the extension is “.wav”, the sound reproduction unit 12 of the personal computer 2 can execute “http://blog1.com/u1/10/03.wav” and “http: //” as in the first embodiment. blog1.com/u1/10/04.wav "is attempted.

ここでは、上述のように、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０３．ｗａｖ」の内容は、無音となっているので、下線９０６をマウスで選択したときは、「ｂａ」の音は再生されず、スピーカ１４は、無音のままであり、下線９０８をマウスで選択したときだけ「ｋａ」の音が再生されるので、スピーカ１４から「ｂａｋａ」と聞こえることは無い。 Here, as described above, since the content of “http://blog1.com/u1/10/03.wav” is silent, when the underline 906 is selected with the mouse, “ba” The sound is not reproduced, the speaker 14 remains silent, and the sound “ka” is reproduced only when the underline 908 is selected with the mouse, so that the speaker 14 does not hear “baka”.

本実施例では、ステップ１３２０で、音声データの置換は、左にある単語に対応する方の音声データのファイルを無音にするとしたが、無音の代わりに、予め固定した音にしても良い。 In this embodiment, in the voice data replacement in step 1320, the voice data file corresponding to the word on the left is silenced. However, instead of silence, a sound fixed in advance may be used.

本実施例で、音声合成サイトＡ１０１０で、受信部Ａ１０１２が出力したテキストデータを、音声合成部４４へ入力する前に、テキストデータで内容を審査し、音声合成部４４へ送るテキストデータを変更しても良い。 In this embodiment, before inputting the text data output from the receiving unit A1012 to the speech synthesizing unit 44 at the speech synthesizing site A1010, the contents are examined with the text data and the text data to be sent to the speech synthesizing unit 44 is changed. May be.

実施例１では、複数のブログサイトがあった場合、新たな読み上げ禁止用語を登録するときに複数のブログサイトの読み上げ禁止用語データベースを、それぞれ更新する必要がある。内容審査部と、読み上げ禁止用語データベースをブログサイトの外部の別のサイトに内容審査サイトとして設置し、複数のブログサイトで、内容審査サイトを共有するようにすれば、新たな読み上げ禁止用語を登録するときに、共有する内容審査サイトの読み上げ禁止用語データベースだけを更新すればよく、手間が省ける。 In the first embodiment, when there are a plurality of blog sites, it is necessary to update the reading-prohibited term database of the plurality of blog sites when registering a new reading-prohibited term. If you set up the content review department and the database for prohibited reading aloud on a separate site outside the blog site as a content review site, and share the content review site with multiple blog sites, register new prohibited words for reading aloud When you do this, you only need to update the read-only words database on the content review site that you share, saving you time and effort.

以下、この場合の実施例３を図２から図９、図１４を用いて説明する。この実施例は、Ｗｅｂブラウザと音声再生機能を備えるパーソナルコンピュータ、実施例１と異なる構成のブログサイト、実施例１と同じ音声合成サイト、及び内容審査サイトの４つの部分から構成される。図２から図９の各構成要素の動作は、実施例１の図２から図９の各構成要素の動作と同じである。内容審査サイトは、インターネット上のＷｅｂサイトであり、サーバなどのコンピュータとソフトウェアで構成される。 Hereinafter, Embodiment 3 in this case will be described with reference to FIGS. 2 to 9 and FIG. This embodiment is composed of four parts: a personal computer having a Web browser and a voice reproduction function, a blog site having a configuration different from that of the first embodiment, the same voice synthesis site as that of the first embodiment, and a content examination site. The operation of each component in FIGS. 2 to 9 is the same as the operation of each component in FIGS. 2 to 9 of the first embodiment. The content examination site is a website on the Internet, and is composed of a computer such as a server and software.

図1４は、実施例３の構成を示す。図1４において、図１と同じものには同じ符号を付している。 FIG. 14 shows the configuration of the third embodiment. In FIG. 14, the same components as those in FIG.

ブログサイトＢ１４００は、内容審査に関わる構成を備えていない点が実施例１と異なり、内容審査に関わる構成は、ブログサイトＢ１４００とは異なるサイトである内容審査サイト１４０２に設けられる。 The blog site B1400 is different from the first embodiment in that the blog site B1400 is not provided with a configuration relating to content examination, and the configuration relating to the content examination is provided in the content examination site 1402, which is a site different from the blog site B1400.

内容審査サイト１４０２は、読み上げ禁止用語データベース１４０４及び内容審査部１４０６を有する。読み上げ禁止用語データベース１４０４は、音声での読み上げに適していない読み上げ禁止用語を格納する。内容審査部１４０６は、ブログサイトＢ１４００のＨＴＭＬテキストデータベース２６と音声テキストデータベース２８、及び読み上げ禁止用語データベース１４０４を参照し、読み上げ禁止用語か否かを判定し、読み上げ禁止用語と判定したときは、対応する音声データを変更する。 The content review site 1402 has a reading-prohibited term database 1404 and a content review unit 1406. The reading-prohibited term database 1404 stores the reading-prohibited terms that are not suitable for voice reading. The content examination unit 1406 refers to the HTML text database 26 and the speech text database 28 of the blog site B 1400, and the reading-prohibited term database 1404 to determine whether or not it is a reading-prohibited term. Change the audio data to be used.

ブログサイトＢ１４００には、ユーザを認証するログイン機能やＷｅｂページを検索する検索機能などがあるが、本実施例でも、省略する。 The blog site B1400 has a login function for authenticating a user and a search function for searching for a web page, which are also omitted in this embodiment.

読み上げ禁止用語データベース１４０４と内容審査部１４０６とは、それぞれ内容審査サイト１４０２で動作するデータベース及びプログラムであり、内容審査サイト１４０２を構成するハードウエア(コンピュータ)、そのオペレーティング・システム、及びそれらが提供するファイルシステムなどで実現する。 The reading-prohibited term database 1404 and the content review unit 1406 are a database and a program that operate on the content review site 1402, respectively, and the hardware (computer) that configures the content review site 1402, its operating system, and they provide Realized with a file system.

以下、実施例１と同様に、パーソナルコンピュータ１でユーザが図２のように入力し、ブログサイト１４００の編集部２４が図３に示す処理を実行したとする。このときにＨＴＭＬテキストデータベース２６に格納されたＨＴＭＬ文章は、図４と同じであり、付けられたファイル名も実施例1と同様に、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／ｈｏｎｂｕｎ．ｈｔｍｌ」であったとする。 Hereinafter, as in the first embodiment, it is assumed that the user inputs the personal computer 1 as shown in FIG. 2, and the editing unit 24 of the blog site 1400 executes the processing shown in FIG. The HTML text stored in the HTML text database 26 at this time is the same as that shown in FIG. 4, and the file name assigned is “http://blog1.com/u1/10/honbun. html ".

ブログサイトＢ１４００の変換要求部３０は、実施例１と同様に、定期的にＨＴＭＬテキストデータベース２６を参照し、新たに格納されたファイルであるＨＴＭＬ文章を検出し、星印や三角印など絵文字で囲まれたテキストデータの部分を抽出し、抽出されたテキストデータ毎に、ユニークなＵＲＩを生成し、それらテキストデータとＵＲＩを音声合成サイト４０へ送信し、ＵＲＩに対応するファイル名の音声データと、そのＵＲＩの最後尾の「．ｗａｖ」を「．ｔｘｔ」で置換したファイル名の読みのテキストデータを得る。ここでは、実施例１と同様に、音声合成サイトに送信されたデータのひとつは、テキストデータが「場」であり、生成されたＵＲＩは「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０１．ｗａｖ」であったとし、他のひとつは、テキストデータが「か」であり、生成されたＵＲＩは「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０２．ｗａｖ」であったとする。 As in the first embodiment, the conversion request unit 30 of the blog site B1400 periodically refers to the HTML text database 26, detects HTML text that is a newly stored file, and uses pictographs such as stars and triangles. A portion of the enclosed text data is extracted, a unique URI is generated for each extracted text data, the text data and the URI are transmitted to the speech synthesis site 40, and voice data having a file name corresponding to the URI The text data of the reading of the file name is obtained by replacing “.wav” at the end of the URI with “.txt”. Here, as in the first embodiment, one of the data transmitted to the speech synthesis site is text data “place”, and the generated URI is “http://blog1.com/u1/10/01”. .Wav ”, and the other one is that the text data is“ ka ”and the generated URI is“ http://blog1.com/u1/10/02.wav ”.

上記のデータを受信した音声合成サイト４０は、実施例１と同様に動作し、ＵＲＩに対応するファイル名の音声データのファイルと、そのＵＲＩの最後尾の「．ｗａｖ」を「．ｔｘｔ」を置換したＵＲＩで、テキストデータをブログサイト３１４００の変換要求部３０へ戻す。ここで、テキストデータは、実施例１と同様に、「場」を「ｂａ」という音に変換し、「か」を「ｋａ」という音に変換したとし、テキストデータは、「ｂａ」と、「ｋａ」となる。 The voice synthesis site 40 that has received the above data operates in the same manner as in the first embodiment, and the voice data file having the file name corresponding to the URI and the last “.wav” of the URI are changed to “.txt”. The text data is returned to the conversion request unit 30 of the blog site 3 1400 with the replaced URI. Here, it is assumed that the text data is converted from “ba” to a sound “ba” and “ka” is converted to a sound “ka”, as in the first embodiment, and the text data is “ba”. “Ka”.

この段階では、実施例1と同様に、「場」に対応するテキストデータのＵＲＩは、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０１．ｔｘｔ」となり、このＵＲＩで指定されるファイルの内容は、文字として解釈して、「ｂａ」である。「か」に対応するテキストデータのＵＲＩは、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０２．ｔｘｔ」で、このＵＲＩで指定されるファイルの内容は、文字として解釈して、「ｋａ」となる。 At this stage, as in the first embodiment, the URI of the text data corresponding to “place” is “http://blog1.com/u1/10/01.txt”, and the file specified by this URI The content is interpreted as characters and is “ba”. The URI of the text data corresponding to “ka” is “http://blog1.com/u1/10/02.txt”. The contents of the file specified by this URI are interpreted as characters, and “ka "

これらのデータを受信した変換要求部３０は、実施例１と同様に受信したデータを音声テキストデータベース２８へ格納し、ＨＴＭＬテキストデータベース２６にアクセスし、音声データと、テキストデータを得たＨＴＭＬ文章の絵文字で囲まれた部分を、Ｗｅｂブラウザが再生できるようにＵＲＩを含む適当なタグで置換する。置換した後のＨＴＭＬ文章は、図５と同じであり、図４に示すＨＴＭＬ文章に代わって、図５に示すＨＴＭＬ文章がＨＴＭＬテキストデータベース２６に格納される。音声テキストデータベース２８に格納される音声データと読みのテキストデータは、実施例１と同様に、図６に示す通りである。 Upon receiving these data, the conversion request unit 30 stores the received data in the speech text database 28 as in the first embodiment, accesses the HTML text database 26, and obtains the speech data and the HTML text obtained from the text data. The part surrounded by pictograms is replaced with an appropriate tag including a URI so that the Web browser can reproduce it. The HTML text after replacement is the same as that shown in FIG. 5, and the HTML text shown in FIG. 5 is stored in the HTML text database 26 instead of the HTML text shown in FIG. 4. The voice data and the text data of reading stored in the voice text database 28 are as shown in FIG.

内容審査サイト１４０２の内容審査部１４０６は、ブログサイトＢ１４００のＨＴＭＬテキストデータベース２６にあるＨＴＭＬ文章と音声テキストデータベース２８にある音声データとテキストデータを参照し、読み上げ禁止用語データベース１４０４に格納した読み上げ禁止用語を用いて処理する。 The content review unit 1406 of the content review site 1402 refers to the HTML text in the HTML text database 26 of the blog site B 1400, the speech data and text data in the speech text database 28, and the speech prohibition term stored in the speech prohibition term database 1404. To process.

内容審査部１４０６の処理は、実施例１の内容審査部３４と同様であり、定期的にＨＴＭＬテキストデータベース２６にあるファイルであるＨＴＭＬ文章の各々について、図７のステップ７００からステップ７３０の処理を行う。ここでは、内容審査部３４は、図５のＨＴＭＬ文章について処理を行い、ステップ７０２から７１２の処理を繰り返したとする。 The processing of the content examination unit 1406 is the same as that of the content examination unit 34 of the first embodiment. For each HTML sentence that is a file in the HTML text database 26, the processing from step 700 to step 730 in FIG. Do. Here, it is assumed that the content examination unit 34 performs processing on the HTML text of FIG. 5 and repeats the processing from steps 702 to 712.

すると、図５のＨＴＭＬ文章５００の５０６２行目がステップ７０８でパタンマッチし、ステップ７１２で、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０１．ｗａｖ」が、＄Ｆに追加される。また、「．ｗａｖ」から「．ｔｘｔ」に置換したＵＲＩは、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０１．ｔｘｔ」となり、それに格納されているテキストデータ、「ｂａ」が＄Ｓに追加される。 Then, line 5062 of the HTML sentence 500 in FIG. 5 is pattern-matched at step 708, and “http://blog1.com/u1/10/01.wav” is added to $ F at step 712. The URI replaced from “.wav” to “.txt” is “http://blog1.com/u1/10/01.txt”, and the text data “ba” stored therein is $ S. To be added.

この結果、＄Ｆは［ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０１．ｗａｖ、ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０２．ｗａｖ］となり、＄Ｓは［ｂａ、ｋａ］となる。 As a result, $ F becomes [http: // blog1. com / u1 / 10/01. wav, http: // blog1. com / u1 / 10/02. wav] and $ S becomes [ba, ka].

次にステップ７１４へ進み、＄Ｓが空白のときは終了し（７３０）、空白でないときは、＄Ｓと読み上げ禁止用語データベース２８にある読み上げ禁止用語とのパタンマッチを行う(ステップ７１６)。 Next, the process proceeds to step 714. If $ S is blank, the process ends (730). If $ S is not blank, pattern matching is performed between $ S and the prohibited words to be read in the prohibited words database 28 (step 716).

次に、ステップ７１８へ進み、パタンマッチが成功した場合は、対応する音声データの内容を予め決めた音声データに置換し(ステップ７２０)、＄Ｆと＄Ｓを一要素分左へシフトする(ステップ７２２)。パタンマッチが成功しなかった場合は、ステップ７２２へ進む。ステップ７２２が終了した後は、ステップ７１４からステップ７２２を繰り返し、＄Ｓが空白のときは終了する(ステップ７３０)。 Next, the process proceeds to step 718. If the pattern match is successful, the corresponding audio data is replaced with predetermined audio data (step 720), and $ F and $ S are shifted to the left by one element (step 720). Step 722). If the pattern match is not successful, the process proceeds to step 722. After step 722 ends, step 714 to step 722 are repeated, and when $ S is blank, it ends (step 730).

読み上げ禁止用語データベース１４０４には、図８に示す２つの読み上げ禁止用語８０２と８０４が格納され、内容は、それぞれ、「ａｈｏ」と「ｂａｋａ」であったとする。 The reading-prohibited term database 1404 stores two reading-prohibited terms 802 and 804 shown in FIG. 8, and the contents are “aho” and “baka”, respectively.

ここでは、ステップ７１４で、＄Ｓは［ｂａ、ｋａ］であり、空白で無いので、ステップ７１６へ進む。読み上げ禁止用語８０２は「ａｈｏ」であり、パタンマッチは成功しないが、読み上げ禁止用語８０４は「ｂａｋａ」であり、パタンマッチは成功する。どこで、ステップ７２０で、対応する音声データの内容を予め決めた音声データに置換する。ここでは、音声データの置換は、＄Ｆの先頭（左側）にあるＵＲＩで指定される音声データのファイルの内容を無音に置換するとする。 Here, in step 714, $ S is [ba, ka] and is not blank, so the process proceeds to step 716. The reading prohibition term 802 is “aho” and the pattern match does not succeed, but the reading prohibition term 804 is “baka” and the pattern match succeeds. In step 720, the content of the corresponding audio data is replaced with predetermined audio data. Here, it is assumed that the audio data is replaced by silence in the audio data file specified by the URI at the top (left side) of $ F.

この段階で、＄Ｆは［ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０１．ｗａｖ、ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０２．ｗａｖ］であるので、図６の音声データ６０２である、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０１．ｗａｖ」の内容は、「ｂａ」に対応する音ではなく、無音となる。 At this stage, $ F is [http: // blog1. com / u1 / 10/01. wav, http: // blog1. com / u1 / 10/02. wav], the content of “http://blog1.com/u1/10/01.wav”, which is the audio data 602 in FIG. 6, is not a sound corresponding to “ba” but is silent.

次に、ステップ７２２で、＄Ｆと＄Ｓを一要素分左へシフトし、＄Ｆは［ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０２．ｗａｖ］となり、＄Ｓは［ｋａ］となる。ステップ７１４からステップ７１８でパタンマッチするものは無いので、ステップ７２２へ進み、＄Ｆと＄Ｓを一要素分左へシフトし、再度、ステップ７１４で、＄Ｆと＄Ｓは、両者とも空白になり、内容審査部１４０６の処理を終了する（ステップ７３０）。 Next, in step 722, $ F and $ S are shifted to the left by one element, and $ F is changed to [http: // blog1. com / u1 / 10/02. wav] and $ S becomes [ka]. Since there is no pattern match from step 714 to step 718, the process proceeds to step 722, $ F and $ S are shifted to the left by one element, and again in step 714, both $ F and $ S are left blank. Thus, the processing of the content examination unit 1406 is terminated (step 730).

上記の状態でパーソナルコンピュータ２のユーザが、Ｗｅｂブラウザ１０を用いて、ブログサイトＢ１４００へアクセスし、Ｗｅｂサーバ２０を経由して、ＨＴＭＬテキストデータベース２６にある、ＨＴＭＬ文章(図５)を閲覧したとする。このとき、パーソナルコンピュータ２のディスプレイ４には、実施例１と同様に図９のように表示される。 In the above state, the user of the personal computer 2 accesses the blog site B 1400 using the Web browser 10 and browses the HTML text (FIG. 5) in the HTML text database 26 via the Web server 20. To do. At this time, the display 4 of the personal computer 2 is displayed as shown in FIG.

ここで、実施例1と同様にパーソナルコンピュータ２を操作しているユーザが、下線９０６、と９０８をマウスでこの順番に選択したとする。ここでは、上述のように、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０１．ｗａｖ」の内容は、無音となっているので、下線９０６をマウスで選択したときは、「ｂａ」の音は再生されず、スピーカ１４は無音のままであり、下線９０８をマウスで選択したときだけ「ｋａ」の音が再生されるので、スピーカ１４から「ｂａｋａ」と聞こえることは無い。 Here, it is assumed that the user operating the personal computer 2 selects the underlines 906 and 908 with the mouse in this order as in the first embodiment. Here, as described above, the content of “http://blog1.com/u1/10/01.wav” is silent, so when the underline 906 is selected with the mouse, the “ba” The sound is not reproduced, the speaker 14 remains silent, and the sound “ka” is reproduced only when the underline 908 is selected with the mouse, so that the speaker 14 does not hear “baka”.

本実施例では、ステップ７２０で、音声データの置換は、左にある単語に対応する方の音声データのファイルを無音にするとしたが、実施例１と同様に無音の代わりに、予め固定した音にしても良い。 In this embodiment, in step 720, the voice data is replaced by silence in the voice data file corresponding to the left word. However, as in the first embodiment, instead of silence, Anyway.

上述の実施例３で、音声合成サイト４０で、受信部４２が出力したテキストデータを、音声合成部４４へ入力する前に、テキストデータで内容を審査し、不適切と判断した場合は、テキストデータを変更し、変更したテキストデータを音声合成部４４へ送るテキストデータとしても良い。 In Example 3 described above, before the text data output from the receiving unit 42 is input to the speech synthesizing unit 44 at the speech synthesizing site 40, the contents are examined with the text data, and the text data is determined to be inappropriate. The data may be changed, and the changed text data may be sent to the speech synthesizer 44 as text data.

本実施例では、音声テキストデータベース２８に格納されたテキストデータ６０４と６０８は、変換要求部３０と内容審査サイト１４０２からだけアクセスされ、編集部２４など、他のプログラムからアクセスされることが無いので、いったん音声ファイルを作成し、音声テキストデータベース２８に格納した後に、音声テキストデータベース２８のテキストデータだけを変更して、内容審査サイト１４０２で適正と判定され、不適切な用語を発声するように変更することはできないという効果がある。 In this embodiment, the text data 604 and 608 stored in the speech text database 28 are accessed only from the conversion request unit 30 and the content examination site 1402 and are not accessed from other programs such as the editing unit 24. Once the voice file is created and stored in the voice text database 28, only the text data in the voice text database 28 is changed, and the content review site 1402 determines that it is appropriate, and changes to utter inappropriate terms. It has the effect that it cannot be done.

実施例２では、内容審査部と、電子透かし検出部と、読み上げ禁止用語データベースをブログサイトに備えたが、内容審査部と、電子透かし検出部と、読み上げ禁止用語データベースをブログサイトの外部の別のサイトに内容審査サイトとして設置し、複数のブログサイトで、内容審査サイトを共有するようにしても良い。 In the second embodiment, the content review unit, the digital watermark detection unit, and the reading prohibition term database are provided in the blog site. However, the content review unit, the digital watermark detection unit, and the read prohibition term database are separately provided outside the blog site. This site may be set up as a content review site, and the content review site may be shared by a plurality of blog sites.

以下、この場合の実施例４を図２から図４、図８、図９、図１１から図１３、図１５を用いて説明する。この実施例は、Ｗｅｂブラウザと音声再生機能を備えるパーソナルコンピュータ、実施例２と異なる構成のブログサイト、実施例２と同じ音声合成サイト、及び実施例３と異なる構成の内容審査サイトの４つの部分から構成される。 Hereinafter, Embodiment 4 in this case will be described with reference to FIGS. 2 to 4, 8, 9, 11 to 13, and 15. This embodiment has four parts: a personal computer having a web browser and a voice reproduction function, a blog site having a different configuration from that of the second embodiment, the same voice synthesis site as that of the second embodiment, and a content examination site having a different configuration from that of the third embodiment. Consists of

内容審査サイトは、インターネット上のＷｅｂサイトであり、サーバなどのコンピュータとソフトウェアで構成する。 The content examination site is a website on the Internet, and is composed of a computer such as a server and software.

図４、図８、図９、図１１から図１３の各構成要素の動作は、実施例２の各構成要素の動作と同じである。 The operation of each component in FIGS. 4, 8, 9, and 11 to 13 is the same as the operation of each component in the second embodiment.

図1５は、実施例４の構成を示す。図1５において、図１０と同じものには同じ符号を付している。 FIG. 15 shows the configuration of the fourth embodiment. In FIG. 15, the same components as those in FIG. 10 are denoted by the same reference numerals.

ブログサイトＣ１５００は、内容審査に関わる構成を備えていない点が実施例２と異なり、内容審査に関わる構成は、ブログサイトＣ１５００とは異なるサイトである内容審査サイト１５０２に設けられる。 The blog site C1500 is different from the second embodiment in that it does not have a configuration relating to content examination, and a configuration relating to content examination is provided in the content examination site 1502, which is a site different from the blog site C1500.

１５０２は、内容審査サイト１５０２は、電子透かし検出部１５０４、読み上げ禁止用語データベース１５０６及び内容審査部Ａ１５０８を有する。電子透かし検出部１５０４は、図１０の電子透かし検出部１００６と同じである。読み上げ禁止用語データベース１５０６は、図１０の読み上げ禁止用語データベース３２と同じである。内容審査部Ａ１５０８は、ブログサイトＣ１５００の音声データベース１００２を参照し、電子透かし検出部１５０４からテキストデータを得て、ブログサイトＣ１５００のＨＴＭＬテキストデータベース２６と、読み上げ禁止用語データベース１５０６を参照し、読み上げ禁止用語か否かを判定し、読み上げ禁止用語と判定したときは、音声データベース１００２に格納された対応する音声データを変更する。 The content review site 1502 includes a digital watermark detection unit 1504, a reading prohibited term database 1506, and a content review unit A 1508. The digital watermark detection unit 1504 is the same as the digital watermark detection unit 1006 in FIG. The reading-prohibited term database 1506 is the same as the reading-prohibited term database 32 in FIG. The content examination unit A1508 refers to the voice database 1002 of the blog site C1500, obtains text data from the digital watermark detection unit 1504, refers to the HTML text database 26 of the blog site C1500, and the reading-prohibited term database 1506, and is prohibited from reading. It is determined whether or not the term is a term, and when it is determined that the term is a reading-prohibited term, the corresponding speech data stored in the speech database 1002 is changed.

ブログサイトには、ユーザを認証するログイン機能やＷｅｂページを検索する検索機能などがあるが、本実施例でも、省略する。 The blog site has a login function for authenticating a user and a search function for searching for a Web page, which are also omitted in this embodiment.

読み上げ禁止用語データベース１５０６、電子透かし検出部１５０４、及び内容審査部Ａ１５０８は、内容審査サイトＡ１５０２で動作するデータベース及びプログラムであり、内容審査サイトＡ１５０２を構成するハードウエア(コンピュータ)、そのオペレーティング・システム、及びそれらが提供するファイルシステムなどで実現する。 The reading-prohibited term database 1506, the digital watermark detection unit 1504, and the content screening unit A 1508 are a database and a program that operate on the content screening site A 1502, hardware (computer) that constitutes the content screening site A 1502, an operating system thereof, And a file system provided by them.

パーソナルコンピュータ１でのエンドユーザの操作の流れと、編集部２４での処理の流れは、実施例１から実施例３と同じである（図２と図３）。 The flow of operation of the end user on the personal computer 1 and the flow of processing in the editing unit 24 are the same as those in the first to third embodiments (FIGS. 2 and 3).

ここでは、エンドユーザは、書き込むボタン２１０をマウス７で操作し、タイトル入力部分２０２と本文入力部分２０４に書き込まれた文章をブログサイトＣ１５００のＨＴＭＬテキストデータベース２６へ適当なファイル名をつけて、ファイルとして格納したとする（ステップ３１０から３１４）。ここでは、付けられたファイル名は、実施例１から実施例３と同様に「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／ｈｏｎｂｕｎ．ｈｔｍｌ」であったとする。このときにＨＴＭＬテキストデータベース２６に格納されたＨＴＭＬ文章は実施例１と同様に図４であったとする。 Here, the end user operates the write button 210 with the mouse 7, assigns a proper file name to the HTML text database 26 of the blog site C 1500, and writes the text written in the title input part 202 and the text input part 204. (Steps 310 to 314). Here, it is assumed that the assigned file name is “http://blog1.com/u1/10/honbun.html” as in the first to third embodiments. Assume that the HTML text stored in the HTML text database 26 at this time is FIG. 4 as in the first embodiment.

変換要求部Ａ１００４は、実施例１から実施例３と同様に、定期的にＨＴＭＬテキストデータベース２６を参照し、新たに格納されたファイルであるＨＴＭＬ文章を検出し、星印や三角印など絵文字で囲まれたテキストデータの部分を抽出し、抽出されたテキストデータ毎に、ユニークなＵＲＩを生成し、それらテキストデータとＵＲＩを音声合成サイトＡ１０１０へ送信し、ＵＲＩに対応するファイル名の電子透かし入り音声データを得る。 As in the first to third embodiments, the conversion request unit A1004 periodically refers to the HTML text database 26, detects HTML text that is a newly stored file, and uses pictograms such as stars and triangles. Extracts the enclosed text data part, generates a unique URI for each extracted text data, sends the text data and the URI to the speech synthesis site A1010, and includes an electronic watermark with a file name corresponding to the URI Get audio data.

ここでは、音声合成サイトＡ１０１０に送信されたデータのひとつは、テキストデータが「場」であり、ＵＲＩが「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０３．ｗａｖ」であったとし、他のひとつは、テキストデータが「か」であり、ＵＲＩが、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０４．ｗａｖ」であったとする。 Here, it is assumed that one of the data transmitted to the speech synthesis site A 1010 is that the text data is “place”, and the URI is “http://blog1.com/u1/10/03.wav”, and the other One of them is that the text data is “ka” and the URI is “http://blog1.com/u1/10/04.wav”.

上記のデータを受信した音声合サイトＡ１０１０の受信部Ａ１０１２は、テキストデータを音声合成部４４へ出力し、ＵＲＩを送信部Ａ１０１６へ出力する。音声合成部４４は、テキストデータを音声データに変換し、読みのテキストデータと、音声データを電子透かし挿入部１０１４へ出力する。電子透かし挿入部１０１４は、音声合成部４４から入力した音声データに、読みのテキストデータを電子透かしとして音声データに挿入し、送信部Ａ１０１６へ出力する。音声データへの電子透かしの挿入、電子透かしの検出、抽出については、実施例２と同様である。 Receiving unit A1012 of voice joint site A1010 that has received the above data outputs text data to voice synthesizing unit 44 and outputs a URI to transmitting unit A1016. The voice synthesis unit 44 converts the text data into voice data, and outputs the read text data and the voice data to the digital watermark insertion unit 1014. The digital watermark insertion unit 1014 inserts the text data of the reading into the voice data as the digital watermark into the voice data input from the voice synthesis unit 44, and outputs the voice data to the transmission unit A 1016. Insertion of digital watermark into audio data, detection and extraction of digital watermark are the same as in the second embodiment.

送信部Ａ１０１６は、実施例２と同様に、電子透かし挿入部１０１４から得た、電子透かし入り音声データを、受信部Ａ１０１２から得たＵＲＩに対応するファイル名の音声データのファイルとしてブログサイトＣ１５００の変換要求部Ａ１００４へ戻す。ここで、テキストデータは、「場」を「ｂａ」という音に変換し、「か」を「ｋａ」という音に変換したとする。 Similarly to the second embodiment, the transmission unit A1016 transmits the audio data with digital watermark obtained from the digital watermark insertion unit 1014 as a file of audio data having a file name corresponding to the URI obtained from the reception unit A1012. Return to conversion request unit A1004. Here, it is assumed that the text data is converted from “ba” into a sound “ba” and “ka” into a sound “ka”.

一方の「ｂａ」が電子透かしで挿入された電子透かし入りの音声データのファイルは、ＵＲＩが「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０３．ｗａｖ」であり、他方の「ｋａ」が電子透かしで挿入された電子透かし入りの音声データのファイルは、ＵＲＩが「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０４．ｗａｖ」である。 The file of audio data with a digital watermark in which one “ba” is inserted with a digital watermark has a URI “http://blog1.com/u1/10/03.wav” and the other “ka” The audio data file with the digital watermark inserted by the digital watermark has a URI “http://blog1.com/u1/10/04.wav”.

これらのデータを受信した変換要求部Ａ１００４は、受信したデータを音声データベース１００２へ格納し、ＨＴＭＬテキストデータベース２６にアクセスし、音声データと、テキストデータを得たＨＴＭＬ文章の絵文字で囲まれた部分を、Ｗｅｂブラウザが再生できるようにＵＲＩを含む適当なタグで置換する。置換した後のＨＴＭＬ文章は、実施例２の図１１と同様である。以降は、図４に示すＨＴＭＬ文章に代わって、図１１に示すＨＴＭＬ文章がＨＴＭＬテキストデータベース２６に格納される。音声データベース１００２に格納される電子透かし入り音声データは、実施例２の図１２と同様である。 Upon receiving these data, the conversion request unit A1004 stores the received data in the speech database 1002, accesses the HTML text database 26, and converts the speech data and the portion surrounded by pictographs of the HTML text from which the text data was obtained. Replace with an appropriate tag containing a URI so that the web browser can play it. The HTML text after the replacement is the same as that in FIG. Thereafter, the HTML text shown in FIG. 11 is stored in the HTML text database 26 instead of the HTML text shown in FIG. The voice data with digital watermark stored in the voice database 1002 is the same as that of FIG.

内容審査サイトＡ１５０２の内容審査部Ａ１５０８は、ブログサイトＣ１５００のＨＴＭＬテキストデータベース２６にあるＨＴＭＬ文章と、音声データベース１００２にある電子透かし入り音声データを参照し、読み上げ禁止用語データベース１５０６に格納した読み上げ禁止用語を用いて処理を行う。 The content review unit A1508 of the content review site A1502 refers to the HTML text in the HTML text database 26 of the blog site C1500 and the digital watermark-added speech data in the speech database 1002, and the speech-prohibited terminology stored in the speech-prohibited terminology database 1506 Process using.

内容審査部Ａ１５０８の処理は、実施例２の内容審査部の処理と同様であり、定期的にＨＴＭＬテキストデータベース２６にあるファイルであるＨＴＭＬ文章の各々について、図１３のステップ１３００からステップ１３３０の処理を実行する。ここでは、内容審査部２１５０８は、図１１のＨＴＭＬ文章について処理を行い、ステップ１３０２から１３１２の処理を繰り返したとする。 The processing of the content screening unit A 1508 is the same as the processing of the content screening unit of the second embodiment. For each HTML sentence that is a file in the HTML text database 26 periodically, the processing from step 1300 to step 1330 in FIG. Execute. Here, it is assumed that the content examination unit 2 1508 performs processing on the HTML text of FIG. 11 and repeats the processing of steps 1302 to 1312.

すると、図１１のＨＴＭＬ文章１１００の１１０６２行目がステップ１３０８でパタンマッチし、ステップ１３１１で、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０３．ｗａｖ」から電子透かしで挿入されているテキストデータ「ｂａ」を抽出する。ステップ１３１２で「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０３．ｗａｖ」が、＄Ｆに追加され、「ｂａ」が＄Ｓに追加される。 Then, the 11062st line of the HTML sentence 1100 in FIG. 11 is pattern-matched in Step 1308, and in Step 1311, the text inserted from “http://blog1.com/u1/10/03.wav” with a digital watermark is inserted. Data “ba” is extracted. In step 1312, “http://blog1.com/u1/10/03.wav” is added to $ F, and “ba” is added to $ S.

この結果、＄Ｆは、実施例２と同様に、［ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０３．ｗａｖ、ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０４．ｗａｖ］となり、＄Ｓは［ｂａ、ｋａ］となる。 As a result, $ F is [http: // blog1. com / u1 / 10/03. wav, http: // blog1. com / u1 / 10/04. wav] and $ S becomes [ba, ka].

次にステップ１３１４へ進み、＄Ｓが空白のときは終了し（１３３０）、空白でないときは、＄Ｓと読み上げ禁止用語データベース１５０６にある読み上げ禁止用語とのパタンマッチを行う(ステップ１３１６)。 Next, the process proceeds to step 1314. If $ S is blank, the process is terminated (1330). If $ S is not blank, pattern matching is performed between $ S and the prohibited words to be read in the prohibited words database 1506 (step 1316).

次に、ステップ１３１８へ進み、パタンマッチが成功した場合は、対応する音声データの内容を予め決めた音声データに置換し(ステップ１３２０)、＄Ｆと＄Ｓを一要素分左へシフトする(ステップ１３２２)。パタンマッチが成功しなかった場合は、ステップ１３２２へ進む。ステップ１３２２が終了した後は、ステップ１３１４からステップ１３２２を繰り返し、＄Ｓが空白のときは終了する(ステップ１３３０)。 Next, the process proceeds to step 1318. If the pattern match is successful, the content of the corresponding audio data is replaced with predetermined audio data (step 1320), and $ F and $ S are shifted to the left by one element (step 1320). Step 1322). If the pattern match is not successful, the process proceeds to step 1322. After step 1322 is completed, steps 1314 to 1322 are repeated, and when $ S is blank, the process ends (step 1330).

ここでは、禁止用語データベース１５０６には、実施例２と同様に図８に示すように、２つの読み上げ禁止用語８０２と８０４が格納され、内容は、それぞれ、「ａｈｏ」と「ｂａｋａ」であったとする。 Here, as shown in FIG. 8, the prohibited term database 1506 stores two reading prohibited terms 802 and 804 as shown in FIG. 8, and the contents are “aho” and “baka”, respectively. To do.

ステップ１３１４で、＄Ｓは［ｂａ、ｋａ］であり、空白でないので、ステップ１３１６へ進む。読み上げ禁止用語８０２は「ａｈｏ」であり、パタンマッチは成功しないが、読み上げ禁止用語８０４は「ｂａｋａ」であり、パタンマッチは成功する。そこで、ステップ１３２０で、対応する音声データの内容を予め決めた音声データに置換する。ここでは、音声データに置換は、＄Ｆの先頭（左側）にある単語に対応する方の音声データのファイルを無音にするとする。 In step 1314, since $ S is [ba, ka] and is not blank, the process proceeds to step 1316. The reading prohibition term 802 is “aho” and the pattern match does not succeed, but the reading prohibition term 804 is “baka” and the pattern match succeeds. Therefore, in step 1320, the content of the corresponding audio data is replaced with predetermined audio data. Here, it is assumed that the replacement with the voice data makes the voice data file corresponding to the word at the top (left side) of $ F silence.

この段階で、＄Ｆは、［ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０３．ｗａｖ、ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０４．ｗａｖ］であるので、図１２の音声データ１２０２である、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０３．ｗａｖ」の内容は、「ｂａ」に対応する音ではなく、無音となる。 At this stage, $ F is [http: // blog1. com / u1 / 10/03. wav, http: // blog1. com / u1 / 10/04. wav], the content of “http://blog1.com/u1/10/03.wav”, which is the audio data 1202 of FIG. 12, is not a sound corresponding to “ba” but a silence.

次に、ステップ１３２２で、＄Ｆと＄Ｓを一要素分左へシフトし、＄Ｆは［ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０４．ｗａｖ］となり、＄Ｓは［ｋａ］となる。ステップ１３１４からステップ１３１８でパタンマッチするものは無いので、ステップ１３２２へ進み、＄Ｆと＄Ｓを一要素分左へシフトし、再度、ステップ１３１４で、＄Ｆと＄Ｓは、両者とも空白になり、内容審査部Ａ１５０８の処理を終了する（ステップ１３３０）。 Next, in step 1322, $ F and $ S are shifted to the left by one element, and $ F is changed to [http: // blog1. com / u1 / 10/04. wav] and $ S becomes [ka]. Since there is no pattern match from step 1314 to step 1318, the process proceeds to step 1322, where $ F and $ S are shifted to the left by one element, and again in step 1314, both $ F and $ S are left blank. Thus, the processing of the content examination unit A 1508 is terminated (step 1330).

上記の状態でパーソナルコンピュータ２のユーザが、Ｗｅｂブラウザ１０を用いて、ブログサイトＣ１５００へアクセスし、Ｗｅｂサーバ２２を経由して、ＨＴＭＬテキストデータベース２６にある、ＨＴＭＬ文章(図１１)を閲覧したとする。このとき、パーソナルコンピュータ２のディスプレイ４には、実施例１から実施例３と同様に図９に示すように表示される。 In the above state, the user of the personal computer 2 accesses the blog site C1500 using the Web browser 10 and browses the HTML text (FIG. 11) in the HTML text database 26 via the Web server 22. To do. At this time, the display 4 of the personal computer 2 is displayed as shown in FIG. 9 as in the first to third embodiments.

ここで、パーソナルコンピュータ２を操作しているユーザが、下線９０６、と９０８をマウスでこの順番に選択したとする。 Here, it is assumed that the user operating the personal computer 2 selects the underlines 906 and 908 in this order with the mouse.

下線９０６と９０８に対応するリンクは、それぞれ図１１の１１０６２から１１０６４行目と１１０７２から１１０７４行目であり、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０３．ｗａｖ」と、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０４．ｗａｖ」である。拡張子が「．ｗａｖ」であるので、実施例1と同様にパーソナルコンピュータ２の音声再生部１２は、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０３．ｗａｖ」と「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０４．ｗａｖ」の再生を試みる。 The links corresponding to the underlines 906 and 908 are the lines 11062 to 11064 and the lines 11072 to 11074 in FIG. 11, respectively, “http://blog1.com/u1/10/03.wav” and “http: //Blog1.com/u1/10/04.wav ". Since the extension is “.wav”, the sound reproduction unit 12 of the personal computer 2 can execute “http://blog1.com/u1/10/03.wav” and “http: //” as in the first embodiment. blog1.com/u1/10/04.wav "is attempted.

ここでは、上述のように、「ｈｔｔｐ：／／ｂｌｏｇ１．ｃｏｍ／ｕ１／１０／０３．ｗａｖ」の内容は無音となっているので、下線９０６をマウスで選択したときは、「ｂａ」の音は再生されず、スピーカ１４は、無音のままであり、下線９０８をマウスで選択したときに「ｋａ」の音が再生されるので、スピーカ１４から「ｂａｋａ」と聞こえることは無い。 Here, as described above, since the content of “http://blog1.com/u1/10/03.wav” is silent, the sound of “ba” is selected when the underline 906 is selected with the mouse. Is not reproduced and the speaker 14 remains silent, and the sound “ka” is reproduced when the underline 908 is selected with the mouse, so that the speaker 14 does not hear “baka”.

上述の実施例４で、音声合成サイトＡ１０１０で、受信部Ａ１０１２が出力したテキストデータを、音声合成部４４へ入力する前に、テキストデータで内容を審査し、音声合成部４４へ送るテキストデータを変更しても良い。 In the fourth embodiment, the text data output from the receiving unit A1012 at the speech synthesis site A1010 is examined with the text data before being input to the speech synthesis unit 44, and the text data to be sent to the speech synthesis unit 44 is It may be changed.

実施例１の構成図である。1 is a configuration diagram of Example 1. FIG. ブログの入力画面の例である。It is an example of the input screen of a blog. 編集部の処理フローチャートである。It is a process flowchart of an edit part. 格納されたＨＴＭＬ文章の例を示した図である。It is the figure which showed the example of the stored HTML text. 置換したＨＴＭＬ文章の例を示した図である。It is the figure which showed the example of the substituted HTML sentence. 音声テキストデータベースに格納されたデータの例を示す図である。It is a figure which shows the example of the data stored in the audio | voice text database. 内容審査部の処理フローチャートである。It is a processing flowchart of a content examination part. 読み上げ禁止用語データベースに格納される用語の例を示す図である。It is a figure which shows the example of the term stored in the reading prohibition term database. ブログの表示画面例を示す図である。It is a figure which shows the example of a display screen of a blog. 実施例２の構成図である。FIG. 6 is a configuration diagram of Example 2. 置換したＨＴＭＬ文章の例を示す図である。It is a figure which shows the example of the substituted HTML text. 音声データベースに格納された電子透かし入り音声データの例示す図である。It is a figure which shows the example of the audio | voice data with a digital watermark stored in the audio | voice database. 内容審査部の処理フローチャートである。It is a processing flowchart of a content examination part. 実施例３の構成図である。FIG. 6 is a configuration diagram of Example 3. 実施例４の構成図である。FIG. 6 is a configuration diagram of Example 4.

Explanation of symbols

１、２：パーソナルコンピュータ、３、４：ディスプレイ、５、６：キーボード、７、８：マウス、９、１０：Ｗｅｂブラウザ、１１、１２音声再生部、１３、１４スピーカ、２０、１０００、１４００、１５００：ブログサイト、２２：Ｗｅｂサーバ、２４：編集部、２６：ＨＴＭＬテキストデータベース、２８：音声テキストデータベース、３０：変換要求部、３２：読み上げ禁止用語データベース、３４：内容審査部、４０、１０１０：音声合成サイト、４２：受信部、４４：音声合成部、４６：送信部、２００：入力画面、２０２：タイトル入力部分、２０４：本文入力部分、２０６：音声−絵文字対応表示部分、２０８：取り消すボタン、２１０：書き込むボタン、４００、５００：ＨＴＭＬ文章、６０２：音声データ、６０４：テキストデータ、６０６：音声データ、６０８：テキストデータ、８０２、８０４：読み上げ禁止用語、１４０２、１５０２：内容審査サイト。 1, 2, personal computer, 3, 4: display, 5, 6: keyboard, 7, 8: mouse, 9, 10: web browser, 11, 12 audio playback unit, 13, 14 speakers, 20, 1000, 1400, 1500: Blog site, 22: Web server, 24: Editing unit, 26: HTML text database, 28: Speech text database, 30: Conversion request unit, 32: Reading prohibited term database, 34: Content review unit, 40, 1010: Speech synthesis site, 42: reception unit, 44: speech synthesis unit, 46: transmission unit, 200: input screen, 202: title input part, 204: text input part, 206: voice-pictogram correspondence display part, 208: cancel button 210: Write button 400, 500: HTML text 602: Audio data 604: Text data, 606: voice data, 608: text data, 802, 804: reading banned words, 1402,1502: content review site.

Claims

In response to receiving the text data, converting the text data into voice data and transmitting the voice data and text data representing the reading of the voice data;
A text database for storing text data received from a connected terminal, a plurality of partial text data is extracted from the text data in response to an instruction from the terminal, and the extracted plurality of partial text data is sent to the speech synthesis site A conversion request unit for transmitting and storing in the speech text database speech data corresponding to the plurality of partial text data transmitted from the speech synthesis site and text data representing reading of the speech data, and the plurality of partial texts When the text data representing the reading of the voice data corresponding to the data is continuous, and the text data representing the continuous reading corresponds to a preset reading prohibition term, the voice corresponding to the partial text data Includes a content review section that replaces the data with predetermined audio data. Text-to-speech conversion service system which is characterized by having a blog site.

2. The text-to-speech conversion service system according to claim 1, wherein text data representing the reading of the voice data is inserted into the voice data as a digital watermark.

In response to receiving the text data, converting the text data into voice data and transmitting the voice data and text data representing the reading of the voice data;
A text database for storing text data received from a terminal to be connected, a plurality of partial text data is extracted from the text data in response to an instruction from the terminal, and the extracted plurality of partial text data is extracted from the speech synthesis site. A blog site including a conversion request unit that stores voice data corresponding to the plurality of partial text data transmitted from the voice synthesis site and text data representing the reading of the voice data in a voice text database;
When the text data representing the reading of the audio data corresponding to the plurality of partial text data is made continuous, the partial text data when the text data representing the continuous reading corresponds to a preset reading prohibition term A text-to-speech conversion service system comprising a content examination site for replacing the voice data corresponding to the above with predetermined voice data.

4. The text-to-speech conversion service system according to claim 3, wherein text data representing the reading of the voice data is inserted into the voice data as a digital watermark.

Receive text data from the connected device,
Extracting a plurality of partial text data from the text data in response to an instruction from the terminal;
Converting the extracted partial text data into voice data;
Generating text data representing the reading of the converted voice data;
When the text data representing the reading of the audio data corresponding to the plurality of partial text data is made continuous, the partial text data when the text data representing the continuous reading corresponds to a preset reading prohibition term A text-to-speech conversion service method, wherein the speech data corresponding to the above is replaced with predetermined speech data.