JP2003140672A - Phoneme business system - Google Patents

Phoneme business system

Info

Publication number
JP2003140672A
JP2003140672A JP2001340687A JP2001340687A JP2003140672A JP 2003140672 A JP2003140672 A JP 2003140672A JP 2001340687 A JP2001340687 A JP 2001340687A JP 2001340687 A JP2001340687 A JP 2001340687A JP 2003140672 A JP2003140672 A JP 2003140672A
Authority
JP
Japan
Prior art keywords
phoneme
means
copyright
phonemes
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
JP2001340687A
Other languages
Japanese (ja)
Other versions
JP2003140672A5 (en
Inventor
Kazunori Hayashi
Masaru Mase
和典 林
優 間瀬
Original Assignee
Matsushita Electric Ind Co Ltd
松下電器産業株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matsushita Electric Ind Co Ltd, 松下電器産業株式会社 filed Critical Matsushita Electric Ind Co Ltd
Priority to JP2001340687A priority Critical patent/JP2003140672A/en
Publication of JP2003140672A publication Critical patent/JP2003140672A/en
Publication of JP2003140672A5 publication Critical patent/JP2003140672A5/ja
Application status is Withdrawn legal-status Critical

Links

Abstract

PROBLEM TO BE SOLVED: To enable an owner of copyright on a phoneme to obtain a copyright charge corresponding to the use of the phoneme and a user of service using the phoneme to easily receive the service. SOLUTION: A phoneme service system is equipped with a phoneme input means of inputting the phoneme, a copyright owner registering means of registering the owner of copyright on the phoneme, a voice synthesizing means of analyzing data of a voice synthesis object by using, for example, text data of a document, etc., and a database of generated phonemes from the phoneme input means, extracting and connecting optimum phonemes, data by data, and computing the consumption of the phonemes, and a selling means of selling the service using the phonemes to users at service requests from the users and further equipped with a copyright charge calculating means of calculating copyright charges by owners of copyright on the phonemes according to the use quantities calculated by the voice synthesizing means.

Description

【発明の詳細な説明】 【0001】 【発明の属する技術分野】本発明は音声の最小構成要素である音素を用いた音素ビジネスシステムに関する。 BACKGROUND OF THE INVENTION [0001] [Technical Field of the Invention The present invention relates to phonemes business system using a phoneme is the smallest component of the speech. 【0002】 【従来の技術】近年、電子メールやワープロ等のテキストデータを音声に変換する機能がパーソナルコンピュータに搭載されるようになってきている。 [0002] In recent years, the ability to convert text data, such as e-mail and word processing to the voice has come to be mounted on a personal computer. しかしながらテキストデータを音声変換するのみの機能であれば、コストパフォーマンスに欠ける等の問題がある。 However, if the text data a function of only the voice conversion, there are problems such as lack of cost performance. また出力される音声の種類も男性や女性といった一般的なものであり、必ずしもユーザが所望する声色での音声出力ではないので、ユーザが聴いていて楽しさを感じにくい面があった。 The kind of sound that is output is also commonplace, such as men and women, because necessarily the user is not the voice output in the tone of voice to be desired, the user had a difficult surface to feel the fun is listening. 【0003】特開平7−140999号公報には、人間の発声に近い合成音声を生成することができる音声合成装置及び音声合成方法が開示されている。 [0003] Japanese Patent Laid-Open No. 7-140999, the speech synthesis apparatus and the speech synthesis method can generate synthetic speech close to human vocalization is disclosed. すなわち、辞書の中に読み仮名、アクセント型等の情報をととも、アクセント指令値及び又は音韻継続時間長情報を予め用意しておき、音韻の継続時間長を用いて音素片データのパラメータ列を生成し、それらを基に音声波形を合成することにより、人間の発声に一段と近い合成音声を出力するものである。 Namely, kana reading in the dictionary, both the information of accent type, etc., prepared in advance accent command value and or phoneme duration information, the parameter sequence of phoneme component data using a duration of the phoneme produced by synthesizing a speech waveform based on them, and outputs a further close synthesized speech to human vocalization. また特開平11−143483号公報には、パソコン、ワープロ、ゲーム機などを利用する際の合成音声の発生について、特にユーザが任意でかつ多様な合成音声を選ぶことが可能な手段を実現するシステムが開示されている。 System also JP-A-11-143483, to realize personal computer, a word processor, for generating synthesized speech when using a game machine or the like, a means capable of particularly user chooses any desired and diverse synthetic speech There has been disclosed. 【0004】これらの問題を解決する為の手段として音声発声者の肉声をサンプリングして作成された音素データベースを用いる読み上げシステムが考案されている。 [0004] reading system used phonemic database created by sampling the real voice of the voice speaker as a means for solving these problems have been devised. 【0005】 【発明が解決しようとする課題】実在の人物から採取した音素は発声者固有の個性をもつものであり、著作権に似た権利を認めるべきであるが、従来の読み上げシステムにおいてはその権利(以下著作権とする)を認める手段を備えていない。 [0005] was taken from a real person [0006] phonemes are those with the speaker's unique personality, but should recognize the rights similar to copyright, in the conventional speech system that right does not have a means to recognize the (hereinafter referred to as copyright). 本来は著作権を認めるべきであり、 Originally it should acknowledge the copyright,
使用に応じて相応の対価(著作権料)が音素の著作権所有者に支払わなければならないが、従来の読み上げシステムにおいては音素が使用されても著作権料を支払うものではない。 Although corresponding consideration (copyright royalty) must pay the copyright owner of phonemes in accordance with the use, you do not pay royalties even phonemes are used in the conventional reading system. 【0006】実在の人物から採取した音素が使用された場合、その著作権所有者に対して著作権料が支払われない為に、権利者は不利益を被ることになる。 [0006] If the phoneme taken from a real person has been used, in order to not pay copyright fees to the copyright owner, the right person will be disadvantaged. また販売手段が無いためにユーザは容易にサービスを受けることが難しかった。 The user because there is no sale means it is difficult to readily undergo service. 従って音素を利用したビジネスを発展させる為の障害となる可能性があった。 Therefore, there is a possibility that the failure of the order to develop the business by using the phoneme. またユーザがシステムから容易にサービスを受けるための販売手段も備えてはいなかった。 The user has not is also provided selling means for readily undergo service from the system. 【0007】 【課題を解決するための手段】上記課題を考慮し、音素が使用される場合においては、その音素の著作権所有者に対して使用に応じた著作権料を支払う手段とユーザが容易にサービスを受けることができる販売手段が必要である。 [0007] In view of the above object, according to an aspect of, when the phoneme is used, the means and the user to pay the copyright fee corresponding to the use for the copyright owner of the phoneme is required sales means capable of readily undergo service. これを実現する為に、音声の最小構成要素を音素と定め、その個性を持つ音素と、その音素を取り込む音素取り込み手段と音素の著作権所有者を登録する著作権者登録手段と、音声合成目的のデータ、例えば文章等のテキストデータと音素取り込み手段から生成される音素のデータベースを用いて音声合成目的のデータを解析し、そのデータ毎に最適な音素を抽出して繋ぎあわせるとともに音素の使用量を算出する音声合成手段と、音声合成手段が処理した合成音データをユーザに提供する配信手段と、音声合成手段によって算出された使用量に応じて音素の著作権所有者毎に著作権料を算出する著作権料算出手段と著作権料の算出情報を基に著作権料を音素の著作権所有者に支払う金銭支払い手段と音素を利用したサービスをユーザに販売 In order to achieve this, defined as phoneme the smallest components of the voice, and the phoneme with its individuality, and copyright registration means for registering the copyright owner of the phoneme uptake means and phoneme to capture the phoneme, voice synthesis the purpose of the data, for example, analyzes the data for speech synthesis purposes using phoneme database generated from text data and a phoneme capturing means texts such as the use of phonemes causes Awa connecting to extract optimum phoneme for each data and speech synthesis means for calculating the amount, work synthesized speech data is speech synthesis means is processed and distributed means for providing to the user, for each copyright owner of phonemes in accordance with the use amount calculated by the speech synthesis means fees the user of the copyright fee calculation means and the copyright fee service using the money means of payment and the phoneme to pay copyright fees to the copyright owner of the phoneme the calculation information on the basis of which to calculate the sales る販売手段と、音声合成目的のデータを記録する合成目的データ記録手段と音素取り込み手段が作成した音素データベースを記録する音素データベース記録手段とユーザからのサービス要求を受け付け、ユーザとシステムのインターフェースを行うユーザインターフェース手段から構成される音素ビジネスシステムを提供する。 Performing a sales unit receives a service request from the phoneme database record unit and the user to record the phonemic database synthesis subject data recording means and phoneme capture means has created to record the data of the speech synthesis purposes, the interface of the user and the system that providing phonemes business system composed of a user interface means. 【0008】 【発明の実施の形態】請求項1記載の発明は音声の最小構成要素を音素と定め、その個性を持つ音素と、その音素を取り込む音素取り込み手段と音素の著作権所有者を登録する著作権者登録手段と、音声合成目的のデータ、 DETAILED DESCRIPTION OF THE INVENTION The invention according to claim 1 is defined as phonemes minimum component of a voice, registration and phoneme with its individuality, the copyright owner of the phonemes fetching means and phoneme capturing the phoneme copyright registration means and, speech synthesis purpose of the data to be,
例えば文章等のテキストデータと音素取り込み手段から生成される音素のデータベースを用いて音声合成目的のデータを解析し、そのデータ毎に最適な音素を抽出して繋ぎあわせるとともに音素の使用量を算出する音声合成手段と、音声合成手段が処理した合成音データをユーザに提供する配信手段と、音声合成手段によって算出された使用量に応じて音素の著作権所有者毎に著作権料を算出する著作権料算出手段と著作権料の算出情報を基に著作権料を音素の著作権所有者に支払う金銭支払い手段と音素を利用したサービスをユーザに販売する販売手段と、音声合成目的のデータを記録する合成目的データ記録手段と音素取り込み手段が作成した音素データベースを記録する音素データベース記録手段とユーザからのサービス要求を受け付 Calculating the amount of the phoneme causes Awa connecting analyzes the data for speech synthesis purposes, extracts the optimal phoneme for each data, for example, using phonemic database generated from text data and a phoneme capturing means sentence such as work calculating a speech synthesis means, distribution means for providing a synthesized speech data is speech synthesis means is processed to a user, the copyright fee for each copyright holder of phonemes in accordance with the use amount calculated by the speech synthesis means and sales means for selling fees calculation means and the copyright fee service using the money means of payment and the phoneme to pay copyright fees to the copyright owner of the phoneme the calculation information on the basis of the user, the data of the speech synthesis purpose with receiving the service request from the phoneme database record unit and the user to record the phonemic database synthesis subject data recording means and phoneme capture means it has created to record 、ユーザとシステムのインターフェースを行うユーザインターフェース手段から構成される音素ビジネスシステムであり、音素の著作権所有者は音素の使用に応じた著作権料が得られ、また音素を利用したサービスのユーザは容易にサービスを受けることができるようになる。 A phoneme business system composed of a user interface means for interfacing a user and the system, phoneme copyright owner copyright royalties based on the use of phonemes can be obtained and the user of the service using the phoneme easily it becomes possible to receive the service. 【0009】(実施の形態)以下、本発明の音素ビジネスシステムの実施例について図1から図4を用いて説明する。 [0009] (Embodiment) Hereinafter, an example of a phoneme business system of the present invention will be described with reference to FIGS. 【0010】図2は本発明の音素ビジネスシステムの概略説明図である。 [0010] FIG. 2 is a schematic illustration of phonemes business system of the present invention. 以下に本システムの概略説明を行う。 Performing general description of the system below.
(201)は本発明の音素ビジネスシステムであり、音素を利用したサービスをユーザに販売するとともに、音素の使用に応じた著作権料を音素の著作権所有者に支払う。 (201) is a phoneme business system of the present invention, as well as sell services using phonemes user pays a copyright royalty in accordance with the use of phonemes copyright owner phonemes.
(202)は本発明の音素ビジネスシステムに音素を提供する音素提供者である。 (202) is a phoneme provider to provide phonemes to the phoneme business system of the present invention. (203)は本発明の音素ビジネスシステムから音素を利用したサービスを受ける一般ユーザである。 (203) is a general user to receive the service using the phoneme from the phoneme business system of the present invention. (204)は本発明の音素ビジネスシステムから音素を利用したサービスを受け、さらに一般ユーザに対して音声情報等のサービスを提供する企業や市役所等の行政機関及び学校等の教育機関や宗教団体等のコンテンツ提供者である。 (204) receives a service using the phoneme from the phoneme business system of the present invention, further education institutions and religious organizations such as government agencies and schools such as companies and city hall to provide services such as voice information for the general user which is a content provider. 【0011】音素提供者が本システムに音素を提供すると、本システムでは提供された音素の著作権所有者の登録が行われる(205)。 [0011] phoneme provider provides phonemes to the system, registration of the copyright owner of the phonemes provided in this system is performed (205). 次にコンテンツ提供者と一般ユーザは本システムに対して、ネットワーク経由や電話、ファックス、郵便、口頭等及びこれらを組み合わせた手段で所望の音声キャラクタを用いての音声合成目的データの音声合成サービスを要求する(206)。 Then the content provider and the general user this system, through the network, telephone, fax, mail, speech synthesis service of voice synthesis subject data of using the desired sound character verbal etc. and means a combination of these to request (206). 音声合成目的データとは音声合成させたい文章が記述されたデータであり、データの形式は限定しない。 The voice synthesis subject data is data sentence written you want to speech synthesis, the format of the data is not limited. またその内容は例えばニュースや行政案内及び教科書や予め本システムに記録された小説等の文章、ユーザが作成した文章、自分史、 The sentence, such as the contents of the novel that has been recorded in, for example, news and administrative guidance and textbooks and in advance this system, the text created by the user, personal history,
ドラマ、地方の方言等である。 Drama, is a local dialect, and the like. 音素ビジネスシステムはユーザから要求のあった音声キャラクタの音素データベースを用いて音声合成を行い、合成音データをネットワーク経由や光ディスクや磁気ディスク、半導体メモリー等の記録媒体に記録して郵便または人手にてユーザに配信し、そのサービスに対する料金を徴収する(207)。 Phonemes business system performs speech synthesis by using phonemic database of the voice character that requested by the user, synthesized speech data network via, an optical disk or a magnetic disk, by post or manually recorded in a recording medium such as a semiconductor memory and delivered to the user, to collect a fee for the service (207). 【0012】一般ユーザは、配信された合成音データを合成音データ入力手段と、音声出力手段を備えた端末装置に取り込み、再生することで所望の音声キャラクタでの合成音声を聴くことができる。 [0012] Generally the user, the distributed synthesized speech data and the synthesized speech data input means captures the terminal device provided with a voice output means, it is possible to listen to the synthesized speech at the desired sound character by reproducing. 合成音データ入力手段とは例えば、モデム等のネットワークインターフェース、光ディスクや磁気ディスクや半導体メモリー等である記憶媒体のデータ入力手段である。 The synthesized speech data input means such as a modem such as a network interface, a data input means for an optical disk or a magnetic disk or a semiconductor memory such as a a storage medium. また音声出力手段とはスピーカやヘッドフォン、イヤフォン等である。 Also the audio output means is a speaker or a headphone, an earphone or the like. またコンテンツ提供者は配信された合成音データを前記記録媒体に記録し、一般ユーザのサービス要求に備える。 The content provider records the distributed synthesized speech data to said recording medium, comprising the service request general user.
また一般ユーザはコンテンツ提供者に対してネットワーク経由や電話、FAX、郵便、口頭及びこれらを組み合わせた方法にてキャラクタ音声でのニュースや行政案内等を要求し(208)、コンテンツ提供者は要求されたサービスをネットワーク経由や光ディスク、磁気ディスク、半導体メモリー等の記録媒体に記録して郵便または人手にて一般ユーザに配信する(209)。 The general user network via or telephone to the content provider, FAX, mail, verbal and requests news and administrative information or the like in the character's voice at a combination of these methods (208), the content provider is requested service network via or an optical disk has a magnetic disk, and a recording medium such as a semiconductor memory to distribute the general user by mail or hand (209). そして一般ユーザは配信された合成音データを前記手段にて取り込み、合成音声を聴くことができる。 The general user takes the delivered synthesized speech data by said means, it is possible to listen to the synthesized speech. そして本システムからサービスに使用された音素の使用に応じて、使用された音素の著作権所有者に著作権料が支払われる(210)。 And depending on the use of phonemes from the system is used for the service, copyright fees to copyright owners of phonemes used is paid (210). ここまでが本システムの概略説明である。 Far is a schematic description of the present system. 【0013】次に本システムの詳細説明を行う。 [0013] Next, a detailed description of the present system. 図1は本発明の音素ビジネスシステムのブロック図である。 Figure 1 is a block diagram of a phoneme business system of the present invention. (1 (1
01)は音素登録者が発声する肉声であり、(102)は発声された肉声から音素を抽出し、データベース化する音素取り込み手段である。 01) is a real voice phoneme registrant utters, (102) extracts the phoneme from real voice uttered a phoneme uptake means for a database. (103)は音素取り込み手段から取り込まれた音素の著作権所有者の登録を行う著作権者登録手段であり、(104)は音素取り込み手段から生成された音素のデータベースを用い、音声合成したい目的のデータを分析した結果、最適な音素を組み合わせて発音するとともに、音素の使用量をも算出する音声合成手段である。 (103) is a copyright registration means for registering the phonemes of the copyright holder taken from phonemes fetching means (104) uses a phoneme database generated from phonemic fetching means, objects to be speech synthesis results data was analyzed for, while pronunciation by combining optimum phoneme is a speech synthesizing means also calculates the amount of phonemes. 【0014】(105)は音声合成手段によって算出された使用量の結果に応じ、音素の著作権所有者毎に著作権料を算出する著作権料算出手段であり、(106)は著作権料課金手段からの算出情報を基に著作権料を音素の著作権所有者に支払う金銭支払い手段である。 [0014] (105) depending on the result of the use amount calculated by the speech synthesis means, a copyright royalty calculation unit calculating the copyright royalty for each copyright owner of phonemes, (106) copyright royalty on the basis of the calculated information from the charging means it is a monetary payment means to pay copyright fees to the copyright owner of the phoneme. (107)は音声合成手段が処理した合成音データをユーザに提供する為の配信手段であり、インターネット等のネットワーク経由または光ディスク及び磁気ディスクや半導体メモリー等の記録媒体に記録して郵便または人手にてユーザに合成音データを提供する。 (107) is a delivery means for providing a synthesized speech data is speech synthesis means is processed to the user, the mail or manually recorded in a recording medium such as network or optical disk and a magnetic disk or a semiconductor memory such as the Internet to provide a synthetic sound data to the user Te. (108)は音素を利用したサービスをユーザに販売する販売手段である。 (108) is a sales unit for selling services using phonemes user. 【0015】(109)は音声合成目的のデータ、例えば小説や文章のテキストデータ等を記録する為の合成目的データ記録手段であり、(110)は音素取り込み手段が作成した音素データベースを記録する音素データベース記録手段である。 [0015] (109) is a synthetic purpose data recording means for recording the voice synthesis purpose of the data, for example novels and sentence of the text data and the like, (110) phoneme to record the phoneme database that phoneme capture means has been created it is a database recording means. 合成目的データ記録手段、音素データベース記録手段は光ディスクや磁気ディスク、半導体メモリー等であり、データが記録できればここに記載したものに限定しない。 Synthesis purpose data recording means, phonemic database record means is an optical disk or a magnetic disk, a semiconductor memory or the like, not limited to data described herein if the recording. (111)はユーザからのサービス要求を受け付け、ユーザとシステムのインターフェースを行うユーザインターフェース手段である。 (111) accepts a service request from a user, a user interface means for interfacing a user and the system. ユーザインターフェース手段はインターネット上で用いられるWEBシステムでも良いし、電話やFAX、郵便あるいは直接人手で行っても良い。 It user interface means may be a WEB system used on the Internet, phone or FAX, it may be performed by mail or directly manually. 【0016】次に動作の説明を行う。 [0016] Next, the operation will be described. 本システムの動作は2つの動作に大別できる。 Operation of the system can be divided into two operations. 一つは肉声を取り込み、音素を蓄積するまでの動作、もう一つは蓄積した音素の利用から販売、著作権所有者への課金、著作権料支払いまでの動作である。 One takes the human voice, the operation of up to accumulate a phoneme, one for sale from the use of accumulated phoneme, charging the copyright owner, is an operation of up to copyright payments. 【0017】初めに本システムの音素蓄積の動作について説明する。 [0017] a description will be given of the operation of the phoneme accumulation of the beginning to the present system. 図3は本発明の音素ビジネスシステムにおける音素蓄積までの動作フローチャートである。 Figure 3 is an operational flowchart of up phonemes accumulation in the phoneme business system of the present invention. 音素登録者が発声を行うとマイク等を備えた音素取り込み手段は発声された肉声を任意のフォーマットに沿った形でデータベース化し、音素データベース記録手段に記録する If phonemes registrant performs utterance phoneme uptake means having a microphone is a database of real voice uttered in line with any format, and records the phonemic database recording means
(s301)。 (S301). 次に著作権者登録手段は音素取り込み手段が取り込んだ音素に関し、その音素の著作権所有者の登録を行う(s302)。 Then copyright registration means relates to a phoneme captured by the phoneme incorporation means, carry out the registration of the copyright owner of the phoneme (s302). なお、(s301)、(s302)の動作の順番は入れ替わっても良い。 It should be noted, (s301), it may be replaced in the order of the operations of (s302). 以上が音素蓄積までの動作である。 The above is the operation up to the phoneme storage. 【0018】図4は本発明の音素ビジネスシステムにおける音素を利用したサービスの受け付けから販売、著作権料支払いまでの動作フローチャートである。 [0018] FIG. 4 is sold from the acceptance of the service by using the phoneme in the phoneme business system of the present invention, it is an operation flow chart of up to copyright payments. ユーザ、 User,
すなわちコンテンツ提供者や一般ユーザが本システムに対し、インターネット等のネットワークや電話、FAX、 That to the content providers and general user the system, such as the Internet network, telephone, FAX,
郵便、口頭及びこれらを組み合わせた手段を用いて、所望の音声キャラクタや音声合成目的のデータを指定し、 Mail, using the means combining oral and these, to specify the data of the desired sound character and sound synthesis purposes,
音声合成サービスを要求すると、本システムのユーザインターフェース手段はサービス要求を受け付ける(s40 When requesting a speech synthesis service, the user interface means of the present system accepts the service request (s40
1)。 1). ユーザが指定する音声合成目的のデータとは予めシステム内部の合成目的データ記録部に記録されているデータやコンテンツ提供者及び一般ユーザが本システムに対して音声合成を依頼するデータである。 Advance within the system synthesis subject data recording unit data or content provider are recorded in general and the user from the voice synthesis target data designated by the user is the data for requesting speech synthesis for the present system. コンテンツ提供者及び一般ユーザが音声合成を依頼したデータはシステム内部の合成目的データ記録手段に記録される。 Content providers and general user data requested speech synthesis is recorded in the compositing object data recording unit within the system. 【0019】次に販売手段はユーザインターフェース手段が受け付けたユーザからのサービス要求内容を認識し、ユーザが依頼したサービスに応じた料金を計算する。 [0019] The next sale means recognizes the service request content from the user that the user interface means is accepted, to calculate the fee depending on the services that the user has requested. そして料金の結果をユーザインターフェース手段を用いてユーザに提示し、ユーザからの了承を得て、料金を徴収する(s402)。 The results of rates using the user interface means presents to the user to obtain approval from the user, to collect fees (s 402). この徴収形態については以下のように複数の形態がある。 This collection form has several forms as follows. 例えばユーザに提供する音声キャラクタの数に応じた料金徴収、または音声キャラクタの質(世間相場)に応じた料金徴収、もしくは各キャラクタの音素データ量に応じた料金徴収、それと音素を用いて音声合成するデータの本数やデータ量に応じた料金徴収、または音声合成済みデータの本数または量に応じた料金徴収、勿論、上記各料金徴収要因を色々組み合わせた料金徴収もあり得る。 For example tolling according to the number of voice characters to be provided to the user, or tolling according to the quality (public market) voice character, or tolling in accordance with the phoneme data of each character, the same speech synthesis using phoneme tolling according to the number and amount of data to be, or tolling according to the number or amount of speech synthesis data, of course, there may be fee collection that various combinations of the above respective tolling factors. なお必ずしもこの段階で料金を徴収する必要はなく、ユーザ名、クレジット番号等のユーザ情報を記録し、課金を行って後日に料金を徴収しても良い。 It should be noted that it is not always necessary to collect fees at this stage, user name, and record the user information such as credit card number, may be a fee at a later date doing the billing. 【0020】次に音声合成手段はユーザから指定された音声合成目的データを合成目的データ記録部から読み出し、場合によっては音声合成可能なデータに変換して、 [0020] Then voice synthesizing means reads the voice synthesis subject data specified by the user from the composite object data recording unit, is converted into speech synthesis data that can be in some cases,
順次解析を行い、各データに最も適する音素データをユーザが指定した音声キャラクタの音素データベースから読み出して、繋ぎ合わせ、合成音データを作成する。 Performed sequentially analysis, the most suitable phonemic data to the data read out from the phoneme database the voice character specified by the user, joining, creating a composite sound data. 合成音データは場合によっては配信手段やユーザが使用する端末装置に最適なデータフォーマットに変換される(s Synthesized speech data is optionally be converted into an optimum data format for the terminal device used distribution unit and the user (s
403)。 403). 次に配信手段は音声合成手段が作成した合成音データをユーザに配信する(s404)。 Then the delivery means is delivered to the user the synthetic sound data that you have created speech synthesis means (s404). 【0021】一般ユーザは本システムから配信された合成音データを合成音データ入力手段と、音声出力手段を備えた端末装置に取り込み、再生することで所望の音声キャラクタでの文章の朗読を聴くことができる。 [0021] Generally the user to listen to the synthesized speech data delivered from the system and synthesized speech data input means captures the terminal device having a voice output unit, a reading of the text at the desired sound character by reproducing can. またコンテンツ提供者は配信された合成音データを前記記録媒体に記録し、一般ユーザのサービス要求に備える。 The content provider records the distributed synthesized speech data to said recording medium, comprising the service request general user. 次に音声合成手段は音声合成の際に使用された音素の使用量を算出する(s405)。 Then the speech synthesis means calculates the amount of the phonemes used in speech synthesis (s405). なおここでは音素の使用量としたが、音声合成したい目的のデータの使用量や音声合成音の使用量であっても良い。 Note here was the amount of phonemes may be used in an amount of usage and speech synthesis sound data of interest to be speech synthesis. また使用量についてもデータの量及び合成時間の意味も勿論含んでいる。 Also it includes of course the meaning of the amount of data for the use amount and synthesis time. 【0022】次に著作権料算出手段は音声合成手段からの使用量の算出結果に基づき、使用量に応じた著作権料の算出を行う(s406)。 The next copyright royalty calculation means based on the amount of the calculation result from the speech synthesis unit, calculates the copyright royalty corresponding to the amount (s406). そしてこの算出情報を基に金銭支払い手段より、著作権料が音素の著作権所有者に対して支払われる(s407)。 And from the money payment means on the basis of the calculated information, copyright fees are paid to the copyright owner of the phoneme (s407). 【0023】 【発明の効果】本発明のシステムにより、音素の著作権所有者は音素の使用に応じた著作権料が得られ、また音素を利用したサービスのユーザは容易にサービスを受けることができるようになる。 [0023] The system of the present invention, phoneme copyright owner copyright royalties based on the use of phonemes can be obtained and be subject to the user easily service services using phonemes become able to. したがって音素を用いるビジネスそのものが大きく発展する可能性がある。 Thus there is a possibility that the business itself using phoneme is developed greatly.

【図面の簡単な説明】 【図1】本発明の音素ビジネスシステムのブロック図【図2】本発明の音素ビジネスシステムの概略説明図【図3】本発明の音素ビジネスシステムにおける音素蓄積までの動作フローチャート【図4】本発明の音素ビジネスシステムにおける音素を利用したサービスの受け付けから販売、著作権料支払いまでの動作フローチャート【符号の説明】 (101) 音素登録者が発声する肉声(102) 音素取り込み手段(103) 著作権者登録手段(104) 音声合成手段(105) 著作権料算出手段(106) 金銭支払い手段(107) 配信手段(108) 販売手段(109) 合成目的データ記録手段(110) 音素データベース記録手段(111) ユーザインターフェース手段 Operations up phonemes accumulation in the phoneme business system BRIEF DESCRIPTION OF THE DRAWINGS block diagram of phonemes business system of the present invention; FIG schematic illustration of phonemes business system of the present invention; FIG 3 shows the present invention sold by reception of services using the phonemes in the phoneme business systems in the flowchart FIG. 4 the invention, operation flowchart until copyright payments [description of Reference numerals] (101) real voice (102) of the phoneme registrant utters phoneme uptake means (103) Copyright registration means (104) speech synthesis means (105) copyright royalty calculation unit (106) monetary payment means (107) delivery means (108) sales means (109) synthesis subject data recording means (110) phonemic database record means (111) user interface means

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl. 7識別記号 FI テーマコート゛(参考) G10L 13/00 G10L 3/00 E 13/06 5/04 Z ────────────────────────────────────────────────── ─── of the front page continued (51) Int.Cl. 7 identification mark FI theme Court Bu (reference) G10L 13/00 G10L 3/00 E 13/06 5/04 Z

Claims (1)

  1. 【特許請求の範囲】 【請求項1】 音声の最小構成要素を音素と定め、その個性を持つ音素と、その音素を取り込む音素取り込み手段と音素の著作権所有者を登録する著作権者登録手段と、音声合成目的のデータ、例えば文章等のテキストデータと音素取り込み手段から生成される音素のデータベースを用いて音声合成目的のデータを解析し、そのデータ毎に最適な音素を抽出して繋ぎあわせるとともに音素の使用量を算出する音声合成手段と、音声合成手段が処理した合成音データをユーザに提供する配信手段と、音声合成手段によって算出された使用量に応じて音素の著作権所有者毎に著作権料を算出する著作権料算出手段と著作権料の算出情報を基に著作権料を音素の著作権所有者に支払う金銭支払い手段と音素を利用したサービスを Defined as phonemes Patent Claims 1. A minimum component of the speech, the phoneme with its individuality, copyright registration means for registering the copyright owner of the phonemes fetching means and phoneme capturing the phoneme When speech synthesis target data, analyzes the data of the speech synthesis purposes for example, using the phoneme database generated from text data and a phoneme capturing means texts such as spliced ​​to extract the best phoneme for each data and speech synthesis means for calculating the amount of phonemes with a delivery means for providing a synthesized speech data is speech synthesis means is processed to a user, the phoneme copyright each holder according to the amount used calculated by the speech synthesis means a service using the cash payment means and phoneme to pay copyright fees on the basis of the calculated information of copyright fee calculation means and the copyright fee to calculate the copyright fees to the copyright owner of the phoneme to ーザに販売する販売手段と、音声合成目的のデータを記録する合成目的データ記録手段と音素取り込み手段が作成した音素データベースを記録する音素データベース記録手段とユーザからのサービス要求を受け付け、ユーザとシステムのインターフェースを行うユーザインターフェース手段から構成される音素ビジネスシステム。 A sales unit to sell over THE, a service request from the phoneme database record unit and the user to record the phonemic database synthesis subject data recording means and phoneme capture means has created to record the data of the speech synthesis purposes accepted, the user and the system phonemes business system composed of a user interface means for performing interface. 【請求項2】 音素は「あ」や「い」、「か」や「き」 2. A phoneme is "a" and "i", "or" or "ki"
    といった母音や子音の組み合わせから成る音であることを特徴とする請求項1記載の音素ビジネスシステム。 Phonemes business system according to claim 1, wherein the a sound consisting of a combination of vowels and consonants such. 【請求項3】 音素は連続する音声の最小単位である単音(例えば「秋(あき)」は「a」「k」「i」の単音から成る)であることを特徴とする請求項1記載の音素ビジネスシステム。 Wherein phonemes single note is the smallest unit of speech that continuous (e.g., "Autumn (Aki)" consists of single-note "a" "k" "i") according to claim 1, wherein it is phoneme business systems. 【請求項4】 音素は単語であることを特徴とする請求項1記載の音素ビジネスシステム。 4. A phoneme is a phoneme business system according to claim 1, wherein it is a word. 【請求項5】 音素は文節や文章であることを特徴とする請求項1記載の音素ビジネスシステム。 5. The phoneme phoneme business system according to claim 1, wherein it is a phrase or sentence. 【請求項6】 音素は擬音語、擬声語、擬態語であることを特徴とする請求項1記載の音素ビジネスシステム。 6. phonemes onomatopoeias, onomatopoeia, phonemes business system according to claim 1, wherein it is a mimetic. 【請求項7】 音素はデジタル合成音声であることを特徴とする請求項1記載の音素ビジネスシステム。 7. A phoneme is a phoneme business system according to claim 1, characterized in that a digital synthetic speech. 【請求請8】 端末装置の合成音データ入力手段はメモリーカードや光ディスク及び磁気ディスク等の記憶装置やモデム等のネットワークインターフェースであることを特徴とする請求請1記載の読み上げシステム。 [Claimed 請 8] the terminal device reading system according 請 1, wherein the synthetic sound data input means is a network interface of the storage device and a modem such as a memory card or optical disk and a magnetic disk.
JP2001340687A 2001-11-06 2001-11-06 Phoneme business system Withdrawn JP2003140672A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2001340687A JP2003140672A (en) 2001-11-06 2001-11-06 Phoneme business system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2001340687A JP2003140672A (en) 2001-11-06 2001-11-06 Phoneme business system

Publications (2)

Publication Number Publication Date
JP2003140672A true JP2003140672A (en) 2003-05-16
JP2003140672A5 JP2003140672A5 (en) 2005-04-14

Family

ID=19154843

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2001340687A Withdrawn JP2003140672A (en) 2001-11-06 2001-11-06 Phoneme business system

Country Status (1)

Country Link
JP (1) JP2003140672A (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7945851B2 (en) 2007-03-14 2011-05-17 Nuance Communications, Inc. Enabling dynamic voiceXML in an X+V page of a multimodal application
US8229081B2 (en) 2008-04-24 2012-07-24 International Business Machines Corporation Dynamically publishing directory information for a plurality of interactive voice response systems
US8239205B2 (en) 2006-09-12 2012-08-07 Nuance Communications, Inc. Establishing a multimodal advertising personality for a sponsor of a multimodal application
US8290780B2 (en) 2009-06-24 2012-10-16 International Business Machines Corporation Dynamically extending the speech prompts of a multimodal application
US8380513B2 (en) 2009-05-19 2013-02-19 International Business Machines Corporation Improving speech capabilities of a multimodal application
US8416714B2 (en) 2009-08-05 2013-04-09 International Business Machines Corporation Multimodal teleconferencing
US8494858B2 (en) 2006-09-11 2013-07-23 Nuance Communications, Inc. Establishing a preferred mode of interaction between a user and a multimodal application
US8510117B2 (en) 2009-07-09 2013-08-13 Nuance Communications, Inc. Speech enabled media sharing in a multimodal application
US8571872B2 (en) 2005-06-16 2013-10-29 Nuance Communications, Inc. Synchronizing visual and speech events in a multimodal application
US8600755B2 (en) 2006-09-11 2013-12-03 Nuance Communications, Inc. Establishing a multimodal personality for a multimodal application in dependence upon attributes of user interaction
US8706490B2 (en) 2007-03-20 2014-04-22 Nuance Communications, Inc. Indexing digitized speech with words represented in the digitized speech
US8781840B2 (en) 2005-09-12 2014-07-15 Nuance Communications, Inc. Retrieval and presentation of network service results for mobile device using a multimodal browser
US8843376B2 (en) 2007-03-13 2014-09-23 Nuance Communications, Inc. Speech-enabled web content searching using a multimodal browser
US8862475B2 (en) 2007-04-12 2014-10-14 Nuance Communications, Inc. Speech-enabled content navigation and control of a distributed multimodal browser
US8909532B2 (en) 2007-03-23 2014-12-09 Nuance Communications, Inc. Supporting multi-lingual user interaction with a multimodal application
US8938392B2 (en) 2007-02-27 2015-01-20 Nuance Communications, Inc. Configuring a speech engine for a multimodal application based on location
US9076454B2 (en) 2008-04-24 2015-07-07 Nuance Communications, Inc. Adjusting a speech engine for a mobile computing device based on background noise
US9083798B2 (en) 2004-12-22 2015-07-14 Nuance Communications, Inc. Enabling voice selection of user preferences
US9208785B2 (en) 2006-05-10 2015-12-08 Nuance Communications, Inc. Synchronizing distributed speech recognition
US9208783B2 (en) 2007-02-27 2015-12-08 Nuance Communications, Inc. Altering behavior of a multimodal application based on location
US9349367B2 (en) 2008-04-24 2016-05-24 Nuance Communications, Inc. Records disambiguation in a multimodal application operating on a multimodal device
US9396721B2 (en) 2008-04-24 2016-07-19 Nuance Communications, Inc. Testing a grammar used in speech recognition for reliability in a plurality of operating environments having different background noise

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9083798B2 (en) 2004-12-22 2015-07-14 Nuance Communications, Inc. Enabling voice selection of user preferences
US8571872B2 (en) 2005-06-16 2013-10-29 Nuance Communications, Inc. Synchronizing visual and speech events in a multimodal application
US8781840B2 (en) 2005-09-12 2014-07-15 Nuance Communications, Inc. Retrieval and presentation of network service results for mobile device using a multimodal browser
US9208785B2 (en) 2006-05-10 2015-12-08 Nuance Communications, Inc. Synchronizing distributed speech recognition
US9292183B2 (en) 2006-09-11 2016-03-22 Nuance Communications, Inc. Establishing a preferred mode of interaction between a user and a multimodal application
US8494858B2 (en) 2006-09-11 2013-07-23 Nuance Communications, Inc. Establishing a preferred mode of interaction between a user and a multimodal application
US8600755B2 (en) 2006-09-11 2013-12-03 Nuance Communications, Inc. Establishing a multimodal personality for a multimodal application in dependence upon attributes of user interaction
US9343064B2 (en) 2006-09-11 2016-05-17 Nuance Communications, Inc. Establishing a multimodal personality for a multimodal application in dependence upon attributes of user interaction
US8862471B2 (en) 2006-09-12 2014-10-14 Nuance Communications, Inc. Establishing a multimodal advertising personality for a sponsor of a multimodal application
US8498873B2 (en) 2006-09-12 2013-07-30 Nuance Communications, Inc. Establishing a multimodal advertising personality for a sponsor of multimodal application
US8239205B2 (en) 2006-09-12 2012-08-07 Nuance Communications, Inc. Establishing a multimodal advertising personality for a sponsor of a multimodal application
US8938392B2 (en) 2007-02-27 2015-01-20 Nuance Communications, Inc. Configuring a speech engine for a multimodal application based on location
US9208783B2 (en) 2007-02-27 2015-12-08 Nuance Communications, Inc. Altering behavior of a multimodal application based on location
US8843376B2 (en) 2007-03-13 2014-09-23 Nuance Communications, Inc. Speech-enabled web content searching using a multimodal browser
US7945851B2 (en) 2007-03-14 2011-05-17 Nuance Communications, Inc. Enabling dynamic voiceXML in an X+V page of a multimodal application
US9123337B2 (en) 2007-03-20 2015-09-01 Nuance Communications, Inc. Indexing digitized speech with words represented in the digitized speech
US8706490B2 (en) 2007-03-20 2014-04-22 Nuance Communications, Inc. Indexing digitized speech with words represented in the digitized speech
US8909532B2 (en) 2007-03-23 2014-12-09 Nuance Communications, Inc. Supporting multi-lingual user interaction with a multimodal application
US8862475B2 (en) 2007-04-12 2014-10-14 Nuance Communications, Inc. Speech-enabled content navigation and control of a distributed multimodal browser
US9076454B2 (en) 2008-04-24 2015-07-07 Nuance Communications, Inc. Adjusting a speech engine for a mobile computing device based on background noise
US9396721B2 (en) 2008-04-24 2016-07-19 Nuance Communications, Inc. Testing a grammar used in speech recognition for reliability in a plurality of operating environments having different background noise
US8229081B2 (en) 2008-04-24 2012-07-24 International Business Machines Corporation Dynamically publishing directory information for a plurality of interactive voice response systems
US9349367B2 (en) 2008-04-24 2016-05-24 Nuance Communications, Inc. Records disambiguation in a multimodal application operating on a multimodal device
US8380513B2 (en) 2009-05-19 2013-02-19 International Business Machines Corporation Improving speech capabilities of a multimodal application
US9530411B2 (en) 2009-06-24 2016-12-27 Nuance Communications, Inc. Dynamically extending the speech prompts of a multimodal application
US8290780B2 (en) 2009-06-24 2012-10-16 International Business Machines Corporation Dynamically extending the speech prompts of a multimodal application
US8510117B2 (en) 2009-07-09 2013-08-13 Nuance Communications, Inc. Speech enabled media sharing in a multimodal application
US8416714B2 (en) 2009-08-05 2013-04-09 International Business Machines Corporation Multimodal teleconferencing

Similar Documents

Publication Publication Date Title
US6775651B1 (en) Method of transcribing text from computer voice mail
KR101066741B1 (en) Semantic object synchronous understanding for highly interactive interface
US5724481A (en) Method for automatic speech recognition of arbitrary spoken words
US6556972B1 (en) Method and apparatus for time-synchronized translation and synthesis of natural-language speech
JP3994368B2 (en) Information processing apparatus, information processing method, and recording medium
US10372891B2 (en) System and method for identifying special information verbalization timing with the aid of a digital computer
US6570964B1 (en) Technique for recognizing telephone numbers and other spoken information embedded in voice messages stored in a voice messaging system
US8583418B2 (en) Systems and methods of detecting language and natural language strings for text to speech synthesis
US8352272B2 (en) Systems and methods for text to speech synthesis
US8355919B2 (en) Systems and methods for text normalization for text to speech synthesis
US5893902A (en) Voice recognition bill payment system with speaker verification and confirmation
US8712776B2 (en) Systems and methods for selective text to speech synthesis
US20110314132A1 (en) Method and system for interacting with a user in an experiential environment
US20020032591A1 (en) Service request processing performed by artificial intelligence systems in conjunctiion with human intervention
US8396714B2 (en) Systems and methods for concatenation of words in text to speech synthesis
CN101124623B (en) Voice authentication system and method
CN1157710C (en) Speech datas extraction
EP1704560B1 (en) Virtual voiceprint system and method for generating voiceprints
US7457397B1 (en) Voice page directory system in a voice page creation and delivery system
US9813366B2 (en) Method and system for communicating between a sender and a recipient via a personalized message including an audio clip extracted from a pre-existing recording
US7283973B1 (en) Multi-modal voice-enabled content access and delivery system
US20100082327A1 (en) Systems and methods for mapping phonemes for text to speech synthesis
US20030033161A1 (en) Method and apparatus for generating and marketing supplemental information
Iskra et al. Speecon-speech databases for consumer devices: Database specification and validation
US7624044B2 (en) System for marketing goods and services utilizing computerized central and remote facilities

Legal Events

Date Code Title Description
A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20040603

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20040603

RD01 Notification of change of attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7421

Effective date: 20050701

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20060502

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20060516

A761 Written withdrawal of application

Free format text: JAPANESE INTERMEDIATE CODE: A761

Effective date: 20060713