JP2000194384A

JP2000194384A - System and method for recording and synthesizing sound, and infrastracture for distributing recorded sound to be reproduced on remote place

Info

Publication number: JP2000194384A
Application number: JP11256692A
Authority: JP
Inventors: Steven D Curtin; ディーカーティンスティーヴン
Original assignee: Lucent Technologies Inc
Current assignee: Nokia of America Corp
Priority date: 1998-09-10
Filing date: 1999-09-10
Publication date: 2000-07-14
Also published as: EP0986046A1

Abstract

PROBLEM TO BE SOLVED: To record sound with high quality and independently of a sampling speed by discriminating selected structure out of common spectral structure and common formant structure in a frame and generating a record including its basic frequency and selected frequency. SOLUTION: After executing sampling allowed to be generated at each 1 ms interval by a sampler 120, basic frequency is extracted by a frame generator 130. The frame generator 130 generates a frame and extracts the basic frequency and the envelope of a spectrum from a sound source 110 through the sampler 120. Then a frame analyzer 140 discriminates selected structure out of common spectral structure and common formant structure in the frame and generates a record including the selected structure and suitable basic frequency. Then the generated record is stored in a 1st storage unit 150.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の技術分野】本発明は、概して、音の記録及び再
生に関し、特に音の記録を作り、その後にその音を合成
するためにその記録を使用する、分解能に無関係なシス
テム及び方法に関する。FIELD OF THE INVENTION The present invention relates generally to sound recording and playback, and more particularly, to a resolution independent system and method for creating a sound record and subsequently using the record to synthesize the sound.

【０００２】[0002]

【発明の背景】楽器の演奏に対する現在の録音は、その
楽器のアナログ信号をサンプリングするか、あるいは制
御器への入力であるジェスチャ(gestures)を記録するか
のいずれかに基づいている。これは再生の状況におい
て、その演奏がサンプリングの場合には固定の時間領域
においてのみ編集されるか、あるいは楽器のディジタル
・インターフェース(Musical Instrument Digital Inst
erface:ＭＩＤＩ）の場合には、キーボード及びパーカ
ッションだけが実際に録音されることになる。追加的
な、スペクトル的な合成の技術は、音楽の演奏を連続的
な演奏とは反対に離散的な音符に分割する。一般に、テ
ンポまたはキーを変え、そして再生時に演奏を外部イベ
ントと同期化させる機能は、程良いコストで達成し、そ
して受認できない歪みなしには達成することは困難な場
合が多い。BACKGROUND OF THE INVENTION Current recordings of musical instrument performance are based either on sampling the analog signal of the instrument or recording gestures that are inputs to a controller. This can only be edited in a fixed time domain if the performance is sampled in a playback situation, or the instrument's digital interface (Musical Instrument Digital Inst.
erface: MIDI), only the keyboard and percussion will actually be recorded. Additional, spectral synthesis techniques divide a musical performance into discrete notes as opposed to a continuous performance. In general, the ability to change tempo or key and synchronize performance with external events during playback is achieved at a reasonable cost and often difficult to achieve without unacceptable distortion.

【０００３】発生する音波形は周波数や振幅などの多く
のパラメータによって特徴付けることができる。フーリ
エ解析を使ってスペクトル成分から構成されている１つ
のスペクトル・フレームとして、周波数領域において音
波形を表現することができる。そのスペクトル・フレー
ムはその波形の最低の、すなわち、基本の周波数と、そ
の高調波（その基本周波数の倍数において発生するスペ
クトル成分）を一緒に含んでいる。弦楽器から、又は音
声における母音からのスペクトル成分は、基本周波数の
整数倍に近い場所において発生するのが普通であり、一
方、パーカッションの楽器からのスペクトル成分は基本
周波数の非整数倍の場所において発生する。[0003] The generated sound waveform can be characterized by a number of parameters, such as frequency and amplitude. A sound waveform can be represented in the frequency domain as a single spectral frame composed of spectral components using Fourier analysis. The spectral frame contains the lowest or fundamental frequency of the waveform together with its harmonics (spectral components occurring at multiples of the fundamental frequency). Spectral components from stringed instruments or from vowels in speech usually occur near integer multiples of the fundamental frequency, while spectral components from percussion instruments occur at non-integer multiples of the fundamental frequency. I do.

【０００４】現在の音の録音は、受認し難い費用と歪み
レベルの録音に対する変更を行う他の録音や再生特性に
依存する、若しくはその傾向のある、典型的なサンプル
速度であることが解っている。さらに、これらの制限に
よって、現在の録音が、再生のために選択するフォーマ
ットを非常に限定している。ラジオ局は、将来に向けて
何日もプログラムすることができる録音の選択を提供す
るが、これは、通常電話による臨時的な特定の要求能力
によってのみ許される。選択された要求の場合であって
も、その録音はその基本的なアレンジメントと同様にテ
ンポやキーに関するフォーマットが完全に固定されてい
る。It has been found that the recording of current sounds is at a typical sample rate that is dependent on or prone to unacceptable cost and other recording and playback characteristics that make changes to the distortion level recording. ing. In addition, these limitations make current recordings very restrictive of the format selected for playback. Radio stations offer a choice of recordings that can be programmed for many days into the future, but this is usually only allowed by the ad-hoc, specific and demanding capabilities of the telephone. Even in the case of the selected request, the recording has a completely fixed tempo and key format, as well as its basic arrangement.

【０００５】従って、この分野の技術において必要とさ
れることは、特定の聴取者の変更または適応に対する能
力を有する、任意且つ便利な方法で選択することができ
る、高品質でサンプル速度に無関係な録音を提供する方
法である。[0005] Therefore, what is needed in the art is a high quality, sample rate independent, selectable in any convenient manner that has the ability to modify or adapt to a particular listener. A way to provide recordings.

【０００６】[0006]

【発明の概要】従来の技術の上記の欠点に対処するため
に、本発明は、音を分解能に無関係(resolution-indepe
ndent)な方法で記録して合成するシステムと方法、及び
遠隔で再生する分解能に無関係な録音を分配するインフ
ラストラクチャを提供する。１つの実施例においては、
１つのシステムは、（１）音から基本周波数とスペクト
ルの包絡線とを抽出して、それらからフレームを生成す
るフレーム発生器と、（２）そのフレームにおける共通
スペクトル構造及び共通フォルマント構造のうちの選択
されたものを識別し、その基本周波数及びその選択され
た周波数を含んでいるレコードを生成するフレーム・ア
ナライザとを含む。SUMMARY OF THE INVENTION To address the above-mentioned shortcomings of the prior art, the present invention provides for sound-independent resolution-independence.
It provides a system and method for recording and synthesizing in an ndent manner, and an infrastructure for distributing resolution-independent recordings for remote playback. In one embodiment,
One system includes (1) a frame generator that extracts a fundamental frequency and a spectral envelope from a sound and generates a frame therefrom; and (2) a common spectral structure and a common formant structure in the frame. A frame analyzer that identifies the selection and generates a record containing the fundamental frequency and the selected frequency.

【０００７】それ故に、本発明は、音における基本周波
数と選択された構造とを記録し、そしてこれらの基本周
波数と選択された構造とを含んでいるレコードを生成
し、それ以降の合成（再生と等価の）に対する基礎を提
供することの広い概念を導入する。好ましいことに、本
発明は、その音を連続的な演奏（個々のトーンまたは音
符とは無関係な）として音を解析する。Therefore, the present invention records the fundamental frequencies and selected structures in a sound, and generates a record containing these fundamental frequencies and the selected structures, and subsequently synthesizes (plays) them. Introduces the broad concept of providing the basis for). Preferably, the present invention analyzes the sound as a continuous play (independent of individual tones or notes).

【０００８】本発明の１つの実施例においては、そのフ
レームは離散的であり、それらを離散的な時間の周期に
対応させることができる。図示されて説明される１つの
実施例においては、共通スペクトル構造または共通フォ
ルマント構造を辞書の中に含めて、そのレコードの合計
サイズを圧縮することができる。In one embodiment of the invention, the frames are discrete, and they can correspond to discrete periods of time. In one embodiment shown and described, a common spectral or common formant structure may be included in the dictionary to compress the total size of the record.

【０００９】本発明の１つの実施例においては、楽器が
その音を発生する。この分野の当業者であれば、ある種
の楽器、たとえば、弦楽器及び管楽器などのフォルマン
ト(formant)の内容についてよく知っている。人間の声
も、捕捉可能で、後での合成で採用することができるフ
ォルマントを同様に含んでいる。但し、本発明は任意の
音について適用することができる。In one embodiment of the invention, a musical instrument produces the sound. Those skilled in the art are familiar with the contents of certain musical instruments, for example, formants such as stringed and wind instruments. The human voice also contains formants that can be captured and adopted in later synthesis. However, the present invention can be applied to any sound.

【００１０】本発明の１つの実施例においては、フレー
ム発生器は音をサンプルした後、それらからその基本周
波数を抽出する。図示されて説明されるこの実施例にお
いては、サンプリングを１ｍｓの間隔で発生させること
ができる。しかし、この分野の当業者であれば、本発明
は、特定のサンプリング周波数には限定されないことを
理解されたい。In one embodiment of the invention, after the frame generator samples the sound, it extracts its fundamental frequency from them. In this embodiment illustrated and described, sampling can occur at 1 ms intervals. However, those skilled in the art will appreciate that the present invention is not limited to a particular sampling frequency.

【００１１】本発明の１つの実施例においては、そのシ
ステムはそのレコードに対して時間の量子化マップを適
用するマッピング回路をさらに含む。そのレコードが生
成されると、本発明は、従来開発された又は将来開発さ
れるであろう広範囲の音操作技法を適用する。In one embodiment of the invention, the system further includes a mapping circuit that applies a temporal quantization map to the record. Once that record has been generated, the present invention applies a wide range of sound manipulation techniques that have been or will be developed in the future.

【００１２】本発明の１つの実施例においては、そのシ
ステムはそのレコードの内容及び順序の選択されたもの
を変更するエディタをさらに含む。このエディタによっ
て一度レコードされた音をさらに操作することができ
る。[0012] In one embodiment of the invention, the system further includes an editor that changes selected contents and order of the records. This editor allows you to further manipulate the sound once recorded.

【００１３】本発明の１つの実施例においては、フレー
ム・アナライザはフレームをフーリエ解析することによ
って共通スペクトル構造と共通フォルマント構造のうち
の選択されたものを識別する。この分野の当業者であれ
ば、特に、周波数を解析することができる高速フーリエ
変換技法についてよく知っている。本発明は他の従来
の、あるいは後で開発されたスペクトル解析技法、たと
えば、ウェーブレット(wavelet)などにも適合する。In one embodiment of the invention, the frame analyzer identifies selected ones of the common spectral structure and the common formant structure by Fourier analyzing the frame. Those skilled in the art are particularly familiar with fast Fourier transform techniques that can analyze frequencies. The present invention is compatible with other conventional or later developed spectral analysis techniques, such as wavelets.

【００１４】本発明は、遠隔での再生のために録音を分
配するインフラストラクチャをさらに提供する。１つの
インフラストラクチャは、（１）関連付けられている録
音データベースを有するラジオ局と、（２）対応する共
通スペクトル構造及び共通フォルマント構造の選択され
たものを含んでいる、録音データベースに含まれている
複数の録音と、（３）複数の録音のうちの１つに対する
遠隔の要求を受信する、録音データベースに結合された
要求受信機と、（４）要求に応答して複数の録音のうち
の１つを送信する、録音データベースに結合された送信
機とを含む。The present invention further provides an infrastructure for distributing recordings for remote playback. One infrastructure is included in the recording database, which includes (1) a radio station with an associated recording database, and (2) selected corresponding common spectral and common formant structures. A plurality of recordings; (3) a request receiver coupled to the recording database for receiving a remote request for one of the plurality of recordings; and (4) one of the plurality of recordings in response to the request. And a transmitter coupled to the recording database.

【００１５】従って、本発明は、「オーディオ・オン・
デマンド」に実質的に等しいものを提供し、その中でフ
ォーマット化されたオーディオ・ファイルが遠隔の「ラ
ジオ」に提供され、その遠隔「ラジオ」はオーディオを
その場で合成することができる。従って、本発明の１つ
の実施例においては、そのインフラストラクチャは複数
の録音のどれかを受信して、ディジタル的に操作するこ
とができる複数の遠隔ラジオをさらに含む。その遠隔ラ
ジオは、その複数の録音の１つを再生できるようにする
ためにデータ処理や記憶装置ハードウェアにダウンロー
ドされ実行されるソフトウェアを含むことができる。こ
のインフラストラクチャは、遠隔ラジオが受信した無線
波を単純に復調して増幅する従来のアナログのＡＭまた
はＦＭラジオのインフラストラクチャと非常に対照的で
ある。Accordingly, the present invention provides an audio-on-
It provides what is substantially equivalent to "demand", in which a formatted audio file is provided to a remote "radio", which can synthesize audio on the fly. Thus, in one embodiment of the invention, the infrastructure further includes a plurality of remote radios that can receive any of the plurality of recordings and operate digitally. The remote radio may include software that is downloaded and executed on data processing and storage hardware to enable one of the plurality of recordings to be played. This infrastructure is in stark contrast to conventional analog AM or FM radio infrastructure, which simply demodulates and amplifies radio waves received by a remote radio.

【００１６】本発明の１つの実施例においては、送信機
は複数の録音の１つを受信機に対して放送する。代わり
に、複数の録音のいずれかを個々の遠隔「ラジオ」に対
して向ける(addressed)ことができる。In one embodiment of the invention, the transmitter broadcasts one of the plurality of recordings to the receiver. Alternatively, any of the multiple recordings can be addressed to individual remote "radios".

【００１７】本発明の１つの実施例においては、複数の
録音のうちのいずれかが複数のビットストリーム・ファ
イルの中に実現されている。そのビットストリーム・フ
ァイルは基本周波数と上記のように選択されたもの関す
るデータを含む。In one embodiment of the present invention, any of the plurality of recordings is implemented in a plurality of bitstream files. The bitstream file contains data about the fundamental frequency and the one selected above.

【００１８】本発明の１つの実施例においては、録音デ
ータベースは要求のレコードを含む。これは、歌の流行
（popularity)や広告宣伝の普及を追跡して、正確なロ
イヤルティ支払い額を自動的に計算することができる。In one embodiment of the invention, the recording database includes a record of the request. It can track song popularity and advertising dissemination and automatically calculate the exact royalty payment.

【００１９】上記は、以下に続く本発明の詳細な説明を
この分野の当業者がより良く理解できるように、むしろ
広くて、好ましく、代替的な機能について概説した。本
発明の特許請求の範囲の主題を形成する、本発明の追加
の特徴が以下に記述される。この分野の当業者であれ
ば、本発明と同じ目的を遂行するために、他の構造を設
計または修正するための基礎としてここに開示された概
念及び特定の実施例が容易に使うことができることを理
解されたい。また、この分野の当業者であれば、そのよ
うな等価な構造はその広い形式において本発明の精神及
び範囲から逸脱することがないことを実感することがで
きるだろう。The foregoing has outlined rather broad, preferred, and alternative features so that those skilled in the art can better understand the detailed description of the invention that follows. Additional features of the invention will be described hereinafter which form the subject of the claims of the invention. Those skilled in the art can readily use the concepts and specific embodiments disclosed herein as a basis for designing or modifying other structures to accomplish the same purpose as the present invention. I want to be understood. Those skilled in the art will also realize that such equivalent constructions in their broad form do not depart from the spirit and scope of the invention.

【００２０】[0020]

【発明の詳細な記述】先ず最初に図１を参照すると、本
発明の原理に従って構築された楽器を録音するための、
分解能に無関係(resolution-independent)なシステムの
ブロック図１００が示されている。分解能に無関係なシ
ステムのブロック図１００は、音源１１０、サンプラ(s
ampler)１２０、フレーム発生器１３０、フレーム・ア
ナライザ１４０、第１の記憶ユニット１５０、マッピン
グ回路１６０、エディタ１７０及び第２の記憶ユニット
１８０を含む。DETAILED DESCRIPTION OF THE INVENTION Referring first to FIG. 1, a method for recording an instrument constructed in accordance with the principles of the present invention is described.
A block diagram 100 of a resolution-independent system is shown. The block diagram 100 of the resolution-independent system includes a sound source 110, a sampler (s
ampler) 120, a frame generator 130, a frame analyzer 140, a first storage unit 150, a mapping circuit 160, an editor 170, and a second storage unit 180.

【００２１】本発明は、分解能に無関係な方法で音を記
録して合成するシステムと方法、及び、分解能に無関係
な録音を遠隔再生のために分配するインフラストラクチ
ャを提供する。この実施例においては、その音を発生す
る楽器を使うことができる。この分野の当業者であれ
ば、弦楽器や管楽器などのある種の楽器のフォルマント
の内容についてよく知っている。人間の声もある種のフ
ォルマントを含んでおり、それを捕捉して録音の再生に
おいて使われる後での合成において採用することができ
る。しかし、本発明は任意の音について適用することが
できる。The present invention provides a system and method for recording and synthesizing sound in a resolution-independent manner, and an infrastructure for distributing resolution-independent recordings for remote playback. In this embodiment, an instrument that generates the sound can be used. Those skilled in the art are familiar with the formant content of certain instruments, such as stringed and wind instruments. The human voice also contains certain formants, which can be captured and employed in later synthesis used in recording playback. However, the present invention can be applied to any sound.

【００２２】１ｍｓの間隔で発生することができるサン
プリングがサンプラ１２０によって実行された後、基本
周波数がフレーム発生器１３０によって抽出される。し
かし、この分野の当業者であれば、本発明は図に示され
ているような特定のサンプリング周波数または分離され
たサンプラの使用には限定されないことを理解された
い。サンプラ１２０はフレーム発生器１３０の一部とし
て含まれていてもよい。この実施例によって、音源を含
んでいるデータがサンプリング速度または所望のテンポ
とは無関係になるようにすることができる。フレーム発
生器１３０はフレームを生成し、サンプラ１２０を通し
て音源１１０から基本周波数及びスペクトルの包絡線を
抽出する。次に、フレーム・アナライザ１３０がそのフ
レームの中の共通スペクトル構造及び共通フォルマント
構造のうちの選択されたものを識別し、この選択された
もの及び適切な基本周波数を含んでいる１つのレコード
を生成する。次にこれが第１の記憶ユニット１５０の中
に格納される。After the sampling, which can occur at 1 ms intervals, has been performed by sampler 120, the fundamental frequency is extracted by frame generator 130. However, those skilled in the art will appreciate that the invention is not limited to the use of any particular sampling frequency or separate samplers as shown in the figures. Sampler 120 may be included as part of frame generator 130. This embodiment allows the data containing the sound source to be independent of the sampling rate or the desired tempo. Frame generator 130 generates a frame and extracts the fundamental frequency and spectral envelope from sound source 110 through sampler 120. Next, the frame analyzer 130 identifies a selected one of the common spectral and common formant structures in the frame and generates one record containing the selected and the appropriate fundamental frequency. I do. This is then stored in the first storage unit 150.

【００２３】従って、本発明は、音の中の基本周波数及
び選択構造を記憶し、それらの基本周波数及び選択され
た構造を含んでいるレコードを生成して再生において使
われるそれ以降の合成のための基礎を提供することの広
い概念を導入する。本発明は、音を個々のトーンまたは
音符とは無関係に連続的な演奏として解析する。フレー
ムを離散的であるようにして、それらが離散的な時間の
期間に対応するようにすることができる。フレーム・ア
ナライザ１４０はそのフレームをフーリエ解析によって
共通スペクトル構造及び共通フォルマント構造のうちの
選択されたものを識別する。この分野の当業者であれ
ば、周波数を解析することができる高速フーリエ変換
（Ｆast Ｆourier Ｔransform：ＦＦＴ）技法について
よく知っている。本発明は、他の従来の技法または、ウ
ェーブレット(wavelet)などの、後で開発されたスペク
トル解析技法についても同様に使える。Accordingly, the present invention stores the fundamental frequencies and selected structures in a sound and generates a record containing those fundamental frequencies and selected structures for subsequent synthesis used in playback. Introduce the broad concept of providing a basis for. The present invention analyzes the sound as a continuous performance independent of individual tones or notes. The frames can be made discrete so that they correspond to discrete time periods. The frame analyzer 140 identifies the selected one of the common spectral structure and the common formant structure by Fourier analysis of the frame. Those skilled in the art are familiar with fast Fourier transform (FFT) techniques that can analyze frequencies. The invention can be used with other conventional techniques or later developed spectral analysis techniques, such as wavelets.

【００２４】この実施例においては、共通スペクトル構
造またはフォルマント構造を、そのレコードの合計サイ
ズを圧縮するために辞書の中に含めることができる。共
通スペクトル構造やフォルマント構造の識別及びグルー
ピングによって、それらが、特定の音組成を再構築する
のに用いられる特定の辞書にあるワードと同様の方法で
アクセスできるレコード構成へと組織化される。サンプ
ルされ、フレーム化され、解析される音に関連付けられ
た共通の構造のカスタムなコレクションであるようにそ
の辞書をすることができる。しかし、その辞書はまた、
向けられるべき特定の音源に対応するよう認識されタグ
が付けられている音源のより広いコレクションの共通構
造を含むこともできる。さらに、認識されタグが付けら
れている適切なスペクトル構造とフォルマント構造とを
含んでいる辞書などのコレクションであってよい。In this embodiment, a common spectral or formant structure can be included in the dictionary to compress the total size of the record. The identification and grouping of common spectral and formant structures organizes them into a record structure that can be accessed in a manner similar to words in a particular dictionary used to reconstruct a particular sound composition. The dictionary can be made to be a custom collection of common structures associated with the sounds sampled, framed and analyzed. However, the dictionary also
It may also include the common structure of a broader collection of sound sources that are recognized and tagged to correspond to the particular sound source to be directed. Further, it may be a collection, such as a dictionary, containing appropriate spectral and formant structures that are recognized and tagged.

【００２５】分解能に無関係なシステムのブロック図１
００は、時間の量子化をレコードに適用するマッピング
回路１６０をさらに含む。レコードが生成されると、本
発明は、広範囲の、従来型の、そして後で開発された音
操作技法を適用する。時間の量子化マップは、レコード
がテンポまたはキーを変える機能によって合成されるよ
うにすることができ、それによって異なる「フィール・
ファクタ(feel factors)」を提供する。さらに、他のフ
ァクタが要求するときには、再生の演奏を外部クロック
と同期化させる機能をも提供する。Block diagram of system independent of resolution
00 further includes a mapping circuit 160 that applies time quantization to the record. Once the record has been generated, the present invention applies a wide range of conventional and later developed sound manipulation techniques. The quantization map of time allows the records to be synthesized by the ability to change the tempo or key, thereby producing different "feel
Provides "feel factors". It also provides the ability to synchronize playback performance with an external clock when other factors require.

【００２６】ブロック図１００の分解能に無関係なシス
テムは、一旦録音された音をさらに操作することができ
るエディタ１７０をさらに含む。エディタ１７０はその
録音内容または順序を変更することができる。必要な場
合、元の音源に含まれていない効果を提供するために、
スペクトル構造またはフォルマント構造を変更すること
ができる。また、エディタ１７０は、周波数の輪郭（fr
equency contour）に時間相対的にレコードされたシー
ケンスを再構成するために用いることもできる。分解能
に無関係なシステムは、次にこれらのマップされ、編集
された代替レコードを第２の記憶ユニット１８０の中に
格納する。[0026] The resolution independent system of block diagram 100 further includes an editor 170 that can further manipulate the sound once recorded. The editor 170 can change the recorded contents or order. If necessary, to provide effects not included in the original sound source,
The spectral structure or formant structure can be changed. The editor 170 also displays the frequency contour (fr
It can also be used to reconstruct a sequence recorded relative to the time of the frequency contour. The resolution-independent system then stores these mapped and edited alternative records in the second storage unit 180.

【００２７】ここで図２を参照すると、図１のシステム
において実行できる楽器を録音する、解能に無関係な方
法のフローチャート２００が示されている。フローチャ
ート２００は音を録音する方法の一例を示しており、そ
れは音から基本周波数及びスペクトルの包絡線を抽出す
る段階と、その基本周波数及びスペクトルの包絡線から
フレームを生成する段階とを含む。次に、そのフレーム
の中の共通スペクトル構造及び共通フォルマント構造の
１つが識別されて選択され、基本周波数及びその選択さ
れた周波数の両方を含んでいるレコードを生成する。Referring now to FIG. 2, there is shown a flowchart 200 of a resolution-independent method of recording an instrument that can be performed in the system of FIG. Flowchart 200 illustrates an example of a method for recording a sound, which includes extracting a fundamental frequency and spectral envelope from the sound, and generating a frame from the fundamental frequency and spectral envelope. Next, one of the common spectral and common formant structures in the frame is identified and selected to generate a record that includes both the fundamental frequency and the selected frequency.

【００２８】その方法はステップ２０５から始まり、レ
コードを生成する決定が行われ、そして音源がステップ
２１０において選択され、その音源は音を発生する楽器
を含む。音はステップ２１５においてサンプルされた
後、そのサンプルされた音信号から基本周波数が抽出さ
れる。フレーム（離散的であってよい）が次にステップ
２２０において発生され、それから基本周波数及びスペ
クトルの包絡線が抽出される。次に、これらのフレーム
がフーリエ解析を使ってステップ２２５において解析さ
れ、そして次に共通スペクトル構造及びフォルマント構
造がステップ２３０において識別される。これらの共通
の構造が次にステップ２３５において格納され、それが
その共通の構造のレコードを生成する。時間の量子化マ
ップが次にそのレコードに対してステップ２４０に示さ
れているように適用されるか、あるいはそのレコードが
必要に応じてその内容を編集するために、ステップ２４
５においてそのレコードの内容及び順序の１つを選択し
て変更することができる。その方法はステップ２５０に
おいて終了し、音がこの実施例においては選択され、サ
ンプルされ、フレーム化され、解析され、レコードさ
れ、マップされるか、あるいは変更されている。The method begins at step 205, where a decision to create a record is made, and a sound source is selected at step 210, the sound source including the instrument that produces the sound. After the sound is sampled in step 215, a fundamental frequency is extracted from the sampled sound signal. A frame (which may be discrete) is then generated at step 220 from which the fundamental and spectral envelopes are extracted. Next, these frames are analyzed at step 225 using Fourier analysis, and then the common spectral and formant structures are identified at step 230. These common structures are then stored in step 235, which generates a record of the common structure. The quantization map of time is then applied to the record as shown in step 240, or the record is edited in step 24 to edit its contents as needed.
In step 5, one of the contents and order of the record can be selected and changed. The method ends at step 250, where the sound has been selected, sampled, framed, analyzed, recorded, mapped, or modified in this embodiment.

【００２９】ここで図３を参照すると、本発明の原理に
従って構築された、レコードされた楽器の合成のための
分解能に無関係なシステムのブロック図３００が示され
ている。ブロック図３００の分解能に無関係なシステム
は記憶ユニット３０５、マッピング回路３１０、エディ
タ３１５、波形整形器３２０、出力装置３２５及びスピ
ーカ３３０を含む。記憶ユニット３０５は、上記の図１
と図２において説明されたように音源をサンプルし、フ
レーム化し、そして解析することによって生成されたレ
コード及び辞書を含んでいる。Referring now to FIG. 3, there is illustrated a block diagram 300 of a resolution independent system for the synthesis of a recorded instrument, constructed in accordance with the principles of the present invention. The resolution independent system of block diagram 300 includes storage unit 305, mapping circuit 310, editor 315, waveform shaper 320, output device 325, and speaker 330. The storage unit 305 corresponds to FIG.
And records and dictionaries generated by sampling, framing, and analyzing the sound source as described in FIG.

【００３０】マッピング回路３１０はそのレコードに対
して時間の量子化マップを適用する。以前に述べたよう
に、本発明は広い範囲の、従来の、そして後で開発され
た音操作技法を適用し、そのレコードを変更されたテン
ポまたはキーで合成することができるようにし、そして
その再生演奏を外部クロックと同期化させる機能を提供
する。レコードされた音についてのそれ以上の操作がエ
ディタ３１５によって提供され、それは元の音源に含ま
れていない効果を提供するために、そのレコードの内容
または順序を変更することができる。また、エディタ３
１５は説明されたように周波数の輪郭に時間相対的なレ
コードのシーケンスを再構成するためにも使うことがで
きる。The mapping circuit 310 applies a time quantization map to the record. As mentioned earlier, the present invention applies a wide range of conventional and later developed sound manipulation techniques, allowing the record to be synthesized with an altered tempo or key, and Provides a function to synchronize playback performance with an external clock. Further manipulation of the recorded sound is provided by the editor 315, which can alter the contents or order of the record to provide effects not included in the original sound source. Editor 3
15 can also be used to reconstruct the sequence of records relative to the frequency contours as described.

【００３１】波形整形器３２０は記憶ユニット３０５、
マッピング回路３１０及びエディタ３１５に結合されて
いる。波形整形器３２０は基本の周波数を受け取り、波
形整形変換機能を適用して、格納されているレコード、
マップされたレコードまたは編集されたレコードのいず
れかから波形を生成する。また、波形整形器３２０はこ
れらの３つのいくつかの組合せを使って波形を発生する
こともできる。また、波形整形器３２０は波形整形のプ
ロセスに適応するために波形整形器３２０の中に格納さ
れている波形整形の伝達関数のいくつかから選択するこ
とができる。波形整形器３２０はこの実施例においては
外部クロックによって駆動され、波形が外部事象と同期
化されるようにしている。波形は次に出力装置３２５及
びスピーカ３３０を使って出力音に変換される。ここで
定義された合成のプロセスによって、元々録音された音
が適切な忠実度で再生されるようにすることができ、あ
るいは元々録音された音がユーザに対して適切なように
変更されるようにすることができる。これはＦＦＴまた
は直接正弦波などの他の合成技法の使用を排除するもの
ではない。The waveform shaper 320 includes a storage unit 305,
It is coupled to a mapping circuit 310 and an editor 315. The waveform shaper 320 receives the basic frequency, applies a waveform shaping conversion function, and stores the stored records,
Generate waveforms from either mapped or edited records. Waveform shaper 320 may also generate a waveform using some combination of these three. Also, the waveform shaper 320 can select from some of the waveform shaping transfer functions stored in the waveform shaper 320 to adapt to the waveform shaping process. The waveform shaper 320 is driven by an external clock in this embodiment, so that the waveform is synchronized with external events. The waveform is then converted to output sound using output device 325 and speaker 330. The synthesis process defined here can ensure that the originally recorded sound is played with the proper fidelity, or that the originally recorded sound is modified as appropriate for the user. Can be This does not preclude the use of other synthesis techniques such as FFT or direct sine waves.

【００３２】ここで図４を参照すると、図３のシステム
において実行することができる、録音された楽器を合成
する、分解能に無関係な方法のフローチャート４００が
示されている。フローチャート４００に示されている方
法によって、本発明に従って録音された音の生成及び再
生において、時間の量子化マップを適用するか、あるい
は選択されたレコードの内容または順序を変更すること
が可能である。Referring now to FIG. 4, there is shown a flowchart 400 of a resolution-independent method of synthesizing recorded instruments that can be performed in the system of FIG. By the method shown in flowchart 400, it is possible to apply a quantization map of time or change the content or order of selected records in the generation and playback of sound recorded in accordance with the present invention. .

【００３３】その方法は開始のステップ４０５から始ま
り、レコードを合成する決定が行われ、そしてそのレコ
ードがステップ４１０において選択される。次に、その
レコードがステップ４１５において時間の量子化マップ
を使って処理される。次に、そのレコードがステップ４
２０において編集機能によって変更される。その選択さ
れ、マップされ、そして編集されたレコードが、次に、
ステップ４２５において波形整形され、そしてその波形
整形されたレコードが次に、ステップ４３０において再
生のために配送される。その方法はステップ４３５にお
いて終了する。The method begins at start step 405, where a decision to synthesize a record is made, and that record is selected at step 410. The record is then processed at step 415 using the time quantization map. Next, the record is
At 20 it is changed by the editing function. The selected, mapped, and edited record is then
The waveform is shaped in step 425, and the shaped record is then delivered for playback in step 430. The method ends at step 435.

【００３４】ここで図５を参照すると、リモートの再生
のために分解能に無関係な録音を分配することができる
通信インフラストラクチャのブロック図５００が示され
ている。ブロック図５００は、図５Ａに示されている無
線サーバ５０５及び対話型の音楽プレイヤ５１０を含
み、図５Ｂに示されている音再生のためのランダム・ア
クセスの演奏リスト５１５、ダウンロード機能５２０、
プレイヤ５２５及びスピーカ５３０を含む。Referring now to FIG. 5, there is shown a block diagram 500 of a communication infrastructure capable of distributing resolution independent recordings for remote playback. The block diagram 500 includes the wireless server 505 and the interactive music player 510 shown in FIG. 5A, and a random access play list 515, a download function 520, and the like for sound reproduction shown in FIG. 5B.
It includes a player 525 and a speaker 530.

【００３５】本発明は、遠隔再生のために録音を分配す
るインフラストラクチャをさらに提供する。１つのイン
フラストラクチャはラジオ局として描かれている無線サ
ーバ５０５を含み、その放送局はそれに関係付けられて
いる録音データベースと、その録音データベースの内部
に含まれている複数の録音を含み、複数の各録音は録音
における個々の各楽器、またはボーカリストのレコード
に対応している共通スペクトル構造及び共通フォルマン
ト構造のうちの選択されたものを含み、それが再度合成
されて、再生時にリアルタイムで組み合わされる。The present invention further provides an infrastructure for distributing recordings for remote playback. One infrastructure includes a radio server 505, depicted as a radio station, which includes a recording database associated with it, a plurality of recordings contained within the recording database, and a plurality of recordings. Each recording includes a selection of a common spectral structure and a common formant structure corresponding to each individual instrument in the recording, or a record of the vocalist, which is re-synthesized and combined in real time during playback.

【００３６】対話型の音楽プレイヤ５１０は、複数の録
音の各種のものに対する遠隔の要求を受信する無線サー
バ５０５に関係付けられた録音データベースに結合され
ている、対応する要求受信機に対して要求を発生する。
さらに、録音データベースに対して結合されている送信
機も無線サーバ５０５と関連付けられており、要求に対
応して複数の録音を送信する。追加の単方向モードがあ
り、そのモードにおいては、受信機は所望の選択が送信
されるまで待った後、そのダウンロード及び再生を行
う。The interactive music player 510 makes a request to a corresponding request receiver coupled to a recording database associated with a wireless server 505 that receives remote requests for various ones of the plurality of recordings. Occurs.
Further, a transmitter coupled to the recording database is also associated with the wireless server 505 and transmits a plurality of recordings in response to the request. There is an additional one-way mode in which the receiver waits until the desired selection has been transmitted before downloading and playing it.

【００３７】従って、本発明は、オーディオ・オン・デ
マンドと同等のものを提供し、その中でフォーマット化
されたオーディオ・ファイルが遠隔のラジオまたはプレ
イヤ（「クライアント」の受信機として働く）に対して
提供され、これらのリモートのラジオがそのオーディオ
をその場で合成することができる。本発明のこの実施例
においては、そのインフラストラクチャは、複数の録音
を受信してディジタルに操作することができる複数の遠
隔ディジタルラジオをさらに含む。このインフラストラ
クチャは、遠隔ラジオが、受信した無線波を単純に復調
して増幅する従来のアナログのＡＭまたはＦＭのラジオ
のインフラストラクチャと非常に対照的である。このイ
ンフラストラクチャは現在提案されているか、あるいは
後で開発されたディジタルの送信機及び受信機の標準規
格のどれとでも機能することができる。その録音のため
のプログラムの題材は天気予報、ニュース、株式情報ま
たは他のトピック情報を含むことができるが、それらに
限定されない。Thus, the present invention provides an audio-on-demand equivalent in which the formatted audio file is transmitted to a remote radio or player (acting as a "client" receiver). These remote radios are capable of synthesizing their audio on the fly. In this embodiment of the invention, the infrastructure further includes a plurality of remote digital radios capable of receiving and digitally operating the plurality of recordings. This infrastructure is in stark contrast to the traditional analog AM or FM radio infrastructure where the remote radio simply demodulates and amplifies the received radio waves. This infrastructure can work with any of the currently proposed or later developed digital transmitter and receiver standards. The subject matter of the program for the recording may include, but is not limited to, weather forecast, news, stock information or other topic information.

【００３８】ランダム・アクセスの演奏リスト５１５は
複数のビット・ストリーム・ファイルに具体化されてい
る複数の録音を表す。そのビット・ストリーム・ファイ
ルは上記のように基本周波数及び選択された周波数に関
係するデータを含んでいる。選択の集合または提案の集
合を表す可能性があるビット・ストリーム・ファイル
は、単独のシリアル・ループの中で発生する可能性があ
る。ユーザはこれらのうちのどれかを選択し、それらを
ダウンロードして再生することができる。代わりに、ダ
ウンロード５２０をより迅速にユーザが実行することが
できる並列ループの集合の中にそれらを発生させること
ができる。無線サーバ５０５に関連付けられている送信
機は複数の録音のどれかをすべての遠隔ラジオに対して
放送することができる。代わりに、複数の録音のどれか
を個々の遠隔ラジオに対してだけ向けることができる。
録音データベースは要求のレコードを含むことができ
る。これによって歌の流行（popurality)又は広告の普
及度を追跡することができ、正確なロイヤルティ支払い
金額を自動的に計算することができる。The random access performance list 515 represents a plurality of recordings embodied in a plurality of bit stream files. The bit stream file contains data relating to the fundamental frequency and the selected frequency as described above. A bit stream file, which may represent a set of choices or a set of suggestions, can occur in a single serial loop. The user can select any of these and download and play them. Alternatively, downloads 520 can be generated into a set of parallel loops that can be executed more quickly by the user. A transmitter associated with wireless server 505 can broadcast any of the multiple recordings to all remote radios. Alternatively, any of the multiple recordings can be directed only to individual remote radios.
The recording database may include a record of the request. This allows for tracking the populity of the song or the popularity of the advertisement and automatically calculates the exact royalty payment amount.

【図面の簡単な説明】[Brief description of the drawings]

【図１】本発明の原理に従って構築された楽器の録音の
ための、分解能に無関係なシステムのブロック図を示
す。FIG. 1 illustrates a block diagram of a resolution independent system for recording a musical instrument constructed in accordance with the principles of the present invention.

【図２】図１のシステムにおいて実行することができる
楽器の録音の、分解能に無関係な方法のフローチャート
を示す。FIG. 2 shows a flowchart of a resolution-independent method of recording an instrument that can be performed in the system of FIG.

【図３】本発明の原理に従って構築された、レコードさ
れた楽器を合成するための、分解能に無関係なシステム
のブロック図を示す。FIG. 3 illustrates a block diagram of a resolution-independent system for synthesizing a recorded instrument constructed in accordance with the principles of the present invention.

【図４】図３のシステムにおいて実行することができる
レコードされた楽器の合成のための、分解能に無関係な
方法のフローチャートを示す。FIG. 4 shows a flowchart of a resolution independent method for the synthesis of recorded instruments that can be performed in the system of FIG.

【図５Ａ】リモートの再生のために、分解能に無関係な
録音を分配することができる通信インフラストラクチャ
のブロック図を示す。FIG. 5A shows a block diagram of a communication infrastructure that can distribute resolution independent recordings for remote playback.

【図５Ｂ】リモートの再生のために、分解能に無関係な
録音を分配することができる通信インフラストラクチャ
のブロック図を示す。FIG. 5B shows a block diagram of a communication infrastructure that can distribute resolution independent recordings for remote playback.

Claims

[Claims]

1. A system for recording sound, comprising: a frame generator for extracting a fundamental frequency and a spectrum envelope from the sound and generating a frame therefrom; a common spectral structure and a common formant structure in the frame A frame analyzer for identifying a selected one of the above and generating a record including the fundamental frequency and the selected one.

2. The system according to claim 1, wherein said frames are discrete.

3. The system according to claim 1, wherein a musical instrument produces said sound.

4. The system of claim 1, wherein the frame generator samples the sound after extracting the fundamental frequency from the sound.

5. The system according to claim 1, further comprising a mapping circuit for applying a time quantization map to said record.

6. The system according to claim 1, further comprising an editor for changing a selected one of the contents and order of the record.

7. The system of claim 1, wherein the frame analyzer identifies the selection from a common spectral structure and a common formant structure by performing a Fourier analysis on the frame. System to do.

8. A method for recording a sound, comprising: extracting a fundamental frequency and a spectral envelope from the sound; generating a frame from the fundamental frequency and the spectral envelope; Identifying a selected one of a structure and a common formant structure; and generating a record containing the fundamental frequency and the selected one.

9. The method according to claim 8, wherein the frames are discrete.

10. The method of claim 8, wherein said sound is generated by a musical instrument.

11. The method according to claim 8, further comprising the step of sampling said sound before extracting said fundamental frequency from said sound.

12. The method of claim 8, further comprising applying time quantization to the record.

13. The method of claim 8, further comprising the step of changing a selected one of the contents and order of the record.

14. The method of claim 8, further comprising performing a Fourier analysis on the frame.

15. An infrastructure for distributing recordings for remote playback, comprising: a radio station having an associated recording database; and a plurality of recordings contained within the recording database. Each of the plurality of recordings includes a fundamental frequency and a corresponding one of a selected common spectral structure and a common formant structure, and further receives a remote request for any of the plurality of recordings. A request receiver coupled to the recording database and a transmitter coupled to the recording database for transmitting any of the plurality of recordings in response to the request.

16. The infrastructure of claim 15, wherein the transmitter broadcasts any of the plurality of recordings to a receiver.

17. The infrastructure of claim 15, further comprising a plurality of remote radios capable of receiving and digitally operating any of the plurality of recordings. .

18. The infrastructure of claim 15, wherein the plurality of recordings are generated by a musical instrument.

19. The infrastructure of claim 15, wherein any of the plurality of recordings is embodied in a plurality of bitstream files.

20. The infrastructure of claim 15, wherein the recording database includes a record of the request.

21. A receiver for receiving a recording including a fundamental frequency and a selected one of a corresponding common spectral structure and a common formant structure, and transmitting a waveform shaping based on the selected one. A waveform shaper coupled to the receiver for applying a function to the fundamental frequency to generate a waveform; and a speaker coupled to the waveform shaper for converting the waveform to an output sound. And radio.

22. The radio according to claim 21,
Further applying time quantization to the selected one;
A radio comprising a mapping circuit coupled to said receiver.

23. The radio according to claim 21,
A radio, wherein a transfer function of the waveform shaper is selected from a plurality of waveform shaping transfer functions stored in the waveform shaper.

24. The radio according to claim 21,
A radio wherein the waveform shaper is externally clocked.

25. The radio according to claim 21,
A radio comprising an editor coupled to said receiver for altering the content of said recording.