WO1999065016A1 - Synthesizer system utilizing mass storage devices for real time, low latency access of musical instrument digital samples - Google Patents

Synthesizer system utilizing mass storage devices for real time, low latency access of musical instrument digital samples Download PDF

Info

Publication number
WO1999065016A1
WO1999065016A1 PCT/US1999/013182 US9913182W WO9965016A1 WO 1999065016 A1 WO1999065016 A1 WO 1999065016A1 US 9913182 W US9913182 W US 9913182W WO 9965016 A1 WO9965016 A1 WO 9965016A1
Authority
WO
WIPO (PCT)
Prior art keywords
digital
stream
digital waveform
memory
waveform signals
Prior art date
Application number
PCT/US1999/013182
Other languages
French (fr)
Inventor
James E. Van Buskirk
Joseph A. Bibbo
Original Assignee
Conexant Systems, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US09/095,519 external-priority patent/US6008446A/en
Application filed by Conexant Systems, Inc. filed Critical Conexant Systems, Inc.
Priority to JP2000553943A priority Critical patent/JP2002518693A/en
Priority to EP99927461A priority patent/EP1101217A1/en
Publication of WO1999065016A1 publication Critical patent/WO1999065016A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H7/00Instruments in which the tones are synthesised from a data store, e.g. computer organs
    • G10H7/02Instruments in which the tones are synthesised from a data store, e.g. computer organs in which amplitudes at successive sample points of a tone waveform are stored in one or more memories

Definitions

  • the present invention is related generally to digital sound synthesizer systems. More particularly, the present invention is related to a sound synthesizer system utilizing mass storage media or devices.
  • Digital synthesizers or other electronic systems generally create sound or music and can be utilized as an electronic musical instrument or electronic sound machine.
  • Digital synthesizers are ordinarily arranged to accept input signals from a musician or operator interface and produce digital output signals representing analog signals in the audio frequency range.
  • the interface can provide keystroke signals, mouse signals, touch pad signals, or keyboard signals representing the musician's action.
  • the digital output signals from the digital synthesizer can be converted to analog signals and directed to equipment such as a loudspeaker, tape recorder, mixer, or other device and reproduced in the form of sound.
  • the digital synthesizer can be arranged to provide output signals simulating the sounds of conventional, known musical instruments.
  • the synthesizer may be arranged to simulate sounds which would be emitted by a theoretical instrument having predetermined characteristics different from those of any conventional, known musical instruments.
  • cymbal or the like are an aperiodic function which cannot fully be described by any simple mathematical expression. Accordingly, the production or synthesis of digital signals representing sounds as rich and complex as those of a real instrument is a daunting digital signal processing and sampling task. Moreover, the synthesizer must respond to the nuances of the musician's manipulation of the interface. For example, a snare drum has many different audio characteristics when played in various locations such as toward the center of the drum, toward the rim or on the rim. Additionally, the audio characteristics of the snare drum also vary with the striking force, playing technique and stylistic inflection of the musician.
  • a large enough sample database for each of these characteristics and combinations thereof (facilitated by huge sampling capability) is needed to adequately capture the behavior of the instrument.
  • By having a large enough waveform database more realistic and expressive synthesis can be achieved.
  • Many synthesizers using waveform sampling technology are used extensively in the music and multimedia fields for their ability to create musical sounds that closely emulate the sound of a musical instrument.
  • MIDI Musical Instrument Digital Interface
  • the MIDI interface creates control signals or MIDI control data.
  • the MIDI control data represents music events such as the occurrence of specific notes (e.g. , Middle C, to be realized by a specific musical sound, e.g., piano, horn or drum) .
  • Conventional synthesizer systems utilize a large solid state memory which stores the digital waveform signals representing the real sound of each note played on a particular instrument.
  • the memory can be a static random access memory (SRAM) , a dynamic Ram (DRAM) or a read only memory (ROM) .
  • SRAM static random access memory
  • DRAM dynamic Ram
  • ROM read only memory
  • the appropriate waveform signal is selected depending on the key activated and the intensity of the strike.
  • the waveform signal is converted into an analog output signal.
  • the digital waveform signal can be combined with other notes which are simultaneously being played before being converted to the analog output signal.
  • the synthesizer in effect merely plays back digital recordings of individual sounds or notes.
  • Each waveform signal is stored as a series of individual data words, each representing a single sample of the waveform at a particular time.
  • any such stored waveform signal must include thousands of samples per second in stored sound.
  • the memory required to store each waveform signal is substantial, and the solid state memory required to store all the required waveform signals is accordingly extremely large.
  • the solid state memory is required because its speed allows an essentially real time playback (e.g., no audio perceivable delay) .
  • the high cost per unit of sound samples storage of any type of solid state memory has somewhat prohibited the use of large amounts of digital waveform signals for accurate representation of musical instruments and all their nuances.
  • a major limitation of current synthesizers is the lack of sufficient memory to store the entire sample of a wide range of sounds associated with musical instruments (e.g., due to cost).
  • synthesizer systems have used techniques to more efficiently store instrument samples. These techniques generally result in lower quality sound generation.
  • the technique of "looping" reuses samples of the sound. By replicating and reusing groups of samples, the overall memory requirement is reduced.
  • Other techniques store samples in a compressed state in solid state memory.
  • these systems require significant CPU power to implement the decompression
  • the decompression algorithm can be lossy and suffer audio quality degradation.
  • other prior art synthesizer systems have utilized hard disks or other mass storage devices from which the instrument samples are loaded into a solid state memory prior to musical tone generation or playing the instrument. However, these systems still require large amounts of solid state memory because all of the digital waveform signals for the instrument must be loaded into the solid state memory before playing.
  • Some synthesizer systems have utilized ROM or other types of less expensive, slower solid state memory to store the digital waveform signals. These synthesizers employ a data caching technique to move the waveform signals from the slower ROM to a high speed RAM such as a SRAM for tone generation. Nonetheless, the slower ROM and high speed RAM add significant cost to the synthesizer.
  • the present invention relates to an audio processor for providing a digital output signal representative of a sound at an output.
  • the audio processor includes a mass storage device input for receiving a plurality of digital waveform signals, a control input for receiving digital control signals, and a control circuit coupled to the control input and the mass storage device input.
  • the digital control signals are indicative of at least one selected digital waveform signal of the digital waveform signals.
  • the control circuit stores a first part of the digital waveform signals in a host memory.
  • the control circuit provides from the host memory a first part of the selected digital waveform signal to the output in response to the digital control signals and subsequently provides a remaining part of the selected digital waveform signal from the mass storage device input to the output.
  • the control circuit provides the remaining part of the selected digital waveform signal without a loss of continuity between the first part of the selected digital waveform signal and the remaining part of the selected digital waveform signal.
  • the present invention also relates to a digital synthesizer system including a mass storage device means for storing a plurality of digital waveform signals, a control input means for receiving a digital control signals, a host memory means for storing a first part for each of the digital waveform signals and a processor means for generating a digital sound signal at an output.
  • a digital synthesizer system including a mass storage device means for storing a plurality of digital waveform signals, a control input means for receiving a digital control signals, a host memory means for storing a first part for each of the digital waveform signals and a processor means for generating a digital sound signal at an output.
  • a digital synthesizer system including a mass storage device means for storing a plurality of digital waveform signals, a control input means for receiving a digital control signals, a host memory means for storing a first part for each of the digital waveform signals and a processor means for generating a digital sound signal at an output.
  • Each of the digital waveform signals
  • the digital control signal is indicative of a selected sound of the sounds.
  • the processor means provides from the host memory the first part of the digital waveform signal corresponding to the selected sound to the output and provides a second part of the digital waveform signal corresponding to the selected sound from the mass storage device means to the output.
  • the processor means utilizes a caching techniques to provide the second part to the output.
  • the present invention further relates to a memory architecture for a digital synthesizer system.
  • the digital synthesizer system includes a mass storage device and a processor.
  • the mass storage device stores a plurality of digital waveform signals. Each of the digital waveform signals corresponds to a particular sound of a plurality of sounds.
  • the processor receives a digital control signal.
  • the digital control signal is indicative of a selected sound of the sounds.
  • the audio processor generates a digital sound signal at a output in response to the digital control signal.
  • the processor provides a first part of the digital waveform signal corresponding to a selected sound to the output followed by the second part of the digital waveform signal corresponding to the selected sound from the mass storage device.
  • the memory architecture includes a sample buffer and a stream buffer.
  • the sample buffer stores a first part of each of the digital waveform signals.
  • the stream buffer temporarily stores the second part of the digital waveform signal corresponding to the selected sound.
  • the present invention still further relates to a method of digitally synthesizing sounds in a synthesizer system.
  • the synthesizer system includes a mass storage device, a host memory and a processor.
  • the mass storage device stores a plurality of digital waveform signals.
  • the processor has a digital control signal input and a digital output.
  • the processor is coupled to the host memory and the
  • the method includes downloading a first part for each of the digital waveform signals into the host memory, receiving a digital control signal on the digital control signal input, the digital control signal indicating a selected digital waveform signal of the digital waveform signals, providing the first part of the selected digital waveform signal from the host memory to the digital output, and providing a second part of the selected digital waveform signal from the mass storage device to the digital output.
  • the first part and the second part are provided so there is no audio perceivable delay between the first part and the second part.
  • the present invention even further relates to a method of digitally synthesizing sounds in a synthesizer system.
  • the synthesizer system includes a mass storage device, a host memory, and a processor.
  • the mass storage device stores a plurality of digital waveform signals. Each of the digital waveform signals corresponds to a particular sound of a plurality of sounds. The first part for each of the digital waveform signals is stored in the host memory.
  • the processor has a digital control signal input and a digital output. The processor is coupled to the host memory and the mass storage device.
  • the method includes receiving a digital control signal on the digital control signal input, the digital control signal indicating a selected sound of the plurality of sounds, providing a first part of a digital waveform signal corresponding to the selected sound from the host memory to the digital output, and providing a second part of the digital waveform signal from the mass storage device to the digital output corresponding to the selected sound.
  • the first part and the second part are provided so there is no loss of continuity between the first part and the second part.
  • the present invention also relates to an audio processor for providing a digital output signal
  • the audio processor includes a mass storage device input for receiving a plurality of digital waveform signals, a sample pool for storing a first part of the digital waveform signals, a stream buffer, a stream engine coupled to the mass storage device input and to the stream buffer, and a synthesizer engine coupled to the stream buffer and the output.
  • the stream engine provides a second part of the digital waveform signals from the mass storage device input to the stream buffer.
  • the synthesizer engine provides the second part of the digital waveform signals from the stream buffer to the output.
  • a digital synthesizer stores a first part of a digital waveform signal in a host memory and caches the remaining parts of the digital waveform signal from a mass storage device.
  • the digital music synthesizer preferably utilizes a hard disk drive as a storage device and is implemented in software which is run by a computer.
  • the computer includes a central processing unit and host memory operating in a Windows*-based environment.
  • the synthesizer system utilizes an audio processor configured in software.
  • the audio processor includes a solid state sample pool, a stream engine, a synthesizer engine and a solid state stream buffer.
  • the stream engine provides portions of the digital waveform samples or signals from a mass storage device to the stream buffer.
  • the synthesizer engine provides the portions of digital waveform signals from the sample pool and the stream buffer to a digital output.
  • the processor preferably responds to MIDI control signals to select particular digital waveform signals.
  • the sample pool stores a first part of each digital waveform signal stored on a hard drive.
  • the sample pool includes an array of sample cells, each including a deck buffer and a control data buffer.
  • Each cell of the array of stream cells preferably includes a deck buffer pointer, a stream buffer and a control data buffer.
  • the deck buffer stores the first part of the waveform.
  • the stream buffer is utilized to store subsequent parts of the waveform which are stored on the hard drive.
  • Figure 1 is an exemplary block diagram of a digital sound synthesizer system including an audio processor in accordance with an exemplary embodiment of the present invention
  • Figure 2 is a more detailed block diagram showing the audio processor, illustrated in Figure 1, including a stream engine, an array of stream buffer cells, and a sample pool including sample pool cells;
  • FIG. 3 is a more detailed block diagram of the stream engine illustrated in Figure 2;
  • Figure 4 is a block diagram of one of the stream buffer cells illustrated in Figure 2.
  • Figure 5 is a block diagram of one of the sample pool cells illustrated in Figure 2.
  • a digital synthesizer system 10 includes an audio processor 12, a sound selection
  • Audio processor 12 has a digital output 28 coupled to converter 25.
  • Converter 25 has an output coupled to speaker system 26.
  • audio processor 12 receives digital control signals at a digital control input 20 from sound selection source 18 and digital samples or digital waveform signals at a mass storage device input 24 from digital waveform signal source 22. Audio processor 12 processes the digital control signals at input 20 and the digital waveform signals at input 24 to provide a digital output signal at output 28.
  • the digital output signal is provided to converter 25 which provides an analog signal representative of the digital output signal to speaker system 26.
  • processor 12 receives a number of control signals at input 20
  • processor 12 can provide a combination, mixture or a number of different digital output signals at output 28, representative of the digital waveform signals indicated by the control signals at input 20.
  • Speaker system 26 preferably generates sound or music representative of the digital output signal provided at output 28.
  • Converter 25 and speaker system 26 can be replaced with other devices coupled to digital output 28.
  • a mixer, recorder, digital storage device such as a hard disk drive, or other audio apparatus can be coupled to output 28 of processor 12.
  • Speaker system 26 can be a stereo system including two or more speakers for providing a stereo production of sound indicated by the digital output signal which preferably represents stereo sound.
  • Sound selection source 18 can be any device for generating control signals which are most preferably MIDI data or control signals.
  • MIDI control signals are well known in the art and represent music events such as the occurrence of a specific musical note, e.g., middle C, to be realized by a specific musical sound, e.g., piano, horn, drum or other device.
  • audio processor 12 may be utilized with a different type of sound selection source 18 which provides control signals to input 20.
  • Sound selection source 18 can be a mouse interface, piano keyboard, electronic drum pad, key pad, touch pad, button interface, or other musician interface for providing digital control signals to input 20.
  • Sound waveform source 22 is preferably a mass storage device such as a hard disk drive, optical disk drive, tape source, floppy disk drive, or other memory device. Sound waveform source 22 stores a wide range of digital samples of one or more musical instruments. For example, sound waveform source 22 can store approximately 1100 independent digital waveform signals associated with a grand piano.
  • the number of waveform signals can be adjusted for various applications and instruments.
  • the discussion of the use of system 10 to reproduce sounds of a grand piano is not for a limiting purpose and is merely discussed as an example.
  • the types of notes, types of sources, types of instruments, and size of samples are discussed only in an exemplary fashion in this patent application.
  • System 10 can be implemented on a PC platform, as an embedded system, as a music synthesizer, as a dedicated rack mount music sampler, or as another sound generation device.
  • system 10 is a computer based system comprised of a CPU, a hard disk drive and host memory.
  • the CPU is configured by software to perform the operations of audio processor 12 described throughout the present application.
  • system 10 can be a dedicated digital signal processor system, a dedicated general purpose processor system, or a hardware system.
  • the host memory is preferably at least 32 megabytes of DRAM, although other types of solid state memory can be used.
  • the host memory can be combinations of ROM, SRAM, flash memory or other devices.
  • the CPU is preferably a 486 or above processor.
  • System 10 is configured to provide digital waveform samples or signals stored in source 22 in real
  • audio processor 12 provides digital output signals representative of those manipulations at output 28 without an audio perceivable latency between the manipulation of source 18 and the hearing the music from speaker system 26.
  • audio processor 12 is able to provide the digital output signals in less than 15 milliseconds from the manipulation of source 18.
  • an audio perceivable delay refers to a loss of continuity, an objectionable latency, a stall, or a detainment of the music or sound as heard by the human ear.
  • audio perceivable delay is used in two different contexts.
  • the first context is related to the time difference discussed above between a musician's actuations and the production of sound.
  • the second context is related to the continuity of signals between the first part and remaining parts of the digital waveform signals. Also, the second context is related to the continuity of signals between portions of the remaining parts of the digital waveform signals.
  • the second context requires that the delay be less than several micro-seconds (e.g., so that there is no loss of continuity) to prevent noise artifacts, which are detectable by the human ear.
  • System parameters and application criteria can affect the definition of audio perceivable delay and the
  • system 10 can advantageously utilize a slower speed, lower cost, mass storage device as source 22 (e.g., a hard disk drive) , and yet provide digital output signals at output 28 in a sufficient amount of time.
  • source 22 e.g., a hard disk drive
  • the use of source 22 to store the bulk of the digital waveform signals greatly reduces the size of the solid state memory (not shown) required by processor 12 (e.g., and hence the cost of processor 12) for real time production of sound.
  • processor 12 utilizes a caching system so that most of the digital waveform signals are not stored in the solid state memory of processor 12. Most portions of the digital waveform signals are only stored in solid state memory when actually needed.
  • processor 12 utilizes its solid state memory (not shown) to store only a first portion of each of the digital waveform signals stored in source 22.
  • processor 12 downloads the first portion for each digital waveform signal stored for the selected instrument.
  • the musician may choose a portion of an instrument or
  • processor 12 can download a first portion of each digital waveform in accordance with the musician's choice. Therefore, the first portions of digital waveforms signals stored in the solid state memory of audio processor 12 preferably correspond on a one-to-one basis to the digital waveform signals stored on source 22 for a selected instrument or group of instruments.
  • Audio processor 12 responds to the digital control signals provided at input 20. Audio processor 12 provides the first portions of a selected digital waveform signal from its solid state memory to output 28 in response to the digital control signal provided at input 20. Preferably, at least the first 0.5 seconds of the digital waveform signal is stored in the solid state memory. The remaining portions of the selected digital waveform signal are then provided from source 22 to processor 12 and to output 28. The remaining portions can be retrieved while the first portion is being played. The remaining portion can include portions of the digital waveform signals representing up to three minutes of sound. Therefore, processor 12 utilizes a caching mechanism to reduce or eliminate an audio perceivable delay or loss of continuity between providing a first portion of digital waveform signals stored in solid state memory and remaining portions stored in source 22.
  • Audio processor 12 has input 24 coupled to source 22 which is shown in FIGURES 2 and 3 as a hard disk drive.
  • Audio processor 12 includes a sample pool 40, an array 42 of stream cells 43, a command parser 50, a command queue 52, a synthesizer engine 66, a download module 64, a system thread manager 68, and a stream engine 80.
  • Command parser 50 is coupled to input 20 and command queue 52.
  • Command queue 52 is coupled to a control input of synthesizer engine 66.
  • Input 24 is coupled to download module 64 and stream engine 80.
  • Download module 64 is coupled to sample pool 40.
  • Sample pool 40 is comprised of a plurality of sample pool cells 41. Sample pool cells 41 are provided for each of digital waveform signals for a particular instrument, portion of an instrument, or group of instruments selected by the musician. Sample pool 40 is configurable for a various number of cells. In the grand piano example, pool 40 has approximately 1100 cells 41.
  • Stream engine 80 is also coupled to system thread manager 68.
  • System thread manager 68 operates as a master to stream engine 80.
  • Manager 68 ensures that stream engine 80 receives the necessary amount of operational time of the CPU (not shown) utilized to implement processor 12 by cooperating with the operating system of the computer.
  • Manager 68 preferably keeps stream engine 80 consistently
  • manager 68, engine 66, engine 80, module 64, and parser 50 are preferably implemented by the CPU of the PC platform (not shown) .
  • the storage operations of array 42, pool 40 and queue 52 are provided by the host memory for the PC platform.
  • Stream engine 80 is also coupled to array 42.
  • Stream engine 80 is capable of providing portions of digital waveform signals from source 22 to stream cell array 42.
  • each of cells 43 includes a stream buffer 47 and a run-time control data buffer 49.
  • Stream cell array 42 is preferably scalable from approximately 24 to 256 cells although any size of array 42 is possible.
  • Each of cells 43 represents a single voice or stream of polyphony where two voice polyphony is representative of a concurrent musical tone (e.g., the combination of a C and E note). Therefore, a one-to-one voice to stream cell 43 relationship is utilized by processor 12.
  • Stream engine 80 manages the flow of data or portions of digital waveform signals into stream cell array 42.
  • Engine 80 maintains streams of data or portions of signals from source 22 to cells 43.
  • the streams of data can be
  • Stream engine 80 is preferably a state machine configured to move portions of digital waveform signals from source 22 to array 42 so processor 12 can create sound in a real time or near real time fashion. As stream engine 80 provides the portions of digital waveform signals to cells 43 in array 42, synthesizer engine 60 consumes the portion of digital waveform signals in cells 43 and provides a digital waveform output representative of a selected sound at output 28.
  • stream engine 80 fills cells 43 in array 42 with portions of the digital waveform signals, stream engine 80 provides synchronization and error correction, and decompression operations such as a lossless second order decompression on the signals.
  • stream engine 80 is shown and described in the application as a throughput from input 24 to array 42 for discussion purposes, stream engine 80 in actuality moves data from source 22 to array 42 (e.g., performs more of a transmission management operation of signals from source 22 to array 42) .
  • download module 64 receives a first part for each of the digital waveform signals for the particular instrument from source 22.
  • Download module 64 stores the first part of the digital
  • sample pool cells 41 in sample pool 40 include a deck buffer 51 and a global control data buffer 53.
  • deck buffer 51 of sample cells 41 stores the first 0.5 seconds (approximately 64K bytes depending on characteristics of system 10 such as RAM size, CPU speed and hard disk speed) of each of the digital waveform signals.
  • source 22 and engine 80 can provide signals to array 42 more quickly.
  • Global control data buffer 53 includes control information or global control data.
  • Global control data refers to control data for a particular sample. Global control data is available to system 10 at load time. Global control data provides system 10 with characteristics of the sample, or digital waveform signal, including the sampling rate, the length of the sample, the playback parameters and control words for compression algorithms. The control words are used to decompress the particular packets or portions of the digital waveform signals.
  • Memory space can be saved by using a pointer (not shown) in buffer 49 of cells 43 which merely addresses the corresponding deck buffer 51 in pool 40, rather than storing
  • parser 50 manages the allocation of all new stream cells 43. Parser 50 responds to the asynchronous digital control signals at input 20 to reference stream cells 43 back to the appropriate sample cell 41. Parser 50 manages the allocation and deallocation of the active stream cells 43 in response to on/off messages received at input 20. For example, as a musician strikes a key, an event is sent to parser 50 to turn on a particular sample cell 41. Parser 50 finds the appropriate cell 41 from the sample pool 40, based on the velocity at which the key was struck along with current state of the synthesizer system 10. Parser 50 is responsible for allocating a stream cell 43 from the available cells 43 in array 42. Parser 50 first, searches for an unused stream cell 43. Parser 50 uses the run-time control data stored in buffer 49 of a stream cell 43 to determine its availability. In the case that all stream cells 43 are allocated, parser 50 uses the following rules to determine which stream cell 43 is to be preempted:
  • parser 50 looks for a voice with identical characteristic as the new voice. For example, if a middle C on the keyboard is currently being played, and the musician hits another middle C when system 10 is at maximum polyphony, the parser 50 will preempts the older middle C, in favor of the newer middle C.
  • Parser 50 finds the voice (e.g., cell 43) with the lowest run-time volume value in buffer 49, and selects it as the preempted voice. Parser 50, on allocating a stream cell 43, sets the run-time control data in buffer 49. The run-time control data references a single sample cell 41. The runtime control data also signals both stream engine 80 and synthesizer engine 66 to begin processing this particular cell 43. On allocation of new voice, parser 50 is responsible for all initialization of stream cells 43. Likewise, parser 50 passes all control signals through to the synthesizer engine 66, via the command queue 52. These commands allow engine 66 to control the playback experience as defined by industry accepted standards, such as the MIDI 1.0 specification. For example, the natural termination of voice can be signaled when a musician releases a key on an instrument. This off signal allows the
  • synthesizer engine 66 to naturally decay the active voice in manner which is consistent with the sounds of an acoustic instrument.
  • Parser 50 initializes the run-time control data by flagging it as a "new voice". The flag is used in conjunction with the stream engine 80 for priority scheduling and error correction. Parser 50 also initializes the progress counter for a new voice/stream cell 43. This progress counter is maintained by stream engine 80 and synthesizer engine 66 for priority scheduling and stream data integrity.
  • Parser 50 initializes the run-time control data in buffer 49 to point to deck buffer 51 in the associated sample cell 41. In addition, parser 50 sets the stream buffer 47 corresponding to the stream cell 43. Synthesizer engine 66 then begins processing out of deck buffer 51. As data is consumed from deck buffer 51, stream engine 80 begins filling stream buffer 47 of cell 43. Eventually, synthesizer engine 66 consumes all the data available in deck buffer 51. Engine 66 dynamically switches its internal pointer to begin consuming data from stream buffer 47 of cell 43 once all the data in buffer 51 is consumed. Engine 66 signals stream engine 80 by setting the appropriate flag in the run-time control data and advancing the progress counter.
  • stream engine 80 is shown having input 24 coupled to source 22.
  • Engine 80 is also coupled to system thread manager 68.
  • An output of stream engine 80 is coupled to array 42 of cells 43.
  • Stream engine 80 includes a priority stream scheduler 82, a buffer management selector 84, a stream count manager 86, a disk synchronizer 88 having a control output coupled to source 22, a control word parser 90, a data mover 94, and a decompressor 96.
  • Data mover 94 is also coupled to a command synchronization circuit 102 from command parser 50.
  • Stream engine 80 is responsible for supplying data from source 22 to active stream cells 43, while maintaining the real-time integrity of the sample audio (e.g., digital output signal) .
  • Engine 80 acts as the producer to the individual cell 43, whereas synthesizer engine 66 acts as the consumer.
  • Synthesizer engine 66 and stream engine 80 are coupled by the run-time control data, which is initialized by parser 50 when a new voice is allocated.
  • the run-time control data guarantees audible integrity out of the entire system 10 by ensuring that the consumer (e.g., engine 66) and producer (e.g., engine 80) of data always remain in synchronization.
  • Stream engine 80 supplies data to n-channels (e.g., n cells 43), where n is the current polyphony level.
  • the data is streamed from source 22 to stream buffer 47 of
  • Priority stream scheduler 82 is based off a priority level scheduling mechanism.
  • Cells 43 are dynamically assigned a priority level. Cells 43 are handled with the highest levels processed first.
  • the priority levels are assigned to cells 43 based on the current position of the data stream. When a new voice is allocated, the data stream is initialized to the starting position on source 22. The particular cell 43 is assigned a new stream position. Stream/cells 43 flagged with a new stream position are given the highest priority in system 10. If there are multiple cells 43 with the new stream flag set, then a round-robin method is implemented.
  • Stream cells 43 that do not have the new flag set are prioritized based on the progress counter which is a component of the run-time control data in buffer 49. This counter will be incremented when a single quadrant of data has been consumed from the stream buffer 47 by the synthesizer engine 66.
  • the priority levels are determined by the value of the progress counter. This counter is initialized to zero by parser 50 when a new voice has been allocated. In the case where multiple cells 43 have the same priority levels, a round-robin scheduler is implemented.
  • Stream engine 80 is a state machine, which is given control of system 10 via the system thread manager 68.
  • System thread manager 68 is responsible for allocating all remaining bandwidth of the host processor (not shown) to stream engine 80.
  • Priority stream scheduler 82 determines which, if any, stream cell 43 needs to be processed next.
  • control is passed to buffer management selector 84.
  • the buffer management selector 84 initializes the transfer from source 22 to a stream buffer 47, based on the currently used buffer scheme.
  • system 10 implements two different buffering schemes.
  • the choice of buffering scheme is a function of the system parameters, including CPU bandwidth, speed of source 22 (including data access time and seek time) .
  • System 10 employs two buffering schemes: a single quadrant transfer and a double quadrant transfer.
  • All stream buffers 47 in cells 43 can be viewed as four quadrants.
  • Synthesizer engine 66 consumes data on a quadrant by quadrant basis. Each time a quadrant is consumed the progress counter is incremented.
  • Stream engine 80 fills stream buffer 47 either a single quadrant or a double quadrant at a time, depending on the buffering scheme currently deployed. Stream engine 80 is potentially maintaining n streams of real-time data that may be
  • n is the polyphony of system 10.
  • set-up time e.g., seek time
  • data rate e.g., data rate
  • system 10 incurs the cost of a set-up time.
  • System 10 uses disk synchronizer 88 to set-up the source 22 from the correct location for the current stream.
  • the accumulated time required to set-up the stream will be half of the time required when using the single quadrant buffer.
  • the negative to using the double quadrant buffering scheme is that each time a cell 43 needs to be processed, twice as much data must be moved from the source 22 to the stream buffer 47. Because stream engine 80 maintains a single thread of execution when processing the stream cells 43, if a single cell 45 maintains control of source 22 for too long, other streams have the potential to starve.
  • the buffering scheme for a particular stream cell 43 is stored in the run-time control data buffer 49.
  • Stream count manager 80 manages the progress counter for the purpose of indicating the current status of stream engine 80 for any given stream cell 43.
  • the progress counter is used by synthesizer engine 66 as a method to signal the stream engine 80 for more data. Likewise, the progress counter can be used to indicate that data is available.
  • disk synchronizer 80 sets up source 22 by issuing a command to move the reading apparatus to the correct location on source 22.
  • the exact location on source 22 for any stream cell 43 is maintained in the run-time control data buffer 49.
  • the run-time control data is updated on each pass through the stream manager 80.
  • System 10 accounts for compressed data. Any compression algorithm can be implemented. However, for the application of musical tone generation a lossless scheme is preferably implemented. The lossless algorithm guarantees that quality does not degrade due to the compression/decompression process.
  • the musical samples will be compressed prior to storage in source 22.
  • This embodiment implements a second order differential compression/decompression algorithm.
  • the decompression of a stream is handled by a decompressor 96.
  • the compressed stream is packetized. Each packet contains a signal control word, which is used as a control data for decompressor 96.
  • the control words are initially stored with the instrument on source 22. When an instrument is loaded into system 10, download module 64 transfers the control words from the source 22 to the global control data buffer 53 in cells 41 of pool 40.
  • the decompression algorithm can accept various packet sizes, providing they are of a 2" size. The size cf
  • the packet offsets the compression ratio of the sampled instrument. This embodiment use 2 kiloword packet sizes.
  • the data mover is responsible for directing the sampled instrument from source 22 to either the second order decompressor 96 or the stream buffer 47.
  • command synchronization module 102 executes. Module 102 is responsible for handling any error correction and all synchronization between the stream engine 80 and parser 50.
  • the command synchronization 102 handles the previously discussed preemption of stream cell 43. If a stream cell 43 is prematurely deallocated because system 10 has hit maximum polyphony and a new voice has been started, then stream engine 80 must check the integrity of the active stream cell 43 before proceeding. This situation is a result of stream engine 80 and parser 50 running asynchronously.
  • system 10 can be configured to provide the first part of digital waveform signals stored in deck buffer 51 of cells 41 and pool 40 to cells 43 in array 42 rather than directly providing the first part from pool 40 to output 28 through synthesizer engine 66.
  • engine 80 or engine 66 moves the first part of the digital waveform signal from pool 40 to cell 43 in array 42.
  • Synthesizer engine 66 then moves the first part of the digital waveform signal from cell 43 of array 42 to digital output 28.
  • Deck buffer 45 is preferably comprised of 64K bytes.
  • Stream buffer 47 is preferably comprised of from two to four quadratures of 32 Kilobytes (Kb) .
  • Control buffer 49 is comprised of 200 bytes.
  • Source 22 preferably has a seek time of approximately 7 milliseconds (ms) and a data rate of 8-12 MB/S although other times are possible.
  • Source 22 is preferably a one gigabyte or more hard disk drive.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Electrophonic Musical Instruments (AREA)

Abstract

A synthesizer system includes a CPU and host memory operating a software routine. The software routine stores a first part of each waveform signal in a sample pool (40) of host memory and provides remaining portions of selected musical sounds from the hard drive (22) to a stream cell array (42) without an audio perceivable delay. The synthesizer system utilizes a caching system which allow low cost, high storage devices to be utilized in an audio synthesizer system. MIDI control signals are provided to an audio processor for selecting appropriate digital waveform signals.

Description

SYNTHESIZER SYSTEM UTILIZING MASS STORAGE DEVICES FOR REAL TIME, LOW LATENCY ACCESS OF MUSICAL INSTRUMENT DIGITAL
SAMPLES
FIELD OF THE INVENTION
The present invention is related generally to digital sound synthesizer systems. More particularly, the present invention is related to a sound synthesizer system utilizing mass storage media or devices.
BACKGROUND OF THE INVENTION
Digital synthesizers or other electronic systems generally create sound or music and can be utilized as an electronic musical instrument or electronic sound machine. Digital synthesizers are ordinarily arranged to accept input signals from a musician or operator interface and produce digital output signals representing analog signals in the audio frequency range. The interface can provide keystroke signals, mouse signals, touch pad signals, or keyboard signals representing the musician's action.
The digital output signals from the digital synthesizer can be converted to analog signals and directed to equipment such as a loudspeaker, tape recorder, mixer, or other device and reproduced in the form of sound. The digital synthesizer can be arranged to provide output signals simulating the sounds of conventional, known musical instruments. Alternatively, the synthesizer may be arranged to simulate sounds which would be emitted by a theoretical instrument having predetermined characteristics different from those of any conventional, known musical instruments.
Music and sound synthesis is a formidable technical task. Real musical instruments produce complex blends of many different frequencies imparting what is commonly referred to as "tone color" to the sound. For example, percussion sounds such as those made by a drum,
cymbal or the like are an aperiodic function which cannot fully be described by any simple mathematical expression. Accordingly, the production or synthesis of digital signals representing sounds as rich and complex as those of a real instrument is a formidable digital signal processing and sampling task. Moreover, the synthesizer must respond to the nuances of the musician's manipulation of the interface. For example, a snare drum has many different audio characteristics when played in various locations such as toward the center of the drum, toward the rim or on the rim. Additionally, the audio characteristics of the snare drum also vary with the striking force, playing technique and stylistic inflection of the musician.
A large enough sample database for each of these characteristics and combinations thereof (facilitated by huge sampling capability) is needed to adequately capture the behavior of the instrument. By having a large enough waveform database, more realistic and expressive synthesis can be achieved. Many synthesizers using waveform sampling technology are used extensively in the music and multimedia fields for their ability to create musical sounds that closely emulate the sound of a musical instrument.
Prior synthesizer systems often utilize a Musical Instrument Digital Interface (MIDI) to control digital synthesizers. The MIDI interface creates control signals or MIDI control data. The MIDI control data represents music events such as the occurrence of specific notes (e.g. , Middle C, to be realized by a specific musical sound, e.g., piano, horn or drum) . Conventional synthesizer systems utilize a large solid state memory which stores the digital waveform signals representing the real sound of each note played on a particular instrument. The memory can be a static random access memory (SRAM) , a dynamic Ram (DRAM) or a read only memory (ROM) . When the musician actuates a key or other
interface, the appropriate waveform signal is selected depending on the key activated and the intensity of the strike. The waveform signal is converted into an analog output signal. The digital waveform signal can be combined with other notes which are simultaneously being played before being converted to the analog output signal.
In this arrangement, the synthesizer in effect merely plays back digital recordings of individual sounds or notes. Each waveform signal is stored as a series of individual data words, each representing a single sample of the waveform at a particular time. To achieve acceptable fidelity, any such stored waveform signal must include thousands of samples per second in stored sound. The memory required to store each waveform signal is substantial, and the solid state memory required to store all the required waveform signals is accordingly extremely large. The solid state memory is required because its speed allows an essentially real time playback (e.g., no audio perceivable delay) . The high cost per unit of sound samples storage of any type of solid state memory has somewhat prohibited the use of large amounts of digital waveform signals for accurate representation of musical instruments and all their nuances. A major limitation of current synthesizers is the lack of sufficient memory to store the entire sample of a wide range of sounds associated with musical instruments (e.g., due to cost).
To reduce the memory requirements, synthesizer systems have used techniques to more efficiently store instrument samples. These techniques generally result in lower quality sound generation. The technique of "looping" reuses samples of the sound. By replicating and reusing groups of samples, the overall memory requirement is reduced. Other techniques store samples in a compressed state in solid state memory. However, these systems require significant CPU power to implement the decompression
algorithm. Also, the decompression algorithm can be lossy and suffer audio quality degradation. other prior art synthesizer systems have utilized hard disks or other mass storage devices from which the instrument samples are loaded into a solid state memory prior to musical tone generation or playing the instrument. However, these systems still require large amounts of solid state memory because all of the digital waveform signals for the instrument must be loaded into the solid state memory before playing.
Some synthesizer systems have utilized ROM or other types of less expensive, slower solid state memory to store the digital waveform signals. These synthesizers employ a data caching technique to move the waveform signals from the slower ROM to a high speed RAM such as a SRAM for tone generation. Nonetheless, the slower ROM and high speed RAM add significant cost to the synthesizer.
Other prior art systems store only one or several waveforms representing each musical instrument. These waveforms are then adjusted by digital signal processing techniques or other electronic techniques (e.g., non-linear distortion) to reflect frequency and amplitude changes associated with different musical characteristics as indicated by the MIDI control data. For example, the frequency and amplitude of a sample waveform representing middle C of a piano can be adjusted to synthesize a different piano note and volume. However, these types of synthesizers are unable to produce the complex blends or tone color to a high enough fidelity for the musically trained ear. In another example, some systems utilize digital filters to adjust the harmonic content of a particular note. However, these systems require significant CPU power and can suffer audio quality degradation.
Thus, there is a need for a music synthesizer which can store large amounts of musical samples without
utilizing substantial amounts of expensive solid state memory. Further still, there is a need for a music synthesizer which can utilize a mass storage device and yet provide a real time production of musical tones.
SUMMARY OF INVENTION The present invention relates to an audio processor for providing a digital output signal representative of a sound at an output. The audio processor includes a mass storage device input for receiving a plurality of digital waveform signals, a control input for receiving digital control signals, and a control circuit coupled to the control input and the mass storage device input. The digital control signals are indicative of at least one selected digital waveform signal of the digital waveform signals. The control circuit stores a first part of the digital waveform signals in a host memory. The control circuit provides from the host memory a first part of the selected digital waveform signal to the output in response to the digital control signals and subsequently provides a remaining part of the selected digital waveform signal from the mass storage device input to the output. The control circuit provides the remaining part of the selected digital waveform signal without a loss of continuity between the first part of the selected digital waveform signal and the remaining part of the selected digital waveform signal.
The present invention also relates to a digital synthesizer system including a mass storage device means for storing a plurality of digital waveform signals, a control input means for receiving a digital control signals, a host memory means for storing a first part for each of the digital waveform signals and a processor means for generating a digital sound signal at an output. Each of the digital waveform signals corresponds to a particular sound
of a plurality of sounds. The digital control signal is indicative of a selected sound of the sounds. The processor means provides from the host memory the first part of the digital waveform signal corresponding to the selected sound to the output and provides a second part of the digital waveform signal corresponding to the selected sound from the mass storage device means to the output. The processor means utilizes a caching techniques to provide the second part to the output. The present invention further relates to a memory architecture for a digital synthesizer system. The digital synthesizer system includes a mass storage device and a processor. The mass storage device stores a plurality of digital waveform signals. Each of the digital waveform signals corresponds to a particular sound of a plurality of sounds. The processor receives a digital control signal. The digital control signal is indicative of a selected sound of the sounds. The audio processor generates a digital sound signal at a output in response to the digital control signal. The processor provides a first part of the digital waveform signal corresponding to a selected sound to the output followed by the second part of the digital waveform signal corresponding to the selected sound from the mass storage device. The memory architecture includes a sample buffer and a stream buffer. The sample buffer stores a first part of each of the digital waveform signals. The stream buffer temporarily stores the second part of the digital waveform signal corresponding to the selected sound. The present invention still further relates to a method of digitally synthesizing sounds in a synthesizer system. The synthesizer system includes a mass storage device, a host memory and a processor. The mass storage device stores a plurality of digital waveform signals. The processor has a digital control signal input and a digital output. The processor is coupled to the host memory and the
mass storage device. The method includes downloading a first part for each of the digital waveform signals into the host memory, receiving a digital control signal on the digital control signal input, the digital control signal indicating a selected digital waveform signal of the digital waveform signals, providing the first part of the selected digital waveform signal from the host memory to the digital output, and providing a second part of the selected digital waveform signal from the mass storage device to the digital output. The first part and the second part are provided so there is no audio perceivable delay between the first part and the second part.
The present invention even further relates to a method of digitally synthesizing sounds in a synthesizer system. The synthesizer system includes a mass storage device, a host memory, and a processor. The mass storage device stores a plurality of digital waveform signals. Each of the digital waveform signals corresponds to a particular sound of a plurality of sounds. The first part for each of the digital waveform signals is stored in the host memory. The processor has a digital control signal input and a digital output. The processor is coupled to the host memory and the mass storage device. The method includes receiving a digital control signal on the digital control signal input, the digital control signal indicating a selected sound of the plurality of sounds, providing a first part of a digital waveform signal corresponding to the selected sound from the host memory to the digital output, and providing a second part of the digital waveform signal from the mass storage device to the digital output corresponding to the selected sound. The first part and the second part are provided so there is no loss of continuity between the first part and the second part.
The present invention also relates to an audio processor for providing a digital output signal
representative of a sound and an output. The audio processor includes a mass storage device input for receiving a plurality of digital waveform signals, a sample pool for storing a first part of the digital waveform signals, a stream buffer, a stream engine coupled to the mass storage device input and to the stream buffer, and a synthesizer engine coupled to the stream buffer and the output. The stream engine provides a second part of the digital waveform signals from the mass storage device input to the stream buffer. The synthesizer engine provides the second part of the digital waveform signals from the stream buffer to the output.
According to one exemplary aspect of the present invention, a digital synthesizer stores a first part of a digital waveform signal in a host memory and caches the remaining parts of the digital waveform signal from a mass storage device. The digital music synthesizer preferably utilizes a hard disk drive as a storage device and is implemented in software which is run by a computer. The computer includes a central processing unit and host memory operating in a Windows*-based environment.
In another exemplary aspect of the present invention, the synthesizer system utilizes an audio processor configured in software. The audio processor includes a solid state sample pool, a stream engine, a synthesizer engine and a solid state stream buffer. The stream engine provides portions of the digital waveform samples or signals from a mass storage device to the stream buffer. The synthesizer engine provides the portions of digital waveform signals from the sample pool and the stream buffer to a digital output. The processor preferably responds to MIDI control signals to select particular digital waveform signals.
In yet another exemplary embodiment of the present invention, a memory architecture for a digital synthesizer
utilizes a host memory which includes a sample pool and an array of stream cells. The sample pool stores a first part of each digital waveform signal stored on a hard drive. The sample pool includes an array of sample cells, each including a deck buffer and a control data buffer. Each cell of the array of stream cells preferably includes a deck buffer pointer, a stream buffer and a control data buffer. The deck buffer stores the first part of the waveform. The stream buffer is utilized to store subsequent parts of the waveform which are stored on the hard drive.
BRIEF DESCRIPTION OF THE DRAWINGS
Preferred embodiments of the invention will hereafter be described in conjunction with the appended drawings, wherein like numerals denote like elements, and:
Figure 1 is an exemplary block diagram of a digital sound synthesizer system including an audio processor in accordance with an exemplary embodiment of the present invention; Figure 2 is a more detailed block diagram showing the audio processor, illustrated in Figure 1, including a stream engine, an array of stream buffer cells, and a sample pool including sample pool cells;
Figure 3 is a more detailed block diagram of the stream engine illustrated in Figure 2;
Figure 4 is a block diagram of one of the stream buffer cells illustrated in Figure 2; and
Figure 5 is a block diagram of one of the sample pool cells illustrated in Figure 2.
DETAILED DESCRIPTION OF PREFERRED EXEMPLARY EMBODIMENTS OF THE PRESENT INVENTION
With reference to FIGURE 1, a digital synthesizer system 10 includes an audio processor 12, a sound selection
source 18, such as a MIDI source, a sound waveform source 22, such as a mass storage device for storing musical instrument digital waveform signals, a digital-to-analog (D/A) converter 25, and a speaker system 26. Audio processor 12 has a digital output 28 coupled to converter 25. Converter 25 has an output coupled to speaker system 26.
In operation, audio processor 12 receives digital control signals at a digital control input 20 from sound selection source 18 and digital samples or digital waveform signals at a mass storage device input 24 from digital waveform signal source 22. Audio processor 12 processes the digital control signals at input 20 and the digital waveform signals at input 24 to provide a digital output signal at output 28. The digital output signal is provided to converter 25 which provides an analog signal representative of the digital output signal to speaker system 26. As processor 12 receives a number of control signals at input 20, processor 12 can provide a combination, mixture or a number of different digital output signals at output 28, representative of the digital waveform signals indicated by the control signals at input 20. Speaker system 26 preferably generates sound or music representative of the digital output signal provided at output 28. Converter 25 and speaker system 26 can be replaced with other devices coupled to digital output 28. For
example, a mixer, recorder, digital storage device such as a hard disk drive, or other audio apparatus can be coupled to output 28 of processor 12. Speaker system 26 can be a stereo system including two or more speakers for providing a stereo production of sound indicated by the digital output signal which preferably represents stereo sound.
Sound selection source 18 can be any device for generating control signals which are most preferably MIDI data or control signals. MIDI control signals are well known in the art and represent music events such as the occurrence of a specific musical note, e.g., middle C, to be realized by a specific musical sound, e.g., piano, horn, drum or other device. Alternatively, audio processor 12 may be utilized with a different type of sound selection source 18 which provides control signals to input 20.
Sound selection source 18 can be a mouse interface, piano keyboard, electronic drum pad, key pad, touch pad, button interface, or other musician interface for providing digital control signals to input 20. Sound waveform source 22 is preferably a mass storage device such as a hard disk drive, optical disk drive, tape source, floppy disk drive, or other memory device. Sound waveform source 22 stores a wide range of digital samples of one or more musical instruments. For example, sound waveform source 22 can store approximately 1100 independent digital waveform signals associated with a grand piano. The
independent waveforms signals are representative of the 88 keys of the piano played at 6 variations of timbre in stereo (e.g., 88 x 6 x 2 = 1056). The number of waveform signals can be adjusted for various applications and instruments. The discussion of the use of system 10 to reproduce sounds of a grand piano is not for a limiting purpose and is merely discussed as an example. The types of notes, types of sources, types of instruments, and size of samples are discussed only in an exemplary fashion in this patent application.
System 10 can be implemented on a PC platform, as an embedded system, as a music synthesizer, as a dedicated rack mount music sampler, or as another sound generation device. Preferably, system 10 is a computer based system comprised of a CPU, a hard disk drive and host memory. The CPU is configured by software to perform the operations of audio processor 12 described throughout the present application. Alternatively, system 10 can be a dedicated digital signal processor system, a dedicated general purpose processor system, or a hardware system. The host memory is preferably at least 32 megabytes of DRAM, although other types of solid state memory can be used. The host memory can be combinations of ROM, SRAM, flash memory or other devices. The CPU is preferably a 486 or above processor. System 10 is configured to provide digital waveform samples or signals stored in source 22 in real
time, e.g., without audio perceivable delay, at output 28. As the musician manipulates sound selection source 18, audio processor 12 provides digital output signals representative of those manipulations at output 28 without an audio perceivable latency between the manipulation of source 18 and the hearing the music from speaker system 26. Preferably, audio processor 12 is able to provide the digital output signals in less than 15 milliseconds from the manipulation of source 18. The term an audio perceivable delay, as used in this application, refers to a loss of continuity, an objectionable latency, a stall, or a detainment of the music or sound as heard by the human ear. The term audio perceivable delay is used in two different contexts. The first context is related to the time difference discussed above between a musician's actuations and the production of sound. The second context is related to the continuity of signals between the first part and remaining parts of the digital waveform signals. Also, the second context is related to the continuity of signals between portions of the remaining parts of the digital waveform signals. The second context requires that the delay be less than several micro-seconds (e.g., so that there is no loss of continuity) to prevent noise artifacts, which are detectable by the human ear. System parameters and application criteria can affect the definition of audio perceivable delay and the
acceptability of the latency. Most preferably, there is no delay between portions of digital waveform signals as they are provided at output 28.
Due to the configuration of audio processor 12, system 10 can advantageously utilize a slower speed, lower cost, mass storage device as source 22 (e.g., a hard disk drive) , and yet provide digital output signals at output 28 in a sufficient amount of time. The use of source 22 to store the bulk of the digital waveform signals greatly reduces the size of the solid state memory (not shown) required by processor 12 (e.g., and hence the cost of processor 12) for real time production of sound. Unlike conventional synthesizer systems which download all of the digital waveform signals from source 220 into solid state memory, processor 12 utilizes a caching system so that most of the digital waveform signals are not stored in the solid state memory of processor 12. Most portions of the digital waveform signals are only stored in solid state memory when actually needed. More particularly, processor 12 utilizes its solid state memory (not shown) to store only a first portion of each of the digital waveform signals stored in source 22. When the musician selects a particular instrument, processor 12 downloads the first portion for each digital waveform signal stored for the selected instrument. Alternatively, the musician may choose a portion of an instrument or
several instruments, and processor 12 can download a first portion of each digital waveform in accordance with the musician's choice. Therefore, the first portions of digital waveforms signals stored in the solid state memory of audio processor 12 preferably correspond on a one-to-one basis to the digital waveform signals stored on source 22 for a selected instrument or group of instruments.
Once the solid state memory of audio processor 12 is loaded with the first portions, audio processor 12 responds to the digital control signals provided at input 20. Audio processor 12 provides the first portions of a selected digital waveform signal from its solid state memory to output 28 in response to the digital control signal provided at input 20. Preferably, at least the first 0.5 seconds of the digital waveform signal is stored in the solid state memory. The remaining portions of the selected digital waveform signal are then provided from source 22 to processor 12 and to output 28. The remaining portions can be retrieved while the first portion is being played. The remaining portion can include portions of the digital waveform signals representing up to three minutes of sound. Therefore, processor 12 utilizes a caching mechanism to reduce or eliminate an audio perceivable delay or loss of continuity between providing a first portion of digital waveform signals stored in solid state memory and remaining portions stored in source 22.
With reference to FIGURE 2 , audio processor 12 has input 24 coupled to source 22 which is shown in FIGURES 2 and 3 as a hard disk drive. Audio processor 12 includes a sample pool 40, an array 42 of stream cells 43, a command parser 50, a command queue 52, a synthesizer engine 66, a download module 64, a system thread manager 68, and a stream engine 80. Command parser 50 is coupled to input 20 and command queue 52. Command queue 52 is coupled to a control input of synthesizer engine 66. Input 24 is coupled to download module 64 and stream engine 80. Download module 64 is coupled to sample pool 40. Sample pool 40 is comprised of a plurality of sample pool cells 41. Sample pool cells 41 are provided for each of digital waveform signals for a particular instrument, portion of an instrument, or group of instruments selected by the musician. Sample pool 40 is configurable for a various number of cells. In the grand piano example, pool 40 has approximately 1100 cells 41.
Stream engine 80 is also coupled to system thread manager 68. System thread manager 68 operates as a master to stream engine 80. Manager 68 ensures that stream engine 80 receives the necessary amount of operational time of the CPU (not shown) utilized to implement processor 12 by cooperating with the operating system of the computer. Manager 68 preferably keeps stream engine 80 consistently
active as the CPU operates in a multi-tasking environment such as the Windows® environment.
The operations of manager 68, engine 66, engine 80, module 64, and parser 50 are preferably implemented by the CPU of the PC platform (not shown) . The storage operations of array 42, pool 40 and queue 52 are provided by the host memory for the PC platform.
Stream engine 80 is also coupled to array 42. Stream engine 80 is capable of providing portions of digital waveform signals from source 22 to stream cell array 42.
The portions of the digital waveform signals are stored in cells 43 of cell array 42. With reference to FIGURE 4, each of cells 43 includes a stream buffer 47 and a run-time control data buffer 49. Stream cell array 42 is preferably scalable from approximately 24 to 256 cells although any size of array 42 is possible. Each of cells 43 represents a single voice or stream of polyphony where two voice polyphony is representative of a concurrent musical tone (e.g., the combination of a C and E note). Therefore, a one-to-one voice to stream cell 43 relationship is utilized by processor 12.
Stream engine 80 manages the flow of data or portions of digital waveform signals into stream cell array 42. Engine 80 maintains streams of data or portions of signals from source 22 to cells 43. The streams of data can
be up to the number cells 43 and can be maintained as virtual streams from source 22.
Stream engine 80 is preferably a state machine configured to move portions of digital waveform signals from source 22 to array 42 so processor 12 can create sound in a real time or near real time fashion. As stream engine 80 provides the portions of digital waveform signals to cells 43 in array 42, synthesizer engine 60 consumes the portion of digital waveform signals in cells 43 and provides a digital waveform output representative of a selected sound at output 28.
As stream engine 80 fills cells 43 in array 42 with portions of the digital waveform signals, stream engine 80 provides synchronization and error correction, and decompression operations such as a lossless second order decompression on the signals. Although stream engine 80 is shown and described in the application as a throughput from input 24 to array 42 for discussion purposes, stream engine 80 in actuality moves data from source 22 to array 42 (e.g., performs more of a transmission management operation of signals from source 22 to array 42) .
When an instrument or group of instruments or portion thereof is selected by the musician, download module 64 receives a first part for each of the digital waveform signals for the particular instrument from source 22.
Download module 64 stores the first part of the digital
waveform signals in sample cells 41 of sample pool 40. Alternatively, the first part of the digital waveform signals can be permanently stored in pool 40 or received from another source (not shown) . With reference to FIGURE 5, sample pool cells 41 in sample pool 40 (FIGURE 2) include a deck buffer 51 and a global control data buffer 53. Preferably, deck buffer 51 of sample cells 41 stores the first 0.5 seconds (approximately 64K bytes depending on characteristics of system 10 such as RAM size, CPU speed and hard disk speed) of each of the digital waveform signals. Of course, smaller samples can be stored if source 22 and engine 80 can provide signals to array 42 more quickly.
Global control data buffer 53 includes control information or global control data. Global control data refers to control data for a particular sample. Global control data is available to system 10 at load time. Global control data provides system 10 with characteristics of the sample, or digital waveform signal, including the sampling rate, the length of the sample, the playback parameters and control words for compression algorithms. The control words are used to decompress the particular packets or portions of the digital waveform signals.
Memory space can be saved by using a pointer (not shown) in buffer 49 of cells 43 which merely addresses the corresponding deck buffer 51 in pool 40, rather than storing
the first portion of the digital waveform signal in cells 43. Similarly, memory space can be saved by providing pointer (not shown) which merely points to global control data buffer 53. In operation, when digital control signals such as
MIDI control signals are provided at input 20, parser 50 manages the allocation of all new stream cells 43. Parser 50 responds to the asynchronous digital control signals at input 20 to reference stream cells 43 back to the appropriate sample cell 41. Parser 50 manages the allocation and deallocation of the active stream cells 43 in response to on/off messages received at input 20. For example, as a musician strikes a key, an event is sent to parser 50 to turn on a particular sample cell 41. Parser 50 finds the appropriate cell 41 from the sample pool 40, based on the velocity at which the key was struck along with current state of the synthesizer system 10. Parser 50 is responsible for allocating a stream cell 43 from the available cells 43 in array 42. Parser 50 first, searches for an unused stream cell 43. Parser 50 uses the run-time control data stored in buffer 49 of a stream cell 43 to determine its availability. In the case that all stream cells 43 are allocated, parser 50 uses the following rules to determine which stream cell 43 is to be preempted:
1. If applicable, parser 50 looks for a voice with identical characteristic as the new voice. For example, if a middle C on the keyboard is currently being played, and the musician hits another middle C when system 10 is at maximum polyphony, the parser 50 will preempts the older middle C, in favor of the newer middle C.
2. Parser 50 finds the voice (e.g., cell 43) with the lowest run-time volume value in buffer 49, and selects it as the preempted voice. Parser 50, on allocating a stream cell 43, sets the run-time control data in buffer 49. The run-time control data references a single sample cell 41. The runtime control data also signals both stream engine 80 and synthesizer engine 66 to begin processing this particular cell 43. On allocation of new voice, parser 50 is responsible for all initialization of stream cells 43. Likewise, parser 50 passes all control signals through to the synthesizer engine 66, via the command queue 52. These commands allow engine 66 to control the playback experience as defined by industry accepted standards, such as the MIDI 1.0 specification. For example, the natural termination of voice can be signaled when a musician releases a key on an instrument. This off signal allows the
synthesizer engine 66 to naturally decay the active voice in manner which is consistent with the sounds of an acoustic instrument.
Parser 50 initializes the run-time control data by flagging it as a "new voice". The flag is used in conjunction with the stream engine 80 for priority scheduling and error correction. Parser 50 also initializes the progress counter for a new voice/stream cell 43. This progress counter is maintained by stream engine 80 and synthesizer engine 66 for priority scheduling and stream data integrity.
Parser 50 initializes the run-time control data in buffer 49 to point to deck buffer 51 in the associated sample cell 41. In addition, parser 50 sets the stream buffer 47 corresponding to the stream cell 43. Synthesizer engine 66 then begins processing out of deck buffer 51. As data is consumed from deck buffer 51, stream engine 80 begins filling stream buffer 47 of cell 43. Eventually, synthesizer engine 66 consumes all the data available in deck buffer 51. Engine 66 dynamically switches its internal pointer to begin consuming data from stream buffer 47 of cell 43 once all the data in buffer 51 is consumed. Engine 66 signals stream engine 80 by setting the appropriate flag in the run-time control data and advancing the progress counter.
With reference to Figure 3, stream engine 80 is shown having input 24 coupled to source 22. Engine 80 is also coupled to system thread manager 68. An output of stream engine 80 is coupled to array 42 of cells 43. Stream engine 80 includes a priority stream scheduler 82, a buffer management selector 84, a stream count manager 86, a disk synchronizer 88 having a control output coupled to source 22, a control word parser 90, a data mover 94, and a decompressor 96. Data mover 94 is also coupled to a command synchronization circuit 102 from command parser 50.
Stream engine 80 is responsible for supplying data from source 22 to active stream cells 43, while maintaining the real-time integrity of the sample audio (e.g., digital output signal) . Engine 80 acts as the producer to the individual cell 43, whereas synthesizer engine 66 acts as the consumer. Synthesizer engine 66 and stream engine 80 are coupled by the run-time control data, which is initialized by parser 50 when a new voice is allocated. The run-time control data guarantees audible integrity out of the entire system 10 by ensuring that the consumer (e.g., engine 66) and producer (e.g., engine 80) of data always remain in synchronization.
Stream engine 80 supplies data to n-channels (e.g., n cells 43), where n is the current polyphony level. The data is streamed from source 22 to stream buffer 47 of
the appropriate stream cell 43. The order of processing multiple stream cells 43 is determined at run-time by priority stream scheduler 82. Priority stream scheduler 82 is based off a priority level scheduling mechanism. Cells 43 are dynamically assigned a priority level. Cells 43 are handled with the highest levels processed first. The priority levels are assigned to cells 43 based on the current position of the data stream. When a new voice is allocated, the data stream is initialized to the starting position on source 22. The particular cell 43 is assigned a new stream position. Stream/cells 43 flagged with a new stream position are given the highest priority in system 10. If there are multiple cells 43 with the new stream flag set, then a round-robin method is implemented. Stream cells 43 that do not have the new flag set are prioritized based on the progress counter which is a component of the run-time control data in buffer 49. This counter will be incremented when a single quadrant of data has been consumed from the stream buffer 47 by the synthesizer engine 66. The priority levels are determined by the value of the progress counter. This counter is initialized to zero by parser 50 when a new voice has been allocated. In the case where multiple cells 43 have the same priority levels, a round-robin scheduler is implemented.
Stream engine 80 is a state machine, which is given control of system 10 via the system thread manager 68. System thread manager 68 is responsible for allocating all remaining bandwidth of the host processor (not shown) to stream engine 80.
Priority stream scheduler 82 determines which, if any, stream cell 43 needs to be processed next. When an appropriate stream cell 43 has been tagged for processing, control is passed to buffer management selector 84. The buffer management selector 84 initializes the transfer from source 22 to a stream buffer 47, based on the currently used buffer scheme. To guarantee real-time accuracy for a variety of hardware systems, system 10 implements two different buffering schemes. The choice of buffering scheme is a function of the system parameters, including CPU bandwidth, speed of source 22 (including data access time and seek time) . System 10 employs two buffering schemes: a single quadrant transfer and a double quadrant transfer.
All stream buffers 47 in cells 43 can be viewed as four quadrants. Synthesizer engine 66 consumes data on a quadrant by quadrant basis. Each time a quadrant is consumed the progress counter is incremented. Stream engine 80 fills stream buffer 47 either a single quadrant or a double quadrant at a time, depending on the buffering scheme currently deployed. Stream engine 80 is potentially maintaining n streams of real-time data that may be
segmented across source 22, where n is the polyphony of system 10. In accessing multiple streams, two parameters are of concern to the efficient utilization of source 22, set-up time (e.g., seek time), and data rate. Every time a stream cell 43 needs to be processed, system 10 incurs the cost of a set-up time. System 10 uses disk synchronizer 88 to set-up the source 22 from the correct location for the current stream. In a double quadrant buffering scheme, over the life of the stream, the accumulated time required to set-up the stream will be half of the time required when using the single quadrant buffer.
The negative to using the double quadrant buffering scheme is that each time a cell 43 needs to be processed, twice as much data must be moved from the source 22 to the stream buffer 47. Because stream engine 80 maintains a single thread of execution when processing the stream cells 43, if a single cell 45 maintains control of source 22 for too long, other streams have the potential to starve. The buffering scheme for a particular stream cell 43 is stored in the run-time control data buffer 49. Stream count manager 80 manages the progress counter for the purpose of indicating the current status of stream engine 80 for any given stream cell 43. The progress counter is used by synthesizer engine 66 as a method to signal the stream engine 80 for more data. Likewise, the progress counter can be used to indicate that data is available.
After the buffer management selector 84 has chosen the appropriate buffering scheme, disk synchronizer 80 sets up source 22 by issuing a command to move the reading apparatus to the correct location on source 22. The exact location on source 22 for any stream cell 43 is maintained in the run-time control data buffer 49. The run-time control data is updated on each pass through the stream manager 80.
System 10 accounts for compressed data. Any compression algorithm can be implemented. However, for the application of musical tone generation a lossless scheme is preferably implemented. The lossless algorithm guarantees that quality does not degrade due to the compression/decompression process. The musical samples will be compressed prior to storage in source 22. This embodiment implements a second order differential compression/decompression algorithm. The decompression of a stream is handled by a decompressor 96. The compressed stream is packetized. Each packet contains a signal control word, which is used as a control data for decompressor 96. The control words are initially stored with the instrument on source 22. When an instrument is loaded into system 10, download module 64 transfers the control words from the source 22 to the global control data buffer 53 in cells 41 of pool 40. The decompression algorithm can accept various packet sizes, providing they are of a 2" size. The size cf
the packet offsets the compression ratio of the sampled instrument. This embodiment use 2 kiloword packet sizes.
The data mover is responsible for directing the sampled instrument from source 22 to either the second order decompressor 96 or the stream buffer 47. Before the data is moved, command synchronization module 102 executes. Module 102 is responsible for handling any error correction and all synchronization between the stream engine 80 and parser 50. The command synchronization 102 handles the previously discussed preemption of stream cell 43. If a stream cell 43 is prematurely deallocated because system 10 has hit maximum polyphony and a new voice has been started, then stream engine 80 must check the integrity of the active stream cell 43 before proceeding. This situation is a result of stream engine 80 and parser 50 running asynchronously.
With referenced FIGURE 2 , system 10 can be configured to provide the first part of digital waveform signals stored in deck buffer 51 of cells 41 and pool 40 to cells 43 in array 42 rather than directly providing the first part from pool 40 to output 28 through synthesizer engine 66. In this method, engine 80 or engine 66 moves the first part of the digital waveform signal from pool 40 to cell 43 in array 42. Synthesizer engine 66 then moves the first part of the digital waveform signal from cell 43 of array 42 to digital output 28.
Deck buffer 45 is preferably comprised of 64K bytes. Stream buffer 47 is preferably comprised of from two to four quadratures of 32 Kilobytes (Kb) . Control buffer 49 is comprised of 200 bytes. Source 22 preferably has a seek time of approximately 7 milliseconds (ms) and a data rate of 8-12 MB/S although other times are possible. Source 22 is preferably a one gigabyte or more hard disk drive.
It is understood that, while the detailed drawings, specific examples and particular component values given describe a preferred exemplary embodiment of the present invention, they are for the purpose of illustration only. The apparatus and method of the invention is not limited to the precise details and conditions disclosed. For example, although the system is utilized as a software configured computer based synthesizer, other hardware and software schemes can be utilized. Further, although specific memory sizes and time periods are discussed, they are discussed only for the purpose of illustration. Further, single lines in the various drawings may represent multiple conductors. Various changes made to the details disclosed without departing from the spirit of the invention which is defined by the following claims.

Claims

CLAIMSWe claim:
1. A digital synthesizer system, having an input and an output, that uses a plurality of digital waveform signals, the digital synthesizer system comprising: a mass storage device that stores at least a second portion of each of the plurality of digital waveform signals; a memory that stores a first portion of each of the plurality of digital waveform signals; and a control circuit, coupled to the memory and the mass storage device, that provides to the output from the memory the first portion of a selected one of the plurality of digital waveform signals and then provides to the output from the mass storage device the second portion of the selected one of the plurality of digital waveform signals.
2. The digital synthesizer system of claim 1 wherein the second portion of the selected one of the plurality of digital waveform signals is stored in the memory after the first portion of the selected one of the plurality of digital waveform signals has been selected.
3. The digital synthesizer system of claim 2 wherein the second portion of the selected one of the plurality of digital waveform signals is retrieved from the memory when stored in the memory.
4. The digital synthesizer system of claim 1 wherein the control circuit comprises a
CPU.
5. The digital synthesizer system of any of the claims 1 , 2, 3 or 4 wherein the mass storage device comprises an optical disk drive, tape drive, hard disk drive, or a floppy disk drive.
6. A digital synthesizer system, having an input and an output, that uses a plurality of digital waveform signals, the digital synthesizer system comprising: a first memory, having relatively slower access timing, that stores a first portion and a second portion of each of the plurality of digital waveform signals; a second memory having relatively faster access timing; a control circuit, coupled to the first memory and the second memory, that provides to the output the first portion followed by the second portion of a selected one of the plurality of digital waveform signals; and the control circuit selectively moving the first portion of the selected one of the plurality of digital waveform signals into the second memory.
7. The digital synthesizer system of claim 6 wherein the first portion of the selected one of the plurality of digital waveform signals is stored in the second memory after the control circuit provides the first portion of the plurality of digital waveform signals to the output.
8. The digital synthesizer system of claim 7 wherein the first portion of the selected one of the plurality of digital waveform signals is retrieved from the second memory when stored in the second memory.
9. The digital synthesizer system of claim 6 wherein the control circuit comprises a CPU.
10. The digital synthesizer system of any of the claims 6, 7, 8 or 9 wherein the first memory comprises an optical disk drive, tape drive, hard disk drive, or a floppy disk drive.
PCT/US1999/013182 1998-06-10 1999-06-10 Synthesizer system utilizing mass storage devices for real time, low latency access of musical instrument digital samples WO1999065016A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2000553943A JP2002518693A (en) 1998-06-10 1999-06-10 Synthesizer system using mass storage device for real-time, low-latency access of musical instrument digital samples
EP99927461A EP1101217A1 (en) 1998-06-10 1999-06-10 Synthesizer system utilizing mass storage devices for real time, low latency access of musical instrument digital samples

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/095,519 1998-06-10
US09/095,519 US6008446A (en) 1997-05-27 1998-06-10 Synthesizer system utilizing mass storage devices for real time, low latency access of musical instrument digital samples

Publications (1)

Publication Number Publication Date
WO1999065016A1 true WO1999065016A1 (en) 1999-12-16

Family

ID=22252375

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1999/013182 WO1999065016A1 (en) 1998-06-10 1999-06-10 Synthesizer system utilizing mass storage devices for real time, low latency access of musical instrument digital samples

Country Status (3)

Country Link
EP (1) EP1101217A1 (en)
JP (1) JP2002518693A (en)
WO (1) WO1999065016A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003007290A2 (en) * 2001-07-10 2003-01-23 Post Musical Instruments B.V. Digital musical instrument

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4508001A (en) * 1983-07-29 1985-04-02 Nippon Gakki Seizo Kabushiki Kaisha Electronic musical instrument using large-capacity recording medium
WO1992015087A1 (en) * 1991-02-15 1992-09-03 Studio Audio & Video Limited Musical data storage techniques
US5811706A (en) * 1997-05-27 1998-09-22 Rockwell Semiconductor Systems, Inc. Synthesizer system utilizing mass storage devices for real time, low latency access of musical instrument digital samples

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4508001A (en) * 1983-07-29 1985-04-02 Nippon Gakki Seizo Kabushiki Kaisha Electronic musical instrument using large-capacity recording medium
WO1992015087A1 (en) * 1991-02-15 1992-09-03 Studio Audio & Video Limited Musical data storage techniques
US5811706A (en) * 1997-05-27 1998-09-22 Rockwell Semiconductor Systems, Inc. Synthesizer system utilizing mass storage devices for real time, low latency access of musical instrument digital samples

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003007290A2 (en) * 2001-07-10 2003-01-23 Post Musical Instruments B.V. Digital musical instrument
WO2003007290A3 (en) * 2001-07-10 2003-10-16 Post Musical Instr B V Digital musical instrument

Also Published As

Publication number Publication date
EP1101217A1 (en) 2001-05-23
JP2002518693A (en) 2002-06-25

Similar Documents

Publication Publication Date Title
US5811706A (en) Synthesizer system utilizing mass storage devices for real time, low latency access of musical instrument digital samples
JP3161561B2 (en) Multimedia system
US5515474A (en) Audio I/O instruction interpretation for audio card
KR0149251B1 (en) Micromanipulation of waveforms in a sampling music synthesizer
US7728213B2 (en) System and method for dynamic note assignment for musical synthesizers
US6100461A (en) Wavetable cache using simplified looping
US5747714A (en) Digital tone synthesis modeling for complex instruments
EP0853802B1 (en) Audio synthesizer
US7045700B2 (en) Method and apparatus for playing a digital music file based on resource availability
EP0600639B1 (en) System and method for dynamically configuring synthesizers
CN110299128B (en) Electronic musical instrument, method, and storage medium
US5918302A (en) Digital sound-producing integrated circuit with virtual cache
US4805511A (en) Electronic bell-tone generating system
WO1999065016A1 (en) Synthesizer system utilizing mass storage devices for real time, low latency access of musical instrument digital samples
EP2015855B1 (en) System and method for dynamic note assignment for musical synthesizers
WO2008008418A2 (en) System and method for creation of a virtual musical instrument that controls multiple samplers or synthesizers by remapping portions of each to a single keyboard
US20050076770A1 (en) System and method for dynamic note assignment for musical synthesizers
JP3518716B2 (en) Music synthesizer
US5380949A (en) Key assigner for an electronic musical instrument having multiple tone channels and priority level value data
JP7332002B2 (en) Electronic musical instrument, method and program
WO2007123549A1 (en) System and method for dynamic note assignment for musical synthesizers
US9418641B2 (en) Swap Divisi process
JP7124370B2 (en) Electronic musical instrument, method and program
JPH0266597A (en) Musical sound composition system
WO2008016649A2 (en) System and method for consistent power balance of notes played on sampled or synthesized sounds

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
ENP Entry into the national phase

Ref country code: JP

Ref document number: 2000 553943

Kind code of ref document: A

Format of ref document f/p: F

WWE Wipo information: entry into national phase

Ref document number: 1999927461

Country of ref document: EP

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWW Wipo information: withdrawn in national office

Ref document number: 1999927461

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1999927461

Country of ref document: EP