WO2008008425A2 - Musical performance desk spread simulator - Google Patents

Musical performance desk spread simulator Download PDF

Info

Publication number
WO2008008425A2
WO2008008425A2 PCT/US2007/015882 US2007015882W WO2008008425A2 WO 2008008425 A2 WO2008008425 A2 WO 2008008425A2 US 2007015882 W US2007015882 W US 2007015882W WO 2008008425 A2 WO2008008425 A2 WO 2008008425A2
Authority
WO
WIPO (PCT)
Prior art keywords
player
note
notes
pitch
players
Prior art date
Application number
PCT/US2007/015882
Other languages
French (fr)
Other versions
WO2008008425A3 (en
Inventor
Christopher L. Stone
Original Assignee
The Stone Family Trust Of 1992
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Stone Family Trust Of 1992 filed Critical The Stone Family Trust Of 1992
Publication of WO2008008425A2 publication Critical patent/WO2008008425A2/en
Publication of WO2008008425A3 publication Critical patent/WO2008008425A3/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/02Means for controlling the tone frequencies, e.g. attack or decay; Means for producing special musical effects, e.g. vibratos or glissandos
    • G10H1/06Circuits for establishing the harmonic content of tones, or other arrangements for changing the tone colour
    • G10H1/08Circuits for establishing the harmonic content of tones, or other arrangements for changing the tone colour by combining tones
    • G10H1/10Circuits for establishing the harmonic content of tones, or other arrangements for changing the tone colour by combining tones for obtaining chorus, celeste or ensemble effects
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/155Musical effects
    • G10H2210/195Modulation effects, i.e. smooth non-discontinuous variations over a time interval, e.g. within a note, melody or musical transition, of any sound parameter, e.g. amplitude, pitch, spectral response or playback speed
    • G10H2210/221Glissando, i.e. pitch smoothly sliding from one note to another, e.g. gliss, glide, slide, bend, smear or sweep
    • G10H2210/225Portamento, i.e. smooth continuously variable pitch-bend, without emphasis of each chromatic pitch during the pitch change, which only stops at the end of the pitch shift, as obtained, e.g. by a MIDI pitch wheel or trombone
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/155Musical effects
    • G10H2210/245Ensemble, i.e. adding one or more voices, also instrumental voices
    • G10H2210/251Chorus, i.e. automatic generation of two or more extra voices added to the melody, e.g. by a chorus effect processor or multiple voice harmonizer, to produce a chorus or unison effect, wherein individual sounds from multiple sources with roughly the same timbre converge and are perceived as one
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/155Musical effects
    • G10H2210/245Ensemble, i.e. adding one or more voices, also instrumental voices
    • G10H2210/251Chorus, i.e. automatic generation of two or more extra voices added to the melody, e.g. by a chorus effect processor or multiple voice harmonizer, to produce a chorus or unison effect, wherein individual sounds from multiple sources with roughly the same timbre converge and are perceived as one
    • G10H2210/255Unison, i.e. two or more voices or instruments sounding substantially the same pitch, e.g. at the same time
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2240/00Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
    • G10H2240/171Transmission of musical instrument data, control or status information; Transmission, remote access or control of music data for electrophonic musical instruments
    • G10H2240/281Protocol or standard connector for transmission of analog or digital data to or from an electrophonic musical instrument
    • G10H2240/311MIDI transmission
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/541Details of musical waveform synthesis, i.e. audio waveshape processing from individual wavetable samples, independently of their origin or of the sound they represent
    • G10H2250/641Waveform sampler, i.e. music samplers; Sampled music loop processing, wherein a loop is a sample of a performance that has been edited to repeat seamlessly without clicks or artifacts

Definitions

  • This system and methods relate to the field of increasing the realism of the resultant sound when multiple, discretely sampled acoustic instruments (or small clusters of instruments or synthesized instruments) are played in a manner intended to be an ensemble performance.
  • the methods are intended to emulate the natural deviations in player performance that occur, and with scaling to varying degrees of player skill and precision.
  • the system and methods for manipulation of sampled or synthesized sounds described herein relates to the playback of orchestral sounds, choirs, or any type of music in a manner that simulates the concurrent live performance of said sounds in an ensemble.
  • the proposed system and methods will have little if any effect on the playing of a single note or chord; rather, the major musical benefit is to emulate the "blur" of sound that occurs as an ensemble plays notes in motion.
  • each individual instrument typically or perhaps each pair of instruments is represented by a separate audio sample (or a sample set comprised of multiple samples that play as a unit, such as a sampled attack, a sampled sustain and a sampled release for a given instrument which set functions to play a given note).
  • Most of the sampled instruments commercially available today are either solo instruments or whole sections of instruments. It is possible to have individual sampled or synthesized instruments that alone sound realistic and yet when multiple such "solo" instruments are played together, the result seldom if ever sounds like these "players" are actually performing live in the same space at the same time. There are several reasons why this is so, and the methods addressed herein present a technical remedy to one aspect of this lack of ensemble realism.
  • PTO reference number 11/208798 confirmed by postcard with PTO STAMP: 112949, and also a Microphone Bleed Simulator patent being filed this week which claims priority from provisional patent application Number 60807166 submitted July 12, 2006, as confirmed by electronic acknowledgement receipt EFS ID 1110859, Confirmation
  • String instruments like all acoustic instruments, are mechanical devices that are subject to tonal differences and timing anomalies just like any other machine that lacks absolute precision; and since musical instruments are played by humans, the definitely lack precision.
  • a group such as the violin section
  • effects is a musical term that covers such sounds as for example vibrato, tremolo, col legno, sul ponticello and so forth, all of which are produced by variations in the live player's technique.
  • these typical effects involve such factors as the position of the bow relative to the neck and bridge of the instrument, how much 'flutter' in the bow hand is
  • the violin for example, is not a fretted instrument which means that tuning accuracy from note to note is also relatively flexible and random as it depends upon the player's hand position and the string tension he or she applies to the string.
  • prior art ensemble samples are too perfect and predictable to sound entirely realistic.
  • One prior art sample library and software product uses recorded instruments playing musical runs of notes, or phrases, from which databases of the phrase characteristics and modeled, and from which models a synthesized process
  • FIG. 1 Desk spread simulator block diagram depicting the functional elements that comprise a representative system.
  • Pitch spread simplified diagram - depicts the way pitch changes according to player position, as indicated by desk number. This is an example for 8 players (8 desks).
  • Pitch spread method diagram - depicts a more detailed version of the pitch spread of Fig. 2 (for 16 player positions) with a logical flow diagram to depict an 210 example of what functions may occur in the pitch spread system and what functions may occur in the sampler.
  • Time spread simplified diagram depicts the way initial playing of the note is progressively delayed according to player position, as indicated by desk number. This is an example for 8 players (8 desks).
  • Time spread method diagram depicts a more detailed version of the time spread of Fig. 3 (for 16 player positions) with a logical flow diagram to depict an example of what functions may occur in the time spread system and what functions may occur in the sampler.
  • Hysteresis system block diagram depicts a method whereby the 220 speed of successively played notes creates a scaling value (which is hysteresis) and how that value is applied to the remaining elements of the desk spread system (MIDI Pitch Shift, MIDI Delay, and Envelope-DSR) to command the various deviations that comprise desk spread.
  • MIDI Pitch Shift a scaling value
  • MIDI Delay MIDI Delay
  • Envelope-DSR Envelope-DSR
  • sample player In order to play samples one typically uses a so-called sample player, which is generally implemented as a software module running on a general purpose computer, either as a stand-alone program or as a plug-in to a host program environment. Some sample players are implemented on special-purpose dedicated
  • samplers which, internally, still uses sample software and computer components, but which are constituted as an embedded system. Most such samplers are caused to play notes either by direct control from an external device (typically a MIDI keyboard, where MIDI is an acronym for Musical Instrument Digital Interface), from a different type of note generator (such as a MIDI guitar pickup) or from a device
  • an external device typically a MIDI keyboard, where MIDI is an acronym for Musical Instrument Digital Interface
  • note generator such as a MIDI guitar pickup
  • MIDI refers to the Musical Instrument Digital Interface, a standard promulgated almost 3 decades ago and widely used in the musical industry today, albeit with many modifications and updates; we refer to MIDI control in this specification since it is prevalent, but any means to control sample playing or
  • Synthesizers use wave tables to build sounds or they use various oscillators or frequency modulation methods to create sounds, but generally synthesizers also function under software control, referring to lookup tables or other specification files, and they operate under control of a host program of some sort;
  • synthesizers generally, synthesizers also use MIDI protocol, MIDI note generators and sequencers for control of note playing.
  • Sequencers are typically implemented as software programs running on a general purpose computer, either the same computer or a different computer from the one on which the sampler runs (one sequencer or keyboard can address
  • block 10 refers to the overall system of playing sampled or synthesized sounds and comprises an input device 105 which may be a MIDI keyboard or other note generator or a sequencer, a Desk
  • 2 ⁇ o Spread Processor 110 which is the actual controlling system proposed in this document, and a sample player (or synthesizer) 120, which may be one or more actual “samplers” or “synthesizers” on one or multiple hardware platforms. Samplers and synthesizers are considered to be (or to play) virtual instruments, in contrast to the actual physical musical instruments whose sounds they may emulate.
  • 270 played by a given section of like instruments i.e., by the first violins
  • keyboard hysteresis automatically increases the effectiveness of other controls that set the Pitch Spread
  • hysteresis may also affect Glide Speed (glide refers to a smooth change in pitch from one note to the next, as contrasted with a single jump in pitch that occurs without glide).
  • glide refers to a smooth change in pitch from one note to the next, as contrasted with a single jump in pitch that occurs without glide.
  • Musicly the effects of these various control methods accord with the manner in which musicians tend to get "sloppier” as they have to play faster 280 and faster. While hysteresis is a shared control factor for multiple methods of increasing realism, we'll first describe the various methods themselves and then show how hysteresis works to influence them.
  • keyboard hysteresis scaling must be empirically determined with a given set of sampled or synthesized sounds but the method itself
  • PITCH SPREAD This function commands differential pitch shift for musical notes sent from the system (Fig. 1 110, 155) to the various sampler/synthesizer paths 120; successively higher numbered instrument desks play notes that diverge 305 increasingly from the default pitch of the note that is supposed to be sounding, and which is being sounded by player number 1.
  • the pitch spread should be adjustable and for this purpose we provide a control such as the examples shown in Fig. 2 205/215/225.
  • This fader-like implementation of a user interface adjusts an Outgoing note pitch shifter in Desk Spread Processor 155 of Fig.
  • the lowest numbered desk (#1) is always the closest to the conductor/audience, and would typically be the "first seat” in an orchestral section - the most skilled player.
  • the highest numbered desk (#8 or 320 #16 in these examples) is always furthest away and would typically be the least skilled player.
  • path refers to a specific MIDI port and channel by which the system addresses (commands to play) a given sample; a path may also be considered to represent a "desk” (which is a 325 music stand that may have one, two or even a few musicians playing at once and captured on a single sample). The path, therefore, represents the smallest discretely addressable player of a note, typically one or two musicians on a sampler channel. As indicated in Fig.
  • the last desk (highest numbered 335 instrument path (#8 in Fig. 2, 230) will be a half tone above or below the default pitch for the note nominally being played by that desk.
  • each desk in succession does not simply go up or down in pitch; the first desk is unchanged, the second desk rises in pitch, the one after that goes lower, the one after that higher, and so forth. For this reason, the highest numbered desk will be higher in pitch if it's
  • MIDI CC control change
  • each unique MIDI port/channel may have up to 128 MIDI
  • Each MIDI CC# can have a range of value spanning 128 increments from value 0 through value 127.
  • the "meaning" or interpretation of what a given MIDI CC# and CC value do is based on how the specific sampler and sample set (or synthesizer) are programmed; there are a few standard usages but the majority of CC information is not fixed in the industry.
  • MIDI CC#s are reserved for common control functions, many are available to be used for whatever purpose is desired; any non-conflicting "available" CC#s can be allocated for the CC value communication to implement the present system.
  • Pitch spread MIDI CC values are generated for all notes on each of the pertinent instrument paths as commanded by the pitch spread control 305.
  • the values are derived in 310 either by consulting a set of lookup tables based on the Pitch Spread control setting, the number of active instrument paths, and the desired curve set, or by an algorithm which performs
  • the pitch spread CC values are not sent out to the sampler 315 until they receive one additional bit of processing; they are in fact scaled according to the keyboard hysteresis value in step 313.
  • hysteresis in greater detail subsequently. Hysteresis can be applied so that maximum hysteresis values cause
  • the pitch spread to increase beyond the value set with the Pitch Spread control, at middling hysteresis values the Pitch Spread control's output can be unaltered, and low hysteresis values the pitch spread can be reduced to less than the set Pitch Spread control values; or Hysteresis can be applied so that maximum hysteresis values cause the Pitch Spread control to output its current set value, and any less
  • 4io hysteresis scales down the Pitch Spread control output.
  • the choice of methods to apply Hysteresis may vary with user preference or overall system implementation considerations.
  • After hysteresis scaling the pitch bend CC values 315 are sent to the sampler(s) paths along with the note-on and other information so the sampler "knows" what to play.
  • 425 implementation may well change as sampler/synthesizer and computer technology evolve, but the basic concept remains the same; command automatically-scaled pitch shift per sample.
  • a suitably written script interprets incoming CC values for each note 370. The script may be bypassed for path #1 as shown in 375 because path #1 never has any pitch bend
  • step 430 based on desk spread; this step is optional but may be useful to reduce unnecessary pitch bend calculations at the sampler.
  • the script checks to see if the CC value is at, above or below a value of 63 (step 380). If it is below 63 (step 390) and moving toward zero then the script commands negative (downward bending) pitch excursion, scaled such that a value of zero is equal to Vz
  • the script interprets CC values above 63 and moving toward 127 (step 385) to command positive (upward bending) pitch excursion scaled such that a value of 127 is equal to Vz tone upward shift.
  • the appropriately pitch-bent note for each path is thereby played by the sampler 395.
  • TIME SPREAD This function introduces progressively greater delays in note onset when multiple desks are playing in response to "simultaneous" note-on events; the meaning of "simultaneous” may be defined to include note-on events that fall within a specified time window such as 2 milliseconds, rather than truly
  • Time Spread itself may be implemented using a straightforward MIDI time delay function.
  • the delay technique presented here is unique in that it sets an automatically generated and appropriately scaled timing value for each of the
  • the delay can be visualized to be established by overlaying the path number on the chosen curve whose maximum value equals the set value of the Time Spread control at the last path (Fig. 5 path #16). In reality, this time spread mapping will be accomplished either by lookup tables or by algorithmic
  • the note-on commands plus a MIDI CC value (or other control parameter) that commands the sampler to wait for the designated time interval before sounding the note-on data may be sent from the Desk Spread Processor (Fig. 1, 110) immediately to the sampler (Fig. 1, 120); in the sampler itself the delay is actually implemented (except as otherwise noted below).
  • CC value i.e., CC value 0
  • Sliding the Time Spread control all the way up to the top (100 ms) position per 545 causes the extreme end of the curve to reach CC value 127, and intermediate control settings merely scale the highest numbered path's maximum curve deviation and all intermediate paths accordingly.
  • the values are derived in 510 either (a) by consulting a set of lookup tables based on the Time Spread control setting, the number of active instrument paths, and the desired curve, or (b) by an algorithm which performs essentially the same function by means of calculation.
  • time delay CC values are not sent out to the sampler 520 until they receive one additional bit of processing; they are in fact scaled according to the keyboard hysteresis value in step 515.
  • the time spread value of the highest numbered path equals that which is set with the Time Spread control, and at
  • the time delay CC values are proportionately reduced to less than the set Time Spread control value.
  • the time delay CC values are sent to the sampler(s) along with the note-on and other information so the sampler "knows" what to play.
  • sampler 515 can be scripted to produce time delay for a given sample based on MIDI CC values; such samplers are commercially available. However, it is equally viable to delay the output of notes issued to the sampler or synthesizer from the Time Spread controlling module; the authors opted at this time for sampler scripting because it balances the workload of the various system modules and provides (based on
  • the script scales the time delay such that a value of zero is equal to no delay and a value of 127 is equal to 100 milliseconds time delay (or whatever value one wishes to establish as the maximum; such high values yield unnatural sounds and yield more of a special effect than a
  • DECAY-SUSTAIN-RELEASE ENVELOPE The concept of applying a volume 5 envelope to a sound dates back to very early synthesizers.
  • an envelope generator has four controls: Attack, Decay, Sustain and Release (A-D-S-R). This can be altered to a smaller or larger set of envelope controlling parameters such as the D-S-R envelope proposed herein.
  • Envelopes have been used on samples as well as synthesized sounds, but they are applied somewhat differently. That's o because samples include their own envelope - the level changes that were present in the original recorded sound. Raw sample recordings are subsequently "cut" into several segments for suitable playback control, and it is still possible to apply an envelope to the sound.
  • Each sampled note for each instrument in a given sample library will have its own unique sustain (if the note is looped), decay and release characteristics5 which means that empirical testing is necessary to "fine tune" the actual envelope for each sampled note to work correctly with the particular sample set; it is not worthwhile to publish general values or an algorithmic rule for this function as ideally it should be scripted per library.
  • a CC scale of 0 through 127 can be used internally within the system's ENVELOPE subsystem, subject to 560 overall hysteresis scaling and to sensitivity scaling per sample set (or per sampled or synthesized instrument).
  • Hysteresis affects multiple subsystems and itself is scaled according to the time difference between notes
  • a hysteresis generator 605 examines the incoming stream of note-on commands and times the successive intervals from note to note to derive a smoothed moving average value for this inter-note time. The shorter the time between notes, the greater the hysteresis value.
  • the output of the hysteresis generator will be a coefficient stream clamped to 0 minimum and 1 maximum, and will flow through a Hysteresis control 610 that scales the generated hysteresis value from 100 percent (1) down to zero percent (0).
  • the hysteresis coefficient value coming out of control 610 is then distributed to the various components
  • hysteresis serves to increase the effectiveness of the Pitch Spread, Time Spread and D-S-R Envelope Modification subsystems as notes are played in more rapid succession (or
  • Fig. 6 depicts this output value graphically as a vertical bar in the representation of the various controls.
  • control slider "knobs" in 610, 620, 630 and 640 represent the setting of the control, but not the actual output of each control; the gray bars to the right of
  • Hysteresis control 610 is set at maximum but is not putting out maximum scaling multiplier of 1.0 for hysteresis in the illustration, presumably because the rate at which notes are incoming is less than the 16 note-per-second speed required to drive the hysteresis to full value. (If the control of 610 were pulled down to 50% instead of the 100%
  • This hysteresis value is distributed to the Pitch Spread, Time Spread and Release Envelope controls (620, 630, 640) through distribution system 615.
  • the Pitch Spread control 620 is set slightly above UNISON, but because the
  • the Time Spread control 630 is set at maximum, or 100 milliseconds in this example, but because it's receiving that 75% hysteresis value (e.g., a scalar multiplier of about 0.75), the time spread output is 75% of maximum, or a CC that 615 would produce about 75 milliseconds maximum time delay.
  • 75% hysteresis value e.g., a scalar multiplier of about 0.75
  • Envelope also works a bit differently. Notice that the Envelope Control 640 has a zero value at mid-point in the control scale. That's because the default envelope applied to the actual samples is constructed to partially truncate the sampled notes; otherwise notes would "ring out" too much in ordinary usage. Thus
  • the incoming 75% hysteresis value has a different effect on the Envelope Control and its subsystem.
  • any Hysteresis values above 0.5 push the envelope out to a longer decay-sustain-release characteristic, with a maximum occurring at 1.0, whereas Hysteresis values below 0.5 pull the envelope in to a shorter decay-sustain- release characteristic; a Hysteresis value of 0.5 is "neutral" having no effect.
  • the Release Envelope 640 slider is set at mid-point (0 value)
  • the effective envelope is longer due to the hysteresis input.
  • the Hysteresis Distribution block of the illustration 615 indicates that it may be desirable to scale the hysteresis effect before it's applied to one or another subsystem; again empirical testing will be necessary here, and it is anticipated that scaling the applied hysteresis values between subsystems may benefit from user- adjustability based on preferences or style of music being played.
  • the scaling is thus optional as it also may not be required or it may be accomplished by other means such as altering the effective sensitivity of each subsystem by means of sampler scripting.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Electrophonic Musical Instruments (AREA)

Abstract

A system and methods are proposed for manipulation of playback of sampled orchestral and other sounds, simulating multiple factors which characterize live performance of notes in motion by an ensemble. The methods rely upon use of samples or synthesized sounds which are created for discrete individual instruments or desks, which desks typically comprise a pair of instruments, rather than large numbers of instruments per sample. Following the manner in which musicians tend to play somewhat out-of-synchronization with one another, somewhat off-pitch from the intended notes, and with a bit of overlap between musicians with respect to ending notes, particularly as these characteristics increase when notes are played faster, the system simulates these characteristics. Playing speed is measured and used to alter the relative pitch, relative note timing and note envelopes so that samples are not played in perfect unison, which is to say they are played less precisely and thus more realistically.

Description

MUSICAL PERFORMANCE DESK SPREAD SIMULATOR
PRIOR PATENT APPLICATION REFERENCE
This application claims priority from a provisional patent Application 60807169, e-filed by the author of the present patent on July 12, 2006, per EFIS ID 1110877, Confirmation Number 6630, the entire disclosure of which is incorporated herein by reference.
TECHNICAL FIELD
This system and methods relate to the field of increasing the realism of the resultant sound when multiple, discretely sampled acoustic instruments (or small clusters of instruments or synthesized instruments) are played in a manner intended to be an ensemble performance. The methods are intended to emulate the natural deviations in player performance that occur, and with scaling to varying degrees of player skill and precision.
BACKGROUND
The system and methods for manipulation of sampled or synthesized sounds described herein relates to the playback of orchestral sounds, choirs, or any type of music in a manner that simulates the concurrent live performance of said sounds in an ensemble. The proposed system and methods will have little if any effect on the playing of a single note or chord; rather, the major musical benefit is to emulate the "blur" of sound that occurs as an ensemble plays notes in motion.
The system and methods described herein apply to multiple, discretely sampled or synthesized musical instruments or voices, whereby each individual instrument (typically) or perhaps each pair of instruments is represented by a separate audio sample (or a sample set comprised of multiple samples that play as a unit, such as a sampled attack, a sampled sustain and a sampled release for a given instrument which set functions to play a given note). Most of the sampled instruments commercially available today are either solo instruments or whole sections of instruments. It is possible to have individual sampled or synthesized instruments that alone sound realistic and yet when multiple such "solo" instruments are played together, the result seldom if ever sounds like these "players" are actually performing live in the same space at the same time. There are several reasons why this is so, and the methods addressed herein present a technical remedy to one aspect of this lack of ensemble realism.
The first factor that prevents sampled or synthesized instruments from sounding like "live ensemble" instruments is the issue of note allocation when playing chords (as contrasted to single notes). We'll use orchestral stringed instruments as an example in this document because they are the most difficult to simulate in ensemble, but in fact most of what's presented here applies equally to other types of instruments as well as human voices and even to non-musical sampled sounds. In a "real world" situation with live musicians, a section of string players divide themselves into sub-groups in order to parse out the discrete notes within a chord; one or more musicians will play a particular note, another musician or group will play another note, and so forth so that all notes are played. When a different number of notes is played, the musicians re-allocate themselves once again to ensure all notes are played with approximately equal energy; this process is called "d/V/s/" or subtractive polyphony. The divisi process allows the overall sound power and sonic balance to be maintained. With sampled strings, however, the sound power multiplies with each successively stacked note because string samples are typically samples of multiple instruments; play 3 notes using a sample of 16 violins and you'll in fact hear the sound of 48 violins - three times as many instruments. This unnatural process, called additive polyphony, creates an "organ effect" which seriously degrades the realistic harmonic structure and destroys the orchestral balance of string simulations. The reason why conventionally sampled strings often tend to sound like an organ is because organs use additive rather than natural subtractive polyphony. The author of this patent application has already been granted a separate U.S. patent addressing the matter of note allocation among individual instrument samples rather than using large sections of sampled instruments per Patent Number 7,109,406 granted on September 19, 2006; that patent is cited here by reference because although it is not in any way incorporated in this specification, its underlying assumption of separate samples for each instrument (or each pair of instruments otherwise known as a "desk") is useful to implement the methods of the current application's technology and to gain the greatest degree of realism.
A second factor that detracts from the realism of various samples that are supposed to be playing in ensemble but in fact were recorded discretely is the lack of the natural bleed of sound from each instrument into the presumed multiple spot microphones and room microphones that would all be picking up all the sounds in a live recording or stage performance situation. Yet another pair of patent applications by the authors of this patent application addresses this issue by recreating the effect of multiple instrument locations in a virtual recording environment from which sound bleeds into all the other virtual microphones; see the Microphone Bleed Simulator Utility Patent Application submitted August 22, 2005, U.S. PTO reference number 11/208798 confirmed by postcard with PTO STAMP: 112949, and also a Microphone Bleed Simulator patent being filed this week which claims priority from provisional patent application Number 60807166 submitted July 12, 2006, as confirmed by electronic acknowledgement receipt EFS ID 1110859, Confirmation
80 Number 6620. While the microphone bleed simulation process is not integral to the Desk Spread process described herein, it does compliment it by providing a greater degree of overall realism and blending.
Even with the correct allocation of notes among a fixed number of instruments (divisi), and with simulated bleed across the multiple "virtual microphones" in a
85 presumed live performance environment, the realism of discretely sampled instruments when played to sound like an ensemble still tends to fall short of a realistic "live" performance. For example, although string samples or synthesized strings, in general, can do a decent job of representing the static sound of real string instruments, such as the violin, they rarely if ever manage to correctly imitate live
90 players in motion {that is, playing a series of notes or chords in succession). One simply cannot get the realism of violins in motion out of fixed samples of multiple violins that were playing one same note at the same time when initially recorded to make the samples, which is unfortunately the method by which prior art sample libraries are recorded and played.
95 String instruments, like all acoustic instruments, are mechanical devices that are subject to tonal differences and timing anomalies just like any other machine that lacks absolute precision; and since musical instruments are played by humans, the definitely lack precision. When playing in a group, such as the violin section, each player has a unique sound and timing accuracy when switching between notes andoo when playing various effects; effects is a musical term that covers such sounds as for example vibrato, tremolo, col legno, sul ponticello and so forth, all of which are produced by variations in the live player's technique. In non-musical terms, we can say that these typical effects involve such factors as the position of the bow relative to the neck and bridge of the instrument, how much 'flutter' in the bow hand is
105 applied (the speed and depth of back-and-forth bowing), and how much "flutter1 in the position and pressure of fingers on the strings (which in turn affects the pitch and intonation of the notes played). In addition, each player sits in a different location in the room which creates multiple time delays due to overall mic placement; while physical position-related timing issues are addressed in the Microphone Bleed no Simulator patents referenced above, simple variations that occur based on the moment at which a given player actually starts to play the note are not; this more random note-timing variation requires additional methods if it is to be realistically simulated when using sampled sounds.
Note timing issues notwithstanding, each player does not necessarily play the
115 notes precisely as written in the score. Once again, players are humans, and the musical instruments they use can be played out of tune. From a purely physical standpoint, the violin, for example, is not a fretted instrument which means that tuning accuracy from note to note is also relatively flexible and random as it depends upon the player's hand position and the string tension he or she applies to the string.
120 Thus, a realistic sampled violin section (or almost any ensemble) would not have every single player playing perfectly pitched notes; they would vary in a somewhat random fashion, influenced by each player's personal inability to accurately "keep up" with more rapidly changing notes or to accurately play notes which span a wider physical fingering range. It is this very imprecision and semi-random quality that
125 imparts realism, warmth and beauty to a string ensemble. In short, the reason why a conventional (prior art) sample of twenty violins playing from note to note can't sound real is because it's too perfect; even if "imperfection" is part of the sample, it isn't going to be random and so the predictability itself is unnatural.
Another factor affecting the realism of sampled ensemble playback involves 130 the precision with which one note is abandoned to play the next note; the imprecision of this change for each player is part of what makes a performance sound realistic. Each player within a string section, for example, may hold a given note for a slightly different time.
As a result of the individual players' differences in note pitch, initiation of
135 playing notes, and duration of holding notes, the overall section sound is constantly in flux and slightly out of synch — we'll call this naturally-occurring effect Desk Spread for lack of a widely accepted industry term to describe it; that's because it spreads out the performance characteristics of multiple desks of musicians whereby in a typical performance environment a desk is comprised of a music stand and one or
140 two musicians playing the music on that stand. A major reason why conventional sampled sections of instruments don't sound realistic is that they always have the same timing, pitch and note durations for everyone playing because they were all sampled at once. It is true that some sample libraries record a few versions of a given sound, and these libraries can be set to randomly invoke one or the other
145 version of the sampled sound from note to note, but such an approach does not produce the realistic desk spread that occurs in a live ensemble because the deviations in sound do not increase with the speed of playing and they do not become greater, which is to say more inaccurate, with the lesser skill of the more distantly seated musicians as actually occurs in a real orchestra, for example. In
150 other words, such prior art ensemble samples are too perfect and predictable to sound entirely realistic. One prior art sample library and software product uses recorded instruments playing musical runs of notes, or phrases, from which databases of the phrase characteristics and modeled, and from which models a synthesized process
155 produces new sounds that purport to retain some of the characteristics of the original live players and their imperfections. This prior-art method does not permit the user to play the sounds in real time with the benefit of such processing, but rather requires a post-playing process that looks ahead to notes which will be played in order to perform anticipatory processes that create the required effects. Because
160 it is based on modeling of a limited set of predetermined already-played sounds, and because it is not a real-time process, this prior art does not approach the methods set forth herein, which methods allow discrete samples of actual instruments to be played in real time with the natural deviations we describe below. For more information you may refer to the website at:
165 http://www.synful.com/synfulorchestra.htm
The methods presented herein intentionally make the playback of multiple, separately sampled or synthesized instruments become imperfect with respect to timing, pitch and note duration. This means that with this desk spread function active, at no time can all the members of the violin section (or any grouping of
170 sampled or synthesized sounds intended to be heard as an ensemble) be heard playing the exact same note at the exact same time as occurs with prior art sampled sections of instruments. While it might be possible to go out and sample a host of different instruments individually, then write and play music by manually applying the effects described here, such an effort would be a painstaking and horribly inefficient
175 process; even a skilled composer would be seriously challenged to do it. It would be a great waste of precious time attempting to manually create what a large group of individuals do naturally and inadvertently as a byproduct of the physical process of live ensemble playing. The methods presented herein are particularly powerful because they allow for automatic control of the timing (note initiation), pitch
180 deviation, and note duration among collections of individual sampled or synthesized instruments. The effects created by these methods seem to occur naturally when playing sampled or synthesized sounds in that they do not require any extra effort or preparation by the composer or performer.
The specific system implementation and methods discussed herein are
185 exemplary only; many deviations are possible and are anticipated by the author of the present specification, and actual implementations ideally should provide for user- adjustable or presettable parameters that change the way the methods are scaled and applied. The overall scheme is this; increase the deviation in pitch from the intended pitch of the notes to be played by a scaling factor that increases (a) with
190 greater playing speed between notes and (b) with greater player distance from the front of the section. This is in accord with the principles that (a) players get sloppier when they have to play faster, and (b) the least skilled players sit further back in the section. Similarly, the desk spread method adds incremental time delay to the onset of notes as notes are played faster and faster, and causes more delay deviation to
195 occur as the player positions are further from the front of the section for the same reasons we cite to justify pitch deviation. Also the decay-sustain-release time of the note volume envelopes increases with faster playing speed to lengthen, with variability, the note durations. In summary, the functions described herein prevent sampled instruments from doing perfectly what live musicians cannot do perfectly
200 due to their various skills and due to natural variations that occur when playing actual instruments in a live, real-time environment. BRIEF DESCRIPTIONS OF THE DRAWINGS
Fig. 1. Desk spread simulator block diagram depicting the functional elements that comprise a representative system. 205 Fig. 2. Pitch spread simplified diagram - depicts the way pitch changes according to player position, as indicated by desk number. This is an example for 8 players (8 desks).
Fig. 3. Pitch spread method diagram - depicts a more detailed version of the pitch spread of Fig. 2 (for 16 player positions) with a logical flow diagram to depict an 210 example of what functions may occur in the pitch spread system and what functions may occur in the sampler.
Fig. 4. Time spread simplified diagram — depicts the way initial playing of the note is progressively delayed according to player position, as indicated by desk number. This is an example for 8 players (8 desks).
215 Fig. 5. Time spread method diagram — depicts a more detailed version of the time spread of Fig. 3 (for 16 player positions) with a logical flow diagram to depict an example of what functions may occur in the time spread system and what functions may occur in the sampler.
Fig. 6. Hysteresis system block diagram — depicts a method whereby the 220 speed of successively played notes creates a scaling value (which is hysteresis) and how that value is applied to the remaining elements of the desk spread system (MIDI Pitch Shift, MIDI Delay, and Envelope-DSR) to command the various deviations that comprise desk spread. 225 DETAILED DESCRIPTION
In order to play samples one typically uses a so-called sample player, which is generally implemented as a software module running on a general purpose computer, either as a stand-alone program or as a plug-in to a host program environment. Some sample players are implemented on special-purpose dedicated
230 hardware which, internally, still uses sample software and computer components, but which are constituted as an embedded system. Most such samplers are caused to play notes either by direct control from an external device (typically a MIDI keyboard, where MIDI is an acronym for Musical Instrument Digital Interface), from a different type of note generator (such as a MIDI guitar pickup) or from a device
235 which stores and subsequently "plays back" note instructions, which device is known as a sequencer. MIDI refers to the Musical Instrument Digital Interface, a standard promulgated almost 3 decades ago and widely used in the musical industry today, albeit with many modifications and updates; we refer to MIDI control in this specification since it is prevalent, but any means to control sample playing or
240 synthesizer playing can be substituted without appreciably altering the system or methods set forth herein. Synthesizers use wave tables to build sounds or they use various oscillators or frequency modulation methods to create sounds, but generally synthesizers also function under software control, referring to lookup tables or other specification files, and they operate under control of a host program of some sort;
245 generally, synthesizers also use MIDI protocol, MIDI note generators and sequencers for control of note playing.
Sequencers are typically implemented as software programs running on a general purpose computer, either the same computer or a different computer from the one on which the sampler runs (one sequencer or keyboard can address
250 multiple samplers or synthesizers). Some samplers, synthesizers and sequencers are embodied as discrete hardware devices, but their function is essentially the same as the more common software implementations. In order to implement the methods described herein there is no need to reinvent the basic sequencer- keyboard-sampler (or synthesizer) metaphor, but there is a need to insert an
255 additional controller ahead of the sampler, which controller handles communication between any keyboard or sequencer and whatever sampler or samplers are used to play the discretely sampled sounds. Refer to Fig. 1; block 10 refers to the overall system of playing sampled or synthesized sounds and comprises an input device 105 which may be a MIDI keyboard or other note generator or a sequencer, a Desk
2βo Spread Processor 110 which is the actual controlling system proposed in this document, and a sample player (or synthesizer) 120, which may be one or more actual "samplers" or "synthesizers" on one or multiple hardware platforms. Samplers and synthesizers are considered to be (or to play) virtual instruments, in contrast to the actual physical musical instruments whose sounds they may emulate.
265 A common function to the several methods presented here is their ability to automatically change the playback characteristics of virtual instruments based on the rapidity of successively played notes. For lack of any existing term for what is a new concept, we call this type of scalar control parameter "hysteresis." The hysteresis method evaluates the time difference between successive notes being
270 played by a given section of like instruments (i.e., by the first violins), calculates the delta of note initiation times, smoothes this value, and uses it to control various other aspects of sample playing (or synthesizer playing) for each individually addressable sampled (or synthesized) player within that section. Specifically, keyboard hysteresis automatically increases the effectiveness of other controls that set the Pitch Spread,
275 Time Spread and Slur Amount as notes are played in ever more rapid succession. Optionally, hysteresis may also affect Glide Speed (glide refers to a smooth change in pitch from one note to the next, as contrasted with a single jump in pitch that occurs without glide). Musically the effects of these various control methods accord with the manner in which musicians tend to get "sloppier" as they have to play faster 280 and faster. While hysteresis is a shared control factor for multiple methods of increasing realism, we'll first describe the various methods themselves and then show how hysteresis works to influence them.
The specific values used for keyboard hysteresis scaling must be empirically determined with a given set of sampled or synthesized sounds but the method itself
285 will work for almost any sounds so long as they are accessed as individual sounds as pairs or sounds (or very small groups of sounds). For simplicity, we'll use the term "player" to denote a single sampled or synthesized instrument or a desk of two or a few instruments since there is no significant functional difference.
Each player must be separately addressable in order to be correctly
290 controlled; in current technology this typically means that each player will be addressed on a discrete MIDI port and channel (in the near future we anticipate that other methods of designating instruments will be more widely used than the older MIDI protocols, but this doesn't change the basic functions described here, only the addressing scheme). We'll refer to each unique MIDI port/channel as a unique
295 "path." For convenience and in keeping with orchestral practice, we will assume that the "first chair" or lead player in a given section of instruments has the lowest path number and that successively higher numbered paths correspond with players who sit further and further away from the audience in a live ensemble; this convention makes it easier to describe the functions here and of course any numbering scheme
300 may be used so long as the result is the same.
PITCH SPREAD: This function commands differential pitch shift for musical notes sent from the system (Fig. 1 110, 155) to the various sampler/synthesizer paths 120; successively higher numbered instrument desks play notes that diverge 305 increasingly from the default pitch of the note that is supposed to be sounding, and which is being sounded by player number 1. The pitch spread should be adjustable and for this purpose we provide a control such as the examples shown in Fig. 2 205/215/225. This fader-like implementation of a user interface adjusts an Outgoing note pitch shifter in Desk Spread Processor 155 of Fig. 1 that simulates natural 3io deviations in pitch like those which occur in a section of instruments due largely to differences between individual players who for example may finger notes differently (on stringed instruments) or blow a bit differently (on winds and brass instruments). Note than in Fig. 2 through Fig. 5 we show a graphic curve to which specific instrument "paths" are mapped to derive MIDI CC values; in reality calculations (e.g., 315 algorithms) and/or arrays (e.g., look up tables) may be used to derive these MIDI CC values. The graphs are shown for ease of visualization of the method. We also refer to either 8 or 16 desks in these illustrations. The lowest numbered desk (#1) is always the closest to the conductor/audience, and would typically be the "first seat" in an orchestral section - the most skilled player. The highest numbered desk (#8 or 320 #16 in these examples) is always furthest away and would typically be the least skilled player.
The term "path" used above and elsewhere in this document refers to a specific MIDI port and channel by which the system addresses (commands to play) a given sample; a path may also be considered to represent a "desk" (which is a 325 music stand that may have one, two or even a few musicians playing at once and captured on a single sample). The path, therefore, represents the smallest discretely addressable player of a note, typically one or two musicians on a sampler channel. As indicated in Fig. 2 at minimum (UNISon) setting of the control 205 there is no pitch shift and all desks (paths) play in unison with respect to pitch 210 (in 330 actuality, they may be playing different notes, but the pitch of each note is precisely what it should be at the Unison setting; if all instruments were actually supposed to be playing the same note, then exactly the same pitch would be commanded for every one of their paths to the sampler).
At a maximum setting of Vz Tone 225 the last desk (highest numbered 335 instrument path (#8 in Fig. 2, 230) will be a half tone above or below the default pitch for the note nominally being played by that desk. As can be seen, each desk in succession does not simply go up or down in pitch; the first desk is unchanged, the second desk rises in pitch, the one after that goes lower, the one after that higher, and so forth. For this reason, the highest numbered desk will be higher in pitch if it's
340 even numbered, or lower in pitch if odd numbered; there is really no significance to whether one commands desk #2 to go higher or lower in pitch so this choice within the method shown is arbitrary.
As can be seen in Fig. 2 and 3, the amount of pitch change increases as the desk number increases. However, the last desk will always exhibit the maximum
345 amount of pitch change as set by the Pitch Spread control, which means that if fewer paths are selected, the relative pitch change between paths is greater for a given setting of the Pitch Spread control; this can be seen by comparing Desk 8 in Fig 2 230 to Desk 16 in Fig. 3 360; in both cases the Pitch Spread control (225, 355) is set at the maximum of 1/2-T and the highest numbered of the selected desks has a
350 pitch deviation of a half-tone. The scaling of the pitch shift may sound better with one or another curve (linear, logarithmic, exponential, custom) for various applications as may be determined through empirical testing; the end user may be given access to select alternative curves. The inverse curve pairs shown in Fig. 3 (345 - 350 and 360 - 365) are merely exemplary curves.
355 MIDI CC (control change) values are used to convey the pitch spread amount
(and other parameters in this system) to the sampler/synthesizer because MIDI CC is a commonly available means to control today's electronic instruments; in the future other parameters may be used to accomplish these functions. According to the MIDI specification, each unique MIDI port/channel may have up to 128 MIDI
360 CC#s associated with it. Each MIDI CC#, in turn, can have a range of value spanning 128 increments from value 0 through value 127. The "meaning" or interpretation of what a given MIDI CC# and CC value do is based on how the specific sampler and sample set (or synthesizer) are programmed; there are a few standard usages but the majority of CC information is not fixed in the industry. Thus
365 while some MIDI CC#s are reserved for common control functions, many are available to be used for whatever purpose is desired; any non-conflicting "available" CC#s can be allocated for the CC value communication to implement the present system.
In Fig. 3, the Pitch Spread function is depicted in greater detail than in Fig. 2.
370 One can see a pair of inverse curves (345-350 or 360-365) mapped across a grid of 16 instrument paths (16 desks). The curves begin at a zero point coinciding with instrument path #1. When the Pitch Spread Control is set at unison (330) it outputs a middle value on the CC scale (i.e., half way from zero to 127 is a CC value of 63). Regardless of the Pitch Spread setting, the first player (path #1) never
375 exhibits any pitch spread as this represents the lead player who is our "perfect" reference point and always presumed to be playing at precise pitch and time; this is why both the up going and down going curves (345-350 or 360-365) are at the zero reference (value 63) for path #1. Pitch spread will occur with increasing deviation from the nominal pitch as the path numbers increase (from 2 through 16 in Fig. 3),
380 as can be seen by the graphed dots on the curves when pitch spread is moved up to %-tone as shown in 340 and then Va-tone as shown in 355. The actual pitch bend CC values used for this function are obtained for even number paths by referring to the rising curve (345 or 360) whereas CC values are obtained for odd numbered paths by referring to the descending curve (350 or 365).
385 Sliding the Pitch Spread control down to the bottom (Unison) position per 330 essentially flattens the curves per 335 so that all paths receive the same zero reference CC value (i.e., CC value 63), meaning there is no pitch spread. Sliding the Pitch Spread control all the way up to the top (1/4-T) position per 355 causes the extreme ends of the inverse curves to reach values of 127 (upward going) and
390 nearly zero (downward going), and intermediate control settings merely scale the maximum curve deviations accordingly. Any active instrument paths to which Pitch Spread is applied receive the correspondingly mapped CC values, as indicated by the gray dots in Fig. 3.
The method for accomplishing this pitch shift is shown in Fig. 3 by the block
395 diagram at the left side of the illustration. Pitch spread MIDI CC values are generated for all notes on each of the pertinent instrument paths as commanded by the pitch spread control 305. The values are derived in 310 either by consulting a set of lookup tables based on the Pitch Spread control setting, the number of active instrument paths, and the desired curve set, or by an algorithm which performs
400 essentially the same function as a lookup table but does so by means of calculation.
The pitch spread CC values are not sent out to the sampler 315 until they receive one additional bit of processing; they are in fact scaled according to the keyboard hysteresis value in step 313. We discuss hysteresis in greater detail subsequently. Hysteresis can be applied so that maximum hysteresis values cause
405 the pitch spread to increase beyond the value set with the Pitch Spread control, at middling hysteresis values the Pitch Spread control's output can be unaltered, and low hysteresis values the pitch spread can be reduced to less than the set Pitch Spread control values; or Hysteresis can be applied so that maximum hysteresis values cause the Pitch Spread control to output its current set value, and any less
4io hysteresis scales down the Pitch Spread control output. The choice of methods to apply Hysteresis may vary with user preference or overall system implementation considerations. After hysteresis scaling the pitch bend CC values 315 are sent to the sampler(s) paths along with the note-on and other information so the sampler "knows" what to play.
415 This particular implementation relies upon a sampler per 30 in Fig. 3 which can be scripted (automated by means of coding accessible to those who prepare the sample library) to produce proportional pitch shift for a given sample based on MIDI CC values; various such samplers are commercially available as for example the software sampler Kontakt 2 by Native Instruments of Germany. However, it is 420 possible, alternatively, to issue MIDI pitch shift commands directly to the sampler or synthesizer input from the Pitch Spread controlling module using a common MIDI pitch shift CC#. One benefit of relying upon sampler scripting is that it balances the workload of the various system modules 110, 120 and provides (based on current technology) a more flexible data communication scheme. The optimum
425 implementation may well change as sampler/synthesizer and computer technology evolve, but the basic concept remains the same; command automatically-scaled pitch shift per sample. At the sampler in this exemplary implementation, a suitably written script interprets incoming CC values for each note 370. The script may be bypassed for path #1 as shown in 375 because path #1 never has any pitch bend
430 based on desk spread; this step is optional but may be useful to reduce unnecessary pitch bend calculations at the sampler. For all other paths, the script checks to see if the CC value is at, above or below a value of 63 (step 380). If it is below 63 (step 390) and moving toward zero then the script commands negative (downward bending) pitch excursion, scaled such that a value of zero is equal to Vz
435 tone downward shift. Conversely, the script interprets CC values above 63 and moving toward 127 (step 385) to command positive (upward bending) pitch excursion scaled such that a value of 127 is equal to Vz tone upward shift.
The appropriately pitch-bent note for each path is thereby played by the sampler 395.
440
TIME SPREAD: This function introduces progressively greater delays in note onset when multiple desks are playing in response to "simultaneous" note-on events; the meaning of "simultaneous" may be defined to include note-on events that fall within a specified time window such as 2 milliseconds, rather than truly
445 simultaneous, and since MIDI is a serial protocol and performers are not robots, it's beneficial to define a reasonably short time window to be treated as "simultaneously played notes." Instead of all desks immediately playing any designated note(s), the Time Spread method creates a progressive note-on lag with higher numbered desks (higher path numbers). This behavior simulates symphonic reality with respect to
450 how musicians are seated; as previously stated, the more skilled musicians (those able to play faster with good accuracy) are seated in the front (lower numbered desks).
Referring to Fig. 4, with the Time Spread control all the way down (zero milliseconds) 405 there is no time lag for notes sounded by any of the paths to the
455 sampler 410. The more you increase the Time Spread setting (415, then 425), the greater the time lag before the onset of the note(s) played by each successive desk (420, 430). The maximum time lag (or time spread) occurs at the last desk and equals the set time value of the Time Spread control. The time lag on desks between the first and last selected desk follows a curve such that there is little lag for
460 the lower numbered desks and increased lag for higher numbered desks. As described in the prior discussion of the Pitch Spread Subsystem, the actual curve or curves used must be empirically determined and it may be desirable for the end user of the system to be able to select among different curves for particular applications. Time Spread has no direct relationship to the pitch or tempo of the notes being
465 played but it should be related to the rapidity with which notes are played in succession, and is therefore subject to scaling by the Hysteresis subsystem.
Time Spread itself may be implemented using a straightforward MIDI time delay function. The delay technique presented here is unique in that it sets an automatically generated and appropriately scaled timing value for each of the
470 pertinent instrument paths (i.e., those paths controlling instrument samples for a given ensemble or section); the delay can be visualized to be established by overlaying the path number on the chosen curve whose maximum value equals the set value of the Time Spread control at the last path (Fig. 5 path #16). In reality, this time spread mapping will be accomplished either by lookup tables or by algorithmic
475 calculation. The note-on commands plus a MIDI CC value (or other control parameter) that commands the sampler to wait for the designated time interval before sounding the note-on data may be sent from the Desk Spread Processor (Fig. 1, 110) immediately to the sampler (Fig. 1, 120); in the sampler itself the delay is actually implemented (except as otherwise noted below).
480 Regardless of the Time Spread setting, the first player (path #1) never exhibits any time spread as this represents the lead player who is our "perfect" reference point and always presumed to be playing precisely; this is why all three example curves shown for various Time Spread Control settings (530, 540, 550) are at the zero reference (CC value 0) for path #1 (532, 542, 552). Time spread will
485 increase as the path numbers increase (from 2 through 16 in Fig. 5), as can be seen by the graphed dots (540, 550) on the curves when the Time Spread control is moved up to 50 ms 535 and then 100 ms values 545.
Sliding the Time Spread control down to the bottom (0 ms) position per 525 essentially flattens the curves per 530 so that all paths receive the same zero
490 reference CC value (i.e., CC value 0), meaning there is no time delay. Sliding the Time Spread control all the way up to the top (100 ms) position per 545 causes the extreme end of the curve to reach CC value 127, and intermediate control settings merely scale the highest numbered path's maximum curve deviation and all intermediate paths accordingly.
495 Any active instrument paths to which Time Spread is applied receive the correspondingly mapped CC values, as indicated by the gray dots in Fig. 5, except these values are also affected by Keyboard Hysteresis as explained below.
The method for accomplishing this incremental time delay is shown in Fig. 5 by the block diagram at the left side of the illustration. Time spread CC values are
500 generated for all notes on each of the pertinent instrument paths as commanded by the Time Spread control 505. The values are derived in 510 either (a) by consulting a set of lookup tables based on the Time Spread control setting, the number of active instrument paths, and the desired curve, or (b) by an algorithm which performs essentially the same function by means of calculation.
505 The time delay CC values are not sent out to the sampler 520 until they receive one additional bit of processing; they are in fact scaled according to the keyboard hysteresis value in step 515. We discuss this in greater detail subsequently, but basically with maximum hysteresis the time spread value of the highest numbered path equals that which is set with the Time Spread control, and at
5io all lower hysteresis values, the time delay CC values are proportionately reduced to less than the set Time Spread control value. At this point, in 520, the time delay CC values are sent to the sampler(s) along with the note-on and other information so the sampler "knows" what to play.
This particular implementation relies upon a sampler per 30 in Fig. 5 which
515 can be scripted to produce time delay for a given sample based on MIDI CC values; such samplers are commercially available. However, it is equally viable to delay the output of notes issued to the sampler or synthesizer from the Time Spread controlling module; the authors opted at this time for sampler scripting because it balances the workload of the various system modules and provides (based on
520 current technology) a more flexible data communication scheme. The optimum implementation may well change as sampler and computer technology evolve, but the basic concept remains the same. At the sampler, a suitably written script interprets incoming CC values for each note 570. The script may be bypassed for Path #1 as shown in 565 because Path #1 never has any time delay based on desk
525 spread; this step is optional but may be useful to reduce unnecessary time delay calculations at the sampler. For all other paths, the script scales the time delay such that a value of zero is equal to no delay and a value of 127 is equal to 100 milliseconds time delay (or whatever value one wishes to establish as the maximum; such high values yield unnatural sounds and yield more of a special effect than a
530 natural one). The appropriately time-delayed note for each path is thereby played by the sampler 580.
DECAY-SUSTAIN-RELEASE ENVELOPE: The concept of applying a volume 5 envelope to a sound dates back to very early synthesizers. Traditionally an envelope generator has four controls: Attack, Decay, Sustain and Release (A-D-S-R). This can be altered to a smaller or larger set of envelope controlling parameters such as the D-S-R envelope proposed herein. Envelopes have been used on samples as well as synthesized sounds, but they are applied somewhat differently. That's o because samples include their own envelope - the level changes that were present in the original recorded sound. Raw sample recordings are subsequently "cut" into several segments for suitable playback control, and it is still possible to apply an envelope to the sound. Many commercially available samplers allow for scripting or otherwise commanding an envelope to be applied to the sample playback. In the 5 context of improving the realism of an ensemble and increasing the "desk spread" the author of this document developed a method of applying D-S-R envelope control to the sampled notes, again scaling this with Hysteresis such that the faster the succession of notes played, the longer the envelope; this extends the notes played by higher numbered desks, particularly at higher playing speeds, causing them to0 "lingering." This method is similar to that used for Time Spread except that instead of applying increasing CC values to create greater delays before the onset of playing notes with higher numbered paths, it applies increasing CC values to create longer note releases. Each sampled note for each instrument in a given sample library will have its own unique sustain (if the note is looped), decay and release characteristics5 which means that empirical testing is necessary to "fine tune" the actual envelope for each sampled note to work correctly with the particular sample set; it is not worthwhile to publish general values or an algorithmic rule for this function as ideally it should be scripted per library. We can say, however, that a CC scale of 0 through 127 can be used internally within the system's ENVELOPE subsystem, subject to 560 overall hysteresis scaling and to sensitivity scaling per sample set (or per sampled or synthesized instrument).
HYSTERESIS CONTROL SUBSYSTEM: Hysteresis affects multiple subsystems and itself is scaled according to the time difference between notes
565 being played (the delta of note-on times). Referring to the block diagram of Fig. 6, a hysteresis generator 605 examines the incoming stream of note-on commands and times the successive intervals from note to note to derive a smoothed moving average value for this inter-note time. The shorter the time between notes, the greater the hysteresis value.
570 Empirical testing of skilled keyboard players, playing very rapid runs and successions of notes, enabled the author to establish a "very fast" hysteresis value of approximately 16 to 17 notes per second and so he standardized on the value of 16 notes per second as being equal to one for the purpose of calculating a hysteresis coefficient. He then devised a simple means to determine whether a very
575 rapid succession of notes was in fact just that, or whether it was merely a non- simultaneously played chord (based on inter-note timing), and based on that, the number of notes played per second can be measured. This is done on a continuous basis, with that value divided by 16. The resulting value will be a hysteresis coefficient— a multiplier which varies between zero (no notes or 1 note played per
580 second) up to 1 (16 or more notes played per second). Thus, the output of the hysteresis generator will be a coefficient stream clamped to 0 minimum and 1 maximum, and will flow through a Hysteresis control 610 that scales the generated hysteresis value from 100 percent (1) down to zero percent (0). The hysteresis coefficient value coming out of control 610 is then distributed to the various
585 subsystems it affects via a further distribution and scaling function 615. The additional scaling function is included because a single hysteresis value may not be optimal for altering the various subsystems which are affected. In general, hysteresis serves to increase the effectiveness of the Pitch Spread, Time Spread and D-S-R Envelope Modification subsystems as notes are played in more rapid succession (or
590 perhaps better stated, it serves to decrease the effectiveness when notes are played less rapidly).
The pointers on the controls for the Pitch Spread 620, Time Spread 630 and Envelope 640 subsystems do not alone control their respective MIDI Pitch Shift 625, MIDI Delay 635 and Envelope D-S-R 645 functions. Instead, the effective output of
595 each of these controls is altered by the Hysteresis value which is applied to them. Fig. 6 depicts this output value graphically as a vertical bar in the representation of the various controls.
The control slider "knobs" in 610, 620, 630 and 640 represent the setting of the control, but not the actual output of each control; the gray bars to the right of
600 each slider knob represent the actual output of the controls. Thus the Hysteresis control 610 is set at maximum but is not putting out maximum scaling multiplier of 1.0 for hysteresis in the illustration, presumably because the rate at which notes are incoming is less than the 16 note-per-second speed required to drive the hysteresis to full value. (If the control of 610 were pulled down to 50% instead of the 100%
605 shown, then the gray bar would drop to half it's displayed height, indicating the hysteresis output is further scaled back below 50%.)
This hysteresis value is distributed to the Pitch Spread, Time Spread and Release Envelope controls (620, 630, 640) through distribution system 615.
The Pitch Spread control 620 is set slightly above UNISON, but because the
6io hysteresis system is generating about 75% of maximum value, the actual Pitch Spread control output is 75% of the set value as well.
The Time Spread control 630 is set at maximum, or 100 milliseconds in this example, but because it's receiving that 75% hysteresis value (e.g., a scalar multiplier of about 0.75), the time spread output is 75% of maximum, or a CC that 615 would produce about 75 milliseconds maximum time delay.
Envelope also works a bit differently. Notice that the Envelope Control 640 has a zero value at mid-point in the control scale. That's because the default envelope applied to the actual samples is constructed to partially truncate the sampled notes; otherwise notes would "ring out" too much in ordinary usage. Thus
620 the incoming 75% hysteresis value has a different effect on the Envelope Control and its subsystem. Here any Hysteresis values above 0.5 push the envelope out to a longer decay-sustain-release characteristic, with a maximum occurring at 1.0, whereas Hysteresis values below 0.5 pull the envelope in to a shorter decay-sustain- release characteristic; a Hysteresis value of 0.5 is "neutral" having no effect. For this 625 reason, although the Release Envelope 640 slider is set at mid-point (0 value), the effective envelope is longer due to the hysteresis input.
We have not illustrated a control for Glide Speed as it is an optional effect; it is covered generally by Fig. 1 165 as an "Articulation Modifier," since one may wish to alter some other performance aspect for a given type of musical instrument. In
63 o general, the approach to modifying Glide Speed is similar to that for the controls we have presented in detail; a nominal glide speed for a set portamento or glissando 'pitch change between notes' is set, and the actual glide speed for each player is modified by a combination of a Glide Speed control and a hysteresis scaling of that control setting, as applied differentially to progressively high-numbered players. 635 As anyone skilled in the art will understand, MIDI CC values are only legal or usable between zero and 127. Therefore the combining of CC values in these controls is done in such a way that the MIDI CC value is "clamped;" any combination of subsystem control setting and hysteresis value that would tend to push the control output above 127 or below zero is restricted to these limits. The scaling notation in
640 the Hysteresis Distribution block of the illustration 615 indicates that it may be desirable to scale the hysteresis effect before it's applied to one or another subsystem; again empirical testing will be necessary here, and it is anticipated that scaling the applied hysteresis values between subsystems may benefit from user- adjustability based on preferences or style of music being played. The scaling is thus optional as it also may not be required or it may be accomplished by other means such as altering the effective sensitivity of each subsystem by means of sampler scripting.

Claims

We claim:
1. A system for controlling the playing of a run of successive notes by a plurality of 650 virtual musical instruments representing a section of virtual players which system scales certain specified performance parameters by assigning a seating location and a skill level to said virtual players in order to emulate the naturally occurring deviations that occur with an ensemble of live musicians such that, a processor is inserted between a musical input device such as a keyboard of 655 sequencer and a sample player or synthesizer to generate actual musical sounds, which processor is devised to analyze and react to the incoming stream of musical notes commanded to be played at its input, relies upon stored values and calculations to alter the nature of these note commands, and outputs modified note commands to the connected sample player or 660 synthesizer; the processor assigns virtual players to progressively higher numbered desks, which progressively higher numbers represent a progressively reduced player skill level; the processor causes players designated to have progressively lower skill 665 levels to play notes that have progressively greater deviations from the intended note's pitch; the processor causes players designated to have progressively lower skill levels to initiate the playing of notes they are commanded to play after delay times that become longer with each player in proportion to their progressively 670 lower skill levels; the processor causes players designated to have progressively lower skill to hold notes, sounding them for longer times before releasing them in proportion to their progressively lower skill levels.
2. A method for producing pitch deviations among a plurality of virtual musical instruments representing a section of virtual players whereby players presumed to have progressively lower skill levels are caused to play notes that deviate to a greater degree from the nominal pitch of the note or notes they are supposed to be playing such that, each of the available players is assigned a number from 1 through n, where n is the last of the available players; . player 1 is presumed to have the highest skill level, and player n the lowest skill level, with all player numbers in between assumed to have skill levels distributed evenly between the skill level of player 1 and player n; a scale and control are devised to establish negative and positive pitch deviations whereby the minimum and maximum deviation from the nominal pitch can be set to a desired value; player 1 is protected from any pitch deviation such that the notes played by player 1 are always at precisely the pitch commanded by the input note; players 2 through n are subject to increasingly greater pitch deviations whenever the maximum set pitch deviation is greater than zero value such that player n always is subject to the greatest set pitch deviation; pitch deviation alternates with each successively higher numbered player such that if one player is forced higher in pitch, the next higher number player is forced lower in pitch, and so forth; the scaling of the pitch deviation may be altered for sonic effect such that the characteristic of the increasing pitch deviation tracks a linear, logarithmic, exponential or other set curve; pitch deviation is executed by conveying pitch shift commands from the pitch controlling system that implements this method to the sampler or synthesizer which is producing the actual musical notes.
3. A method for producing note start time deviations among a plurality of virtual musical instruments representing a section of virtual players whereby players presumed to have progressively lower skill levels are caused to play notes that start at a progressively longer time after the initial note start command is issued such that, each of the available players is assigned a number from 1 through n, where n is the last of the available players; player 1 is presumed to have the highest skill level, and player n the lowest skill level, with all player numbers in between assumed to have skill levels distributed evenly between the skill level of player 1 and player n; a scale and control are devised to establish maximum time delay between the time a note is commanded to be played at the input and the time a note playing command is actually issued a player at the output; player 1 is protected from any time delay such that the notes played by player 1 are always at precisely the time commanded by the input note; players 2 through n are subject to increasingly greater time delays whenever the maximum time delay is greater than zero value such that player n always is subject to the greatest set time delay; the scaling of the time delay may be altered for sonic effect such that the characteristic of the increasing delay times tracks a linear, logarithmic, exponential or other set curve; time delay is executed by delaying note-on commands from the time controlling system that implements this method to the sampler or synthesizer which is producing the actual musical notes.
4. A method for producing note envelope deviations among a plurality of virtual musical instruments representing a section of virtual players whereby players presumed to have progressively lower skill levels are caused to hold notes for a progressively longer time than the nominally played note length such that, each of the available players is assigned a number from 1 through n, where n is the last of the available players; player 1 is presumed to have the highest skill level, and player n the lowest skill level, with all player numbers in between assumed to have skill levels distributed evenly between the skill level of player 1 and player n; a scale and control are devised to establish maximum extension of the note duration commanded to a player at the output; player 1 is protected from any note duration change such that the notes played by player 1 always follow the note duration commanded by the input note; players 2 through n are subject to increasingly greater note duration whenever the maximum envelope change setting is greater than zero value such that player n always is subject to the greatest increase in note duration; the scaling of the note envelope duration may be altered for sonic effect such that the characteristic of the increasing note duration tracks a linear, logarithmic, exponential or other set curve; note duration is executed by sending volume envelope Decay-Sustain- Release commands from the duration controlling system that implements this method to the sampler or synthesizer which is producing the actual musical notes.
5. A method to scale the player skill-level based variations of Claim 2 such that, as the speed of playing increases, as measured by decreasing time between successively played notes, a playing-speed generated scalar value is calculated; o the playing-speed generated scalar value can be applied to alter the effect of any commanded changes in note pitch.
6. A method to scale the player skill-level based variations of Claim 3 such that, as the speed of playing increases, as measured by decreasing time between5 successively played notes, a playing-speed generated scalar value is calculated; the playing-speed generated scalar value can be applied to alter the effect of any commanded delay in note playing start times. 0 7. A method to scale the player skill-level based variations of Claim 4 such that, as the speed of playing increases, as measured by decreasing time between successively played notes, a playing-speed generated scalar value is calculated; the playing-speed generated scalar value can be applied to alter the effect of5 any commanded changes in the volume envelope of notes played.
PCT/US2007/015882 2006-07-12 2007-07-11 Musical performance desk spread simulator WO2008008425A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US80716906P 2006-07-12 2006-07-12
US60/807,169 2006-07-12

Publications (2)

Publication Number Publication Date
WO2008008425A2 true WO2008008425A2 (en) 2008-01-17
WO2008008425A3 WO2008008425A3 (en) 2008-04-10

Family

ID=38923888

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2007/015882 WO2008008425A2 (en) 2006-07-12 2007-07-11 Musical performance desk spread simulator

Country Status (1)

Country Link
WO (1) WO2008008425A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2582995A (en) * 2019-04-10 2020-10-14 Sony Interactive Entertainment Inc Audio generation system and method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6316710B1 (en) * 1999-09-27 2001-11-13 Eric Lindemann Musical synthesizer capable of expressive phrasing
US20040035284A1 (en) * 2002-08-08 2004-02-26 Yamaha Corporation Performance data processing and tone signal synthesing methods and apparatus
US6703549B1 (en) * 1999-08-09 2004-03-09 Yamaha Corporation Performance data generating apparatus and method and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6703549B1 (en) * 1999-08-09 2004-03-09 Yamaha Corporation Performance data generating apparatus and method and storage medium
US6316710B1 (en) * 1999-09-27 2001-11-13 Eric Lindemann Musical synthesizer capable of expressive phrasing
US20040035284A1 (en) * 2002-08-08 2004-02-26 Yamaha Corporation Performance data processing and tone signal synthesing methods and apparatus

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2582995A (en) * 2019-04-10 2020-10-14 Sony Interactive Entertainment Inc Audio generation system and method
US11653167B2 (en) 2019-04-10 2023-05-16 Sony Interactive Entertainment Inc. Audio generation system and method

Also Published As

Publication number Publication date
WO2008008425A3 (en) 2008-04-10

Similar Documents

Publication Publication Date Title
EP1125272B1 (en) Method of modifying harmonic content of a complex waveform
US7003120B1 (en) Method of modifying harmonic content of a complex waveform
JP3675287B2 (en) Performance data creation device
Pejrolo et al. Acoustic and MIDI orchestration for the contemporary composer: a practical guide to writing and sequencing for the studio orchestra
US7915514B1 (en) Advanced MIDI and audio processing system and method
US7728213B2 (en) System and method for dynamic note assignment for musical synthesizers
JP3975772B2 (en) Waveform generating apparatus and method
Goebl et al. Touch and temporal behavior of grand piano actions
Cook Remutualizing the musical instrument: Co-design of synthesis algorithms and controllers
JP3900188B2 (en) Performance data creation device
US7030312B2 (en) System and methods for changing a musical performance
JP3829780B2 (en) Performance method determining device and program
Pestova Models of interaction in works for piano and live electronics
JPH09325773A (en) Tone color selecting device and tone color adjusting device
US8299347B2 (en) System and method for a simplified musical instrument
WO2008008425A2 (en) Musical performance desk spread simulator
EP2015855B1 (en) System and method for dynamic note assignment for musical synthesizers
JP3900187B2 (en) Performance data creation device
Menzies New performance instruments for electroacoustic music
JP3719129B2 (en) Music signal synthesis method, music signal synthesis apparatus and recording medium
JP2004070154A (en) Music playing data processing method and musical sound signal synthesizing method
Salmi Using sample-based virtual instruments to produce orchestral strings in film music
McGuire et al. MIDI techniques
JP3812509B2 (en) Performance data processing method and tone signal synthesis method
Delekta et al. Synthesis System for Wind Instruments Parts of the Symphony Orchestra

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07810376

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase in:

Ref country code: DE

NENP Non-entry into the national phase in:

Ref country code: RU

122 Ep: pct application non-entry in european phase

Ref document number: 07810376

Country of ref document: EP

Kind code of ref document: A2