EP1265221A1

EP1265221A1 - Automatic music improvisation method and device

Info

Publication number: EP1265221A1
Application number: EP01401485A
Authority: EP
Inventors: Francois Pachet
Original assignee: Sony France SA
Current assignee: Sony France SA
Priority date: 2001-06-08
Filing date: 2001-06-08
Publication date: 2002-12-11

Abstract

The invention provides a method for automatically generating music that constitutes an improvisation, characterised in that it comprises the steps of:

establishing a data base of music patterns (42),
detecting current music data (12),
generating with reference to said music patterns an improvisation as a continuation in real time of said current music data (4).

In this way, an improvisation is performed in a seamless manner as continuation of a musical piece played by an instrument.

The invention also provides the possibility of controlling the direction of improvisation through control data, e.g. in the form of notes.

Description

The invention relates to a device and process for automatically improvising music such that it follows on seamlessly and in real time from music produced from an external source, e.g. a musical instrument being played live.
It can serve to simulate an improvising performing musician, capable for instance of completing a musical phrase started by a human musician, taking up instantly with an improvisation that takes into account the immediate musical context, style and other characteristics.
In this respect, the invention contrasts with prior computerised music composing systems, which can be classed into two types:
i) systems which compose on demand autonomous musical pieces in the manner of a certain composer, performing artist or musical genre, but which do not adapt coherently to a live musical environment; and
ii) "question-answer" type systems, i.e. in which a player inputs a music sequence and the systems replies with a complementary music sequence. The latter is influenced by the player input, but forms an independent musical item with a clear break point marking the switch from the player input (question) to the artificially generated response (answer).
Musical improvisation, especially in Jazz, is both a fascinating activity and a very frustrating one. Improvisation by a human musician requires an intimate relationship between musical thought and sensory-motor processes: the musician must listen, think, develop ideas and move his/her fingers very quickly. Speed and lack of time are crucial ingredients of improvisation; it is what makes it exciting. It is also what makes it frustrating: beginners as well as experienced improvisers are by definition limited by their technical abilities, and by the morphology of the instrument.
The invention can overcome this hurdle by creating meta instruments which address this issue explicitly: providing fast, efficient and enhanced means of generating interesting improvisation, in a real-world, real-time context.
Music improvisation has long been an object of study for computer scientists. Various systems and models of improvisation have been designed, using a wide variety of techniques. Most of the approaches have consisted in either building partial or fully-fledged improvisation systems (Biles, "Interactive GenJam; Integrating Real-Time Performance with a Genetic Algorithm, Proc. IMC 98, Ann Arbor, Michigan; Ramalho, et al, Simulating Creativity in Jazz Performance. Proc. of the National Conference in Artificial Intelligence, pp. 108-113, AAAI-94, Seattle, AAAI Press) or in proposing cognitive models of improvisation (Johnson-Laird, "Jazz Improvisation: A Theory at the Computational Level", Representing Musical Structures, P. Howell, R. West and I. Cross, eds, Academic Press, 291-325).
In all cases, no system has so far succeeded in creating fully convincing improvisations: the feeling of automation is never avoided. Moreover, most of the existing systems require a considerable amount of explicit, symbolic information to be given by a human to the system, such as human input for supervising learning, the underlying chord sequence, the tempo, the song structure, etc.
In view of the foregoing, the invention departs from these known approaches in several fundamental ways.
First, it provides for seamless integration in the playing mode of the musician, as opposed to traditional question/answer or fully automatic systems.
Second, it is based on agnostic learning, which does not require any a priori knowledge of the musician's style, harmonic grid, or tempo.
Third, it adapts quickly and without human intervention to unexpected changes in rhythm, harmony or style.
Finally, it can be made easily and intimately controllable by the musician - an extension of a musical instrument - rather than as an actual intelligent and autonomous musical agent. The resulting system achieves very good performance by basically replacing explicit, symbolic knowledge and autonomy by intimate control.
More particularly, a first object of the invention is to provide a method of automatically generating music that constitutes an improvisation, characterised in that it comprises the steps of:

establishing a data base of music patterns,
detecting current music data,
generating with reference to the music patterns an improvisation as a continuation in real time of the current music data.

The music patterns can be musical phrases extracted from a stream of music data, each phrase constituting a respective music pattern.
In the preferred embodiment, the continuation is determined from potential root nodes of a tree structure expressing music patterns.
To this end, the embodiment can comprise the steps of:

mapping in a hash table predetermined data, such as pitch, against a set of tree nodes containing the data,
using the hash table to obtain a list of potential root nodes to start from for generating the continuation, and
updating the hash table as a function of the creation of new nodes in the tree structure.

The tree structure can be constructed in accordance the Lempel-Ziv algorithm using a data compression scheme.
The generating step may comprise the steps of:

extracting at least one traversal of the tree structure to obtain a set of possible continuations following on from the current music data, and
choosing a music sequence continuation from the set by performing a random draw.

The above procedure can preferably also include the step of applying a weighting to nodes of the tree.
Advantageously, the invention can provide an improvisation control mode, comprising the steps of:

receiving music control data, typically (although not necessarily) during the improvisation construction step, and
selecting at least one pattern for the improvisation on the basis of the music control data.

The music control data can comprise a sequence of n notes, where n is an arbitrarily chosen number.
The continuation can be determined from potential root nodes of a tree structure expressing music patterns, wherein the selecting step comprises the step of attributing, to each of one or more nodes of the tree structure, a weight as a function of how a sequence associated to a given node matches with the control data.
The weight attributing step may comprise the steps of:

determining a harmonic weighting function (Harmo_prob) expressed as: Harmo_prob ≡ (notes(X) ∩ Ctrl) / |notes(X)| where notes(X) designates a set of pitches represented by a node X, and Ctrl designates the last n notes forming the control data, whereby:
- if X is a note, then |X| = 1 and Harmo_prob(x) = 0 or 1, and
- if X is a chord, then Harmo_prob(x) belongs to [0,1], and is maximal (1) when all the notes of X are in the set of external notes,
using the harmonic weighting as at least a component of an overall weighting function (Weight(X)) to apply to nodes of the tree.

The overall weighting function, for a possible node X, can be defined as: Weight(X) ≡ (1 - S)*LZ_prob(X) + S*Harmo_prob(X), where LZ(prob(X) is the probability of X in the Tree, and S is a tuning parameter.
The method can further comprise the step of providing a jumping procedure comprising the steps of:

determining whether, for a given input sub-sequence seq, none of the possible continuations have a non-zero Harmo_prob value,
if none of the possible continuations have a non-zero Harmo_prob value, making a random draw weighted by S, as follows:
If Weight(X) ≤ S, and
If the value of the random draw is less than S, then
effecting the jump by restarting the computation of the next node by taking the whole set of notes of the tree.

The step of establishing a data base can comprise the step of discriminating between chords and non-chordal note successions, chords being identified as notes separated from each other by a time interval shorter than a predetermined threshold.
The improvisation generating step is preferably halted upon detecting current music data. Conversely, the improvisation generating step is preferably started upon detecting an interruption in the current music data.
The music patterns forming the data base can originate from a source, e.g. music files, different from the source producing the current music data, e.g. a musical instrument.
The music patterns can also be extracted from the source producing the current music data, e.g. a musical instrument.
The music control data can be produced from a source, e.g. a musical instrument, different from the source, e.g. another musical instrument, producing the current music data.
According to a second aspect, the invention provides a device for automatically generating music that constitutes an improvisation, characterised in that it comprises:

means for establishing a data base of music patterns,
means for detecting current music data, and
means for generating with reference to said music patterns an improvisation as a continuation in real time of the current music data.

The optional features presented supra in the context of the method according to the invention are applicable mutatis mutandis to the above device.
Namely, the above device may further comprise means for extracting musical phrases from a stream of music data, each phrase constituting a respective music pattern.
The improvisation generating means may comprise selection means for selecting music patterns from potential root nodes of a tree structure expressing music patterns.
Preferably, the device further comprises:

hash table means for mapping predetermined data, such as pitch, in a hash table against a set of tree nodes containing the data,
means for obtaining from the hash table a list of potential root nodes to start from for generating the continuation, and
means for updating the hash table as a function of the creation of new nodes in the tree structure.

Further, the generating means may comprise:

means for extracting at least one traversal of the tree structure to obtain a set of possible continuations following on from the current music data, and
means for choosing a music sequence continuation from the set by performing a random draw.

The above device may further comprise means for applying a weighting to nodes of the tree.
The device may further be equipped with an improvisation control mode, comprising:

means for receiving music control data (Ctrl), typically (although not necessarily) during the improvisation construction step, and
means for selecting at least one pattern for the improvisation on the basis of the music control data.

The continuation may be determined from potential root nodes of a tree structure expressing music patterns, the selecting means preferably comprising means for attributing, to each of one or more nodes of the tree structure, a weight as a function of how a sequence associated to a given node matches with the control data (Ctrl).
Suitably, the device further comprises at least one of:
i) means for halting the improvisation generation upon detecting current music data; and
ii) means for starting the improvisation generating upon detecting an interruption in the current music data.
According to a third aspect, the invention provides a music improvisation system, characterised in that it comprises:

a device as defined above,
a first source of music data operatively connected to supply data to the data base,
a second source of music data producing the current music data, e.g. a musical instrument.

The first source of audio data can be one of:
i) music file data, and ii) an output from a musical instrument; and the second source of audio data can be a musical instrument.
The first source and second source of audio data can equally be a same musical instrument or respective musical instruments.
According to a fourth aspect, the invention provides a music improvisation system, characterised in that it comprises:

a device as above, equipped with an input for handling control data, and
an improvisation control source operatively connected to the device for delivering the music control data during the improvisation.

The improvisation control source can be a musical instrument different from a musical instrument forming the second source of audio data.
According to a fifth aspect, the invention provides a system comprising:

at least first and second devices as defined above,
a first musical instrument and a second musical instrument different from the first musical instrument,

the first musical instrument is operatively connected as a source of data for the data base of music patterns of the first device and as a source of current music data for the second device, whereby the second device generates an improvisation with a sound of the first musical instrument referring to a data base produced from the second instrument, and
the second musical instrument is operatively connected as a source of data for the data base of music patterns of the second device and as a source of current music data for the first device, whereby the first device generates an improvisation with a sound of the second musical instrument referring to a data base produced from the first instrument.

According to a sixth aspect, the invention provides a software package comprising functional units for implementing the method according to the first object with a computer.
The invention and its advantages shall become more apparent from reading the following description of the preferred embodiments, given purely as non-limiting examples, with reference to the appended drawings in which:

figure 1 is a general block diagram showing the functional elements of an improvisation system in accordance with a preferred embodiment of the invention,
figure 2 is a simplified example of an LZ Tree structure used in a traversal Tree algorithmic structure within the system of figure 1,
figure 3 is an example of different user controls offered by the embodiment through its on-screen graphic computer interface, and
figure 4 is a diagram showing how several systems of figure 1 can be interconnected to implement a sharing mode.

A music improvisation system 1 according to a preferred embodiment of the invention is based on a combination of two modules: a learning module 2 and a generator/continuation module 4, both working in real time. The input 6 and the output 8 of the system are streams of Midi information. The system 1 is able to analyse and produce pitch, amplitude and rhythm information (onsets and duration).
The system accommodates several playing modes; it can adopt an arbitrary role and cooperate with any number of musicians.
In the standard playing mode, the system 1 is used by one musician, whose musical instrument, e.g. an electric guitar 10, has a Midi-compatible output connected to a Midi input interface 12 of the learning module 2 via a Midi connector box 14.
The output 8 of the system 1 is taken from a Midi output interface 16 to a Midi synthesiser 18 and then to a sound reproduction system 20. The latter plays through loudspeakers 22 either the audio output of the system 1 or the direct output from the instrument 10, depending whether the system or the instrument is playing.
The learning module 2 and the generator/continuation module 4 are under the overall control of a central management and software interface unit 24 for the system. This unit is functionally integrated with a personal computer (PC) comprising a main processing unit (base station) 26 equipped with a mother board, memory, support boards, CDrom and/or DVDrom drive 28, a diskette drive 30, as well as a hard disk, drivers and interfaces. The software interface 24 is user accessible via the PC's monitor 32, keyboard 34 and mouse 36. Optionally, further control inputs to the system 1 can be accessed from pedal switches and control buttons on the Midi connector box 14, or Midi gloves.
In the standard situation, the system acts as a "sequence continuator": the note stream of the musician's instrument 10 is systematically segmented into phrases by a phrase extractor 38, using a temporal threshold (200 milliseconds). Each phrase resulting from that segmentation is sent asynchronously from the phrase extractor 38 to a phrase analyser 40, which builds up a model of recurring patterns. In reaction to the played phrase, the system also generates a new phrase, which is built as a continuation of the input phrase, and not as an "answer" as in the case of the Biles reference supra.
To build the continuation, the learning module 2 systematically learns all melodic phrases played by the musician. The technique consists in building progressively a database of recurring patterns 42 detected in the input sequences produced by the phrase analyser 40. For this purpose, the learning module 2 uses a data compression scheme (implemented by module 44) adapted from the scheme described by Lempel-Ziv in the paper "Compression of individual sequences via a variable rate coding, IEEE Trans. Int. Computer Music Conf. (ICMC'91), pp.344-347. This scheme, referred to hereafter as the LZ scheme, has been shown to be well suited for capturing melodic patterns efficiently, as explained by Assayag & al. in the paper "Guessing the composer's mind: applying universal prediction to musical style, Proc. ICMC 99, Beijing, China, I.C.M.A., San Francisco, USA.
The technique consists in building a prefix Tree by a simple, linear analysis of each input sequence. Each time a sequence is input to the system 1, it is parsed from left to right and new prefixes encountered are systematically added to the Tree. The principle of the LZ Tree is described in detail in the Assayag et al paper.
The LZ scheme is itself well known in data compression, the algorithm itself dating back from the 80's. The idea of applying LZ to musical modelling has been proposed by various authors (Assayag, Dubnov, Delerue and others), but only in the context of off-line musical composition, or musical style classification.
By contrast, the present invention adapts this scheme to the context of real-time (*real tim*) music improvisation. To this end, the embodiment provides an efficient data structure based on 1) a standard Lempel Ziv Tree structure and 2) a hash table (dictionary) 46 that allows to directly access the nodes in the structures having a given data. This serves to ensure that learning can be done in real time (less than a few milliseconds). There is recorded in the Tree only information related to pitch and velocity (Midi). Rhythmic information is discarded in the present embodiment, although it can also be accommodated if required.
Learning can be done on the fly with the user playing from scratch. It can also be done from a library of predefined musical styles (for instance in the style of well known Jazz artists such as Pat Martino, John McLaughlin, John Coltrane, Charlie Parker, etc.).
Note that it the embodiment, the handling of the LZ algorithm is modified in several ways, covered in sections 1 to 3 infra.

1. Implementing an efficient Tree structure.

Because most usage of LZ for music is for batch processing (e.g. composition or classification), the present application calls for an efficient representation of the LZ Tree, both for Tree updating (learning) and traversal (generation). The Assayag et al. paper proposes a dual representation of the LZ Tree which is supposed to speed up the traversal, but which takes up much more space, by basically doubling the size of the Tree.
Very good results have been obtained by exploiting the basic Tree structure, augmented with a Hash table 46 mapping each data (here Midi pitch) to the set of Tree nodes containing this data. The Hash table 46 is updated each time a new node is created. When a Tree traversal must be effected from an input sequence starting with item i, the Hash table directly yields the list of potential root nodes to start from.
The generator/continuation module 4 of the system 1 is the real time continuation mechanism, which generates the music in reaction to an input sequence. The generation is performed using a traversal of Tree, through a traversal Tree module 48, to obtain the set of all possible continuations of the sequence. Then, the following item is chosen by a random draw, weighted by the probabilities of each possible continuation. This function is provided by a random draw and weighting module 50.
Music input to the generator/continuation module 4 is taken from the real time Midi source input interface 12 via an internal connection L1. In this way, the module 4 receives, at least, pitch and velocity information from the instrument 10.
In operation, the system 1 automatically detects musical phrases, by the means of a given time threshold applied to the phrase extractor 38. When a musical phrase is detected, it is sent to the phrase analyser 40 of the learning module 2 AND, through internal connection L1, to the generator/continuation module 4. The latter computes a possible continuation for the input phrase, of a given length (parameter), and outputs it to the sound reproduction system 20 (e.g. through a Midi scheduler). Usually, prefixes are only looked up from the root node. In order to ensure completeness, the embodiment introduces a modification in the access method of the Tree. Since a given prefix can be located arbitrarily in the Tree structure, a check is made for all its possible occurrences in any part of the Tree. The selection of the "next" node to take is based on the reunion of all possible continuations thus found, associated with the corresponding weights.
The Tree traversal is extended to ensure a complete search. By definition, an LZ Tree usually contains only partial information about patterns present in the input sequence, and is not complete (as opposed to classical Markov models). Consequently, a given pattern can be present at several locations in the Tree. For instance, in the Tree shown in figure 2, the pattern ABC is present twice (at nodes with an asterix):
To detect these two occurrences, the usual Tree traversal mechanism is augmented slightly by computing all possible continuations for a given input sequence. In the present case, with an input sequence ABC, the hash table 46 yields three nodes having data "A", so three "standard" traversals are performed, to get eventually only two sets of possible continuations (here, D, D and E). These continuations are aggregated and the next item is drawn according to the respective weights of each node. The system can handle over 35 000 LZ Tree nodes in real time (i.e. with a response time of less than 200 milliseconds), with a Java implementation on a personal computer PC running with a Pentium III microprocessor using a "MidiShare" (Pentium III is a registered trademark of Intel. Inc.).
The preferred embodiment makes use of the LZ (Lempel-Ziv) algorithm in its capacity to build a Tree of recurring patterns in an efficient way, but not its compressive capacity, as in usual implementations. More particularly, there is provided a more efficient use of this method, so that a real-time learning can be performed.
In data compression, the above-mentioned Tree is built and eventually used to build a compressed representation of a given sequence. The embodiment, however, does not utilise this compression, but merely uses the Tree of patterns for generating other sequences "in the style of" the previously inputted sequences.
The real-time learning can be performed by any method, as long as it can be done quickly (as in the case of Lempel-Ziv algorithm, which involves only one traversal of the input sequence), and as long as it can quickly produce a structure useable to complete the input sequence.
It is recalled that Lempel-Ziv parsing consists in building a Tree starting from the beginning of the sequence (root is on the left) to the right, and building each time a prefix Tree.
For instance, the sequence A B C A B C D A B C C D E produces a Tree as follows, where each prefix is indicated successively between commas (,):
A, B, C, AB, CD, ABC, CDE
A-B-C
B
C-D-E
The Tree does not contain all the patterns in the sequence. However, it does converge, for infinite sequences, to the entropy of the sequence, i.e. it has good properties in the long term.
Now, the Tree can be used for generating a sequence by choosing each time a node among possible continuations, according to its "weight", cf. unit 50. The weight is in fact the size of its sub-Tree, and corresponds, by construction, to its probability of occurrence.

2. Handling of chords.

The embodiment introduces a representation of chords so that the LZ Tree can handle single notes or chords indifferently. Chords are abstract entities: a Midi stream is by definition a stream of single notes, so chords as such do not exist in reality for a Midi stream. What shall be termed a chord in the present context is any stream of n notes whose tonal duration is less than a given threshold, e.g. 40 milliseconds. When such a stream is detected, a single chord object aggregating all the notes in the stream is created, instead of n single note objects.
To allow comparison of this chord object with other chord objects, each chord has a canonical representation that takes into account basic facts about tonal harmony:

pitches are considered only modulo 12 to allow octave equivalence,
pitches are sorted to consider streams of similar pitches with different ordering as similar.

3. Implicit management of harmony.

Harmony is a fundamental notion in most forms of music, Jazz being a particularly good example in this respect. Chord changes play an important role in deciding whether notes are "right" or not. It is important to note that while harmony detection is extremely simple to perform for a normally trained musician, it is extremely difficult for a system to express and represent explicitly harmony information, especially in real time. Accordingly, a design choice of the system according to the present embodiment is not to manage harmony explicitly. This choice is justified on three considerations :
i) because of the intimate control of the system, it is easy for the musician to correct the system in case it really goes too far out of tune, by simply playing a few notes (e.g. the third and fifth) and relaunching the system 1 in a new, correct, direction. To this end, the generator/continuation module 4 has a module 52 for external harmonic control, which cooperates with the random draw/weighting module 50. The external harmonic control module 52 presents an external input connector 54 for receiving the above mentioned notes. These "steering notes" effectively redirect the improvisation towards a chosen direction;
ii) because the system continuously learns, it eventually also learns the chord changes in the pattern base. For instance, playing tunes such as "So What" by Miles Davis (alternation of D minor and D# minor) creates in the long run patterns with this chord change;
iii) finally, and perhaps most importantly, the embodiment is designed to have a control mode through module 52 supra that actually allows the system to take into account external harmonic information without unduly complicating the data representation scheme.
The idea is to introduce a constraint facility in the generation phase. External information may be sent as additional input to the system via the harmonic mode control input 54. This information can be typically the last n notes (pitches) played by any external source (e.g. a piano with Midi interface 56) in a piano-guitar ensemble, where the guitar 10 is connected to Midi input interface 12, for instance. The value of n can be set arbitrarily to 8, to provide steering on the basis of the last eight notes. External input 54 is thus used to influence the generation process as follows. When a set of possible continuation nodes is computed with the LZ Tree, as described above, instead of choosing a node according to its weight (probability), the random draw and weighting module 50 is set to weight the nodes according to how they match the notes presented at the external input 54. For instance, it can be decided to give preference to nodes whose pitch is included in the set of external pitches, to favour branches of the Tree having common notes with the piano comping (?). In this case, the harmonic information is provided in real time by one of the musicians (in this case the pianist), without intervention of the user, and without having to explicitly enter the harmonic grid in the system. The system then effectively matches its improvisation to the thus-entered steering notes.
This matching is achieved by a harmonic weighting function designated "Harmo_prob" and defined as follows.
Consider a set of external notes, designated Ctrl, entered into the harmonic control module 52 through input 54. These notes Ctrl are taken to correspond to the last n notes entered at input 54, coming e.g. from a piano 56, while Midi input interface 12 is connected to a guitar 10 and the synthesiser 18 that is connected to the Midi output interface 16 is a guitar synthesiser.
Consider now the set of pitches represented by node X, designated notes(X). The harmonic weighting function for notes(X) can then be expressed as: Harmo_prob ≡ (notes(X) ∩ Ctrl) / |notes(X)|
If X is a note (and not a chord), then |X| = 1 and Harmo_prob(x) = 0 or 1.
If X is a chord, then Harmo_prob(x) belongs to [0,1], and is maximal (1) when all the notes of X are in the set of external notes.
There is then defined a new function for choosing the next node in the LZ Tree. Consider 1) LZ_prob(X), the probability of X in the LZ Tree, and 2) Harmo_prob (X), the harmonic weighting function, which assigns a weight to node X in the Tree, representing how close the node matches an external input. Both LZ_prob and Harmo_prob assign values in [0,1]. The aim is to achieve a compromise between these two weighting schemes. To introduce some flexibility, the system 1 adds a parameter S that allows tuning the total weighting scheme, so that the weight can take on a range of intermediate values between two extremes. When S = 0, the weighting scheme is equal to the standard probability-based weighting scheme. When S = 1, the weighting scheme is equivalent to the harmonic function.
The weight function is therefore defined as follows, where X is a possible node: Weight(X) ≡ (1 - S)*LZ_prob(X) + S*Harmo_prob(X).
Finally, the system 1 introduces a "jumping procedure", which allows to avoid a drawback of the general approach. Indeed, it may be the case that for a given input sub-sequence seq, none of the possible continuations have a non-zero Harmo_prob value. In such a case, the system 1 introduces the possibility to "jump" back to the root of the LZ Tree, to allow the generated sequence to be closer to the external input. Of course, this jump should not be made too often, because the stylistic consistency represented by the LZ Tree would otherwise be broken. The system 1 therefore performs this jump by making a random draw weighted by S, as follows:
If Weight(X) ≤ S, and
If the value of the random draw is less than S
Then make a jump, that is restart the computation of the next node by taking the whole set of notes of the LZ Tree, rather than the natural continuation of seq.
The system 1 does not learn nor produce rhythmic information. In the context of Jazz improvisation, for instance, it was found that letting the continuation system produce rhythm by combining patterns is very awkward, and in fact limits its usability. Instead, the preferred embodiment generates phrases of eight notes. Velocity is enough to produce phrases that sound human to the point of being indistinguishable. The system 1 can actually also generate non-linear streams of notes by simply using the rhythmic pattern of the input sequence, and mapping it to the output sequence. In any case, no rhythmic information is learned.
In the embodiment, Midi files for improvisation are used as input sequences. To this purpose, there can be used known improvisers, which are long enough to let the system actually capture the recurring patterns.
Further, the learning scheme can be carried out several times to accelerate the learning process.
The embodiment described is limited, intentionally, in its musical knowledge. The limitations are of three kinds: 1) rhythmic, 2) harmonic, and 3) polyphonic. Interestingly, it has turned out that these limitations are actually strengths, because of the control mode. In some way, these limitations are compensated by control.
The system leads to several control issues. First, in view of providing an intimate control by the musician, the embodiment implements a set of basic controllers that are easy to trigger in real time. These are accessible through the software interface 24 in the form of on-screen pushbuttons and pull-down menus on the PC monitor 32, which can be activated through the computer keyboard 34 and/or mouse 36.
An example of a typical screen page graphic of the computer interface displayed on the monitor 32 is shown in figure 3.
Among the different controllable parameters are the following basic controls:

"learn on/off" (tick box 60), to set the learning process on or off
"continuation on/off" (tick box 62) to tell the system to produce continuations of input sequences or not, and
"superposition on/off" (tick box 64), to tell the system whether it should stop its generation when a new phrase is detected, or not.

The last control is particularly useful. By default, the system stops playing when the user starts to play or resumes, to avoid superposition of improvisations. With a little bit of training, this mode can be used to produce a unified stream of notes, thereby producing an impression of seamlessness. In other words, the system 1 takes over with its improvisation immediately from the point where the musician (guitar 10) stops playing, and ceases instantly when the musician starts to play again. These controls are implemented with a foot controller of the Midi connector box 14 when enabled by the basic controls on screen (tick boxes). They can also be implemented with "Midi" gloves.
When the system 1 is silent owing to the presence of music output from the instrument 10, it continues to analyse that output as part of its continuing learning process, as explained above. An internal link L2 is active in this case to also send the music output of the instrument from the Midi input interface 12 to the Midi output interface 16, so as to allow the instrument to be heard through the Midi synthesiser 18, sound reproduction system 20 and speakers 22.
Additionally the software interface allows a set of parameters to be adjusted from the screen 32, such as:

the number of notes to be generated by the system (as a multiplicative factor of the number of notes in the input sequence) (box 66), and
the tempo of the generated sequence (as a multiplicative factor of the tempo of the incoming sequence) (box 68).

An interesting consequence of the design of the system 1 is that it leads to several remarkable new playing modes, including:

Single autarcy. One musician plays with the system 1 after having fed the system with a database of improvisations by a famous musician, as Midi files. Successful results have been achieved in this way with a database of Midi choruses from Pat Martino;
Multiple autarcy: each musician has his/her own system 1, with its own datatbase;
Master/Slave: one musician (e.g. guitarist) uses the system in its basic form, another (e.g. pianist) provides the external data (steering notes) at the harmonic control mode input 54 to influence the generation of the music improvisation;
Cumulative: all musicians share the same pattern database.
Sharing: each musician plays with the pattern database of the other (e.g.; piano with guitar, etc.). This creates exciting new possibilities as a musician can experience playing with unusual patterns.

Figure 4 shows an example of a set-up for the sharing mode in the case of a guitar and piano duo (of course, other instruments outside this sharing mode can be present in the music ensemble). Here, each instrument in the sharing mode is non acoustic and composed a two functional parts : the played portion and a respective synthesiser. For the guitar, these portions are respectively the main guitar body 10 with its Midi output and a guitar synthesiser 18b. For the piano, they are respectively the main keyboard unit with its Midi output 56 and a piano synthesiser 18a.
Two improvisation systems 1a and 1b as described above are used. The elements shown in figure 4 in connection with these systems are designated with the same reference numerals as in figure 1, followed by an "a" or "b" depending on whether they depend from improvisation system 1a or 1b respectively.
One of the improvisation systems 1a has its Midi input interface 12a connected to the Midi output of the main guitar body 10 and its Midi output interface 16a connected to the input of the piano synthesiser 18a. The latter thus plays the improvisation of system 1a, through the sound reproduction system 20a and speakers 22a, based on the phrases taken from the guitar input.
The other improvisation system 1b has its Midi input interface 12b connected to the Midi output of the main keyboard unit 56 and its Midi output interface 16b connected to the Midi input of the guitar synthesiser 18b. The latter thus plays the improvisation of system 1b, through the sound reproduction system 20b and speakers 22b, based on the phrases taken from the piano input.
This inversion of synthesisers 18a and 18b is operative all while the improvisation is active. When a musician starts playing, the improvisation is automatically interrupted so that his/her instrument 10 or 56 takes over through its normally attributed synthesiser 18b or 18a respectively. This taking over is accomplished by adapting link L2 mentioned supra so that a first link L2a is established between Midi input interface 12a and Midi output interface 16b when the guitar 10 starts to play, and a second link L2b is established between Midi interface 12b and Midi output interface 16a when the piano 56 starts playing.
Naturally, this concept of connecting the inputs 6 and outputs 8 of the system to different instruments can extrapolated to any number n of improvisation systems, the choice of instruments involved being arbitrary.
From the foregoing, it is apparent that the embodiment allows to generate automatically musical melodies in a given style. It automatically learns the style from various sorts of musical inputs (real time, Midi files). The invention allows to generate musical melodies in a reactive way, according to musical stimulus, coming, e.g. from real time Midi signals. Several modes of control are proposed, to make the invention actually useable in a real live performance context.
The invention can be embodied in wide variety of forms with a large range of optional features. The implementation described is based largely on existing hardware elements (computer, Midi interfaces, etc.), with the main aspects contained in software based modules. These can be integrated in a complete or partial software package in the form of a suitable data carrier, such as DVD or CD disks, or diskettes that can be loaded through the appropriate drives 28, 30 of the PC.
Alternatively, the invention can be implemented as a complete stand-alone unit integrating all the necessary hardware and software to implement a complete system connectable to one or several instruments and having its own audio outputs, interfaces, controls etc.
Between these two extremes, a large number of software, firmware and hardware embodiments can be envisaged.
Finally, it is clear that music data protocols other than Midi can be envisaged. Likewise, the teachings of the invention accommodate for all sorts of music styles, categories, and all sorts of musical instruments, those mentioned with reference to the figures being mere examples.

Claims

A method of automatically generating music that constitutes an improvisation, characterised in that it comprises the steps of:

establishing a data base of music patterns (42),

detecting current music data (12),

generating with reference to said music patterns an improvisation as a continuation in real time of said current music data (4).
Method according to claim 1, further comprising the step of extracting musical phrases from a stream of music data (38, 40), each phrase constituting a respective music pattern.
Method according to claim 1 or 2, wherein said continuation is determined from potential root nodes of a Tree structure expressing music patterns.
Method according to claims 3, further comprising the steps of:

mapping in a hash table (46) predetermined data, such as pitch, against a set of tree nodes containing said data,

using said hash table to obtain a list of potential root nodes to start from for generating said continuation, and

updating said hash table as a function of the creation of new nodes in said tree structure.
Method according to claim 3 or 4, wherein said tree structure is constructed in accordance the Lempel-Ziv algorithm using a data compression scheme (44).
Method according to any one of claims 3 to 5, wherein said generating step comprises the steps of:

extracting at least one traversal of said tree structure (48) to obtain a set of possible continuations following on from said current music data, and

choosing a music sequence continuation from said set by performing a random draw (50).
Method according to claim 6, further comprising the step of applying a weighting (52) to nodes of said tree.
Method according to any one of claims 1 to 7, further comprising the provision of an improvisation control mode (54, 52), comprising the steps of:

receiving music control data during said improvisation construction step, and

selecting at least one pattern for said improvisation on the basis of said music control data (Ctrl).
Method according to claim 8, wherein said music control data (Ctrl) comprise a sequence of n notes, where n is an arbitrarily chosen number.
Method according to any one of claims 8 or 9, with said continuation determined from potential root nodes of a Tree structure expressing music patterns, wherein said selecting step comprises the step (50) of attributing, to each of one or more nodes of said tree structure, a weight as a function of how a sequence associated to a given node matches with said control data (Ctrl).
Method according to claim 10, wherein said weight attributing step comprises the steps of:

determining a harmonic weighting function (Harmo_prob) expressed as: Harmo_prob ≡ (notes(X) ∩ Ctrl) / |notes(X)| where notes(X) designates a set of pitches represented by a node X, and Ctrl designates the last n notes forming said control data, whereby:

if X is a note, then |X| = 1 and Harmo_prob(x) = 0 or 1, and

if X is a chord, then Harmo_prob(x) belongs to [0,1], and is maximal (1) when all the notes of X are in the set of external notes,

using said harmonic weighting as at least a component of an overall weighting function (Weight(X)) to apply to nodes of said Tree.
Method according to claim 11, wherein said overall weighting function, for a possible node X, is defined as: Weight(X) ≡ (1 - S)*LZ_prob(X) + S*Harmo_prob(X), where LZ(prob(X) is the probability of X in said Tree, and S is a tuning parameter.
Method according to claim 11 or 12, further comprising the step of providing a jumping procedure comprising the steps of :

determining whether, for a given input sub-sequence seq, none of the possible continuations have a non-zero Harmo_prob value,

if none of the possible continuations have a non-zero Harmo_prob value, making a random draw weighted by S, as follows:

If Weight(X) ≤ S, and

If the value of the random draw is less than S, then

effecting said jump by restarting the computation of the next node by taking the whole set of notes of said Tree.
Method according to any one of claims 1 to 13, wherein said step of establishing a data base comprises the step of discriminating between chords and non-chordal note successions, chords being identified as notes separated from each other by a time interval shorter than a predetermined threshold.
Method according to any one of claims 1 to 14, wherein said improvisation generating step is halted upon detecting current music data.
Method according to any one of claims 1 to 15, wherein said improvisation generating step is started upon detecting an interruption in said current music data.
Method according to any one of claims 1 to 16, wherein said music patterns forming said data base originate from a source, e.g. music files, different from the source producing said current music data, e.g. a musical instrument (10).
Method according to any one of claims 1 to 17, wherein said music patterns are extracted from the source producing said current music data, e.g. a musical instrument.
Method according to any one of claims 8 to 18, wherein said music control data is produced from a source, e.g. a musical instrument (56), different from the source, e.g. another musical instrument, producing said current music data.
A Device for automatically generating music that constitutes an improvisation, characterised in that it comprises:

means (2) for establishing a data base of music patterns,

means (12)) for detecting current music data, and

means (4) for generating with reference to said music patterns an improvisation as a continuation in real time of said current music data.
Device according to claim 20, further equipped with an improvisation control mode, comprising:

means (54) for receiving music control data (Ctrl) during said improvisation construction step, and

means (50) for selecting at least one pattern for said improvisation on the basis of said music control data.
A music improvisation system, characterised in that it comprises:

a device according to claim 20 or 21,

a first source of music data operatively connected to supply data to said data base, and

a second source of music data (10) producing said current music data, e.g. a musical instrument.
System according to claim 22, wherein said first source of audio data is one of:

i) music file data, and ii) an output from a musical instrument (10); and wherein said second source of audio data is a musical instrument (10; 56).
System according to claim 23, wherein said first source and second source of audio data are a same musical instrument (10).
System according to claim 23, wherein said first source and said second source are respective musical instruments (10 and 56).
A music improvisation system, characterised in that it comprises:

a device according to claim 21,

an improvisation control source (56) operatively connected to said device for delivering said music control data (Ctrl) during said improvisation.
System according to claim 26, wherein said improvisation control source is a musical instrument (56) different from a musical instrument (10) forming said second source of audio data.
A system comprising:

at least first and second devices (1a, 1b according to claim 20 or 21,

a first musical instrument (10) and a second musical instrument (56) different from said first musical instrument,

wherein

said first musical instrument is operatively connected as a source of data for said data base of music patterns of said first device and as a source of current music data for said second device, whereby said second device generates an improvisation with a sound of said first musical instrument referring to a data base produced from said second instrument, and

said second musical instrument is operatively connected as a source of data for said data base of music patterns of said second device and as a source of current music data for said first device, whereby said first device generates an improvisation with a sound of said second musical instrument referring to a data base produced from said first instrument.
A software package comprising functional units for implementing the method according to any one of claims 1 to 19 with a computer.