WO2007105180A2

WO2007105180A2 - Automatic play list generation

Info

Publication number: WO2007105180A2
Application number: PCT/IB2007/050892
Authority: WO
Inventors: Alexander Petrus Paulus Vrijsen
Original assignee: Pace Plc
Priority date: 2006-03-16
Filing date: 2007-03-15
Publication date: 2007-09-20
Also published as: WO2007105180A3

Abstract

A method of generating a play list comprising a sequence of audio files. Currently available methods of generating play lists provide limited capability for the user to specify what type of music is played. Thus, an improved method of generating a specified sequence of media content is desirable. Accordingly, a method of sequencing media content is disclosed herein, wherein a set of at least one attribute variable is defined, the attribute variable or variables being characteristic of the media content. From the defined set of attribute variables, at least one attribute variable is selected. A pattern of variation over time is defined for each of the selected attribute variables. At least a portion of the media content is selected based on the attribute variables selected from the set of attribute variables. A play list of the selected portions of the media content is generated based on the defined patterns of variation of the selected attribute variables.

Description

Automatic Play List Generation

This invention relates to selecting multimedia content, and more particularly to generating a play list.

A method of automatically sequencing audio and music content is described in an article by Hugues Vinet, et al, titled "The CUIDADO Project", pp. 197- 203, International Conference on Music Information Retrieval, Paris, October 2002. However, their method provides only a limited capability for the user to specify the type of music that is to be played. It is therefore desirable to have an improved method of generating a play list of media content. It is also desirable to have a media player capable of playing media content according to a play list generated by an improved method of sequencing media content. It is also desirable to have a computer program capable of generating a play list of media content according to an improved method of sequencing the media content, when the computer program is run on a computer.

Accordingly, an improved method of sequencing media content to generate a play list, a media player capable of playing media content according to a play list generated by an improved method of sequencing media content, and a computer program containing instructions for implementing an improved method of sequencing media content are disclosed herein. A set of at least one attribute variable is defined, the attribute variable or variables being characteristic of the media content. From the defined set of attribute variables, at least one attribute variable is selected. A pattern of variation over time is defined for each of the selected attribute variables. At least a portion of the media content is selected based on the selected attribute variables. A play list of the selected portion of the media content is generated based on the defined patterns of variation of the selected attribute variables. The generated play list may be used by the media player to play media content according to the sequence defined in the play list. The computer program contains instructions to instruct a computer or media player device to implement the steps of the method disclosed herein, when the computer program is run on a computer. These and other aspects will be described in detail hereinafter, by way of example, on the basis of the following embodiments, with reference to the accompanying drawings, wherein: Fig. 1 illustrates a method of generating a play list of media content;

Fig. 2 illustrates a method of specifying patterns of variation for selected attribute variables;

Fig. 3 illustrates a method of defining a set of attribute variables, based on a pre-determined set of attribute variables; Fig. 4 schematically shows examples of media players capable of playing media content; and

Fig. 5 schematically shows a medium containing a computer program for generating a play list of media content.

Corresponding reference numerals used in the various figures represent corresponding elements in the figures.

Fig. 1 illustrates a possible implementation of the disclosed method. A set of attribute variables is defined in step 101, the attribute variables being characteristic of some media content. From the defined set of attribute variables, at least one attribute variable is selected in step 102. A pattern of variation over time is defined, in step 103, for each of the selected attribute variables. Based on the attribute variables selected in step 102, at least a portion of the media content is selected in step 104. A play list of the selected portion of media content is generated in step 105, based on the patterns of variation defined for the selected attribute variables, in step 103. Wikipedia (http://en.wikipedia.org), the internet-based free encyclopedia, defines a disk jockey (DJ) as "an individual who selects and plays pre-recorded music for the enjoyment of others". Recently, there have been attempts to automate the task of a DJ, using programs that take a play list of media content, for example music, as input and automatically mix these for the user. Such functionality is henceforth referred to as "AutoDJ", and programs that provide such functionality are called "AutoDJ programs".

In its simplest form, a play list is a list of media content, for example songs. A play list could also refer to audio content other than songs, or even to video content. The media content, for instance songs, are typically mixed in the order as provided by the user. Alternatively, the user selects a first song, based on which subsequent "matching" songs are automatically looked up and compiled into the play list. The matching may be done based on a simple parameter like the genre of the song, or name of the artist or album, or length of the song, etc. Slightly more complicated matching operations may involve considering the tempo of the song, or a particular combination of keywords from the title or the lyrics, etc. Even more advanced algorithms may find matching songs based on low-level audio features or characteristics of the songs, which are in turn based on the actual audio content of the song. Often, the play lists that are generated by such systems are only "play sets", since the song order is either not taken into account or is considered a matter of minor importance. A better way to select matching content is to let the user influence the generated order by selecting or specifying a play list profile. For example, when the user selects an "AutoDJ" for a play list (either existing or to be generated), a list of choices may be presented to the user, which may include the following:

Fixed - The user specifies the songs, and the exact order in which the songs have to be played, which then forms the play list. This process is, of course, not automatic, but the play list so created could be used for automatically generating future play lists for a user with similar tastes.

Climax position - Some useful choices may be "Start with climax", "Climax in the middle", "End with climax", and "Climax at both ends". Other variations could also be possible, for example, having multiple climaxes, etc. The "Start with climax" option may start with songs that are energetic, fast, dynamic, aggressive etc, and end with slow, soft songs, while the "End with climax" option would reverse the order. In both these options, the attribute variable, for example tempo, varies only in one direction, if visualized as a graph as shown in Fig. 2. In other words, the tempo would either rise or fall monotonically. For example, in the "End with climax" option, the tempo would get faster with time, while in the "Start with climax" option, the tempo would get slower with time. In contrast, in both the "Climax in the middle" and the "Climax at both ends" options, the tempo would change direction approximately midway during the playing time. The various characteristics mentioned above, e.g., fast, slow, etc., may be based on a single parameter like the tempo, or on a combination of parameters like the tempo, genre, etc.

Specify first and last song - Intermediate songs are automatically chosen so that the smoothest transitions result. The idea is that if a user likes a particular song, the likelihood that he/she may like other similar songs is quite high. Some intermediate songs could also be specified by the user.

User defined - The user selects one or more characteristics like tempo, mood, year, genre, and then specifies the pattern of variation of each of these characteristics. These characteristics are also referred to as attribute variables, in the document.

A user may select, and thereby define a set of attribute variables (101) that are of interest. For example, the user may choose genre and mood as the set of defined attribute variables (101). In case only one attribute variable is defined in the set of attribute variables (101) that attribute variable is automatically selected. If more than one attribute variable is defined in the set, for example, genre, mood and tempo, then the user has the option of selectively choosing one or more attribute variables at a time, and defining a pattern of variation for each of the selected variables. This operation is explained in detail below, with reference to Fig. 2. Once the desired attribute variables have been selected, the list of media content, for example, songs that would form the play list, is selected automatically.

One possible implementation of the disclosed method also utilizes a multidimensional approach to selection and sequencing of media content. Thus, for every point in time, at least one attribute variable needs to be defined based on which the songs are selected. In this case, a two-dimensional selection space bounded by time on one axis, e.g., the X-axis, and the selected attribute variable on the other axis, e.g., the Y-axis, forms the two-dimensional space within which the pattern of variation of the selected attribute variable is defined. If exactly two attribute variables are selected, a three- dimensional selection space, bounded by time on one axis and each of the selected attribute variables on the other two axes, forms the basic volume within which the patterns of variation of the selected attribute variables are defined. If more than two attribute variables are selected, the dimensionality will increase proportionally. As it may be difficult to visualize more than three dimensions, such a multi-dimensional space with more than 3 dimensions may need to be broken up into groups of three-dimensional or two-dimensional subspaces. For example, if the user chooses mood, genre, tempo, and year as the selected attribute variables, this would give us a five-dimensional volume, with time on one axis, and mood, genre, tempo and year forming the other four axes. It is possible to represent this five-dimensional space as multiple subspaces, for example as two three-dimensional subspaces, or four two-dimensional subspaces. If three- dimensional subspaces are considered, the first three-dimensional subspace could be bounded by axes representing mood, genre and time, while the second three-dimensional subspace could be bounded by axes representing tempo, year and time. As may be expected, the time axis will be duplicated in the two three-dimensional subspaces. Alternatively, each of the four selected attribute variables may be mapped against the same time axis to yield four two-dimensional subspaces.

A pattern of variation is defined (103) for each of the selected attribute variables. The pattern of variation would depend on the nature of the attribute variable. For example, it may be continuously variable, as in the case of tempo, or it may have binary values as in the choice of male or female voice, or discrete values as in the choice of the year of the song, etc. A method of defining patterns of variation is described in detail in Fig. 2.

A portion of the media content is selected based on the attribute variables selected by the user. An additional input that may be used in deciding the selection of songs is the playing time. This may be necessary as some songs that satisfy all other criteria as decided by the attribute variables, may be too long to fit into the desired playing time. Once the songs are selected, their patterns of variation are considered while sequencing them, in order to compile the play list. It is, of course, possible to select only portions of a song or songs. Alternatively, portions of other audio content like jingles, news or sports commentaries may be selected. Alternatively, portions of video content may be selected. For example, to generate a play list representing "highlights" from a video recording of a soccer match, a user could select "sound volume" and "duration of increased sound volume" as the attribute variables. Patterns of variation for these two attribute variables could be defined as "a sudden increase followed by a decrease", and "lasting/steady for 5 seconds", respectively. Applying these criteria to the video recording, it might be possible to isolate highlights where goals were actually scored in the game, while avoiding failed attempts.

The media content in a play list may be mixed during playback or rendering, using special effects like fade in, fade out, etc. In the current context, it is possible to give the user a "Best Mix" option, wherein in addition to automatically selecting the songs based on the selected attribute variables, the order of songs is automatically chosen so that the transitions between the songs are as smooth as possible. Fig. 2 shows a method of defining patterns of variation for attribute variables selected from a set of attribute variables. The X-axes of all the selection graphs 209, 210, 211 shown in the figure represent time. The selection graphs 209, 210, 211 show the variations in tempo, mood and voice, respectively, as set by the user. The squares 201, 204, 207 on the selection graphs 209, 210, 211 denote control points that the user may use to change the pattern of variation of each of the attribute variables in the selection graphs 209, 210, 211, respectively. The lines 202, 205 connect the various control points in the selection graphs 209, 210, respectively, and serve to show the actual pattern of variation of the attribute variables, especially in cases where the change is preferably smooth. The arrows 212, 213, 214 denote the start times, and the arrows 203, 206, 208 denote the end times, associated with each of the attribute variables represented by the respective selection graphs. Graph 209 shows an exemplary variation of tempo with time, where the Y-axis shows the variation in the tempo with Fs denoting Fast and Sl denoting Slow. Graph 210 shows an exemplary variation of "mood" with time, with H denoting "Happy mood", and S denoting "Sad mood", on the Y-axis. Graph 211 shows an exemplary variation of the voice of the singer with time, with F standing for Female and M standing for Male.

The selection graphs help a user to visualize how each of the attribute variables changes over time. The selection graphs may also be used as a visual input interface, as the user can specify more accurately the preferred pattern of variation for each of the attribute variables. For example in the tempo selection graph 209, which shows the variation of the tempo of the media content over time, the initial tempo is slow, then speeds up gradually, reaches a peak or climax and finally slows a little towards the end. The mood selection graph 210 shows that the user prefers to start with "happy" songs, gradually transition to "sad" songs, and gradually transition back to "happy" songs towards the end. The "happy" or "sad" nature of songs is often decided by the musical key on which the composition is based, with major keys typically conveying a happy mood, and minor keys typically conveying a sad mood. However, the definition of "happy" and "sad" need not be based on a single characteristic like the musical key; rather it could be based on a combination of characteristics like the musical key, the tempo, the genre, etc.

The start times and the end times associated with the various attribute variables need not coincide. For example, to create a play list of songs that would play for one hour, a user might specify that for the first 15 minutes, the songs should be of slow tempo, and in a major key (which often conveys a happy mood). For the next 30 minutes, the songs should be of medium tempo, in a minor key, and in a female voice. And finally, for the last 15 minutes, the songs should be duets in a major key. As may be noticed from the above specifications, there is no mention of the voice, i.e., male or female, for the first 15 minutes. Therefore, "voice" will not be an attribute variable considered for selecting songs to fill this duration. In other words, either voice may chosen, or alternatively duets or even pieces of music that are fully instrumental may be chosen. For the last 15 minutes, the tempo is not specified. Therefore, the tempo will not be considered at all while selecting songs to create the last 15 minutes of the play list. Alternatively, if a particular attribute variable is not specified, a predetermined default value could be assigned to it, which value may then be used in selecting the media content. For example, a default value of "instrumental" could be used as no "voice" attribute variable has been defined for the first 15 minutes of playing time. In some cases, the last specified value may be assigned to a particular attribute variable, which value may then be maintained till the end of the play list. For example, as no tempo has been defined for the last 15 minutes, the tempo could simply be maintained at the last specified value, which is "medium".

If, as shown in Fig. 3, a set of pre-determined characteristics or attribute variables is given, or a set of attribute variables is generated based on an example play list, the control points may be used to fine-tune the selection graphs further. If no example play list is provided the selection graphs may be used by the user to fully describe the dynamics of the play list, starting from scratch.

The squares 201, 204, 207 shown on the selection graphs are only representative symbols. They could be any other shape, or could be represented by icons, pictures, numbers, etc. It is also be possible to have no visible symbols for control points; rather, the position of the line itself may be adjusted directly. The control points may be adjusted on a computer monitor or other display screen using a cursor controlled by a user interface device like a mouse, keyboard, joystick, touch pad, laser pointer, etc. The control points may also be adjusted by using a touch screen display interface. Alternatively, rotatable knobs, slider controls, rocker switches, etc., may be used to set the control points.

It may be noted that continuous graphs are not applicable for all characteristics. For example, genres could be visualized as a few selection boxes over time or perhaps by changing the color of the graph, where each color represents a different genre. A single graph may thus be used to represent multiple attribute variables. For example, if different colors are used to represent the genre, and the Y-axis depicts the tempo, then it is possible to specify a combination of tempo and genre using a single graph. The specific attribute variables shown in the selection graphs in the figure are only examples; the selection graphs 209, 210, 211 may be used to represent other attribute variables as well. Any number of selection graphs may be used to display additional attribute variables as well. Other forms of selection graphs or charts, for example bar charts, may be used alternatively, to serve the same purpose as the selection graphs 209, 210, 211. Fig. 3 illustrates one possible implementation of the disclosed method. A set of at least one attribute variable is defined in step 101, the attribute variable or variables being selected from a pre-determined superset of attribute variables 301. Out of the defined set of attribute variables, at least one attribute variable is selected in step 102. A pattern of variation over time is defined for each of the selected attribute variables in step 103, based on pre-determined patterns of variations defined in step 302 for respective attribute variables. Based on the attribute variables selected in step 102, portions of media content are selected in step 104. A play list of the selected portion of media content is generated in step 105, based on the patterns of variation defined for the selected attribute variables in step 104. A play list created by a DJ is often characteristic of the particular DJ. In other words, a particular DJ often tends to be partial to certain specific attribute variables and certain patterns of their variation, while compiling a play list. If a user likes a certain play list created by a particular DJ, he/she may desire to duplicate the selection at a later stage, or may want to apply the same set of attributes variables and their patterns of variation to a different set of songs, in order to compile a new but similar play list. We term such a duplication as "taking over" the characteristics of a play list. A similar "taking over" operation could also be applied to a play list generated by the end user, or created by a computer or other media player like the iPod™. The various attribute variables and their patterns of variation may be displayed to the user as graphs, as described in Fig. 2.

Instead of taking over the play list characteristics, i.e., attribute variables and their patterns of variation, via an example play list, it is also possible to take over pre-defined play list characteristics directly. For example, if it is determined that most play lists of a particular DJ (called X) all have (approximately) the same characteristics, the average of these play list characteristics could be provided as a choice for selection by the user as "Create a DJ X play list". By thus taking over the pre-defined play list characteristics directly, the set of attribute variables is automatically defined identical to the list of attribute variables contained in the pre-defined play list characteristics. The patterns of variation of the attribute variables are also automatically defined identical to the patterns of variation contained in the pre-defined play list characteristics. Once the play list characteristics have been taken over, a user may use them without modification, or may choose to fine-tune the selection using a graphical user interface, as explained in Fig. 2. After the user has created his/her play list characteristics, either by taking over an example without modification, or by taking over an example and fine tuning, or by specifying these characteristics manually, one or more play lists may be generated by looking up a database of songs, and selecting those songs that fulfill the criteria set by the attribute variables and their respective patterns of variation. Fig. 4 shows possible embodiments of media players 402, 403 capable of playing media content according to a play list generated by the disclosed method 401.

A media player may be implemented in either hardware or software, or a combination of the two. Possible embodiments of a media player implemented in software include the Windows Media Player™ by Microsoft^® Corporation, Real Player™ by Real Networks, Inc., etc. Possible embodiments of a consumer electronics device capable of playing media content include the Apple iPod™, the Creative NOMAD*¹ ViuVo², the Philips GoGear^® HDD 1630, etc.

Fig. 5 shows a possible embodiment of a medium 502 containing a computer program for enabling a media player 503 to play media content according to a play list generated according to the disclosed method 501.

The media player 503 is capable of loading and running a computer program comprising instructions that, when executed on the media player, enable the media player to execute the various aspects of the method disclosed herein. The computer program may reside on a computer-readable or a media-player medium 501, for example a CD-ROM, a DVD, a floppy disk, a memory stick, a magnetic tape, or any other tangible medium that is readable by the media player 503. The computer program may also be a downloadable program that is downloaded, or otherwise transferred to the computer, for example via the Internet. The transfer means 502 may be an optical drive, a magnetic tape drive, a floppy drive, a USB or other computer port, an Ethernet port, etc. The order in the described embodiments of the disclosed methods is not mandatory. A person skilled in the art may change the order of steps or perform steps concurrently using threading models, multi-processor systems or multiple processes without departing from the disclosed concepts. It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design many alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps other than those listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The disclosed method can be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the system claims enumerating several means, several of these means can be embodied by one and the same item of computer readable software or hardware. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.

Claims

CLAIMS:

1. A method of sequencing media content, comprising: defining a set of at least one attribute variable that is characteristic of the media content; selecting at least one attribute variable from the defined set of attribute variables; defining a pattern of variation over time, for each of the selected attribute variables; selecting at least a portion of the media content on the basis of the selected attribute variables; and generating a play list of the selected media content based on the defined patterns of variation of the selected attribute variables.

2. The method of claim 1, wherein selecting the at least one attribute variable is effected using one or more graphical user interfaces.

3. The method of claim 1, wherein defining the pattern of variation of the at least one attribute variable is effected using one or more graphical user interfaces.

4. The method of claim 1, wherein the generating the play list involves mixing the selected media content.

5. The method of claim 1 , wherein defining the set of at least one attribute variable is based on a predetermined set of attribute variables, and defining the pattern of variation over time for each of the selected attribute variables is based on pre-determined respective patterns of variation for the selected attribute variables.

6. The method of claim 5, wherein the pre-determined set of attribute variables, as well as the patterns of variation of the attribute variables in the predetermined set, is characteristic of songs selected by a particular disc jockey.

7. The method of claim 5, wherein the pre-determined set of attribute variables, as well as the patterns of variation of the attribute variables in the predetermined set, is based on an existing play list.

8. A media player for playing media content corresponding to a play list generated according to claim 1.

9. A computer program for sequencing media content, comprising: instructions for defining a set of at least one attribute variable that is characteristic of the media content; instructions for selecting at least one attribute variable from the defined set of attribute variables; instructions for defining a pattern of variation over time, for each of the selected attribute variables; instructions for selecting at least a portion of the media content on the basis of the selected attribute variables; and instructions for generating a play list of the selected media content based on the defined patterns of variation of the selected attribute variables.

10. A medium containing a computer program for sequencing media content, the computer program comprising: instructions for defining a set of at least one attribute variable that is characteristic of the media content; instructions for selecting at least one attribute variable from the defined set of attribute variables; instructions for defining a pattern of variation over time, for each of the selected attribute variables; instructions for selecting at least a portion of the media content on the basis of the selected attribute variables; and instructions for generating a play list of the selected media content based patterns of variation of the selected attribute variables.