EP2304726A1 - Audio mix instruction file with timing information referring to unique patterns within audio tracks - Google Patents

Audio mix instruction file with timing information referring to unique patterns within audio tracks

Info

Publication number
EP2304726A1
EP2304726A1 EP09745757A EP09745757A EP2304726A1 EP 2304726 A1 EP2304726 A1 EP 2304726A1 EP 09745757 A EP09745757 A EP 09745757A EP 09745757 A EP09745757 A EP 09745757A EP 2304726 A1 EP2304726 A1 EP 2304726A1
Authority
EP
European Patent Office
Prior art keywords
audio
file
mix
segment
track
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
EP09745757A
Other languages
German (de)
French (fr)
Inventor
Jonas Norberg
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
PRODIGIUM LTD.
Original Assignee
Tonium AB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tonium AB filed Critical Tonium AB
Publication of EP2304726A1 publication Critical patent/EP2304726A1/en
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/19Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
    • G11B27/28Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/061Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for extraction of musical phrases, isolation of musically relevant segments, e.g. musical thumbnail generation, or for temporal structure analysis of a musical piece, e.g. determination of the movement sequence of a musical work
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2240/00Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
    • G10H2240/011Files or data streams containing coded musical information, e.g. for transmission
    • G10H2240/041File watermark, i.e. embedding a hidden code in an electrophonic musical instrument file or stream for identification or authentification purposes
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2240/00Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
    • G10H2240/325Synchronizing two or more audio tracks or files according to musical features or musical timings
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/131Mathematical functions for musical analysis, processing, synthesis or composition
    • G10H2250/135Autocorrelation

Definitions

  • the present invention relates to methods concerning the creation of mix instructions files, or mix instructions files for audio files, to computer program products and to an apparatus for playing audio files.
  • DJ systems Since it has become possible to store music and other audio information in data files on hard disks or other memory, DJ systems have been developed in which the songs that are to be played are stored in a memory, for example, a data base from which they can be retrieved and played when desired. Typically these systems comprise similar functions to the traditional DJ systems in that they enable the mixing of two tracks and manipulation of each of the tracks to achieve a good mix, for example, a smooth transition between two songs.
  • the mix instructions file is sometimes referred to as a mix recipe or a mix recipe file.
  • a mix recipe or a mix recipe file.
  • Such a system is the TRAKTOR DJ Studio 3 available from Native In- struments Software Synthesis GmbH.
  • the content of the mix instructions file can be created, for example, by registering the actions performed during a DJ session, that is, which music tracks are used and how the DJ manipulates them. This may be assisted by software that, for example, adjusts the playback speed of two tracks.
  • the mix instructions file can comprise data such as the music files' identi- fication, the point in each music file where the playback should start from, and if the playback speed should be modified, how much and at which point and/or in which interval.
  • the system may apply the data in the mix instructions file to recreate the same mix again based on the same music files, which are stored in the system. This is a great improvement over the method previously known, which was to record the entire mix. Also, this system makes it easier to edit the mix since the parameters set in the mix instruction file can be altered instead of recording the entire mix again.
  • Co-pending pat- ent application PCT/SE2007/050491 discloses the possibility to create a so-called "mix instructions file" identifying each music track that is used in the mix, and setting parameters for the tracks and how to mix them, but without including the actual music tracks themselves.
  • the mix instructions file can then be shared with other users without having to include all the music tracks. This means that a smaller file can be shared, and that copyright issues can be avoided in that each person using the mix instructions file must obtain the music files.
  • recipe file normally has a single Creator, whereas it can be shared by one or more Readers. Of course, any given user may be both a reader and a creator.
  • each reader is expected to (1) identify the audio files that are needed in the mix, and (2) follow the included instructions that use these files in order to reproduce the expected mix result.
  • the timing of the instructions plays a critical role in producing the same results as expected by the creator of the mix. This is especially critical since the reader does not have access to the exact same files used by the creator. Instead, another audio file that is identified as the same as the creator's is used.
  • the starting point of the music within the file may vary.
  • the files may have been created using different encoding and/or different sampling frequencies.
  • the files may not be identical when the time-scale is considered and one file cannot necessarily be replaced for another without some adjustments.
  • the mix experienced by a reader may differ from that expected of the mix creator.
  • a method of processing an audio file representing an audio track comprising the steps of analyzing the audio file to define at least one reference point or reference segment in the audio file, said reference point or segment indicating an identifiable pattern within the audio file, said reference point or reference segment identifying a unique position within the audio track, independent of the audio file representing the audio track, storing information related to the reference point or reference segment, using this pattern to align files for use in a mix.
  • This first method of the invention enables the analysis of an audio track to determine its timing within the file containing it, to enable the use of the audio track with an existing mix instructions file.
  • the object is also achieved by a method of creating a mix instructions file specifying a playback mix of at least a first and a second audio file Said method comprising the steps of:
  • the audio file - analyzing the audio file to define at least one reference point or reference segment in the audio file, said reference point or segment indicating an identifiable pattern within the audio file independent of the audio file representation, - storing information related to the reference point or reference segment in the mix instructions file,
  • This second method enables the creation of mix instructions files using audio files having a determined timing of the audio information within the file.
  • the identifiable pattern should define a position within the audio track as uniquely as possible.
  • the object is also achieved by a method of playing a mix of at least a first and a second audio track using a mix instructions file, said method comprising the following steps:
  • an audio file comprising an audio track used by the mix instructions file, - identifying at least one reference point or reference segment in the mix instructions file, associated with the audio track, said reference point or segment indicating an identifiable pattern within the identified audio file,
  • This method enables the playing of a mix instructions file generated according to the second method above, using audio files generated according to the first method above.
  • any reader can use a mix instructions file created and shared by another creator, even if the audio files to be used by the mix instructions file do not match the exact digital representations used by the creator.
  • Unique reference points or segments are defined in each audio file and are used to determine the timing information related to the respective audio file within the mix instructions file.
  • the reference time points or segments are expected to be purely based on the audio properties of the corresponding track, and are independent of the digital representation of the audio tracks.
  • the reference time points or segments require a sufficiently high resolution
  • the reader of a mix instructions file is enabled to use a different set of tracks that are identified as similar to those used by the Mix Creator, yet the tracks used by the Reader may differ in quality and digital representation, causing a difference in the timing properties of the tracks. So, unless this difference in timing between the tracks is taken care of, the timing of the mix instructions during playback may no longer match the Creator's timing.
  • the reference point or segments may be defined, for example, as the point in the audio file having the highest amplitude. This will constitute an easily identifiable reference point in each audio file. Of course a defined number of amplitude maxima may be used to obtain several reference points.
  • a beat analysis may be performed to define the reference point or segment.
  • Each of the methods may further comprise the following steps: Dividing the second file into frames,
  • the method may further comprise the steps of determining the offset between the first and the second audio track using cross-correlation, by identifying the maximum value of the cross-correlation of a first and a second signal, each of said first and second signal being a scalar or a vector signal, the first signal representing a characteristic segment of the first track and the second signal representing at least a part of the second track, then determining the time scale to determine the offset.
  • the method further comprises the steps of obtaining a first and a second cross-correlation value and interpolating between the first and second cross-correlation values.
  • the method may also further comprise the step of converting the spectra corresponding to segment(s) to a representation that uses a perception-based frequency scale before comparing the characterizing sequences. This will result in a lower dimensionality of the vector signal, while keeping the most significant part of the audio signal, that is, the part that is in the audible range.
  • the characterizing sequence of a track may be selected as the one having the highest entropy of a number of such sequences. This will optimize the reliability of the matching.
  • the invention also relates to a computer program product characterized in that it comprises computer readable code means which, when run in an apparatus for playing audio tracks, will cause the apparatus to perform any of the above methods, and to an apparatus for playing audio tracks, comprising such a computer program product.
  • Fig. 1 illustrates a DJ system that may be used according to the invention
  • Fig. 2 is a flow chart of a method for creating a mix instructions file according to an embodiment of the invention
  • Fig. 3 is a flow chart of a method for using a mix instructions file with new audio files
  • Fig. 4 is a graphical representation of the audio content of an example audio file
  • Fig. 5 is a flow chart of a method for determining reference points or segments according to a second embodiment of the invention
  • Fig. 6 is a flow chart of a method of determining the offset between two files.
  • FIG. 1 is a simplified version of Fig. 2 of co-pending application PCT/SE2007/050491. A more detailed description is given in this co-pending application.
  • Fig. 1 illustrates a DJ system according to the invention.
  • a computer 23 comprises a first data base 25 for holding audio files and a second data base 27 for holding mix instructions files similar to the ones known in the art.
  • the computer has user input/output means represented by a keyboard 6 and a screen 8.
  • Mix instructions files may be created in the computer and/or retrieved from another source, for example, through the Internet, and stored in the second data base 27.
  • the computer comprises a playback software program 33 arranged to retrieve a mix instructions file from the second data base 27 and, as prescribed in the mix instructions file, to retrieve at least one audio file comprising a music track at a time from the first data base 25 and manipulate them according to the mix instructions file to create a mix of the music tracks.
  • the computer preferably also has retrieval software 35 for retrieving, through a data network 36, mix instructions files and/or audio files from external sources. These may be used directly upon retrieval or may be stored in the second data base 27.
  • the computer also preferably has retrieval software 35 for retrieving, through a data network 36, mix instructions files and/or mu- sic tracks from sources such as a music track data base 37 or a mix instructions file data base 39 in the network.
  • the portable DJ system also comprises a mix creation unit 40 to enable the operator to create mix instructions files in the portable DJ system and/or the direct retrieval of mix instructions files and/or music tracks from the network to the portable DJ system. From the creator's point of view, the mix instructions files are created by user input to the computer in a way known per se.
  • the portable DJ system is arranged to analyze each audio file used in a mix instructions file and define one or more reference time points or reference segments in the file to be included in the mix instructions file. The system is also arranged to define all timing properties of the mix instructions file in relation to the reference points or reference segments.
  • the same reference time points, or segments can be identified in these other audio files and used to align the audio file as required for use in the mix.
  • this is achieved by means of software arranged in the portable DJ system. How this can be achieved will be discussed in more detail in the following.
  • each of the units 33, 35 and 40 in the computer 23 and the playback unit 43 of the portable unit 41 are implemented as software modules stored in the computer, or portable unit, respectively.
  • the illustration shown in Figure 2 is only a logical diagram, and the actual functions can be implemented in software in a number of different ways.
  • different audio files that use different digital representations, yet have the same audio content may differ in several respects. For example they may have different delay times before the audio signal actually starts, or after the audio signal ends. Also, they may have a different file format, have been created us- ing a different encoding scheme and/or sampling frequency. Further, they may be affected by noise in different ways. Hence, for example, if a reference time point has been set a number of seconds into a music track based on a file of a particular representation of the track, if a different representation of the track, obtained from another source, is used, the reference time point may be placed in the wrong position relative to the actual content of the file, if the reference time point is simply placed at the same number of seconds into the music track.
  • the mix creation unit 40 of the portable DJ system is arranged, when creating a mix, to identify one or more reference points, or reference segments, which are well defined points or segments in each music track, related to the audio content of the music track.
  • any time properties in the mix instructions file (such as the point in time of a cue point) are defined in relation to the reference points or reference segments of the appropriate track, making the mix instructions file independent of any digital representation of the included audio tracks.
  • the mix playback unit 33 of the portable DJ system is also arranged, when reading a mix, to identify these reference points, or reference segments in order to properly align the reader's audio tracks. These reference time points or segments may be determined in a number of different ways, as will be discussed below.
  • each of the audio files used in the mix instructions file should have one or more reference time points, or reference segments defined, and the definitions should be included in the mix instructions file.
  • the reference time points or segments may be specified in the audio file, or stored in a separate data base, associated to the audio file. Exam- pies of how to do this will be given in the following. An overall procedure for creating the mix instructions file is shown in Figure 2.
  • step S21 a reference to an audio file is included in the mix instructions file.
  • the audio file can be manipulated in any way that is common in the art, to mix it with one or more other audio files, increase its speed etc.
  • step S22 the audio file is analyzed to see if it already has reference time points or segments defined for it.
  • the reference time points or segments can be either directly embedded in the audio file, or stored in an independent data base that refers to the audio file.
  • Step S23 is a decision step. If reference points or segments exist for the file, go to step S25, if not, go to step S24.
  • step S24 at least one reference point or segment is defined for the file. This can be done in a number of different ways, some of which will be discussed below.
  • step S25 information about the reference time points, or segments is stored in connection with the audio file itself, and with the mix instructions file.
  • step S26 the mix instructions file is redefined so that any time properties in the mix instructions file are defined in relation to the reference points or reference segments for the corresponding audio track.
  • Figure 3 illustrates a method of using a mix instructions file retrieved from another source, with the reader's own audio files. As discussed above, these may not be identical to the ones used when creating the mix instructions file.
  • Co-pending application PCT/SE2007/050491 discloses a method for retrieving audio tracks to be used with the mix instructions file, if necessary.
  • step S31 the mix instructions file is searched to identify the audio files used by it. Then for each audio file, the following steps are performed: In step S32 the audio file is analyzed to see if it has reference time points, or segments, that match the ones defined for the audio file in the mix instructions file. This means that the references should be created in the same way, so that they represent the same characteristic of the audio file.
  • Step S33 is a decision step: if matching references are found, go to step S35; if not, go to step S34.
  • step S34 reference time points, or segments, are defined for the audio file in the same way as was done in S24 of Figure 2. Information about the references is stored in association with the audio file.
  • step S35 the references created in step S34, or found in step S33, are compared to the reference time points for the audio file in the mix instructions file, and the result of the comparison is used to align the audio file with the one used when creating the mix instructions file.
  • the alignment of the reference time points or reference segments can be performed by determining the time difference between the reference points or reference segments of the track to be used and the reference points or reference segments of the same track as defined in the mix instructions file. This difference can then be offset for each time property in the mix instructions file, which is defined in relation to the creator's reference points or reference segments for that track. If the differences are not identical for all reference time points or reference segments, then the difference can advantageously be interpolated between such points (before the first or after the last time point or segment the difference can be held constant). Linear interpolation of the differences can be used.
  • the mix instructions file can be used together with the reader's audio files to play the mix. It would of course be possible to align the audio files as they become needed while playing the mix as well, if this could be handled fast enough. It would of course also be possible to first redefine the mix instructions file so that any time properties in the mix instructions file are defined in relation to the reader's audio files, before using the mix instructions file.
  • Figure 4 shows a graphical repre- sentation in the time domain of the audio content of an audio file.
  • the audio content has clearly distinguishable features, such as amplitude peaks in a certain pattern.
  • a simple way of determining a reference time point would be to find the maximum amplitudes in the track and determine the distance to this maximum peak from the start of the file.
  • a number of the highest peaks for example three or five of the highest peaks could be used.
  • the reference points are determined based on an analysis of certain time characteristics of the audio track such as the time positions of the audio beats in the track.
  • An example of such an analysis is the beats analysis produced by the software algorithm aufTAKT (http ://www.zplane.de/).
  • Such an analysis produces a vector of time positions.
  • different time vectors will be produced.
  • the analysis focuses on the audio properties of the tracks, and is independent of the digital representations, the different time vectors can be aligned to provide a time alignment of the different track representations.
  • the reference points, or segments are determined based on a sequence of periodograms, which form estimates of short-time Fourier transforms, as will be discussed in more detail below.
  • WO 02/065782 discloses a method of generating a hash signal, also referred to as a signature.
  • the hash signal may be seen as a summary of the file, and can be matched with hash signals stored in a database to identify the information signal. This can be used, for example, to verify correct receipt of a large file by sending only the hash value of the file.
  • Fig. 5 is a flow chart of the steps performed to generate one or more sequences of periodograms, that can be used according to the invention to determine the offset between two files comprising the same music track.
  • At least a part of the music track is divided into frames.
  • a frame length of 20-30 ms has been found to be suitable.
  • the frames overlap.
  • step S52 a Fast Fourier Transform (FFT) is performed on each frame, to transform the frames to the frequency domain.
  • FFT Fast Fourier Transform
  • step S53 for each frame, the signal is then converted to a periodogram by computing the energy for each frequency bin and discarding the phase.
  • the square-root of the periodogram spectrum values can be taken to obtain an esti- mate of the short-term magnitude spectrum.
  • Step S54 is an optional step in which the spectrum is converted to a perceptual scale, that is, the spectrum values within each of a set of pre-defined frequency bands are summed and scaled, where the pre-defined frequency bands and scaling factors are selected in a manner consistent with the audible frequency range.
  • the summations can be weighted for increased accuracy.
  • the summations result in a "perceptual" spectrum with fewer bins.
  • the well-known mel or ERB (equivalent rectangular band) scales can be used to construct the pre-defined frequency bands.
  • a processing step S55 can be included, that removes the undesired sensitivity to fixed offsets in the spectrum.
  • Such fixed offsets can be caused by stationary noise
  • step S55 has as input a spectrum corresponding to that time frame.
  • the signal is now represented by a sequence of spectra.
  • For each frequency bin a scalar time sequence of spectrum values exists.
  • Each channel is a time signal that has one time sample per time frame of the original signal. These channels are significantly down-sampled compared to the original audio signal.
  • the sequence of spectra forms a vector signal with each of the channels being a component of the vector.
  • step S55 each frame is compared to a previous frame to determine the difference between them. That is, for each channel an output sample the previous sample is subtracted from the current sample.
  • the resulting vector signal of time differences of the spectra describes the changes in the spectra corresponding to successive time segments. Thus, such vector signals are less sensitive to stationary additive noise than the spectra themselves.
  • the difference signals are sensitive to the power level of the audio file.
  • a simple method to remove information about the power level of a signal is to consider only the sign of the signal. That is, positive signal samples are represented by +1 and negative signal samples are represented by -1.
  • the sign of the difference signal is determined.
  • the end result is a binary representation of the time differences of the sequence of spectrums.
  • step S57 one or more processed frames are selected for use in the method de- picted in Fig. 6, for determining the offset between the file used in Fig. 5 and another file comprising the same music track.
  • the result, as illustrated by step S57, is a characteristic representation of a part of the audio file.
  • the characteristic vector sequence Some characteris sequences will be more efficient in the sense that they can be shorter to obtain a cer- tain alignment performance.
  • a suitable selection criterion for selecting such a part or parts of the signal to be used for computing the characteristic vector sequence would be entropy of the characteristic vector sequence, as matching such a characteristic vector sequence in different files is likely to have features facilitating alignment.
  • the parts can also be selected manually.
  • the parts can be selected as the part of 500 ms centered around the middle of the file used for the original recipe.
  • Fig. 6 is a flow chart of how the characteristic vector sequence obtained according to Fig. 5 can be used to find the offset between two files.
  • each characteristic vector sequence is a vector signal segment or a set of vector signal segments.
  • the characteristic vector sequence is cross-correlated with the entire vector sequence representation of the other file. This can be described as sliding the characteristic vector sequence across the vector sequence of the other file and cross-correlating at different positions to find the best match. By finding the maximum in the cross-correlation, the location of the characteristic vector sequence in the other file is found, which means that the time scales of the two files can be synchronized. It is noted that the cross-correlation is performed on a vector signal that is down-sampled significantly compared to the audio signal.
  • step S62 an interpolation is made between the distinct cross-correlation sample values obtained, to increase the resolution of the cross-correlation curve.
  • step S63 the maximum vale of the cross-correlation curve is found, to identify more precisely the part of the other file that matches the characteristic vector sequence.
  • step S64 the time scale is determined to determine the offset between the two files.
  • the offset value obtained in step S64 may be used to correct the parameters in the mix instructions file related to the processed audio file.

Landscapes

  • Signal Processing For Digital Recording And Reproducing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management Or Editing Of Information On Record Carriers (AREA)

Abstract

The present invention facilitates the sharing of mix instructions files by taking into account timing differences between different files holding the same audio information. Unique reference points or segments are defined in each audio file and are used to determine the timing information related to the respective audio file within the mix instructions file.

Description

AUDIO MIX INSTRUCTION FILE WITH TIMING INFORMATION REFERRING TO UNIQUE PATTERNS WITHIN AUDIO TRACKS
Technical Field
The present invention relates to methods concerning the creation of mix instructions files, or mix instructions files for audio files, to computer program products and to an apparatus for playing audio files.
Background and Related Art
Since it has become possible to store music and other audio information in data files on hard disks or other memory, DJ systems have been developed in which the songs that are to be played are stored in a memory, for example, a data base from which they can be retrieved and played when desired. Typically these systems comprise similar functions to the traditional DJ systems in that they enable the mixing of two tracks and manipulation of each of the tracks to achieve a good mix, for example, a smooth transition between two songs.
In applicant's co-pending application PCT/SE2006/050030 a hand-held DJ system comprising a data base for music tracks is proposed in which one set of controls is used to control both tracks.
Recently systems have become available in which the parameters used to achieve a certain mix can be stored in a mix instructions file and applied to the music files at a later time. The mix instructions file is sometimes referred to as a mix recipe or a mix recipe file. Such a system is the TRAKTOR DJ Studio 3 available from Native In- struments Software Synthesis GmbH. The content of the mix instructions file can be created, for example, by registering the actions performed during a DJ session, that is, which music tracks are used and how the DJ manipulates them. This may be assisted by software that, for example, adjusts the playback speed of two tracks. For example, the mix instructions file can comprise data such as the music files' identi- fication, the point in each music file where the playback should start from, and if the playback speed should be modified, how much and at which point and/or in which interval. The system may apply the data in the mix instructions file to recreate the same mix again based on the same music files, which are stored in the system. This is a great improvement over the method previously known, which was to record the entire mix. Also, this system makes it easier to edit the mix since the parameters set in the mix instruction file can be altered instead of recording the entire mix again.
As disc jockey systems become more and more computer based, the possibility to make mixes of music tracks and share them with other users arises. Co-pending pat- ent application PCT/SE2007/050491 discloses the possibility to create a so-called "mix instructions file" identifying each music track that is used in the mix, and setting parameters for the tracks and how to mix them, but without including the actual music tracks themselves. The mix instructions file can then be shared with other users without having to include all the music tracks. This means that a smaller file can be shared, and that copyright issues can be avoided in that each person using the mix instructions file must obtain the music files.
In this document the creator of such a recipe file will be referred to as a "Recipe Creator", and users of such a file will be referred to as "Recipe Reader". Note that a recipe file normally has a single Creator, whereas it can be shared by one or more Readers. Of course, any given user may be both a reader and a creator.
When the creator shares a certain mix instructions file with many readers, each reader is expected to (1) identify the audio files that are needed in the mix, and (2) follow the included instructions that use these files in order to reproduce the expected mix result. When executing the included instructions, the timing of the instructions plays a critical role in producing the same results as expected by the creator of the mix. This is especially critical since the reader does not have access to the exact same files used by the creator. Instead, another audio file that is identified as the same as the creator's is used.
Given that various digital representations of an audio track exist, a problem may arise when a reader uses one audio file representation that differs from that used by the creator, even though they are identified as representing the same audio track.
For example, the starting point of the music within the file may vary. Also, the files may have been created using different encoding and/or different sampling frequencies. Hence, although the audio contents of the files, as experienced by the user, are similar, the files may not be identical when the time-scale is considered and one file cannot necessarily be replaced for another without some adjustments. Unless the difference in the timing characteristics of the different representations of the same audio track are handled by the reader of the recipe file, the mix experienced by a reader may differ from that expected of the mix creator.
Object of the Invention
It is an object of the invention to facilitate the sharing of mix instructions files.
Summary of the Invention
This object is achieved according to the present invention by a method of processing an audio file representing an audio track, comprising the steps of analyzing the audio file to define at least one reference point or reference segment in the audio file, said reference point or segment indicating an identifiable pattern within the audio file, said reference point or reference segment identifying a unique position within the audio track, independent of the audio file representing the audio track, storing information related to the reference point or reference segment, using this pattern to align files for use in a mix. This first method of the invention enables the analysis of an audio track to determine its timing within the file containing it, to enable the use of the audio track with an existing mix instructions file.
The object is also achieved by a method of creating a mix instructions file specifying a playback mix of at least a first and a second audio file Said method comprising the steps of:
- identifying an audio track to be used in the mix instructions file, - storing in the mix instructions file information related to the playback of the audio track in the mix, including timing information,
- analyzing the audio file to define at least one reference point or reference segment in the audio file, said reference point or segment indicating an identifiable pattern within the audio file independent of the audio file representation, - storing information related to the reference point or reference segment in the mix instructions file,
- defining the timing information in relation to the at least one reference point or reference segment.
This second method enables the creation of mix instructions files using audio files having a determined timing of the audio information within the file.
The identifiable pattern should define a position within the audio track as uniquely as possible.
The object is also achieved by a method of playing a mix of at least a first and a second audio track using a mix instructions file, said method comprising the following steps:
- identifying an audio file comprising an audio track used by the mix instructions file, - identifying at least one reference point or reference segment in the mix instructions file, associated with the audio track, said reference point or segment indicating an identifiable pattern within the identified audio file,
- analyzing the audio file to define at least one indication point or indication seg- ment in the audio file matching the at least one reference point or reference segment,
- comparing the indication point or segment to at least one reference point or segment stored in the mix instructions file,
- aligning the audio file on the basis of the comparison
This method enables the playing of a mix instructions file generated according to the second method above, using audio files generated according to the first method above.
By introducing in the mix instructions file reference time points or segments for each audio file that is used in the mix, any reader can use a mix instructions file created and shared by another creator, even if the audio files to be used by the mix instructions file do not match the exact digital representations used by the creator. Unique reference points or segments are defined in each audio file and are used to determine the timing information related to the respective audio file within the mix instructions file. The reference time points or segments are expected to be purely based on the audio properties of the corresponding track, and are independent of the digital representation of the audio tracks.
To align the two audio files sufficiently precisely to be able to replace one with the other in a mix instructions file, the reference time points or segments require a sufficiently high resolution
According to the invention, therefore, the reader of a mix instructions file is enabled to use a different set of tracks that are identified as similar to those used by the Mix Creator, yet the tracks used by the Reader may differ in quality and digital representation, causing a difference in the timing properties of the tracks. So, unless this difference in timing between the tracks is taken care of, the timing of the mix instructions during playback may no longer match the Creator's timing.
In any of the three methods above, the reference point or segments may be defined, for example, as the point in the audio file having the highest amplitude. This will constitute an easily identifiable reference point in each audio file. Of course a defined number of amplitude maxima may be used to obtain several reference points.
Alternatively a beat analysis may be performed to define the reference point or segment.
Each of the methods may further comprise the following steps: Dividing the second file into frames,
Performing a Fourier Transform on at least two frames,
Converting the transformed signal to periodograms to obtain corresponding spectra,
Determining the difference between the at least two frames,
Selecting at least one processed frame for use in determining the offset between a first and a second audio track comprising the same audio information.
In the latter case, the method may further comprise the steps of determining the offset between the first and the second audio track using cross-correlation, by identifying the maximum value of the cross-correlation of a first and a second signal, each of said first and second signal being a scalar or a vector signal, the first signal representing a characteristic segment of the first track and the second signal representing at least a part of the second track, then determining the time scale to determine the offset. Preferably, to increase the resolution, the method further comprises the steps of obtaining a first and a second cross-correlation value and interpolating between the first and second cross-correlation values.
The method may also further comprise the step of converting the spectra corresponding to segment(s) to a representation that uses a perception-based frequency scale before comparing the characterizing sequences. This will result in a lower dimensionality of the vector signal, while keeping the most significant part of the audio signal, that is, the part that is in the audible range.
The characterizing sequence of a track may be selected as the one having the highest entropy of a number of such sequences. This will optimize the reliability of the matching.
The invention also relates to a computer program product characterized in that it comprises computer readable code means which, when run in an apparatus for playing audio tracks, will cause the apparatus to perform any of the above methods, and to an apparatus for playing audio tracks, comprising such a computer program product.
Brief Description of the Drawings
The invention will be described in more detail in the following, by way of example and with reference to the appended drawings in which:
Fig. 1 illustrates a DJ system that may be used according to the invention, Fig. 2 is a flow chart of a method for creating a mix instructions file according to an embodiment of the invention,
Fig. 3 is a flow chart of a method for using a mix instructions file with new audio files, Fig. 4 is a graphical representation of the audio content of an example audio file, Fig. 5 is a flow chart of a method for determining reference points or segments according to a second embodiment of the invention, Fig. 6 is a flow chart of a method of determining the offset between two files.
Detailed Description of Embodiments
Fig. 1 is a simplified version of Fig. 2 of co-pending application PCT/SE2007/050491. A more detailed description is given in this co-pending application. Fig. 1 illustrates a DJ system according to the invention. A computer 23 comprises a first data base 25 for holding audio files and a second data base 27 for holding mix instructions files similar to the ones known in the art. The computer has user input/output means represented by a keyboard 6 and a screen 8.
Mix instructions files may be created in the computer and/or retrieved from another source, for example, through the Internet, and stored in the second data base 27.
According to the invention the computer comprises a playback software program 33 arranged to retrieve a mix instructions file from the second data base 27 and, as prescribed in the mix instructions file, to retrieve at least one audio file comprising a music track at a time from the first data base 25 and manipulate them according to the mix instructions file to create a mix of the music tracks. The computer preferably also has retrieval software 35 for retrieving, through a data network 36, mix instructions files and/or audio files from external sources. These may be used directly upon retrieval or may be stored in the second data base 27.
According to the invention, even if the mix instructions file is based on music tracks that are found in the first data base 25, it may be that the copies of the music tracks do not have the exact same digital representation of the tracks as the files used when the mix instructions file is created. The computer also preferably has retrieval software 35 for retrieving, through a data network 36, mix instructions files and/or mu- sic tracks from sources such as a music track data base 37 or a mix instructions file data base 39 in the network.
The portable DJ system also comprises a mix creation unit 40 to enable the operator to create mix instructions files in the portable DJ system and/or the direct retrieval of mix instructions files and/or music tracks from the network to the portable DJ system. From the creator's point of view, the mix instructions files are created by user input to the computer in a way known per se. According to the invention, the portable DJ system is arranged to analyze each audio file used in a mix instructions file and define one or more reference time points or reference segments in the file to be included in the mix instructions file. The system is also arranged to define all timing properties of the mix instructions file in relation to the reference points or reference segments. In this way, when the mix instructions file is used with other audio files, the same reference time points, or segments, can be identified in these other audio files and used to align the audio file as required for use in the mix. As will be understood, this is achieved by means of software arranged in the portable DJ system. How this can be achieved will be discussed in more detail in the following.
It will be understood that each of the units 33, 35 and 40 in the computer 23 and the playback unit 43 of the portable unit 41 are implemented as software modules stored in the computer, or portable unit, respectively. As the skilled person will understand, the illustration shown in Figure 2 is only a logical diagram, and the actual functions can be implemented in software in a number of different ways.
As will be understood, different audio files that use different digital representations, yet have the same audio content may differ in several respects. For example they may have different delay times before the audio signal actually starts, or after the audio signal ends. Also, they may have a different file format, have been created us- ing a different encoding scheme and/or sampling frequency. Further, they may be affected by noise in different ways. Hence, for example, if a reference time point has been set a number of seconds into a music track based on a file of a particular representation of the track, if a different representation of the track, obtained from another source, is used, the reference time point may be placed in the wrong position relative to the actual content of the file, if the reference time point is simply placed at the same number of seconds into the music track. According to the invention, therefore, the mix creation unit 40 of the portable DJ system is arranged, when creating a mix, to identify one or more reference points, or reference segments, which are well defined points or segments in each music track, related to the audio content of the music track. According to the invention, any time properties in the mix instructions file (such as the point in time of a cue point) are defined in relation to the reference points or reference segments of the appropriate track, making the mix instructions file independent of any digital representation of the included audio tracks. According to the invention, the mix playback unit 33 of the portable DJ system is also arranged, when reading a mix, to identify these reference points, or reference segments in order to properly align the reader's audio tracks. These reference time points or segments may be determined in a number of different ways, as will be discussed below.
According to the invention, therefore, when creating a mix instructions file each of the audio files used in the mix instructions file should have one or more reference time points, or reference segments defined, and the definitions should be included in the mix instructions file. The reference time points or segments may be specified in the audio file, or stored in a separate data base, associated to the audio file. Exam- pies of how to do this will be given in the following. An overall procedure for creating the mix instructions file is shown in Figure 2.
In step S21 a reference to an audio file is included in the mix instructions file. The audio file can be manipulated in any way that is common in the art, to mix it with one or more other audio files, increase its speed etc. In step S22 the audio file is analyzed to see if it already has reference time points or segments defined for it. The reference time points or segments can be either directly embedded in the audio file, or stored in an independent data base that refers to the audio file. Step S23 is a decision step. If reference points or segments exist for the file, go to step S25, if not, go to step S24.
In step S24 at least one reference point or segment is defined for the file. This can be done in a number of different ways, some of which will be discussed below. In step S25 information about the reference time points, or segments is stored in connection with the audio file itself, and with the mix instructions file.
In step S26, the mix instructions file is redefined so that any time properties in the mix instructions file are defined in relation to the reference points or reference segments for the corresponding audio track.
Figure 3 illustrates a method of using a mix instructions file retrieved from another source, with the reader's own audio files. As discussed above, these may not be identical to the ones used when creating the mix instructions file. Co-pending application PCT/SE2007/050491 discloses a method for retrieving audio tracks to be used with the mix instructions file, if necessary.
In step S31 the mix instructions file is searched to identify the audio files used by it. Then for each audio file, the following steps are performed: In step S32 the audio file is analyzed to see if it has reference time points, or segments, that match the ones defined for the audio file in the mix instructions file. This means that the references should be created in the same way, so that they represent the same characteristic of the audio file.
Step S33 is a decision step: if matching references are found, go to step S35; if not, go to step S34. In step S34 reference time points, or segments, are defined for the audio file in the same way as was done in S24 of Figure 2. Information about the references is stored in association with the audio file.
In step S35 the references created in step S34, or found in step S33, are compared to the reference time points for the audio file in the mix instructions file, and the result of the comparison is used to align the audio file with the one used when creating the mix instructions file.
The alignment of the reference time points or reference segments can be performed by determining the time difference between the reference points or reference segments of the track to be used and the reference points or reference segments of the same track as defined in the mix instructions file. This difference can then be offset for each time property in the mix instructions file, which is defined in relation to the creator's reference points or reference segments for that track. If the differences are not identical for all reference time points or reference segments, then the difference can advantageously be interpolated between such points (before the first or after the last time point or segment the difference can be held constant). Linear interpolation of the differences can be used.
When all the audio files to be used with the mix instructions file have been aligned properly, the mix instructions file can be used together with the reader's audio files to play the mix. It would of course be possible to align the audio files as they become needed while playing the mix as well, if this could be handled fast enough. It would of course also be possible to first redefine the mix instructions file so that any time properties in the mix instructions file are defined in relation to the reader's audio files, before using the mix instructions file.
With reference to Figure 4, some methods of determining the reference points, or segments, in the time domain will be discussed. Figure 4 shows a graphical repre- sentation in the time domain of the audio content of an audio file. As can be seen, the audio content has clearly distinguishable features, such as amplitude peaks in a certain pattern. A simple way of determining a reference time point would be to find the maximum amplitudes in the track and determine the distance to this maximum peak from the start of the file. To create a more reliable reference, a number of the highest peaks, for example three or five of the highest peaks could be used.
In another embodiment of the invention, the reference points are determined based on an analysis of certain time characteristics of the audio track such as the time positions of the audio beats in the track. An example of such an analysis is the beats analysis produced by the software algorithm aufTAKT (http ://www.zplane.de/).
Such an analysis produces a vector of time positions. With the same analysis applied on different digital representation of the same audio track, different time vectors will be produced. However, given that the analysis focuses on the audio properties of the tracks, and is independent of the digital representations, the different time vectors can be aligned to provide a time alignment of the different track representations.
In another embodiment of the invention, the reference points, or segments, are determined based on a sequence of periodograms, which form estimates of short-time Fourier transforms, as will be discussed in more detail below.
Several methods are known for generating a hash signal. For example, WO 02/065782 discloses a method of generating a hash signal, also referred to as a signature. The hash signal may be seen as a summary of the file, and can be matched with hash signals stored in a database to identify the information signal. This can be used, for example, to verify correct receipt of a large file by sending only the hash value of the file. Fig. 5 is a flow chart of the steps performed to generate one or more sequences of periodograms, that can be used according to the invention to determine the offset between two files comprising the same music track.
In the first step S51, at least a part of the music track is divided into frames. A frame length of 20-30 ms has been found to be suitable. To increase the resolution, preferably, the frames overlap.
In step S52 a Fast Fourier Transform (FFT) is performed on each frame, to transform the frames to the frequency domain. The resulting frequency domain represen- tation is discrete and each frequency point is referred to as a frequency "bin".
In step S53, for each frame, the signal is then converted to a periodogram by computing the energy for each frequency bin and discarding the phase. Advantageously the square-root of the periodogram spectrum values can be taken to obtain an esti- mate of the short-term magnitude spectrum. Below we refer to the estimate of the magnitude spectrum or the periodogram, whichever is used, as the "spectrum".
Step S54 is an optional step in which the spectrum is converted to a perceptual scale, that is, the spectrum values within each of a set of pre-defined frequency bands are summed and scaled, where the pre-defined frequency bands and scaling factors are selected in a manner consistent with the audible frequency range. The summations can be weighted for increased accuracy. The summations result in a "perceptual" spectrum with fewer bins. The well-known mel or ERB (equivalent rectangular band) scales can be used to construct the pre-defined frequency bands.
Advantageously, a processing step S55 can be included, that removes the undesired sensitivity to fixed offsets in the spectrum. Such fixed offsets can be caused by stationary noise For each time frame, step S55 has as input a spectrum corresponding to that time frame. The signal is now represented by a sequence of spectra. For each frequency bin a scalar time sequence of spectrum values exists. Thus, we can distin- guish a set of scalar frequency "channels", each channel corresponding to one frequency bin. Each channel is a time signal that has one time sample per time frame of the original signal. These channels are significantly down-sampled compared to the original audio signal. The sequence of spectra forms a vector signal with each of the channels being a component of the vector. In step S55 each frame is compared to a previous frame to determine the difference between them. That is, for each channel an output sample the previous sample is subtracted from the current sample. The resulting vector signal of time differences of the spectra describes the changes in the spectra corresponding to successive time segments. Thus, such vector signals are less sensitive to stationary additive noise than the spectra themselves.
The difference signals are sensitive to the power level of the audio file. A simple method to remove information about the power level of a signal is to consider only the sign of the signal. That is, positive signal samples are represented by +1 and negative signal samples are represented by -1. In step S56 the sign of the difference signal is determined. The end result is a binary representation of the time differences of the sequence of spectrums.
In step S57 one or more processed frames are selected for use in the method de- picted in Fig. 6, for determining the offset between the file used in Fig. 5 and another file comprising the same music track. The result, as illustrated by step S57, is a characteristic representation of a part of the audio file. We refer to this characteristic representation of the part as the "characteristic vector sequence". Some characteris sequences will be more efficient in the sense that they can be shorter to obtain a cer- tain alignment performance. A suitable selection criterion for selecting such a part or parts of the signal to be used for computing the characteristic vector sequence would be entropy of the characteristic vector sequence, as matching such a characteristic vector sequence in different files is likely to have features facilitating alignment. However, the parts can also be selected manually. As a third option, the parts can be selected as the part of 500 ms centered around the middle of the file used for the original recipe.
Fig. 6 is a flow chart of how the characteristic vector sequence obtained according to Fig. 5 can be used to find the offset between two files. As explained above, each characteristic vector sequence is a vector signal segment or a set of vector signal segments. In step S61 the characteristic vector sequence is cross-correlated with the entire vector sequence representation of the other file. This can be described as sliding the characteristic vector sequence across the vector sequence of the other file and cross-correlating at different positions to find the best match. By finding the maximum in the cross-correlation, the location of the characteristic vector sequence in the other file is found, which means that the time scales of the two files can be synchronized. It is noted that the cross-correlation is performed on a vector signal that is down-sampled significantly compared to the audio signal. Thus, in step S62 an interpolation is made between the distinct cross-correlation sample values obtained, to increase the resolution of the cross-correlation curve. In step S63 the maximum vale of the cross-correlation curve is found, to identify more precisely the part of the other file that matches the characteristic vector sequence. In step S64 the time scale is determined to determine the offset between the two files.
The offset value obtained in step S64 may be used to correct the parameters in the mix instructions file related to the processed audio file.

Claims

Claims
1. A method of analyzing an audio file representing an audio track, comprising the steps of analyzing the audio file to define at least one reference point or reference segment in the audio file, said reference point or segment indicating an identifiable pattern within the audio file, said reference point or reference segment identifying a unique position within the audio track, independent of the audio file representing the audio track, storing information related to the reference point or reference segment, using this pattern to align files for use in a mix
2. A method of creating a mix instructions file specifying a playback mix of at least a first and a second audio file Said method comprising the steps of:
- identifying an audio track to be used in the mix instructions file,
- storing in the mix instructions file information related to the playback of the audio track in the mix, including timing information,
- analyzing the audio file to define at least one reference point or reference seg- ment in the audio file, said reference point or segment indicating an identifiable pattern within the audio file, independent of the audio file representation,
- storing information related to the reference point or reference segment in the mix instructions file,
- defining the timing information in relation to the at least one reference point or reference segment.
3. A method of playing a mix of at least a first and a second audio track using a mix instructions file, said method comprising the following steps:
- identifying an audio file comprising an audio track used by the mix instructions file, - identifying at least one reference point or reference segment in the mix instructions file, associated with the audio track, said reference point or segment indicating an identifiable pattern within the identified audio file, independent of the audio file representation - analyzing the audio file to define at least one indication point or indication segment in the audio file matching the at least one reference point or reference segment,
- comparing the indication point or segment to at least one reference point or segment stored in the mix instructions file - aligning the audio file on the basis of the comparison
4. A method according to any one of the previous claims, wherein the reference point is defined as the point in the audio file having the highest amplitude.
5. A method according to any one of the claims 1-3, wherein a beat analysis is used to define the reference point or segment.
6. A method according to any one of the preceding claims, further comprising the following steps: Dividing the second file into frames
Performing a Fourier Transform on at least two frames
Converting the transformed signal to periodograms to obtain corresponding spectra,
Determining the difference between the at least two frames
Selecting at least one processed frame for use in determining the offset between a first and a second audio track comprising the same audio information.
7. A method according to claim 6, comprising the step of determining the offset between the first and the second audio track using cross-correlation, by identifying the maximum value of the cross-correlation of a first and a second signal, each of said first and second signal being a scalar or a vector signal, the first signal representing a characteristic segment of the first track and the second signal representing at least a part of the second track, then determining the time scale to determine the offset.
8. A method according to claim 7, comprising the steps of obtaining a first and a second cross-correlation value and interpolating between the first and second cross- correlation values to increase the resolution
9. A method according to any one of the claims 6-8, further comprising the step of converting the spectra corresponding to segment(s) to a representation that uses a perception-based frequency scale before comparing the characterizing sequences.
10. A method according to any one of the preceding claims wherein the characterizing sequence of a track is selected as the one having the highest entropy of a number of such sequences.
11. A computer program product characterized in that it comprises computer readable code means which, when run in an apparatus for playing audio tracks, will cause the apparatus to perform the method according to any one of the claims 1-9.
12. An apparatus for playing audio tracks, characterized in that it comprises a computer program product according to claim 11.
EP09745757A 2008-05-16 2009-05-13 Audio mix instruction file with timing information referring to unique patterns within audio tracks Ceased EP2304726A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SE0801130 2008-05-16
PCT/EP2009/055765 WO2009138425A1 (en) 2008-05-16 2009-05-13 Audio mix instruction file with timing information referring to unique patterns within audio tracks

Publications (1)

Publication Number Publication Date
EP2304726A1 true EP2304726A1 (en) 2011-04-06

Family

ID=40957938

Family Applications (1)

Application Number Title Priority Date Filing Date
EP09745757A Ceased EP2304726A1 (en) 2008-05-16 2009-05-13 Audio mix instruction file with timing information referring to unique patterns within audio tracks

Country Status (2)

Country Link
EP (1) EP2304726A1 (en)
WO (1) WO2009138425A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019115333A1 (en) 2017-12-11 2019-06-20 100 Milligrams Holding Ab System and method for creation and recreation of a music mix, computer program product and computer system

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106486128B (en) * 2016-09-27 2021-10-22 腾讯科技(深圳)有限公司 Method and device for processing double-sound-source audio data

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4646099B2 (en) * 2001-09-28 2011-03-09 パイオニア株式会社 Audio information reproducing apparatus and audio information reproducing system
US7027124B2 (en) * 2002-02-28 2006-04-11 Fuji Xerox Co., Ltd. Method for automatically producing music videos
WO2005104088A1 (en) * 2004-04-19 2005-11-03 Sony Computer Entertainment Inc. Music composition reproduction device and composite device including the same
JP4626376B2 (en) * 2005-04-25 2011-02-09 ソニー株式会社 Music content playback apparatus and music content playback method
EP1959427A4 (en) * 2005-12-09 2011-11-30 Sony Corp Music edit device, music edit information creating method, and recording medium where music edit information is recorded

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
C. BACHMANN, H. BISCHOFF, M. BRÖER, S. PFEIFER: "CUBASE 4 Advanced Music Production System", 19 October 2007 (2007-10-19), Retrieved from the Internet <URL:ftp://ftp.steinberg.net/Download/Cubase_4/Docs_English/Operation_Manual.pdf> [retrieved on 20120522] *
See also references of WO2009138425A1 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019115333A1 (en) 2017-12-11 2019-06-20 100 Milligrams Holding Ab System and method for creation and recreation of a music mix, computer program product and computer system

Also Published As

Publication number Publication date
WO2009138425A1 (en) 2009-11-19

Similar Documents

Publication Publication Date Title
CN1941071B (en) Beat extraction and detection apparatus and method, music-synchronized image display apparatus and method
US8680388B2 (en) Automatic recognition and matching of tempo and phase of pieces of music, and an interactive music player
US9251796B2 (en) Methods and systems for disambiguation of an identification of a sample of a media stream
KR101292698B1 (en) Method and apparatus for attaching metadata
US8076566B2 (en) Beat extraction device and beat extraction method
US7041892B2 (en) Automatic generation of musical scratching effects
Scott et al. Automatic multi-track mixing using linear dynamical systems
US6534700B2 (en) Automated compilation of music
JP2005518594A (en) A system that sells products using audio content identification
US20160196812A1 (en) Music information retrieval
JP3886372B2 (en) Acoustic inflection point extraction apparatus and method, acoustic reproduction apparatus and method, acoustic signal editing apparatus, acoustic inflection point extraction method program recording medium, acoustic reproduction method program recording medium, acoustic signal editing method program recording medium, acoustic inflection point extraction method Program, sound reproduction method program, sound signal editing method program
US20020172379A1 (en) Automated compilation of music
KR20050085765A (en) Audio signal analysing method and apparatus
KR100754294B1 (en) Feature-based audio content identification
Schwarz et al. Methods and datasets for DJ-mix reverse engineering
WO2007072394A2 (en) Audio structure analysis
WO2009138425A1 (en) Audio mix instruction file with timing information referring to unique patterns within audio tracks
JP2009063714A (en) Audio playback device and audio fast forward method
EP3391372B1 (en) Improved method, apparatus and system for embedding data within a data stream
EP3724873B1 (en) System and method for creation and recreation of a music mix, computer program product and computer system
JP5338312B2 (en) Automatic performance synchronization device, automatic performance keyboard instrument and program
JP2009294671A (en) Audio reproduction system and audio fast-forward reproduction method
US6448484B1 (en) Method and apparatus for processing data representing a time history
KR101352758B1 (en) Apparatus for generating karaoke contents and method thereof
JP4336362B2 (en) Sound reproduction apparatus and method, sound reproduction program and recording medium therefor

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20101216

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL BA RS

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: PRODIGIUM LTD.

DAX Request for extension of the european patent (deleted)
17Q First examination report despatched

Effective date: 20120126

REG Reference to a national code

Ref country code: DE

Ref legal event code: R003

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN REFUSED

18R Application refused

Effective date: 20130117