WO2007086417A1 - ビート抽出装置及びビート抽出方法 - Google Patents
ビート抽出装置及びビート抽出方法 Download PDFInfo
- Publication number
- WO2007086417A1 WO2007086417A1 PCT/JP2007/051073 JP2007051073W WO2007086417A1 WO 2007086417 A1 WO2007086417 A1 WO 2007086417A1 JP 2007051073 W JP2007051073 W JP 2007051073W WO 2007086417 A1 WO2007086417 A1 WO 2007086417A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- beat
- music
- extraction
- position information
- beats
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H1/00—Details of electrophonic musical instruments
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10G—REPRESENTATION OF MUSIC; RECORDING MUSIC IN NOTATION FORM; ACCESSORIES FOR MUSIC OR MUSICAL INSTRUMENTS NOT OTHERWISE PROVIDED FOR, e.g. SUPPORTS
- G10G3/00—Recording music in notation form, e.g. recording the mechanical operation of a musical instrument
- G10G3/04—Recording music in notation form, e.g. recording the mechanical operation of a musical instrument using electrical means
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H1/00—Details of electrophonic musical instruments
- G10H1/36—Accompaniment arrangements
- G10H1/40—Rhythm
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
- G10H2210/076—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for extraction of timing, tempo; Beat detection
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2240/00—Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
- G10H2240/011—Files or data streams containing coded musical information, e.g. for transmission
- G10H2240/046—File format, i.e. specific or non-standard musical file format used in or adapted for electrophonic musical instruments, e.g. in wavetables
- G10H2240/071—Wave, i.e. Waveform Audio File Format, coding, e.g. uncompressed PCM audio according to the RIFF bitstream format method
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2240/00—Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
- G10H2240/325—Synchronizing two or more audio tracks or files according to musical features or musical timings
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2250/00—Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
- G10H2250/131—Mathematical functions for musical analysis, processing, synthesis or composition
- G10H2250/215—Transforms, i.e. mathematical transforms into domains appropriate for musical signal processing, coding or compression
- G10H2250/235—Fourier transform; Discrete Fourier Transform [DFT]; Fast Fourier Transform [FFT]
Definitions
- the present invention relates to a beat extraction device and a beat extraction method for extracting beats of a rhythm of music.
- Performances performed by performers are finally delivered to the user as music content. Specifically, each performer's performance is mixed down in the form of two stereo channels, for example, to form a complete package.
- the completed package reaches the user as a music CD (Compact Disc) using, for example, a PCM (Pulse Code Modulation) method.
- the sound source in this music CD is what is called a sampling sound source.
- DI Music Instrument Digital Interface
- MIDI data In the MIDI format, performance information, lyrics information, and time code information (time stamp) describing pronunciation timing (event time) necessary for sync control are described as MIDI data.
- MIDI data is created in advance by content creators, and the power Laoque player only produces sound at the time it should be performed according to the instructions of the MIDI data. In other words, the device is generating (playing) music on the spot. This can only be enjoyed in a limited environment of MIDI data and its dedicated devices.
- SMIL Synchronized Multimedia Integration Language
- the music content distributed in the world comes from MIDI and SMIL, for example, PCM data represented by CD and MP3 (MPEG (Moving Picture Experts Group) Audio Layer 3) which is the compressed audio, etc.
- PCM data represented by CD and MP3 (MPEG (Moving Picture Experts Group) Audio Layer 3) which is the compressed audio, etc.
- MP3 MPEG (Moving Picture Experts Group) Audio Layer 3
- the music playback device provides music content to the user by performing DZA conversion on the sampled audio waveform of PCM or the like and outputting it.
- DZA conversion on the sampled audio waveform of PCM or the like
- FM radio broadcasting Furthermore, there are cases where people perform on the spot, such as concerts and live performances, and provide them to users.
- the machine can automatically recognize the timing of measures such as measures and beats from the raw music waveform of music, information such as MIDI and SMIL event time information is prepared in advance. Even if it is not necessary, it can be used to synchronize music and other media, such as karaoke and dance, as well as to a huge amount of existing content such as CDs. The possibility of new entertainment is expanded.
- Japanese Patent No. 3066528 discloses the sound pressure data for each frequency band created from the music data, and the frequency band in which the rhythm is most prominently identified from among the frequency bands. A method for estimating a rhythm component based on a change period in sound pressure data at a specified frequency timing is described.
- the techniques for calculating rhythm, beat, tempo, etc. can be broadly classified into those that analyze music signals in the time domain as disclosed in Japanese Patent Laid-Open No. 2002-116754 and those that are analyzed in the frequency domain as described in Japanese Patent No. 3066528. It is divided into what to do.
- Japanese Patent Laid-Open No. 2002-116754 does not necessarily match the beat and the time waveform, so that essentially high extraction accuracy cannot be obtained.
- the one using the frequency analysis of Japanese Patent No. 3066528 can improve the extraction accuracy relatively more than the Japanese Patent Laid-Open No. 2002-116754, but the data obtained by the frequency analysis includes specific data.
- Many beats are included in addition to the beats in the notes, and it is extremely difficult to separate the beats in a specific note from all the beats.
- the music tempo (time period) itself fluctuates greatly, it is extremely difficult to extract only the beats of specific notes following those fluctuations.
- the present invention has been proposed in view of such a conventional situation, and even for a song whose tempo is fluctuating, only a beat at a specific note is accurately detected over the entire song. It is an object of the present invention to provide a beat extraction device and a beat extraction method that can be extracted.
- a beat extraction device is obtained by extracting beat beat processing information of a rhythm in a music piece and the beat extraction processing means.
- Beat period information is generated using the above beat position information, Beat alignment processing means for aligning beats of beat position information extracted by the beat extraction processing means based on the beat cycle information.
- the beat extraction method according to the present invention is extracted by a beat extraction process step of extracting beat position information of a rhythm in a musical piece and the beat extraction process step. Using the beat position information obtained to generate beat cycle information, and based on the beat cycle information, a beat alignment processing step of aligning beats of the beat position information extracted by the beat extraction processing means. It is characterized by.
- FIG. 1 is a functional block diagram showing an internal configuration of a music playback device including an embodiment of a beat extraction device according to the present invention.
- FIG. 2 is a functional block diagram showing an internal configuration of a beat extraction unit.
- FIG. 3 (A) is a diagram showing an example of a time waveform of a digital audio signal.
- B) is a diagram showing a spectrogram of the digital audio signal.
- FIG. 4 is a functional block diagram showing an internal configuration of a beat extraction processing unit.
- FIG. 5 (A) is a diagram showing an example of a time waveform of a digital audio signal.
- FIG. 5C is a diagram showing a beat extraction waveform of the digital audio signal.
- Fig. 6 is a diagram showing beat intervals of beat position information extracted by the beat extraction processing unit, and Fig. 6 (B) is beat position information subjected to alignment processing by the beat alignment processing unit. It is a figure which shows the beat interval.
- FIG. 7 is a diagram showing a window width for determining whether or not a specific beat is an in-beat.
- FIG. 8 is a diagram showing beat intervals of beat position information.
- FIG. 9 is a diagram showing total beats calculated based on beat position information extracted by the beat extraction unit.
- FIG. 10 is a diagram showing total beats and instantaneous beat periods.
- FIG. 11 is a graph showing instantaneous BPM with respect to the number of beats in a live-recorded music.
- FIG. 12 is a graph showing an instantaneous BPM with respect to the number of beats in a song recorded by a so-called computer.
- FIG. 13 is a flowchart showing a processing procedure in an example of correcting beat position information according to the reliability index value.
- FIG. 14 is a flowchart showing an example of a processing procedure for automatically optimizing beat extraction conditions.
- FIG. 1 is a block diagram showing an internal configuration of a music playback device 10 including an embodiment of a beat extraction device according to the present invention.
- the music playback device 10 is composed of, for example, a personal computer.
- a CPU Central Processing Unit
- ROM Read Only Memory
- RAM Random Access Memory
- the system bus 100 includes an audio data decoding unit 104, a media drive 105, a communication network interface (the interface is described as IZF in the figure, the same applies hereinafter) 107, an operation input unit interface 109, A display interface 111, an IZO port 113 and an IZO port 114, an input unit interface 115, and an HDD (Hard Disc Drive) 121 are connected. A series of data processed in each functional block is supplied to other functional blocks via the system bus 100.
- IZF communication network interface
- HDD Hard Disc Drive
- the media drive 105 takes in music data of music content stored on the medium 106 such as a CD (Compact Disc) or a DVD (Digital Versatile Disc) to the system bus 100.
- a CD Compact Disc
- DVD Digital Versatile Disc
- the operation input unit interface 109 is connected to an operation input unit 110 such as a keyboard and a mouse.
- the display 112 displays, for example, a display synchronized with the extracted beat, It is assumed that dolls and robots that dance in synchronization with the camera are displayed.
- An audio playback unit 117 and a beat extraction unit 11 are connected to the IZO port 113.
- the beat extraction unit 11 is connected to the saddle port 114.
- the input unit interface 115 is connected to an input unit 116 including an AZD (Analog to Digital) change 116A, a microphone terminal 116B, and a microphone 116C.
- the audio signal or music signal collected by the microphone 106C is converted into a digital audio signal by the AZD converter 16A and supplied to the input unit interface 115.
- the input unit interface 115 captures this digital audio signal into the system bus 100.
- a digital audio signal (corresponding to a time waveform signal) taken into the system bus 100 is recorded on the HDD 121 in the form of a .wav file or the like.
- the digital audio signal captured via the input unit interface 115 is not directly supplied to the audio playback unit 117.
- the audio data decoding unit 104 decodes the music data and restores the digital audio signal.
- the audio data decoding unit 104 transfers the restored digital audio signal to the IZO port 113 via the system bus 100.
- the ⁇ port 113 supplies the digital audio signal transferred via the system bus 100 to the beat extraction unit 11 and the audio reproduction unit 117.
- An existing medium 106 such as a CD is taken into the system bus 100 through the media drive 105.
- the uncompressed audio content acquired by the listener downloading and taking in the HDD 121 is taken directly into the system bus 100.
- the compressed audio content is returned to the system bus 100 through the audio data decoding unit 104.
- Digital audio signals captured from the input unit 116 to the system bus 100 via the input unit interface 115 are not limited to music signals, but include, for example, human voice signals and other audio band signals).
- a digital audio signal (corresponding to a time waveform signal) captured by the system node 100 is sent to the IZO port 113. It is transferred and supplied to the beat extraction unit 11.
- the beat extraction unit 11 which is an embodiment of the beat processing apparatus according to the present invention is extracted by the beat extraction processing unit 12 that extracts beat position information of the rhythm in the music and the beat extraction processing unit 12.
- Beat period information is generated using the beat position information obtained in this manner, and based on this beat period information, a beat alignment processing unit 13 that aligns beat position information beats extracted by the beat extraction processing unit 12 is provided. Prepare.
- the beat extraction processing unit 12 extracts rough beat position information from the digital audio signal.
- the result is output as metadata recorded in the .mty file.
- the beat alignment processing unit 13 uses all of the metadata recorded in the .mty file or the metadata corresponding to the music portion assumed to have the same tempo, to extract the beat extracted by the beat extraction processing unit 12. Align location information and output the result as metadata recorded in a .may file. This makes it possible to obtain extracted beat position information with high accuracy step by step. Details of the beat extraction unit 11 will be described later.
- the audio playback unit 117 includes a DZA converter 117A, an output amplifier 117B, and a speaker 117C.
- the IZO port 113 supplies the digital audio signal transferred via the system bus 100 to the DZA modification 117 provided in the audio playback unit 117.
- the DZA conversion 117A converts the digital audio signal supplied from the input port 113 into an analog audio signal and supplies the analog audio signal to the speaker 117C through the output amplifier 117.
- the speaker 117C reproduces the analog audio signal supplied with the DZA transformation 117 through the output amplifier 117.
- the display interface 111 is connected to a display 112 such as an LCD (Liquid Crystal Display).
- a display 112 such as an LCD (Liquid Crystal Display).
- the beat component and the tempo value from which the music data power of the music content is extracted are displayed.
- an animation image and lyrics are displayed in synchronization with the music.
- the communication network interface 107 is connected to the Internet 108.
- the server that stores the attribute information of the music content is connected to the Internet. 108, send an attribute information acquisition request using the music content identification information as a search word, and the attribute information sent from the server in response to the acquisition request, for example, the hard disk of the HDD 121 Remember me.
- the attribute information of the music content applied to the music playback device 10 includes information constituting the music.
- the information that composes the song includes information about the break of the song, the chord in the song, the tempo of each chord, the key, the volume, and the time signature, the information about the score, the information about the chord progression, the information about the lyrics, etc. It consists of information that serves as a standard for determining the so-called tune.
- the chord unit is a unit of chord attached to the music, such as the beat and measure of the music.
- the information on the segmentation of music includes, for example, relative position information from the start position of the music and a time stamp.
- the beat extraction unit 11 included in the music playback device 10 according to an embodiment to which the present invention is applied extracts the beat position information of the rhythm of the music based on the characteristics of the digital audio signal described below.
- FIG. 3A shows an example of a time waveform of a digital audio signal. This figure
- the time waveform shown in (A) has a part that instantaneously shows a large peak value.
- the portion exhibiting the large peak value is, for example, a portion corresponding to a part of the drum beat.
- FIG. 3 (B) shows a spectrogram of the digital audio signal having the time waveform shown in FIG. 3 (A).
- the beat component power vector hidden in the time waveform shown in Fig. 3 (A) appears as a part that changes greatly instantaneously. Recognize. And when you actually listen to the sound, the power spectrum in this spectrogram changes greatly instantaneously. It can be seen that the portion that corresponds to the beat component.
- the beat extraction unit 11 regards the part of the spectrogram where the power spectrum changes instantaneously as the beat component of the rhythm.
- the beat extraction processing unit 12 includes a power spectrum calculation unit 12A, a change rate calculation unit 12B, an envelope follower unit 12C, a comparator unit 12D, and a binarization unit 12E. .
- a digital audio signal having a time waveform force as shown in FIG. 5A is input to the power spectrum calculation unit 12A.
- the digital audio signal supplied from the audio data decoding unit 104 is supplied to the power spectrum calculation unit 12A included in the beat extraction processing unit 12.
- the power spectrum calculation unit 12A cannot extract a beat component with high accuracy in time waveform force, for example, FFT (Fast Fourier Transform) is used for this time waveform as shown in FIG. A spectrogram such as this is calculated.
- FFT Fast Fourier Transform
- the resolution in this FFT calculation is set to 5 to 30 msec in real time, with the number of samples set to 512 or 1024 samples when the sampling frequency of the digital audio signal input to the beat extraction processing unit 12 is 48 kHz.
- various numerical values set in the FFT calculation are not limited to these.
- the power spectrum calculation unit 12A supplies the calculated power spectrum to the change rate calculation unit 12B.
- the change rate calculation unit 12B calculates the change rate of the power spectrum supplied from the power spectrum calculation unit 12A. In other words, the change rate calculation unit 12B calculates the change rate of the power spectrum by performing a differentiation operation on the power spectrum supplied with the power spectrum calculation unit 12A. The rate-of-change calculation unit 12B repeatedly performs a differential operation on the power spectrum that changes from moment to moment, thereby generating a beat extraction wave as shown in FIG. A detection signal indicating the shape is output.
- the peak that rises in the positive direction in the beat extraction waveform shown in Fig. 5 (C) is regarded as the beat component.
- the envelope follower unit 12C removes chattering of the detection signal by adding a hysteresis characteristic with an appropriate time constant to the detection signal.
- the detection signal from which chattering has been removed is supplied to the comparator unit 12D.
- the comparator unit 12D has an appropriate threshold, cuts low level noise in the detection signal supplied from the envelope follower unit 12C, and binarizes the detection signal from which the low level noise has been cut. Supply to part 12E.
- the binarization unit 12E performs binarization processing that leaves only the detection signal having a level equal to or higher than the threshold among the detection signals to which the comparator unit 12D force is also supplied, and generates a beat composed of P1, P2, and P3. Outputs beat position information indicating the time position of the component as metadata recorded in the .mty file.
- the beat extraction processing unit 12 extracts the beat position information from the time waveform of the digital audio signal, and outputs it as metadata recorded in the .mty file.
- Each component included in the beat extraction processing unit 12 has an internal parameter, and the effect of the operation of each component is changed by changing each internal parameter.
- this internal parameter can be set by an automatic optimization force, for example, by manual operation by the user manually at the operation input unit 110.
- the beat interval of the beat position information of the music extracted by the beat extraction processing unit 12 and recorded as metadata in the mty file is, for example, non-uniform intervals as shown in FIG. 6 (A). There are many cases.
- the beat alignment processing unit 13 performs beat position information alignment processing on the music pieces or music portions assumed to have the same tempo among the beat position information extracted by the beat extraction processing unit 12.
- the beat alignment processing unit 13 is extracted by the beat extraction processing unit 12 and has the metadata power of the beat position information recorded in the mty file. For example, as shown in A1 to All in FIG. Extracts equally spaced beats that are equally spaced in time, and is shown as B1 to B4 Do not extract such irregular beats. The equally spaced beats in this embodiment are equally spaced at quarter note intervals.
- the beat alignment processing unit 13 calculates the average frequency T of the beat position information extracted by the beat extraction processing unit 12 and recorded in the mty file. Beats with equal are extracted as equally spaced beats.
- the beat alignment processing unit 13 newly adds interpolation beats as indicated by C1 to C3 at the positions where the regularly spaced beats exist. This makes it possible to obtain beat position information in which all beat intervals are equal.
- the beat alignment processing unit 13 defines and extracts beats as in-beats that have substantially the same phase as the equally-spaced beats.
- the in-beat is a beat synchronized with an actual music beat, and includes an equidistant beat.
- the beat alignment processing unit 13 defines beats having completely different phases from the equally spaced beats as outbeats, and excludes them.
- Outbeats are beats that are not synchronized with the actual music beat (quarter note beat). For this reason, the beat alignment processing unit 13 needs to discriminate between inbeats and outbeats.
- the beat alignment processing unit 13 performs a constant window centered on equally spaced beats as shown in FIG. Define window width W.
- the beat alignment processing unit 13 determines beats included in the window width W as in beats, and determines beats not included in the window width W as out beats.
- the beat alignment processing unit 13 adds an interpolated beat that is a beat for interpolating the equally spaced beats when the window width W includes evenly spaced beats.
- the beat alignment processing unit 13 is an beat having an equal interval beat as indicated by All to A20 and an beat having substantially the same phase as the equal interval beat All.
- Beat D11 is extracted as an in-beat, and interpolated beats as shown by C11 to C13 are extracted.
- the beat alignment processing unit 13 does not extract the beats as indicated by B11 to B13 as quarter note beats.
- the number of inbeats to be extracted can be increased and the extraction errors can be reduced by setting the window width W to a larger value.
- This window width W may normally be a constant value, but it can be adjusted as a norm, for example, by increasing the value for music with extremely large shaking.
- the beat alignment processing unit 13 gives, as metadata, beat attributes such as an in beat included in the window width W and an out beat not included in the window width W. In addition, when there is no extracted beat within the window width W, the beat alignment processing unit 13 automatically adds an interpolation beat and gives a beat attribute called this interpolation beat as metadata.
- the metadata constituting the beat information includes the beat position information described above and the beat information combined with the beat attribute described above, and is recorded in the metadata file (.may).
- Each component provided in the beat alignment processing unit 13 has internal parameters such as the basic window width W, and the effect of the operation is changed by changing each internal parameter.
- the beat extraction unit 11 automatically extracts digital audio signal power very high-precision beat information by two-stage data processing in the beat extraction processing unit 12 and the beat alignment processing unit 13. It becomes possible.
- beat information of equal intervals of quarter notes can be obtained over the entire song.
- the music playback device 10 uses the following formula (1) to calculate the total beat based on the beat position information of the first beat X1 and the last beat Xn extracted by the beat extraction unit 11. A number can be calculated.
- Total beats Total inbeats + Total interpolation beats (1)
- the music playback device 10 can calculate the music tempo (average BPM) based on the beat position information extracted by the beat extraction unit 11 using the following formulas (2) and (3). .
- Average beat period (last beat position-first beat position) / (total number of beats) 1) (2)
- Average BPM [bpm] Sampling frequency Z Average beat period X 60 (3)
- the music playback device 10 can obtain the total number of beats and the average BPM by simple four arithmetic operations.
- the music playback device 10 can calculate the tempo of the music at high speed and with a low load using the calculated result. Note that the method for obtaining the tempo of a song is not limited to this.
- the music playback device 10 calculates an instantaneous BPM indicating an instantaneous tempo fluctuation of the music, which has been impossible until now, based on the beat position information extracted by the beat extraction unit 11. Can do. As shown in FIG. 10, the music playback device 10 calculates the instantaneous BPM according to the following formula (4), with the time interval of equal beats as the instantaneous beat period Ts.
- the music playback device 10 graphs this instantaneous BPM for each beat and displays it on the display 112 via the display interface 111.
- the user can grasp this instantaneous BPM distribution as the tempo fluctuation distribution in the music that he / she is actually listening to, for example, for rhythm training and performance mistakes that occur during music recording.
- FIG. 11 is a graph showing instantaneous BPM with respect to the number of beats in a live-recorded music piece.
- FIG. 12 is a graph showing the instantaneous BPM with respect to the number of beats in a song recorded by a so-called computer.
- the music recorded by the computer has less fluctuation time than the music recorded live. This is due to the fact that the tempo variation in computer-recorded music is quite small.
- a method for making beat position information extraction processing more accurate will be described.
- the metadata indicating the beat position information extracted by the beat extraction unit 11 is generally extracted by automatic computer recognition technology, this beat position information has some extraction errors. Including. In particular, depending on the music, the beat may fluctuate unevenly and the beat may be extremely poor.
- the beat alignment processing unit 13 assigns a reliability index value indicating the reliability of the metadata to the metadata supplied from the beat extraction processing unit 12, and automatically sets the reliability of the metadata. to decide.
- This reliability index value is, for example, instantaneous B as shown in the following formula (5).
- beat position information extraction error can be corrected manually by the user. If the extraction error can be easily found and the error part can be corrected, the correction work becomes more efficient.
- FIG. 13 is a flowchart illustrating an example of a processing procedure for manually correcting beat position information based on the reliability index value.
- step S 1 a digital audio signal is supplied from the ⁇ port 113 to the beat extraction processing unit 12 included in the beat extraction unit 11.
- step S2 the beat extraction processing unit 12 extracts beat position information from the digital audio signal supplied from the ⁇ port 113, and the beat alignment processing unit 13 as metadata recorded in the .mty file. To supply.
- step S 3 the beat alignment processing unit 13 performs alignment processing of beats constituting the beat position information supplied from the beat extraction processing unit 12.
- step S4 the beat alignment processing unit 13 determines whether or not the reliability index value assigned to the metadata on which the alignment processing has been performed is equal to or greater than a certain threshold value N (%). . In this step S4, if the reliability index value is N (%) or more, the process proceeds to step S6, and if the reliability index value is less than N (%), the process proceeds to step S5.
- step S5 manual correction in beat alignment processing by the user is performed by an authoring tool (not shown) provided in the music playback device 10.
- step S6 the beat alignment processing unit 13 supplies the beat position information subjected to the beat alignment processing to the IZO port 114 as metadata recorded in a .may file.
- beat position information can be extracted with higher accuracy by changing the extraction condition of beat position information based on the reliability index value.
- FIG. 14 is a flowchart showing an example of the processing procedure for specifying the beat extraction condition.
- the beat extraction unit 11 and the beat alignment processing unit 13 prepare in advance a set of a plurality of internal parameters and perform beat extraction processing for each parameter set.
- the reliability index value is calculated.
- step S 11 a digital audio signal is supplied from the ⁇ port 113 to the beat extraction processing unit 12 included in the beat extraction unit 11.
- step S12 the beat extraction processing unit 12 extracts beat position information from the digital audio signal supplied from the ⁇ port 113, and the beat alignment processing unit 13 as metadata recorded in the .mty file. To supply.
- step S13 the beat alignment processing unit 13 performs beat alignment processing of the metadata supplied from the beat extraction processing unit 12.
- step S14 the beat alignment processing unit 13 determines whether or not the reliability index value assigned to the metadata for which the alignment processing has been completed is equal to or greater than a certain threshold N (%). In step S14, if the reliability index value is N (%) or more, the process proceeds to step S16. If the reliability index value is less than N (%), the process proceeds to step S15.
- step S15 the beat extraction processing unit 12 and the beat alignment processing unit 13 each change the parameters of the parameter set described above, and return to step S12. After step S12 and step S13, the reliability index value is determined again in step S14.
- Steps S12 to S15 are repeated until the reliability index value becomes N (%) or more in step S14.
- the music playback device 10 equipped with the beat extraction device according to the present invention has no beat position information and no time stamp information! Sound source) can be musically synchronized with other media.
- the data size of beat position information and time stamp information is several kilobytes to several tens of kilobytes, which is very small, one thousandth of the data size of speech waveforms, reducing the amount of memory and processing steps. Can be handled very easily by the user.
- beats can be accurately extracted over the entire song for music whose tempo changes or music whose rhythm changes.
- new entertainment can be created by synchronizing music with other media.
- the beat extraction device according to the present invention can be applied not only to the above-described personal computer and portable music player, but also to any type of device or electronic device.
- beat position information of the rhythm in the music is extracted, beat period information is generated using the extracted beat position information, and extracted based on the beat period information.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Auxiliary Devices For Music (AREA)
- Electrophonic Musical Instruments (AREA)
Priority Applications (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US12/161,882 US8076566B2 (en) | 2006-01-25 | 2007-01-24 | Beat extraction device and beat extraction method |
| EP07707320A EP1978508A1 (en) | 2006-01-25 | 2007-01-24 | Beat extraction device and beat extraction method |
| KR1020087016468A KR101363534B1 (ko) | 2006-01-25 | 2007-01-24 | 비트 추출 장치 및 비트 추출 방법 |
| CN2007800035136A CN101375327B (zh) | 2006-01-25 | 2007-01-24 | 节拍提取设备和节拍提取方法 |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2006-016801 | 2006-01-25 | ||
| JP2006016801A JP4949687B2 (ja) | 2006-01-25 | 2006-01-25 | ビート抽出装置及びビート抽出方法 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2007086417A1 true WO2007086417A1 (ja) | 2007-08-02 |
Family
ID=38309206
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2007/051073 Ceased WO2007086417A1 (ja) | 2006-01-25 | 2007-01-24 | ビート抽出装置及びビート抽出方法 |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US8076566B2 (https=) |
| EP (1) | EP1978508A1 (https=) |
| JP (1) | JP4949687B2 (https=) |
| KR (1) | KR101363534B1 (https=) |
| CN (1) | CN101375327B (https=) |
| WO (1) | WO2007086417A1 (https=) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2008283305A (ja) * | 2007-05-08 | 2008-11-20 | Sony Corp | ビート強調装置、音声出力装置、電子機器、およびビート出力方法 |
| JP2009294671A (ja) * | 2009-09-07 | 2009-12-17 | Sony Computer Entertainment Inc | オーディオ再生装置およびオーディオ早送り再生方法 |
| US9411882B2 (en) | 2013-07-22 | 2016-08-09 | Dolby Laboratories Licensing Corporation | Interactive audio content generation, delivery, playback and sharing |
Families Citing this family (38)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP4465626B2 (ja) * | 2005-11-08 | 2010-05-19 | ソニー株式会社 | 情報処理装置および方法、並びにプログラム |
| US7956274B2 (en) * | 2007-03-28 | 2011-06-07 | Yamaha Corporation | Performance apparatus and storage medium therefor |
| JP4311466B2 (ja) * | 2007-03-28 | 2009-08-12 | ヤマハ株式会社 | 演奏装置およびその制御方法を実現するプログラム |
| JP5266754B2 (ja) | 2007-12-28 | 2013-08-21 | ヤマハ株式会社 | 磁気データ処理装置、磁気データ処理方法および磁気データ処理プログラム |
| KR101230479B1 (ko) * | 2008-03-10 | 2013-02-06 | 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. | 트랜지언트 이벤트를 갖는 오디오 신호를 조작하기 위한 장치 및 방법 |
| US8344234B2 (en) * | 2008-04-11 | 2013-01-01 | Pioneer Corporation | Tempo detecting device and tempo detecting program |
| JP5337608B2 (ja) * | 2008-07-16 | 2013-11-06 | 本田技研工業株式会社 | ビートトラッキング装置、ビートトラッキング方法、記録媒体、ビートトラッキング用プログラム、及びロボット |
| JP2010054530A (ja) * | 2008-08-26 | 2010-03-11 | Sony Corp | 情報処理装置、発光制御方法およびコンピュータプログラム |
| US7915512B2 (en) * | 2008-10-15 | 2011-03-29 | Agere Systems, Inc. | Method and apparatus for adjusting the cadence of music on a personal audio device |
| JP2010114737A (ja) * | 2008-11-07 | 2010-05-20 | Kddi Corp | 携帯端末、拍位置修正方法および拍位置修正プログラム |
| JP5282548B2 (ja) * | 2008-12-05 | 2013-09-04 | ソニー株式会社 | 情報処理装置、音素材の切り出し方法、及びプログラム |
| US8889976B2 (en) * | 2009-08-14 | 2014-11-18 | Honda Motor Co., Ltd. | Musical score position estimating device, musical score position estimating method, and musical score position estimating robot |
| TWI484473B (zh) | 2009-10-30 | 2015-05-11 | Dolby Int Ab | 用於從編碼位元串流擷取音訊訊號之節奏資訊、及估算音訊訊號之知覺顯著節奏的方法及系統 |
| EP2328142A1 (en) | 2009-11-27 | 2011-06-01 | Nederlandse Organisatie voor toegepast -natuurwetenschappelijk onderzoek TNO | Method for detecting audio ticks in a noisy environment |
| US9159338B2 (en) * | 2010-05-04 | 2015-10-13 | Shazam Entertainment Ltd. | Systems and methods of rendering a textual animation |
| JP5569228B2 (ja) * | 2010-08-02 | 2014-08-13 | ソニー株式会社 | テンポ検出装置、テンポ検出方法およびプログラム |
| JP5594052B2 (ja) * | 2010-10-22 | 2014-09-24 | ソニー株式会社 | 情報処理装置、楽曲再構成方法及びプログラム |
| US9324377B2 (en) | 2012-03-30 | 2016-04-26 | Google Inc. | Systems and methods for facilitating rendering visualizations related to audio data |
| CN103971685B (zh) * | 2013-01-30 | 2015-06-10 | 腾讯科技(深圳)有限公司 | 语音命令识别方法和系统 |
| US9756281B2 (en) | 2016-02-05 | 2017-09-05 | Gopro, Inc. | Apparatus and method for audio based video synchronization |
| US9697849B1 (en) | 2016-07-25 | 2017-07-04 | Gopro, Inc. | Systems and methods for audio based synchronization using energy vectors |
| US9640159B1 (en) | 2016-08-25 | 2017-05-02 | Gopro, Inc. | Systems and methods for audio based synchronization using sound harmonics |
| US9653095B1 (en) | 2016-08-30 | 2017-05-16 | Gopro, Inc. | Systems and methods for determining a repeatogram in a music composition using audio features |
| JP6500869B2 (ja) * | 2016-09-28 | 2019-04-17 | カシオ計算機株式会社 | コード解析装置、方法、及びプログラム |
| US9916822B1 (en) | 2016-10-07 | 2018-03-13 | Gopro, Inc. | Systems and methods for audio remixing using repeated segments |
| JP6705422B2 (ja) * | 2017-04-21 | 2020-06-03 | ヤマハ株式会社 | 演奏支援装置、及びプログラム |
| CN108108457B (zh) * | 2017-12-28 | 2020-11-03 | 广州市百果园信息技术有限公司 | 从音乐节拍点中提取大节拍信息的方法、存储介质和终端 |
| JP7343268B2 (ja) * | 2018-04-24 | 2023-09-12 | 培雄 唐沢 | 任意信号挿入方法及び任意信号挿入システム |
| WO2019224990A1 (ja) * | 2018-05-24 | 2019-11-28 | ローランド株式会社 | ビート音発生タイミング生成装置 |
| CN109256146B (zh) * | 2018-10-30 | 2021-07-06 | 腾讯音乐娱乐科技(深圳)有限公司 | 音频检测方法、装置及存储介质 |
| CN113302679B (zh) * | 2019-01-23 | 2025-02-11 | 索尼集团公司 | 信息处理系统、信息处理方法和程序 |
| CN111669497A (zh) * | 2020-06-12 | 2020-09-15 | 杭州趣维科技有限公司 | 一种移动端自拍时音量驱动贴纸效果的方法 |
| CN113411663B (zh) * | 2021-04-30 | 2023-02-21 | 成都东方盛行电子有限责任公司 | 一种用于非编工程中的音乐节拍提取方法 |
| CN113590872B (zh) * | 2021-07-28 | 2023-11-28 | 广州艾美网络科技有限公司 | 跳舞谱面生成的方法、装置以及设备 |
| JP7786153B2 (ja) * | 2021-11-24 | 2025-12-16 | ヤマハ株式会社 | 楽曲推論装置、楽曲推論方法、楽曲推論プログラム、モデル生成装置、モデル生成方法、及びモデル生成プログラム |
| WO2025041587A1 (ja) * | 2023-08-23 | 2025-02-27 | ソニーグループ株式会社 | 情報処理装置及び情報処理方法 |
| CN119961484B (zh) * | 2025-04-10 | 2025-07-04 | 四川师范大学 | 一种民族舞蹈数字化展示系统 |
| CN120748450B (zh) * | 2025-09-03 | 2025-11-21 | 港湾之星健康生物(深圳)有限公司 | VEM-Token节拍捕捉和对齐模型建构的方法 |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH0366528B2 (https=) | 1984-10-19 | 1991-10-17 | Fuji Valve | |
| JPH06290574A (ja) * | 1993-03-31 | 1994-10-18 | Victor Co Of Japan Ltd | 楽曲検索装置 |
| JP2002116754A (ja) | 2000-07-31 | 2002-04-19 | Matsushita Electric Ind Co Ltd | テンポ抽出装置、テンポ抽出方法、テンポ抽出プログラム及び記録媒体 |
| JP2002278547A (ja) * | 2001-03-22 | 2002-09-27 | Matsushita Electric Ind Co Ltd | 楽曲検索方法、楽曲検索用データ登録方法、楽曲検索装置及び楽曲検索用データ登録装置 |
| JP2003108132A (ja) * | 2001-09-28 | 2003-04-11 | Pioneer Electronic Corp | オーディオ情報再生装置及びオーディオ情報再生システム |
| JP2003263162A (ja) * | 2002-03-07 | 2003-09-19 | Yamaha Corp | 音楽データのテンポ推定方法および装置 |
Family Cites Families (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH0366528A (ja) | 1989-08-02 | 1991-03-22 | Fujitsu Ltd | ロボットハンド |
| JP3066528B1 (ja) | 1999-02-26 | 2000-07-17 | コナミ株式会社 | 楽曲再生システム、リズム解析方法及び記録媒体 |
| JP4186298B2 (ja) | 1999-03-17 | 2008-11-26 | ソニー株式会社 | リズムの同期方法及び音響装置 |
| KR100365989B1 (ko) * | 2000-02-02 | 2002-12-26 | 최광진 | 가상 음악 영상 시스템 및 그 시스템의 영상 표시 방법 |
| US7035873B2 (en) | 2001-08-20 | 2006-04-25 | Microsoft Corporation | System and methods for providing adaptive media property classification |
| DE60237860D1 (de) * | 2001-03-22 | 2010-11-18 | Panasonic Corp | Schallmerkmalermittlungsgerät, Schalldatenregistrierungsgerät, Schalldatenwiederauffindungsgerät und Verfahren und Programme zum Einsatz derselben |
| US6518492B2 (en) * | 2001-04-13 | 2003-02-11 | Magix Entertainment Products, Gmbh | System and method of BPM determination |
| DE10123366C1 (de) | 2001-05-14 | 2002-08-08 | Fraunhofer Ges Forschung | Vorrichtung zum Analysieren eines Audiosignals hinsichtlich von Rhythmusinformationen |
| CN1206603C (zh) * | 2001-08-30 | 2005-06-15 | 无敌科技股份有限公司 | 音乐音频产生方法与播放系统 |
| JP4243682B2 (ja) | 2002-10-24 | 2009-03-25 | 独立行政法人産業技術総合研究所 | 音楽音響データ中のサビ区間を検出する方法及び装置並びに該方法を実行するためのプログラム |
-
2006
- 2006-01-25 JP JP2006016801A patent/JP4949687B2/ja not_active Expired - Fee Related
-
2007
- 2007-01-24 EP EP07707320A patent/EP1978508A1/en not_active Withdrawn
- 2007-01-24 WO PCT/JP2007/051073 patent/WO2007086417A1/ja not_active Ceased
- 2007-01-24 CN CN2007800035136A patent/CN101375327B/zh not_active Expired - Fee Related
- 2007-01-24 US US12/161,882 patent/US8076566B2/en not_active Expired - Fee Related
- 2007-01-24 KR KR1020087016468A patent/KR101363534B1/ko not_active Expired - Fee Related
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH0366528B2 (https=) | 1984-10-19 | 1991-10-17 | Fuji Valve | |
| JPH06290574A (ja) * | 1993-03-31 | 1994-10-18 | Victor Co Of Japan Ltd | 楽曲検索装置 |
| JP2002116754A (ja) | 2000-07-31 | 2002-04-19 | Matsushita Electric Ind Co Ltd | テンポ抽出装置、テンポ抽出方法、テンポ抽出プログラム及び記録媒体 |
| JP2002278547A (ja) * | 2001-03-22 | 2002-09-27 | Matsushita Electric Ind Co Ltd | 楽曲検索方法、楽曲検索用データ登録方法、楽曲検索装置及び楽曲検索用データ登録装置 |
| JP2003108132A (ja) * | 2001-09-28 | 2003-04-11 | Pioneer Electronic Corp | オーディオ情報再生装置及びオーディオ情報再生システム |
| JP2003263162A (ja) * | 2002-03-07 | 2003-09-19 | Yamaha Corp | 音楽データのテンポ推定方法および装置 |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2008283305A (ja) * | 2007-05-08 | 2008-11-20 | Sony Corp | ビート強調装置、音声出力装置、電子機器、およびビート出力方法 |
| JP2009294671A (ja) * | 2009-09-07 | 2009-12-17 | Sony Computer Entertainment Inc | オーディオ再生装置およびオーディオ早送り再生方法 |
| US9411882B2 (en) | 2013-07-22 | 2016-08-09 | Dolby Laboratories Licensing Corporation | Interactive audio content generation, delivery, playback and sharing |
Also Published As
| Publication number | Publication date |
|---|---|
| US8076566B2 (en) | 2011-12-13 |
| US20090056526A1 (en) | 2009-03-05 |
| KR101363534B1 (ko) | 2014-02-14 |
| JP4949687B2 (ja) | 2012-06-13 |
| CN101375327A (zh) | 2009-02-25 |
| KR20080087112A (ko) | 2008-09-30 |
| EP1978508A1 (en) | 2008-10-08 |
| JP2007199306A (ja) | 2007-08-09 |
| CN101375327B (zh) | 2012-12-05 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP4949687B2 (ja) | ビート抽出装置及びビート抽出方法 | |
| US7534951B2 (en) | Beat extraction apparatus and method, music-synchronized image display apparatus and method, tempo value detection apparatus, rhythm tracking apparatus and method, and music-synchronized display apparatus and method | |
| KR101292698B1 (ko) | 메타데이터 부여 방법 및 장치 | |
| WO2004029927A2 (en) | System and method for generating an audio thumbnail of an audio track | |
| WO2009038316A2 (en) | The karaoke system which has a song studying function | |
| US20170047094A1 (en) | Audio information processing | |
| JP2003177784A (ja) | 音響変節点抽出装置及びその方法、音響再生装置及びその方法、音響再生システム、音響配信システム、情報提供装置、音響信号編集装置、音響変節点抽出方法プログラム記録媒体、音響再生方法プログラム記録媒体、音響信号編集方法プログラム記録媒体、音響変節点抽出方法プログラム、音響再生方法プログラム、音響信号編集方法プログラム | |
| JP2002215195A (ja) | 音楽信号処理装置 | |
| JPH07295560A (ja) | Midiデータ編集装置 | |
| Monti et al. | Monophonic transcription with autocorrelation | |
| JP2009063714A (ja) | オーディオ再生装置およびオーディオ早送り再生方法 | |
| JP2005107329A (ja) | カラオケ装置 | |
| JP5338312B2 (ja) | 自動演奏同期装置、自動演奏鍵盤楽器およびプログラム | |
| JP4048249B2 (ja) | カラオケ装置 | |
| JP4537490B2 (ja) | オーディオ再生装置およびオーディオ早送り再生方法 | |
| JP2005107332A (ja) | カラオケ装置 | |
| JP2007121917A (ja) | マルチスタンダード採点を行うカラオケ採点装置 | |
| JP2002215163A (ja) | 波形データ解析方法、波形データ解析装置および記録媒体 | |
| US20070051228A1 (en) | Method and Apparatus for Playing in Synchronism with a DVD an Automated Musical Instrument | |
| Driedger | Time-scale modification algorithms for music audio signals | |
| JP2002358078A (ja) | 音楽ソース同期回路および音楽ソース同期方法 | |
| JP2000305600A (ja) | 音声信号処理装置及び方法、情報媒体 | |
| JP2008225111A (ja) | カラオケ装置及びプログラム | |
| KR20080051896A (ko) | 가라오케 시스템에서의 노래 점수 계산장치 및 방법 | |
| JPS61162097A (ja) | 伴奏音楽再生装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
| WWE | Wipo information: entry into national phase |
Ref document number: 5594/DELNP/2008 Country of ref document: IN |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2007707320 Country of ref document: EP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 1020087016468 Country of ref document: KR |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 200780003513.6 Country of ref document: CN |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 12161882 Country of ref document: US |