WO2009104269A1 - 楽曲判別装置、楽曲判別方法、楽曲判別プログラム及び記録媒体 - Google Patents
楽曲判別装置、楽曲判別方法、楽曲判別プログラム及び記録媒体 Download PDFInfo
- Publication number
- WO2009104269A1 WO2009104269A1 PCT/JP2008/053031 JP2008053031W WO2009104269A1 WO 2009104269 A1 WO2009104269 A1 WO 2009104269A1 JP 2008053031 W JP2008053031 W JP 2008053031W WO 2009104269 A1 WO2009104269 A1 WO 2009104269A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- music
- harmony
- intelligibility
- pitch
- level
- Prior art date
Links
- 238000000034 method Methods 0.000 title abstract description 8
- 238000004364 calculation method Methods 0.000 claims description 19
- 230000006870 function Effects 0.000 claims description 5
- 238000012850 discrimination method Methods 0.000 claims description 2
- 239000011295 pitch Substances 0.000 description 161
- 230000007704 transition Effects 0.000 description 26
- 238000006243 chemical reaction Methods 0.000 description 10
- 238000012545 processing Methods 0.000 description 9
- 230000002123 temporal effect Effects 0.000 description 9
- 238000010586 diagram Methods 0.000 description 6
- 230000033764 rhythmic process Effects 0.000 description 6
- 239000000203 mixture Substances 0.000 description 5
- 238000001228 spectrum Methods 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000008929 regeneration Effects 0.000 description 3
- 238000011069 regeneration method Methods 0.000 description 3
- 230000001755 vocal effect Effects 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 2
- 230000035876 healing Effects 0.000 description 2
- 230000001020 rhythmical effect Effects 0.000 description 2
- 240000006829 Ficus sundaica Species 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 230000003796 beauty Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 210000005069 ears Anatomy 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 239000003292 glue Substances 0.000 description 1
- JEIPFZHSYJVQDO-UHFFFAOYSA-N iron(III) oxide Inorganic materials O=[Fe]O[Fe]=O JEIPFZHSYJVQDO-UHFFFAOYSA-N 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000009527 percussion Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H1/00—Details of electrophonic musical instruments
- G10H1/0008—Associated control or indicating means
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
- G10H2210/066—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for pitch analysis as part of wider processing for musical purposes, e.g. transcription, musical performance evaluation; Pitch recognition, e.g. in polyphonic sounds; Estimation or use of missing fundamental
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2240/00—Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
- G10H2240/075—Musical metadata derived from musical analysis or for use in electrophonic musical instruments
- G10H2240/085—Mood, i.e. generation, detection or selection of a particular emotional content or atmosphere in a musical piece
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2240/00—Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
- G10H2240/171—Transmission of musical instrument data, control or status information; Transmission, remote access or control of music data for electrophonic musical instruments
- G10H2240/201—Physical layer or hardware aspects of transmission to or from an electrophonic musical instrument, e.g. voltage levels, bit streams, code words or symbols over a physical link connecting network nodes or instruments
- G10H2240/241—Telephone transmission, i.e. using twisted pair telephone lines or any type of telephone network
- G10H2240/251—Mobile telephone transmission, i.e. transmitting, accessing or controlling music data wirelessly via a wireless or mobile telephone receiver, analog or digital, e.g. DECT GSM, UMTS
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2250/00—Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
- G10H2250/131—Mathematical functions for musical analysis, processing, synthesis or composition
- G10H2250/215—Transforms, i.e. mathematical transforms into domains appropriate for musical signal processing, coding or compression
- G10H2250/235—Fourier transform; Discrete Fourier Transform [DFT]; Fast Fourier Transform [FFT]
Definitions
- the present application relates to a technical field such as a music discriminating apparatus capable of discriminating the tone of music.
- As one means for searching for music it is performed to determine the tune of the music and search for the music from the tune.
- the tone of the music refers to the impression of the music felt by the person who has listened to the music.
- a music that has a high speed (tempo) and is composed of high sounds shows a bright and lively tone.
- the music is characterized by the tune and used for searching for music.
- Patent Document 1 discloses an invention that classifies the types of chords possessed by a music and analyzes the composition of the music. JP-A-6-290574
- chord even if a chord is used in a musical composition, the chord may not be clearly audible due to simultaneous use of other severely distorted sounds or rhythm sounds (such as distorted electric guitar, bass or drum). .
- the melody of such music is rich in change and has an intense and exciting impression. Therefore, even if the music tone is searched using the chord composition included in the music as an index, it is not possible to select the music that matches the user's request.
- Patent Document 1 even if the similarity of the chord composition of a music can be extracted, it is impossible to know the tone of the music. Therefore, even if an attempt is made to search for music that gives comfort and calmness based on the chord structure of the music, it has not been possible to select music that meets the user's request.
- the present application aims to provide a music discriminating apparatus, a music discriminating method, a music discriminating program, a recording medium, and the like that provide the music and music information desired by the user, with the elimination of such points as one of the problems. To do.
- the music discriminating apparatus includes a pitch power addition level calculation means for calculating a pitch power addition level from input music data, and the calculated pitch power addition level. Based on the harmony intelligibility calculating means for calculating the harmony intelligibility indicating the degree of whether or not the harmony can be clearly heard on the auditory sense, the music tone discriminating means for determining the tone of the music using the harmony intelligibility, It is characterized by providing.
- a pitch power addition level calculation step of calculating a pitch power addition level from input music data, and harmony is audible based on the calculated pitch power addition level.
- a music discriminating program comprising: a computer included in a music discriminating apparatus; a pitch power addition level calculating means for calculating a pitch power addition level from input music data; and the calculated pitch power addition level.
- tune determination means Based on the harmony intelligibility calculation means for calculating the degree of harmony intelligibility that indicates whether or not the harmony can be heard clearly, tune determination means for determining the tune of the music using the harmony intelligibility, It is made to function.
- FIG. 1 It is a figure which shows the music data (signal) and FFT power addition level ANP (p) after FFT conversion
- A) is a figure which shows the music data (signal) after FFT conversion
- B) is music power addition.
- A) -t1 shows the pitch power addition level for every pitch in minute time t1
- A) -t2 is a diagram showing the pitch power addition level for each pitch in the minute time t2
- (A) -t is the pitch power addition level for each pitch in the minute time t.
- FIG. 3 is a flowchart showing an operation of the information reproducing apparatus S. It is a flowchart which shows the music tone discrimination
- FIG. 6 is a diagram showing harmony intelligibility and low-frequency beat level values when music A is reproduced with a signal power (SignalPower) of about 100. It is a figure which shows the value of a harmony intelligibility and a low-frequency beat level when signal power (SignalPower) is lowered to about 50, which is half.
- SignalPower Signal Power
- FIG. 10 is a diagram illustrating harmony intelligibility and low-frequency beat level values when music A is reproduced with a signal power (SignalPower) of about 20; It is a figure which shows the value of a harmony intelligibility and a low-pass beat level at the time of raising signal power (SignalPower) to about 40 times.
- “harmonic intelligibility” is defined as an index indicating the degree of whether or not a chord can be heard clearly in the sense of hearing.
- a song whose harmony can be heard clearly has a clear and beautiful sound, shows a healing tone that gives the person who listens to it calm and calm, and a song whose harmony cannot be heard clearly is intense. It is known to show a powerful tune. Accordingly, the inventors of the present invention pay attention to the fact that the tone of the music varies depending on whether or not the harmony can be heard clearly. ”As a new index indicating the tone of the song, and the tone of the song was determined by the value indicated by the harmony intelligibility.
- the power distribution for each interval is calculated from the music data, the harmony intelligibility, etc. are calculated from the calculated power distribution, and the calculated harmony intelligibility etc. Based on this, the tone of the music is determined.
- a signal level (amplitude) power spectrum F (n) of a predetermined bandwidth (Hz) is calculated by FFT conversion or the like for the input music data, and the pitch power of each pitch (Hz) is calculated from the calculation result. calculate. Then, a “pitch power addition level” is calculated in which the pitch power of each pitch is weighted and added within the range of one octave.
- the harmony intelligibility is calculated for a musical piece in which a chord can be heard clearly.
- This music uses a lot of instrumental sound that has harmonic components such as piano, synthesizer or stringed instrument. It gives clarity, beauty or calmness to the audibility, and harmonies (for example, chords) resonate in the ears. It has the characteristics of.
- FFT Fast Fourier transform
- ⁇ t minute time
- N-point FFT the FFT conversion result in this embodiment.
- the FFT transform is fast Fourier transform, and is a process of extracting which frequency component is included in a certain signal. Since the FFT conversion is a known technique, detailed description thereof is omitted.
- the point refers to a point representing each of the ranges divided by a predetermined bandwidth (Hz) in all frequency components included in the music data
- the N point refers to N predetermined bandwidths (Hz). It shows that there are N points representing each range in the divided ranges.
- FIG. 1A shows music data (signal) after FFT conversion.
- the horizontal axis 10 represents the frequency range to which the signal extracted by the FFT transform belongs, and the vertical axis 11 represents the power spectrum F (n) indicating the energy that the signal contains for each frequency.
- the pitch power of each pitch is calculated within the M octave range.
- the frequency range (Hz) of the converted N-point FFT is divided into octave groups in which an arbitrary frequency range (Hz) is one octave (in this embodiment, M octaves), The octaves divided into groups are further divided at arbitrary points (Hz), and the divided points are set as pitches (Hz), and the pitch power at each pitch is calculated.
- the N-point FFT is first divided into a frequency band 220 Hz to 420 Hz as an arbitrary frequency range into one octave group, and the octave divided into this group is set as a log scale (frequency In the characteristic graph, the scale is divided into 12 equal parts, and each divided point is defined as a pitch. Specifically, assuming that the 12th root of 2 is k, a value obtained by multiplying the frequency of a certain pitch by k is set as the frequency of the next pitch.
- pitch A is 220 Hz
- pitch A # is 233 Hz (220 * k)
- pitch B is 247 Hz (220 * k ⁇ 2)
- pitch C is 261 Hz (220 * k ⁇ 3)
- pitch C #, pitch D, pitch D #, pitch E, pitch F, pitch F #, pitch G, and the last pitch G # are set to 415 Hz (220 * k ⁇ 11).
- the pitch A # is sequentially set to 466 Hz (440 * k)
- the pitch B is 494 Hz (440 * k ⁇ 2)
- the pitch C is 523 Hz (440 * k ⁇ 3)
- the final pitch G # is 830 Hz (440 *).
- the frequency points are classified as k ⁇ 11).
- the pitch power is expressed by, for example, the formula (1).
- F (p) represents power at the FFT point p
- fpos (m) represents an FFT point corresponding to a frequency of an arbitrary pitch m
- NP (m) is an arbitrary pitch m. It represents the pitch power.
- the pitch power addition level is calculated. Specifically, the pitch power for each pitch calculated for each octave described above is weighted and added within the range of one octave (hereinafter referred to as “pitch power addition level”). In this way, the pitch power is concentrated within a range of one octave.
- W (i) represents weighting. This has the effect of preventing adverse effects of noise components in the high frequency band. For example, in the high frequency band, there is a high possibility that high frequency noise is included, so the weighting is reduced (the value of W (i) is reduced). Further, the weighting may be defined for each arbitrary octave, or may be added for each integer octave, for example.
- Equation (2) the sum is obtained by multiplying each pitch by weighting, and the pitch power over an arbitrary octave is aggregated within a range of one octave. In this way, the pitch power addition level ANP (p) is calculated for each pitch. This pitch power addition level ANP (p) is the power for each pitch.
- FIG. 1B shows the pitch power addition level ANP (p).
- the horizontal axis 12 represents the pitches A to G # as the pitches, and the vertical axis 13 represents the pitch power addition level ANP (p).
- the harmony intelligibility is calculated.
- the deviation of the pitch power aggregated within one octave range is calculated.
- the pitch power deviation aggregated within one octave range is defined as harmony intelligibility CCV. *
- the harmony intelligibility that is, the deviation within one octave range in the calculated pitch power addition level ANP (p) (the average value of the calculated pitch power addition levels ANP (p) of each pitch) The sum of the squares of the differences is calculated.
- Harmony intelligibility means an index that indicates whether or not harmony can be heard clearly. For example, not only deviation in pitch power addition level but also variance, difference in pitch power addition level, Any index may be used as long as it indicates the magnitude of the pitch or fluctuation, and whether or not there is a pitch power addition level with a large power.
- Expression (4) generalizes the constant term and omits the average value calculation of the pitch power addition level ANP (p).
- Equation (5) is obtained by omitting the square operation.
- CCV indicates harmony intelligibility
- the number of pitches that make up the harmony is not always constant. In CCV, it fluctuates according to the number of pitches that make up the harmony. Therefore, as in Equation (6), the difference between the protruding average value of the pitch power and the average value of the other pitch powers may be used as the harmony intelligibility.
- UpAvr is an average value of protruding pitch power
- DnAvr is an average value of other pitch powers.
- this CCV has a large value when a beautiful chord can be clearly heard, but also has a large value when an ugly chord can be clearly heard. Since an ugly chord is not a harmony, in such a case, it is necessary to reduce the harmony intelligibility using the coefficient X shown in Equation (7).
- X is in the range of 0 to 1 and is determined according to the set of protruding pitches. X is determined to be a large value if the protruding pitch set is a chord that can be said to be harmony, such as a consonance, and X is determined to be a small value if it is a chord that cannot be considered harmony, such as a dissonance. Alternatively, by comparing the prominent pitch set with all music theoretical consonants and identifying the most likely consonant, it is possible to consider the other pitches as noise for harmony. Therefore, when the sum of the pitch powers other than the consonance is large, X is set to a small value, and when it is small, X is set to a large value.
- the harmony intelligibility calculated using Expression (3) is a certain instantaneous value in the music data.
- the harmony intelligibility of the music data at a certain time is calculated, and the change is calculated (measurement in the time direction of the harmony intelligibility).
- FIG. 2A shows an image of calculating the time direction transition of the pitch power addition level for each pitch in a certain time of the music data.
- the pitch power addition level for each pitch at the minute time t1 is calculated using, for example, the equation (3), and this is calculated until the minute time t.
- the time direction transition of the pitch power addition level for each pitch from the fixed period t1 to t is calculated.
- FIG. 2 (B) shows the pitch power addition level for each pitch over a fixed time of the music data using four parameters.
- the horizontal axis 14 represents time, and represents the time from the minute time t1 to t in FIG.
- the vertical axis 15 indicates each pitch.
- the pitch addition power addition level 16 is shown in shades, with the pitch addition level being higher as the density is lower, and the pitch addition level being lower as the density is lower.
- the temporal direction transition of the harmony intelligibility is calculated from the calculation result of the pitch power addition level for each interval of the minute time of the music data, using the expression (3), for example, for the corresponding harmony intelligibility of the minute time.
- the harmony intelligibility is calculated from a minute time t1 to t, which is a fixed period.
- the temporal direction transition of the harmony intelligibility t is calculated.
- FIG. 2C shows the temporal transition of harmony intelligibility (CCV).
- the vertical axis 15 represents the degree of harmony intelligibility.
- the pitch power addition level ANP (p) of a chord of a certain pitch protrudes and is large, and the pitch power addition level ANP (p) of other pitches is low.
- the standard deviation of the pitch power addition level is expected to increase.
- the calculation result of the harmony intelligibility which is the standard deviation of the pitch power addition level for each pitch of the music data, is also shown to be large as predicted from FIG.
- the harmony intelligibility showed a high value as a result of calculating the harmony intelligibility for the music that is clear in the sense that the chord is audible and has a healing tone that gives comfort and calmness.
- a music with high harmony intelligibility can be heard with a certain chord, it is proved that the above calculation result and the impression of the chord of the music in the sense of hearing the music actually match.
- the harmony intelligibility is calculated for music in which chords cannot be heard clearly.
- This musical piece uses a lot of instrumental sounds that contain many non-wavelength components or noise components such as percussion instruments or electronic musical instruments (such as electric guitars) that have an effect, and the tone of the music is intense, annoying, or It is powerful, has a low sense of harmony (no chords are felt in the sense of hearing), and has features such as emphasis on glue and rhythm.
- the calculation method is the same as the time direction transition (FIGS. 1A to 1C) when the harmony intelligibility is large.
- the tune of the music can be determined to be “healed” or “severe” using the result of the temporal transition of the harmony intelligibility.
- FIG. 4 is a diagram illustrating a schematic configuration example of the information reproducing / recording apparatus according to the present embodiment.
- the information reproduction / recording apparatus S includes a reproduction processing unit 1, an external output unit 2, a recording unit 3, a system control unit 4, a communication unit 5, and the like.
- the playback processing unit 1 is a CD (Compact Disc), MD (Mini Disc), DVD (Digital Versatile Disc), or card-type recording medium (for example, a memory stick or an SD card) under the control of the system control unit 4.
- the music data recorded on the recording medium is reproduced, and the music data is output to the external output unit 2.
- the external output unit 2 includes a DSP (Digital Signal Processor), an amplifier, a speaker, and the like.
- the external output unit 2 performs known acoustic processing on music data reproduced by the reproduction processing unit 1, and passes through the amplifier and the speaker. Output audio to the outside.
- DSP Digital Signal Processor
- the recording unit 3 includes a recording device such as a hard disk drive, for example. Under the control of the system control unit 4, for example, the music data output from the reproduction processing unit 1 is compressed and recorded in a predetermined file format. At the same time, accompanying information (for example, music ID (music identification information), music name, album name of the album in which the music is recorded, etc.) is recorded on the recording medium.
- a recording device such as a hard disk drive
- the music data output from the reproduction processing unit 1 is compressed and recorded in a predetermined file format.
- accompanying information for example, music ID (music identification information), music name, album name of the album in which the music is recorded, etc.
- the music data can be downloaded from the music distribution server connected to the Internet via the communication unit 7 together with the accompanying information.
- the accompanying information can be downloaded from a server having a CDDB (CD Data Base) connected to the Internet using, for example, TOC (Table Of Contents) information corresponding to each piece of music data as a key.
- CDDB CD Data Base
- TOC Table Of Contents
- the system control unit 4 includes a CPU having a calculation function, a working RAM, various processing programs (including the display control program of the present application), a ROM that stores data, and the like.
- the CPU is stored in the ROM or the like.
- the system control unit 6 functions as a pitch power addition level calculation unit, a harmony intelligibility calculation unit, a bass beat level detection unit, and a tone determination unit of the present application.
- the system control unit 6 calculates the pitch power addition level from the music data input from the reproduction processing unit 1 or the storage unit 3, and calculates the harmony intelligibility from the calculated pitch power addition level. Then, the tone of the music is determined based on the calculated harmony intelligibility.
- the system control unit 4 calculates a signal level (amplitude) power spectrum F (n) power spectrum F (n) of a predetermined bandwidth (Hz) by FFT conversion or the like for the input music data, and calculates From the result, the pitch power of each pitch (Hz) is calculated. Then, a “pitch power addition level” is calculated in which the pitch power of each pitch is weighted and added within the range of one octave.
- the system control unit 4 deviates from the “pitch power addition level” calculated by the pitch power addition level calculation means within one octave range (average value of the calculated pitch power addition levels ANP (p) of each pitch). Difference).
- system control unit 4 determines the tune of the music using the harmony intelligibility and the like.
- FIG. 2 is a flowchart showing the operation of the information reproducing / recording apparatus S.
- step S1 When music data is input from the reproduction processing unit 1 or the like (step S1), the system control unit 4 performs FFT conversion on the music data at N points (step S2). Next, the pitch power of each pitch is calculated within the M octave range (step S3), and the pitch power addition level is calculated (step S4). Next, the harmony intelligibility is calculated (step S5), and finally, the temporal direction transition of the harmony intelligibility is calculated (step S6).
- the melody is determined according to the calculated harmony intelligibility.
- the tune of the music determined in this way is stored in the storage unit 3 or the like in association with the music in the music table, for example. Then, when searching for music, the music tone is displayed by referring to the music table so that the user can identify it.
- FIG. 6 is a flowchart showing tune determination based on harmony intelligibility and low-frequency beat level.
- the low-frequency beat level is added to the flowchart showing the operation of the information reproducing / recording apparatus S shown in FIG.
- step S11 the harmony intelligibility is calculated.
- the calculation of harmony intelligibility is as described in the flowchart showing the operation of the information reproducing / recording apparatus S shown in FIG.
- a low-frequency beat level is calculated.
- the low-frequency beat level indicates a volume level that constitutes a rhythm part of a musical piece such as a drum or a bass.
- a sound constituting a rhythm part of a musical piece such as a drum or a bass is in a low frequency range compared to other sounds. Therefore, here, these volume levels are collectively referred to as a low-frequency beat level.
- the low-frequency beat level is specifically a low-frequency signal for music.
- step S13 the calculated harmony intelligibility and low-frequency beat level time direction transition are calculated.
- the time direction transition may be calculated for the entire music piece or a part thereof.
- FIGS. 7A to 7D show time-series changes in harmony intelligibility and low-frequency beat level of four types of music.
- the horizontal axis 17 of the graph represents the time direction
- the vertical axis 18 represents the temporal direction transition of the harmony intelligibility and the low-frequency beat level.
- the value on the vertical axis is a value normalized by an arbitrary value, and is normalized in the same manner for each piece of music shown in FIGS. 7A to 7D. Therefore, these values are relatively large and small.
- the relationship can be compared.
- a solid line portion 19 in the graph indicates harmony intelligibility
- a broken line portion 20 indicates a low-frequency beat level.
- the music A shown in FIG. 7A is recognized as a lively and rock-like music tone in terms of human hearing.
- the tone of the music A is determined using the harmony intelligibility and the low-frequency beat level, the intelligibility of the music A may change around 30 and the low-frequency beat level may change around 80. Calculated. Since the harmony intelligibility is low and the low-frequency beat level is high, it is determined that the music tone gives a strong impression.
- the tune that is recognized from the sense of hearing matches the tune that is discriminated by calculating the harmony intelligibility and the low-frequency beat level.
- Music B shown in FIG. 7 (B) is composed of only piano and vocals until the middle of the music, and is a music in which chords can be heard clearly.
- the rhythm parts such as drums are played after the middle of the song.
- the harmony intelligibility shows a very high value of about 80 until the middle of the music.
- the low-frequency beat level is as low as about 20 until the middle of the song.
- the music B has a beautiful sound of chords and shows a quiet impression as a music tone.
- the 7C is composed of a band performance including many sound components such as vocals, keyboards, drums, basses, and guitars, and shows a melody that is recognized rhythmically for hearing.
- the harmony intelligibility of the music piece C is as high as about 60, and the low-frequency beat level is also high as about 60, so that the music tone is easy to hear and shows a rhythmical impression. Therefore, the tune that is recognized from the sense of hearing coincides with the tune that is determined by calculating the harmony intelligibility and the low-frequency beat level.
- Music D shown in FIG. 7 (D) is composed of vocal a cappella, and is a music that allows you to hear chords prominently.
- the music D has the same degree of harmony intelligibility as the music A, but the low frequency beat level is much lower than that of the music A. Even if the harmony intelligibility is the same value, it can be determined that the tune is different from the music A by the value of the low-frequency beat level. Therefore, the tune that is recognized for hearing matches the tune that is discriminated using the harmony intelligibility and the low-frequency beat level.
- Fig. 8 shows tune classification using harmony intelligibility and low-frequency beat level.
- the horizontal axis 30 indicates the harmony intelligibility
- the vertical axis 31 indicates the low frequency beat level.
- the tunes of music can be classified by the values of harmony intelligibility and low-frequency beat level.
- the music that tends to have high harmony intelligibility and low-frequency beat level shows an easy-to-listen and rhythmic music tone.
- a musical piece that tends to have a high degree of harmony intelligibility and a low low-frequency beat level shows a quiet tone.
- music that tends to have a low degree of harmony intelligibility and a tendency to have a high low-frequency beat level exhibits intense music.
- the music which tends to have low harmony intelligibility and tends to have a low low-frequency beat level has a thin sound and shows an intense tone.
- the music tone can be determined in detail by analyzing the music using other feature values, for example, the low-frequency beat level.
- FIG. 9 shows an example of harmony intelligibility and low-frequency beat level transition in the time direction in the music A when the signal power (SignalPower) is lowered.
- the horizontal axis 21 of the graph represents the time direction
- the vertical axis 22 represents the temporal direction transition of the harmony intelligibility and the low frequency beat level.
- the value on the vertical axis is a value normalized with an arbitrary value, and is normalized in the same manner for each piece of music shown in FIGS. 9A to 9B. Therefore, these values are relatively large and small. The relationship can be compared.
- a solid line portion 19 of the graph indicates harmony intelligibility
- a broken line portion 20 indicates a low-frequency beat level
- a one-dot chain line portion 21 indicates SignalPower.
- the music A is a music that has been determined to be a musical tone that shows a strong impression by calculating the harmony intelligibility and the low-frequency beat level. According to the hypothesis, the tune of the music should not change depending on the magnitude of the signal power.
- FIG. 9A shows the values of harmony intelligibility and low-frequency beat level when the music A is reproduced with a signal power (SignalPower) of about 100.
- FIG. 9B shows values of harmony intelligibility and low-frequency beat level when the signal power (SignalPower) is lowered to about 50, which is half.
- the values of harmony intelligibility and low-frequency beat level are almost unchanged. Therefore, when the signal power (SignalPower) is lowered, the values of the harmony intelligibility and the low-frequency beat level are not affected, and can be said to reflect the tone of the music.
- the above hypothesis indicates that it has been proved.
- FIG. 10 shows an example of harmony intelligibility and low-frequency beat level transition in the time direction in the music B when the signal power (SignalPower) is increased.
- the music B is a music that has been determined to show a quiet impression up to the middle stage of the music B and an intense impression after the middle stage of the music B by calculating the harmony intelligibility and the low-frequency beat level. Similar to the above, the tune of the music should not change depending on the magnitude of the signal power.
- FIG. 10A shows the values of harmony intelligibility and low-frequency beat level when the music A is reproduced with a signal power (SignalPower) of about 20.
- FIG. 10B shows values of harmony intelligibility and low-frequency beat level when the signal power (SignalPower) is increased to about 40 times. 10A and 10B, the values of the harmony intelligibility and the low-frequency beat level are almost unchanged. Therefore, even when the signal power (SignalPower) is increased, the values of the harmony intelligibility and the low-frequency beat level are not affected by this, and can be said to reflect the tune of the music. Thus, the above hypothesis indicates that it has been proved.
- a chord is calculated from the input music data
- a harmony intelligibility, etc. is calculated from the calculated chord
- the music tone of the music is calculated based on the calculated harmony intelligibility, etc. Can be determined more accurately, and the music can be selected based on the music.
- the harmony intelligibility itself (numerical data) can be classified by level and stored as music metadata, so that the music can be searched by specifying the harmony intelligibility itself.
- determining the music tone based on the harmony intelligibility it can be performed by various methods. For example, it can be performed based on subjective evaluation by a large number of subjects. Moreover, it can also be performed by a user's arbitrary operation and determination. Moreover, you may make it determine a music tone automatically according to the reproduction
- the present application is applied to the reproduction recording apparatus S.
- the present invention is applied to, for example, a mobile phone, a personal computer, and other electronic devices for in-vehicle use and home use. It can also be applied to.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Auxiliary Devices For Music (AREA)
- Electrophonic Musical Instruments (AREA)
Abstract
Description
2 外部出力部
3 記憶部
4 システム制御部
5 通信部
S 情報再生記録装置
本願は、「ハーモニー明瞭度」を用いて、楽曲の曲調を判別することを提唱するものである。
以下、本願の最良の実施形態を添付図面に基づいて説明する。なお、以下に説明する実施の形態は、情報再生記録装置に対して本願を適用した場合の実施形態である。
次に図2を用いて、本実施形態における情報再生記録装置Sの動作を説明する。図2は、情報再生記録装置Sの動作を示すフローチャートである。
ハーモニー明瞭度と併せて、その他の特徴量、例えば、低域ビートレベルを用いて楽曲を解析することにより、詳細に楽曲の有する曲調を判別することができる。
以上に示すように、ハーモニー明瞭度及び低域ビートレベルは、楽曲の曲調を判別する指標となる。従って、ハーモニー明瞭度及び低域ビートレベルは、楽曲のシグナルパワー(SignalPower)(dB)即ち、再生する際の音量の大きさによって左右されず、楽曲の曲調を判別する指標とならなければならない。以下、上記仮説について実証する。
Claims (7)
- 入力された楽曲データから音程パワー加算レベルを算出する音程パワー加算レベル算出手段と、
前記算出された音程パワーに基いて、ハーモニーが聴感上明確に聞こえるか否かの度合いを示すハーモニー明瞭度を算出するハーモニー明瞭度算出手段と、
前記ハーモニー明瞭度を用いて、前記楽曲の曲調を判別する曲調判別手段と、
を備えることを特徴とする楽曲判別装置。 - 請求項1に記載の楽曲判別装置において、
前記ハーモニー明瞭度算出手段は、前記音程パワー加算レベルの偏差によって、前記ハーモニー明瞭度を算出することを特徴とする楽曲判別装置。 - 請求項1又は2に記載の楽曲判別装置において、
前記音程パワー算出手段は、前記楽曲の一部分の音程パワー加算レベルを算出し、
前記ハーモニー明瞭度検出手段は、前記楽曲の一部分の音程パワー加算レベルに基づいて前記ハーモニー明瞭度を算出することを特徴とする楽曲判別装置。 - 請求項1乃至3の何れか一項に記載の楽曲判別装置において、
前記楽曲データの低域ビートレベルを検出する低域ビートレベル検出手段を更に備え、
前記曲調判別手段は、前記ハーモニー明瞭度算出手段によって算出された値又は低域ビートレベル検出手段によって検出された値のうち少なくともいずれか一方の値を用いて前記曲調を判別することを特徴とする楽曲判別装置。 - 入力された楽曲データから音程パワー加算レベルを算出する音程パワー加算レベル算出工程と、
前記算出された音程パワー加算レベルに基いて、ハーモニーが聴感上明確に聞こえるか否かの度合いを示すハーモニー明瞭度を算出するハーモニー明瞭度算出工程と、
前記ハーモニー明瞭度を用いて、前記楽曲の曲調を判別する曲調判別工程と、
を備えることを特徴とする楽曲判別方法。 - 楽曲判別装置に含まれるコンピュータを、
入力された楽曲データから音程パワー加算レベルを算出する音程パワー加算レベル算出手段、
前記算出された音程パワー加算レベルに基いて、ハーモニーが聴感上明確に聞こえるか否かの度合いを示すハーモニー明瞭度を算出するハーモニー明瞭度算出手段、
前記ハーモニー明瞭度を用いて、前記楽曲の曲調を判別する曲調判別手段、
として機能させることを特徴とする楽曲判別プログラム。 - 請求項6に記載の楽曲判別プログラムがコンピュータに読み取り可能に記録されていることを特徴とする記録媒体。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2009554175A JPWO2009104269A1 (ja) | 2008-02-22 | 2008-02-22 | 楽曲判別装置、楽曲判別方法、楽曲判別プログラム及び記録媒体 |
PCT/JP2008/053031 WO2009104269A1 (ja) | 2008-02-22 | 2008-02-22 | 楽曲判別装置、楽曲判別方法、楽曲判別プログラム及び記録媒体 |
US12/918,962 US20110011247A1 (en) | 2008-02-22 | 2008-02-22 | Musical composition discrimination apparatus, musical composition discrimination method, musical composition discrimination program and recording medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2008/053031 WO2009104269A1 (ja) | 2008-02-22 | 2008-02-22 | 楽曲判別装置、楽曲判別方法、楽曲判別プログラム及び記録媒体 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2009104269A1 true WO2009104269A1 (ja) | 2009-08-27 |
Family
ID=40985164
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2008/053031 WO2009104269A1 (ja) | 2008-02-22 | 2008-02-22 | 楽曲判別装置、楽曲判別方法、楽曲判別プログラム及び記録媒体 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20110011247A1 (ja) |
JP (1) | JPWO2009104269A1 (ja) |
WO (1) | WO2009104269A1 (ja) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2017116964A (ja) * | 2017-03-29 | 2017-06-29 | カシオ計算機株式会社 | コード抽出装置、および方法 |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2829497A1 (en) * | 2011-03-24 | 2012-09-27 | Sanofi-Aventis Deutschland Gmbh | Device and method for detecting an actuation action performable with a medical device |
AU345903S (en) * | 2012-03-05 | 2012-12-05 | Apple Inc | Display screen for an electronic device |
US10143830B2 (en) * | 2013-03-13 | 2018-12-04 | Crisi Medical Systems, Inc. | Injection site information cap |
USD748671S1 (en) * | 2014-03-17 | 2016-02-02 | Lg Electronics Inc. | Display panel with transitional graphical user interface |
USD757093S1 (en) * | 2014-03-17 | 2016-05-24 | Lg Electronics Inc. | Display panel with transitional graphical user interface |
USD748669S1 (en) * | 2014-03-17 | 2016-02-02 | Lg Electronics Inc. | Display panel with transitional graphical user interface |
USD748670S1 (en) * | 2014-03-17 | 2016-02-02 | Lg Electronics Inc. | Display panel with transitional graphical user interface |
USD748134S1 (en) * | 2014-03-17 | 2016-01-26 | Lg Electronics Inc. | Display panel with transitional graphical user interface |
US10610651B2 (en) | 2014-06-09 | 2020-04-07 | Aerami Therapeutics, Inc. | Self-puncturing liquid drug cartridges and associated dispenser |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0837700A (ja) * | 1994-07-21 | 1996-02-06 | Kenwood Corp | 音場補正回路 |
JPH08298418A (ja) * | 1995-04-25 | 1996-11-12 | Matsushita Electric Ind Co Ltd | 音質調整装置 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4331058A (en) * | 1980-11-24 | 1982-05-25 | Kawai Musical Instrument Mfg. Co., Ltd. | Adaptive accompaniment level in an electronic musical instrument |
DE69616139T2 (de) * | 1995-04-25 | 2002-03-14 | Matsushita Electric Ind Co Ltd | System zum Einstellen der Tonqualität |
JP4244133B2 (ja) * | 2002-11-29 | 2009-03-25 | パイオニア株式会社 | 楽曲データ作成装置及び方法 |
JP2006195384A (ja) * | 2005-01-17 | 2006-07-27 | Matsushita Electric Ind Co Ltd | 楽曲調性算出装置および選曲装置 |
JP5507997B2 (ja) * | 2006-04-14 | 2014-05-28 | コーニンクレッカ フィリップス エヌ ヴェ | 調音およびキー分析のためのオーディオスペクトル中の音成分の選択 |
-
2008
- 2008-02-22 US US12/918,962 patent/US20110011247A1/en not_active Abandoned
- 2008-02-22 WO PCT/JP2008/053031 patent/WO2009104269A1/ja active Application Filing
- 2008-02-22 JP JP2009554175A patent/JPWO2009104269A1/ja not_active Ceased
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0837700A (ja) * | 1994-07-21 | 1996-02-06 | Kenwood Corp | 音場補正回路 |
JPH08298418A (ja) * | 1995-04-25 | 1996-11-12 | Matsushita Electric Ind Co Ltd | 音質調整装置 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2017116964A (ja) * | 2017-03-29 | 2017-06-29 | カシオ計算機株式会社 | コード抽出装置、および方法 |
Also Published As
Publication number | Publication date |
---|---|
JPWO2009104269A1 (ja) | 2011-06-16 |
US20110011247A1 (en) | 2011-01-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2009104269A1 (ja) | 楽曲判別装置、楽曲判別方法、楽曲判別プログラム及び記録媒体 | |
US8471135B2 (en) | Music transcription | |
CN112382257B (zh) | 一种音频处理方法、装置、设备及介质 | |
EP2661743B1 (en) | Input interface for generating control signals by acoustic gestures | |
Eggink et al. | Instrument recognition in accompanied sonatas and concertos | |
JP5229998B2 (ja) | コード名検出装置及びコード名検出用プログラム | |
Lerch | Software-based extraction of objective parameters from music performances | |
JP6288197B2 (ja) | 評価装置及びプログラム | |
JP6102076B2 (ja) | 評価装置 | |
JP5292702B2 (ja) | 楽音信号生成装置及びカラオケ装置 | |
JP6056799B2 (ja) | プログラム、情報処理装置、及びデータ生成方法 | |
JP5618743B2 (ja) | 歌唱音声評価装置 | |
JP2016071188A (ja) | 採譜装置、及び採譜システム | |
JP6036800B2 (ja) | 音信号生成装置及びプログラム | |
JP5776205B2 (ja) | 音信号生成装置及びプログラム | |
JP5659501B2 (ja) | 電子音楽装置及びプログラム | |
CN113270081A (zh) | 调整歌伴奏音的方法及调整歌伴奏音的电子装置 | |
Chaisri | Extraction of sound by instrument type and voice from music files | |
Cuesta et al. | Audio Melody Extraction | |
Nunn | Analysis and resynthesis of polyphonic music | |
JP2007041488A (ja) | 音声信号の音階的特性分析方法ならびに、その装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
DPE2 | Request for preliminary examination filed before expiration of 19th month from priority date (pct application filed from 20040101) | ||
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 08711805 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2009554175 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 12918962 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 08711805 Country of ref document: EP Kind code of ref document: A1 |