WO2017195292A1

WO2017195292A1 - Music structure analysis device, method for analyzing music structure, and music structure analysis program

Info

Publication number: WO2017195292A1
Application number: PCT/JP2016/063981
Authority: WO
Inventors: 四郎鈴木
Original assignee: ＰｉｏｎｅｅｒＤＪ株式会社
Priority date: 2016-05-11
Filing date: 2016-05-11
Publication date: 2017-11-16
Also published as: EP3457395A4; EP3457395A1; JPWO2017195292A1

Abstract

Provided is a music structure analysis device (4) that allocates feature segments that characterize the structure of music data (M1) to music data (M1) in which development points are established in feature segments, wherein the music structure analysis device (4) includes: a position information acquisition unit (11) for acquiring the position information for the development points; a sound number analyzing unit (12) for analyzing a number of sounds with different frequencies for each segment between each of the development points on the basis of the position information for the development points acquired by the position information acquisition unit (11); and a feature segment allocation unit (15) for allocating the feature segments to a segment between other development points on the basis of the segment between development points determined to contain a local maximum value for the number of sounds analyzed by the sound number analyzing unit (12).

Description

Music structure analysis apparatus, music structure analysis method, and music structure analysis program

The present invention relates to a music structure analysis apparatus, a music structure analysis method, and a music structure analysis program.

Conventionally, the music structure is automatically assigned by assigning characteristic sections that characterize the music structure such as so-called Intro, A melody (Verse1), B melody (Verse2), Sabi (Hook), Outro, etc. Analysis techniques are known.
For example, Patent Literature 1 discloses a technique for assigning feature sections such as stanzas and refrains to music data by performing similarity determination between segments (feature sections) assigned to music data. Yes.

Japanese Patent No. 4775380

However, in the technique described in Patent Document 1, since the feature sections are assigned based on the similarity between the feature sections, it is not possible to assign some feature sections such as rust and B melody. There is a problem.
Even if assigned, the intro, A melody, B melody, rust, C melody, and outro must be assigned in comparison with each other, and the assignment of feature sections becomes complicated.

An object of the present invention is to provide a music structure analysis apparatus, a music structure analysis method, and a music structure analysis program capable of easily assigning characteristic sections characterizing music data.

The music structure analysis apparatus of the present invention is
A music structure analysis apparatus that assigns the feature section to the music data set with the development points of the feature section characterizing the structure of the music data,
A position information acquisition unit for acquiring position information of the development point;
Based on the position information of the expansion points acquired by the position information acquisition unit, a sound number analysis unit that analyzes the number of sounds with different frequencies for each section between the expansion points;
A feature section allocating section that assigns a feature section to a section between other development points based on a section of development points that takes the maximum value of the number of sounds analyzed by the sound number analysis section,
It is characterized by having.

The music structure analysis method of the present invention includes:
A music structure analysis method for assigning a characteristic section to music data in which a development point of a characteristic section characterizing the structure of the music data is set,
Obtaining position information of the development point;
A procedure for analyzing the number of sounds having different frequencies for each section between the development points based on the acquired position information of the development points;
A procedure for assigning a feature section to a section between other development points based on a section between development points taking the maximum value of the number of analyzed sounds;
It is characterized by implementing.

The music structure analysis program of the present invention causes a computer to function as the music structure analysis apparatus described above.

The schematic diagram showing the structure of the acoustic system which concerns on embodiment of this invention. The block diagram showing the structure of the music structure analysis apparatus in the said embodiment. The schematic diagram for demonstrating the expansion | deployment point in the said embodiment. The schematic diagram for demonstrating the area between the expansion | deployment points of the music data in the said embodiment. The graph for demonstrating the characteristic area allocation part in the said embodiment. The schematic diagram for demonstrating the display information produced | generated by the display information production | generation part in the said embodiment. The flowchart for demonstrating the effect | action in the said embodiment.

[1] Overall Configuration of Acoustic Control System FIG. 1 shows an acoustic control system 1 according to an embodiment of the present invention. The acoustic control system 1 includes two digital players 2, a digital mixer 3, A computer 4 and a speaker 5 are provided.
The digital player 2 includes a jog dial 2A, a plurality of operation buttons (not shown), and a display 2B, and the operator of the digital player 2 operates the jog dial 2A or the operation buttons to perform acoustic control information corresponding to the operation. Can be output. The acoustic control information is output to the computer 4 via a USB (Universal Serial Bus) cable 6 capable of bidirectional communication.

The digital mixer 3 includes an operation switch 3A, a volume adjustment lever 3B, and a left / right switching lever 3C. By operating these switches 3A, levers 3B, 3C, acoustic control information can be output. The acoustic control information is output to the computer 4 via the USB cable 7. The music information processed by the computer 4 is input to the digital mixer 3, and the music information including the input digital signal is converted into an analog signal and output from the speaker 5 through the analog cable 8. The
The digital player 2 and the digital mixer 3 are connected to each other via a LAN (Local Area Network) cable 9 compliant with the IEEE 1394 standard, and the sound control generated by operating the digital player 2 without using the computer 4. Information can also be output directly to the digital mixer 3 for DJ performance.

[2] Functional Block Configuration of Computer 4 FIG. 2 shows a functional block diagram of the computer 4 as a music structure analyzing apparatus. The computer 4 includes a position information acquisition unit 11, a sound number analysis unit 12, a bass level analysis unit 13, a ratio calculation unit 14, a feature section allocation unit 15, and a display as a music structure analysis program executed on the arithmetic processing device 10. An information generation unit 16 is provided.

The position information acquisition unit 11 acquires the number of measures of the development point set in the music data M1 as the position information of the development point. Specifically, as shown in FIG. 3, the position information acquisition unit 11 has development points P1, P2,..., Pn, Pn + 3,... Set between bars in the music data M1. The position information of Pe-4, in this embodiment, the number of bars is acquired.
The expansion point is the frequency analysis of the number of sounds with different frequencies for each measure by FFT, etc., the number of peaks of the sound pressure level is counted, and the other is based on the measure that is the maximum number of sounds in the music data M1. The ratio of the sound of the bars is calculated and set as the inter-bar position where the ratio changes greatly. The expansion points can be set in the computer 4 using the sound number analysis unit 12, but the expansion points P1, P2,..., Pn, by analyzing the number of sounds in advance as in this embodiment. Music data M1 in which Pn + 3,... Pe-4 is set may be used.
Further, the setting of the development points P1, P2,..., Pn, Pn + 3,... Pe-4 is not limited to the method described above, and may be performed based on, for example, the similarity of phrases in the music data M1. .

The sound number analysis unit 12 detects the signal level of each frequency band for each segment between the development points P1, P2,..., Pn, Pn + 3,. Analyze the number of sounds. Here, the “number of sounds” may be a case where sounds having different frequencies may be counted as different sounds, or a fundamental tone, overtones, etc. may be counted as one with the same scale. Note that the input music data M1 may be music data stored on a hard disk in the computer 4, or music data recorded on a CD, a Blu-ray disc or the like inserted in a slot of the digital player 2. Alternatively, it may be music data that can be downloaded from a network via a communication line.

Specifically, the sound number analysis unit 12, for example, as shown in FIG. 4, for each section between the development points P1, P2,..., Pn, Pn + 3,. Count the number of sounds by counting the number of peaks in the frequency. In the present embodiment, the analysis of sounds having different frequencies is performed using FFT, but the present invention is not limited to this, and for example, frequency conversion may be performed using discrete cosine transform or discrete Fourier transform.
The sound number analysis unit 12 outputs the analysis result to the ratio calculation unit 14.

The bass level analysis unit 13 takes into account the average value for each bar section of the low-frequency sound pressure peak level, which is the low-frequency signal level, and develops points P1, P2, ..., Pn, Pn + 3, ... Pe-4. It is provided to allocate a chorus section in the section between.
The low sound level analysis unit 13 has a frequency lower than a predetermined frequency in a section between the development points P1, P2,..., Pn, Pn + 3,. Analyzes the low sound pressure peak level. Specifically, the bass level analysis unit 13 acquires bass sound levels such as bass drums and bass, and calculates the average value of the bass sound pressure peak levels in the sections for each measure section as the development point P1, .., Pn, Pn + 3,... Pe-4.
The bass level analysis unit 13 outputs the analysis result to the feature section allocation unit 15.

Based on the analysis result of the sound number analysis unit 12, the ratio calculation unit 14 uses the interval between the expansion points P1, P2,..., Pn, Pn + 3,. Calculate the ratio of the number of sounds in other sections. Specifically, as shown in FIG. 4, the ratio calculation unit 14 specifies the Pn-th section that gives the FFT 2 having the largest number of sounds, acquires the number Nmax of sounds in the n-th bar, Based on the following formula (1), the ratio Rn of other measures is calculated.
Rn = Nn / Nmax (1)
The ratio of the other sections is calculated as the ratio to the section between the nth expansion points P1, P2,..., Pn, Pn + 3,. Is done. The ratio calculation unit 14 outputs the calculation result to the feature interval allocation unit 15.

Based on the calculation result of the ratio calculation unit 14 and the analysis result of the bass level analysis unit 13, the feature interval allocation unit 15 sets the interval between the expansion points P 1, P 2,..., Pn, Pn + 3,. Characteristic sections such as an intro section, an A melody (Verse1) section, a B melody (Verse2) section, a chorus (Hook) section, a C melody (Verse3) section, and an outro section are allocated.
Specifically, as shown in the graph G1 of FIG. 5, the feature section allocating unit 15 searches for a maximum value 1 and a maximum value 2 that are larger than the preceding and following sections based on the analysis result of the sound number analysis unit 12. Then, maximal value 1 and maximal value 2 are set as candidates for the rust section (Hook).
As shown in the graph G2 of FIG. 5, the feature section allocating unit 15 obtains the average value of the low frequency sound pressure peak levels of each feature section based on the analysis result of the bass level analyzing unit 13, and the maximum value 1 Then, it is determined whether or not the average value of the low-frequency sound pressure peak level in the section having the maximum value 2 is higher than a predetermined threshold value, and the hook section is allocated.

Next, as shown in the graph G3 of FIG. 5, the feature interval allocation unit 15 sets the interval before the maximum value 1 and the maximum value 2 as the A melody interval (Verse1), and the subsequent interval as the B melody interval (Verse2). Further, the subsequent section is allocated to the C melody section (Verse 3) or the like. Which feature section is assigned is determined by whether or not the number of sounds exceeds a predetermined threshold. The predetermined threshold value may be a fixed threshold value that is smaller than the maximum value, or may be a threshold value that is set as a predetermined ratio with respect to the maximum value and varies according to the maximum value.
The name of the feature section to be assigned is arbitrary. In FIG. 5, names such as A-Verse and B-Verse may be assigned. In addition, since the intro section is the first section of the music data M1 and the outro section is the last section of the music data M1, the feature section allocating unit 15 assigns intro sections and outro sections in advance. Keep going.

The display information generating unit 16 generates the feature section assigned by the feature section assigning unit 15 as display information together with the music data M1. Specifically, as shown in FIG. 6, the characteristic section is displayed as the music data M1 progresses, and the display information for changing the color of the characteristic section as the music data M1 progresses is generated.
The display information generated by the display information generation unit 16 is output to the display 2B serving as a display device of the digital player 2, and the DJ performer is performing which characteristic section is being played as the music performance of the music data M1 progresses. Can be confirmed.

[3] Operation and Effect of Embodiment Next, a music structure analysis method which is the operation of the present embodiment will be described based on the flowchart shown in FIG.
The position information acquisition unit 11 acquires position information of the development points P1, P2,..., Pn, Pn + 3,... Pe-4 in the music data M1 (step S1).
Next, the sound number analysis unit 12 analyzes the number of sounds in the section between the development points P1, P2,..., Pn, Pn + 3,... Pe-4 (step S2).
Based on the analysis result of the sound number analysis unit 12, the ratio calculation unit 14 uses the interval between the expansion points Pn and Pn + 3 having the maximum number of sounds as a reference to other expansion points P1, P2,. -4 is calculated (step S3).

The feature section assigning unit 15 assigns the intro section to the section from the start point of the music data M1 to the first development point P1 (step S4).
Subsequently, the feature section assigning unit 15 assigns an outro section from the last development point Pe-4 of the music data M1 to the end point of the music-multiple M1 (step S5).
The feature section allocating unit 15 searches for the maximum value in the section between the development points other than the intro section and the outro section (step S6). The search may start from the section next to the intro section or may start from the section before the outro section.
When the section having the maximum value is found, the feature section allocating unit 15 obtains the average value of the low-frequency sound pressure peak levels in the section between the expansion points P1, P2,..., Pn, Pn + 3,. (Procedure S7).

The feature section allocating unit 15 determines whether or not the average value of the low frequency sound pressure peak level in the section having the maximum value exceeds a predetermined threshold (step S8).
When the value is equal to or less than the predetermined threshold, the feature section allocating unit 15 searches for the next maximum value.
When the predetermined threshold value is exceeded, the feature section allocating unit 15 allocates a hook section to the section (step S9).
The feature section allocating unit 15 repeats steps S6 to S9 for all sections having the maximum value in the music data M1 (step S10).
Steps S8 and S9 are performed in order to improve the detection accuracy of the chorus section, and only the search for the maximum value of the section between the expansion points that takes the maximum value of the number of sounds, A rust section may be allocated.

When all the chorus sections have been assigned, the feature section assigning unit 15 acquires the number of section sounds between other development points before and after the section set as the chorus section (step S11).
The feature section allocating unit 15 determines whether or not the number of sounds in the section between other development points exceeds a predetermined threshold (step S12).

When the ratio of the number of sounds exceeds a predetermined threshold, the feature section allocating unit 15 allocates an A melody (Verse1) section to the section (step S13), and when the ratio is equal to or less than the predetermined threshold, the feature section allocating unit. 15 assigns a B melody (Verse2) section to the section (step S14).
The feature section assigning unit 15 repeats until the assignment of the feature section to the section between all the development points P1, P2,..., Pn, Pn + 3,.
When feature sections are assigned to all sections, the feature section assigning unit 15 outputs the assigned result to the display information generating unit 16, and the display information generating unit 16 generates display information based on the assignment result, The generated display information is output to the display 2B of the digital player 2 (step S16).

According to the present embodiment, since all the feature sections can be allocated simply by analyzing the number of sounds by the sound number analysis unit 12, the music data M1 can be easily and quickly allocated. It is possible to assign feature sections.
In addition, by outputting display information from the display information generation unit 16 to the display 2B, a user performing DJ performance can visually recognize which feature section is currently being played, and therefore performs a higher level DJ performance. be able to.

DESCRIPTION OF SYMBOLS 1 ... Sound control system, 2 ... Digital player, 2A ... Jog dial, 2B ... Display, 3 ... Digital mixer, 3A ... Operation switch, 3B ... Volume control lever, 3C ... Left / right switching lever, 4 ... Computer, 5 ... Speaker, 6 ... Cable, 7 ... USB cable, 8 ... Analog cable, 9 ... Cable, 10 ... Arithmetic processing device, 11 ... Position information acquisition unit, 12 ... Sound number analysis unit, 13 ... Bass level analysis unit, 14 ... Ratio calculation unit, 15 ... Characteristic section allocation unit, 16 ... Display information generation unit, G1 ... Graph, G2 ... Graph, G3 ... Graph, M1 ... Music data, P1, P2, Pn, Pn + 3, Pe-4 ... Expansion point.

Claims

A music structure analysis apparatus that assigns the feature section to the music data set with the development points of the feature section characterizing the structure of the music data,
A position information acquisition unit for acquiring position information of the development point;
Based on the position information of the expansion points acquired by the position information acquisition unit, a sound number analysis unit that analyzes the number of sounds with different frequencies for each section between the expansion points;
Based on the section between the development points that takes the maximum value of the number of sounds analyzed by the sound number analysis section, a feature section allocating section that assigns the feature section to a section between other development points;
A music structure analyzing apparatus comprising:
In the music structure analysis apparatus according to claim 1,
Based on the position information of the development points acquired by the position information acquisition unit, for each section between the development points, comprising a bass level analysis unit for analyzing the signal level of the sound of the frequency lower than the predetermined frequency,
The feature section allocating unit assigns the feature section in consideration of a section between development points taking a maximum value of a low-frequency signal level analyzed by the bass level analyzing section. .
In the music structure analysis apparatus according to claim 1 or 2,
The musical piece structure analyzing apparatus characterized in that the characteristic section allocating section allocates a hook section between development points including a measure having a maximum value of the number of sounds.
In the music structure analysis apparatus according to claim 3,
The feature section allocation unit assigns feature sections to sections between the other development points based on whether the number of sounds in the section between other development points exceeds a fixed threshold value. A music structure analysis apparatus characterized by
In the music structure analysis apparatus according to claim 3,
The feature section allocating unit determines whether or not the number of sounds in the section between the other development points exceeds the threshold that varies according to the maximum value, in the section between the other development points. The music structure analysis apparatus characterized by performing the allocation of music.
In the music structure analysis apparatus according to any one of claims 1 to 5,
The music number analysis unit performs frequency conversion and analyzes the number of sounds according to the signal level of each frequency band.
In the music structure analysis apparatus according to any one of claims 1 to 6,
A music structure analyzing apparatus, comprising: a display information generating unit configured to generate display information on the display device together with music data on the characteristic sections allocated by the characteristic section allocating unit.
A music structure analysis method for assigning a characteristic section to music data in which a development point of a characteristic section characterizing the structure of the music data is set,
Obtaining position information of the development point;
A procedure for analyzing the number of sounds having different frequencies for each section between the development points based on the acquired location information of the development points;
A procedure for assigning a feature section to another development point section based on a section between development points taking the maximum value of the analyzed number of sounds;
The music structure analysis method characterized by implementing.
A music structure analysis program for causing a computer to function as the music structure analysis apparatus according to any one of claims 1 to 7.