CN109493879B

CN109493879B - Music rhythm analysis and extraction method and device

Info

Publication number: CN109493879B
Application number: CN201811578654.7A
Authority: CN
Inventors: 尹学渊; 孟祥函; 陈超
Original assignee: Chengdu Hifive Technology Co ltd
Current assignee: Chengdu Potential Artificial Intelligence Technology Co ltd
Priority date: 2018-12-24
Filing date: 2018-12-24
Publication date: 2021-12-17
Anticipated expiration: 2038-12-24
Also published as: CN109493879A

Abstract

The invention discloses a music melody rhythm analyzing and extracting method and device, and relates to the technical field of computers. The method comprises the following steps: preprocessing an MIDI music file to obtain a target audio track; traversing all notes in the target audio track to obtain a starting time value and an ending time value of each note; identifying a sound root of a chord tone in a target sound track; processing the multiple tones in the target audio track into a plurality of single tones; splicing the sound root, the single tones and other notes according to the starting time value and the ending time value of the notes to obtain a complete melody rhythm, and identifying each band in the melody rhythm and a phrase in each band; and performing similarity analysis on all the phrases to obtain the time value similarity of each phrase and marking corresponding labels. The music rhythm analyzing and extracting method and the device can realize the extraction and analysis of the music rhythm, can better mine the information hidden in the rhythm structure, and better guide the research and development of computer music creation.

Description

Music rhythm analysis and extraction method and device

Technical Field

The invention relates to the technical field of computers, in particular to a music melody rhythm analyzing and extracting method and device.

Background

With the development and progress of science and technology, the music creation by the computer is gradually applied, the music creation by the computer can greatly reduce the workload of music creation by human beings, and is expected to break away from the traditional thought to tie up novel music, namely, the music generated by the computer cannot be compared with musicians of human beings, but the machine works can provide candidate or primary works for the human beings, so that the musicians can create more easily, and the music creation by the computer has very wide application scenes.

In the process of computer music creation and analysis, the extraction and analysis of music melody rhythm are indispensable processes. At present, the conventional technology extracts a rhythm time value structure through manual work or a visual tool, is difficult to be applied to a large number of calculation scenes, and cannot be scientifically analyzed by using a big data technology.

Disclosure of Invention

In view of the above, the present invention provides a method and an apparatus for analyzing and extracting music melody rhythm, so as to improve the above problems.

In order to achieve the purpose, the invention adopts the following technical scheme:

in a first aspect, an embodiment of the present invention provides a music melody rhythm analysis and extraction method, where the method includes:

preprocessing an MIDI music file to obtain a target audio track;

traversing all notes in the target audio track to obtain a starting time value and an ending time value of each note in the target audio track;

identifying a root of a chord tone in the target audio track;

processing polyphones in the target audio track into a plurality of monophones;

and splicing the sound root, the multiple tones and other notes according to the starting duration and the ending duration of the notes to obtain a complete melody rhythm, wherein the other notes are tones of the target audio track except the chord tone and the polyphonic tone.

The music melody rhythm analyzing and extracting method as described above, optionally, the method further includes:

identifying each music piece in the melody rhythm;

identifying phrases in each passage;

carrying out similarity analysis on all the phrases to obtain the time value similarity of each phrase;

and marking corresponding labels for each phrase, wherein the labels corresponding to phrases with the similarity exceeding a preset threshold are the same or similar.

extracting a time value structure of each music segment;

the identifying of the phrase in each phrase comprises:

the phrase in each band is identified based on the duration structure of each band.

extracting a time value structure of each phrase;

the similarity analysis is performed on all phrases, and comprises the following steps:

and performing similarity analysis on all the phrases according to the chronaxy structure of each phrase.

Optionally, the method for analyzing and extracting music melody rhythm as described above, where the preprocessing is performed on the MIDI music file to obtain the target track, includes:

and extracting the audio track of the MIDI file according to predefined extraction parameters to obtain the target audio track, wherein the extraction parameters comprise a start time and an end time.

In a second aspect, an embodiment of the present invention provides a music melody rhythm analyzing and extracting device, including:

the pre-processing module is used for pre-processing the MIDI music file to obtain a target audio track;

the acquisition module is used for traversing all the notes in the target audio track to obtain a starting time value and an ending time value of each note in the target audio track;

the identification module is used for identifying the sound root of the chord tone in the target sound track;

a processing module for processing polyphones in the target audio track into a plurality of monophones;

and the splicing module is used for splicing the sound root, the multiple monophones and other notes according to the starting time value and the ending time value of the notes to obtain a complete melody rhythm, wherein the other notes are the monophones except the chord tone and the polyphones in the target audio track.

The device for analyzing and extracting music melody rhythm as described above optionally further includes a similarity analyzing module and a marking module, and the identifying module is further configured to identify each music piece in the melody rhythm; and

identifying phrases in each passage;

the similarity analysis module is used for carrying out similarity analysis on all the phrases to obtain the time value similarity of each phrase;

the marking module is used for marking corresponding labels for each phrase, wherein the labels corresponding to the phrases with the similarity exceeding a preset threshold are the same or similar.

Optionally, the device for analyzing and extracting music melody rhythm as described above further includes an extraction module, where the extraction module is configured to extract a duration structure of each musical piece;

the recognition module is used for recognizing the phrase in each musical section according to the time value structure of each musical section.

Optionally, the extracting module is further configured to extract a time value structure of each phrase;

and the similarity analysis module is used for carrying out similarity analysis on all the phrases according to the chronaxy structure of each phrase.

Optionally, the pre-processing module is configured to extract tracks of a MIDI file according to predefined extraction parameters, so as to obtain the target track, where the extraction parameters include a start time and an end time.

Compared with the prior art, the invention has the beneficial effects that:

the music melody rhythm analyzing and extracting method and the device can realize extraction and analysis of music rhythm, can better mine information hidden in a rhythm structure, and better guide research and development of computer music creation. In the prior art, bands and phrases in rhythm cannot be subdivided, the starting time value and the ending time value of notes cannot be accurately positioned through a program, and meanwhile, the operations of time value structure rhythm analysis based on the analysis, rhythm representation and visual representation and the like cannot be accurately analyzed.

Drawings

Fig. 1 is a block diagram of a terminal device according to a preferred embodiment of the present invention.

Fig. 2 is a flowchart illustrating a music melody rhythm analyzing and extracting method according to a preferred embodiment of the invention.

Fig. 3 is a functional block diagram of an apparatus for analyzing and extracting music melody rhythm according to a preferred embodiment of the invention.

Description of reference numerals: 100-a terminal device; 110-music melody rhythm analyzing and extracting device; 111-a pre-processing module; 112-an acquisition module; 113-an identification module; 114-a processing module; 115-a splicing module; 116-similarity analysis module; 117-marking module; 118-an extraction module; 120-a memory; 130-a memory controller; 140-a processor; 150-peripheral interface; 160-input-output unit; 170-display unit.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

The terms "first," "second," "third," and the like are used solely to distinguish one from another and are not to be construed as indicating or implying relative importance.

The method and the apparatus for analyzing and extracting music melody rhythm provided by the embodiment of the present invention can be applied to the terminal device 100 shown in fig. 1, and in the embodiment of the present invention, the terminal device 100 can be a Personal Computer (PC), a tablet PC, a smart phone, a Personal Digital Assistant (PDA), and the like. It is understood that in other embodiments, the terminal device 100 may also be a network server or a database server.

Referring to fig. 1, the terminal device 100 includes a music melody rhythm analyzing and extracting device 110, a memory 120, a storage controller 130, a processor 140, a peripheral interface 150, an input/output unit 160, and a display unit 170.

The memory 120, the memory controller 130, the processor 140, the peripheral interface 150, the input/output unit 160, and the display unit 170 are electrically connected to each other directly or indirectly to implement data transmission or interaction. For example, the components may be electrically connected to each other via one or more communication buses or signal lines. The music melody rhythm analyzing and extracting device 110 includes at least one software function module which can be stored in the memory 120 in the form of software or firmware (firmware) or solidified in an Operating System (OS) of the electronic device 100. The processor 140 is configured to execute an executable module stored in the memory 120, such as a software functional module or a computer program included in the music melody rhythm analysis and extraction device 110.

The Memory 120 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like. The memory 120 is configured to store a program, and the processor 140 executes the program after receiving an execution instruction, and the method executed by the terminal device 100 defined by the flow procedure disclosed in any of the foregoing embodiments of the present invention may be applied to the processor 140, or implemented by the processor 140.

The processor 140 may be an integrated circuit chip having signal processing capabilities. The Processor 140 may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like, and may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, or a discrete hardware component. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The peripheral interface 150 couples various input/output devices to the processor 140 as well as to the memory 120. In some embodiments, peripheral interface 150, processor 140, and memory controller 130 may be implemented in a single chip. In other examples, they may be implemented separately from the individual chips.

The input/output unit 160 is used for providing input data for a user to realize the interaction of the user with the terminal device 100. The input/output unit 160 may be, but is not limited to, a mouse, a keyboard, and the like.

The display unit 170 provides an interactive interface (e.g., a user operation interface) between the terminal device 100 and a user or is used to display image data to a user reference. In this embodiment, the display unit 170 may be a liquid crystal display or a touch display. In the case of a touch display, the display can be a capacitive touch screen or a resistive touch screen, which supports single-point and multi-point touch operations. Supporting single-point and multi-point touch operations means that the touch display can sense touch operations from one or more locations on the touch display at the same time, and the sensed touch operations are sent to the processor 140 for calculation and processing.

Please refer to fig. 2, which is a flowchart illustrating a music melody rhythm analyzing and extracting method applied to the terminal device 100 shown in fig. 1 according to a preferred embodiment of the present invention. The specific process shown in fig. 2 will be described in detail below.

Step S101, the MIDI music file is preprocessed to obtain the target audio track.

In the embodiment of the present invention, the terminal device 100 stores a MIDI music file for analyzing and extracting a music melody rhythm in advance, and when analyzing and extracting a music melody rhythm, the terminal device 100 extracts a MIDI music file stored in advance, extracts a track in the MIDI music file, obtains one or more target tracks, and subsequently exemplifies one track for convenience of description.

In extracting the MIDI music file, the terminal apparatus 100 may further define, in accordance with the user defined operation, extraction parameters of the MIDI music file, the extraction parameters including a start time and an end time, and in extracting the MIDI music file, the terminal apparatus 100 may extract a piece of track from the MIDI music file in accordance with the extraction parameters to obtain the target track.

Further, after the MIDI music file is preprocessed to obtain the target track, the terminal device 100 may further calculate the start time value of the track rhythm by using the relative time according to the Beat count (BPM, Beat Per Minute) data and the Beat change of the target track.

Step S102, traverse all the notes in the target track to obtain the start duration and the end duration of each note in the target track.

After preprocessing the MIDI music file to obtain the target track, the terminal device 100 traverses all the notes in the target track to obtain the start duration and the end duration of each note in the target track, wherein the start duration refers to the start time of the note in the track, and the end duration refers to the end time of the note in the track.

In this embodiment, the duration calculation may use 64-minute notes as a unit duration, 4-minute notes as one beat, and set the start time or end time of a note as note _ time, and the corresponding duration as note _ time _ value, the start time of the first bar of the music segment where the note is located as basic _ start _ time, and the rate of the note as bpm, so as to obtain the following calculation processes:

duration unit _ second(s) of 64-point note:

unit_second＝60/(bpm*16)。

note relative start or end time calculation:

note_relative_time＝note_time-basic_start_time。

note duration calculation:

note_time_value＝note_relative_time/unit_second。

considering that the tempo and the rate will change, let the last value before the change be last _ note _ time _ value, and recalculate basic _ start _ time, unit _ communicated, note _ relative _ time, etc., calculate the values after the change:

note_time_value＝note_relative_time/unit_second+last_note_time_value。

step S103, the sound root of the chord tone in the target track is identified.

In general, a chord tone and a polyphone are included in a segment of audio track, where the polyphone refers to multiple overlapped notes at the same time, so after a target audio track is obtained, the terminal device 100 identifies the root of the chord tone in the target audio track according to a predefined music theory (the defined music theory is consistent with the actual music theory) (the chord must be established on a certain tone, which is the root, i.e. the fundamental tone.

Step S104, processing the multiple sound in the target audio track into a plurality of single sounds.

Meanwhile, the terminal device 100 performs a division process on the notes overlapped in the polyphones, processes the notes overlapped in the polyphones into monophones, and retains the monophones originally in the target track.

And step S105, splicing the sound root, the multiple monophones and other notes according to the starting time value and the ending time value of the notes to obtain the complete rhythm.

The terminal device 100 traverses all notes in the target track and obtains the start duration and end duration of each note in the target track. And processing the sound roots of the chord sounds in the identified target audio track and the multiple sounds in the target audio track into a plurality of monophones and splicing the rest notes to obtain the complete rhythm. The remaining notes refer to monophones in the target audio track other than the chord tone and the polyphonic tone.

When splicing, the terminal device 100 splices the sound root, the multiple monophones and the rest of the notes according to the start time value and the end time value of each note and the sequence, so that a complete rhythm variation can be formed and the melody is provided with appreciation.

In step S106, each music piece in the rhythm of the melody is identified.

Further, after the complete melody rhythm is obtained by splicing, the terminal device 100 may further identify each music piece in the melody rhythm according to a duration structure of the melody rhythm.

In step S107, phrases in each musical piece are identified.

Then, the terminal device 100 extracts the duration structure of each musical piece, and identifies a phrase in each musical piece based on the extracted duration result.

The duration structure of the note in the embodiment of the invention is as follows:

start time value: time point of note relative to start time.

End time value: the duration of the note relative to the end time.

Since the notes in a track are continuous objects, the duration structure of the track is a sequence of note durations, as follows:

[ (start time value 1, end time value 1), (start time value 2, end time value 2) … ]

Tag information may also be added to the audio track duration structure, and may include:

phrase similar structure: this patent describes the flow with this label.

Musical track instruments: the musical instruments used in the track are numbered MIDI musical instruments.

Average pitch: all note pitches are averaged.

Average force: all note strengths are averaged.

Number of minor segments: the number of sections involved.

And (3) emotion classification: emotion classification the MIDI files may be classified by a decision tree algorithm.

Rate: BPM for MIDI.

Maximum pitch: the maximum of the pitch occurring in the note.

Minimum pitch: pitch minima occurring in the note.

Whether it is the main melody: for recognizing the main melody.

Whether it is the instrument solo: for identifying whether it is the instrument solo.

Further, the resulting value structure and its information tags can be easily structured into a storable structure in a SQL database and stored in the database.

And step S108, performing similarity analysis on all the phrases to obtain the time value similarity of each phrase.

After the phrase in each phrase is identified, the terminal device 100 extracts the time structure of each phrase again, and performs similarity analysis on all phrases according to the time structure of each phrase, where the closer the time structures of two phrases are, the higher the similarity of the two phrases is, and the similarity of each phrase in the melody rhythm can be analyzed through analyzing the time structures of the phrases.

Step S109 marks a corresponding label for each phrase.

For further implementing big data analysis on the music melody rhythm, after analyzing the time value similarity of each phrase in the melody rhythm, the terminal device 100 marks and stores a corresponding label for each phrase, wherein the labels corresponding to phrases with the similarity exceeding a preset threshold are the same or similar. Therefore, phrases with high similarity of the same type can be defined corresponding to the same or similar labels, so that information hidden in rhythm structures can be better mined in the process of creating and analyzing the computer music in the future, a large amount of data support is provided for creating the computer music, and research and development of creating the computer music are better guided.

Referring to fig. 3, a functional block diagram of a music melody rhythm analyzing and extracting device 110 according to a preferred embodiment of the invention is shown, where the music melody rhythm analyzing and extracting device 110 includes a preprocessing module 111, an obtaining module 112, a recognition module 113, a processing module 114, a splicing module 115, a similarity analyzing module 116, a labeling module 117, and an extracting module 118.

The preprocessing module 111 is configured to preprocess the MIDI music file to obtain a target audio track.

It is understood that the preprocessing module 111 can be used to execute the above step S101.

The obtaining module 112 is configured to traverse all the notes in the target track to obtain a start duration and an end duration of each note in the target track.

It is understood that the obtaining module 112 may be configured to perform the step S102.

The identification module 113 is configured to identify a root of the chord tone in the target track.

It is understood that the identification module 113 may be configured to perform the step S103.

The processing module 114 is configured to process the multiple tones in the target audio track into a plurality of single tones.

It is understood that the processing module 114 can be used for executing the step S104.

The splicing module 115 is configured to splice the sound root, the plurality of monophones, and the rest of the notes according to a start duration and an end duration of the notes, so as to obtain a complete melody rhythm.

It is understood that the splicing module 115 can be used to perform the step S105.

The extraction module 118 is configured to extract a duration structure of each musical piece and a duration structure of each phrase.

The recognition module 113 is further configured to recognize each phrase in the rhythm of the melody and recognize a phrase in each phrase.

It is understood that the identification module 113 may also be used to execute the above steps S106 and S107.

The similarity analysis module 116 is configured to perform similarity analysis on all phrases to obtain a time value similarity of each phrase.

It is understood that the similarity analysis module 116 may be configured to perform the step S108.

The marking module 117 is configured to mark a corresponding label for each phrase.

It is understood that the marking module 117 may be configured to perform the step S109.

In summary, the method and apparatus for analyzing and extracting music melody rhythm provided by the present invention can traverse all the notes in the pre-processed target track to obtain the beginning duration and the ending duration of each note in the target track, identify the root of the chord tone in the target track and process the multiple tones in the target track into multiple tones, and then splice the root, multiple tones and other notes according to the beginning duration and the ending duration of the notes to obtain the complete rhythm. Therefore, the music rhythm can be extracted and analyzed, the information hidden in the rhythm structure can be better mined, and the research and development of computer music creation can be better guided. Meanwhile, each phrase in the rhythm and the phrase in each phrase can be identified, similarity analysis is carried out on all phrases, and after the time value similarity of each phrase is obtained, the phrases with the similarity exceeding the preset threshold value are marked with the same or similar labels. Therefore, phrases with high similarity of the same type can be defined corresponding to the same or similar labels, so that information hidden in rhythm structures can be further better mined in the process of creating and analyzing the computer music in the future, a large amount of data support is provided for the computer music creation, and the research and development of the computer music creation are better guided.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, the functional modules in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes. It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims

1. A music melody rhythm analyzing and extracting method is characterized by comprising the following steps:

preprocessing an MIDI music file to obtain a target audio track;

identifying a root of a chord tone in the target audio track;

processing polyphones in the target audio track into a plurality of monophones;

2. The method for analyzing and extracting rhythm of music melody according to claim 1, wherein said method further comprises:

identifying each music piece in the melody rhythm;

identifying phrases in each passage;

3. The method for analyzing and extracting rhythm of music melody according to claim 2, wherein said method further comprises:

extracting a time value structure of each music segment;

the identifying of the phrase in each phrase comprises:

4. The method for analyzing and extracting rhythm of music melody according to claim 3, wherein said method further comprises:

extracting a time value structure of each phrase;

5. The method as claimed in claim 1, wherein the pre-processing the MIDI music file to obtain the target track comprises:

and extracting the audio track of the MIDI file according to predefined extraction parameters to obtain the target audio track, wherein the extraction parameters comprise a start time, an end time and the like.

6. A device for analyzing and extracting rhythm of music melody, comprising:

7. The apparatus for analyzing and extracting music melody rhythm of claim 6, wherein the apparatus further comprises a similarity analysis module and a marking module, and the identification module is further configured to identify each music piece in the melody rhythm; and

identifying phrases in each passage;

8. The apparatus for analyzing and extracting musical melody rhythm according to claim 7, wherein said apparatus further comprises an extracting module for extracting a duration structure of each musical piece;

9. The apparatus for analyzing and extracting rhythm of music melody according to claim 8, wherein the extracting module is further configured to extract a duration structure of each phrase;

10. The apparatus as claimed in claim 6, wherein the pre-processing module is configured to extract the tracks of the MIDI file according to predefined extraction parameters, such as start time and end time, to obtain the target track.