WO2022202297A1 - Procédé de fourniture de texte, programme, et dispositif de fourniture de texte - Google Patents

Procédé de fourniture de texte, programme, et dispositif de fourniture de texte Download PDF

Info

Publication number
WO2022202297A1
WO2022202297A1 PCT/JP2022/010084 JP2022010084W WO2022202297A1 WO 2022202297 A1 WO2022202297 A1 WO 2022202297A1 JP 2022010084 W JP2022010084 W JP 2022010084W WO 2022202297 A1 WO2022202297 A1 WO 2022202297A1
Authority
WO
WIPO (PCT)
Prior art keywords
chord
data
code
text
music
Prior art date
Application number
PCT/JP2022/010084
Other languages
English (en)
Japanese (ja)
Inventor
和久 秋元
Original Assignee
ヤマハ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ヤマハ株式会社 filed Critical ヤマハ株式会社
Priority to CN202280022223.0A priority Critical patent/CN116997958A/zh
Priority to JP2023508950A priority patent/JPWO2022202297A1/ja
Publication of WO2022202297A1 publication Critical patent/WO2022202297A1/fr
Priority to US18/471,376 priority patent/US20240013760A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/36Accompaniment arrangements
    • G10H1/38Chord
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10GREPRESENTATION OF MUSIC; RECORDING MUSIC IN NOTATION FORM; ACCESSORIES FOR MUSIC OR MUSICAL INSTRUMENTS NOT OTHERWISE PROVIDED FOR, e.g. SUPPORTS
    • G10G1/00Means for the representation of music
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0008Associated control or indicating means
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/395Special musical scales, i.e. other than the 12- interval equally tempered scale; Special input devices therefor
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/571Chords; Chord sequences
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/571Chords; Chord sequences
    • G10H2210/576Chord progression
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2240/00Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
    • G10H2240/075Musical metadata derived from musical analysis or for use in electrophonic musical instruments
    • G10H2240/081Genre classification, i.e. descriptive metadata for classification or selection of musical pieces according to style
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/311Neural networks for electrophonic musical instruments or musical processing, e.g. for musical recognition or control, automatic composition or improvisation

Definitions

  • This disclosure relates to a text providing method.
  • Patent Document 1 discloses a technique of detecting cadences in a musical score indicating the chord progression of a piece of music, displaying arrow symbols at the cadences, and changing the color according to the type of the cadences. The user can recognize the part corresponding to the cadence and the type of the cadence among the chords included in the music by the arrow symbol and the color.
  • One of the purposes of this disclosure is to provide a commentary on the code from multiple codes arranged in chronological order.
  • code input data in which codes are arranged in chronological order is generated based on the relationship between code string data in which codes are arranged in chronological order and commentary texts related to the codes included in the code string data.
  • a text providing method is provided that includes obtaining a corresponding text.
  • FIG. 4 is a diagram for explaining a chroma vector representing a code in one embodiment
  • FIG. 4 is a diagram for explaining a chroma vector representing a code in one embodiment
  • FIG. 10 is a diagram for explaining an example of commentary sentences obtained from code input data
  • FIG. 4 is a diagram for explaining an example of a teacher data set
  • FIG. 4 is a diagram for explaining an example of a teacher data set
  • FIG. 4 is a diagram for explaining an example of a teacher data set
  • FIG. FIG. 4 is a diagram for explaining an example of a teacher data set
  • FIG. FIG. 4 is a diagram for explaining an example of a teacher data set
  • FIG. FIG. 4 is a diagram for explaining an example of a teacher data set
  • FIG. FIG. 4 is a diagram for explaining an example of a teacher data set
  • FIG. 10 is a diagram for explaining a chord progression detected as two-five-one;
  • FIG. 10 is a diagram for explaining an example of commentary sentences obtained from code input data;
  • FIG. 10 is a diagram for explaining an example of commentary sentences obtained from code input data;
  • FIG. 10 is a diagram for explaining a modification of chord progression detected as two-five-one;
  • FIG. 10 is a diagram for explaining a modification of chord progression detected as two-five-one;
  • FIG. 4 is a diagram for explaining a music database in one embodiment;
  • FIG. FIG. 4 is a diagram for explaining a method of calculating chord progression importance; It is a flow chart which shows processing which generates code input data in one embodiment.
  • FIG. 1 is a diagram showing a text providing system in one embodiment.
  • the text providing system 1000 includes a text providing server 1 (text providing device) and a model generation server 3 connected to a network NW such as the Internet.
  • the communication terminal 9 is a smart phone, a tablet computer, a laptop computer, a desktop computer, or the like, and connects to the network NW to perform data communication with other devices.
  • the text providing server 1 receives data related to music from the communication terminal 9 via the network NW, and transmits commentary text corresponding to the chord progression included in the music to the communication terminal 9.
  • the communication terminal 9 can display commentary sentences on the display.
  • the text providing server 1 generates commentary text using a learned model obtained by machine learning.
  • the trained model 155 receives chord input data in which chords constituting a piece of music are arranged in chronological order, it outputs commentary sentences regarding chord progressions through arithmetic processing using a neural network.
  • the model generation server 3 executes machine learning processing using the teacher data set to generate a trained model used in the text providing server 1 .
  • the text providing server 1 and the model generating server 3 will be described below.
  • the text providing server 1 includes a control section 11 , a communication section 13 and a storage section 15 .
  • the control unit 11 includes a CPU (processor), RAM and ROM.
  • the control unit 11 executes a program stored in the storage unit 15 by the CPU, thereby performing processing according to instructions described in the program.
  • This program includes a program 151 for performing text providing processing, which will be described later.
  • the communication unit 13 includes a communication module, connects to the network NW, and transmits and receives various data to and from other devices.
  • the storage unit 15 includes a storage device such as a non-volatile memory, and stores the program 151 and the learned model 155. In addition, various data used in the text providing server 1 are stored.
  • the storage unit 15 may store a music database 159 .
  • the song database 159 is described in another embodiment.
  • the program 151 is provided to the text providing server 1 in a state stored in a computer-readable recording medium such as a magnetic recording medium, an optical recording medium, a magneto-optical recording medium, or a semiconductor memory as long as it can be executed by a computer. good too. In this case, the text providing server 1 may be provided with a device for reading the recording medium.
  • Program 151 may be provided by downloading via communication unit 13 .
  • the learned model 155 is generated by machine learning in the model generating server 3 and provided to the text providing server 1.
  • the trained model 155 When the trained model 155 is provided with code input data, it outputs a commentary text about the code through arithmetic processing using a neural network.
  • the trained model 155 is a model using RNN (Recurrent Neural Network).
  • the trained model 155 uses Seq2Seq (Sequence To Sequence) and includes an encoder and a decoder, which will be described later.
  • the code input data and commentary text are examples of data described in chronological order, and the details will be described later. Therefore, the trained model 155 is preferably a model that is advantageous in handling time-series data.
  • the trained model 155 may be a model using LSTM (Long Short Term Memory) and GRU (Gated Recurrent Unit).
  • the trained model 155 may be a model using CNN (Convolutional Neural Network), Attention (Self-Attention, Source Target Attention), or the like.
  • the trained model 155 may be a model combining multiple models.
  • Trained model 155 may be stored in another device connected via network NW. In this case, the text providing server 1 may connect to the trained model 155 via the network NW.
  • the model generation server 3 includes a control section 31 , a communication section 33 and a storage section 35 .
  • the control unit 31 includes a CPU (processor), RAM and ROM.
  • the control unit 31 executes a program stored in the storage unit 35 by the CPU, thereby performing processing according to instructions described in the program.
  • This program includes a program 351 for performing model generation processing, which will be described later.
  • Model generation processing is processing for generating a learned model 155 using a teacher data set.
  • the communication unit 33 includes a communication module, connects to the network NW, and transmits and receives various data to and from other devices.
  • the storage unit 35 includes a storage device such as a non-volatile memory, and stores the program 351 and the teacher data set 355. In addition, various data used in the model generation server 3 are stored.
  • the program 351 is provided to the model generation server 3 as long as it can be executed by a computer, and is stored in a computer-readable recording medium such as a magnetic recording medium, an optical recording medium, a magneto-optical recording medium, or a semiconductor memory. good too.
  • the text providing server 1 may be provided with a device for reading the recording medium.
  • Program 351 may be provided by downloading via communication unit 33 .
  • a plurality of teacher data sets 355 may be stored in the storage unit 35 .
  • the teacher data set 355 is data in which the code string data 357 and the commentary text data 359 are associated with each other, and is used when the trained model 155 is generated. Details of the teacher data set 355 will be described later.
  • FIG. 2 is a flowchart showing text provision processing in one embodiment.
  • the control unit 11 waits until music code data is received from the communication terminal 9 (step S101; No).
  • the song code data is data in which a plurality of codes that constitute a song are arranged in chronological order and described.
  • the song code data is described as "CM7-Dm7-Em7-".
  • each code may be arranged in units of predetermined unit periods (for example, one bar, one beat, etc.), or may be arranged in order without considering unit periods.
  • each chord is arranged in units of one measure in the above example
  • the song code data is described as "CM7-CM7-Dm7". be done.
  • the music code data is described as "CM7-Dm7-" as in the above example.
  • the communication terminal 9 transmits the music code data to the text providing server 1 .
  • the control unit 11 When the text providing server 1 receives the music code data, the control unit 11 generates code input data from the music code data (step S103).
  • the chord input data is written by converting each chord included in the song chord data into a predetermined format.
  • the code input data is data in which each code is described by a chroma vector.
  • FIGS. 3 and 4 are diagrams for explaining chroma vectors representing codes in one embodiment.
  • the chroma vector is described by the presence of a note "1" or no note “0" corresponding to each note name (C, C#, D, . . . ).
  • each chord is converted into data (hereinafter referred to as conversion data) in which a chroma vector corresponding to a component tone, a chroma vector corresponding to a bass tone, and a chroma vector corresponding to a tension tone are combined.
  • the conversion data is data describing three chroma vectors as matrix data (3 ⁇ 12).
  • Transform data may be described as vector data in which three chroma vectors are connected in series.
  • Fig. 3 is an example of code "CM7” expressed as conversion data.
  • FIG. 4 is an example in which the code “C/B” is represented as conversion data.
  • “CM7” and “C/B” have the same constituent sounds, but different base sounds and tension sounds. Therefore, according to the conversion data, "CM7" and “C/B” can be distinguished. That is, the transform data can unambiguously represent the functionality of the code.
  • the conversion data may contain at least the chroma vectors of the component sounds, and may not contain at least one of the bass sound and the tension sound, or both.
  • the structure of the conversion data should be appropriately set according to the required result.
  • the code input data is the data in which the converted data are arranged in chronological order.
  • the chord input data are the conversion data corresponding to "CM7”, the conversion data corresponding to "Dm7”, It is described as data arranged in the order of .
  • the control unit 11 provides the code input data to the trained model 155 (step S105).
  • the control unit 11 executes arithmetic processing by the trained model 155 (step S107), and acquires sentence output data from the trained model 155 (step S109).
  • the control unit 11 transmits the acquired text output data to the communication terminal 9 (step S111).
  • the sentence output data corresponds to the commentary sentence described above and includes a character group indicating an explanation for the code defined by the code input data.
  • the commentary text includes at least one of a first group of characters describing chord progressions, a second group of characters describing function of the chords, and a third group of characters describing joining techniques between chords.
  • the commentary text includes a first character group, a second character group and a third character group.
  • FIG. 5 is a diagram for explaining an example of commentary sentences obtained from code input data.
  • the trained model 155 includes an encoder (also called an input layer) that generates intermediate state data by operating the supplied code input data with the RNN, and sentence output data by operating the intermediate state data with the RNN. It contains an output decoder (also called an output layer). More specifically, the encoder is provided with a plurality of transform data included in the code input data in chronological order. The decoder outputs a plurality of characters (character group) arranged in chronological order as commentary text. A character here may mean one word (morpheme) classified by morphological analysis. Intermediate states are sometimes referred to as hidden states or hidden layers.
  • chord input data shown in FIG. 5 is shown as music chord data "CM7-Dm7-Em7-", but is data in which each chord is described as conversion data as described above.
  • an end marker (EOS: End Of Sequence) is attached to the portion where the code ends.
  • sentence output data that is, commentary sentences are composed of combinations of the following character groups.
  • diatonic chords are sequentially ascended to form a two-five between Fm7 and Bb7. While Bb7 functions as a substitute chord for the subdominant minor chord Fm7, the following dominant 7th chord E7 for Am7
  • the substitute chord Am7 of the tonic chord CM7 and the substitute chord AbM7 of the subdominant minor chord Fm7 are repeated, so that the root note rises and falls by a semitone while the 3rd and 7th notes are kept the same. It is progress.”
  • the first group of characters (explaining the chord progression) in the commentary text shown in Fig. 5 corresponds to "form two fives between Fm7 and Bb7".
  • the second character group (explaining the functions of the chords) in the commentary sentences shown in FIG. functions as a back chord of the tonic chord CM7", "alternative chord Am7 of tonic chord CM7”, and “alternative chord AbM7 of subdominant minor chord Fm7".
  • the description of the two functions of Bb7 is put together and expressed as "Bb7... functions, while Am7 following it functions.”
  • the third character group (description of the connection technique between chords) in the commentary sentences shown in FIG. and the 7th note are progressions that are held homophonically".
  • the ascending diatonic chords are expressed as "Sequentially ascending diatonic chords, . . . " so as to be connected to the next sentence.
  • the sentence output data obtained in this way is transmitted to the communication terminal 9 that transmitted the music code data.
  • the user of the communication terminal 9 is provided with commentary sentences corresponding to the music code data.
  • the above is the description of the text providing process.
  • model generation processing (model generation method) executed by the control unit 31 in the model generation server 3 will be described.
  • the model generation process is started in response to a request from a terminal or the like used by the manager of the model generation server 3 .
  • the model generation process may be started in response to a user's request, that is, a request from the communication terminal 9 .
  • FIG. 6 is a flowchart showing model generation processing in one embodiment.
  • the control unit 31 acquires the teacher data set 355 from the storage unit 35 (step S301).
  • the teacher data set 355 includes code string data 357 and commentary text data 359 that are associated with each other.
  • the code string data 357 is described in the same format as the code input data. That is, the code string data 357 is described as data in which codes represented by conversion data are arranged in time series.
  • the commentary text data 359 is data containing commentary texts as shown in FIG.
  • This commentary text is a text that explains the code defined by the code string data 357 .
  • the commentary text includes at least one of a first group of characters describing chord progressions, a second group of characters describing function of the chords, and a third group of characters describing linking techniques between chords, as described above.
  • the commentary text data 359 is provided with identifiers for specifying words obtained by dividing the commentary text by morphological analysis. Each word is described as "One Hot Vector”.
  • the analysis text may be described in word expressions such as “word2vec” and “GloVe”.
  • the code string data 357 included in the teacher data set 355 includes a sequence of chords corresponding to one piece of music in this example, and at least one end marker EOS is attached.
  • the teacher data set 355 can take various forms. A plurality of possible examples of the teacher data set 355 will be described with reference to FIGS. 7 to 9. FIG.
  • FIGs 7 to 9 are diagrams for explaining an example of a teacher data set.
  • the code string data 357 corresponding to the chords of the music are indicated by a plurality of sections (music segments CL(A) to CL(E)).
  • the music sections CL(A) to CL(E) each correspond to a segmented range of phrases constituting a music piece, for example, a range of 8-bar units, and are composed of a plurality of chords arranged in chronological order. include.
  • Each music segment need not be the same length as other music segments.
  • the code string data 357 shown in FIG. 7 has a format in which codes corresponding to music sections CL(A) to CL(E) are described in series, and includes an end marker EOS only at the end of the data.
  • the code string data 357 in FIG. 8 has a format in which codes corresponding to music segments CL(A) to CL(E) are divided for each music segment and described.
  • An end marker EOS is written at the division position.
  • a section divided by the end marker EOS is called a divided area.
  • a plurality of music sections may be included in one divided area.
  • one music section is not included in a plurality of divided areas.
  • the code string data 357 in FIG. 9 divides the chords corresponding to the music segments CL(A) to CL(E) into each music segment, and then further divides the chords before and after the music segment in each divided area. It has a format in which the code of the song section is added and described. That is, in the code string data 357 in FIG. 9, a plurality of continuous music sections are arranged in one divided area, and at least one music section is included in the plurality of divided areas. In this example, three consecutive music sections are arranged in each divided area, and two consecutive music sections are arranged only in the first and last divided areas. The number of continuous music sections is not limited to this example.
  • the commentary text data 359 includes commentary texts ED(A) to ED(E) respectively corresponding to the music sections CL(A) to CL(E).
  • the commentary text ED(A) includes a group of characters explaining the chords corresponding to the music section CL(A).
  • the commentary text data 359 shown in FIGS. 8 and 9 are divided by the end marker EOS, similarly to the code string data 357 .
  • the control unit 31 inputs the code string data 357 to a model for machine learning (here, called a training model) (step S303).
  • the training model is a model that performs arithmetic processing using the same neural network (RNN in this example) as the trained model 155 .
  • the training model may be the learned model 155 stored in the text providing server 1 .
  • the control unit 31 uses the values output from the training model corresponding to the input of the code string data and the commentary text data 359 to execute machine learning by error backpropagation (step S305). Specifically, machine learning updates the weighting factors in the neural network of the trained model. If there are other teacher data sets 355 to be learned (step S307; Yes), machine learning is performed using the remaining teacher data sets 355 (steps S301, S303, S305). If there is no other teacher data set 355 to be learned (step S307; No), the control unit 31 terminates machine learning.
  • the control unit 31 generates a training model that has undergone machine learning as a learned model (step S309), and ends model generation processing.
  • the generated trained model is provided to the text providing server 1 and used as the trained model 155 .
  • the trained model 155 is a model that has learned the correlation between the code defined in the code string data 357 and the commentary text for that code.
  • the control unit 31 resets the intermediate state at the time of the end marker EOS. That is, in machine learning, the code in a specific divided area and the codes in other divided areas divided from that area are not treated as continuous time-series data.
  • a code in a specific music section and a code in a music section different from that music section are treated as mutually independent time-series data.
  • the chords included in one music section are treated as time-series data.
  • music sections separated from each other are not included in one divided area and are treated as independent time-series data.
  • the music segment CL(B) and the music segment CL(C) may be included in one divided area or may be included in different divided areas. Therefore, depending on the divided areas, the code of the music section CL(B) and the code of the music section CL(C) may be treated as a series of time-series data or treated as independent time-series data. be.
  • a first example is a teacher data set in which no divided regions are set as shown in FIG.
  • a second example is a teacher data set in which a plurality of divided areas are set as shown in FIG. 8 and none of the music segments are included in the plurality of divided areas.
  • a third example is a data set in which a plurality of divided areas are set as shown in FIG. 9 and at least one music section is included in the plurality of divided areas.
  • FIG. 10 is a diagram for explaining chord progressions detected as two-five-one.
  • FIG. 10 shows an example of a two-five-one chord progression within the scale of Cmaj or Amin (basic form, derivative form) and another example outside the scale (subverse chord, postponement of resolution).
  • "Back code” means that a back code is used in part of the chord progression. Derived forms and sub-codes corresponding to the basic form are enclosed in a range with dashed lines. In the derivative form and the back code, the parts that differ from the basic form are underlined.
  • “deferred resolution” it is an example of a two-five-one format with a change added by inserting the code shown in parentheses.
  • the learned model 155 generated by the above-described machine learning can output as a commentary sentence that the two-five-one exists even if the chord progression is expressed in a form other than the basic form.
  • the learned model 155 generated by machine learning including the context can output a commentary sentence considering whether the chord progression corresponds to 251 or not. can.
  • FIGS. 11 and 12 are diagrams for explaining examples of commentary sentences obtained from code input data. 11 and 12 both have a sequence of codes "Em7-A7-GbM7-Ab7", but in FIG. 12 DbM7 is added to the last part. That is, the code positioned at the end of the time series by the end marker EOS is Ab7 in FIG. 11, whereas it is DbM7 in FIG.
  • the trained model 155 presumes that the element "Em7-A7-Ab7" in the code input data shown in FIG. 11 is related to two-five-one, and outputs the following commentary text as text output data.
  • “Em7-A7-Ab7 is a derivative of II-VI (Em7-A7-Dm7) in the diatonic chord of the Cmaj scale when compared to the Dmaj scale, and Dm7 is changed to a reverse chord.
  • GbM7 is a two-five to Ab7 in Dbmaj and is inserted to temporarily delay resolution (termination) to Ab7.”
  • the trained model 155 presumes that the element "GbM7-Ab7-DbM7" in the code input data shown in FIG.
  • the following commentary sentences that are also referred to are output as sentence output data. "There is a temporary modulation from Cmaj to Dmaj. GbM7-Ab7-DbM7 changes II to the same subdominant IV (Ebm7 ⁇ GbM7 ).In order to make the modulation smooth, a kind of two-five-one using the back chord of Em7-A7-Ab7 is adopted.”
  • the learned model 155 can generate similar parts even if the code sequences contained in the code input data are similar. It is possible to output text output data including an appropriate commentary text in consideration of the context.
  • FIGS. 13 and 14 are diagrams for explaining modifications of chord progressions detected as two-five-one.
  • the connection technique of lowering the baseline is applied to the basic form of the chord progression of two-five-one, "Bm7(-5)-E7-Am7," for example, "Bm7(- 5) -Bm7(-5)/F-E7-E7/G#-Am7".
  • Bm7(-5)-E7-Am7 for example, "Bm7(- 5) -Bm7(-5)/F-E7-E7/G#-Am7”.
  • the chord input data may specify the sequence of all chords contained in the song code data, or it may specify the sequence of some chords extracted from it. There may be.
  • a section of music corresponding to a chord included in chord input data will be referred to as a specific section.
  • the specific section may be set by the user, or may be set by a predetermined method exemplified below.
  • chord input data provided to the trained model 155 does not have to be all of the song code data, and if the characteristic parts of the song can be used, it is possible to obtain the commentary sentences that are characteristic of the song. Therefore, it is preferable to set such a characteristic portion of a piece of music as the specific section.
  • a characteristic part of a piece of music can be set by various methods, and one example will be described.
  • control unit 11 divides a piece of music into a plurality of predetermined determination intervals (for example, the above-described music intervals), and sets determination intervals that satisfy predetermined conditions as specific intervals.
  • predetermined determination intervals for example, the above-described music intervals
  • determination intervals that satisfy predetermined conditions are set as specific intervals.
  • a determination section having a chord progression importance exceeding a predetermined threshold is set as the specific section.
  • chord progression importance is calculated based on various data registered in the music database 159 and the chord progression in the determination section. An example of this calculation method will be described.
  • FIG. 15 is a diagram for explaining the music database in one embodiment.
  • the music database 159 is stored in the storage unit 15 of the text providing server 1, for example.
  • Information about a plurality of songs is registered in the song database 159. For example, genre information, scale information, chord appearance rate data, and chord progression appearance rate data associated with each other are registered.
  • Genre information is information indicating the genre of a song, for example, "rock”, “pops”, “jazz”, and so on.
  • the scale information is information indicating scales (including keys in this example) such as 'C major scale', 'C minor scale', 'C# major scale', and so on.
  • Each scale has set tones that compose it (hereinafter referred to as scale constituent tones).
  • the chord appearance rate data indicates the ratio of each type of chord to the total number of chords in all songs registered in the song database. For example, if the total number of codes is "10000" and the number of codes "Cm" is "100", the appearance rate of that code is "0.01".
  • any of the following criteria may be used for the identity of codes that are similar to each other. If the chord names are different from each other, they may be treated as different chords ("CM7" and "C/B” are different). If the constituent sounds are the same, they may be treated as the same code ("CM7” and "C/B” are the same). If the constituent notes and the bass note are the same, they may be treated as the same code ("CM7” and "G/C” are the same). Even if the constituent sounds are different from each other, if they are the same except for the tension sound, they may be treated as the same code ("CM7" and "C” are the same).
  • the chord progression appearance rate data indicates the ratio of each type of chord progression to the total number of chord progressions of all songs registered in the song database.
  • the chord progression referred to here is set in advance by a user or the like. For example, if the total number of chord progressions is "20000" and the number of chord progressions "Dm-G7-CM7" is "400", the chord progression appearance rate is "0.02".
  • the criteria for determining code identity may be the same as the code appearance rate determination method described above. Any of the following examples of criteria for determining the identity of chord progressions may be used. Chord progressions that are similar to each other may be treated as the same chord progression. For example, the derivative form of the basic form shown in FIG. 10 and the form using the alternate chord may be treated as the same chord progression.
  • chord progression in which at least two of the chord progressions match may be treated as the same chord progression.
  • chord progression “Dm-G7-CM7”, “*-G7-CM7”, “Dm-*-CM7” and “Dm-G7-*” may be treated as the same chord progression.
  • “*” indicates an unspecified code (one of all codes).
  • chord appearance rate data and chord progression appearance rate data include data for all songs.
  • the chord appearance rate data and the chord progression appearance rate data further include data determined corresponding to each genre defined in the genre information.
  • the chord appearance rate data and chord progression appearance rate data corresponding to the genre "rock” may include the chord appearance rate and the chord progression appearance rate obtained only from songs corresponding to the genre "rock".
  • the parameters of the appearance rate total number of chords and total number of chord progressions
  • the data for the entire song may be used.
  • chords and chord progressions In terms of chords and chord progressions, the appearance rate in the genre "rock” and the appearance rate in the genre “jazz” are different. Therefore, by having the appearance rate of chords and the appearance rate of chord progressions for each genre, it is possible to more accurately determine the characteristic portion of a piece of music. Genre information may not necessarily be used, and in this case, there may be no chord appearance rate data and chord progression appearance rate data for each genre.
  • FIG. 16 is a diagram for explaining the method of calculating the importance of chord progression.
  • the example shown in FIG. 16 shows each index value and importance when the chord progression in the determination section is "C-Cm-CM7-Cm7".
  • the index value includes a chord progression rarity (CP) determined for a chord progression, a scale factor (S) determined for each chord constituting the chord progression, and a chord rarity (C). Based on these indices, the chord importance (CS) for each chord and the chord progression importance (CPS) for the chord progression are calculated. Both the index value and the importance have values ranging from "0" to "1". A higher value indicates a characteristic element.
  • the key for the song is C
  • the scale is major scale
  • the genre is pops.
  • the scale element (S) is set to "0" when all of the chord constituent tones are included in the scale constituent tones, and is set to "1" when any of the chord constituent tones are not included in the scale constituent tones. ”. This is because it can be said that chords that include tones that are not included in the scale-constituting tones are characteristic parts of a piece of music.
  • the code rarity (C) is obtained by a predetermined calculation formula.
  • the calculation formula is determined so that the higher the code appearance rate, the lower the code rarity (C).
  • C code rarity
  • C major scale C and CM7 have relatively high chord appearance rates, so the chord rarity (C) is set to a relatively small value.
  • the chord progression rarity (CP) is obtained by a predetermined calculation formula.
  • the calculation formula is determined so that the higher the chord progression appearance rate, the lower the chord progression rarity (CP).
  • the occurrence rate of the chord progression "C-Cm-CM7-Cm7” is extremely low, so the chord progression rarity (CP) is set to a large value "1".
  • chord importance is calculated using the scale factor (S), chord rarity (C), and chord progression rarity (CP).
  • S scale factor
  • C chord rarity
  • CP chord progression rarity
  • chord progression importance indicates that the larger the number (closer to "1"), the more unusual the chord progression is compared to other songs.
  • a determination section with a large chord progression importance is a characteristic part of a piece of music.
  • index value and importance calculation method are examples, and various calculation methods can be used as long as the importance of the chord progression as a whole (characteristic part of the song) can be obtained.
  • This code input data generation method replaces, for example, the processing in step S103 shown in FIG.
  • FIG. 17 is a flowchart showing processing for generating code input data in one embodiment.
  • the control unit 11 sets the key, scale and genre (step S1031). As described above, the key, scale, and genre may be obtained by receiving them from the communication terminal 9 according to settings made by the user, or may be obtained by analyzing music code data.
  • the control unit 11 divides the music piece into a plurality of judgment intervals (step S1033), and calculates the chord progression importance (CPS) in each judgment interval (step S1035).
  • CPS chord progression importance
  • the control unit 11 sets at least one judgment section as a specific section based on the chord progression importance (CPS) calculated for each judgment section (step S1037).
  • a determination section having a chord progression importance (CPS) greater than a predetermined threshold is set as the specific section.
  • a predetermined number of determination segments may be set as the specific segment in order from the determination segment having the highest chord progression importance (CPS).
  • the control unit 11 generates code input data corresponding to the specific section (step S1039).
  • one specific section may be arranged in one divided area by arranging an end marker EOS for each specific section, or a plurality of consecutive determination sections may be arranged in a plurality of specific sections. , the plurality of specific sections may be arranged so as to be included in one divided area.
  • the trained model 155 By providing the generated chord input data to the trained model 155 in this way, the trained model 155 generates commentary sentences for the chord progressions representing the characteristic parts of the music, and outputs sentence output data. be able to.
  • the text providing server 1 uses the trained model 155 to generate commentary text from code input data, but a model that does not use a neural network (for example, a rule-based model) is used. may be used. According to the trained model 155, the accuracy of the commentary text can be improved by using many teacher data sets 355 for machine learning.
  • a model that does not use a neural network for example, a rule-based model
  • the accuracy of the commentary text can be improved by using many teacher data sets 355 for machine learning.
  • a rule for generating commentary text from code input data that is, a correspondence relationship between information corresponding to the code string data 357 and information corresponding to the commentary text data 359 described above.
  • This rule requires a large amount of information. For example, as described above, various types of chord sequences are assumed to be determined as a two-five-one chord progression. Therefore, in order to improve the accuracy of the commentary text, it is necessary to set each commentary text corresponding to many possible types. In order to reduce the amount of information, it may be necessary to simplify the commentary sentences compared to when the trained model 155 is used. Although it may be less efficient than using the trained model 155, it is possible to generate commentary sentences from code input data using a rule-based model.
  • chord appearance rate data and the chord progression appearance rate data may be defined to be equivalent regardless of the key of the music.
  • the chord appearance rate data is such that the code “CM7” when the key of the song is “C” and the code “EM7” when the key of the song is “E” are interpreted as the same chord. do it.
  • the chord progression appearance rate data is the same for the chord progression "Dm-G7-CM7" when the key of the song is "C” and the chord progression "Fm-B7-EM7” when the key is "E”. It should be interpreted as a chord progression.
  • chord appearance rate data and the chord progression appearance rate data may be defined by chords expressed relative to the music key.
  • a relative expression may be, for example, a conversion when the key is “C”. It may be converted into descriptions such as "I” and "II”. For example, the code “Em7" in the key "C” is expressed as "IIIm7".
  • control unit 11 converts the chord appearance rate data and the chord progression appearance rate data defined by the chords of relative expression into the chords of absolute expression based on the set key of the music piece. .
  • the control unit 11 calculates the chord importance (CS) and the chord progression importance (CPS) based on the appearance rate of the converted chord.
  • the text providing server 1 may use an arithmetic model such as SVM (Support Vector Machine) or HMM (Hidden Markov Model) instead of using the trained model 155 .
  • the controller 11 obtains a specific chord progression such as "two five one" from the chord input data using this computational model.
  • the control unit 11 combines the acquired chord progression and a predetermined template to control the commentary text.
  • a predetermined template is, for example, "XXXX is used in this chord progression.”
  • a commentary text saying "Two Five One is used for this chord progression” is generated.
  • HMM the codes included in the code input data may be input sequentially.
  • SVM a predetermined number of codes included in code input data may be input collectively.
  • a server storing a plurality of trained models 155 may be connected to the network NW.
  • This server may be the model generation server 3 .
  • the text providing server 1 may select one of the plurality of trained models 155 stored in this server and execute the text providing process described above.
  • the text providing server 1 may download the learned model 155 used in the text providing process from the server and store it in the storage unit 15, or may communicate with the server that stores the learned model 155 without downloading it. Code input data may be sent and sentence output data may be received by doing.
  • a plurality of trained models 155 differ from each other in at least part of the teacher data set 355 used for machine learning. For example, when machine learning is performed using a plurality of teacher data sets 355 classified by genre (jazz, classical, etc.), a plurality of learned models 155 corresponding to the genres are generated.
  • the teacher data set 355 may be classified according to genre type or musical instrument type. According to this classification, the code string data and commentary text data are specialized for that classification.
  • the teacher data set 355 may be classified according to the authors of the commentary texts included in the commentary text data 359 thereof.
  • chord input data corresponding to music classified as jazz For example, by providing chord input data corresponding to music classified as jazz to the trained model 155 corresponding to jazz, it is possible to obtain highly accurate commentary text.
  • the object to which the music corresponding to the chord input data is classified may be set by the user or may be set by analyzing the music.
  • multiple types of commentary texts may be obtained. For example, if a plurality of trained models 155 corresponding to a plurality of creators are used, it is possible to compare the obtained plurality of types of commentary sentences and select one suitable for the user. A new commentary text may be generated based on common points among the commentary texts obtained from a plurality of trained models 155 .
  • Code input data and code string data 357 are not limited to being described in chroma vectors. For example, as long as the constituent sounds of the chord are represented by data including vectors, they may be represented by other methods. Codes may also be described using expressions such as "word2vec" and "GloVe”.
  • codes are arranged in chronological order based on the relationship between code string data in which codes are arranged in chronological order and commentary texts related to the codes included in the code string data.
  • a text providing method includes obtaining text corresponding to code input data.
  • Obtaining the sentence may include obtaining the sentence from the trained model by providing the code data to the trained model that has learned the relationship.
  • the chords included in the code string data may include at least constituent notes of the chords and bass notes.
  • the chords included in the code string data may include at least constituent sounds of the chords and tension sounds of the chords.
  • the code may be represented by data including vectors.
  • the chord may be represented by data including a first chroma vector corresponding to the constituent notes of the chord.
  • the chord may be represented by data including a second chroma vector corresponding to the bass note of the chord.
  • the chord may be represented by data including a third chroma vector corresponding to the tension sound of the chord.
  • the commentary text may include a first character group explaining chord progression.
  • the commentary text may include a second group of characters explaining the function of the code.
  • the commentary text may include a third character group that explains the technique of connecting codes.
  • the predetermined condition may include a condition using the chord included in the music chord data and the importance of the chord determined according to the key of the music.
  • the predetermined condition may include a condition using the code included in the music code data and the importance of the code determined according to the genre of the music.
  • a program may be provided for causing a computer to execute the text providing method.
  • a text providing device may be provided that includes a storage unit that stores instructions of this program and a processor that executes the instructions.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Auxiliary Devices For Music (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Un mode de réalisation de l'invention concerne un procédé de fourniture de texte, comprenant les étapes suivantes : fourniture de données d'entrée d'accord dans lesquelles des accords sont disposés dans un ordre chronologique à un modèle appris ayant appris une relation entre des données de séquence d'accords dans lesquelles des accords sont disposés dans un ordre chronologique et un texte explicatif concernant les accords inclus dans les données de séquence d'accords ; et acquisition d'un texte correspondant aux données d'entrée d'accord à partir du modèle appris.
PCT/JP2022/010084 2021-03-23 2022-03-08 Procédé de fourniture de texte, programme, et dispositif de fourniture de texte WO2022202297A1 (fr)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN202280022223.0A CN116997958A (zh) 2021-03-23 2022-03-08 文章提供方法、程序及文章提供装置
JP2023508950A JPWO2022202297A1 (fr) 2021-03-23 2022-03-08
US18/471,376 US20240013760A1 (en) 2021-03-23 2023-09-21 Text providing method and text providing device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2021-049200 2021-03-23
JP2021049200 2021-03-23

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/471,376 Continuation US20240013760A1 (en) 2021-03-23 2023-09-21 Text providing method and text providing device

Publications (1)

Publication Number Publication Date
WO2022202297A1 true WO2022202297A1 (fr) 2022-09-29

Family

ID=83395600

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/010084 WO2022202297A1 (fr) 2021-03-23 2022-03-08 Procédé de fourniture de texte, programme, et dispositif de fourniture de texte

Country Status (4)

Country Link
US (1) US20240013760A1 (fr)
JP (1) JPWO2022202297A1 (fr)
CN (1) CN116997958A (fr)
WO (1) WO2022202297A1 (fr)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011215181A (ja) * 2010-03-31 2011-10-27 Yamaha Corp 楽譜表示装置および楽譜表示方法を実現するためのプログラム
JP2012168538A (ja) * 2011-02-14 2012-09-06 Honda Motor Co Ltd 楽譜位置推定装置、及び楽譜位置推定方法

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011215181A (ja) * 2010-03-31 2011-10-27 Yamaha Corp 楽譜表示装置および楽譜表示方法を実現するためのプログラム
JP2012168538A (ja) * 2011-02-14 2012-09-06 Honda Motor Co Ltd 楽譜位置推定装置、及び楽譜位置推定方法

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
KAGO YUINA, KOJIRI TOMOKO: "Piano Accompaniment Support System Focusing on Accompaniment Pattern Image", IEICE TECHNICAL REPORT, ET, IEICE, JP, vol. 116, no. 517 (ET2016-104), 1 March 2017 (2017-03-01), JP, pages 61 - 66, XP009539998 *
RYOTA NISHIMURA, MIHO HIGAKI, NORIHIDE KITAOKA: "RNN-based mapping of acoustic vector time series to document vectors", IEICE TECHNICAL REPORT, SP, IEICE, JP, vol. 118, no. 112 (SP2018-12), 1 June 2018 (2018-06-01), JP, pages 59 - 64, XP009539997 *
RYOUE TAKAHASHI AND OTHERS: "3-7-4 Consideration of Association between Song Reviews and Acoustic Features", SPRING AND AUTUMN MEETING OF THE ACOUSTICAL SOCIETY OF JAPAN, ACOUSTICAL SOCIETY OF JAPAN, JP, 1 March 2007 (2007-03-01) - 15 March 2007 (2007-03-15), JP , pages 743 - 744, XP009539970, ISSN: 1880-7658 *

Also Published As

Publication number Publication date
CN116997958A (zh) 2023-11-03
JPWO2022202297A1 (fr) 2022-09-29
US20240013760A1 (en) 2024-01-11

Similar Documents

Publication Publication Date Title
Chen et al. Functional Harmony Recognition of Symbolic Music Data with Multi-task Recurrent Neural Networks.
CN111630590B (zh) 生成音乐数据的方法
Collins et al. Developing and evaluating computational models of musical style
JP2020003535A (ja) プログラム、情報処理方法、電子機器、及び学習済みモデル
Suzuki et al. Four-part harmonization using Bayesian networks: Pros and cons of introducing chord nodes
Cambouropoulos The harmonic musical surface and two novel chord representation schemes
Marxer et al. Unsupervised incremental online learning and prediction of musical audio signals
Banar et al. A systematic evaluation of GPT-2-based music generation
Bigo et al. Sketching sonata form structure in selected classical string quartets
CN113707112A (zh) 基于层标准化的递归跳跃连接深度学习音乐自动生成方法
Colombo et al. Learning to generate music with BachProp
Wu et al. The power of fragmentation: a hierarchical transformer model for structural segmentation in symbolic music generation
Glickman et al. (A) Data in the Life: Authorship Attribution of Lennon-McCartney Songs
WO2022202297A1 (fr) Procédé de fourniture de texte, programme, et dispositif de fourniture de texte
US10431191B2 (en) Method and apparatus for analyzing characteristics of music information
Marsden Representing melodic patterns as networks of elaborations
Savage et al. Measuring the cultural evolution of music: With case studies of British-American and Japanese folk, art, and popular music
Syarif et al. Gamelan Melody Generation Using LSTM Networks Controlled by Composition Meter Rules and Special Notes
Kan et al. Generation of irregular music patterns with deep learning
Llorens et al. musif: a Python package for symbolic music feature extraction
Müller et al. Chord Recognition
KR102490769B1 (ko) 음악적 요소를 이용한 인공지능 기반의 발레동작 평가 방법 및 장치
JP2007101780A (ja) 楽曲のタイムスパン木の自動分析方法、自動分析装置、プログラムおよび記録媒体
Hadimlioglu et al. Automated musical transitions through rule-based synthesis using musical properties
Uehara Unsupervised Learning of Harmonic Analysis Based on Neural HSMM with Code Quality Templates

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22775089

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023508950

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 202280022223.0

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22775089

Country of ref document: EP

Kind code of ref document: A1