WO2015115666A1 - Dispositif d'analyse de composition musicale et dispositif d'evaluation de chant - Google Patents

Dispositif d'analyse de composition musicale et dispositif d'evaluation de chant Download PDF

Info

Publication number
WO2015115666A1
WO2015115666A1 PCT/JP2015/053016 JP2015053016W WO2015115666A1 WO 2015115666 A1 WO2015115666 A1 WO 2015115666A1 JP 2015053016 W JP2015053016 W JP 2015053016W WO 2015115666 A1 WO2015115666 A1 WO 2015115666A1
Authority
WO
WIPO (PCT)
Prior art keywords
attribute
music
lyrics
analysis
evaluation
Prior art date
Application number
PCT/JP2015/053016
Other languages
English (en)
Japanese (ja)
Inventor
松本 秀一
Original Assignee
ヤマハ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ヤマハ株式会社 filed Critical ヤマハ株式会社
Publication of WO2015115666A1 publication Critical patent/WO2015115666A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/36Accompaniment arrangements
    • G10H1/361Recording/reproducing of accompaniment for use with an external source, e.g. karaoke systems
    • G10H1/368Recording/reproducing of accompaniment for use with an external source, e.g. karaoke systems displaying animated or moving pictures synchronized with the music or audio part
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/091Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for performance evaluation, i.e. judging, grading or scoring the musical qualities or faithfulness of a performance, e.g. with respect to pitch, tempo or other timings of a reference performance
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2220/00Input/output interfacing specifically adapted for electrophonic musical tools or instruments
    • G10H2220/005Non-interactive screen display of musical or status data
    • G10H2220/011Lyrics displays, e.g. for karaoke applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2240/00Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
    • G10H2240/075Musical metadata derived from musical analysis or for use in electrophonic musical instruments
    • G10H2240/081Genre classification, i.e. descriptive metadata for classification or selection of musical pieces according to style

Definitions

  • the present invention relates to a technique for estimating music attributes.
  • Patent Document 1 a classification code that specifies a musical genre such as enka or pop is set in advance for each song, and BGV (background video) displayed when the song is sung is set according to the classification code of the song.
  • BGV background video
  • Patent Document 2 discloses a configuration in which the singing scoring standard is changed according to the genre of music.
  • an object of the present invention is to estimate music attributes without requiring information for designating attributes for each music.
  • the music analysis apparatus of the present invention includes a lyrics analysis unit that specifies words included in the lyrics of the music, and an attribute estimation unit that estimates music attributes from the words specified by the lyrics analysis unit. It comprises.
  • the attributes of the music are estimated according to the words and phrases specified by the lyrics analysis unit from the lyrics of the music, so it is possible to estimate the attributes of the music without requiring information for specifying the attributes for each music It is.
  • the attribute estimation unit estimates the attribute of the music from the phrase specified by the lyrics analysis unit, using reference information that specifies the affinity for each of the plurality of attributes for each phrase.
  • reference information that specifies the affinity for each of the plurality of attributes for each phrase.
  • the attribute estimation unit uses the recognition model for each attribute that represents the tendency of the words and phrases used in the lyrics of the music of each attribute, and determines the attributes of the music from the phrases specified by the lyrics analysis unit.
  • the attribute of a song is estimated using a recognition model for each attribute that represents the tendency of the phrase used for the lyrics of the song of each attribute, it is actually used for the lyrics of many songs of each attribute. Appropriate estimation of the attribute reflecting the tendency of the phrase to be performed is realized.
  • the attribute estimation unit estimates the attribute of each of the plurality of analysis sections into which the music is divided according to each phrase specified by the lyrics analysis unit from the lyrics in the analysis section.
  • the attribute since the attribute is estimated for each analysis section into which the music is divided, it is possible to estimate the attribute that appropriately reflects the temporal change of the tune and the subject in the music.
  • the attribute estimation unit estimates the attribute of the music according to the music information (for example, performance tempo and tone) of the music.
  • the music information for example, performance tempo and tone
  • the attribute estimation unit estimates the attribute of the music according to the music information (for example, performance tempo and tone) of the music.
  • the present invention further provides a singing evaluation apparatus.
  • a singing evaluation apparatus includes a lyrics analysis unit that identifies a phrase included in the lyrics of a song, an attribute estimation unit that estimates a song attribute from the phrase specified by the lyrics analysis unit, and an attribute estimation unit And a singing evaluation unit that evaluates the singing voice by an evaluation method according to the attribute estimated by.
  • a singing evaluation apparatus includes a lyric analysis unit that identifies words included in the lyrics of a song, an attribute estimation unit that estimates the attributes of the song from the phrases specified by the lyric analysis unit, and a singing voice
  • a singing evaluation unit that causes a display device to display a comment according to the evaluation result and the attribute estimated by the attribute estimation unit.
  • a singing evaluation apparatus includes a lyrics analysis unit that identifies words included in the lyrics of a song, an attribute estimation unit that estimates attributes of the song from the phrases specified by the lyrics analysis unit, and attribute estimation A control unit that controls the operation of the lighting device or a display image by the display device according to the attribute estimated by the unit.
  • FIG. 1 is a configuration diagram of a singing evaluation apparatus 100 according to the first embodiment of the present invention.
  • the singing evaluation device 100 is an information processing device that evaluates (scores) the skill of singing by a user, and is realized by a computer system including an arithmetic processing device 12, a storage device 14, a sound collection device 16, and a display device 18.
  • the singing evaluation apparatus 100 is suitably used as a karaoke apparatus that reproduces accompaniment sounds of music, for example.
  • the sound collection device 16 is a device (microphone) that collects ambient sounds.
  • the sound collection device 16 of the first embodiment collects a singing voice V sung by a user on a specific music (hereinafter referred to as “target music”).
  • the target music is, for example, a music selected by the user from a plurality of music.
  • the display device 18 (for example, a liquid crystal display panel) displays an image instructed from the arithmetic processing device 12. For example, the evaluation result of the singing voice V is displayed on the display device 18.
  • a sound emitting device for example, a speaker.
  • the arithmetic processing device 12 controls each element of the song evaluation device 100 in an integrated manner by executing a program stored in the storage device 14. Specifically, as illustrated in FIG. 1, the arithmetic processing device 12 evaluates the skill of singing the target music according to the music analysis unit 22 that estimates the attribute of the target music and the estimation result by the music analysis unit 22.
  • the singing evaluation unit 24 is realized.
  • the attribute estimated by the music analysis unit 22 is, for example, information that directly or indirectly represents the tone (atmosphere) or the subject of the target music.
  • information such as “broken heart”, “sad”, “inseparable”, “happy”, “fun”, “hope”, “city”, “city”, “old”, “future”, “parting”, and “meeting” are estimated as attributes of the target music.
  • musical genres such as “enka”, “pops”, and “love songs” as attributes.
  • a configuration in which each function of the arithmetic processing device 12 is distributed to a plurality of devices, or a configuration in which a dedicated electronic circuit realizes a part of the function of the arithmetic processing device 12 may be employed.
  • the storage device 14 stores programs executed by the arithmetic processing device 12 and various data used by the arithmetic processing device 12.
  • a known recording medium such as a semiconductor recording medium and a magnetic recording medium or a combination of a plurality of types of recording media is arbitrarily employed as the storage device 14.
  • the storage device 14 of the first embodiment stores music information D for each of a plurality of music pieces.
  • the music information D of an arbitrary piece of music includes melodic information DA that specifies the main melodic note sequence (a time series of a plurality of notes constituting the singing part of the music) of the music, and the lyrics of the music (at the time of the phrase) And lyric information DB designating a series).
  • the storage device 14 of the first embodiment stores reference information R used for estimating the attributes of the target music by the music analysis unit 22.
  • FIG. 2 is a schematic diagram of the reference information R.
  • the affinity H of one word / phrase W for the attribute a [k] is an index of the degree to which the word / phrase W is compatible (harmonized) as the lyrics of the music of the attribute a [k]. Specifically, the affinity H increases so that the possibility that the word W appears in the lyrics of the music with the attribute a [k] (that is, the word W becomes more familiar with the musical atmosphere of the music).
  • the affinity H for each word W is statistically set for each of the K attributes a [1] to a [K].
  • FIG. 3 is a configuration diagram of the music analysis unit 22.
  • the music analysis unit 22 of the first embodiment includes a lyrics analysis unit 32 and an attribute estimation unit 34.
  • the lyrics analyzing unit 32 analyzes the lyrics designated by the lyrics information DB of the target music. Specifically, the lyrics analyzing unit 32 specifies a plurality of words / phrases W included in the lyrics of the target music. For specifying each word / phrase W by the lyric analysis unit 32, known natural language processing such as morphological analysis is arbitrarily employed.
  • the lyric analysis unit 32 selects a predetermined word / phrase W that can be used for the lyric of the music among a plurality of morphemes (the smallest unit having linguistic meaning) obtained by dividing the lyrics specified by the lyric information DB.
  • the attribute estimation unit 34 estimates the attribute A of the target song from the plurality of words W specified by the lyrics analysis unit 32 for the lyrics of the target song.
  • the reference information R stored in the storage device 14 is used for attribute estimation by the attribute estimation unit 34.
  • FIG. 4 is a flowchart of a process in which the attribute estimation unit 34 estimates the attribute A of the target music (hereinafter referred to as “attribute estimation process”).
  • attribute estimation process the attribute estimation unit 34 counts the number of appearances C for each of the plurality of words W specified by the lyrics analysis unit 32 from the lyrics of the target music (SA1). That is, the appearance frequency C of each word / phrase W corresponds to the number (number of times) of the word / phrase W included in the lyrics of the target music.
  • the attribute estimation unit 34 calculates an estimated index X [k] (X [1] to X [K]) for each of the K attributes a [1] to a [K] registered in the reference information R ( SA2).
  • the attribute a [k] having a large number of appearances C of the word / phrase W having a high affinity H specified by the reference information R. ] Is set to a larger numerical value as the estimated index X [k].
  • the estimated index X [k] of the attribute a [k] increases as the number of appearances C of the word W (word W having a high affinity H) familiar with the musical atmosphere of the attribute a [k] in the lyrics of the target song increases. Tends to be set to a large number. Therefore, each estimation index X [1] can be used for estimating the attribute A of the target music piece.
  • the attribute estimation unit 34 estimates the attribute A of the target music in accordance with the K estimated indexes X [1] to X [K] calculated by the above procedure (SA3). Specifically, the attribute estimation unit 34 of the first embodiment selects one attribute a [k] having the maximum estimated index X [k] among the K attributes a [1] to a [K]. Confirmed as attribute A of the target song.
  • SA3 the K estimated indexes X [1] to X [K] calculated by the above procedure (SA3). Specifically, the attribute estimation unit 34 of the first embodiment selects one attribute a [k] having the maximum estimated index X [k] among the K attributes a [1] to a [K]. Confirmed as attribute A of the target song.
  • SA3 the attribute estimation unit 34 of the first embodiment selects one attribute a [k] having the maximum estimated index X [k] among the K attributes a [1] to a [K]. Confirmed as attribute A of the target song.
  • the singing evaluation unit 24 evaluates the skill of the singing by the user by analyzing the singing voice V according to the attribute A of the target music estimated by the music analysis unit 22.
  • the singing evaluation unit 24 compares the musical note string specified by the melody information DA of the music information D of the target music with the singing voice V picked up by the sound pickup device 16, and matches the two.
  • An evaluation index (score) S corresponding to the degree of (difference) is calculated.
  • FIG. 5 is a flowchart of an operation in which the singing evaluation unit 24 evaluates the singing voice V (hereinafter referred to as “singing evaluation processing”).
  • the singing evaluation unit 24 calculates basic values s [1] to s [N] for each of a plurality (N) of different musical evaluation items (SB1).
  • the basic value is calculated regardless of the attribute of the target music when the target music is sung.
  • N evaluation items are musical viewpoints to evaluate the skill of singing, such as accuracy of pitch, propriety of inflection, singing technique (bar, fall, vibrato, kobushi, shakuri, long tone) Includes presence or absence.
  • a known technique is arbitrarily adopted for detection of the evaluation item.
  • Japanese Patent Application Laid-Open No. 2008-268370 discloses a technique for detecting a sung section using the Kobushi technique.
  • Japanese Unexamined Patent Application Publication No. 2004-102146 discloses a technique for detecting vibrato
  • Japanese Unexamined Patent Application Publication No. 2012-8596 discloses a technique for detecting a long tone.
  • Japanese Unexamined Patent Application Publication No. 2007-334364 discloses a technique for detecting vibrato, intonation, voice quality, timing, and sneezing, respectively.
  • a known technique is arbitrarily employed for singing evaluation for each evaluation item (calculation of basic values s [1] to s [N]).
  • Japanese Patent No. 5585320 discloses singing evaluation based on singing voice and music information (music data).
  • the singing evaluation unit 24 specifies N weighted values w [1] to w [N] corresponding to different evaluation items (SB2).
  • the weight information E is a data table that specifies N weight values w [1] to w [N] for each of the K attributes a [1] to a [K]. is there.
  • Each of the N weighted values w [1] to w [N] is individually set for each attribute a [k] according to the importance of each evaluation item in the singing of the music of each attribute a [k]. .
  • the weight value w [n] of the evaluation item of each singing technique is set to a relatively large value.
  • the weight value w [n] of the evaluation item of pitch accuracy is set to a relatively large numerical value.
  • the weight value w [n] of these evaluation items is a numerical value of 1 to 10.
  • numerical values such as sneezing, kobushi and vibrato are set large, and numerical values of rhythm are set small.
  • numerical values such as sneezing, vibrato, and pitch are set large, and numerical values for kobushi are set small.
  • numerical values such as pitch and long tone are set large, and numerical values of sneezing, kobushi, and inflection are set small.
  • the singing evaluation unit 24 stores N weight values w [1] to w [N] corresponding to the attribute A estimated by the attribute estimation unit 34 among the K attributes a [1] to a [K]. It is specified from the 14 weight value information E.
  • the singing evaluation unit 24 sings according to the N basic values s [1] to s [N] calculated in step SB1 and the N weighted values w [1] to w [N] specified in step SB2.
  • FIG. 8 A specific example of the basic value s [n] calculated in step SB1 is shown in FIG.
  • the basic value s [n] shown in FIG. 8 is a numerical value of 1 to 5.
  • the singing evaluation unit 24 evaluates the singing voice V by a variable evaluation method (evaluation standard) corresponding to the attribute A estimated for the target music piece.
  • the singing evaluation unit 24 displays the evaluation index S calculated by the above procedure on the display device 18 (SB4).
  • the attribute A of the target music is estimated according to the plurality of words W specified by the lyrics analysis unit 32 from the lyrics of the target music, so that information specifying the attribute for each music is specified.
  • the attribute A of the target music is simply estimated by using the reference information R that specifies the affinity H of each word / phrase W for each of the K attributes a [1] to a [K]. There is an advantage that you can.
  • Second Embodiment A second embodiment of the present invention will be described below.
  • standard referred by description of 1st Embodiment is diverted, and each detailed description is abbreviate
  • FIG. 9 is a configuration diagram of the music analysis unit 22 in the second embodiment.
  • the attribute estimation unit 34 of the music analysis unit 22 of the second embodiment uses the K recognition models M [1] to M [K] corresponding to the different attributes a [k] to determine the attribute A of the target music.
  • One recognition model M [k] corresponding to an arbitrary attribute a [k] is a statistical model representing the tendency of the word / phrase W used for the lyrics of a song having the attribute a [k].
  • GMM Global System forture Model
  • the K recognition models M [1] to M [K] are generated in advance by machine learning using a lot of learning information L corresponding to different music pieces.
  • the learning information L includes attribute information (label) LA that specifies the attribute a [k] of the music, and lyrics information LB that specifies the lyrics of the music.
  • a plurality of attributes a [k] can be specified for an arbitrary piece of music.
  • a large number of learning information L is classified for each attribute a [k] specified by the attribute information LA, and the tendency of the word / phrase W extracted from the lyrics information LB of each learning information L classified into the attribute a [k] is statistically calculated.
  • a recognition model M [k] for each attribute a [k] is generated.
  • the K recognition models M [1] to M [K] (actually variables that define each recognition model M [k]) prepared in advance by the machine learning exemplified above are stored in the storage device 14, This is used for estimating the attribute A of the target music piece by the attribute estimation unit 34.
  • a known machine learning technique is arbitrarily employed for generating the recognition model M [k].
  • Japanese Patent Laid-Open No. 2006-139185 is available.
  • Japanese Unexamined Patent Publication No. 2006-139185, paragraph [0011] discloses a technique related to generation of a recognition model.
  • Each recognition model M [k] of the second embodiment is used to calculate the estimated index Y [k] of the attribute a [k] according to the plurality of words W specified by the lyrics analysis unit 32 from the lyrics of the target music. It is a statistical model.
  • the estimated index Y [k] calculated by any one recognition model M [k] is an index of the probability (likelihood) that the target music corresponds to the attribute a [k].
  • the higher the degree that the plurality of phrases W of the target song match the tendency expressed by the recognition model M [k] (the tendency of the phrase W used in the lyrics of the song with the attribute a [k]) the higher the estimated index.
  • Y [k] is set to a large value. That is, the estimated index (likelihood) Y [k] is also considered as the affinity of the word / phrase W for the attribute a [k].
  • FIG. 10 is a flowchart of attribute estimation processing in which the attribute estimation unit 34 of the second embodiment estimates the attribute A of the target music piece.
  • the attribute estimation unit 34 applies the plurality of words W specified by the lyrics analysis unit 32 to each of the K recognition models M [1] to M [K], thereby different attributes.
  • K estimated indices Y [1] to Y [K] corresponding to a [k] are calculated (SC1).
  • the degree to which a plurality of phrases W used in the lyrics of the target music matches the tendency of the recognition model M [k] is high (the target music is attribute a [k from the viewpoint of the tendency of the words used in the lyrics). ] Is higher, the estimated index Y [k] is set to a larger numerical value.
  • the attribute estimation unit 34 estimates the attribute A of the target music according to the K estimation indexes Y [1] to Y [K] (SC2). Specifically, the attribute estimation unit 34 of the second embodiment selects one attribute a [k] having the maximum estimated index Y [k] among the K attributes a [1] to a [K]. Confirmed as attribute A of the target song.
  • the above is a specific example of the attribute estimation process in the second embodiment.
  • the content of the song evaluation process by the song evaluation unit 24 is the same as that of the first embodiment.
  • the attribute A of the target song is determined by using the recognition model M [k] for each attribute a [k] representing the tendency of the words used in the lyrics of the song of each attribute a [k]. Since it is estimated, it is possible to realize an appropriate estimation of the attribute A reflecting the actual tendency of the word / phrase W used in the lyrics of a large number of music pieces with the attribute a [k].
  • FIG. 11 is an explanatory diagram of the operation of the song evaluation apparatus 100 according to the third embodiment.
  • the attribute estimation unit 34 individually estimates the attribute A for each of a plurality of analysis sections P (in the example of FIG. 11, a total of four analysis sections P1 to P4 in FIG. 11). Specifically, the attribute estimation unit 34 applies each word / phrase W identified by the lyric analysis unit 32 from the lyrics in one arbitrary analysis interval P to the attribute estimation process, so that the attribute A of the analysis interval is determined.
  • a method similar to that of the first embodiment or the second embodiment may be adopted for the estimation of the attribute A.
  • the analysis section P is, for example, each section (number 1, number 2, etc. of the music) in which the target music is divided on the time axis with an interlude as a boundary.
  • the attribute A “exciting” is estimated for the analysis section P1 of the target music
  • the attribute A “parting” is estimated for the analysis section P2
  • the attribute A “toned” is estimated for the analysis section P3.
  • the case where the attribute A of “sad” is estimated for the section P4 is illustrated.
  • the evaluation method of the singing voice V can be changed for each analysis section P according to the estimation result of the attribute A.
  • the same effect as in the first embodiment is realized. Further, in the third embodiment, since the attribute A is estimated for each analysis section P, it is possible to estimate the attribute A that appropriately reflects the change in the tune and the subject in the music over time.
  • the attribute estimation unit 34 of the fourth embodiment estimates the attribute A of the target song according to the performance tempo of the target song. Specifically, the attribute estimation unit 34 variably controls the range of the attribute a [k] that can be selected as the attribute A of the target music (the candidate range of the attribute A) according to the performance tempo.
  • the attribute estimation unit 34 determines whether the performance tempo of the target music is high or low, and as shown in FIG. 12, when the performance tempo is low, K attributes a [1] including a negative attribute a [k] are included. While the attribute A is selected from ⁇ a [K], the negative attribute a [k] is excluded from the selection target when the performance tempo is high. That is, the negative attribute a [k] is not selected as the attribute A for the target music piece having a high performance tempo.
  • the same effect as in the first embodiment is realized. Further, in the fourth embodiment, since the performance tempo of the target music is added to the estimation of the attribute A of the target music in addition to each word W of the lyrics of the target music, for example, only each word W is used for the estimation of the attribute A. There is an advantage that the attribute A of the target musical piece can be estimated with high accuracy compared to the configuration to be performed.
  • one attribute a [k] corresponding to the estimated index (X [k], Y [k]) among the K attributes a [1] to a [K] is targeted.
  • a combination of a plurality of attributes a [k] can be estimated as the attribute A of the target music piece.
  • a configuration in which one or more attributes a [k] whose estimated index Y [k] exceeds a predetermined threshold is selected as the attribute A of the target music is adopted.
  • the song evaluation process by the song evaluation unit 24 is executed according to the plurality of attributes a [k]. For example, assuming that a total of three combinations of the attribute a [k1], the attribute a [k2], and the attribute a [k3] are estimated as the attribute A of the target music, the singing evaluation unit 24 uses the attribute a [ k1], a weight value w [n] corresponding to the attribute a [k2], a weight value w [n] corresponding to the attribute a [k3], and a weight value w [n] corresponding to the attribute a [k3].
  • the evaluation index S is calculated as a definite weight value w [n] of the base value s [n].
  • each weight value w [n] applied to the calculation of the evaluation index S (the weighted sum of the basic values s [n]) is changed according to the attribute A of the target song.
  • the configuration for changing the singing evaluation method according to the attribute A of the target song is not limited to the above examples.
  • the evaluation index S is calculated by adopting an evaluation method corresponding to the attribute A of the target music among different types of evaluation methods (singing evaluation processing), and the basic value s [n] for each evaluation item A configuration in which a variable applied to the calculation is variably set according to the attribute A of the target music, and a combination of the basic value s [n] applied to the calculation of the evaluation index S among the N basic values s [n] ( In other words, a configuration in which the evaluation item added to the singing evaluation) is changed according to the attribute A of the target music piece may be employed.
  • the arithmetic processing unit 12 selects an image (BGV) displayed on the display device 18 when the target song is sung according to the attribute A of the target song. For example, when the attribute A “sad” is estimated for the target song, a sad impression BGV is selected and displayed. Further, in the configuration in which the BGV (moving image) displayed on the display device 18 is sequentially selected from a plurality of candidates at each time point in the music, the selection target at each time is controlled according to the attribute A of the target music. Is also possible. A configuration in which an image displayed on a terminal device such as a mobile phone or a smartphone owned by each user is controlled according to the attribute A may be employed.
  • the arithmetic processing unit 12 causes the display device 18 to display a comment (an evaluation comment or a guidance comment) according to the attribute A of the target song.
  • a comment an evaluation comment or a guidance comment
  • the singing evaluation unit 24 presents an evaluation comment selected according to the attribute A from a plurality of candidates prepared in advance together with the evaluation index S to the user. For example, when the evaluation index S exceeds a predetermined value and the attribute A “satisfied” is estimated for the target song, an evaluation comment “Nice songs are also good!” Is displayed on the display device 18 together with the evaluation index S. Is displayed.
  • the singing evaluation unit 24 presents, together with the evaluation index S, the evaluation comment selected according to the attribute A from the plurality of candidates prepared in advance to the user, in the same manner as the presentation of the evaluation comment described above.
  • the evaluation index S is lower than a predetermined value
  • the target music is attribute A such as “not cut” or “sad” (especially emotional to be sung with particular attention to inflection compared to other elements such as pitch)
  • a guidance comment such as “Let's sing with emotion while paying attention to inflection” is displayed on the display device 18 together with the evaluation index S.
  • the arithmetic processing unit 12 (the singing evaluation unit 24) is used for the production of various lighting devices (for example, mirror balls or lasers that realize various visual effects) that are installed in the acoustic space together with the singing evaluation unit 100. It is also possible to control the operation of the device) according to the attribute A. For example, when the attribute A “not cut” is estimated, warm color (red, etc.) illumination light is emitted, and when the attribute A “fresh” is estimated, cold color (blue, etc.) illumination light is emitted. When the attribute A of “violent” is estimated, the brightness or hue of the illumination light is changed.
  • the attribute A of the target music is estimated according to the performance tempo, but the music information (musical element) added to the estimation of the attribute A is not limited to the performance tempo.
  • music information such as the performance part configuration (musical instrument combination), rhythm pattern, music tone, music length, music composition, etc. of the target music can be added to the estimation of the attribute A.
  • the song evaluation apparatus 100 which applies the attribute A which the music analysis part 22 estimated to the song evaluation process was illustrated, this invention is also used as a music analysis apparatus which estimates the attribute A of object music. Can be realized.
  • the music analysis apparatus is an information processing apparatus including the music analysis unit 22 (the lyrics analysis unit 32 and the attribute estimation unit 34) exemplified in the above-described embodiments.
  • the singing evaluation apparatus 100 of each form mentioned above is equivalent to the structure which added the singing evaluation part 24 to the music analysis apparatus.
  • the music analysis apparatus is realized by hardware (electronic circuit) such as DSP (Digital Signal Processor) dedicated to analysis of music attributes, and a general-purpose such as CPU (Central Processing Unit). This is also realized by cooperation between the arithmetic processing unit and the program.
  • a program causes a computer to function as a lyric analysis unit that identifies a phrase included in the lyrics of a song, and an attribute estimation unit that estimates a song attribute from the phrase specified by the lyrics analysis unit .
  • the program of the present invention can be provided in a form stored in a computer-readable recording medium and installed in the computer.
  • the recording medium is, for example, a non-transitory recording medium, and an optical recording medium (optical disk) such as a CD-ROM is a good example, but a known arbitrary one such as a semiconductor recording medium or a magnetic recording medium This type of recording medium can be included.
  • the program of the present invention can be provided in the form of distribution via a communication network and installed in a computer.
  • the present invention is also specified as an operation method (music analysis method) of the music analysis apparatus according to each aspect exemplified above.
  • a music analysis method includes a lyric analysis process for specifying a phrase included in the lyric of the music, and an attribute estimation process for estimating the attribute of the music from the phrase specified in the lyric analysis process. .

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Reverberation, Karaoke And Other Acoustics (AREA)

Abstract

La présente invention concerne une unité (32) d'analyse de paroles chantées sur une musique, qui identifie de multiples mots (W) compris dans lesdites paroles d'une composition musicale. Une unité (34) d'estimation d'attributs estime les attributs (A) de la composition musicale étudiée à partir des multiples mots (W) identifiés par l'unité (32) d'analyse de paroles. Spécifiquement, l'unité (34) d'estimation d'attributs utilise des données de référence (R) qui spécifient le degré d'affinité associé à chaque mot par rapport à de multiples attributs, afin d'estimer les attributs (A) de la composition musicale étudiée à partir des multiples mots (W) identifiés par l'unité d'analyse de paroles (32).
PCT/JP2015/053016 2014-02-03 2015-02-03 Dispositif d'analyse de composition musicale et dispositif d'evaluation de chant WO2015115666A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2014018518A JP2015145955A (ja) 2014-02-03 2014-02-03 楽曲解析装置および歌唱評価装置
JP2014-018518 2014-02-03

Publications (1)

Publication Number Publication Date
WO2015115666A1 true WO2015115666A1 (fr) 2015-08-06

Family

ID=53757231

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2015/053016 WO2015115666A1 (fr) 2014-02-03 2015-02-03 Dispositif d'analyse de composition musicale et dispositif d'evaluation de chant

Country Status (2)

Country Link
JP (1) JP2015145955A (fr)
WO (1) WO2015115666A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114630472A (zh) * 2020-12-10 2022-06-14 逸驾智能科技有限公司 灯光控制方法及设备

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08190394A (ja) * 1995-01-09 1996-07-23 Taito Corp カラオケ装置に曲内容に連動したグラフィックを表示する方法とその装置
JPH09212480A (ja) * 1996-01-31 1997-08-15 Yamaha Corp 雰囲気情報生成装置およびカラオケ装置
JPH11327551A (ja) * 1998-05-15 1999-11-26 Nippon Telegr & Teleph Corp <Ntt> 楽曲の編曲スタイル決定方法及び装置及び楽曲の編曲スタイル決定プログラムを格納した記憶媒体
JP2007200495A (ja) * 2006-01-27 2007-08-09 Nec Corp 音楽再生装置、音楽再生方法及び音楽再生用プログラム
JP2009204870A (ja) * 2008-02-27 2009-09-10 Xing Inc カラオケ装置
JP2011095437A (ja) * 2009-10-29 2011-05-12 Daiichikosho Co Ltd カラオケ採点システム
JP2012058278A (ja) * 2010-09-03 2012-03-22 Yamaha Corp 音声評価装置

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008026622A (ja) * 2006-07-21 2008-02-07 Yamaha Corp 評価装置
JP5772054B2 (ja) * 2011-02-23 2015-09-02 ヤマハ株式会社 歌唱評価装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08190394A (ja) * 1995-01-09 1996-07-23 Taito Corp カラオケ装置に曲内容に連動したグラフィックを表示する方法とその装置
JPH09212480A (ja) * 1996-01-31 1997-08-15 Yamaha Corp 雰囲気情報生成装置およびカラオケ装置
JPH11327551A (ja) * 1998-05-15 1999-11-26 Nippon Telegr & Teleph Corp <Ntt> 楽曲の編曲スタイル決定方法及び装置及び楽曲の編曲スタイル決定プログラムを格納した記憶媒体
JP2007200495A (ja) * 2006-01-27 2007-08-09 Nec Corp 音楽再生装置、音楽再生方法及び音楽再生用プログラム
JP2009204870A (ja) * 2008-02-27 2009-09-10 Xing Inc カラオケ装置
JP2011095437A (ja) * 2009-10-29 2011-05-12 Daiichikosho Co Ltd カラオケ採点システム
JP2012058278A (ja) * 2010-09-03 2012-03-22 Yamaha Corp 音声評価装置

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
GO KIKUCHI: "A Study on Scene Presumption of Music based on Lyrics using PMM", PROCEEDINGS OF THE 2011 IEICE GENERAL CONFERENCE JOHO SYSTEM 1, 28 February 2011 (2011-02-28), pages 26 *
NAOKI NISHIKAWA: "Musical Mood Trajectory Estimation using Lyrics and Audio Features", DAI 73 KAI (HEISEI 23 NEN) ZENKOKU TAIKAI KOEN RONBUNSHU (2) JINKO CHINO TO NINCHI KAGAKU, 2 March 2011 (2011-03-02), pages 2 - 297, 2-298, XP055217654 *
SHUHEI HIROTA: "Kashi karano Ongakuteki Yoso no Chushutsu", PROCEEDINGS OF THE ELEVENTH ANNUAL MEETING OF THE ASSOCIATION FOR NATURAL LANGUAGE PROCESSING, 15 March 2005 (2005-03-15), pages 799 - 802 *

Also Published As

Publication number Publication date
JP2015145955A (ja) 2015-08-13

Similar Documents

Publication Publication Date Title
Rink et al. Motive, gesture and the analysis of performance
US20180268792A1 (en) System and method for automatically generating musical output
KR100895009B1 (ko) 음악추천 시스템 및 그 방법
Davis et al. Generating music from literature
JP2008026622A (ja) 評価装置
CA3064738A1 (fr) Systeme et procede de production automatique de sortie musicale
CN108986843A (zh) 音频数据处理方法及装置、介质和计算设备
Ramirez et al. Automatic performer identification in commercial monophonic jazz performances
JP6350325B2 (ja) 音声解析装置およびプログラム
JP2015191194A (ja) 演奏評価システム、サーバ装置、端末装置、演奏評価方法及びコンピュータプログラム
WO2015115666A1 (fr) Dispositif d&#39;analyse de composition musicale et dispositif d&#39;evaluation de chant
JP2014178620A (ja) 音声処理装置
JP5830840B2 (ja) 音声評価装置
JP2015194767A (ja) 音声評価装置
JP6954780B2 (ja) カラオケ装置
JP5618743B2 (ja) 歌唱音声評価装置
Liu et al. Emotion Recognition of Violin Music based on Strings Music Theory for Mascot Robot System.
WO2016039463A1 (fr) Dispositif d&#39;analyse acoustique
JP2007225916A (ja) オーサリング装置、オーサリング方法およびプログラム
Yang Structure analysis of beijing opera arias
JP6135229B2 (ja) 歌唱評価装置
JP2015191183A (ja) 演奏評価システム、サーバ装置、端末装置、演奏評価方法及びコンピュータプログラム
JP2015191188A (ja) 演奏評価システム、サーバ装置、端末装置、演奏評価方法及びコンピュータプログラム
Murali Last Rites: Self-Representation and Counter-Canon Practices in Classical Music through Radhe Radhe
Chowdhury Modelling Emotional Expression in Music Using Interpretable and Transferable Perceptual Features/submitted by Shreyan Chowdhury

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15742795

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15742795

Country of ref document: EP

Kind code of ref document: A1