WO2015111671A1 - Singing evaluation device, singing evaluation method, and singing evaluation program - Google Patents

Singing evaluation device, singing evaluation method, and singing evaluation program Download PDF

Info

Publication number
WO2015111671A1
WO2015111671A1 PCT/JP2015/051731 JP2015051731W WO2015111671A1 WO 2015111671 A1 WO2015111671 A1 WO 2015111671A1 JP 2015051731 W JP2015051731 W JP 2015051731W WO 2015111671 A1 WO2015111671 A1 WO 2015111671A1
Authority
WO
WIPO (PCT)
Prior art keywords
singing
consonant
distribution
consonant length
evaluation
Prior art date
Application number
PCT/JP2015/051731
Other languages
French (fr)
Japanese (ja)
Inventor
片寄 晴弘
達矢 的場
隆一 成山
松本 秀一
Original Assignee
ヤマハ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ヤマハ株式会社 filed Critical ヤマハ株式会社
Publication of WO2015111671A1 publication Critical patent/WO2015111671A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/60Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for measuring the quality of voice signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L25/87Detection of discrete points within a voice signal
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/93Discriminating between voiced and unvoiced parts of speech signals

Definitions

  • the present invention relates to a technique for evaluating the singing content of a singer.
  • Singing data is voice data obtained by collecting a song of a singer.
  • the musical score data is data for setting the pitch, volume, and sounding timing of each sound in a song sung by a singer.
  • a conventional singing evaluation apparatus sings depending on how much the pitch, volume, and pronunciation timing of each sound in the song data match the pitch, volume, and pronunciation timing of each sound in the score data. Is evaluated. For example, in a conventional singing evaluation apparatus, the higher the degree that the pitch, volume, and pronunciation timing of each sound in the song data match the pitch, volume, and pronunciation timing of each sound in the score data, the higher the degree Evaluate to score.
  • the singing is evaluated by detecting the rhythm of the singing data and detecting whether or not the rhythms included in the singing music data match.
  • An object of the present invention is to provide a technique for evaluating the dynamic feeling of singing, in particular, the reflection, which cannot be evaluated only by comparison with simple musical score data or music data.
  • the singing evaluation apparatus includes a consonant length distribution creation unit and an evaluation unit.
  • the consonant length distribution creating unit creates a consonant length distribution of each consonant included in the singing sound based on the singing data indicating the singing sound.
  • the evaluation unit detects a consonant length variation that is a degree of spread of the consonant length distribution using the consonant length distribution, and evaluates the singing sound using the consonant length variation.
  • This configuration uses the fact that the consonant length variation is different between a song with a sense of dynamism and a song without a sense of dynamism, in particular, a song with a good twist and a song with a bad twist. Therefore, by detecting and using the variation of the consonant length, it is possible to accurately determine the score of the singing.
  • the singing evaluation apparatus of the present invention includes a consonant length measuring unit that measures the consonant length of the singing data obtained from the singing sound of the singer and outputs the measured consonant length to the consonant length distribution creating unit.
  • the consonant length of the singing data can be measured from the singing sound sung by the user.
  • the singing evaluation apparatus includes a consonant length measuring unit, a consonant length distribution creating unit, and an evaluating unit.
  • the consonant length measurement unit measures the consonant length of the singing data obtained from the singing sound of the singer.
  • the consonant length distribution creating unit creates a consonant length distribution.
  • the evaluation unit detects a consonant length variation that is a degree of spread of the consonant length distribution using the consonant length distribution, and evaluates the singing sound using the consonant length variation.
  • This configuration uses the fact that the consonant length variation is different between a song with a sense of dynamism and a song without a sense of dynamism, in particular, a song with a good twist and a song with a bad twist. Therefore, by detecting and using the variation of the consonant length, it is possible to accurately determine the score of the singing.
  • the singing evaluation apparatus of the present invention includes a determination target detection unit that detects a consonant to be determined from at least one of music score data and music data of a song to be sung.
  • the consonant length distribution creating unit creates a consonant length distribution for the consonant to be determined.
  • This configuration utilizes the fact that the degree of variation in consonant length differs between a song with a good twist and a song with a bad twist in a specific sound type and rhythm. Therefore, by detecting and using a variation in consonant length with respect to a specific consonant to be determined, it is possible to more accurately determine the singing.
  • the determination target detection unit of the singing evaluation apparatus of the present invention includes at least one of a sound type analysis unit, a rhythm analysis unit, and a specific section extraction unit.
  • the sound type analysis unit analyzes the sound type sung from the musical score data.
  • the rhythm analyzer analyzes the rhythm sung from the musical score data.
  • the specific section extraction unit extracts a specific section of a song to be sung from the music data.
  • the determination target detection unit determines a determination target from at least one of a sound type, a rhythm, and a specific section.
  • This configuration shows a more specific preferable example of the determination target detection unit. As described above, by determining the sound type, rhythm, and specific section as determination targets, more accurate determination can be performed.
  • the singing evaluation apparatus of the present invention includes a vowel pronunciation timing acquisition unit and a vowel pronunciation timing distribution creation unit.
  • the vowel pronunciation timing acquisition unit detects vowel pronunciation timing using song data.
  • the vowel pronunciation timing distribution creation unit detects a timing difference between the vowel pronunciation timing and the beat timing of the music, and creates a distribution of the vowel pronunciation timing difference.
  • the evaluation unit uses the distribution of the vowel pronunciation timing difference for the evaluation of the song.
  • the vowel pronunciation timing is almost the same as the beat timing for both good and bad singing, and if the vowel pronunciation timing deviates many times, the song can be heard poorly. is doing. Therefore, by detecting and using the difference between the pronunciation timing of the vowel and the beat, it is possible to more accurately evaluate the skill of singing.
  • the evaluation unit determines that the larger the degree of spread of the consonant length distribution (the variation of the consonant length) is, the better the song is, and highly evaluates the singing.
  • Another aspect of the present invention creates a distribution of consonant lengths of each consonant included in the singing sound based on the singing data indicating the singing sound, detects the degree of spread of the consonant length distribution, A singing evaluation method for evaluating the singing sound using a degree of spread of distribution is provided.
  • a consonant length distribution creating unit that creates a consonant length distribution of each consonant included in the singing sound based on the singing data indicating the singing sound, and a degree of spread of the consonant length distribution.
  • a singing evaluation program for causing a computer to execute each function of detecting and evaluating the singing sound using the degree of spread of the consonant length distribution.
  • Frequency distribution diagram showing the relationship between the distribution of the time difference between the vowel sounding start timing and the beat timing according to the second embodiment of the present invention and the poor singing
  • the block diagram which shows the main structures of the song evaluation apparatus which concerns on the 3rd Embodiment of this invention.
  • Frequency distribution diagram showing consonant length distribution for each beat according to the third embodiment of the present invention
  • the flowchart of the song evaluation method which concerns on the 3rd Embodiment of this invention.
  • the block diagram which shows the main structures of the song evaluation apparatus which concerns on the 4th Embodiment of this invention.
  • FIG. 1 is a block diagram showing the main configuration of a singing evaluation apparatus according to the first embodiment of the present invention.
  • FIG. 2 is a diagram illustrating a reference concept of singing evaluation of the singing evaluation apparatus according to the first embodiment of the present invention.
  • the horizontal axis is the consonant length
  • the vertical axis is the frequency. That is, FIG. 2 is a diagram showing a frequency distribution of consonant lengths included in a predetermined time section of a sung song.
  • the singing evaluation apparatus makes it possible to evaluate the dynamic feeling of the singing, in particular, the “noir” of the singing.
  • the song “Nori” in the present invention is an advanced technique that musically controls the deviation from a position (timing) that is mainly divided into measures in time. ⁇ Singers' sung songs and uplifting feelings, ⁇ Rhythm that makes you want to sing and dance together just by listening. ⁇ Human, free or lively dynamics, It is a singing expression that obtains effects such as. Sometimes called “groove” or “groove feeling”.
  • the singing with a bad twist in FIG. 2 shows a case where the singing is mechanically performed according to the pitch, the volume, and the sounding timing according to the score.
  • the singing with a good flare shows the case where the singing that can be felt from the part depending on the emotion at the time of singing by the above-mentioned singer is performed.
  • the consonant length variation is large and the average consonant length is large (consonant length is long on average).
  • the consonant length variation is small and the average value is small (consonant length is short on average).
  • the index based on the variation of the consonant length it is possible to determine whether the groove is good or bad. Specifically, for example, the variance of the consonant length and the standard deviation are calculated. When the consonant length variance and the standard deviation are large, it is determined that the sound is good and the singing is good, and when the consonant length dispersion and the standard deviation are small, the sound is bad and the singing is determined to be poor. In addition, the score for singing increases as the variance or standard deviation increases, and the score decreases as the variance or standard deviation decreases.
  • an average value of the consonant length is calculated.
  • the average value of the consonant length is large, it is determined that the sound is good and the singing is good.
  • the average value of the consonant length is small, it is determined that the sound is poor and the singing is poor.
  • it determines so that the score with respect to a song becomes high, so that an average value is large, and a score becomes low, so that an average value is small.
  • the consonant length that takes the maximum value (mode) is long.
  • the consonant length that takes the maximum value (mode) is short. Therefore, by calculating an index based on the length of the consonant length that is the mode value, it is possible to determine whether the groove is good or bad.
  • a consonant length value that is a mode value is calculated.
  • the consonant length value that is the most frequent value is large (when the consonant length is long), it is determined that the sound is good and the singing is good, and when the consonant length value that is the most frequent value is small (the consonant length is short) ), It is determined that the song is bad and the singing is poor. Further, the higher the consonant length value that is the mode value, the higher the score for singing, and the smaller the consonant length value that is the mode value, the lower the score.
  • the song may be evaluated by determining the glue by taking into account at least two of an index based on variation, an index based on an average value, and an index based on a mode value.
  • the index based on the variation may be digitized
  • the index based on the average value may be digitized
  • the result of the four arithmetic operations may be used to determine the glue and evaluate the singing.
  • the numerical values representing these indices may be set to increase as the variation and the average value increase, and an average of these indices or a weighted average may be used.
  • the singing evaluation apparatus has a configuration shown in FIG.
  • the singing evaluation device 10 includes a consonant length measuring unit 11, a consonant length distribution creating unit 12, and an evaluating unit 13.
  • the consonant length measuring unit 11 receives singing data obtained by collecting the singing of the singer.
  • the consonant length measuring unit 11 detects a consonant from the song data using a known method.
  • the consonant length measuring unit 11 measures the consonant length that is the length of each consonant.
  • the consonant length measuring unit 11 outputs the measured consonant length to the consonant length distribution creating unit 12.
  • An example of a known method for detecting consonants is Japanese Patent Application Laid-Open No. 2008-32933.
  • This document discloses a method of determining a period with periodicity as a vowel section and determining other sections as consonant sections.
  • GUI Graphic User Interface
  • a method of measuring the length of each identified consonant section by manually identifying a consonant section while listening to song data without using a GUI is also conceivable.
  • the consonant length distribution creating unit 12 stores the input consonant lengths over a preset time interval.
  • the time interval to be stored may be, for example, the whole piece of music or a predetermined one phrase. When using the determined one phrase, it is only necessary to acquire score data separately and use it with reference to the score data.
  • the consonant length distribution creating unit 12 creates a frequency distribution of consonant lengths.
  • the consonant length distribution creation unit 12 outputs the frequency distribution of the consonant length to the evaluation unit 13.
  • the evaluation unit 13 determines whether the groove is good or bad from the frequency distribution of the consonant length, and performs singing evaluation based on the groove.
  • the determination criteria for the glue are as described above. Specifically, the evaluation unit 13 calculates an index (for example, variance or standard deviation) based on the variation of the consonant length from the frequency distribution of the consonant length. The evaluation unit 13 determines whether the flutter is good or bad from the index based on the variation of the consonant length as described above. Note that, as described above, the evaluation unit 13 may determine whether or not the slack is good by using the average value of the consonant length or the consonant length (mode) at which the frequency is maximum.
  • the configuration of the present embodiment it is possible to accurately determine whether the singing is good or not, and to perform singing evaluation in consideration of the quality of the singing. That is, it is possible to realize a singing evaluation that is closer to the feeling of good and bad hands by a singer or listener.
  • the following methods can be considered as a method of singing evaluation by the evaluation unit 13. For example, singing data of various levels ranging from those who are good at singing to those who are poor at the same song are collected and stored in a storage device.
  • a computer reads each song data from a memory
  • the stage evaluation of the variance value is not limited to 10 stages, and a finer stage evaluation (20 stages, 50 stages, etc.) or a rougher stage evaluation (8 stages, 5 stages, etc.) may be adopted. Absent.
  • the interval between steps used in such a step evaluation may be set at equal intervals. Alternatively, a weighted interval (for example, the interval at 5 points and the interval at 10 points are different) may be used.
  • the interval of the steps used in the step evaluation is set at an equal interval
  • a method of weighting the added points in each step may be adopted (as the three-step evaluation, the minimum evaluation is 5 points, the middle The evaluation is 10 points, and the highest evaluation is 30 points).
  • FIG. 3 is a flowchart of the singing evaluation method according to the first embodiment of the present invention.
  • the song evaluation device 10 acquires song data and measures the consonant length of each consonant included in the song data (S101).
  • the singing evaluation apparatus 10 creates a distribution of the acquired consonant lengths (S102).
  • the singing evaluation apparatus 10 calculates an index based on the variation of the consonant length from the distribution of the consonant length, determines whether the groove is good or bad using the index, and performs singing evaluation (S103).
  • the quality of the groove is determined based on the variation of the consonant length.
  • the quality of the groove can be determined using the average value or the mode value of the consonant length. it can.
  • FIG. 4 is a block diagram showing the main configuration of the singing evaluation apparatus according to the second embodiment of the present invention.
  • the same singing data as the consonant length measurement unit 11 is input to the vowel pronunciation timing acquisition unit 21.
  • the vowel pronunciation timing acquisition unit 21 detects a vowel by a known method, and detects a pronunciation start timing of the vowel. Specifically, in the case of a vowel with a consonant, the vowel timing detection unit 21 detects the timing at which the vowel is switched to the vowel. The vowel timing detection unit 21 detects the timing at which vowels are switched when vowels are continuous. The vowel timing detection unit 21 detects a timing at which a vowel is generated from a silent state when a vowel is generated from the silence. The vowel pronunciation timing acquisition unit 21 outputs the detected pronunciation start timing of each vowel to the vowel pronunciation timing distribution creation unit 22.
  • the vowel sound generation timing distribution creation unit 22 receives the sound generation start timing of each vowel and the beat timing of the song being sung.
  • the vowel pronunciation timing distribution creation unit 22 compares the difference between the beat timing and the vowel pronunciation start timing. At this time, the vowel sounding timing distribution creating unit 22 associates the timing of the beat closest to the sounding start timing of each vowel with the sounding start timing of each vowel.
  • FIG. 5 is a diagram showing the concept of associating the vowel pronunciation start timing with the beat timing. In the case of FIG. 5, for example, the sounding start timing of the vowel V01 is closest to the timing of the first beat.
  • the vowel sound generation timing distribution creation unit 22 sets the beat corresponding to the sound generation start timing of the vowel V01 as the first beat. Similarly, in the case of FIG. 5, the vowel sound generation timing distribution creating unit 22 sets the beat corresponding to the sound generation start timing of the vowel V02 as the third beat. In the case of FIG. 5, the vowel sound generation timing distribution creation unit 22 sets the beat corresponding to the sound generation start timing of the vowel V03 as the fourth beat, and sets the beat corresponding to the sound generation start timing of the vowel V04 as the fifth beat. The beat corresponding to the sound generation start timing of the vowel V05 is set to the seventh beat, and the beat corresponding to the sound generation start timing of the vowel V06 is set to the eighth beat.
  • the vowel pronunciation timing distribution creation unit 22 calculates the time difference between the vowel pronunciation start timing and the beat timing corresponding to each. The vowel pronunciation timing distribution creation unit 22 creates this time difference distribution. The vowel sound generation timing distribution creation unit 22 outputs the time difference distribution to the evaluation unit 13.
  • FIG. 6 is a frequency distribution diagram showing the relationship between the distribution of the time difference between the vowel sounding start timing and the beat timing and the poor singing ability. As shown in FIG. 6, when the singing is good, the variation in time difference between the vowel pronunciation start timing and the beat timing is small, and when the singing is poor, the time difference variation between the vowel pronunciation start timing and the beat timing is small. Is big.
  • the mode value of the time difference between the vowel pronunciation start timing and the beat timing is substantially 0, and when the singing is not good, the time difference between the vowel pronunciation start timing and the beat timing is the most frequent. The value is greatly deviated from 0.
  • the evaluation unit 13 detects the variation in the time difference, and determines that the singing is better as the variation in the time difference becomes smaller.
  • the evaluation unit 13 detects the mode value of the time difference, and determines that the singing is better as the mode value is closer to zero.
  • the evaluation part 13 may determine the skill level of a song using both the time difference variation and the time difference mode value.
  • the evaluation unit 13 reflects the singing evaluation result based on the time difference between the vowel pronunciation start timing and the beat timing in the singing evaluation result based on the above-mentioned consonant length, and performs singing evaluation in an integrated manner. As a result, it is possible to more accurately determine the sung skill.
  • the evaluation unit 13 may perform the singing evaluation based on the consonant length only when it is determined that the singing is good based on the singing evaluation result based on the time difference between the vowel pronunciation start timing and the beat timing. . In this case, the evaluation unit 13 does not perform the singing evaluation based on the consonant length when the singing evaluation based on the consonant length is considered unnecessary. Therefore, the processing load of the singing evaluation can be reduced.
  • FIG. 7 is a flowchart of the singing evaluation method according to the second embodiment of the present invention.
  • FIG. 7 shows a case where whether or not to perform the singing evaluation based on the consonant length is switched depending on the singing evaluation result based on the time difference between the vowel sound generation timing and the beat timing.
  • the song evaluation device 10A acquires song data and measures the consonant length of each consonant included in the song data (S201). Apart from the measurement of the consonant length, the singing evaluation device 10A detects the pronunciation start timing of each vowel included in the singing data (S202). Next, the singing evaluation device 10A creates a time difference distribution between the sounding start timing of each vowel and the timing of the beat corresponding to the timing (S203).
  • the singing evaluation apparatus 10A detects that the time difference is greatly distributed in the vicinity of 0 (S204: YES), the singing evaluation apparatus 10A calculates an index based on the variation of the consonant length from the distribution of the consonant length, and uses this index to determine whether or not Bad judgment is performed and singing evaluation is performed (S205).
  • the singing evaluation device 10A detects that the time difference is not largely distributed in the vicinity of 0 (S204: NO), the singing evaluation by the consonant length is not performed, and the singing is evaluated to be poor.
  • FIG. 8 is a block diagram showing the main configuration of the singing evaluation apparatus according to the third embodiment of the present invention.
  • the singing evaluation device 10B of the present embodiment is different from the singing evaluation device 10A shown in the second embodiment in the connection configuration with respect to the vowel pronunciation timing acquisition unit 21 and the vowel pronunciation timing distribution creation unit 22,
  • the other configuration is the same as the singing evaluation apparatus 10A shown in the second embodiment. Therefore, only the part different from the singing evaluation apparatus 10A according to the second embodiment will be specifically described.
  • the vowel sound generation timing acquisition unit 21 outputs the detected vowel sound generation start timing to the consonant length distribution generation unit 12 together with the vowel sound generation timing distribution generation unit 22.
  • the vowel pronunciation timing distribution creation unit 22 outputs beats corresponding to each vowel pronunciation start timing to the consonant length distribution creation unit 12.
  • the consonant length distribution creating unit 12 detects a beat corresponding to each consonant from the input vowel sounding timing and the corresponding beat. Specifically, in the case of FIG. 5, the consonant length distribution creating unit 12 detects that the beat corresponding to the consonant C01 attached to the vowel V01 is the first beat. Similarly, the consonant length distribution creating unit 12 detects that the beat corresponding to the consonant C02 attached to the vowel V02 is the third beat, and detects the beat corresponding to the consonant C04 attached to the vowel V04 is the fifth beat. The beat corresponding to the consonant C05 attached to the vowel V05 is detected as the seventh beat, and the beat corresponding to the consonant C06 attached to the vowel V06 is detected as the eighth beat.
  • the consonant length distribution creating unit 12 creates a consonant length distribution for each beat.
  • FIG. 9 is a frequency distribution diagram showing the consonant length distribution for each beat according to the third embodiment of the present invention.
  • the consonant length distribution creation unit 12 outputs the consonant length distribution for each beat to the evaluation unit 13.
  • the evaluation unit 13 uses a consonant length distribution for each beat to determine whether the groove is good or bad and performs singing evaluation.
  • the evaluation unit 13 has a difference between the dispersion and standard deviation of the consonant length in the beat having the smallest consonant length variation and the difference between the dispersion and standard deviation of the consonant length in the beat having the largest consonant length variation.
  • Judge the quality of the glue In the specific example shown in FIG. 9, the quality of the glue is determined based on the difference between the variance and standard deviation of the first beat and the variance and standard deviation of the sixth beat.
  • the evaluation unit 13 determines not only the arithmetic difference between the variance and the standard deviation as the difference, but also the beat with the variance of the consonant length and the standard deviation in the beat having the smallest consonant length variation and the maximum consonant length variation.
  • a consonant length variance or an arithmetic ratio with a standard deviation may be used. The evaluation unit 13 determines that the larger the difference is, the better the roughness is, and the smaller the difference is, the worse the roughness is.
  • FIG. 10 is a flowchart of the singing evaluation method according to the third embodiment of the present invention.
  • FIG. 10 shows a case where singing evaluation based on the time difference between the vowel sound generation timing and the beat timing is not performed.
  • singing evaluation based on the time difference between the vowel sound generation timing and the beat timing may or may not be performed.
  • the song evaluation apparatus 10B acquires song data, and measures the consonant length of each consonant included in the song data (S301). Apart from the measurement of the consonant length, the singing evaluation device 10B detects the sounding start timing of each vowel included in the singing data (S302). Next, the singing evaluation device 10B detects the association between the sounding start timing of each vowel and the timing of the corresponding beat (S303). The singing evaluation apparatus 10B creates a distribution of consonant length for each beat (S304). The singing evaluation device 10B determines whether the sound is good or bad by using the variation in the distribution of the consonant length for each beat, and performs singing evaluation (S305).
  • FIG. 11 is a block diagram which shows the main structures of the song evaluation apparatus which concerns on the 4th Embodiment of this invention.
  • the singing evaluation apparatus 10C of the present embodiment further includes a determination target detection unit 23 with respect to the singing evaluation apparatus 10B shown in the third embodiment, and the processing of the consonant length distribution creation unit 12 is different. Therefore, only a different part from the song evaluation apparatus 10B which concerns on 3rd Embodiment is demonstrated concretely.
  • the determination target detection unit 23 includes a sound form analysis unit, a rhythm analysis unit, and a specific section extraction unit.
  • the determination target detection unit 23 receives at least one of score data and music data.
  • the musical score data includes the pitch, volume, and sounding timing of the song to be sung.
  • the music data includes the composition of music such as a chorus section, the genre of music, and the like.
  • the determination target detection unit 23 analyzes the score data and sets a consonant to be used for determination of the consonant length.
  • FIG. 12 is a diagram showing a specific example of a consonant setting concept used for determination of consonant length.
  • FIG. 12A is a diagram showing a consonant setting concept based on sound type, and
  • FIG. 12B is a diagram showing a consonant setting concept based on rhythm.
  • the sound type analysis unit of the determination target detection unit 23 detects the sound type used for the determination of the consonant length from the score data, the sound type interval and the consonant used for the determination of the consonant length are detected. Set the timing. For example, as shown in FIG. 12A, when the rising tone type is detected, the consonant of the third sound in the three sounds whose pitches rise continuously is set as the consonant used for the determination of the consonant length. The timing of the consonant is given to the consonant length distribution creating unit 12.
  • the rhythm analysis unit of the determination target detection unit 23 detects a rhythm to be used for determining the consonant length from the musical score data
  • the rhythm section and a consonant timing to be used for determining the consonant length are set.
  • the consonant of the third sound the sound having a short sound length
  • the consonant length of the consonant is given to the consonant length distribution creating unit 12.
  • the tone type and rhythm of a continuous three-tone section is specified, but a section composed of two or more sounds may be used.
  • the smaller the number of sounds the greater the number of times that the same sound type and the same rhythm are included in the song being sung. Therefore, it is easy to create a more accurate consonant length distribution, which is more useful.
  • the consonant of the head or middle sound may be set as the consonant used for the determination of the consonant length, not the last sound of the section. Thereby, more appropriate evaluation can be performed according to a sound type and a rhythm.
  • the determination target detection unit 23 can analyze music data and set a consonant to be used for determination of the consonant length. For example, the determination target detection unit 23 detects a chorus section from the music data, and gives the time of the chorus section to the consonant length distribution creation unit 12.
  • the consonant length distribution creating unit 12 refers to the timing or interval given from the determination target detecting unit 23 and the timing of each consonant obtained from the vowel sounding timing detection 21 and the vowel sounding timing distribution creating unit 22, and the consonant length measuring unit 11 is used to create a consonant length distribution for a given timing or interval.
  • FIG. 13 is a flowchart of the singing evaluation method according to the fourth embodiment of the present invention. Note that FIG. 13 shows a case in which whether or not to perform singing evaluation based on the consonant length is switched depending on the singing evaluation result based on the time difference between the vowel sounding timing and the beat timing. Singing evaluation based on the time difference from the beat timing may not be performed.
  • the song evaluation apparatus 10C acquires song data and measures the consonant length of each consonant included in the song data (S401). Separately from the measurement of the consonant length, the singing evaluation apparatus 10C analyzes the score data or the music data, and sets the timing or section that is the creation target of the consonant length distribution based on the sound type, rhythm, or specific section (S402). ). In addition to the measurement of the consonant length and the setting of the creation target of the consonant length distribution, the singing evaluation device 10C detects the pronunciation start timing of each vowel included in the singing data (S403). Next, the singing evaluation apparatus 10C creates a time difference distribution between the sounding start timing of each vowel and the timing of the corresponding beat (S404).
  • the singing evaluation device 10C detects that the time difference is largely distributed in the vicinity of 0 (S405: YES), the singing evaluation device 10C creates a consonant length distribution for the timing or section to be determined (S406).
  • the singing evaluation apparatus 10C calculates an index based on the variation of the consonant length from the distribution of the consonant length, determines whether the groove is good or bad using the index, and performs the singing evaluation (S407).
  • the singing evaluation device 10A detects that the time difference is not greatly distributed in the vicinity of 0 (S405: NO)
  • the singing evaluation is not performed by the consonant length, and the singing is evaluated as being poor. Or, the singing evaluation by consonant is performed, but the evaluation of the glue may be greatly reduced.
  • consonant length distribution by consonant for example, a consonant length distribution may be created for each consonant by identifying the consonant.
  • a consonant category may be identified from the music data, and a consonant length distribution may be created for each consonant category.
  • consonants may be classified into categories of voiced and unvoiced sounds, or categories such as plosives and sibilants, and a consonant length distribution may be created for each category.
  • the mode of measuring the consonant length and determining the variation of the consonant length based on the singing sound collected by the microphone or the like is shown, but the singing created by artificially imitating a human voice
  • the above-described configuration can also be applied to sound, and similar effects can be obtained.
  • the mode in which the consonant length is automatically measured has been shown.
  • the user manually identifies the consonant and the vowel while viewing the waveform, measures the consonant length, and inputs it to the device (acquired by the device). )
  • the vowel pronunciation start timing is automatically measured.
  • the user manually detects the vowel pronunciation start timing while viewing the waveform, and represents the detected vowel pronunciation start timing.
  • Data may be input to the device (acquired by the device), and the user listens to the singing sound and detects the vowel start timing manually, and inputs data indicating the detected vowel pronunciation start timing to the device (Acquired by the apparatus).
  • the vowel sound generation timing acquisition unit may use an operation input unit (not shown).
  • Singing evaluation device 11 consonant length measurement unit 12: consonant length distribution creation unit 13: evaluation unit 21: vowel pronunciation timing acquisition unit 22: vowel pronunciation timing distribution creation unit 23: determination target detection unit

Abstract

A singing evaluation device (10) equipped with a consonant length measurement unit (11), a consonant length distribution creation unit (12), and an evaluation unit (13). The consonant length measurement unit (11) detects consonants in singing data and measures the length of each consonant. The consonant length distribution creation unit (12) stores, over a prescribed time period, the consonant lengths that have been input, and creates a frequency distribution for the consonant length. The evaluation unit (13) utilizes the fact that the singing is more in the groove when the variation in the consonant length is greater and the singing is less in the groove when the variation in the consonant length is less to determine whether the singing is in the groove. Specifically, the evaluation unit (13) calculates an index that is based on the variation in the consonant length, from the frequency distribution for the consonant length, and determines whether the singing is in the groove from the size of the index that is based on the variation in the consonant length.

Description

歌唱評価装置、歌唱評価方法及び歌唱評価プログラムSinging evaluation device, singing evaluation method, and singing evaluation program
 本発明は、歌唱者の歌唱内容を評価する技術に関する。 The present invention relates to a technique for evaluating the singing content of a singer.
 現在普及しているカラオケ装置に用いられる歌唱評価装置においては、各種の歌唱評価方法が採用されている。これら従来の歌唱評価装置では、基本的な歌唱評価項目として、歌唱データと楽譜データとを比較する。歌唱データとは、歌唱者の歌唱を収音して得られる音声データである。楽譜データとは、歌唱者が歌唱している曲における各音の音高、音量、および発音タイミング等を設定するデータである。従来の歌唱評価装置は、歌唱データの各音の音高、音量、および発音タイミングが、楽譜データの各音の音高、音量、および発音タイミングに対して、どれだけ一致しているかによって、歌唱を評価している。例えば、従来の歌唱評価装置は、歌唱データの各音の音高、音量、および発音タイミングが、楽譜データの各音の音高、音量、および発音タイミングに一致している度合いが高いほど、高得点になるように評価する。 Various singing evaluation methods are adopted in the singing evaluation apparatus used for the karaoke apparatus that is currently popular. In these conventional song evaluation apparatuses, song data and score data are compared as basic song evaluation items. Singing data is voice data obtained by collecting a song of a singer. The musical score data is data for setting the pitch, volume, and sounding timing of each sound in a song sung by a singer. A conventional singing evaluation apparatus sings depending on how much the pitch, volume, and pronunciation timing of each sound in the song data match the pitch, volume, and pronunciation timing of each sound in the score data. Is evaluated. For example, in a conventional singing evaluation apparatus, the higher the degree that the pitch, volume, and pronunciation timing of each sound in the song data match the pitch, volume, and pronunciation timing of each sound in the score data, the higher the degree Evaluate to score.
 また、特許文献1に記載のカラオケ装置では、歌唱データのリズムを検出し、歌唱している楽曲データに含まれるリズムを一致しているかどうかを検出することで、歌唱を評価している。 Moreover, in the karaoke apparatus described in Patent Document 1, the singing is evaluated by detecting the rhythm of the singing data and detecting whether or not the rhythms included in the singing music data match.
日本国特開2007-114492号公報Japanese Unexamined Patent Publication No. 2007-114492
 しかしながら、特許文献1に記載のカラオケ装置を含む従来の歌唱評価装置では、歌唱者の躍動感、特にノリを評価することができない。このような歌唱における躍動感は、歌唱者や聴者によって気持ちの良いものであるにも関わらず、上述のように、単純な楽譜データとの比較だけでは評価することができない。すなわち、歌唱者や聴者が「歌唱者は躍動感があって上手く歌えている」と思っているのにも関わらず、評価が低い(例えば、点数が低い)と判断されることがある。 However, with the conventional singing evaluation apparatus including the karaoke apparatus described in Patent Document 1, it is not possible to evaluate the singer's feeling of dynamism, particularly the flutter. Such a feeling of dynamism in singing can be evaluated only by comparison with simple musical score data, as described above, even though it is pleasant for singers and listeners. That is, although the singer or listener thinks that “the singer is singing well with a sense of dynamism”, the evaluation may be judged to be low (for example, the score is low).
 本発明の目的は、単純な楽譜データや楽曲データとの比較だけでは評価できない歌唱の躍動感、特にノリを反映して評価する技術を提供することにある。 An object of the present invention is to provide a technique for evaluating the dynamic feeling of singing, in particular, the reflection, which cannot be evaluated only by comparison with simple musical score data or music data.
 この発明の歌唱評価装置は、子音長分布作成部および評価部を備える。子音長分布作成部は、歌唱音を示す歌唱データに基づき、当該歌唱音に含まれる各子音の子音長の分布を作成する。評価部は、子音長の分布を用いて子音長の分布の広がり度合いである子音長のバラツキを検出し、該子音長のバラツキを用いて歌唱音を評価する。 The singing evaluation apparatus according to the present invention includes a consonant length distribution creation unit and an evaluation unit. The consonant length distribution creating unit creates a consonant length distribution of each consonant included in the singing sound based on the singing data indicating the singing sound. The evaluation unit detects a consonant length variation that is a degree of spread of the consonant length distribution using the consonant length distribution, and evaluates the singing sound using the consonant length variation.
 この構成では、躍動感がある歌唱と躍動感がない歌唱、特に、ノリが良い歌唱とノリが悪い歌唱とでは、子音長のバラツキが異なることを利用している。したがって、子音長のバラツキを検出して用いることで、歌唱のノリを正確に判定している。 This configuration uses the fact that the consonant length variation is different between a song with a sense of dynamism and a song without a sense of dynamism, in particular, a song with a good twist and a song with a bad twist. Therefore, by detecting and using the variation of the consonant length, it is possible to accurately determine the score of the singing.
 また、この発明の歌唱評価装置は、歌唱者の歌唱音から得られる歌唱データの子音長を測定し、測定した前記子音長を子音長分布作成部に出力する子音長測定部を備える。 Moreover, the singing evaluation apparatus of the present invention includes a consonant length measuring unit that measures the consonant length of the singing data obtained from the singing sound of the singer and outputs the measured consonant length to the consonant length distribution creating unit.
 この構成では、ユーザによって歌唱された歌唱音から歌唱データの子音長を測定することができる。 In this configuration, the consonant length of the singing data can be measured from the singing sound sung by the user.
 この発明の歌唱評価装置は、子音長測定部、子音長分布作成部、および、評価部を備える。子音長測定部は、歌唱者の歌唱音から得られる歌唱データの子音長を測定する。子音長分布作成部は、子音長の分布を作成する。評価部は、子音長の分布を用いて子音長の分布の広がり度合いである子音長のバラツキを検出し、該子音長のバラツキを用いて歌唱音を評価する。 The singing evaluation apparatus according to the present invention includes a consonant length measuring unit, a consonant length distribution creating unit, and an evaluating unit. The consonant length measurement unit measures the consonant length of the singing data obtained from the singing sound of the singer. The consonant length distribution creating unit creates a consonant length distribution. The evaluation unit detects a consonant length variation that is a degree of spread of the consonant length distribution using the consonant length distribution, and evaluates the singing sound using the consonant length variation.
 この構成では、躍動感がある歌唱と躍動感がない歌唱、特に、ノリが良い歌唱とノリが悪い歌唱とでは、子音長のバラツキが異なることを利用している。したがって、子音長のバラツキを検出して用いることで、歌唱のノリを正確に判定している。 This configuration uses the fact that the consonant length variation is different between a song with a sense of dynamism and a song without a sense of dynamism, in particular, a song with a good twist and a song with a bad twist. Therefore, by detecting and using the variation of the consonant length, it is possible to accurately determine the score of the singing.
 また、この発明の歌唱評価装置は、歌唱される曲の楽譜データおよび楽曲データのうち少なくとも一方から判定対象の子音を検出する判定対象検出部を備える。子音長分布作成部は、判定対象とされた子音に対して子音長の分布を作成する。 Moreover, the singing evaluation apparatus of the present invention includes a determination target detection unit that detects a consonant to be determined from at least one of music score data and music data of a song to be sung. The consonant length distribution creating unit creates a consonant length distribution for the consonant to be determined.
 この構成では、特定の音型やリズムにおいて、特に子音長のバラツキ度合いがノリの良い歌唱とノリの悪い歌唱で異なることを利用している。したがって、特定の判定対象の子音に対して子音長のバラツキを検出して用いることで、歌唱のノリを、より正確に判定している。 This configuration utilizes the fact that the degree of variation in consonant length differs between a song with a good twist and a song with a bad twist in a specific sound type and rhythm. Therefore, by detecting and using a variation in consonant length with respect to a specific consonant to be determined, it is possible to more accurately determine the singing.
 また、この発明の歌唱評価装置の判定対象検出部は、音型分析部、リズム分析部、および特定区間抽出部の少なくとも一つを備える。音型分析部は、楽譜データから歌唱される音型を分析する。リズム分析部は、楽譜データから歌唱されるリズムを分析する。特定区間抽出部は、楽曲データから歌唱される曲の特定区間を抽出する。判定対象検出部は、音型、リズム、および特定区間の少なくとも一つから判定対象を決定する。 Further, the determination target detection unit of the singing evaluation apparatus of the present invention includes at least one of a sound type analysis unit, a rhythm analysis unit, and a specific section extraction unit. The sound type analysis unit analyzes the sound type sung from the musical score data. The rhythm analyzer analyzes the rhythm sung from the musical score data. The specific section extraction unit extracts a specific section of a song to be sung from the music data. The determination target detection unit determines a determination target from at least one of a sound type, a rhythm, and a specific section.
 この構成では、判定対象検出部のより具体的な好ましい態様例を示している。このように、音型、リズム、および特定区間を判定対象とすることで、より正確な判定が可能になる。 This configuration shows a more specific preferable example of the determination target detection unit. As described above, by determining the sound type, rhythm, and specific section as determination targets, more accurate determination can be performed.
 また、この発明の歌唱評価装置は、母音発音タイミング取得部、および、母音発音タイミング分布作成部と、を備える。母音発音タイミング取得部は、歌唱データを用いて母音発音タイミングを検出する。母音発音タイミング分布作成部は、母音発音タイミングと楽曲の拍のタイミングとのタイミング差を検出し、母音発音タイミング差の分布を作成する。評価部は、母音発音タイミング差の分布を前記歌唱の評価に利用する。 Moreover, the singing evaluation apparatus of the present invention includes a vowel pronunciation timing acquisition unit and a vowel pronunciation timing distribution creation unit. The vowel pronunciation timing acquisition unit detects vowel pronunciation timing using song data. The vowel pronunciation timing distribution creation unit detects a timing difference between the vowel pronunciation timing and the beat timing of the music, and creates a distribution of the vowel pronunciation timing difference. The evaluation unit uses the distribution of the vowel pronunciation timing difference for the evaluation of the song.
 この構成では、母音の発音タイミングは、ノリの良い歌唱でもノリの悪い歌唱でも拍のタイミングと略一致しており、母音の発音タイミングが拍からずれる回数が多いと歌唱が下手に聞こえることを利用している。したがって、母音の発音タイミングと拍との差を検出して利用することで、歌唱の上手下手をさらに正確に評価することができる。 In this configuration, the vowel pronunciation timing is almost the same as the beat timing for both good and bad singing, and if the vowel pronunciation timing deviates many times, the song can be heard poorly. is doing. Therefore, by detecting and using the difference between the pronunciation timing of the vowel and the beat, it is possible to more accurately evaluate the skill of singing.
 また、この発明の歌唱評価装置では、評価部は、子音長の分布の広がり度合い(子音長のバラツキ)が大きいほどノリが良いと判断して歌唱を高評価する。 Moreover, in the singing evaluation apparatus of the present invention, the evaluation unit determines that the larger the degree of spread of the consonant length distribution (the variation of the consonant length) is, the better the song is, and highly evaluates the singing.
 この構成では、歌唱評価の一例を示しており、後述の図2に示すように、ノリが良い歌唱は子音長のバラツキが大きく、ノリの悪い歌唱、例えば、機械的に楽譜通り歌った歌唱は子音長のバラツキが少ないことを利用している。したがって、子音長のバラツキが大きいほどノリが良いと判断することで、歌唱者や聴者における歌唱の上手下手の感覚により近い評価を行うことができる。 In this configuration, an example of singing evaluation is shown. As shown in FIG. 2 to be described later, a song with a good twist has a large consonant length variation, and a song with a bad twist, for example, a song sung mechanically according to the score, It uses the fact that there is little variation in consonant length. Therefore, it is possible to perform an evaluation closer to the sensation of the upper and lower singing of a singer or listener by determining that the greater the consonant length variation, the better.
 この発明の別の態様は、歌唱音を示す歌唱データに基づき、当該歌唱音に含まれる各子音の子音長の分布を作成し、前記子音長の分布の広がり度合いを検出し、該子音長の分布の広がり度合いを用いて前記歌唱音を評価する歌唱評価方法を提供する。 Another aspect of the present invention creates a distribution of consonant lengths of each consonant included in the singing sound based on the singing data indicating the singing sound, detects the degree of spread of the consonant length distribution, A singing evaluation method for evaluating the singing sound using a degree of spread of distribution is provided.
 この発明のさらに別の態様は、歌唱音を示す歌唱データに基づき、当該歌唱音に含まれる各子音の子音長の分布を作成する子音長分布作成部と、前記子音長の分布の広がり度合いを検出し、該子音長の分布の広がり度合いを用いて前記歌唱音を評価する評価部と、の各機能をコンピュータに実行させるための歌唱評価プログラムを提供する。 According to another aspect of the present invention, a consonant length distribution creating unit that creates a consonant length distribution of each consonant included in the singing sound based on the singing data indicating the singing sound, and a degree of spread of the consonant length distribution. There is provided a singing evaluation program for causing a computer to execute each function of detecting and evaluating the singing sound using the degree of spread of the consonant length distribution.
 この発明によれば、歌唱者や聴者による上手下手の感覚により近い歌唱評価を実現することができる。 According to the present invention, it is possible to realize singing evaluation closer to the feeling of good and bad hands by a singer or listener.
本発明の第1の実施形態に係る歌唱評価装置の主要構成を示すブロック図The block diagram which shows the main structures of the song evaluation apparatus which concerns on the 1st Embodiment of this invention. 本発明の第1の実施形態に係る歌唱評価装置の歌唱評価の基準概念を示す図The figure which shows the reference | standard concept of song evaluation of the song evaluation apparatus which concerns on the 1st Embodiment of this invention. 本発明の第1の実施形態に係る歌唱評価方法のフローチャートThe flowchart of the song evaluation method which concerns on the 1st Embodiment of this invention. 本発明の第2の実施形態に係る歌唱評価装置の主要構成を示すブロック図The block diagram which shows the main structures of the song evaluation apparatus which concerns on the 2nd Embodiment of this invention. 本発明の第2の実施形態に係る母音の発音開始タイミングと拍のタイミングとを関連付けする概念を示す図The figure which shows the concept which links | relates the pronunciation start timing of a vowel with the timing of a beat which concerns on the 2nd Embodiment of this invention. 本発明の第2の実施形態に係る母音の発音開始タイミングと拍のタイミングとの時間差の分布と歌唱の上手下手との関係を示す度数分布図Frequency distribution diagram showing the relationship between the distribution of the time difference between the vowel sounding start timing and the beat timing according to the second embodiment of the present invention and the poor singing 本発明の第2の実施形態に係る歌唱評価方法のフローチャートThe flowchart of the song evaluation method which concerns on the 2nd Embodiment of this invention. 本発明の第3の実施形態に係る歌唱評価装置の主要構成を示すブロック図The block diagram which shows the main structures of the song evaluation apparatus which concerns on the 3rd Embodiment of this invention. 本発明の第3の実施形態に係る拍毎の子音長分布を示す度数分布図Frequency distribution diagram showing consonant length distribution for each beat according to the third embodiment of the present invention 本発明の第3の実施形態に係る歌唱評価方法のフローチャートThe flowchart of the song evaluation method which concerns on the 3rd Embodiment of this invention. 本発明の第4の実施形態に係る歌唱評価装置の主要構成を示すブロック図The block diagram which shows the main structures of the song evaluation apparatus which concerns on the 4th Embodiment of this invention. 本発明の第4の実施形態に係る子音長の判定に利用する子音の設定概念の具体例を示す図The figure which shows the specific example of the setting concept of the consonant utilized for determination of the consonant length which concerns on the 4th Embodiment of this invention. 本発明の第4の実施形態に係る歌唱評価方法のフローチャートFlowchart of the singing evaluation method according to the fourth embodiment of the present invention.
 本発明の第1の実施形態に係る歌唱評価装置について、図を参照して説明する。図1は、本発明の第1の実施形態に係る歌唱評価装置の主要構成を示すブロック図である。図2は、本発明の第1の実施形態に係る歌唱評価装置の歌唱評価の基準概念を示す図である。図2は横軸が子音長であり、縦軸が度数である。すなわち、図2は、歌唱された曲の所定の時間区間に含まれる子音長の度数分布を示す図である。 The singing evaluation apparatus according to the first embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing the main configuration of a singing evaluation apparatus according to the first embodiment of the present invention. FIG. 2 is a diagram illustrating a reference concept of singing evaluation of the singing evaluation apparatus according to the first embodiment of the present invention. In FIG. 2, the horizontal axis is the consonant length, and the vertical axis is the frequency. That is, FIG. 2 is a diagram showing a frequency distribution of consonant lengths included in a predetermined time section of a sung song.
 本発明に係る歌唱評価装置では、歌唱の躍動感、特に、歌唱の「ノリ」を評価可能にするものである。 The singing evaluation apparatus according to the present invention makes it possible to evaluate the dynamic feeling of the singing, in particular, the “noir” of the singing.
 本願発明における歌唱の「ノリ」とは、主に小節を時間的に等分した位置(タイミング)からのズレを音楽的にコントロールする高度な技法で、
 ・歌唱者の歌唱曲に対する入れ込み具合や高揚感、
 ・聞いているだけで一緒に歌唱したり踊ったりしたくなるようなリズム感、
 ・人間的で、自由な、または、生き生きとした躍動感、
等の効果を得る歌唱表現である。「グルーブ」または「グルーブ感」と呼ばれることもある。
The song “Nori” in the present invention is an advanced technique that musically controls the deviation from a position (timing) that is mainly divided into measures in time.
・ Singers' sung songs and uplifting feelings,
・ Rhythm that makes you want to sing and dance together just by listening.
・ Human, free or lively dynamics,
It is a singing expression that obtains effects such as. Sometimes called “groove” or “groove feeling”.
 まず、図2を用いて、本発明の第1の実施形態に係る歌唱評価装置の歌唱評価の基準概念を説明する。図2におけるノリが悪い歌唱は、楽譜通りの音高、音量、発音タイミングにしたがって機械的に歌唱が行われた場合を示している。ノリが良い歌唱は、上述の歌唱者の歌唱時の感情に依存する部分から感じ取れる歌唱が行われた場合を示している。 First, the reference concept of song evaluation of the song evaluation apparatus according to the first embodiment of the present invention will be described with reference to FIG. The singing with a bad twist in FIG. 2 shows a case where the singing is mechanically performed according to the pitch, the volume, and the sounding timing according to the score. The singing with a good flare shows the case where the singing that can be felt from the part depending on the emotion at the time of singing by the above-mentioned singer is performed.
 図2に示すように、ノリが良い歌唱では、子音長のバラツキが大きく、子音長の平均値も大きい(子音長が平均的に長い)。ノリの悪い歌唱では、子音長のバラツキが小さく、平均値も小さい(子音長が平均的に短い)。 As shown in FIG. 2, in a song with a good twist, the consonant length variation is large and the average consonant length is large (consonant length is long on average). In a song with a bad track, the consonant length variation is small and the average value is small (consonant length is short on average).
 したがって、子音長のバラツキに基づく指標を算出することで、ノリの良し悪しを判定することができる。具体的には、例えば、子音長の分散や標準偏差を算出する。子音長の分散や標準偏差が大きい時には、ノリが良く歌唱が上手であると判定し、子音長の分散や標準偏差が小さい時にはノリが悪く歌唱が下手であると判定する。また、分散や標準偏差が大きいほど歌唱に対する得点が高くなり、分散や標準偏差が小さいほど得点が低くなるように判定する。 Therefore, by calculating the index based on the variation of the consonant length, it is possible to determine whether the groove is good or bad. Specifically, for example, the variance of the consonant length and the standard deviation are calculated. When the consonant length variance and the standard deviation are large, it is determined that the sound is good and the singing is good, and when the consonant length dispersion and the standard deviation are small, the sound is bad and the singing is determined to be poor. In addition, the score for singing increases as the variance or standard deviation increases, and the score decreases as the variance or standard deviation decreases.
 また、子音長の平均値に基づく指標を算出することで、ノリの良し悪しを判定することができる。具体的には、例えば、子音長の平均値を算出する。子音長の平均値が大きい時にはノリが良く歌唱が上手であると判定し、子音長の平均値が小さい時にはノリが悪く歌唱が下手であると判定する。また、平均値が大きいほど歌唱に対する得点が高くなり、平均値が小さいほど得点が低くなるように判定する。 Also, by calculating an index based on the average value of the consonant length, it is possible to determine whether the groove is good or bad. Specifically, for example, an average value of the consonant length is calculated. When the average value of the consonant length is large, it is determined that the sound is good and the singing is good. When the average value of the consonant length is small, it is determined that the sound is poor and the singing is poor. Moreover, it determines so that the score with respect to a song becomes high, so that an average value is large, and a score becomes low, so that an average value is small.
 また、図2に示すように、ノリが良い歌唱では、度数の最大値(最頻値)を取る子音長が長い。ノリが悪い歌唱では、度数の最大値(最頻値)を取る子音長が短い。したがって、最頻値となる子音長の長さに基づく指標を算出することで、ノリの良し悪しを判定することができる。 Also, as shown in FIG. 2, in a song with a good twist, the consonant length that takes the maximum value (mode) is long. In singing with a bad track, the consonant length that takes the maximum value (mode) is short. Therefore, by calculating an index based on the length of the consonant length that is the mode value, it is possible to determine whether the groove is good or bad.
 具体的には、例えば、最頻値となる子音長の値を算出する。最頻値となる子音長の値が大きい時(子音長が長い時)にはノリが良く歌唱が上手であると判定し、最頻値となる子音長の値が小さい時(子音長が短い時)にはノリが悪く歌唱が下手であると判定する。また、最頻値となる子音長の値が大きいほど歌唱に対する得点が高くなり、最頻値となる子音長の値が小さいほど得点が低くなるように判定する。 Specifically, for example, a consonant length value that is a mode value is calculated. When the consonant length value that is the most frequent value is large (when the consonant length is long), it is determined that the sound is good and the singing is good, and when the consonant length value that is the most frequent value is small (the consonant length is short) ), It is determined that the song is bad and the singing is poor. Further, the higher the consonant length value that is the mode value, the higher the score for singing, and the smaller the consonant length value that is the mode value, the lower the score.
 さらに、バラツキに基づく指標、平均値に基づく指標、および最頻値に基づく指標の少なくとも二つを加味してノリを判定し、歌唱を評価してもよい。例えば、バラツキに基づく指標を数値化し、平均値に基づく指標を数値化し、これらの四則演算結果を用いて、ノリの判定および歌唱の評価を行ってもよい。この際、具体的な一例として、バラツキも平均値も大きくなるほど、それらの指標を表す数値が大きくなるように設定し、これら指標の平均や、加重平均を用いてもよい。 Further, the song may be evaluated by determining the glue by taking into account at least two of an index based on variation, an index based on an average value, and an index based on a mode value. For example, the index based on the variation may be digitized, the index based on the average value may be digitized, and the result of the four arithmetic operations may be used to determine the glue and evaluate the singing. At this time, as a specific example, the numerical values representing these indices may be set to increase as the variation and the average value increase, and an average of these indices or a weighted average may be used.
 このような歌唱評価を実現するため、本発明の実施形態に係る歌唱評価装置は、図1に示す構成を備える。歌唱評価装置10は、子音長測定部11、子音長分布作成部12、および評価部13を備える。 In order to realize such singing evaluation, the singing evaluation apparatus according to the embodiment of the present invention has a configuration shown in FIG. The singing evaluation device 10 includes a consonant length measuring unit 11, a consonant length distribution creating unit 12, and an evaluating unit 13.
 子音長測定部11には、歌唱者の歌唱を収音して得られる歌唱データが入力される。子音長測定部11は、歌唱データから既知の方法を用いて子音を検出する。子音長測定部11は、各子音の音長である子音長を測定する。子音長測定部11は、測定した子音長を、子音長分布作成部12に出力する。 The consonant length measuring unit 11 receives singing data obtained by collecting the singing of the singer. The consonant length measuring unit 11 detects a consonant from the song data using a known method. The consonant length measuring unit 11 measures the consonant length that is the length of each consonant. The consonant length measuring unit 11 outputs the measured consonant length to the consonant length distribution creating unit 12.
 子音を検出する既知の方法の一例として、日本国特開2008-32933が挙げられる。この文献は、周期性のある区間を母音区間と判断し、それ以外の区間を子音区間と判断するという手法を開示している。また、GUI(Graphical User Interface)上においてユーザが歌唱データの子音の区間を人手で特定することで、特定された各子音の区間の長さを測定する方法もある。或いは、GUIを用いることなく、ユーザが歌唱データを聞きながら子音の区間を人手で特定することで、特定された各子音の区間の長さを測定する方法も考えられる。 An example of a known method for detecting consonants is Japanese Patent Application Laid-Open No. 2008-32933. This document discloses a method of determining a period with periodicity as a vowel section and determining other sections as consonant sections. There is also a method of measuring the length of each identified consonant section by manually specifying a consonant section of singing data on GUI (Graphical User Interface). Alternatively, a method of measuring the length of each identified consonant section by manually identifying a consonant section while listening to song data without using a GUI is also conceivable.
 子音長分布作成部12は、入力された各子音長を、予め設定された時間区間に亘り記憶する。この記憶する時間区間は、例えば一曲全体であってもよく、決められたワンフレーズであってもよい。決められたワンフレーズを用いる場合には、別途楽譜データを取得し、当該楽譜データを参照して用いればよい。子音長分布作成部12は、子音長の度数分布を作成する。子音長分布作成部12は、子音長の度数分布を、評価部13に出力する。 The consonant length distribution creating unit 12 stores the input consonant lengths over a preset time interval. The time interval to be stored may be, for example, the whole piece of music or a predetermined one phrase. When using the determined one phrase, it is only necessary to acquire score data separately and use it with reference to the score data. The consonant length distribution creating unit 12 creates a frequency distribution of consonant lengths. The consonant length distribution creation unit 12 outputs the frequency distribution of the consonant length to the evaluation unit 13.
 評価部13は、子音長の度数分布からノリの良し悪しを判定し、当該ノリに基づく歌唱評価を行う。ノリの判定基準は上述のものであり、具体的には、評価部13は、子音長の度数分布から、子音長のバラツキに基づく指標(例えば、分散や標準偏差)を算出する。評価部13は、子音長のバラツキに基づく指標から、上述のようにノリの良し悪しを判定する。なお、評価部13は、上述のように、子音長の平均値や、度数が最大となる子音長(最頻値)を用いてノリの良し悪しを判定してもよい。 The evaluation unit 13 determines whether the groove is good or bad from the frequency distribution of the consonant length, and performs singing evaluation based on the groove. The determination criteria for the glue are as described above. Specifically, the evaluation unit 13 calculates an index (for example, variance or standard deviation) based on the variation of the consonant length from the frequency distribution of the consonant length. The evaluation unit 13 determines whether the flutter is good or bad from the index based on the variation of the consonant length as described above. Note that, as described above, the evaluation unit 13 may determine whether or not the slack is good by using the average value of the consonant length or the consonant length (mode) at which the frequency is maximum.
 以上のように、本実施形態に構成を用いることで、歌唱のノリの良し悪しを正確に判定して、当該ノリの良し悪しを加味した歌唱評価を行うことができる。すなわち、歌唱者や聴者による上手下手の感覚により近い歌唱評価を実現することができる。 As described above, by using the configuration of the present embodiment, it is possible to accurately determine whether the singing is good or not, and to perform singing evaluation in consideration of the quality of the singing. That is, it is possible to realize a singing evaluation that is closer to the feeling of good and bad hands by a singer or listener.
 評価部13による歌唱評価の方法としては、以下の様な方法が考えられる。例えば、同一の曲について、歌が上手い人から下手な人まで様々なレベルの歌唱データを記憶装置に収集して記憶させておく。コンピュータが、記憶装置からそれぞれの歌唱データを読み出し、当該歌唱データから得られた子音長の分散を計算する。さらに、コンピュータが、各歌唱データから得られた分散の値を10段階で評価することで、各歌唱データに応じた得点(1~10)を割り当てる。これにより、各歌唱データにおけるノリを10段階の点数で評価することが可能となる。 The following methods can be considered as a method of singing evaluation by the evaluation unit 13. For example, singing data of various levels ranging from those who are good at singing to those who are poor at the same song are collected and stored in a storage device. A computer reads each song data from a memory | storage device, and calculates dispersion | distribution of the consonant length obtained from the said song data. Further, the computer assigns scores (1 to 10) corresponding to each song data by evaluating the variance value obtained from each song data in 10 stages. Thereby, it becomes possible to evaluate the score in each song data with a score of 10 stages.
 なお、こうした分散の値の段階評価は10段階に限らず、より細かな段階評価(20段階、50段階等)、あるいは、より荒い段階評価(8段階、5段階等)を採用しても構わない。また、こうした段階評価で使用される段階の間隔(ある得点に該当すると判定される分散の最小値と分散の最大値との差分)は、それぞれ等間隔に設定してもよいし、等間隔ではなく、重みづけを付した間隔(例えば、5点での間隔と10点での間隔が異なる)としてもよい。また、段階評価で使用される段階の間隔を等間隔に設定するものの、各段階での加点に対して重みづけする方法を採用してもよい(3段階評価として、最低評価を5点、中間評価を10点、最高評価を30点とする)。 In addition, the stage evaluation of the variance value is not limited to 10 stages, and a finer stage evaluation (20 stages, 50 stages, etc.) or a rougher stage evaluation (8 stages, 5 stages, etc.) may be adopted. Absent. In addition, the interval between steps used in such a step evaluation (difference between the minimum value of variance determined to correspond to a certain score and the maximum value of variance) may be set at equal intervals. Alternatively, a weighted interval (for example, the interval at 5 points and the interval at 10 points are different) may be used. In addition, although the interval of the steps used in the step evaluation is set at an equal interval, a method of weighting the added points in each step may be adopted (as the three-step evaluation, the minimum evaluation is 5 points, the middle The evaluation is 10 points, and the highest evaluation is 30 points).
 なお、上述の説明では、各機能部を個別に備える場合を示したが、これらをプログラム化して記憶しておき、当該プログラムをCPU等の演算処理素子で実行するようにしてもよい。この場合、次に示す処理フローを用いればよい。図3は、本発明の第1の実施形態に係る歌唱評価方法のフローチャートである。 In the above description, each functional unit is individually provided. However, these may be stored as a program and the program may be executed by an arithmetic processing element such as a CPU. In this case, the following processing flow may be used. FIG. 3 is a flowchart of the singing evaluation method according to the first embodiment of the present invention.
 まず、歌唱評価装置10は、歌唱データを取得し、歌唱データに含まれる各子音の子音長を測定する(S101)。次に、歌唱評価装置10は、取得した子音長の分布を作成する(S102)。次に、歌唱評価装置10は、子音長の分布から子音長のバラツキに基づく指標を算出し、該指標を用いてノリの良し悪しを判定して、歌唱評価を行う(S103)。なお、図3では、子音長のバラツキに基づいてノリの良し悪しを判定しているが、上述のように、子音長の平均値や最頻値を用いてノリの良し悪しを判定することもできる。 First, the song evaluation device 10 acquires song data and measures the consonant length of each consonant included in the song data (S101). Next, the singing evaluation apparatus 10 creates a distribution of the acquired consonant lengths (S102). Next, the singing evaluation apparatus 10 calculates an index based on the variation of the consonant length from the distribution of the consonant length, determines whether the groove is good or bad using the index, and performs singing evaluation (S103). In FIG. 3, the quality of the groove is determined based on the variation of the consonant length. However, as described above, the quality of the groove can be determined using the average value or the mode value of the consonant length. it can.
 次に、本発明の第2の実施形態に係る歌唱評価装置について、図を参照して説明する。図4は、本発明の第2の実施形態に係る歌唱評価装置の主要構成を示すブロック図である。 Next, a singing evaluation apparatus according to the second embodiment of the present invention will be described with reference to the drawings. FIG. 4 is a block diagram showing the main configuration of the singing evaluation apparatus according to the second embodiment of the present invention.
 本実施形態の歌唱評価装置10Aは、第1の実施形態に示した歌唱評価装置10に対して、母音発音タイミング取得部21、および母音発音タイミング分布作成部22を追加したものであり、他の構成は、第1の実施形態に示した歌唱評価装置10と同じである。したがって、第1の実施形態に係る歌唱評価装置10と異なる箇所のみを、以下に具体的に説明する。 10A of song evaluation apparatuses of this embodiment add the vowel pronunciation timing acquisition part 21 and the vowel pronunciation timing distribution creation part 22 with respect to the song evaluation apparatus 10 shown in 1st Embodiment, and others The configuration is the same as the singing evaluation apparatus 10 shown in the first embodiment. Therefore, only the parts different from the singing evaluation apparatus 10 according to the first embodiment will be specifically described below.
 母音発音タイミング取得部21には、子音長測定部11と同じ歌唱データが入力される。母音発音タイミング取得部21は、既知の方法で母音を検出し、当該母音の発音開始タイミングを検出する。具体的には、母音タイミング検出部21は、子音付きの母音の場合には、子音から母音に切り替わるタイミングを検出する。母音タイミング検出部21は、母音が連続する場合には、母音が切り替わるタイミングを検出する。母音タイミング検出部21は、無音から母音を発音する場合には、無音状態から母音が発生されるタイミングを検出する。母音発音タイミング取得部21は、検出した各母音の発音開始タイミングを、母音発音タイミング分布作成部22に出力する。 The same singing data as the consonant length measurement unit 11 is input to the vowel pronunciation timing acquisition unit 21. The vowel pronunciation timing acquisition unit 21 detects a vowel by a known method, and detects a pronunciation start timing of the vowel. Specifically, in the case of a vowel with a consonant, the vowel timing detection unit 21 detects the timing at which the vowel is switched to the vowel. The vowel timing detection unit 21 detects the timing at which vowels are switched when vowels are continuous. The vowel timing detection unit 21 detects a timing at which a vowel is generated from a silent state when a vowel is generated from the silence. The vowel pronunciation timing acquisition unit 21 outputs the detected pronunciation start timing of each vowel to the vowel pronunciation timing distribution creation unit 22.
 母音発音タイミング分布作成部22には、各母音の発音開始タイミングが入力されるとともに、歌唱されている曲の拍のタイミングが入力されている。母音発音タイミング分布作成部22は、拍のタイミングと母音の発音開始タイミングとの差を比較する。この際、母音発音タイミング分布作成部22は、それぞれの母音の発音開始タイミングに最も近い拍のタイミングを、各母音の発音開始タイミングに対して関連づけする。図5は、母音の発音開始タイミングと拍のタイミングとを関連付けする概念を示す図である。図5の場合では、例えば、母音V01の発音開始タイミングは、第1拍のタイミングに最も近い。したがって、母音発音タイミング分布作成部22は、母音V01の発音開始タイミングに対応する拍を第1拍に設定する。同様に、図5の場合では、母音発音タイミング分布作成部22は、母音V02の発音開始タイミングに対応する拍を第3拍に設定する。また、図5の場合では、母音発音タイミング分布作成部22は、母音V03の発音開始タイミングに対応する拍を第4拍に設定し、母音V04の発音開始タイミングに対応する拍を第5拍に設定し、母音V05の発音開始タイミングに対応する拍を第7拍に設定し、母音V06の発音開始タイミングに対応する拍を第8拍に設定する。 The vowel sound generation timing distribution creation unit 22 receives the sound generation start timing of each vowel and the beat timing of the song being sung. The vowel pronunciation timing distribution creation unit 22 compares the difference between the beat timing and the vowel pronunciation start timing. At this time, the vowel sounding timing distribution creating unit 22 associates the timing of the beat closest to the sounding start timing of each vowel with the sounding start timing of each vowel. FIG. 5 is a diagram showing the concept of associating the vowel pronunciation start timing with the beat timing. In the case of FIG. 5, for example, the sounding start timing of the vowel V01 is closest to the timing of the first beat. Therefore, the vowel sound generation timing distribution creation unit 22 sets the beat corresponding to the sound generation start timing of the vowel V01 as the first beat. Similarly, in the case of FIG. 5, the vowel sound generation timing distribution creating unit 22 sets the beat corresponding to the sound generation start timing of the vowel V02 as the third beat. In the case of FIG. 5, the vowel sound generation timing distribution creation unit 22 sets the beat corresponding to the sound generation start timing of the vowel V03 as the fourth beat, and sets the beat corresponding to the sound generation start timing of the vowel V04 as the fifth beat. The beat corresponding to the sound generation start timing of the vowel V05 is set to the seventh beat, and the beat corresponding to the sound generation start timing of the vowel V06 is set to the eighth beat.
 母音発音タイミング分布作成部22は、それぞれに対応する母音の発音開始タイミングと拍のタイミングとの時間差を算出する。母音発音タイミング分布作成部22は、この時間差の分布を作成する。母音発音タイミング分布作成部22は、時間差の分布を、評価部13に出力する。 The vowel pronunciation timing distribution creation unit 22 calculates the time difference between the vowel pronunciation start timing and the beat timing corresponding to each. The vowel pronunciation timing distribution creation unit 22 creates this time difference distribution. The vowel sound generation timing distribution creation unit 22 outputs the time difference distribution to the evaluation unit 13.
 評価部13は、母音の発音開始タイミングと拍のタイミングとの時間差の分布を用いて、歌唱の上手下手を判定する。図6は、母音の発音開始タイミングと拍のタイミングとの時間差の分布と歌唱の上手下手との関係を示す度数分布図である。図6に示すように、歌唱が上手な場合、母音の発音開始タイミングと拍のタイミングとの時間差のバラツキは小さく、歌唱が下手な場合、母音の発音開始タイミングと拍のタイミングとの時間差のバラツキは大きい。また、歌唱が上手な場合、母音の発音開始タイミングと拍のタイミングとの時間差の最頻値は略0となり、歌唱が下手な場合、母音の発音開始タイミングと拍のタイミングとの時間差の最頻値は0から大きくずれている。 The evaluator 13 determines whether the singing is good or bad using the distribution of the time difference between the vowel pronunciation start timing and the beat timing. FIG. 6 is a frequency distribution diagram showing the relationship between the distribution of the time difference between the vowel sounding start timing and the beat timing and the poor singing ability. As shown in FIG. 6, when the singing is good, the variation in time difference between the vowel pronunciation start timing and the beat timing is small, and when the singing is poor, the time difference variation between the vowel pronunciation start timing and the beat timing is small. Is big. In addition, when the singing is good, the mode value of the time difference between the vowel pronunciation start timing and the beat timing is substantially 0, and when the singing is not good, the time difference between the vowel pronunciation start timing and the beat timing is the most frequent. The value is greatly deviated from 0.
 この特性を利用し、評価部13は、時間差のバラツキを検出して、時間差のバラツキが小さくなるほど歌唱が上手であると判定する。また、評価部13は、時間差の最頻値を検出して、当該最頻値が0に近いほど歌唱が上手であると判定する。また、評価部13は、時間差のバラツキと時間差の最頻値との両方を用いて、歌唱の上手下手を判定してもよい。 Using this characteristic, the evaluation unit 13 detects the variation in the time difference, and determines that the singing is better as the variation in the time difference becomes smaller. The evaluation unit 13 detects the mode value of the time difference, and determines that the singing is better as the mode value is closer to zero. Moreover, the evaluation part 13 may determine the skill level of a song using both the time difference variation and the time difference mode value.
 評価部13は、母音の発音開始タイミングと拍のタイミングとの時間差に基づく歌唱評価結果を、上述の子音長による歌唱評価結果に反映させ、統合的に歌唱評価を行う。これにより、さらに正確に歌唱の上手下手を判定することができる。 The evaluation unit 13 reflects the singing evaluation result based on the time difference between the vowel pronunciation start timing and the beat timing in the singing evaluation result based on the above-mentioned consonant length, and performs singing evaluation in an integrated manner. As a result, it is possible to more accurately determine the sung skill.
 なお、評価部13は、母音の発音開始タイミングと拍のタイミングとの時間差に基づく歌唱評価結果により、歌唱が上手であると判定した場合のみに、子音長による歌唱評価を行うようにしてもよい。この場合、評価部13は、子音長による歌唱評価が不必要と思われる時に、子音長による歌唱評価を行わないので、歌唱評価の処理負荷を軽減することができる。 Note that the evaluation unit 13 may perform the singing evaluation based on the consonant length only when it is determined that the singing is good based on the singing evaluation result based on the time difference between the vowel pronunciation start timing and the beat timing. . In this case, the evaluation unit 13 does not perform the singing evaluation based on the consonant length when the singing evaluation based on the consonant length is considered unnecessary. Therefore, the processing load of the singing evaluation can be reduced.
 なお、本実施形態の処理もプログラム化して記憶しておき、当該プログラムをCPU等の演算処理素子で実行するようにしてもよい。この場合、次に示す処理フローを用いればよい。図7は、本発明の第2の実施形態に係る歌唱評価方法のフローチャートである。なお、図7では、母音発音タイミングと拍のタイミングとの時間差に基づく歌唱評価結果によって、子音長による歌唱評価を行うか行わないかを切り替える場合を示している。 It should be noted that the processing of this embodiment may also be stored as a program, and the program may be executed by an arithmetic processing element such as a CPU. In this case, the following processing flow may be used. FIG. 7 is a flowchart of the singing evaluation method according to the second embodiment of the present invention. FIG. 7 shows a case where whether or not to perform the singing evaluation based on the consonant length is switched depending on the singing evaluation result based on the time difference between the vowel sound generation timing and the beat timing.
 まず、歌唱評価装置10Aは、歌唱データを取得し、歌唱データに含まれる各子音の子音長を測定する(S201)。この子音長の測定とは別に、歌唱評価装置10Aは、歌唱データに含まれる各母音の発音開始タイミングを検出する(S202)。次に、歌唱評価装置10Aは、各母音の発音開始タイミングとそれぞれに対応する拍のタイミングとの時間差の分布を作成する(S203)。歌唱評価装置10Aは、時間差が0近傍に大きく分布していることを検出すると(S204:YES)、子音長の分布から子音長のバラツキに基づく指標を算出し、該指標を用いてノリの良し悪しを判定して、歌唱評価を行う(S205)。歌唱評価装置10Aは、時間差が0近傍に大きく分布していないことを検出すると(S204:NO)、子音長による歌唱評価を行わず、歌唱は下手であると評価する。 First, the song evaluation device 10A acquires song data and measures the consonant length of each consonant included in the song data (S201). Apart from the measurement of the consonant length, the singing evaluation device 10A detects the pronunciation start timing of each vowel included in the singing data (S202). Next, the singing evaluation device 10A creates a time difference distribution between the sounding start timing of each vowel and the timing of the beat corresponding to the timing (S203). When the singing evaluation apparatus 10A detects that the time difference is greatly distributed in the vicinity of 0 (S204: YES), the singing evaluation apparatus 10A calculates an index based on the variation of the consonant length from the distribution of the consonant length, and uses this index to determine whether or not Bad judgment is performed and singing evaluation is performed (S205). When the singing evaluation device 10A detects that the time difference is not largely distributed in the vicinity of 0 (S204: NO), the singing evaluation by the consonant length is not performed, and the singing is evaluated to be poor.
 次に、第3の実施形態に係る歌唱評価装置について、図を参照して説明する。図8は、本発明の第3の実施形態に係る歌唱評価装置の主要構成を示すブロック図である。 Next, a singing evaluation apparatus according to the third embodiment will be described with reference to the drawings. FIG. 8 is a block diagram showing the main configuration of the singing evaluation apparatus according to the third embodiment of the present invention.
 本実施形態の歌唱評価装置10Bは、第2の実施形態に示した歌唱評価装置10Aに対して、母音発音タイミング取得部21、および母音発音タイミング分布作成部22に対する接続構成が異なるものであり、他の構成は第2の実施形態に示した歌唱評価装置10Aと同じである。したがって、第2の実施形態に係る歌唱評価装置10Aと異なる箇所のみを具体的に説明する。 The singing evaluation device 10B of the present embodiment is different from the singing evaluation device 10A shown in the second embodiment in the connection configuration with respect to the vowel pronunciation timing acquisition unit 21 and the vowel pronunciation timing distribution creation unit 22, The other configuration is the same as the singing evaluation apparatus 10A shown in the second embodiment. Therefore, only the part different from the singing evaluation apparatus 10A according to the second embodiment will be specifically described.
 母音発音タイミング取得部21は、検出した母音発音開始タイミングを、母音発音タイミング分布作成部22とともに、子音長分布作成部12にも出力する。 The vowel sound generation timing acquisition unit 21 outputs the detected vowel sound generation start timing to the consonant length distribution generation unit 12 together with the vowel sound generation timing distribution generation unit 22.
 母音発音タイミング分布作成部22は、各母音発音開始タイミングに対応する拍を、子音長分布作成部12に出力する。 The vowel pronunciation timing distribution creation unit 22 outputs beats corresponding to each vowel pronunciation start timing to the consonant length distribution creation unit 12.
 子音長分布作成部12は、入力された母音発音タイミングと、これに対応する拍とから、各子音に対応する拍を検出する。具体的には、図5の場合であれば、子音長分布作成部12は、母音V01につく子音C01に対応する拍を第1拍であると検出する。同様に、子音長分布作成部12は、母音V02につく子音C02に対応する拍を第3拍であると検出し、母音V04につく子音C04に対応する拍を第5拍であると検出し、母音V05につく子音C05に対応する拍を第7拍であると検出し、母音V06につく子音C06に対応する拍を第8拍であると検出する。 The consonant length distribution creating unit 12 detects a beat corresponding to each consonant from the input vowel sounding timing and the corresponding beat. Specifically, in the case of FIG. 5, the consonant length distribution creating unit 12 detects that the beat corresponding to the consonant C01 attached to the vowel V01 is the first beat. Similarly, the consonant length distribution creating unit 12 detects that the beat corresponding to the consonant C02 attached to the vowel V02 is the third beat, and detects the beat corresponding to the consonant C04 attached to the vowel V04 is the fifth beat. The beat corresponding to the consonant C05 attached to the vowel V05 is detected as the seventh beat, and the beat corresponding to the consonant C06 attached to the vowel V06 is detected as the eighth beat.
 子音長分布作成部12は、拍毎に子音長分布を作成する。図9は、本発明の第3の実施形態に係る拍毎の子音長分布を示す度数分布図である。子音長分布作成部12は、拍毎の子音長分布を、評価部13に出力する。 The consonant length distribution creating unit 12 creates a consonant length distribution for each beat. FIG. 9 is a frequency distribution diagram showing the consonant length distribution for each beat according to the third embodiment of the present invention. The consonant length distribution creation unit 12 outputs the consonant length distribution for each beat to the evaluation unit 13.
 評価部13は、拍毎の子音長分布を用いて、ノリの良し悪しを判定し、歌唱評価を行う。具体的な一例としては、評価部13は、子音長のバラツキが最小の拍における子音長の分散や標準偏差と、子音長のバラツキが最大の拍における子音長の分散や標準偏差の差をもって、ノリの良し悪しを判定する。具体的な図9に示す例であれば、第1拍目の分散や標準偏差と第6拍目の分散や標準偏差との差をもって、ノリの良し悪しを判定する。この際、評価部13は、差として、分散や標準偏差の算術的な差だけでなく、子音長のバラツキが最小の拍における子音長の分散や標準偏差と、子音長のバラツキが最大の拍における子音長の分散や標準偏差との算術的な比などを用いてもよい。そして、評価部13は、この差が大きいほどノリが良く、この差が小さいほどノリが悪いと判定する。 The evaluation unit 13 uses a consonant length distribution for each beat to determine whether the groove is good or bad and performs singing evaluation. As a specific example, the evaluation unit 13 has a difference between the dispersion and standard deviation of the consonant length in the beat having the smallest consonant length variation and the difference between the dispersion and standard deviation of the consonant length in the beat having the largest consonant length variation. Judge the quality of the glue. In the specific example shown in FIG. 9, the quality of the glue is determined based on the difference between the variance and standard deviation of the first beat and the variance and standard deviation of the sixth beat. At this time, the evaluation unit 13 determines not only the arithmetic difference between the variance and the standard deviation as the difference, but also the beat with the variance of the consonant length and the standard deviation in the beat having the smallest consonant length variation and the maximum consonant length variation. A consonant length variance or an arithmetic ratio with a standard deviation may be used. The evaluation unit 13 determines that the larger the difference is, the better the roughness is, and the smaller the difference is, the worse the roughness is.
 このように、拍毎のバラツキに基づいてノリの良し悪しを判定することで、より正確にノリの良し悪しを判定することができ、歌唱者や聴者による上手下手の感覚にさらに近い歌唱評価を実現することができる。 In this way, it is possible to determine the quality of the groove more accurately by determining the quality of the groove based on the variation for each beat, and the singing evaluation closer to the feeling of the upper and lower hands by the singer and listener Can be realized.
 なお、本実施形態の処理もプログラム化して記憶しておき、当該プログラムをCPU等の演算処理素子で実行するようにしてもよい。この場合、次に示す処理フローを用いればよい。図10は、本発明の第3の実施形態に係る歌唱評価方法のフローチャートである。なお、図10では、母音発音タイミングと拍のタイミングとの時間差に基づく歌唱評価を行わない場合を示している。このように、本実施形態では、母音発音タイミングと拍のタイミングとの時間差に基づく歌唱評価を行ってもよく、行わなくてもよい。 It should be noted that the processing of this embodiment may also be stored as a program, and the program may be executed by an arithmetic processing element such as a CPU. In this case, the following processing flow may be used. FIG. 10 is a flowchart of the singing evaluation method according to the third embodiment of the present invention. FIG. 10 shows a case where singing evaluation based on the time difference between the vowel sound generation timing and the beat timing is not performed. Thus, in this embodiment, singing evaluation based on the time difference between the vowel sound generation timing and the beat timing may or may not be performed.
 まず、歌唱評価装置10Bは、歌唱データを取得し、歌唱データに含まれる各子音の子音長を測定する(S301)。この子音長の測定とは別に、歌唱評価装置10Bは、歌唱データに含まれる各母音の発音開始タイミングを検出する(S302)。次に、歌唱評価装置10Bは、各母音の発音開始タイミングとそれぞれに対応する拍のタイミングとの関連付けを検出する(S303)。歌唱評価装置10Bは、拍毎に子音長の分布を作成する(S304)。歌唱評価装置10Bは、拍毎の子音長の分布のバラツキを用いて、ノリの良し悪しを判定し、歌唱評価を行う(S305)。 First, the song evaluation apparatus 10B acquires song data, and measures the consonant length of each consonant included in the song data (S301). Apart from the measurement of the consonant length, the singing evaluation device 10B detects the sounding start timing of each vowel included in the singing data (S302). Next, the singing evaluation device 10B detects the association between the sounding start timing of each vowel and the timing of the corresponding beat (S303). The singing evaluation apparatus 10B creates a distribution of consonant length for each beat (S304). The singing evaluation device 10B determines whether the sound is good or bad by using the variation in the distribution of the consonant length for each beat, and performs singing evaluation (S305).
 次に、本発明の第4の実施形態に係る歌唱評価装置について、図を参照して説明する。図11は、本発明の第4の実施形態に係る歌唱評価装置の主要構成を示すブロック図である。本実施形態の歌唱評価装置10Cは、第3の実施形態に示した歌唱評価装置10Bに対して、さらに判定対象検出部23を備えたものであり、子音長分布作成部12の処理が異なる。したがって、第3の実施形態に係る歌唱評価装置10Bと異なる箇所のみを具体的に説明する。 Next, a singing evaluation apparatus according to the fourth embodiment of the present invention will be described with reference to the drawings. FIG. 11: is a block diagram which shows the main structures of the song evaluation apparatus which concerns on the 4th Embodiment of this invention. The singing evaluation apparatus 10C of the present embodiment further includes a determination target detection unit 23 with respect to the singing evaluation apparatus 10B shown in the third embodiment, and the processing of the consonant length distribution creation unit 12 is different. Therefore, only a different part from the song evaluation apparatus 10B which concerns on 3rd Embodiment is demonstrated concretely.
 判定対象検出部23は、音形分析部、リズム分析部、および特定区間抽出部を備える。判定対象検出部23には、楽譜データおよび楽曲データの少なくとも何れか一方が入力される。楽譜データには、歌唱される楽曲の音高、音量、発音タイミング等が含まれている。楽曲データには、サビ区間等の曲の編成、楽曲のジャンル等が含まれている。 The determination target detection unit 23 includes a sound form analysis unit, a rhythm analysis unit, and a specific section extraction unit. The determination target detection unit 23 receives at least one of score data and music data. The musical score data includes the pitch, volume, and sounding timing of the song to be sung. The music data includes the composition of music such as a chorus section, the genre of music, and the like.
 判定対象検出部23は、楽譜データを解析し、子音長の判定に利用する子音を設定する。図12は、子音長の判定に利用する子音の設定概念の具体例を示す図である。図12(A)は音型に基づく子音の設定概念を示す図であり、図12(B)はリズムに基づく子音の設定概念を示す図である。 The determination target detection unit 23 analyzes the score data and sets a consonant to be used for determination of the consonant length. FIG. 12 is a diagram showing a specific example of a consonant setting concept used for determination of consonant length. FIG. 12A is a diagram showing a consonant setting concept based on sound type, and FIG. 12B is a diagram showing a consonant setting concept based on rhythm.
 音型を用いる場合、判定対象検出部23の音型分析部は、子音長の判定に利用する音型を、楽譜データから検出すると、当該音型の区間、および子音長の判定に利用する子音のタイミングを設定する。例えば、図12(A)に示すように、上昇音型を検出すると、連続して音高が上昇する3つの音における3番目の音の子音を、子音長の判定に利用する子音に設定し、当該子音のタイミングを、子音長分布作成部12に与える。 When the sound type is used, when the sound type analysis unit of the determination target detection unit 23 detects the sound type used for the determination of the consonant length from the score data, the sound type interval and the consonant used for the determination of the consonant length are detected. Set the timing. For example, as shown in FIG. 12A, when the rising tone type is detected, the consonant of the third sound in the three sounds whose pitches rise continuously is set as the consonant used for the determination of the consonant length. The timing of the consonant is given to the consonant length distribution creating unit 12.
 リズムを用いる場合、判定対象検出部23のリズム分析部は、子音長の判定に利用するリズムを、楽譜データから検出すると、当該リズムの区間、および子音長の判定に利用する子音のタイミングを設定する。例えば、図12(B)に示すように、シンコペーションを検出すると、異なる音長が繰り返される区間における3つの音における3番目の音(音長が短い音)の子音を、子音長の判定に利用する子音に設定し、当該子音の子音長を、子音長分布作成部12に与える。 When using a rhythm, when the rhythm analysis unit of the determination target detection unit 23 detects a rhythm to be used for determining the consonant length from the musical score data, the rhythm section and a consonant timing to be used for determining the consonant length are set. To do. For example, as shown in FIG. 12B, when syncopation is detected, the consonant of the third sound (the sound having a short sound length) in the three sounds in the section where the different sound lengths are repeated is used to determine the consonant length. And the consonant length of the consonant is given to the consonant length distribution creating unit 12.
 なお、本実施形態では、連続する3音の区間の音型、リズムを指定しているが、2音ないし4音以上からなる区間であってもよい。但し、構成する音の数が少ないほど、同じ音型、同じリズムが歌唱される曲に含まれる回数が多くなり易い。したがって、より正確な子音長分布を作成しやすく、より有用である。また、区間の最後の音でなく、先頭や中間の音の子音を、子音長の判定に利用する子音に設定してもよい。これにより、音型やリズムに応じて、より適切な評価を行うことができる。 In the present embodiment, the tone type and rhythm of a continuous three-tone section is specified, but a section composed of two or more sounds may be used. However, the smaller the number of sounds, the greater the number of times that the same sound type and the same rhythm are included in the song being sung. Therefore, it is easy to create a more accurate consonant length distribution, which is more useful. In addition, the consonant of the head or middle sound may be set as the consonant used for the determination of the consonant length, not the last sound of the section. Thereby, more appropriate evaluation can be performed according to a sound type and a rhythm.
 また、判定対象検出部23は、楽曲データを解析し、子音長の判定に利用する子音を設定することもできる。例えば、判定対象検出部23は、楽曲データからサビ区間を検出して、当該サビ区間の時間を、子音長分布作成部12に与える。 Also, the determination target detection unit 23 can analyze music data and set a consonant to be used for determination of the consonant length. For example, the determination target detection unit 23 detects a chorus section from the music data, and gives the time of the chorus section to the consonant length distribution creation unit 12.
 子音長分布作成部12は、判定対象検出部23から与えられたタイミングもしくは区間と母音発音タイミング検出21および母音発音タイミング分布作成部22から得られる各子音のタイミングとを参照し、子音長測定部11から入力される各子音長を用いて、与えられたタイミングもしくは区間の子音長の分布を作成する。 The consonant length distribution creating unit 12 refers to the timing or interval given from the determination target detecting unit 23 and the timing of each consonant obtained from the vowel sounding timing detection 21 and the vowel sounding timing distribution creating unit 22, and the consonant length measuring unit 11 is used to create a consonant length distribution for a given timing or interval.
 歌唱のノリは、特定の音型や特定のリズムや特定の楽曲区間における特定のタイミングで生じやすいことが分かっている。したがって、本実施形態の構成を用いることで、ノリの生じやすいタイミングで子音長によるノリの判定を行うことができ、より正確にノリの良し悪しを判定することができる。 It has been found that singing is likely to occur at a specific timing in a specific sound type, a specific rhythm, or a specific music section. Therefore, by using the configuration of the present embodiment, it is possible to determine the flare by the consonant length at a timing at which the flare is likely to occur, and it is possible to more accurately determine whether the flare is good or bad.
 なお、本実施形態の処理もプログラム化して記憶しておき、当該プログラムをCPU等の演算処理素子で実行するようにしてもよい。この場合、次に示す処理フローを用いればよい。図13は、本発明の第4の実施形態に係る歌唱評価方法のフローチャートである。なお、図13では、母音発音タイミングと拍のタイミングとの時間差に基づく歌唱評価結果によって、子音長による歌唱評価を行うか行わないかを切り替える場合を示しているが、このような母音発音タイミングと拍のタイミングとの時間差に基づく歌唱評価は行わなくてもよい。 It should be noted that the processing of this embodiment may also be stored as a program, and the program may be executed by an arithmetic processing element such as a CPU. In this case, the following processing flow may be used. FIG. 13 is a flowchart of the singing evaluation method according to the fourth embodiment of the present invention. Note that FIG. 13 shows a case in which whether or not to perform singing evaluation based on the consonant length is switched depending on the singing evaluation result based on the time difference between the vowel sounding timing and the beat timing. Singing evaluation based on the time difference from the beat timing may not be performed.
 まず、歌唱評価装置10Cは、歌唱データを取得し、歌唱データに含まれる各子音の子音長を測定する(S401)。この子音長の測定とは別に、歌唱評価装置10Cは、楽譜データもしくは楽曲データを解析して、音型、リズム、もしくは特定区間による子音長分布の作成対象となるタイミングもしくは区間を設定する(S402)。また、子音長の測定や子音長分布の作成対象の設定とは別に、歌唱評価装置10Cは、歌唱データに含まれる各母音の発音開始タイミングを検出する(S403)。次に、歌唱評価装置10Cは、各母音の発音開始タイミングとそれぞれに対応する拍のタイミングとの時間差の分布を作成する(S404)。歌唱評価装置10Cは、時間差が0近傍に大きく分布していることを検出すると(S405:YES)、判定対象のタイミングもしくは区間に対する子音長の分布を作成する(S406)。歌唱評価装置10Cは、子音長の分布から子音長のバラツキに基づく指標を算出し、該指標を用いてノリの良し悪しを判定して、歌唱評価を行う(S407)。歌唱評価装置10Aは、時間差が0近傍に大きく分布していないことを検出すると(S405:NO)、子音長による歌唱評価を行わず、歌唱は下手であると評価する。もしくは、子音による歌唱評価を行うが、ノリの評価を大きく下げてもよい。 First, the song evaluation apparatus 10C acquires song data and measures the consonant length of each consonant included in the song data (S401). Separately from the measurement of the consonant length, the singing evaluation apparatus 10C analyzes the score data or the music data, and sets the timing or section that is the creation target of the consonant length distribution based on the sound type, rhythm, or specific section (S402). ). In addition to the measurement of the consonant length and the setting of the creation target of the consonant length distribution, the singing evaluation device 10C detects the pronunciation start timing of each vowel included in the singing data (S403). Next, the singing evaluation apparatus 10C creates a time difference distribution between the sounding start timing of each vowel and the timing of the corresponding beat (S404). When the singing evaluation device 10C detects that the time difference is largely distributed in the vicinity of 0 (S405: YES), the singing evaluation device 10C creates a consonant length distribution for the timing or section to be determined (S406). The singing evaluation apparatus 10C calculates an index based on the variation of the consonant length from the distribution of the consonant length, determines whether the groove is good or bad using the index, and performs the singing evaluation (S407). When the singing evaluation device 10A detects that the time difference is not greatly distributed in the vicinity of 0 (S405: NO), the singing evaluation is not performed by the consonant length, and the singing is evaluated as being poor. Or, the singing evaluation by consonant is performed, but the evaluation of the glue may be greatly reduced.
 なお、上述の説明では、拍による子音長の分布を作成する例を示したが、子音による子音長の分布、例えば、子音を識別して子音毎に子音長の分布を作成してもよいし、楽曲データから子音のカテゴリを識別して、子音のカテゴリ毎に子音長の分布を作成してもよい。例えば、有声音および無声音のカテゴリであったり、破裂音や歯擦音といったカテゴリで子音を分類し、カテゴリ毎に子音長分布を作成してもよい。 In the above description, an example of creating a distribution of consonant length by beat has been shown, but a consonant length distribution by consonant, for example, a consonant length distribution may be created for each consonant by identifying the consonant. Alternatively, a consonant category may be identified from the music data, and a consonant length distribution may be created for each consonant category. For example, consonants may be classified into categories of voiced and unvoiced sounds, or categories such as plosives and sibilants, and a consonant length distribution may be created for each category.
 また、上述の説明では、マイク等で収音した歌唱音に基づいて、子音長の測定および子音長のバラツキを判定する態様を示したが、人工的に人の声を真似して作成した歌唱音についても、上述の構成を適用でき、同様の作用効果を得ることができる。 Further, in the above description, the mode of measuring the consonant length and determining the variation of the consonant length based on the singing sound collected by the microphone or the like is shown, but the singing created by artificially imitating a human voice The above-described configuration can also be applied to sound, and similar effects can be obtained.
 また、上述の説明では、子音長を自動的に測定する態様を示したが、波形を見ながらユーザが手動で子音と母音を識別して、子音長を測定し、装置に入力(装置で取得)してもよい。 In the above description, the mode in which the consonant length is automatically measured has been shown. However, the user manually identifies the consonant and the vowel while viewing the waveform, measures the consonant length, and inputs it to the device (acquired by the device). )
 また、上述の説明では、母音発音の開始タイミングを自動で測定する態様を示したが、波形を見ながらユーザが手動で母音発音の開始タイミングを検出して、この検出した母音発音開始タイミングを表すデータを装置に入力(装置で取得)してもよく、ユーザが放音された歌唱音を聞いて手動で母音開始タイミングを検出して、この検出した母音発音開始タイミングを表すデータを装置に入力(装置で取得)してもよい。この場合、母音発音タイミング取得部は、図示していない操作入力部を用いればよい。 In the above description, the vowel pronunciation start timing is automatically measured. However, the user manually detects the vowel pronunciation start timing while viewing the waveform, and represents the detected vowel pronunciation start timing. Data may be input to the device (acquired by the device), and the user listens to the singing sound and detects the vowel start timing manually, and inputs data indicating the detected vowel pronunciation start timing to the device (Acquired by the apparatus). In this case, the vowel sound generation timing acquisition unit may use an operation input unit (not shown).
 本出願は、2014年1月23日に出願された日本特許出願(特願2014-010223)に基づくものであり、その内容はここに参照として取り込まれる。 This application is based on a Japanese patent application (Japanese Patent Application No. 2014-010223) filed on January 23, 2014, the contents of which are incorporated herein by reference.
10,10A,10B,10C:歌唱評価装置
11:子音長測定部
12:子音長分布作成部
13:評価部
21:母音発音タイミング取得部
22:母音発音タイミング分布作成部
23:判定対象検出部
10, 10A, 10B, 10C: Singing evaluation device 11: consonant length measurement unit 12: consonant length distribution creation unit 13: evaluation unit 21: vowel pronunciation timing acquisition unit 22: vowel pronunciation timing distribution creation unit 23: determination target detection unit

Claims (13)

  1.  歌唱音を示す歌唱データに基づき、当該歌唱音に含まれる各子音の子音長の分布を作成する子音長分布作成部と、
     前記子音長の分布の広がり度合いを検出し、該子音長の分布の広がり度合いを用いて前記歌唱音を評価する評価部と、
     を備えた、歌唱評価装置。
    A consonant length distribution creating unit that creates a distribution of consonant lengths of each consonant included in the singing sound based on the singing data indicating the singing sound;
    An evaluator that detects the degree of spread of the consonant length distribution and evaluates the singing sound using the degree of spread of the consonant length distribution;
    A singing evaluation device.
  2.  前記歌唱音から得られる歌唱データの子音長を測定し、測定した前記子音長を前記子音長分布作成部に出力する子音長測定部を備えた、
     請求項1に記載の歌唱評価装置。
    Measure the consonant length of the singing data obtained from the singing sound, comprising a consonant length measuring unit that outputs the measured consonant length to the consonant length distribution creating unit,
    The singing evaluation apparatus according to claim 1.
  3.  歌唱される曲の楽譜データおよび楽曲データのうち少なくとも一つから判定対象の子音を検出する判定対象検出部を備え、
     前記子音長分布作成部は、前記判定対象とされた子音に対して、前記子音長の分布を作成する、
     請求項1または請求項2に記載の歌唱評価装置。
    A determination target detection unit that detects a consonant to be determined from at least one of the musical score data and the music data of the song to be sung;
    The consonant length distribution creation unit creates the consonant length distribution for the consonant that is the determination target.
    The song evaluation apparatus according to claim 1 or 2.
  4.  前記判定対象検出部は、
     前記楽譜データから歌唱される音型を分析する音型分析部、
     前記楽譜データから歌唱されるリズムを分析するリズム分析部、
     前記楽曲データから歌唱される曲の特定区間を抽出する特定区間抽出部、
     の少なくとも一つを備え、
     前記音型、前記リズム、および前記特定区間の少なくとも一つから前記判定対象を決定する、
     請求項3に記載の歌唱評価装置。
    The determination target detection unit includes:
    A sound type analysis unit for analyzing a sound type sung from the score data;
    A rhythm analyzer for analyzing rhythms sung from the musical score data;
    A specific section extraction unit that extracts a specific section of a song sung from the music data;
    Comprising at least one of
    Determining the determination target from at least one of the sound type, the rhythm, and the specific section;
    The singing evaluation apparatus according to claim 3.
  5.  前記歌唱音の母音発音タイミングを取得する母音発音タイミング取得部と、
     前記母音発音タイミングと楽曲の拍のタイミングとのタイミング差を検出し、母音発音タイミング差の分布を作成する母音発音タイミング分布作成部と、
     を備え、
     前記評価部は、前記母音発音タイミング差の分布を、前記歌唱音の評価に利用する、
     請求項1から請求項4のいずれかに記載の歌唱評価装置。
    A vowel pronunciation timing acquisition unit for acquiring a vowel pronunciation timing of the singing sound;
    Detecting a timing difference between the vowel pronunciation timing and the beat timing of the music, and creating a distribution of the vowel pronunciation timing difference;
    With
    The evaluation unit uses the distribution of the vowel pronunciation timing difference for the evaluation of the singing sound.
    The singing evaluation apparatus in any one of Claims 1-4.
  6.  前記評価部は、前記子音長の分布の広がり度合いが大きいほどノリが良いと判断して、前記歌唱音を高評価する、
     請求項1から請求項5のいずれかに記載の歌唱評価装置。
    The evaluation unit judges that the greater the degree of spread of the distribution of the consonant length, the better the paste, and highly evaluates the singing sound.
    The song evaluation apparatus in any one of Claims 1-5.
  7.  歌唱音を示す歌唱データに基づき、当該歌唱音に含まれる各子音の子音長の分布を作成し、
     前記子音長の分布の広がり度合いを検出し、該子音長の分布の広がり度合いを用いて前記歌唱音を評価する、
     歌唱評価方法。
    Based on the singing data indicating the singing sound, create a distribution of the consonant length of each consonant included in the singing sound,
    Detecting the degree of spread of the consonant length distribution and evaluating the singing sound using the degree of spread of the consonant length distribution;
    Singing evaluation method.
  8.  前記歌唱音から得られる歌唱データの子音長を測定し、測定した前記子音長を基に前記子音長の分布を作成する
     請求項7に記載の歌唱評価方法。
    The singing evaluation method according to claim 7, wherein a consonant length of singing data obtained from the singing sound is measured, and a distribution of the consonant length is created based on the measured consonant length.
  9.  歌唱される曲の楽譜データおよび楽曲データのうち少なくとも一つから判定対象の子音を検出し、
     前記判定対象とされた子音に対して、前記子音長の分布を作成する
     請求項7または請求項8に記載の歌唱評価方法。
    A consonant to be judged is detected from at least one of musical score data and music data of the song to be sung,
    The singing evaluation method according to claim 7 or 8, wherein a distribution of the consonant length is created for the consonant that is the determination target.
  10.  前記楽譜データから分析された歌唱される音型と、前記楽譜データから分析された歌唱されるリズムと、および前記楽曲データから抽出された曲の特定区間と、のうち少なくとも一つから前記判定対象を決定する
     請求項9に記載の歌唱評価方法。
    The determination target from at least one of a sung sound type analyzed from the score data, a sung rhythm analyzed from the score data, and a specific section of a song extracted from the song data The singing evaluation method according to claim 9.
  11.  前記歌唱音の母音発音タイミングを取得し、
     前記母音発音タイミングと楽曲の拍のタイミングとのタイミング差を検出し、母音発音タイミング差の分布を作成し、
     前記母音発音タイミング差の分布を、前記歌唱音の評価に利用する、
     請求項7から請求項10のいずれかに記載の歌唱評価方法。
    Get the vowel pronunciation timing of the singing sound,
    Detecting a timing difference between the vowel pronunciation timing and the beat timing of the music, creating a distribution of the vowel pronunciation timing difference;
    Use the distribution of the vowel pronunciation timing difference for the evaluation of the singing sound,
    The singing evaluation method according to any one of claims 7 to 10.
  12.  前記子音長の分布の広がり度合いが大きいほどノリが良いと判断して、前記歌唱音を高評価する、
     請求項7から請求項11のいずれかに記載の歌唱評価方法。
    Judge that the greater the degree of spread of the consonant length distribution, the better the paste, and highly evaluate the singing sound,
    The singing evaluation method according to any one of claims 7 to 11.
  13.  歌唱音を示す歌唱データに基づき、当該歌唱音に含まれる各子音の子音長の分布を作成する子音長分布作成部と、
     前記子音長の分布の広がり度合いを検出し、該子音長の分布の広がり度合いを用いて前記歌唱音を評価する評価部と、
     の各機能をコンピュータに実行させるための歌唱評価プログラム。
    A consonant length distribution creating unit that creates a distribution of consonant lengths of each consonant included in the singing sound based on the singing data indicating the singing sound;
    An evaluator that detects the degree of spread of the consonant length distribution and evaluates the singing sound using the degree of spread of the consonant length distribution;
    Singing evaluation program to make a computer execute each function of.
PCT/JP2015/051731 2014-01-23 2015-01-22 Singing evaluation device, singing evaluation method, and singing evaluation program WO2015111671A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2014010223A JP6304650B2 (en) 2014-01-23 2014-01-23 Singing evaluation device
JP2014-010223 2014-01-23

Publications (1)

Publication Number Publication Date
WO2015111671A1 true WO2015111671A1 (en) 2015-07-30

Family

ID=53681472

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2015/051731 WO2015111671A1 (en) 2014-01-23 2015-01-22 Singing evaluation device, singing evaluation method, and singing evaluation program

Country Status (2)

Country Link
JP (1) JP6304650B2 (en)
WO (1) WO2015111671A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017068990A1 (en) * 2015-10-22 2017-04-27 ヤマハ株式会社 Musical sound evaluation device, evaluation criteria generation device, and recording medium
CN113096689A (en) * 2021-04-02 2021-07-09 腾讯音乐娱乐科技(深圳)有限公司 Song singing evaluation method, equipment and medium
CN114678039A (en) * 2022-04-13 2022-06-28 厦门大学 Singing evaluation method based on deep learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008145940A (en) * 2006-12-13 2008-06-26 Yamaha Corp Voice evaluation device and voice evaluation method
JP2009258366A (en) * 2008-04-16 2009-11-05 Arcadia:Kk Speech control device
JP2013195738A (en) * 2012-03-21 2013-09-30 Yamaha Corp Singing evaluation device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100192753A1 (en) * 2007-06-29 2010-08-05 Multak Technology Development Co., Ltd Karaoke apparatus

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008145940A (en) * 2006-12-13 2008-06-26 Yamaha Corp Voice evaluation device and voice evaluation method
JP2009258366A (en) * 2008-04-16 2009-11-05 Arcadia:Kk Speech control device
JP2013195738A (en) * 2012-03-21 2013-09-30 Yamaha Corp Singing evaluation device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017068990A1 (en) * 2015-10-22 2017-04-27 ヤマハ株式会社 Musical sound evaluation device, evaluation criteria generation device, and recording medium
US10453435B2 (en) 2015-10-22 2019-10-22 Yamaha Corporation Musical sound evaluation device, evaluation criteria generating device, method for evaluating the musical sound and method for generating the evaluation criteria
CN113096689A (en) * 2021-04-02 2021-07-09 腾讯音乐娱乐科技(深圳)有限公司 Song singing evaluation method, equipment and medium
CN114678039A (en) * 2022-04-13 2022-06-28 厦门大学 Singing evaluation method based on deep learning

Also Published As

Publication number Publication date
JP6304650B2 (en) 2018-04-04
JP2015138177A (en) 2015-07-30

Similar Documents

Publication Publication Date Title
EP2980786B1 (en) Voice analysis method and device, voice synthesis method and device and medium storing voice analysis program
Sundberg et al. Acoustical study of classical Peking Opera singing
JP6759545B2 (en) Evaluation device and program
JP5196550B2 (en) Code detection apparatus and code detection program
WO2015111671A1 (en) Singing evaluation device, singing evaluation method, and singing evaluation program
CN108292499A (en) Skill determining device and recording medium
JP4479701B2 (en) Music practice support device, dynamic time alignment module and program
JP2007334364A (en) Karaoke machine
JP2007156330A (en) Karaoke device with compatibility determination function
JP2008065153A (en) Musical piece structure analyzing method, program and device
JP6098422B2 (en) Information processing apparatus and program
JP2008225115A (en) Karaoke device, singing evaluation method, and program
Delviniotis Acoustic characteristics of modern Greek Orthodox Church music
JP4218066B2 (en) Karaoke device and program for karaoke device
JP2008015388A (en) Singing skill evaluation method and karaoke machine
JP2016180965A (en) Evaluation device and program
Bonjyotsna et al. Analytical study of vocal vibrato and mordent of Indian popular singers
JP5447624B2 (en) Karaoke equipment
TWI394141B (en) Karaoke song accompaniment automatic scoring method
JP2008040258A (en) Musical piece practice assisting device, dynamic time warping module, and program
JP5034642B2 (en) Karaoke equipment
JP5416396B2 (en) Singing evaluation device and program
JP2008015212A (en) Musical interval change amount extraction method, reliability calculation method of pitch, vibrato detection method, singing training program and karaoke device
JP5186793B2 (en) Karaoke equipment
KR20230102973A (en) Methods and Apparatus for calculating song scores

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15740439

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15740439

Country of ref document: EP

Kind code of ref document: A1