WO2015111671A1 - Dispositif, procédé et programme d'évaluation de chant - Google Patents

Dispositif, procédé et programme d'évaluation de chant Download PDF

Info

Publication number
WO2015111671A1
WO2015111671A1 PCT/JP2015/051731 JP2015051731W WO2015111671A1 WO 2015111671 A1 WO2015111671 A1 WO 2015111671A1 JP 2015051731 W JP2015051731 W JP 2015051731W WO 2015111671 A1 WO2015111671 A1 WO 2015111671A1
Authority
WO
WIPO (PCT)
Prior art keywords
singing
consonant
distribution
consonant length
evaluation
Prior art date
Application number
PCT/JP2015/051731
Other languages
English (en)
Japanese (ja)
Inventor
片寄 晴弘
達矢 的場
隆一 成山
松本 秀一
Original Assignee
ヤマハ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ヤマハ株式会社 filed Critical ヤマハ株式会社
Publication of WO2015111671A1 publication Critical patent/WO2015111671A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/60Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for measuring the quality of voice signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L25/87Detection of discrete points within a voice signal
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/93Discriminating between voiced and unvoiced parts of speech signals

Definitions

  • the present invention relates to a technique for evaluating the singing content of a singer.
  • Singing data is voice data obtained by collecting a song of a singer.
  • the musical score data is data for setting the pitch, volume, and sounding timing of each sound in a song sung by a singer.
  • a conventional singing evaluation apparatus sings depending on how much the pitch, volume, and pronunciation timing of each sound in the song data match the pitch, volume, and pronunciation timing of each sound in the score data. Is evaluated. For example, in a conventional singing evaluation apparatus, the higher the degree that the pitch, volume, and pronunciation timing of each sound in the song data match the pitch, volume, and pronunciation timing of each sound in the score data, the higher the degree Evaluate to score.
  • the singing is evaluated by detecting the rhythm of the singing data and detecting whether or not the rhythms included in the singing music data match.
  • An object of the present invention is to provide a technique for evaluating the dynamic feeling of singing, in particular, the reflection, which cannot be evaluated only by comparison with simple musical score data or music data.
  • the singing evaluation apparatus includes a consonant length distribution creation unit and an evaluation unit.
  • the consonant length distribution creating unit creates a consonant length distribution of each consonant included in the singing sound based on the singing data indicating the singing sound.
  • the evaluation unit detects a consonant length variation that is a degree of spread of the consonant length distribution using the consonant length distribution, and evaluates the singing sound using the consonant length variation.
  • This configuration uses the fact that the consonant length variation is different between a song with a sense of dynamism and a song without a sense of dynamism, in particular, a song with a good twist and a song with a bad twist. Therefore, by detecting and using the variation of the consonant length, it is possible to accurately determine the score of the singing.
  • the singing evaluation apparatus of the present invention includes a consonant length measuring unit that measures the consonant length of the singing data obtained from the singing sound of the singer and outputs the measured consonant length to the consonant length distribution creating unit.
  • the consonant length of the singing data can be measured from the singing sound sung by the user.
  • the singing evaluation apparatus includes a consonant length measuring unit, a consonant length distribution creating unit, and an evaluating unit.
  • the consonant length measurement unit measures the consonant length of the singing data obtained from the singing sound of the singer.
  • the consonant length distribution creating unit creates a consonant length distribution.
  • the evaluation unit detects a consonant length variation that is a degree of spread of the consonant length distribution using the consonant length distribution, and evaluates the singing sound using the consonant length variation.
  • This configuration uses the fact that the consonant length variation is different between a song with a sense of dynamism and a song without a sense of dynamism, in particular, a song with a good twist and a song with a bad twist. Therefore, by detecting and using the variation of the consonant length, it is possible to accurately determine the score of the singing.
  • the singing evaluation apparatus of the present invention includes a determination target detection unit that detects a consonant to be determined from at least one of music score data and music data of a song to be sung.
  • the consonant length distribution creating unit creates a consonant length distribution for the consonant to be determined.
  • This configuration utilizes the fact that the degree of variation in consonant length differs between a song with a good twist and a song with a bad twist in a specific sound type and rhythm. Therefore, by detecting and using a variation in consonant length with respect to a specific consonant to be determined, it is possible to more accurately determine the singing.
  • the determination target detection unit of the singing evaluation apparatus of the present invention includes at least one of a sound type analysis unit, a rhythm analysis unit, and a specific section extraction unit.
  • the sound type analysis unit analyzes the sound type sung from the musical score data.
  • the rhythm analyzer analyzes the rhythm sung from the musical score data.
  • the specific section extraction unit extracts a specific section of a song to be sung from the music data.
  • the determination target detection unit determines a determination target from at least one of a sound type, a rhythm, and a specific section.
  • This configuration shows a more specific preferable example of the determination target detection unit. As described above, by determining the sound type, rhythm, and specific section as determination targets, more accurate determination can be performed.
  • the singing evaluation apparatus of the present invention includes a vowel pronunciation timing acquisition unit and a vowel pronunciation timing distribution creation unit.
  • the vowel pronunciation timing acquisition unit detects vowel pronunciation timing using song data.
  • the vowel pronunciation timing distribution creation unit detects a timing difference between the vowel pronunciation timing and the beat timing of the music, and creates a distribution of the vowel pronunciation timing difference.
  • the evaluation unit uses the distribution of the vowel pronunciation timing difference for the evaluation of the song.
  • the vowel pronunciation timing is almost the same as the beat timing for both good and bad singing, and if the vowel pronunciation timing deviates many times, the song can be heard poorly. is doing. Therefore, by detecting and using the difference between the pronunciation timing of the vowel and the beat, it is possible to more accurately evaluate the skill of singing.
  • the evaluation unit determines that the larger the degree of spread of the consonant length distribution (the variation of the consonant length) is, the better the song is, and highly evaluates the singing.
  • Another aspect of the present invention creates a distribution of consonant lengths of each consonant included in the singing sound based on the singing data indicating the singing sound, detects the degree of spread of the consonant length distribution, A singing evaluation method for evaluating the singing sound using a degree of spread of distribution is provided.
  • a consonant length distribution creating unit that creates a consonant length distribution of each consonant included in the singing sound based on the singing data indicating the singing sound, and a degree of spread of the consonant length distribution.
  • a singing evaluation program for causing a computer to execute each function of detecting and evaluating the singing sound using the degree of spread of the consonant length distribution.
  • Frequency distribution diagram showing the relationship between the distribution of the time difference between the vowel sounding start timing and the beat timing according to the second embodiment of the present invention and the poor singing
  • the block diagram which shows the main structures of the song evaluation apparatus which concerns on the 3rd Embodiment of this invention.
  • Frequency distribution diagram showing consonant length distribution for each beat according to the third embodiment of the present invention
  • the flowchart of the song evaluation method which concerns on the 3rd Embodiment of this invention.
  • the block diagram which shows the main structures of the song evaluation apparatus which concerns on the 4th Embodiment of this invention.
  • FIG. 1 is a block diagram showing the main configuration of a singing evaluation apparatus according to the first embodiment of the present invention.
  • FIG. 2 is a diagram illustrating a reference concept of singing evaluation of the singing evaluation apparatus according to the first embodiment of the present invention.
  • the horizontal axis is the consonant length
  • the vertical axis is the frequency. That is, FIG. 2 is a diagram showing a frequency distribution of consonant lengths included in a predetermined time section of a sung song.
  • the singing evaluation apparatus makes it possible to evaluate the dynamic feeling of the singing, in particular, the “noir” of the singing.
  • the song “Nori” in the present invention is an advanced technique that musically controls the deviation from a position (timing) that is mainly divided into measures in time. ⁇ Singers' sung songs and uplifting feelings, ⁇ Rhythm that makes you want to sing and dance together just by listening. ⁇ Human, free or lively dynamics, It is a singing expression that obtains effects such as. Sometimes called “groove” or “groove feeling”.
  • the singing with a bad twist in FIG. 2 shows a case where the singing is mechanically performed according to the pitch, the volume, and the sounding timing according to the score.
  • the singing with a good flare shows the case where the singing that can be felt from the part depending on the emotion at the time of singing by the above-mentioned singer is performed.
  • the consonant length variation is large and the average consonant length is large (consonant length is long on average).
  • the consonant length variation is small and the average value is small (consonant length is short on average).
  • the index based on the variation of the consonant length it is possible to determine whether the groove is good or bad. Specifically, for example, the variance of the consonant length and the standard deviation are calculated. When the consonant length variance and the standard deviation are large, it is determined that the sound is good and the singing is good, and when the consonant length dispersion and the standard deviation are small, the sound is bad and the singing is determined to be poor. In addition, the score for singing increases as the variance or standard deviation increases, and the score decreases as the variance or standard deviation decreases.
  • an average value of the consonant length is calculated.
  • the average value of the consonant length is large, it is determined that the sound is good and the singing is good.
  • the average value of the consonant length is small, it is determined that the sound is poor and the singing is poor.
  • it determines so that the score with respect to a song becomes high, so that an average value is large, and a score becomes low, so that an average value is small.
  • the consonant length that takes the maximum value (mode) is long.
  • the consonant length that takes the maximum value (mode) is short. Therefore, by calculating an index based on the length of the consonant length that is the mode value, it is possible to determine whether the groove is good or bad.
  • a consonant length value that is a mode value is calculated.
  • the consonant length value that is the most frequent value is large (when the consonant length is long), it is determined that the sound is good and the singing is good, and when the consonant length value that is the most frequent value is small (the consonant length is short) ), It is determined that the song is bad and the singing is poor. Further, the higher the consonant length value that is the mode value, the higher the score for singing, and the smaller the consonant length value that is the mode value, the lower the score.
  • the song may be evaluated by determining the glue by taking into account at least two of an index based on variation, an index based on an average value, and an index based on a mode value.
  • the index based on the variation may be digitized
  • the index based on the average value may be digitized
  • the result of the four arithmetic operations may be used to determine the glue and evaluate the singing.
  • the numerical values representing these indices may be set to increase as the variation and the average value increase, and an average of these indices or a weighted average may be used.
  • the singing evaluation apparatus has a configuration shown in FIG.
  • the singing evaluation device 10 includes a consonant length measuring unit 11, a consonant length distribution creating unit 12, and an evaluating unit 13.
  • the consonant length measuring unit 11 receives singing data obtained by collecting the singing of the singer.
  • the consonant length measuring unit 11 detects a consonant from the song data using a known method.
  • the consonant length measuring unit 11 measures the consonant length that is the length of each consonant.
  • the consonant length measuring unit 11 outputs the measured consonant length to the consonant length distribution creating unit 12.
  • An example of a known method for detecting consonants is Japanese Patent Application Laid-Open No. 2008-32933.
  • This document discloses a method of determining a period with periodicity as a vowel section and determining other sections as consonant sections.
  • GUI Graphic User Interface
  • a method of measuring the length of each identified consonant section by manually identifying a consonant section while listening to song data without using a GUI is also conceivable.
  • the consonant length distribution creating unit 12 stores the input consonant lengths over a preset time interval.
  • the time interval to be stored may be, for example, the whole piece of music or a predetermined one phrase. When using the determined one phrase, it is only necessary to acquire score data separately and use it with reference to the score data.
  • the consonant length distribution creating unit 12 creates a frequency distribution of consonant lengths.
  • the consonant length distribution creation unit 12 outputs the frequency distribution of the consonant length to the evaluation unit 13.
  • the evaluation unit 13 determines whether the groove is good or bad from the frequency distribution of the consonant length, and performs singing evaluation based on the groove.
  • the determination criteria for the glue are as described above. Specifically, the evaluation unit 13 calculates an index (for example, variance or standard deviation) based on the variation of the consonant length from the frequency distribution of the consonant length. The evaluation unit 13 determines whether the flutter is good or bad from the index based on the variation of the consonant length as described above. Note that, as described above, the evaluation unit 13 may determine whether or not the slack is good by using the average value of the consonant length or the consonant length (mode) at which the frequency is maximum.
  • the configuration of the present embodiment it is possible to accurately determine whether the singing is good or not, and to perform singing evaluation in consideration of the quality of the singing. That is, it is possible to realize a singing evaluation that is closer to the feeling of good and bad hands by a singer or listener.
  • the following methods can be considered as a method of singing evaluation by the evaluation unit 13. For example, singing data of various levels ranging from those who are good at singing to those who are poor at the same song are collected and stored in a storage device.
  • a computer reads each song data from a memory
  • the stage evaluation of the variance value is not limited to 10 stages, and a finer stage evaluation (20 stages, 50 stages, etc.) or a rougher stage evaluation (8 stages, 5 stages, etc.) may be adopted. Absent.
  • the interval between steps used in such a step evaluation may be set at equal intervals. Alternatively, a weighted interval (for example, the interval at 5 points and the interval at 10 points are different) may be used.
  • the interval of the steps used in the step evaluation is set at an equal interval
  • a method of weighting the added points in each step may be adopted (as the three-step evaluation, the minimum evaluation is 5 points, the middle The evaluation is 10 points, and the highest evaluation is 30 points).
  • FIG. 3 is a flowchart of the singing evaluation method according to the first embodiment of the present invention.
  • the song evaluation device 10 acquires song data and measures the consonant length of each consonant included in the song data (S101).
  • the singing evaluation apparatus 10 creates a distribution of the acquired consonant lengths (S102).
  • the singing evaluation apparatus 10 calculates an index based on the variation of the consonant length from the distribution of the consonant length, determines whether the groove is good or bad using the index, and performs singing evaluation (S103).
  • the quality of the groove is determined based on the variation of the consonant length.
  • the quality of the groove can be determined using the average value or the mode value of the consonant length. it can.
  • FIG. 4 is a block diagram showing the main configuration of the singing evaluation apparatus according to the second embodiment of the present invention.
  • the same singing data as the consonant length measurement unit 11 is input to the vowel pronunciation timing acquisition unit 21.
  • the vowel pronunciation timing acquisition unit 21 detects a vowel by a known method, and detects a pronunciation start timing of the vowel. Specifically, in the case of a vowel with a consonant, the vowel timing detection unit 21 detects the timing at which the vowel is switched to the vowel. The vowel timing detection unit 21 detects the timing at which vowels are switched when vowels are continuous. The vowel timing detection unit 21 detects a timing at which a vowel is generated from a silent state when a vowel is generated from the silence. The vowel pronunciation timing acquisition unit 21 outputs the detected pronunciation start timing of each vowel to the vowel pronunciation timing distribution creation unit 22.
  • the vowel sound generation timing distribution creation unit 22 receives the sound generation start timing of each vowel and the beat timing of the song being sung.
  • the vowel pronunciation timing distribution creation unit 22 compares the difference between the beat timing and the vowel pronunciation start timing. At this time, the vowel sounding timing distribution creating unit 22 associates the timing of the beat closest to the sounding start timing of each vowel with the sounding start timing of each vowel.
  • FIG. 5 is a diagram showing the concept of associating the vowel pronunciation start timing with the beat timing. In the case of FIG. 5, for example, the sounding start timing of the vowel V01 is closest to the timing of the first beat.
  • the vowel sound generation timing distribution creation unit 22 sets the beat corresponding to the sound generation start timing of the vowel V01 as the first beat. Similarly, in the case of FIG. 5, the vowel sound generation timing distribution creating unit 22 sets the beat corresponding to the sound generation start timing of the vowel V02 as the third beat. In the case of FIG. 5, the vowel sound generation timing distribution creation unit 22 sets the beat corresponding to the sound generation start timing of the vowel V03 as the fourth beat, and sets the beat corresponding to the sound generation start timing of the vowel V04 as the fifth beat. The beat corresponding to the sound generation start timing of the vowel V05 is set to the seventh beat, and the beat corresponding to the sound generation start timing of the vowel V06 is set to the eighth beat.
  • the vowel pronunciation timing distribution creation unit 22 calculates the time difference between the vowel pronunciation start timing and the beat timing corresponding to each. The vowel pronunciation timing distribution creation unit 22 creates this time difference distribution. The vowel sound generation timing distribution creation unit 22 outputs the time difference distribution to the evaluation unit 13.
  • FIG. 6 is a frequency distribution diagram showing the relationship between the distribution of the time difference between the vowel sounding start timing and the beat timing and the poor singing ability. As shown in FIG. 6, when the singing is good, the variation in time difference between the vowel pronunciation start timing and the beat timing is small, and when the singing is poor, the time difference variation between the vowel pronunciation start timing and the beat timing is small. Is big.
  • the mode value of the time difference between the vowel pronunciation start timing and the beat timing is substantially 0, and when the singing is not good, the time difference between the vowel pronunciation start timing and the beat timing is the most frequent. The value is greatly deviated from 0.
  • the evaluation unit 13 detects the variation in the time difference, and determines that the singing is better as the variation in the time difference becomes smaller.
  • the evaluation unit 13 detects the mode value of the time difference, and determines that the singing is better as the mode value is closer to zero.
  • the evaluation part 13 may determine the skill level of a song using both the time difference variation and the time difference mode value.
  • the evaluation unit 13 reflects the singing evaluation result based on the time difference between the vowel pronunciation start timing and the beat timing in the singing evaluation result based on the above-mentioned consonant length, and performs singing evaluation in an integrated manner. As a result, it is possible to more accurately determine the sung skill.
  • the evaluation unit 13 may perform the singing evaluation based on the consonant length only when it is determined that the singing is good based on the singing evaluation result based on the time difference between the vowel pronunciation start timing and the beat timing. . In this case, the evaluation unit 13 does not perform the singing evaluation based on the consonant length when the singing evaluation based on the consonant length is considered unnecessary. Therefore, the processing load of the singing evaluation can be reduced.
  • FIG. 7 is a flowchart of the singing evaluation method according to the second embodiment of the present invention.
  • FIG. 7 shows a case where whether or not to perform the singing evaluation based on the consonant length is switched depending on the singing evaluation result based on the time difference between the vowel sound generation timing and the beat timing.
  • the song evaluation device 10A acquires song data and measures the consonant length of each consonant included in the song data (S201). Apart from the measurement of the consonant length, the singing evaluation device 10A detects the pronunciation start timing of each vowel included in the singing data (S202). Next, the singing evaluation device 10A creates a time difference distribution between the sounding start timing of each vowel and the timing of the beat corresponding to the timing (S203).
  • the singing evaluation apparatus 10A detects that the time difference is greatly distributed in the vicinity of 0 (S204: YES), the singing evaluation apparatus 10A calculates an index based on the variation of the consonant length from the distribution of the consonant length, and uses this index to determine whether or not Bad judgment is performed and singing evaluation is performed (S205).
  • the singing evaluation device 10A detects that the time difference is not largely distributed in the vicinity of 0 (S204: NO), the singing evaluation by the consonant length is not performed, and the singing is evaluated to be poor.
  • FIG. 8 is a block diagram showing the main configuration of the singing evaluation apparatus according to the third embodiment of the present invention.
  • the singing evaluation device 10B of the present embodiment is different from the singing evaluation device 10A shown in the second embodiment in the connection configuration with respect to the vowel pronunciation timing acquisition unit 21 and the vowel pronunciation timing distribution creation unit 22,
  • the other configuration is the same as the singing evaluation apparatus 10A shown in the second embodiment. Therefore, only the part different from the singing evaluation apparatus 10A according to the second embodiment will be specifically described.
  • the vowel sound generation timing acquisition unit 21 outputs the detected vowel sound generation start timing to the consonant length distribution generation unit 12 together with the vowel sound generation timing distribution generation unit 22.
  • the vowel pronunciation timing distribution creation unit 22 outputs beats corresponding to each vowel pronunciation start timing to the consonant length distribution creation unit 12.
  • the consonant length distribution creating unit 12 detects a beat corresponding to each consonant from the input vowel sounding timing and the corresponding beat. Specifically, in the case of FIG. 5, the consonant length distribution creating unit 12 detects that the beat corresponding to the consonant C01 attached to the vowel V01 is the first beat. Similarly, the consonant length distribution creating unit 12 detects that the beat corresponding to the consonant C02 attached to the vowel V02 is the third beat, and detects the beat corresponding to the consonant C04 attached to the vowel V04 is the fifth beat. The beat corresponding to the consonant C05 attached to the vowel V05 is detected as the seventh beat, and the beat corresponding to the consonant C06 attached to the vowel V06 is detected as the eighth beat.
  • the consonant length distribution creating unit 12 creates a consonant length distribution for each beat.
  • FIG. 9 is a frequency distribution diagram showing the consonant length distribution for each beat according to the third embodiment of the present invention.
  • the consonant length distribution creation unit 12 outputs the consonant length distribution for each beat to the evaluation unit 13.
  • the evaluation unit 13 uses a consonant length distribution for each beat to determine whether the groove is good or bad and performs singing evaluation.
  • the evaluation unit 13 has a difference between the dispersion and standard deviation of the consonant length in the beat having the smallest consonant length variation and the difference between the dispersion and standard deviation of the consonant length in the beat having the largest consonant length variation.
  • Judge the quality of the glue In the specific example shown in FIG. 9, the quality of the glue is determined based on the difference between the variance and standard deviation of the first beat and the variance and standard deviation of the sixth beat.
  • the evaluation unit 13 determines not only the arithmetic difference between the variance and the standard deviation as the difference, but also the beat with the variance of the consonant length and the standard deviation in the beat having the smallest consonant length variation and the maximum consonant length variation.
  • a consonant length variance or an arithmetic ratio with a standard deviation may be used. The evaluation unit 13 determines that the larger the difference is, the better the roughness is, and the smaller the difference is, the worse the roughness is.
  • FIG. 10 is a flowchart of the singing evaluation method according to the third embodiment of the present invention.
  • FIG. 10 shows a case where singing evaluation based on the time difference between the vowel sound generation timing and the beat timing is not performed.
  • singing evaluation based on the time difference between the vowel sound generation timing and the beat timing may or may not be performed.
  • the song evaluation apparatus 10B acquires song data, and measures the consonant length of each consonant included in the song data (S301). Apart from the measurement of the consonant length, the singing evaluation device 10B detects the sounding start timing of each vowel included in the singing data (S302). Next, the singing evaluation device 10B detects the association between the sounding start timing of each vowel and the timing of the corresponding beat (S303). The singing evaluation apparatus 10B creates a distribution of consonant length for each beat (S304). The singing evaluation device 10B determines whether the sound is good or bad by using the variation in the distribution of the consonant length for each beat, and performs singing evaluation (S305).
  • FIG. 11 is a block diagram which shows the main structures of the song evaluation apparatus which concerns on the 4th Embodiment of this invention.
  • the singing evaluation apparatus 10C of the present embodiment further includes a determination target detection unit 23 with respect to the singing evaluation apparatus 10B shown in the third embodiment, and the processing of the consonant length distribution creation unit 12 is different. Therefore, only a different part from the song evaluation apparatus 10B which concerns on 3rd Embodiment is demonstrated concretely.
  • the determination target detection unit 23 includes a sound form analysis unit, a rhythm analysis unit, and a specific section extraction unit.
  • the determination target detection unit 23 receives at least one of score data and music data.
  • the musical score data includes the pitch, volume, and sounding timing of the song to be sung.
  • the music data includes the composition of music such as a chorus section, the genre of music, and the like.
  • the determination target detection unit 23 analyzes the score data and sets a consonant to be used for determination of the consonant length.
  • FIG. 12 is a diagram showing a specific example of a consonant setting concept used for determination of consonant length.
  • FIG. 12A is a diagram showing a consonant setting concept based on sound type, and
  • FIG. 12B is a diagram showing a consonant setting concept based on rhythm.
  • the sound type analysis unit of the determination target detection unit 23 detects the sound type used for the determination of the consonant length from the score data, the sound type interval and the consonant used for the determination of the consonant length are detected. Set the timing. For example, as shown in FIG. 12A, when the rising tone type is detected, the consonant of the third sound in the three sounds whose pitches rise continuously is set as the consonant used for the determination of the consonant length. The timing of the consonant is given to the consonant length distribution creating unit 12.
  • the rhythm analysis unit of the determination target detection unit 23 detects a rhythm to be used for determining the consonant length from the musical score data
  • the rhythm section and a consonant timing to be used for determining the consonant length are set.
  • the consonant of the third sound the sound having a short sound length
  • the consonant length of the consonant is given to the consonant length distribution creating unit 12.
  • the tone type and rhythm of a continuous three-tone section is specified, but a section composed of two or more sounds may be used.
  • the smaller the number of sounds the greater the number of times that the same sound type and the same rhythm are included in the song being sung. Therefore, it is easy to create a more accurate consonant length distribution, which is more useful.
  • the consonant of the head or middle sound may be set as the consonant used for the determination of the consonant length, not the last sound of the section. Thereby, more appropriate evaluation can be performed according to a sound type and a rhythm.
  • the determination target detection unit 23 can analyze music data and set a consonant to be used for determination of the consonant length. For example, the determination target detection unit 23 detects a chorus section from the music data, and gives the time of the chorus section to the consonant length distribution creation unit 12.
  • the consonant length distribution creating unit 12 refers to the timing or interval given from the determination target detecting unit 23 and the timing of each consonant obtained from the vowel sounding timing detection 21 and the vowel sounding timing distribution creating unit 22, and the consonant length measuring unit 11 is used to create a consonant length distribution for a given timing or interval.
  • FIG. 13 is a flowchart of the singing evaluation method according to the fourth embodiment of the present invention. Note that FIG. 13 shows a case in which whether or not to perform singing evaluation based on the consonant length is switched depending on the singing evaluation result based on the time difference between the vowel sounding timing and the beat timing. Singing evaluation based on the time difference from the beat timing may not be performed.
  • the song evaluation apparatus 10C acquires song data and measures the consonant length of each consonant included in the song data (S401). Separately from the measurement of the consonant length, the singing evaluation apparatus 10C analyzes the score data or the music data, and sets the timing or section that is the creation target of the consonant length distribution based on the sound type, rhythm, or specific section (S402). ). In addition to the measurement of the consonant length and the setting of the creation target of the consonant length distribution, the singing evaluation device 10C detects the pronunciation start timing of each vowel included in the singing data (S403). Next, the singing evaluation apparatus 10C creates a time difference distribution between the sounding start timing of each vowel and the timing of the corresponding beat (S404).
  • the singing evaluation device 10C detects that the time difference is largely distributed in the vicinity of 0 (S405: YES), the singing evaluation device 10C creates a consonant length distribution for the timing or section to be determined (S406).
  • the singing evaluation apparatus 10C calculates an index based on the variation of the consonant length from the distribution of the consonant length, determines whether the groove is good or bad using the index, and performs the singing evaluation (S407).
  • the singing evaluation device 10A detects that the time difference is not greatly distributed in the vicinity of 0 (S405: NO)
  • the singing evaluation is not performed by the consonant length, and the singing is evaluated as being poor. Or, the singing evaluation by consonant is performed, but the evaluation of the glue may be greatly reduced.
  • consonant length distribution by consonant for example, a consonant length distribution may be created for each consonant by identifying the consonant.
  • a consonant category may be identified from the music data, and a consonant length distribution may be created for each consonant category.
  • consonants may be classified into categories of voiced and unvoiced sounds, or categories such as plosives and sibilants, and a consonant length distribution may be created for each category.
  • the mode of measuring the consonant length and determining the variation of the consonant length based on the singing sound collected by the microphone or the like is shown, but the singing created by artificially imitating a human voice
  • the above-described configuration can also be applied to sound, and similar effects can be obtained.
  • the mode in which the consonant length is automatically measured has been shown.
  • the user manually identifies the consonant and the vowel while viewing the waveform, measures the consonant length, and inputs it to the device (acquired by the device). )
  • the vowel pronunciation start timing is automatically measured.
  • the user manually detects the vowel pronunciation start timing while viewing the waveform, and represents the detected vowel pronunciation start timing.
  • Data may be input to the device (acquired by the device), and the user listens to the singing sound and detects the vowel start timing manually, and inputs data indicating the detected vowel pronunciation start timing to the device (Acquired by the apparatus).
  • the vowel sound generation timing acquisition unit may use an operation input unit (not shown).
  • Singing evaluation device 11 consonant length measurement unit 12: consonant length distribution creation unit 13: evaluation unit 21: vowel pronunciation timing acquisition unit 22: vowel pronunciation timing distribution creation unit 23: determination target detection unit

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Reverberation, Karaoke And Other Acoustics (AREA)
  • Auxiliary Devices For Music (AREA)

Abstract

La présente invention concerne un dispositif (10) d'évaluation de chant équipé d'une unité (11) de mesure de longueur de consonnes, d'une unité (12) de création de répartition de longueur de consonnes et d'une unité (13) d'évaluation. L'unité (11) de mesure de longueur de consonnes détecte des consonnes dans des données de chant et mesure la longueur de chaque consonne. L'unité (12) de création de répartition de longueur de consonnes mémorise, sur une période de temps prescrite, les longueurs de consonnes qui ont été entrées, puis crée une distribution de fréquence pour la longueur de consonnes. L'unité (13) d'évaluation utilise le fait que le chant est plus dans le rythme lorsque la variation de la longueur de consonnes est supérieure et que le chant est moins dans le rythme lorsque la variation de la longueur de consonnes est inférieure afin de déterminer si le chant est dans le rythme. En particulier, l'unité (13) d'évaluation calcule un indice qui est basé sur la variation de la longueur de consonnes, à partir de la distribution de fréquence pour la longueur de consonnes, puis détermine si le chant est dans le rythme à partir de la taille de l'indice qui se base sur la variation de la longueur de consonnes.
PCT/JP2015/051731 2014-01-23 2015-01-22 Dispositif, procédé et programme d'évaluation de chant WO2015111671A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2014-010223 2014-01-23
JP2014010223A JP6304650B2 (ja) 2014-01-23 2014-01-23 歌唱評価装置

Publications (1)

Publication Number Publication Date
WO2015111671A1 true WO2015111671A1 (fr) 2015-07-30

Family

ID=53681472

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2015/051731 WO2015111671A1 (fr) 2014-01-23 2015-01-22 Dispositif, procédé et programme d'évaluation de chant

Country Status (2)

Country Link
JP (1) JP6304650B2 (fr)
WO (1) WO2015111671A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017068990A1 (fr) * 2015-10-22 2017-04-27 ヤマハ株式会社 Dispositif d'évaluation de son musical, dispositif de génération de critères d'évaluation, et support d'enregistrement
CN113096689A (zh) * 2021-04-02 2021-07-09 腾讯音乐娱乐科技(深圳)有限公司 一种歌曲演唱的评价方法、设备及介质
CN114678039A (zh) * 2022-04-13 2022-06-28 厦门大学 一种基于深度学习的歌唱评价方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008145940A (ja) * 2006-12-13 2008-06-26 Yamaha Corp 音声評価装置及び音声評価方法
JP2009258366A (ja) * 2008-04-16 2009-11-05 Arcadia:Kk 音声制御装置
JP2013195738A (ja) * 2012-03-21 2013-09-30 Yamaha Corp 歌唱評価装置

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009003347A1 (fr) * 2007-06-29 2009-01-08 Multak Technology Development Co., Ltd Appareil de karaoké

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008145940A (ja) * 2006-12-13 2008-06-26 Yamaha Corp 音声評価装置及び音声評価方法
JP2009258366A (ja) * 2008-04-16 2009-11-05 Arcadia:Kk 音声制御装置
JP2013195738A (ja) * 2012-03-21 2013-09-30 Yamaha Corp 歌唱評価装置

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017068990A1 (fr) * 2015-10-22 2017-04-27 ヤマハ株式会社 Dispositif d'évaluation de son musical, dispositif de génération de critères d'évaluation, et support d'enregistrement
US10453435B2 (en) 2015-10-22 2019-10-22 Yamaha Corporation Musical sound evaluation device, evaluation criteria generating device, method for evaluating the musical sound and method for generating the evaluation criteria
CN113096689A (zh) * 2021-04-02 2021-07-09 腾讯音乐娱乐科技(深圳)有限公司 一种歌曲演唱的评价方法、设备及介质
CN114678039A (zh) * 2022-04-13 2022-06-28 厦门大学 一种基于深度学习的歌唱评价方法

Also Published As

Publication number Publication date
JP6304650B2 (ja) 2018-04-04
JP2015138177A (ja) 2015-07-30

Similar Documents

Publication Publication Date Title
Sundberg et al. Acoustical study of classical Peking Opera singing
EP2838082A1 (fr) Procédé et dispositif d'analyse vocale, procédé et dispositif de synthèse vocale et support stockant un programme d'analyse vocale
JP6759545B2 (ja) 評価装置およびプログラム
JP4479701B2 (ja) 楽曲練習支援装置、動的時間整合モジュールおよびプログラム
JP5196550B2 (ja) コード検出装置およびコード検出プログラム
WO2015111671A1 (fr) Dispositif, procédé et programme d'évaluation de chant
CN108292499A (zh) 技巧确定装置和记录介质
JP2007334364A (ja) カラオケ装置
JP2007156330A (ja) 相性判断機能付きカラオケ装置
JP2008065153A (ja) 楽曲構造解析方法、プログラムおよび装置
JP6098422B2 (ja) 情報処理装置、及びプログラム
JP2008015388A (ja) 歌唱力評価方法及びカラオケ装置
Molina et al. Automatic scoring of singing voice based on melodic similarity measures
JP5447624B2 (ja) カラオケ装置
TWI394141B (zh) Karaoke song accompaniment automatic scoring method
JP2008225115A (ja) カラオケ装置、歌唱評価方法およびプログラム
Delviniotis Acoustic characteristics of modern Greek Orthodox Church music
JP4218066B2 (ja) カラオケ装置およびカラオケ装置用プログラム
JP2016180965A (ja) 評価装置およびプログラム
Bonjyotsna et al. Analytical study of vocal vibrato and mordent of Indian popular singers
JP2008040258A (ja) 楽曲練習支援装置、動的時間整合モジュールおよびプログラム
JP5034642B2 (ja) カラオケ装置
JP5416396B2 (ja) 歌唱評価装置およびプログラム
JP2008015212A (ja) 音程変化量抽出方法、ピッチの信頼性算出方法、ビブラート検出方法、歌唱訓練プログラム及びカラオケ装置
JP2007225916A (ja) オーサリング装置、オーサリング方法およびプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15740439

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15740439

Country of ref document: EP

Kind code of ref document: A1