WO2015137360A1 - Singing analyzer - Google Patents

Singing analyzer Download PDF

Info

Publication number
WO2015137360A1
WO2015137360A1 PCT/JP2015/057063 JP2015057063W WO2015137360A1 WO 2015137360 A1 WO2015137360 A1 WO 2015137360A1 JP 2015057063 W JP2015057063 W JP 2015057063W WO 2015137360 A1 WO2015137360 A1 WO 2015137360A1
Authority
WO
WIPO (PCT)
Prior art keywords
singing
advice
processing unit
voice
music
Prior art date
Application number
PCT/JP2015/057063
Other languages
French (fr)
Japanese (ja)
Inventor
松本 秀一
Original Assignee
ヤマハ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ヤマハ株式会社 filed Critical ヤマハ株式会社
Publication of WO2015137360A1 publication Critical patent/WO2015137360A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/36Accompaniment arrangements
    • G10H1/361Recording/reproducing of accompaniment for use with an external source, e.g. karaoke systems
    • G10H1/366Recording/reproducing of accompaniment for use with an external source, e.g. karaoke systems with means for modifying or correcting the external signal, e.g. pitch correction, reverberation, changing a singer's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/066Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for pitch analysis as part of wider processing for musical purposes, e.g. transcription, musical performance evaluation; Pitch recognition, e.g. in polyphonic sounds; Estimation or use of missing fundamental
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/091Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for performance evaluation, i.e. judging, grading or scoring the musical qualities or faithfulness of a performance, e.g. with respect to pitch, tempo or other timings of a reference performance

Definitions

  • the present invention relates to a technique for analyzing a singing voice.
  • Patent Document 1 music that matches the preferences of individual singers by registering music in advance for each of a plurality of groups in which each singer is classified according to the tendency (ie, preference) of past song selection. Is disclosed to a singer.
  • Patent Document 1 is a technique that uses the tendency of music selection by each singer to propose music, but a tendency when a large number of singers sing a music (for example, many singers of music fail. Singing advice (pointing out or advice) taking into account the tendency of individual singers to sing a song (for example, pitch errors are likely to occur in the high range), or the above trends If comments such as the evaluation result of the added singing can be presented to the singer, an effective improvement of the singing can be expected. In view of the above circumstances, an object of the present invention is to present an appropriate comment according to the singing tendency of the singer to the singer.
  • the singing analysis device of the present invention specifies a comment corresponding to the tendency of each reference voice of the group corresponding to the singing voice of the target singer among a plurality of reference voices recorded in advance. And a presentation processing unit for presenting the comment specified by the analysis processing unit to the target singer.
  • a comment appropriate for the singing voice of the target singer can be presented to the target singer. Therefore, there is an advantage that the singing of the target singer can be effectively improved.
  • the analysis processing unit specifies, as the comment, singing advice corresponding to a tendency of each reference voice of a group corresponding to the singing voice of the target singer.
  • the singing advice according to the tendency of each reference voice of the group corresponding to the singing voice of the target singer is specified, the singing advice appropriate for the singing voice of the target singer is presented to the target singer. It is possible.
  • the analysis processing unit refers to reference information that specifies singing advice for each of a plurality of groups that classify a plurality of reference sounds that are common to the singing voice of the target singer and the music, and includes a plurality of reference sounds.
  • the singing advice of the group to which the singing voice of the target singer belongs is specified.
  • the singing voice of the target singer belongs by referring to the reference information that specifies the singing advice for each of a plurality of groups in which a plurality of reference voices in which the singing voice of the target singer and the music are in common are classified. Since the group singing advice is specified, there is an advantage that a suitable singing advice can be presented for each piece of music.
  • the analysis processing unit refers to reference information that specifies singing advice for each music attribute according to the tendency of each reference voice of a group that has collected the voice of the target singer among a plurality of reference voices. Then, the singing advice of the part applicable to a music attribute is specified among the music which an object singer sings.
  • the singing advice of the location corresponding to the music attribute in the music is specified by referring to the reference information that specifies the singing advice for each music attribute according to the tendency of the plurality of reference sounds of the target singer. Therefore, there is an advantage that suitable singing advice (for example, advice for poor singing for each singer) can be presented for each target singer.
  • the reference information specifies singing advice with specific pitches of successive pitches as music attributes
  • the analysis processing unit refers to a location where a specific pitch exists among songs sung by the target singer.
  • the configuration that specifies the singing advice specified for the pitch in the information effective singing advice to overcome the weakness is presented to the singer who is not good at singing that changes the pitch at a specific pitch Is possible.
  • the specific example of the above aspect is later mentioned, for example as 2nd Embodiment.
  • the analysis processing unit specifies, as the comment, an evaluation result obtained by evaluating the singing voice of the target singer according to the tendency specified by the analysis processing unit.
  • the evaluation result of evaluating the singing voice of the target singer according to the tendency of each reference voice of the group corresponding to the singing voice of the target singer is presented, it is appropriate for the singing voice of the target singer.
  • the evaluation result can be presented to the target singer.
  • the singing analysis apparatus responds to the tendency of each reference voice of the group corresponding to the singing voice of the target singer among a plurality of reference voices recorded in advance. And an analysis processing unit for evaluating the singing voice. According to the above configuration, since the singing voice is evaluated (typically scored) according to the singing tendency of the singer, there is an advantage that an evaluation that can effectively contribute to the improvement of the singing can be realized.
  • FIG. 1 is a configuration diagram of a singing analysis apparatus 100 according to the first embodiment of the present invention.
  • the singing analysis device 100 is an information processing device for presenting advice (hereinafter referred to as “singing advice”) regarding the singing of music to a singer of the music (hereinafter referred to as “target singer”),
  • the present invention is realized by a computer system including an arithmetic processing device 12, a storage device 14, a sound collection device 16, and a display device 18.
  • the singing analysis apparatus 100 is suitably used as a karaoke apparatus that reproduces accompaniment sounds of music, for example.
  • the sound collection device 16 is a device (microphone) that collects ambient sounds.
  • the sound collection device 16 of the first embodiment collects the singing voice V in which the target singer sang a specific music (hereinafter referred to as “target music”).
  • target music a specific music
  • the synthesized voice synthesized by the voice synthesis technique can be used as the singing voice V.
  • the display device 18 (for example, a liquid crystal display panel) displays an image instructed from the arithmetic processing device 12.
  • the singing advice A of the target music is displayed on the display device 18. Specifically, at each time point during the song singing by the target singer, singing advice A suitable for the time point is sequentially displayed on the display device 18.
  • a sound emitting device for example, a speaker
  • the storage device 14 stores programs executed by the arithmetic processing device 12 and various data used by the arithmetic processing device 12.
  • a known recording medium such as a semiconductor recording medium or a magnetic recording medium or a combination of a plurality of types of recording media is arbitrarily employed as the storage device 14.
  • the storage device 14 of the first embodiment stores reference information DA for each of a plurality of music pieces. Each reference information DA is used to specify the singing advice A for the music.
  • FIG. 2 is an explanatory diagram of reference information DA for any one piece of music.
  • the reference voice group Q is used to generate the reference information DA.
  • the reference voice group Q is a set of a plurality of singing voices (hereinafter referred to as “reference voices”) R recorded in advance.
  • the plurality of reference sounds R included in the reference sound group Q are sounds in which an unspecified number of singers sang arbitrary music.
  • a plurality of reference voices (a plurality of reference voices with a common song song) R of an arbitrary piece of music is composed of N groups (N is a natural number of 2 or more) groups G [1] ⁇ Classified as G [N].
  • each reference sound R is arbitrary, but a method of classifying a plurality of reference sounds R into N groups G [1] to G [N] is preferable from a musical viewpoint.
  • each reference voice R is set to N for each range of evaluation index (song scoring result) E which is a difference index between the melody of the singing part of the music and the reference voice R (for example, in increments of 5 points of 100 points).
  • the reference information DA of the first embodiment includes N unit information U [1] to U [N] corresponding to different groups G [n] of the reference speech R.
  • unit information U [N] any one unit information U [n] is composed of a plurality of time points (hereinafter referred to as “pointed time points”) T (T1, T2,. ”).
  • Singing advice A A1, A2,(7) Is designated for each.
  • the content of the singing advice A is set individually for each indication time T.
  • the unit information U [n] of an arbitrary piece of music is generated in consideration of the musical tendency of each reference sound R classified into the group G [n] among the plurality of reference sounds R of the music. .
  • each time point at which the singing is to be improved is designated as a point in time T by a large number of reference sounds R of the group G [n] in the music, and the content of the improvement of the singing or advice or indication for improvement ( A character string expressing (suggestion) is designated as singing advice A.
  • FIG. 3 shows a time series of average pitches P [n] over a plurality of reference sounds R of group G [n] and a time series of exemplary pitches P0 of music.
  • the exemplary pitch P0 is a time series of the pitches of each note specified in the musical score of the music, or a time series of the average values of the pitches of the reference sounds R of the group G having the maximum evaluation index E. .
  • the point in time when the difference (pitch error) between the average pitch P [n] of the group G [n] and the exemplary pitch P0 is maximized is designated as the indication time T.
  • a character string for example, a message such as “Caution on the pitch!
  • For improving the pitch error at the time point is designated as the singing advice A for each indicated time point T.
  • the arithmetic processing device 12 performs overall control of each element of the singing analysis device 100 by executing a program stored in the storage device 14.
  • the arithmetic processing device 12 has a plurality of functions (analysis processing unit 22 and presentation processing unit 24) for presenting singing advice A to a target singer who sings the target music.
  • a configuration in which each function of the arithmetic processing device 12 is distributed to a plurality of devices, or a configuration in which a dedicated electronic circuit realizes a part of the function of the arithmetic processing device 12 may be employed.
  • FIG. 4 is a flowchart of a process for the analysis processing unit 22 to specify the singing advice A (hereinafter referred to as “singing advice specifying process”).
  • the singing advice specifying process of FIG. 4 is started with the start of the singing of the target music (reproduction start of the accompaniment sound of the target music).
  • the analysis processing unit 22 determines whether or not the target music has been completed (SA1). When the target music has not ended (SA1: NO), the analysis processing unit 22 selects one of the plurality of sections (hereinafter referred to as “fixed length” or “variable length”) on the time axis. The singing voice V of “selected section” is acquired from the sound collection device 16 (SA2). The analysis processing unit 22 selects each section of the music from the beginning to the end in order every time step SA2 is executed, and acquires the singing voice V in the selected section.
  • the analysis processing unit 22 is a group (hereinafter referred to as “affiliation group”) G to which the singing voice V in the selected section belongs among the N groups G [1] to G [N] into which the plurality of reference voices R of the target music are classified. Is specified (SA3). Specifically, the analysis processing unit 22 calculates the evaluation index E for the singing voice V in the selected section, and among the N groups G [1] to G [N] corresponding to different ranges of the evaluation index E One group G [n] in a range in which the evaluation index E of the singing voice V in the selected section is included is specified as the belonging group G.
  • the analysis processing unit 22 selects the unit information U corresponding to the group G identified in step SA3 from the N unit information U [1] to U [N] of the reference information DA stored in the storage device 14. (SA4). That is, the analysis processing unit 22 identifies each singing advice A of the group G to which the singing voice V of the target singer belongs among the N groups G [1] to G [N].
  • the analysis processing unit 22 moves the process to step SA1. Therefore, until the target music ends (SA1: YES), the belonging group G is sequentially updated for each section of the target music, and the unit information U (the time series of the singing advice A) corresponding to the updated belonging group G is obtained. It is specified sequentially.
  • each unit information U [n] of the reference information DA Singing advice A (for example, a general message such as “Let's sing with emotion”) prepared in advance regardless of the number is presented to the target singer.
  • the presentation processing unit 24 sings the unit information U for the indicated time T at a time that precedes each indicated time T specified by the unit information U specified by the analysis processing unit 22 by a predetermined time.
  • the advice A is displayed on the display device 18. That is, the points to be improved under the respective reference sounds R of the group G to which the singing voice V belongs (that is, the places where the singing voice V of the target singer is supposed to be improved similarly) are sequentially given to the target singer.
  • the target singer can sing with particular attention to a portion of the target music that is likely to fail.
  • the singing advice A corresponding to the tendency of the plurality of reference sounds R of the group G to which the singing voice V of the target singer belongs is presented to the target singer. That is, singing advice A suitable for the singing voice V of each target singer is presented to the target singer. Therefore, there is an advantage that the singing of the target singer can be effectively improved.
  • the singing advice A is specified by referring to the reference information DA that specifies the singing advice A for each group G [n] that classifies the plurality of reference sounds R that share the singing voice V and the music. Therefore, the singing advice A suitable for each piece of music is specified. Therefore, the above-described effect that the singing advice A appropriate for the singing voice V of the target music can be presented is particularly remarkable.
  • the affiliation group G of the singing voice V is updated sequentially for every area of a target music during the song of the target music by a target singer. Therefore, there exists an advantage that suitable singing advice A can be shown for every section of object music.
  • Second Embodiment A second embodiment of the present invention will be described below.
  • standard referred by description of 1st Embodiment is diverted, and each detailed description is abbreviate
  • the storage device 14 of the second embodiment stores the reference information DB of FIG. 5 instead of the reference information DA of the first embodiment.
  • a set (group) of a plurality of reference voices R of the target singer in the reference voice group Q similar to the first embodiment is used.
  • the music attribute X typically the target singer fails to present singing advice A to the target singer
  • the music attribute X that tends to be identified is specified, and the reference information DB that specifies the singing advice A (A1, A2,7)
  • the reference information DB may be generated in advance for each of a plurality of singers, but the reference information DB of the target singer is generated immediately before the singing by the target singer (that is, for each singing). It is also possible.
  • “Music attribute” means a musical attribute (mode) of a music piece. Specifically, range (high / low), performance mark (A melody, rust, etc.), position within a specific section such as a phrase (protrusion, etc.), sound type (ascending, descending, continuation of same sound, kobushi, modification) Sound), note value (long tone, short passage), rhythm type, legato / staccato, tempo, beat position (second beat back, etc.), chord function (root, non-harmonic sound), etc. Included in the concept of “attribute”.
  • the music attribute X means a musical attribute (mode) of the song part of the song.
  • the singing advice such as “careful of high sounds!”
  • the music attribute X1 “high range”, which is analyzed from a plurality of reference sounds R. A1 is specified.
  • the music attribute X2 of “5 degrees” is a tendency that the singing of each pitch that is in succession at a specific pitch (for example, 5 degrees) is not good.
  • Singing advice A2 such as “Caution for pitch change!” Is specified for (specific pitch).
  • the singing advice A3 such as “note rhythm!” Is specified for the music attribute X3 “specific rhythm”.
  • the music attribute X4 “immediately after start” Singing advice A4 such as “!” Is designated.
  • FIG. 6 is a flowchart of the singing advice specifying process for the analysis processing unit 22 of the second embodiment to specify the singing advice A. Similar to the first embodiment, the singing advice specifying process in FIG. 6 is started when the singing of the target music starts.
  • the analysis processing unit 22 refers to the reference information DB of the target singer, so that the portion corresponding to the music attribute X specified by the reference information DB (hereinafter referred to as “pointed section”). (Referred to as “)” (SB1).
  • a specific sound range for example, a high sound range
  • the section of the sound range of the target music is searched as the indicated section, and a specific pitch (for example, 5 degrees) is searched.
  • a section in which the pitch is around the target musical piece is searched for as the indicated section.
  • the rhythm section of the target music is searched as the indicated section, and the specific section (for example, immediately after the start) is referred to as the music attribute X. If the information DB is specified, the section is searched as the indicated section in the target music. Note that it is possible to add a plurality of types of music attributes X to the search for the indicated section. For example, a section of “specific rhythm” and “specific pitch” is searched for as an indicated section.
  • the analysis processing unit 22 specifies the singing advice A for each indicated section searched from the target music by the above procedure (SB2). Specifically, the analysis processing unit 22 specifies the singing advice A corresponding to the music attribute X of the indicated section from the reference information DB for each of the plurality of indicated sections searched from the target music.
  • the above is a specific example of the singing advice specifying process in the second embodiment.
  • the presentation processing unit 24 of the second embodiment presents the singing advice A specified by the analysis processing unit 22 in the singing advice specifying process described above to the target singer for each indicated section of the target music. Specifically, the presentation processing unit 24 displays the singing advice A specified by the analysis processing unit 22 for the indicated section on the display device 18 at a time point preceding the starting point of each indicated section in the target music by a predetermined time.
  • singing advice A for improving the singing of the section is sequentially presented to the target singer in advance of the singing of the indicated section estimated to be weak for the target singer.
  • the singing advice A corresponding to the group tendency corresponding to the singing voice of the target singer in the reference voice group Q is presented to the target singer, the first embodiment.
  • reference information DB that designates the singing advice A for each music attribute X is referred to according to the tendency of the group of the plurality of reference sounds R uttered by the target singer in the past in the reference sound group Q. Therefore, the effect that the appropriate singing advice A can be presented for each target singer is particularly remarkable.
  • the reference information DB that designates the singing advice A with a specific pitch as the music attribute X is referred to, for a singer who is not good at singing each pitch at a specific pitch, Effective singing advice A can be presented.
  • the analysis process part 22 specified the singing advice A according to the tendency of the some reference sound R of the affiliation group G to which the target singer's singing voice V belongs.
  • the analysis processing unit 22 according to the third embodiment specifies comments of evaluation results obtained by evaluating (scoring) the singing voice V according to the tendency of the plurality of reference voices R of the group G to which the singing voice V of the target singer belongs. .
  • the evaluation result of evaluating the singing voice V with an emphasis on the evaluation items according to the tendency of each reference voice R of the group G to which the singing voice V belongs is specified.
  • the reference voice tends to have a large volume variation and a large pitch error.
  • the analysis processing unit 22 sets the weight value of the evaluation of the chorus section of the music to a larger numerical value than the other sections, and the evaluation result Is calculated. Further, for the group G of the reference speech R that tends to have a higher pitch evaluation result than the inflection evaluation result, the analysis processing unit 22 compares the weight evaluation pitch value with other elements such as intonation.
  • the analysis processing unit 22 uses the inflection and singing techniques.
  • the evaluation result is calculated by setting the evaluation weight to a larger value compared to other elements such as pitch.
  • the analysis processing unit 22 has a larger weight value for evaluating the inflection than other elements.
  • the presentation processing unit 24 causes the display device 18 to display a comment on the evaluation result specified by the analysis processing unit 22.
  • comments of evaluation results according to the tendency of the plurality of reference sounds R of the group G to which the singing voice V of the target singer belongs are presented to the target singer. That is, a comment appropriate for the singing voice V of each target singer is presented to the target singer. Therefore, there is an advantage that the singing of the target singer can be effectively improved.
  • the present invention can also be realized as an apparatus (a configuration in which the presentation processing unit 24 is omitted) that evaluates the singing voice V according to the tendency of the plurality of reference voices R of the group G to which the singing voice V of the target singer belongs. .
  • the reference information DA for each piece of music is created in advance has been exemplified.
  • the reference information DA can be generated in real time for each song of each piece of music.
  • the group G of each reference sound R similar in musical tendency to the singing sound V of the target singer is extracted from the reference sound group Q, and the analysis processing unit 22 uses each reference sound R of the group G.
  • a configuration for generating the reference information DA is preferable.
  • a plurality of reference sounds R corresponding to a specific music piece in the reference sound group Q are classified into N groups G [1] to G [N].
  • This method is arbitrary as described in the first embodiment.
  • it is also possible to classify a predetermined number (for example, the top 5%) of reference voices R positioned in descending order of the evaluation index E among a plurality of reference voices R corresponding to a specific music into a group G [n]. is there.
  • the point in time when the difference between the average pitch P [n] of the group G [n] and the exemplary pitch P0 is maximized is selected as the indication point T.
  • the method of selecting T is not limited to the above examples.
  • the evaluation index E of the plurality of reference sounds R included in the group G [n] or the distribution degree of pitches (for example, dispersion or distribution width) increases, the average of the evaluation indices E of the plurality of reference sounds R It is also possible to select the point in time at which the value is minimized as the point-in-time T. It is also possible to select the point in time T when the difference between the average pitch P [n] and the exemplary pitch P0 exceeds a predetermined threshold.
  • the affiliation group G is updated for each section of the target music.
  • the affiliation group G of the selected section is changed according to the evaluation index E of the singing voice V over a plurality of sections including the selected section. It is also possible to specify. Specifically, among the N groups G [1] to G [N] corresponding to different ranges of the evaluation index E, the weighted sum of the evaluation indices E over a plurality of sections with the selected section at the end is included.
  • the group G [n] in the range to be specified is specified as the belonging group G. For example, the weight value applied to the evaluation index E in each section is set to a larger numerical value as it is closer to the selected section.
  • a plurality of reference information D (DA, DB)
  • DA reference information
  • DB reference information
  • a set of reference information DA1 having a high point-in-time T and reference information DA2 having a low point-in-time T is prepared for each piece of music, and the analysis processing unit 22 receives the reference information according to an instruction from the user.
  • a configuration that selectively uses DA1 and reference information DA2 is employed.
  • the reference information DA1 is applied, the singing advice A is presented at a number of pointed-out times T in the target music (that is, stubborn advice), and when the reference information DA2 is applied, the singing advice A is presented. Time T decreases (ie, sweet eye advice).
  • the presentation of the singing advice A is exemplified
  • the presentation of the evaluation result is exemplified.
  • the content presented to the target singer is limited to the above illustration.
  • the analysis processing unit 22 identifies comments (singing advice A and evaluation results) according to the tendency of each reference voice R of the group G belonging to the singing voice V of the target singer. It is expressed comprehensively as an element.
  • the singing analysis device 100 can be realized by a server device (for example, a web server) that communicates with a communication terminal such as a communication karaoke device.
  • a server device for example, a web server
  • the analysis processing unit 22 analyzes the singing advice A corresponding to the group G of the singing voice V received from the communication terminal via the communication network.
  • the presentation processing unit 24 transmits a command for specifying (singing advice specifying process) and causing the target singer to present the singing advice A to the communication terminal.
  • the singing analysis apparatus is realized by hardware (electronic circuit) such as DSP (Digital Signal Processor) dedicated to the presentation of singing advice, and general-purpose arithmetic such as CPU (Central Processing Unit) This is also realized by cooperation between the processing device and the program.
  • the program which concerns on the suitable aspect of this invention is the analysis process part which identifies the comment according to the tendency of each reference audio
  • the computer is caused to function as a presentation processing unit that presents the comment specified by the analysis processing unit to the target singer.
  • the program which concerns on another aspect is an analysis process part which evaluates the said song audio
  • the program of the present invention can be provided in a form stored in a computer-readable recording medium and installed in the computer.
  • the recording medium is, for example, a non-transitory recording medium, and an optical recording medium (optical disk) such as a CD-ROM is a good example, but a known arbitrary one such as a semiconductor recording medium or a magnetic recording medium This type of recording medium can be included.
  • the program of the present invention can be provided in the form of distribution via a communication network and installed in a computer.
  • the present invention is also specified as an operation method (singing analysis method) of the song analysis apparatus according to each of the above aspects.
  • the singing analysis method according to a preferred aspect of the present invention is an analysis process for identifying a comment corresponding to a tendency of each reference voice of a group corresponding to the singing voice of the target singer among a plurality of reference voices recorded in advance. And a presentation process for presenting the comment specified in the analysis process to the target singer.
  • the singing analysis method according to another aspect is an analysis process for evaluating the singing voice according to the tendency of each reference voice of the group corresponding to the singing voice of the target singer among a plurality of reference voices recorded in advance. Includes processes.
  • SYMBOLS 100 Singing analysis apparatus, 12 ... Arithmetic processing apparatus, 14 ... Memory

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Reverberation, Karaoke And Other Acoustics (AREA)

Abstract

A singing analyzer (100) is provided with an analysis processing unit (22) for specifying a comment (e.g., singing advice (A)) corresponding to a trend of reference voices in a group, from among a plurality of reference voices recorded beforehand, corresponding to the singing voice (V) of a subject singer and a presentation processing unit (24) for presenting the comment specified by the analysis processing unit (22) to the subject singer.

Description

歌唱解析装置Singing analysis device
 本発明は、歌唱音声を解析する技術に関する。 The present invention relates to a technique for analyzing a singing voice.
 歌唱者による過去の歌唱の傾向を利用した技術が従来から提案されている。例えば特許文献1には、過去の選曲の傾向(すなわち嗜好)に応じて各歌唱者を分類した複数のグループの各々について楽曲を事前に登録することで、個々の歌唱者の嗜好に合致した楽曲を歌唱者に提案する技術が開示されている。 A technology that uses the tendency of singing in the past by singers has been proposed. For example, in Patent Document 1, music that matches the preferences of individual singers by registering music in advance for each of a plurality of groups in which each singer is classified according to the tendency (ie, preference) of past song selection. Is disclosed to a singer.
日本国特開2012-078387号公報Japanese Unexamined Patent Publication No. 2012-078387
 特許文献1の技術は、各歌唱者による選曲の傾向を楽曲の提案に利用する技術であるが、多数の歌唱者が楽曲を歌唱した場合の傾向(例えば楽曲のうち多数の歌唱者が失敗し易い箇所等)や個々の歌唱者が楽曲を歌唱した場合の傾向(例えば高音域で音高の誤差が発生し易い等)を加味した歌唱のアドバイス(指摘や助言)、または、以上の傾向を加味した歌唱の評価結果等のコメントを歌唱者に提示できれば、歌唱の効果的な改善が期待できる。
 以上の事情を考慮して、本発明は、歌唱者の歌唱の傾向に応じた適切なコメントを歌唱者に提示することを目的とする。
The technique of Patent Document 1 is a technique that uses the tendency of music selection by each singer to propose music, but a tendency when a large number of singers sing a music (for example, many singers of music fail. Singing advice (pointing out or advice) taking into account the tendency of individual singers to sing a song (for example, pitch errors are likely to occur in the high range), or the above trends If comments such as the evaluation result of the added singing can be presented to the singer, an effective improvement of the singing can be expected.
In view of the above circumstances, an object of the present invention is to present an appropriate comment according to the singing tendency of the singer to the singer.
 以上の課題を解決するために、本発明の歌唱解析装置は、事前に収録された複数の参照音声のうち対象歌唱者の歌唱音声に対応したグループの各参照音声の傾向に応じたコメントを特定する解析処理部と、解析処理部が特定したコメントを対象歌唱者に提示する提示処理部とを具備する。以上の構成では、対象歌唱者の歌唱音声に対応したグループの各参照音声の傾向が特定されるから、対象歌唱者の歌唱音声にとって適切なコメントを対象歌唱者に提示することが可能である。したがって、対象歌唱者の歌唱を効果的に改善できるという利点がある。 In order to solve the above problems, the singing analysis device of the present invention specifies a comment corresponding to the tendency of each reference voice of the group corresponding to the singing voice of the target singer among a plurality of reference voices recorded in advance. And a presentation processing unit for presenting the comment specified by the analysis processing unit to the target singer. In the above configuration, since the tendency of each reference voice of the group corresponding to the singing voice of the target singer is specified, a comment appropriate for the singing voice of the target singer can be presented to the target singer. Therefore, there is an advantage that the singing of the target singer can be effectively improved.
 本発明の一態様において、前記解析処理部は、前記対象歌唱者の歌唱音声に対応したグループの各参照音声の傾向に応じた歌唱アドバイスを前記コメントとして特定する。以上の態様では、対象歌唱者の歌唱音声に対応したグループの各参照音声の傾向に応じた歌唱アドバイスが特定されるから、対象歌唱者の歌唱音声にとって適切な歌唱アドバイスを対象歌唱者に提示することが可能である。 In one aspect of the present invention, the analysis processing unit specifies, as the comment, singing advice corresponding to a tendency of each reference voice of a group corresponding to the singing voice of the target singer. In the above aspect, since the singing advice according to the tendency of each reference voice of the group corresponding to the singing voice of the target singer is specified, the singing advice appropriate for the singing voice of the target singer is presented to the target singer. It is possible.
 本発明の一態様において、解析処理部は、対象歌唱者の歌唱音声と楽曲が共通する複数の参照音声を分類した複数のグループの各々について歌唱アドバイスを指定する参照情報を参照して、複数のグループのうち対象歌唱者の歌唱音声が属するグループの歌唱アドバイスを特定する。以上の態様では、対象歌唱者の歌唱音声と楽曲が共通する複数の参照音声を分類した複数のグループの各々について歌唱アドバイスを指定する参照情報を参照することで、対象歌唱者の歌唱音声が属するグループの歌唱アドバイスが特定されるから、楽曲毎に好適な歌唱アドバイスを提示できるという利点がある。また、対象歌唱者による楽曲の歌唱中に解析処理部が当該歌唱音声のグループを順次に更新する構成によれば、対象楽曲の区間毎に好適な歌唱アドバイスを提示できるという利点がある。なお、以上の各態様の具体例は、例えば第1実施形態として後述される。 In one aspect of the present invention, the analysis processing unit refers to reference information that specifies singing advice for each of a plurality of groups that classify a plurality of reference sounds that are common to the singing voice of the target singer and the music, and includes a plurality of reference sounds. The singing advice of the group to which the singing voice of the target singer belongs is specified. In the above aspect, the singing voice of the target singer belongs by referring to the reference information that specifies the singing advice for each of a plurality of groups in which a plurality of reference voices in which the singing voice of the target singer and the music are in common are classified. Since the group singing advice is specified, there is an advantage that a suitable singing advice can be presented for each piece of music. Further, according to the configuration in which the analysis processing unit sequentially updates the group of the singing voice while the target singer sings the music, there is an advantage that suitable singing advice can be presented for each section of the target music. In addition, the specific example of each above aspect is later mentioned as 1st Embodiment, for example.
 本発明の一態様において、解析処理部は、複数の参照音声のうち対象歌唱者の音声を収集したグループの各参照音声の傾向に応じて音楽属性毎に歌唱アドバイスを指定する参照情報を参照して、対象歌唱者が歌唱する楽曲のうち音楽属性に該当する箇所の歌唱アドバイスを特定する。以上の態様では、対象歌唱者の複数の参照音声の傾向に応じて音楽属性毎に歌唱アドバイスを指定する参照情報を参照することで、楽曲のうち音楽属性に該当する箇所の歌唱アドバイスが特定されるから、対象歌唱者毎に好適な歌唱アドバイス(例えば歌唱者毎の苦手な歌唱に対するアドバイス)を提示できるという利点がある。例えば、参照情報が、相前後する各音高の特定の音程を音楽属性として歌唱アドバイスを指定し、解析処理部が、対象歌唱者が歌唱する楽曲のうち特定の音程が存在する箇所について、参照情報にて当該音程に指定された歌唱アドバイスを特定する構成によれば、特定の音程で音高を変化させる歌唱が苦手な歌唱者に対し、苦手の克服のための効果的な歌唱アドバイスを提示することが可能である。なお、以上の態様の具体例は、例えば第2実施形態として後述される。 In one aspect of the present invention, the analysis processing unit refers to reference information that specifies singing advice for each music attribute according to the tendency of each reference voice of a group that has collected the voice of the target singer among a plurality of reference voices. Then, the singing advice of the part applicable to a music attribute is specified among the music which an object singer sings. In the above aspect, the singing advice of the location corresponding to the music attribute in the music is specified by referring to the reference information that specifies the singing advice for each music attribute according to the tendency of the plurality of reference sounds of the target singer. Therefore, there is an advantage that suitable singing advice (for example, advice for poor singing for each singer) can be presented for each target singer. For example, the reference information specifies singing advice with specific pitches of successive pitches as music attributes, and the analysis processing unit refers to a location where a specific pitch exists among songs sung by the target singer. According to the configuration that specifies the singing advice specified for the pitch in the information, effective singing advice to overcome the weakness is presented to the singer who is not good at singing that changes the pitch at a specific pitch Is possible. In addition, the specific example of the above aspect is later mentioned, for example as 2nd Embodiment.
 本発明の一態様において、前記解析処理部は、前記解析処理部が特定した傾向に応じて前記対象歌唱者の歌唱音声を評価した評価結果を前記コメントとして特定する。以上の態様では、対象歌唱者の歌唱音声に対応したグループの各参照音声の傾向に応じて対象歌唱者の歌唱音声を評価した評価結果が提示されるから、対象歌唱者の歌唱音声にとって適切な評価結果を対象歌唱者に提示することが可能である。 In one aspect of the present invention, the analysis processing unit specifies, as the comment, an evaluation result obtained by evaluating the singing voice of the target singer according to the tendency specified by the analysis processing unit. In the above aspect, since the evaluation result of evaluating the singing voice of the target singer according to the tendency of each reference voice of the group corresponding to the singing voice of the target singer is presented, it is appropriate for the singing voice of the target singer. The evaluation result can be presented to the target singer.
 ところで、多数の歌唱者が楽曲を歌唱した場合の傾向や個々の歌唱者による歌唱の傾向を加味して歌唱者の歌唱を評価すれば、歌唱の効果的な改善が期待できる。以上の事情を考慮して、本発明の他の態様に係る歌唱解析装置は、事前に収録された複数の参照音声のうち対象歌唱者の歌唱音声に対応したグループの各参照音声の傾向に応じて当該歌唱音声を評価する解析処理部を具備する。以上の構成によれば、歌唱者の歌唱の傾向に応じて歌唱音声が評価(典型的には採点)されるから、歌唱の改善に効果的に寄与し得る評価を実現できるという利点がある。 By the way, if the singing of a singer is evaluated in consideration of the tendency when a large number of singers sing a song and the tendency of singing by individual singers, an effective improvement of the singing can be expected. In consideration of the above circumstances, the singing analysis apparatus according to another aspect of the present invention responds to the tendency of each reference voice of the group corresponding to the singing voice of the target singer among a plurality of reference voices recorded in advance. And an analysis processing unit for evaluating the singing voice. According to the above configuration, since the singing voice is evaluated (typically scored) according to the singing tendency of the singer, there is an advantage that an evaluation that can effectively contribute to the improvement of the singing can be realized.
本発明の第1実施形態に係る歌唱解析装置の構成図である。It is a lineblock diagram of the song analysis device concerning a 1st embodiment of the present invention. 参照情報の説明図である。It is explanatory drawing of reference information. 指摘時点の説明図である。It is explanatory drawing at the time of indication. 歌唱アドバイス特定処理のフローチャートである。It is a flowchart of a singing advice specific process. 第2実施形態における参照情報の説明図である。It is explanatory drawing of the reference information in 2nd Embodiment. 第2実施形態における歌唱アドバイス特定処理のフローチャートである。It is a flowchart of the singing advice specific process in 2nd Embodiment.
<第1実施形態>
 図1は、本発明の第1実施形態に係る歌唱解析装置100の構成図である。歌唱解析装置100は、楽曲の歌唱に関する指摘や助言等のアドバイス(以下「歌唱アドバイス」という)を当該楽曲の歌唱者(以下「対象歌唱者」という)に提示するための情報処理装置であり、演算処理装置12と記憶装置14と収音装置16と表示装置18とを具備するコンピュータシステムで実現される。歌唱解析装置100は、例えば楽曲の伴奏音を再生するカラオケ装置として好適に利用される。
<First Embodiment>
FIG. 1 is a configuration diagram of a singing analysis apparatus 100 according to the first embodiment of the present invention. The singing analysis device 100 is an information processing device for presenting advice (hereinafter referred to as “singing advice”) regarding the singing of music to a singer of the music (hereinafter referred to as “target singer”), The present invention is realized by a computer system including an arithmetic processing device 12, a storage device 14, a sound collection device 16, and a display device 18. The singing analysis apparatus 100 is suitably used as a karaoke apparatus that reproduces accompaniment sounds of music, for example.
 収音装置16は、周囲の音響を収音する装置(マイクロホン)である。第1実施形態の収音装置16は、対象歌唱者が特定の楽曲(以下「対象楽曲」という)を歌唱した歌唱音声Vを収音する。なお、音声合成技術で合成された合成音声を歌唱音声Vとすることも可能である。表示装置18(例えば液晶表示パネル)は、演算処理装置12から指示された画像を表示する。第1実施形態では、対象楽曲の歌唱アドバイスAが表示装置18に表示される。具体的には、対象歌唱者による楽曲の歌唱中の各時点において、当該時点に好適な歌唱アドバイスAが順次に表示装置18に表示される。なお、歌唱アドバイスAを放音装置(例えばスピーカ)から音声で出力することも可能である。 The sound collection device 16 is a device (microphone) that collects ambient sounds. The sound collection device 16 of the first embodiment collects the singing voice V in which the target singer sang a specific music (hereinafter referred to as “target music”). Note that the synthesized voice synthesized by the voice synthesis technique can be used as the singing voice V. The display device 18 (for example, a liquid crystal display panel) displays an image instructed from the arithmetic processing device 12. In the first embodiment, the singing advice A of the target music is displayed on the display device 18. Specifically, at each time point during the song singing by the target singer, singing advice A suitable for the time point is sequentially displayed on the display device 18. In addition, it is also possible to output the singing advice A by sound from a sound emitting device (for example, a speaker).
 記憶装置14は、演算処理装置12が実行するプログラムや演算処理装置12が使用する各種のデータを記憶する。半導体記録媒体や磁気記録媒体等の公知の記録媒体または複数種の記録媒体の組合せが記憶装置14として任意に採用される。第1実施形態の記憶装置14は、複数の楽曲の各々について参照情報DAを記憶する。各参照情報DAは、当該楽曲の歌唱アドバイスAの特定に利用される。 The storage device 14 stores programs executed by the arithmetic processing device 12 and various data used by the arithmetic processing device 12. A known recording medium such as a semiconductor recording medium or a magnetic recording medium or a combination of a plurality of types of recording media is arbitrarily employed as the storage device 14. The storage device 14 of the first embodiment stores reference information DA for each of a plurality of music pieces. Each reference information DA is used to specify the singing advice A for the music.
 図2は、任意の1個の楽曲の参照情報DAの説明図である。図2に例示される通り、参照情報DAの生成には参照音声群Qが利用される。参照音声群Qは、事前に収録された複数の歌唱音声(以下「参照音声」という)Rの集合である。参照音声群Qに包含される複数の参照音声Rは、不特定多数の歌唱者が任意の楽曲を歌唱した音声である。図2に例示される通り、任意の1個の楽曲の複数の参照音声(歌唱楽曲が共通する複数の参照音声)Rは、N個(Nは2以上の自然数)のグループG[1]~G[N]に分類される。任意の楽曲に対応する1個のグループG[n](n=1~N)には、相異なる歌唱者が当該楽曲を歌唱した複数の参照音声Rが包含される。 FIG. 2 is an explanatory diagram of reference information DA for any one piece of music. As illustrated in FIG. 2, the reference voice group Q is used to generate the reference information DA. The reference voice group Q is a set of a plurality of singing voices (hereinafter referred to as “reference voices”) R recorded in advance. The plurality of reference sounds R included in the reference sound group Q are sounds in which an unspecified number of singers sang arbitrary music. As illustrated in FIG. 2, a plurality of reference voices (a plurality of reference voices with a common song song) R of an arbitrary piece of music is composed of N groups (N is a natural number of 2 or more) groups G [1] ˜ Classified as G [N]. One group G [n] (n = 1 to N) corresponding to an arbitrary musical piece includes a plurality of reference sounds R in which different singers sang the musical piece.
 各参照音声Rの分類(クラスタリング)の仕方は任意であるが、音楽的な観点から複数の参照音声RをN個のグループG[1]~G[N]に分類する方法が好適である。具体的には、楽曲の歌唱パートの旋律と参照音声Rとの異同の指標である評価指標(歌唱採点結果)Eの範囲毎(例えば100点満点の5点刻み)に各参照音声RをN個のグループG[1]~G[N]に分類する方法や、楽曲内で時系列に算定される評価指標Eの傾向(例えば楽曲の後半で評価指標Eが増加する等の傾向)に応じて各参照音声RをN個のグループG[1]~G[N]に分類する方法が採用され得る。また、上記のような歌唱レベル毎の分類以外にも、性別(男性及び女性)毎の分類、性別毎の歌唱レベル毎の分類、一人で歌唱した場合の分類、混声で歌唱した場合の分類、集団で歌唱した場合の分類等も採用され得る。 The method of classification (clustering) of each reference sound R is arbitrary, but a method of classifying a plurality of reference sounds R into N groups G [1] to G [N] is preferable from a musical viewpoint. Specifically, each reference voice R is set to N for each range of evaluation index (song scoring result) E which is a difference index between the melody of the singing part of the music and the reference voice R (for example, in increments of 5 points of 100 points). Depending on the method of grouping into groups G [1] to G [N] and the trend of evaluation index E calculated in time series within the music (for example, the tendency of evaluation index E to increase in the second half of the music) Thus, a method of classifying each reference speech R into N groups G [1] to G [N] can be employed. In addition to the above categorization by singing level, classification by gender (male and female), classification by singing level by gender, classification when singing alone, classification when singing with mixed voice, Classification etc. when singing in a group can also be adopted.
 図2に例示される通り、第1実施形態の参照情報DAは、参照音声Rの相異なるグループG[n]に対応するN個の単位情報U[1]~U[N]を包含する。単位情報U[N]について図2に代表的に例示される通り、任意の1個の単位情報U[n]は、楽曲の複数の時点(以下「指摘時点」という)T(T1,T2,……)の各々について歌唱アドバイスA(A1,A2,……)を指定する。歌唱アドバイスAの内容は指摘時点T毎に個別に設定される。 2, the reference information DA of the first embodiment includes N unit information U [1] to U [N] corresponding to different groups G [n] of the reference speech R. As representatively shown in FIG. 2 for unit information U [N], any one unit information U [n] is composed of a plurality of time points (hereinafter referred to as “pointed time points”) T (T1, T2,. ...) Singing advice A (A1, A2,...) Is designated for each. The content of the singing advice A is set individually for each indication time T.
 任意の1個の楽曲の単位情報U[n]は、当該楽曲の複数の参照音声RのうちグループG[n]に分類された各参照音声Rの音楽的な傾向を加味して生成される。具体的には、楽曲のうちグループG[n]の多数の参照音声Rにて歌唱を改善すべき各時点が指摘時点Tとして指定され、歌唱の改善の内容や改善のための助言または指摘(提案)を表現する文字列が歌唱アドバイスAとして指定される。 The unit information U [n] of an arbitrary piece of music is generated in consideration of the musical tendency of each reference sound R classified into the group G [n] among the plurality of reference sounds R of the music. . Specifically, each time point at which the singing is to be improved is designated as a point in time T by a large number of reference sounds R of the group G [n] in the music, and the content of the improvement of the singing or advice or indication for improvement ( A character string expressing (suggestion) is designated as singing advice A.
 図3には、グループG[n]の複数の参照音声Rにわたる平均的な音高P[n]の時系列と、楽曲の模範的な音高P0の時系列とが併記されている。模範的な音高P0は、楽曲の楽譜で規定された各音符の音高の時系列や、評価指標Eが最大となるグループGの各参照音声Rの音高の平均値の時系列である。図3から理解される通り、グループG[n]の平均的な音高P[n]と模範的な音高P0との差異(音高の誤差)が極大となる時点が指摘時点Tとして指定され、当該時点での音高の誤差を改善するための文字列(例えば「音高に注意!」等のメッセージ)が歌唱アドバイスAとして指摘時点T毎に指定される。 FIG. 3 shows a time series of average pitches P [n] over a plurality of reference sounds R of group G [n] and a time series of exemplary pitches P0 of music. The exemplary pitch P0 is a time series of the pitches of each note specified in the musical score of the music, or a time series of the average values of the pitches of the reference sounds R of the group G having the maximum evaluation index E. . As can be understood from FIG. 3, the point in time when the difference (pitch error) between the average pitch P [n] of the group G [n] and the exemplary pitch P0 is maximized is designated as the indication time T. Then, a character string (for example, a message such as “Caution on the pitch!”) For improving the pitch error at the time point is designated as the singing advice A for each indicated time point T.
 図1の演算処理装置12は、記憶装置14に記憶されたプログラムを実行することで歌唱解析装置100の各要素を統括的に制御する。第1実施形態の演算処理装置12は、図1に例示される通り、対象楽曲を歌唱する対象歌唱者に歌唱アドバイスAを提示するための複数の機能(解析処理部22,提示処理部24)を実現する。なお、演算処理装置12の各機能を複数の装置に分散した構成や、演算処理装置12の機能の一部を専用の電子回路が実現する構成も採用され得る。 1 performs overall control of each element of the singing analysis device 100 by executing a program stored in the storage device 14. As illustrated in FIG. 1, the arithmetic processing device 12 according to the first embodiment has a plurality of functions (analysis processing unit 22 and presentation processing unit 24) for presenting singing advice A to a target singer who sings the target music. To realize. A configuration in which each function of the arithmetic processing device 12 is distributed to a plurality of devices, or a configuration in which a dedicated electronic circuit realizes a part of the function of the arithmetic processing device 12 may be employed.
 図1の解析処理部22は、対象歌唱者に提示すべき歌唱アドバイスAを特定する。第1実施形態の解析処理部22は、対象歌唱者の歌唱音声Vにとって好適な歌唱アドバイスAを対象楽曲の歌唱中に順次に特定する。図4は、解析処理部22が歌唱アドバイスAを特定するための処理(以下「歌唱アドバイス特定処理」という)のフローチャートである。対象楽曲の歌唱の開始(対象楽曲の伴奏音の再生開始)を契機として図4の歌唱アドバイス特定処理が開始される。 1 specifies the singing advice A to be presented to the target singer. The analysis processing unit 22 of the first embodiment sequentially specifies singing advice A suitable for the singing voice V of the target singer during the singing of the target song. FIG. 4 is a flowchart of a process for the analysis processing unit 22 to specify the singing advice A (hereinafter referred to as “singing advice specifying process”). The singing advice specifying process of FIG. 4 is started with the start of the singing of the target music (reproduction start of the accompaniment sound of the target music).
 歌唱アドバイス特定処理を開始すると、解析処理部22は、対象楽曲が終了したか否かを判定する(SA1)。対象楽曲が終了していない場合(SA1:NO)、解析処理部22は、楽曲を時間軸上で所定長(固定長または可変長)毎に区分した複数の区間のうち1個の区間(以下「選択区間」という)の歌唱音声Vを収音装置16から取得する(SA2)。解析処理部22は、楽曲の各区間を先頭から末尾にかけてステップSA2の実行毎に順番に選択して選択区間の歌唱音声Vを取得する。 When the singing advice specifying process is started, the analysis processing unit 22 determines whether or not the target music has been completed (SA1). When the target music has not ended (SA1: NO), the analysis processing unit 22 selects one of the plurality of sections (hereinafter referred to as “fixed length” or “variable length”) on the time axis. The singing voice V of “selected section” is acquired from the sound collection device 16 (SA2). The analysis processing unit 22 selects each section of the music from the beginning to the end in order every time step SA2 is executed, and acquires the singing voice V in the selected section.
 解析処理部22は、対象楽曲の複数の参照音声Rを分類したN個のグループG[1]~G[N]のうち選択区間の歌唱音声Vが属するグループ(以下「所属グループ」という)Gを特定する(SA3)。具体的には、解析処理部22は、選択区間の歌唱音声Vについて評価指標Eを算定し、評価指標Eの相異なる範囲に対応するN個のグループG[1]~G[N]のうち選択区間内の歌唱音声Vの評価指標Eが包含される範囲の1個のグループG[n]を所属グループGとして特定する。 The analysis processing unit 22 is a group (hereinafter referred to as “affiliation group”) G to which the singing voice V in the selected section belongs among the N groups G [1] to G [N] into which the plurality of reference voices R of the target music are classified. Is specified (SA3). Specifically, the analysis processing unit 22 calculates the evaluation index E for the singing voice V in the selected section, and among the N groups G [1] to G [N] corresponding to different ranges of the evaluation index E One group G [n] in a range in which the evaluation index E of the singing voice V in the selected section is included is specified as the belonging group G.
 解析処理部22は、記憶装置14に記憶された参照情報DAのN個の単位情報U[1]~U[N]のうちステップSA3で特定した所属グループGに対応する単位情報Uを選択する(SA4)。すなわち、解析処理部22は、N個のグループG[1]~G[N]のうち対象歌唱者の歌唱音声Vが属する所属グループGの各歌唱アドバイスAを特定する。以上の手順で選択区間の単位情報Uが特定されると、解析処理部22は処理をステップSA1に移行する。したがって、対象楽曲が終了するまで(SA1:YES)、対象楽曲の区間毎に所属グループGが順次に更新され、更新後の所属グループGに対応する単位情報U(歌唱アドバイスAの時系列)が順次に特定される。なお、対象楽曲の最初の区間が歌唱される段階(歌唱音声Vの未取得の段階)では、歌唱音声Vに対応する所属グループGが特定されないから、参照情報DAの各単位情報U[n]とは無関係に事前に用意された歌唱アドバイスA(例えば「感情を込めて歌いましょう」等の一般的なメッセージ)が対象歌唱者に提示される。 The analysis processing unit 22 selects the unit information U corresponding to the group G identified in step SA3 from the N unit information U [1] to U [N] of the reference information DA stored in the storage device 14. (SA4). That is, the analysis processing unit 22 identifies each singing advice A of the group G to which the singing voice V of the target singer belongs among the N groups G [1] to G [N]. When the unit information U of the selected section is specified by the above procedure, the analysis processing unit 22 moves the process to step SA1. Therefore, until the target music ends (SA1: YES), the belonging group G is sequentially updated for each section of the target music, and the unit information U (the time series of the singing advice A) corresponding to the updated belonging group G is obtained. It is specified sequentially. In addition, since the group G corresponding to the singing voice V is not specified at the stage where the first section of the target music is sung (the stage where the singing voice V is not acquired), each unit information U [n] of the reference information DA Singing advice A (for example, a general message such as “Let's sing with emotion”) prepared in advance regardless of the number is presented to the target singer.
 図1の提示処理部24は、以上に例示した歌唱アドバイス特定処理で解析処理部22が特定した歌唱アドバイスAを対象歌唱者に提示する。具体的には、提示処理部24は、解析処理部22が特定した単位情報Uが指定する各指摘時点Tから所定の時間だけ先行する時点において、単位情報Uが当該指摘時点Tについて指定する歌唱アドバイスAを表示装置18に表示させる。すなわち、歌唱音声Vが属する所属グループGの各参照音声Rのもとで改善すべき点(すなわち対象歌唱者の歌唱音声Vでも同様に改善すべきと推測される箇所)が対象歌唱者に順次に提示され、対象歌唱者は、対象楽曲のうち自身が失敗し易い箇所を特に注意して歌唱することが可能である。 1 presents the singing advice A identified by the analysis processing unit 22 in the singing advice identifying process exemplified above to the target singer. Specifically, the presentation processing unit 24 sings the unit information U for the indicated time T at a time that precedes each indicated time T specified by the unit information U specified by the analysis processing unit 22 by a predetermined time. The advice A is displayed on the display device 18. That is, the points to be improved under the respective reference sounds R of the group G to which the singing voice V belongs (that is, the places where the singing voice V of the target singer is supposed to be improved similarly) are sequentially given to the target singer. The target singer can sing with particular attention to a portion of the target music that is likely to fail.
 以上に説明した通り、第1実施形態では、対象歌唱者の歌唱音声Vが属する所属グループGの複数の参照音声Rの傾向に応じた歌唱アドバイスAが対象歌唱者に提示される。すなわち、個々の対象歌唱者の歌唱音声Vにとって適切な歌唱アドバイスAが対象歌唱者に提示される。したがって、対象歌唱者の歌唱を効果的に改善できるという利点がある。 As described above, in the first embodiment, the singing advice A corresponding to the tendency of the plurality of reference sounds R of the group G to which the singing voice V of the target singer belongs is presented to the target singer. That is, singing advice A suitable for the singing voice V of each target singer is presented to the target singer. Therefore, there is an advantage that the singing of the target singer can be effectively improved.
 第1実施形態では特に、歌唱音声Vと楽曲が共通する複数の参照音声Rを分類した各グループG[n]について歌唱アドバイスAを指定する参照情報DAを参照することで歌唱アドバイスAが特定されるから、個々の楽曲に好適な歌唱アドバイスAが特定される。したがって、対象楽曲の歌唱音声Vにとって適切な歌唱アドバイスAを提示できるという前述の効果は格別に顕著である。また、第1実施形態では、対象歌唱者による対象楽曲の歌唱中に歌唱音声Vの所属グループGが対象楽曲の区間毎に順次に更新される。したがって、対象楽曲の区間毎に好適な歌唱アドバイスAを提示できるという利点がある。 In the first embodiment, in particular, the singing advice A is specified by referring to the reference information DA that specifies the singing advice A for each group G [n] that classifies the plurality of reference sounds R that share the singing voice V and the music. Therefore, the singing advice A suitable for each piece of music is specified. Therefore, the above-described effect that the singing advice A appropriate for the singing voice V of the target music can be presented is particularly remarkable. Moreover, in 1st Embodiment, the affiliation group G of the singing voice V is updated sequentially for every area of a target music during the song of the target music by a target singer. Therefore, there exists an advantage that suitable singing advice A can be shown for every section of object music.
<第2実施形態>
 本発明の第2実施形態を以下に説明する。なお、以下に例示する各形態において作用や機能が第1実施形態と同様である要素については、第1実施形態の説明で参照した符号を流用して各々の詳細な説明を適宜に省略する。
Second Embodiment
A second embodiment of the present invention will be described below. In addition, about the element which an effect | action and function are the same as that of 1st Embodiment in each form illustrated below, the reference | standard referred by description of 1st Embodiment is diverted, and each detailed description is abbreviate | omitted suitably.
 第2実施形態の記憶装置14は、第1実施形態の参照情報DAに代えて図5の参照情報DBを記憶する。参照情報DBの生成には、第1実施形態と同様の参照音声群Qのうち対象歌唱者の複数の参照音声Rの集合(グループ)が利用される。具体的には、参照音声群Qのうち対象歌唱者の複数の参照音声Rを解析することで、対象歌唱者に歌唱アドバイスAを提示すべき音楽属性X(典型的には対象歌唱者が失敗する傾向がある音楽属性X)が特定され、複数の音楽属性Xの各々(X1,X2,……)について歌唱アドバイスA(A1,A2,……)を指定する参照情報DBが任意の1人の対象歌唱者について記憶装置14に格納される。なお、参照情報DBは、複数の歌唱者の各々について参照情報DBが事前に生成され得るが、対象歌唱者による歌唱の直前に(すなわち歌唱毎に)当該対象歌唱者の参照情報DBを生成することも可能である。 The storage device 14 of the second embodiment stores the reference information DB of FIG. 5 instead of the reference information DA of the first embodiment. For the generation of the reference information DB, a set (group) of a plurality of reference voices R of the target singer in the reference voice group Q similar to the first embodiment is used. Specifically, by analyzing a plurality of reference voices R of the target singer in the reference voice group Q, the music attribute X (typically the target singer fails to present singing advice A to the target singer) The music attribute X) that tends to be identified is specified, and the reference information DB that specifies the singing advice A (A1, A2,...) For each of the plurality of music attributes X (X1, X2,. Are stored in the storage device 14. The reference information DB may be generated in advance for each of a plurality of singers, but the reference information DB of the target singer is generated immediately before the singing by the target singer (that is, for each singing). It is also possible.
 「音楽属性」は、楽曲の音楽的な属性(態様)を意味する。具体的には、音域(高/低)、演奏マーク(Aメロ、サビ等)、フレーズ等の特定の区間内での位置(出だし等)、音型(上昇、下降、同音連続、コブシ、修飾音)、音価(ロングトーン、短いパッセージ)、リズムの種類、レガート/スタッカート、テンポ、拍位置(2拍目のウラ等)、和音上の機能(ルート、非和声音)等が、「音楽属性」の概念に包含される。
 本発明において、音楽属性Xは、楽曲の歌唱パートの音楽的な属性(態様)を意味する。例えば、高音域の歌唱が苦手であるという傾向が複数の参照音声Rから解析された対象歌唱者の参照情報DBでは、「高音域」という音楽属性X1について「高音に注意!」等の歌唱アドバイスA1が指定される。特定の音程(例えば5度)で相前後する各音高の歌唱が苦手であるという傾向が複数の参照音声Rから解析された対象歌唱者の参照情報DBでは、「5度」という音楽属性X2(特定の音程)について「音高変化に注意!」等の歌唱アドバイスA2が指定される。また、特定のリズムの歌唱が苦手であるという傾向が解析された対象歌唱者の参照情報DBでは、「特定のリズム」という音楽属性X3について「リズムに注意!」等の歌唱アドバイスA3が指定され、楽曲のうち開始直後(歌い始め)等の特定の区間の歌唱が苦手であるという傾向が解析された対象歌唱者の参照情報DBでは、「開始直後」という音楽属性X4について「歌い始めに注意!」等の歌唱アドバイスA4が指定される。
“Music attribute” means a musical attribute (mode) of a music piece. Specifically, range (high / low), performance mark (A melody, rust, etc.), position within a specific section such as a phrase (protrusion, etc.), sound type (ascending, descending, continuation of same sound, kobushi, modification) Sound), note value (long tone, short passage), rhythm type, legato / staccato, tempo, beat position (second beat back, etc.), chord function (root, non-harmonic sound), etc. Included in the concept of “attribute”.
In the present invention, the music attribute X means a musical attribute (mode) of the song part of the song. For example, in the reference information DB of the target singer analyzed from a plurality of reference sounds R, the singing advice such as “careful of high sounds!” For the music attribute X1 “high range”, which is analyzed from a plurality of reference sounds R. A1 is specified. In the reference information DB of the target singer analyzed from a plurality of reference sounds R, the music attribute X2 of “5 degrees” is a tendency that the singing of each pitch that is in succession at a specific pitch (for example, 5 degrees) is not good. Singing advice A2 such as “Caution for pitch change!” Is specified for (specific pitch). In addition, in the reference information DB of the target singer who has analyzed the tendency that the singing of a specific rhythm is not good, the singing advice A3 such as “note rhythm!” Is specified for the music attribute X3 “specific rhythm”. In the reference information DB of the target singer in which the tendency that the singing of a specific section such as immediately after the start (start of singing) is not good is analyzed, the music attribute X4 “immediately after start” Singing advice A4 such as “!” Is designated.
 図6は、第2実施形態の解析処理部22が歌唱アドバイスAを特定するための歌唱アドバイス特定処理のフローチャートである。第1実施形態と同様に、対象楽曲の歌唱の開始を契機として図6の歌唱アドバイス特定処理が開始される。 FIG. 6 is a flowchart of the singing advice specifying process for the analysis processing unit 22 of the second embodiment to specify the singing advice A. Similar to the first embodiment, the singing advice specifying process in FIG. 6 is started when the singing of the target music starts.
 歌唱アドバイス特定処理を開始すると、解析処理部22は、対象歌唱者の参照情報DBを参照することで、対象楽曲のうち参照情報DBで指定される音楽属性Xに該当する箇所(以下「指摘区間」という)を探索する(SB1)。例えば、特定の音域(例えば高音域)が音楽属性Xとして参照情報DBで指定されている場合には、対象楽曲のうち当該音域の区間が指摘区間として探索され、特定の音程(例えば5度)が音楽属性Xとして参照情報DBで指定されている場合には、対象楽曲のうち当該音程で音高が前後する区間が指摘区間として探索される。また、特定のリズムが音楽属性Xとして参照情報DBで指定されている場合には対象楽曲のうち当該リズムの区間が指摘区間として探索され、特定の区間(例えば開始直後)が音楽属性Xとして参照情報DBで指定されている場合には対象楽曲のうち当該区間が指摘区間として探索される。なお、指摘区間の探索に複数種の音楽属性Xを加味することも可能である。例えば、「特定のリズム」かつ「特定の音程」の区間が指摘区間として探索される。 When the singing advice specifying process is started, the analysis processing unit 22 refers to the reference information DB of the target singer, so that the portion corresponding to the music attribute X specified by the reference information DB (hereinafter referred to as “pointed section”). (Referred to as “)” (SB1). For example, when a specific sound range (for example, a high sound range) is designated as the music attribute X by the reference information DB, the section of the sound range of the target music is searched as the indicated section, and a specific pitch (for example, 5 degrees) is searched. Is specified in the reference information DB as the music attribute X, a section in which the pitch is around the target musical piece is searched for as the indicated section. When a specific rhythm is specified as the music attribute X in the reference information DB, the rhythm section of the target music is searched as the indicated section, and the specific section (for example, immediately after the start) is referred to as the music attribute X. If the information DB is specified, the section is searched as the indicated section in the target music. Note that it is possible to add a plurality of types of music attributes X to the search for the indicated section. For example, a section of “specific rhythm” and “specific pitch” is searched for as an indicated section.
 解析処理部22は、以上の手順で対象楽曲から探索した各指摘区間について歌唱アドバイスAを特定する(SB2)。具体的には、解析処理部22は、対象楽曲から探索された複数の指摘区間の各々について、当該指摘区間の音楽属性Xに対応する歌唱アドバイスAを参照情報DBから特定する。以上が第2実施形態における歌唱アドバイス特定処理の具体例である。 The analysis processing unit 22 specifies the singing advice A for each indicated section searched from the target music by the above procedure (SB2). Specifically, the analysis processing unit 22 specifies the singing advice A corresponding to the music attribute X of the indicated section from the reference information DB for each of the plurality of indicated sections searched from the target music. The above is a specific example of the singing advice specifying process in the second embodiment.
 第2実施形態の提示処理部24は、以上に説明した歌唱アドバイス特定処理で解析処理部22が特定した歌唱アドバイスAを対象楽曲の指摘区間毎に対象歌唱者に提示する。具体的には、提示処理部24は、対象楽曲のうち各指摘区間の始点から所定の時間だけ先行する時点において、当該指摘区間について解析処理部22が特定した歌唱アドバイスAを表示装置18に表示させる。以上の説明から理解される通り、対象歌唱者にとって苦手であると推測される指摘区間の歌唱に先立ち、当該区間の歌唱を改善するための歌唱アドバイスAが対象歌唱者に順次に提示される。 The presentation processing unit 24 of the second embodiment presents the singing advice A specified by the analysis processing unit 22 in the singing advice specifying process described above to the target singer for each indicated section of the target music. Specifically, the presentation processing unit 24 displays the singing advice A specified by the analysis processing unit 22 for the indicated section on the display device 18 at a time point preceding the starting point of each indicated section in the target music by a predetermined time. Let As understood from the above description, singing advice A for improving the singing of the section is sequentially presented to the target singer in advance of the singing of the indicated section estimated to be weak for the target singer.
 以上に説明した通り、第2実施形態では、参照音声群Qのうち対象歌唱者の歌唱音声に対応したグループの傾向に応じた歌唱アドバイスAが対象歌唱者に提示されるから、第1実施形態と同様に、個々の対象歌唱者にとって適切な歌唱アドバイスAを対象歌唱者に提示することが可能である。第2実施形態では特に、参照音声群Qのうち対象歌唱者が過去に発声した複数の参照音声Rのグループの傾向に応じて音楽属性X毎に歌唱アドバイスAを指定する参照情報DBが参照されるから、個々の対象歌唱者にとって適切な歌唱アドバイスAを提示できるという効果は格別に顕著である。例えば、特定の音程を音楽属性Xとして歌唱アドバイスAを指定する参照情報DBが参照されるから、特定の音程で相前後する各音高の歌唱が苦手な歌唱者に対し、その克服のための効果的な歌唱アドバイスAを提示することが可能である。 As described above, in the second embodiment, since the singing advice A corresponding to the group tendency corresponding to the singing voice of the target singer in the reference voice group Q is presented to the target singer, the first embodiment. Similarly, it is possible to present the singing advice A appropriate for each target singer to the target singer. In the second embodiment, in particular, reference information DB that designates the singing advice A for each music attribute X is referred to according to the tendency of the group of the plurality of reference sounds R uttered by the target singer in the past in the reference sound group Q. Therefore, the effect that the appropriate singing advice A can be presented for each target singer is particularly remarkable. For example, since the reference information DB that designates the singing advice A with a specific pitch as the music attribute X is referred to, for a singer who is not good at singing each pitch at a specific pitch, Effective singing advice A can be presented.
<第3実施形態>
 第1実施形態や第2実施形態では、対象歌唱者の歌唱音声Vが属する所属グループGの複数の参照音声Rの傾向に応じた歌唱アドバイスAを解析処理部22が特定した。第3実施形態の解析処理部22は、対象歌唱者の歌唱音声Vが属する所属グループGの複数の参照音声Rの傾向に応じて歌唱音声Vを評価(採点)した評価結果のコメントを特定する。具体的には、歌唱音声Vの所属グループGの各参照音声Rの傾向に応じた評価項目を重視して歌唱音声Vを評価した評価結果が特定される。
<Third Embodiment>
In 1st Embodiment and 2nd Embodiment, the analysis process part 22 specified the singing advice A according to the tendency of the some reference sound R of the affiliation group G to which the target singer's singing voice V belongs. The analysis processing unit 22 according to the third embodiment specifies comments of evaluation results obtained by evaluating (scoring) the singing voice V according to the tendency of the plurality of reference voices R of the group G to which the singing voice V of the target singer belongs. . Specifically, the evaluation result of evaluating the singing voice V with an emphasis on the evaluation items according to the tendency of each reference voice R of the group G to which the singing voice V belongs is specified.
 例えば、初心者は楽曲のサビの区間を主に記憶している(それ以外は余り記憶していない)という事情を考慮して、音量の変動量が大きく音高の誤差が大きい傾向がある参照音声RのグループG(すなわち初心者のグループ)に歌唱音声Vが属する場合、解析処理部22は、楽曲のうちサビの区間の評価の加重値を他区間と比較して大きい数値に設定して評価結果を算定する。また、抑揚の評価結果と比較して音高の評価結果が高い傾向がある参照音声RのグループGについて、解析処理部22は、音高の評価の加重値を抑揚等の他要素と比較して大きい数値に設定して評価結果を算定する。音高の評価結果と比較して抑揚の評価結果が高く各種の歌唱技法(コブシやシャクリ)の頻度が高い傾向がある参照音声RのグループGについて、解析処理部22は、抑揚や歌唱技法の評価の加重値を音高等の他要素と比較して大きい数値に設定して評価結果を算定する。また、音圧が大きく音高の変動量が大きい傾向(すなわち熱唱する傾向)がある参照音声RのグループGについて、解析処理部22は、抑揚の評価の加重値を他要素と比較して大きい数値に設定して評価結果を算定する。提示処理部24は、解析処理部22が特定した評価結果のコメントを表示装置18に表示させる。 For example, in consideration of the fact that beginners mainly store the chorus section of music (other than that, they do not store much), the reference voice tends to have a large volume variation and a large pitch error. When the singing voice V belongs to the group G of R (that is, the beginner's group), the analysis processing unit 22 sets the weight value of the evaluation of the chorus section of the music to a larger numerical value than the other sections, and the evaluation result Is calculated. Further, for the group G of the reference speech R that tends to have a higher pitch evaluation result than the inflection evaluation result, the analysis processing unit 22 compares the weight evaluation pitch value with other elements such as intonation. Set the value to a larger value and calculate the evaluation result For the group G of the reference speech R that has a high inflection evaluation result and a high frequency of various singing techniques (Kobushi and Shakuri) compared to the pitch evaluation result, the analysis processing unit 22 uses the inflection and singing techniques. The evaluation result is calculated by setting the evaluation weight to a larger value compared to other elements such as pitch. In addition, for the group G of the reference speech R having a tendency that the sound pressure is large and the pitch fluctuation amount is large (that is, the tendency to sing), the analysis processing unit 22 has a larger weight value for evaluating the inflection than other elements. Set the numerical value and calculate the evaluation result. The presentation processing unit 24 causes the display device 18 to display a comment on the evaluation result specified by the analysis processing unit 22.
 第3実施形態では、対象歌唱者の歌唱音声Vが属する所属グループGの複数の参照音声Rの傾向に応じた評価結果のコメントが対象歌唱者に提示される。すなわち、個々の対象歌唱者の歌唱音声Vにとって適切なコメントが対象歌唱者に提示される。したがって、対象歌唱者の歌唱を効果的に改善できるという利点がある。なお、第3実施形態において、評価結果のコメントの提示を省略することも可能である。すなわち、対象歌唱者の歌唱音声Vが属する所属グループGの複数の参照音声Rの傾向に応じて歌唱音声Vを評価する装置(提示処理部24を省略した構成)としても本発明は実現され得る。 In the third embodiment, comments of evaluation results according to the tendency of the plurality of reference sounds R of the group G to which the singing voice V of the target singer belongs are presented to the target singer. That is, a comment appropriate for the singing voice V of each target singer is presented to the target singer. Therefore, there is an advantage that the singing of the target singer can be effectively improved. In the third embodiment, it is possible to omit the presentation of the comment of the evaluation result. That is, the present invention can also be realized as an apparatus (a configuration in which the presentation processing unit 24 is omitted) that evaluates the singing voice V according to the tendency of the plurality of reference voices R of the group G to which the singing voice V of the target singer belongs. .
<変形例>
 以上の各形態は多様に変形され得る。具体的な変形の態様を以下に例示する。以下の例示から任意に選択された2個以上の態様は適宜に併合され得る。
<Modification>
Each of the above forms can be variously modified. Specific modifications are exemplified below. Two or more aspects arbitrarily selected from the following examples may be appropriately combined.
(1)第1実施形態では、楽曲毎の参照情報DAを事前に作成しておく構成を例示したが、各楽曲の歌唱毎に参照情報DAを実時間的に生成することも可能である。例えば、対象歌唱者の歌唱音声Vに音楽的な傾向が類似する各参照音声RのグループGを参照音声群Qから抽出し、当該グループGの各参照音声Rを利用して解析処理部22が参照情報DAを生成することも可能である。例えば、評価指標Eが歌唱音声Vに近い複数の参照音声R(例えば歌唱音声Vの評価指標Eや音高Pに対して±5%の範囲にある複数の参照音声R)のグループGに応じて参照情報DAを生成する構成が好適である。 (1) In the first embodiment, the configuration in which the reference information DA for each piece of music is created in advance has been exemplified. However, the reference information DA can be generated in real time for each song of each piece of music. For example, the group G of each reference sound R similar in musical tendency to the singing sound V of the target singer is extracted from the reference sound group Q, and the analysis processing unit 22 uses each reference sound R of the group G. It is also possible to generate the reference information DA. For example, according to a group G of a plurality of reference sounds R whose evaluation index E is close to the singing voice V (for example, a plurality of reference sounds R within a range of ± 5% with respect to the evaluation index E and the pitch P of the singing voice V). Thus, a configuration for generating the reference information DA is preferable.
(2)第1実施形態では、参照音声群Qのうち特定の楽曲に対応する複数の参照音声RをN個のグループG[1]~G[N]に分類したが、参照音声Rの分類の仕方は、第1実施形態で既述した通り任意である。例えば、特定の楽曲に対応する複数の参照音声Rのうち評価指標Eの降順で上位に位置する所定個(例えば上位5%)の参照音声RをグループG[n]に分類することも可能である。 (2) In the first embodiment, a plurality of reference sounds R corresponding to a specific music piece in the reference sound group Q are classified into N groups G [1] to G [N]. This method is arbitrary as described in the first embodiment. For example, it is also possible to classify a predetermined number (for example, the top 5%) of reference voices R positioned in descending order of the evaluation index E among a plurality of reference voices R corresponding to a specific music into a group G [n]. is there.
(3)前述の各形態では、グループG[n]の平均的な音高P[n]と模範的な音高P0との差異が極大となる時点を指摘時点Tとして選定したが、指摘時点Tの選定の仕方は以上の例示に限定されない。例えば、グループG[n]に包含される複数の参照音声Rの評価指標Eまたは音高の散布度(例えば分散や分布幅)が増加する時点や、複数の参照音声Rの評価指標Eの平均値が極小となる時点を指摘時点Tとして選定することも可能である。また、平均的な音高P[n]と模範的な音高P0との差異が所定の閾値を上回る時点を指摘時点Tとして選定することも可能である。 (3) In each of the above-mentioned forms, the point in time when the difference between the average pitch P [n] of the group G [n] and the exemplary pitch P0 is maximized is selected as the indication point T. The method of selecting T is not limited to the above examples. For example, when the evaluation index E of the plurality of reference sounds R included in the group G [n] or the distribution degree of pitches (for example, dispersion or distribution width) increases, the average of the evaluation indices E of the plurality of reference sounds R It is also possible to select the point in time at which the value is minimized as the point-in-time T. It is also possible to select the point in time T when the difference between the average pitch P [n] and the exemplary pitch P0 exceeds a predetermined threshold.
(4)第1実施形態では、対象楽曲の区間毎に所属グループGを更新したが、例えば選択区間を包含する複数の区間にわたる歌唱音声Vの評価指標Eに応じて選択区間の所属グループGを特定することも可能である。具体的には、評価指標Eの相異なる範囲に対応するN個のグループG[1]~G[N]のうち、選択区間を最後尾とした複数の区間にわたる評価指標Eの加重和が包含される範囲のグループG[n]が所属グループGとして特定される。各区間の評価指標Eに適用される加重値は、例えば選択区間に近いほど大きい数値に設定される。 (4) In the first embodiment, the affiliation group G is updated for each section of the target music. For example, the affiliation group G of the selected section is changed according to the evaluation index E of the singing voice V over a plurality of sections including the selected section. It is also possible to specify. Specifically, among the N groups G [1] to G [N] corresponding to different ranges of the evaluation index E, the weighted sum of the evaluation indices E over a plurality of sections with the selected section at the end is included. The group G [n] in the range to be specified is specified as the belonging group G. For example, the weight value applied to the evaluation index E in each section is set to a larger numerical value as it is closer to the selected section.
(5)複数の参照情報D(DA,DB)を利用者からの指示に応じて歌唱アドバイスAの特定に選択的に適用することも可能である。例えば第1実施形態では、指摘時点Tが多い参照情報DA1と指摘時点Tが少ない参照情報DA2との組を楽曲毎に用意し、解析処理部22が、利用者からの指示に応じて参照情報DA1と参照情報DA2とを選択的に利用する構成が採用される。参照情報DA1を適用した場合には対象楽曲内の多数の指摘時点Tで歌唱アドバイスAが提示され(すなわち辛目のアドバイス)、参照情報DA2を適用した場合には、歌唱アドバイスAが提示される指摘時点Tが減少する(すなわち甘目のアドバイス)。 (5) It is also possible to selectively apply a plurality of reference information D (DA, DB) to specifying the singing advice A in accordance with an instruction from the user. For example, in the first embodiment, a set of reference information DA1 having a high point-in-time T and reference information DA2 having a low point-in-time T is prepared for each piece of music, and the analysis processing unit 22 receives the reference information according to an instruction from the user. A configuration that selectively uses DA1 and reference information DA2 is employed. When the reference information DA1 is applied, the singing advice A is presented at a number of pointed-out times T in the target music (that is, stubborn advice), and when the reference information DA2 is applied, the singing advice A is presented. Time T decreases (ie, sweet eye advice).
(6)第2実施形態において、対象楽曲の区間毎に評価指標Eを算定し、指摘区間の評価指標Eが所定の基準値を上回る場合(すなわち、対象歌唱者が苦手を克服した場合)に対象歌唱者に通知することも可能である。以上の構成によれば、苦手の克服を対象歌唱者が認識できるから、対象歌唱者による歌唱の意欲を維持できるという利点がある。 (6) In the second embodiment, when the evaluation index E is calculated for each section of the target music and the evaluation index E of the indicated section exceeds a predetermined reference value (that is, when the target singer overcomes weakness) It is also possible to notify the target singer. According to the above configuration, since the target singer can recognize how to overcome weakness, there is an advantage that the singing motivation of the target singer can be maintained.
(7)第1実施形態および第2実施形態においては歌唱アドバイスAの提示を例示し、第3実施形態においては評価結果の提示を例示したが、対象歌唱者に対する提示内容は以上の例示に限定されない。前述の各形態の例示から理解される通り、解析処理部22は、対象歌唱者の歌唱音声Vの所属グループGの各参照音声Rの傾向に応じたコメント(歌唱アドバイスAや評価結果)を特定する要素として包括的に表現される。 (7) In the first embodiment and the second embodiment, the presentation of the singing advice A is exemplified, and in the third embodiment, the presentation of the evaluation result is exemplified. However, the content presented to the target singer is limited to the above illustration. Not. As understood from the illustrations of the above-described embodiments, the analysis processing unit 22 identifies comments (singing advice A and evaluation results) according to the tendency of each reference voice R of the group G belonging to the singing voice V of the target singer. It is expressed comprehensively as an element.
(8)通信カラオケ装置等の通信端末と通信するサーバ装置(例えばウェブサーバ)により歌唱解析装置100を実現することも可能である。例えば、第1実施形態の歌唱解析装置100をサーバ装置にて実現した構成では、通信端末から通信網を介して受信した歌唱音声Vの所属グループGに対応する歌唱アドバイスAを解析処理部22が特定し(歌唱アドバイス特定処理)、歌唱アドバイスAを対象歌唱者に提示させるための指令を提示処理部24が通信端末に送信する。 (8) The singing analysis device 100 can be realized by a server device (for example, a web server) that communicates with a communication terminal such as a communication karaoke device. For example, in the configuration in which the singing analysis device 100 according to the first embodiment is realized by the server device, the analysis processing unit 22 analyzes the singing advice A corresponding to the group G of the singing voice V received from the communication terminal via the communication network. The presentation processing unit 24 transmits a command for specifying (singing advice specifying process) and causing the target singer to present the singing advice A to the communication terminal.
 以上の各態様に係る歌唱解析装置は、歌唱アドバイスの提示に専用されるDSP(Digital Signal Processor)等のハードウェア(電子回路)によって実現されるほか、CPU(Central Processing Unit)等の汎用の演算処理装置とプログラムとの協働によっても実現される。本発明の好適な態様に係るプログラムは、事前に収録された複数の参照音声のうち対象歌唱者の歌唱音声に対応したグループの各参照音声の傾向に応じたコメントを特定する解析処理部、および解析処理部が特定したコメントを対象歌唱者に提示する提示処理部としてコンピュータを機能させる。また、他の態様に係るプログラムは、事前に収録された複数の参照音声のうち対象歌唱者の歌唱音声に対応したグループの各参照音声の傾向に応じて当該歌唱音声を評価する解析処理部としてコンピュータを機能させる。本発明のプログラムは、コンピュータが読取可能な記録媒体に格納された形態で提供されてコンピュータにインストールされ得る。記録媒体は、例えば非一過性(non-transitory)の記録媒体であり、CD-ROM等の光学式記録媒体(光ディスク)が好例であるが、半導体記録媒体や磁気記録媒体等の公知の任意の形式の記録媒体を包含し得る。また、例えば、本発明のプログラムは、通信網を介した配信の形態で提供されてコンピュータにインストールされ得る。 The singing analysis apparatus according to each aspect described above is realized by hardware (electronic circuit) such as DSP (Digital Signal Processor) dedicated to the presentation of singing advice, and general-purpose arithmetic such as CPU (Central Processing Unit) This is also realized by cooperation between the processing device and the program. The program which concerns on the suitable aspect of this invention is the analysis process part which identifies the comment according to the tendency of each reference audio | voice of the group corresponding to the singing audio | voice of a target singer among the some reference audio | voices recorded beforehand, and The computer is caused to function as a presentation processing unit that presents the comment specified by the analysis processing unit to the target singer. Moreover, the program which concerns on another aspect is an analysis process part which evaluates the said song audio | voice according to the tendency of each reference audio | voice of the group corresponding to the singing audio | voice of a target song person among the some reference audio | voices recorded beforehand. Make the computer work. The program of the present invention can be provided in a form stored in a computer-readable recording medium and installed in the computer. The recording medium is, for example, a non-transitory recording medium, and an optical recording medium (optical disk) such as a CD-ROM is a good example, but a known arbitrary one such as a semiconductor recording medium or a magnetic recording medium This type of recording medium can be included. For example, the program of the present invention can be provided in the form of distribution via a communication network and installed in a computer.
 また、以上の各態様に係る歌唱解析装置の動作方法(歌唱解析方法)としても本発明は特定される。本発明の好適な態様に係る歌唱解析方法は、事前に収録された複数の参照音声のうち対象歌唱者の歌唱音声に対応したグループの各参照音声の傾向に応じたコメントを特定する解析処理過程と、解析処理過程で特定したコメントを対象歌唱者に提示する提示処理過程とを包含する。また、他の態様に係る歌唱解析方法は、事前に収録された複数の参照音声のうち対象歌唱者の歌唱音声に対応したグループの各参照音声の傾向に応じて当該歌唱音声を評価する解析処理過程を包含する。 The present invention is also specified as an operation method (singing analysis method) of the song analysis apparatus according to each of the above aspects. The singing analysis method according to a preferred aspect of the present invention is an analysis process for identifying a comment corresponding to a tendency of each reference voice of a group corresponding to the singing voice of the target singer among a plurality of reference voices recorded in advance. And a presentation process for presenting the comment specified in the analysis process to the target singer. In addition, the singing analysis method according to another aspect is an analysis process for evaluating the singing voice according to the tendency of each reference voice of the group corresponding to the singing voice of the target singer among a plurality of reference voices recorded in advance. Includes processes.
 本出願は、2014年3月10日出願の日本特許出願(特願2014-045957)に基づくものであり、その内容はここに参照として取り込まれる。 This application is based on a Japanese patent application (Japanese Patent Application No. 2014-045957) filed on March 10, 2014, the contents of which are incorporated herein by reference.
本発明によれば、歌唱者の歌唱の傾向に応じた適切なコメントを歌唱者に提示することが可能である。 According to the present invention, it is possible to present an appropriate comment to the singer according to the singing tendency of the singer.
100……歌唱解析装置、12……演算処理装置、14……記憶装置、16……収音装置、18……表示装置、22……解析処理部、24……提示処理部。  DESCRIPTION OF SYMBOLS 100 ... Singing analysis apparatus, 12 ... Arithmetic processing apparatus, 14 ... Memory | storage device, 16 ... Sound collection apparatus, 18 ... Display apparatus, 22 ... Analysis processing part, 24 ... Presentation processing part.

Claims (9)

  1.  事前に収録された複数の参照音声のうち対象歌唱者の歌唱音声に対応したグループの各参照音声の傾向に応じたコメントを特定する解析処理部と、
     前記解析処理部が特定したコメントを前記対象歌唱者に提示する提示処理部と
     を具備する歌唱解析装置。
    An analysis processing unit for identifying a comment corresponding to a tendency of each reference voice of the group corresponding to the singing voice of the target singer among a plurality of reference voices recorded in advance;
    A singing analysis device comprising: a presentation processing unit that presents a comment specified by the analysis processing unit to the target singer.
  2.  前記解析処理部は、前記対象歌唱者の歌唱音声に対応したグループの各参照音声の傾向に応じた歌唱アドバイスを前記コメントとして特定する
     請求項1の歌唱解析装置。
    The singing analysis apparatus according to claim 1, wherein the analysis processing unit specifies, as the comment, singing advice corresponding to a tendency of each reference voice of a group corresponding to the singing voice of the target singer.
  3.  前記解析処理部は、前記対象歌唱者の歌唱音声と楽曲が共通する複数の参照音声を分類した複数のグループの各々について歌唱アドバイスを指定する参照情報を参照して、前記複数のグループのうち前記対象歌唱者の歌唱音声が属するグループの歌唱アドバイスを特定する
     請求項2の歌唱解析装置。
    The analysis processing unit refers to reference information that specifies singing advice for each of a plurality of groups in which a plurality of reference sounds that are common to the target singer's singing voice and music are classified, and among the plurality of groups, The singing analysis apparatus according to claim 2, wherein the singing advice of the group to which the singing voice of the target singer belongs is specified.
  4.  前記解析処理部は、前記対象歌唱者による楽曲の歌唱中に当該歌唱音声のグループを順次に更新する
     請求項3の歌唱解析装置。
    The singing analysis apparatus according to claim 3, wherein the analysis processing unit sequentially updates the group of the singing voice during the singing of the music by the target singer.
  5.  前記解析処理部は、前記対象歌唱者の歌唱音声について評価指標を算定し、前記複数のグループのうち当該評価指標が包含されるグループを特定し、特定したグループに対応する単位情報を選択し、当該単位情報によって指定される歌唱アドバイスを特定する
     請求項2の歌唱解析装置。
    The analysis processing unit calculates an evaluation index for the singing voice of the target singer, specifies a group in which the evaluation index is included among the plurality of groups, selects unit information corresponding to the specified group, The singing analysis device according to claim 2, wherein the singing advice specified by the unit information is specified.
  6.  前記解析処理部は、複数の参照音声のうち前記対象歌唱者の音声を収集したグループの各参照音声の傾向に応じて音楽属性毎に歌唱アドバイスを指定する参照情報を参照して、前記対象歌唱者が歌唱する楽曲のうち前記音楽属性に該当する箇所の歌唱アドバイスを特定する
     請求項2の歌唱解析装置。
    The analysis processing unit refers to reference information that specifies singing advice for each music attribute according to the tendency of each reference sound of the group that has collected the sound of the target singer among a plurality of reference sounds, and the target song The singing analysis apparatus according to claim 2, wherein the singing advice is specified at a location corresponding to the music attribute among the songs sung by the person.
  7.  前記参照情報は、相前後する各音高の特定の音程を前記音楽属性として歌唱アドバイスを指定し、
     前記解析処理部は、前記対象歌唱者が歌唱する楽曲のうち前記特定の音程が存在する箇所について、前記参照情報にて当該音程に指定された歌唱アドバイスを特定する
     請求項6の歌唱解析装置。
    The reference information designates a singing advice with a specific pitch of each successive pitch as the music attribute,
    The singing analysis apparatus according to claim 6, wherein the analysis processing unit specifies singing advice specified in the pitch by the reference information with respect to a portion where the specific pitch is present in the music sung by the target singer.
  8.  前記解析処理部は、前記解析処理部が特定した傾向に応じて前記対象歌唱者の歌唱音声を評価した評価結果を前記コメントとして特定する
     請求項1の歌唱解析装置。 
    The singing analysis apparatus according to claim 1, wherein the analysis processing unit specifies, as the comment, an evaluation result obtained by evaluating the singing voice of the target singer according to the tendency specified by the analysis processing unit.
  9.  事前に収録された複数の参照音声のうち対象歌唱者の歌唱音声に対応したグループの各参照音声の傾向に応じたコメントを特定し、
     前記特定したコメントを前記対象歌唱者に提示する
     歌唱解析方法。
    Identify a comment according to the tendency of each reference voice of the group corresponding to the singing voice of the target singer among a plurality of reference voices recorded in advance,
    A singing analysis method for presenting the specified comment to the target singer.
PCT/JP2015/057063 2014-03-10 2015-03-10 Singing analyzer WO2015137360A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2014045957A JP2015169867A (en) 2014-03-10 2014-03-10 singing analyzer
JP2014-045957 2014-03-10

Publications (1)

Publication Number Publication Date
WO2015137360A1 true WO2015137360A1 (en) 2015-09-17

Family

ID=54071803

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2015/057063 WO2015137360A1 (en) 2014-03-10 2015-03-10 Singing analyzer

Country Status (2)

Country Link
JP (1) JP2015169867A (en)
WO (1) WO2015137360A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000029473A (en) * 1998-07-08 2000-01-28 Ricoh Co Ltd Playing music reproducing device
JP2007256617A (en) * 2006-03-23 2007-10-04 Yamaha Corp Musical piece practice device and musical piece practice system
JP2011203479A (en) * 2010-03-25 2011-10-13 Xing Inc Karaoke system, control method of karaoke system, and control program of karaoke system and information recording medium thereof
JP2012173721A (en) * 2011-02-24 2012-09-10 Yamaha Corp Singing voice evaluation device
JP2012203343A (en) * 2011-03-28 2012-10-22 Yamaha Corp Singing support device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000029473A (en) * 1998-07-08 2000-01-28 Ricoh Co Ltd Playing music reproducing device
JP2007256617A (en) * 2006-03-23 2007-10-04 Yamaha Corp Musical piece practice device and musical piece practice system
JP2011203479A (en) * 2010-03-25 2011-10-13 Xing Inc Karaoke system, control method of karaoke system, and control program of karaoke system and information recording medium thereof
JP2012173721A (en) * 2011-02-24 2012-09-10 Yamaha Corp Singing voice evaluation device
JP2012203343A (en) * 2011-03-28 2012-10-22 Yamaha Corp Singing support device

Also Published As

Publication number Publication date
JP2015169867A (en) 2015-09-28

Similar Documents

Publication Publication Date Title
US9818396B2 (en) Method and device for editing singing voice synthesis data, and method for analyzing singing
EP3843083A1 (en) Method, system, and computer-readable medium for creating song mashups
CN103187046B (en) Display control unit and method
JP2016136251A (en) Automatic transcription of musical content and real-time musical accompaniment
US9355634B2 (en) Voice synthesis device, voice synthesis method, and recording medium having a voice synthesis program stored thereon
JP6759545B2 (en) Evaluation device and program
JP2008026622A (en) Evaluation apparatus
Mayor et al. Performance analysis and scoring of the singing voice
JP6102076B2 (en) Evaluation device
WO2014142200A1 (en) Voice processing device
JP2017027070A (en) Evaluation device and program
JP5447624B2 (en) Karaoke equipment
JP5919928B2 (en) Performance evaluation apparatus and program
WO2015137360A1 (en) Singing analyzer
JP6219750B2 (en) Singing battle karaoke system
JP5618743B2 (en) Singing voice evaluation device
JP5585320B2 (en) Singing voice evaluation device
JP5830840B2 (en) Voice evaluation device
JP6954780B2 (en) Karaoke equipment
JP5034642B2 (en) Karaoke equipment
CN112837698A (en) Singing or playing evaluation method and device and computer readable storage medium
JP5416396B2 (en) Singing evaluation device and program
JP2007240552A (en) Musical instrument sound recognition method, musical instrument annotation method and music piece searching method
JP2008040258A (en) Musical piece practice assisting device, dynamic time warping module, and program
JP2016180965A (en) Evaluation device and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15760871

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15760871

Country of ref document: EP

Kind code of ref document: A1