WO2007088820A1 - Machine de karaoke et procede de traitement du son - Google Patents

Machine de karaoke et procede de traitement du son Download PDF

Info

Publication number
WO2007088820A1
WO2007088820A1 PCT/JP2007/051413 JP2007051413W WO2007088820A1 WO 2007088820 A1 WO2007088820 A1 WO 2007088820A1 JP 2007051413 W JP2007051413 W JP 2007051413W WO 2007088820 A1 WO2007088820 A1 WO 2007088820A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
singing
lyrics
model
section
Prior art date
Application number
PCT/JP2007/051413
Other languages
English (en)
Japanese (ja)
Inventor
Akane Noguchi
Original Assignee
Yamaha Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yamaha Corporation filed Critical Yamaha Corporation
Publication of WO2007088820A1 publication Critical patent/WO2007088820A1/fr

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/36Accompaniment arrangements
    • G10H1/361Recording/reproducing of accompaniment for use with an external source, e.g. karaoke systems
    • G10H1/363Recording/reproducing of accompaniment for use with an external source, e.g. karaoke systems using optical disks, e.g. CD, CD-ROM, to store accompaniment information in digital form
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/36Accompaniment arrangements
    • G10H1/361Recording/reproducing of accompaniment for use with an external source, e.g. karaoke systems
    • G10H1/366Recording/reproducing of accompaniment for use with an external source, e.g. karaoke systems with means for modifying or correcting the external signal, e.g. pitch correction, reverberation, changing a singer's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/046Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for differentiation between music and non-music signals, based on the identification of musical parameters, e.g. based on tempo detection
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/066Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for pitch analysis as part of wider processing for musical purposes, e.g. transcription, musical performance evaluation; Pitch recognition, e.g. in polyphonic sounds; Estimation or use of missing fundamental
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • G10H2210/091Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal for performance evaluation, i.e. judging, grading or scoring the musical qualities or faithfulness of a performance, e.g. with respect to pitch, tempo or other timings of a reference performance
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/15Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being formant information

Definitions

  • the present invention relates to a technique for scoring a singer's singing ability.
  • Some karaoke devices that perform automatically based on music data analyze a singer's voice input to a microphone and score the singer's singing ability.
  • the karaoke apparatus disclosed in Patent Document 1 recognizes a singer's voice text input to a microphone and evaluates how much it matches the lyric text of the music. According to this karaoke apparatus, it is possible to evaluate whether or not the singer remembers the lyrics correctly.
  • Patent Document 1 JP-A-10-91172
  • the present invention has been made under the background described above, and its purpose is to complicate the system.
  • the goal is to enable singers to evaluate the ability to remember the lyrics correctly.
  • the present invention inputs a storage means unit storing example voice data representing a model voice when a song is sung according to lyrics, and a singer's singing voice.
  • a voice input unit and a model voice represented by the model voice data are divided into a plurality of model voice sections, and each of the divided sample voice sections in the singing voice input to the voice input means unit A singing voice section corresponding to the singing voice section corresponding to the singing voice section of the singing voice section,
  • a karaoke apparatus having an evaluation unit that performs evaluation by comparing the display unit and a display unit that displays an evaluation result of the evaluation unit.
  • the evaluation unit matches the singing voice of the singing voice section specified by the specifying unit with the model voice of the example voice section corresponding to the singing voice section. You can ask for a degree, and evaluate it based on the degree of agreement!
  • the storage unit stores lyrics data representing the lyrics of the music, and when the matching degree obtained by the evaluation unit is less than a predetermined value, the sample voice section in which the matching degree is less than the predetermined value
  • the evaluation unit may obtain a degree of coincidence between the formant frequency of the singing voice and the formant frequency of the model voice.
  • the evaluation unit obtains a first spectrum envelope from the speech waveform of the singing speech section specified by the specifying unit, and determines from the speech waveform of the example speech between the sample speech sections corresponding to the singing speech section.
  • the second spectrum envelope may be obtained, the formant frequency of the singing voice may be extracted from the first spectrum envelope, and the formant frequency of the sample voice may be extracted from the second spectrum envelope.
  • the evaluation unit includes a voice waveform of the singing voice section of the singing voice section specified by the specifying section and a voice waveform of the model voice of the model voice section corresponding to the singing voice section. Try to find the degree of agreement and evaluate it based on the degree of agreement.
  • the present invention divides the sample voice when the song is sung according to the lyrics represented by the sample voice data into a plurality of sample voice sections, and the singing voice of the singer input to the voice input unit is included.
  • the comparison process obtains a degree of coincidence between the singing voice of the singing voice section specified by the specifying unit and the model voice of the model voice section corresponding to the singing voice section, and the evaluation process Depending on the degree of agreement you find, you may want to make an evaluation.
  • the speech processing method may further include, when the matching degree obtained by the evaluation process is less than a predetermined value, the lyrics of the music corresponding to the voice of the sample voice section in which the matching degree is less than the predetermined value. May be identified from the lyrics represented by the lyrics data, and the display processing may display the identified lyrics on the display unit.
  • the degree of coincidence between the formant frequency of the singing voice and the formant frequency of the model voice may be obtained.
  • the evaluation process obtains a first spectrum envelope from the voice waveform of the singing voice section of the specified singing voice section, and secondly calculates from the voice waveform of the model voice of the model voice section corresponding to the singing voice section.
  • the spectrum envelope may be obtained, the formant frequency of the singing voice is extracted from the first spectrum envelope, and the formant frequency of the model voice may be extracted from the second spectrum envelope.
  • the evaluation process obtains the degree of coincidence between the voice waveform of the singing voice in the specified singing voice section and the voice waveform of the model voice in the sample voice section corresponding to the singing voice section. You can make an evaluation based on the degree of agreement!
  • FIG. 1 is an external view of a karaoke apparatus according to an embodiment of the present invention.
  • FIG. 2 is a block diagram showing a hardware configuration of the karaoke apparatus according to the embodiment of the present invention.
  • FIG. 3 is a diagram illustrating a format of music data in the embodiment of the present invention.
  • FIG. 4 is a diagram illustrating an example of a lyrics table format.
  • FIG. 5 is a diagram illustrating a waveform of a sample voice and a waveform of a singing voice.
  • FIG. 6 This is a diagram when the waveform of the model voice and the waveform of the singing voice are divided into multiple frames.
  • FIG. 1 is an external view of a karaoke apparatus according to an embodiment of the present invention. As shown in the figure, the karaoke device 1 is connected with a monitor 2, a speaker 3L, a speaker 3R, and a microphone 4. Karaoke device 1 is remotely operated by an infrared signal transmitted from remote control device 5.
  • FIG. 2 is a block diagram showing a hardware configuration of the karaoke apparatus 1.
  • a CPU (Central Processing Unit) 102 uses a RAM (Random Access Memory) 104 as a work area and executes various programs stored in a ROM (Read Only Memory) 103 to control each part of the karaoke device 1. To do.
  • the RAM 104 has a music storage area for temporarily storing music data.
  • the storage unit 105 includes a hard disk device, and stores various data such as music data described later and digital data of singing voice input from the microphone 4.
  • the communication unit 108 receives music data from a host computer (not shown), which is a music data distribution source, via a communication network (not shown) such as the Internet, and receives the received music data in the CPU 102. Transfer to the storage unit 105 under control.
  • the music data may be stored in the storage unit 105 in advance.
  • the karaoke device 1 is provided with a reading device that reads various recording media such as CD-ROM and DVD, and the music data recorded on the various recording media is read by the reading device and transferred to the storage unit 105 for storage. It may be.
  • the music data in the present embodiment includes the header, the musical sound data that is WAVE data representing the contents of the karaoke performance sound, and the voice of the model when the lyrics of the music are correctly sung. It has a WAVE format sample voice data representing the waveform and a lyrics table storing lyrics data representing the lyrics of the music.
  • FIG. 4 is a diagram illustrating a format of the lyrics table.
  • the lyrics table there is a correspondence between the lyrics data representing the lyrics of the music to be played and the time interval data indicating the time interval in which the lyrics represented by the lyrics data should be pronounced when the tone is output according to the tone data.
  • the lyric data on the first line represents the lyrics “Kamereon force”, and the time interval data “01: 00—01: 02” associated with this lyric data.
  • the lyrics “Kamereon power ⁇ ” is pronounced between 1 minute and 2 seconds after the beginning of the music performance.
  • the lyric data on the second line represents the lyric “Ichikitaichi”, and the time interval data “01: 03—01: 06” associated with this lyric data is the voice of the model. This shows that the lyrics “Ichikitaichi” is pronounced between the time when 1 minute and 3 seconds have passed since the music started playing and the time when 1 minute and 6 seconds passed.
  • the microphone 4 converts the singing voice of the singer that is input into a voice signal and outputs the voice signal.
  • the audio signal from which the microphone 4 is also output is input to an audio processing DSP (Digital Signal Processor) 111 and an amplifier 112.
  • the voice processing DSP 111 performs AZD conversion on the input voice signal and generates singing voice data representing the singing voice.
  • This singing voice data is stored in the storage unit 105, and compared with the model voice data and used for scoring the singer's singing ability.
  • Input unit 106 detects a signal generated by an input operation to operation panel provided on karaoke apparatus 1 or remote control apparatus 5 and outputs the detection result to CPU 102.
  • the display control unit 107 displays the video and the score result of the singer's singing ability on the motor 2 under the control of the CPU 102.
  • the tone generator 109 generates a tone signal corresponding to the supplied tone data, and outputs the generated tone signal to the effect DSP 110 as a karaoke performance sound.
  • the effect DSP 110 gives an effect such as reverberation echo to the musical sound signal generated by the sound source device 109.
  • the effected tone signal is DZA converted by the effect DSP 110 and output to the amplifier 112.
  • the amplifier 112 synthesizes and amplifies the musical tone signal output from the effect DSP 110 and the audio signal output from the microphone 4 and outputs it to the speakers 3L and 3R. As a result, the melody of the music and the voice of the singer are output from the speakers 3L and 3R.
  • the song data of the designated song is transferred from the storage unit 105 to the song storage area of the RAM 104 by the CPU 102.
  • CPU102 stores this music Karaoke accompaniment processing is executed by sequentially reading various data included in the music data stored in the area.
  • CPU 102 reads out the musical sound data included in the music data, and outputs the read musical sound data to tone generator 109.
  • the tone generator 109 generates a tone signal of a predetermined tone color based on the supplied music data, and outputs the generated tone signal to the effect DSP 110.
  • an effect such as reverberation echo is given to the musical sound signal output from the sound source device 109.
  • the musical sound signal to which the effect is applied is DZ A converted by the effect DSP 110 and output to the amplifier 112.
  • the amplifier 112 amplifies the musical sound signal output from the effect DSP 110 and outputs it to the speakers 3L and 3R. As a result, the melody of the music is output from the speakers 3L and 3R.
  • the CPU 102 starts counting the elapsed time after the music output is started.
  • the singer's voice is input to the microphone 4 and an audio signal is output from the microphone 4.
  • the voice processing DSP 111 performs AZD conversion on the voice signal output from the microphone 4 to generate singing voice data representing the singing voice.
  • This singing voice data is stored in the storage unit 105.
  • the CPU 102 continues counting elapsed time, and searches the lyrics table for a time interval including the counted time as the start time of the time interval (a time interval in which the counted time is included). Then, the retrieved time interval and the lyrics data stored in association with the retrieved time interval are read out. For example, if the counted elapsed time is 01:00:00, the time section “01: 01: 00: 02” and the lyrics data “Kamereon force” on the first line are read out in the lyrics table shown in FIG.
  • the CPU 102 When the CPU 102 reads the time interval, the CPU 102 compares the audio input to the microphone 4 in this time interval with the model audio in this time interval, and determines whether or not the singer sang the lyrics correctly. Judging. Specifically, the CPU 102 prays the voice represented by the model voice data and, as shown in FIG. 5, reads the time interval (01: 00 on the time axis of the voice waveform represented by the model voice data. — Extract voice waveform A between 01: 02). In addition, the CPU 102 analyzes the stored singing voice data, and as shown in FIG. On the time axis shown, extract the speech waveform B between the read time intervals.
  • the extracted speech waveform A is divided into a plurality of frames by being divided at a predetermined time interval (for example, 10 ms) as shown in FIG. 6 (a).
  • the extracted speech waveform B is divided into a plurality of frames by being divided at a predetermined time interval (for example, 10 ms) as shown in FIG. 6 (b).
  • the CPU 102 associates the speech waveform of each frame of the model speech with the speech waveform of each frame of the singing speech using a DP (Dynamic Programming) matching method.
  • a DP Dynamic Programming
  • the frame A1 and the frame B1 are associated with each other.
  • the voice waveform of frame A2 of the sample voice and the voice waveform of frame B2 force frame B3 of the singing voice correspond to each other! /, Then frame A2 and frame B2 to frame B3 are associated. Is done.
  • the CPU 102 compares the characteristics of the speech waveform between corresponding frames. Specifically, the CPU 102 Fourier transforms the speech waveform for each speech waveform of each frame of the model speech. The CPU 102 obtains the logarithm of the amplitude spectrum obtained by the Fourier transform, and inversely transforms it to generate a spectrum envelope for each frame. The CPU 102 extracts the first formant frequency fl 1, the second formant frequency fl 2, and the third formant frequency fl 3 from the obtained spectral envelope.
  • the CPU 102 performs Fourier transform on the voice waveform for each voice waveform of the singer's voice frame associated with each frame of the model voice. Then, the CPU 102 obtains the logarithm of the amplitude spectrum obtained by the Fourier transform, and inversely transforms it to generate a spectrum envelope for each frame. Then, the CPU 102 extracts the obtained spectrum envelope force frequency f21 of the first formant, frequency f22 of the second formant, and frequency 23 of the third formant.
  • the CPU 102 generates a spectrum envelope of the frame A1 of the sample voice, and extracts the spectrum envelope forces fl1 to f13 of the first to third formants. Then, the CPU 102 generates a spectral envelope of the speech waveform of the frame B1 associated with the frame A1, and extracts the first to third formant formant frequencies f21 to f23 from the spectrum envelope. Further, the CPU 102 generates a spectrum envelope of the frame A2 of the sample voice, and the spectrum envelope force extracts the formant frequencies fl 1 to f 13 of the first to third formants. Then, the CPU 102 generates a spectrum envelope of the speech waveform from the frame B2 to the frame B3 associated with the frame A2, and extracts the formant frequencies f21 to f23 of the first to third formants from the spectrum envelope. .
  • the CPU 102 extracts the formant frequencies fl 1 to fl 3 extracted from each frame of the sample voice and the frame force of the singer's voice associated with each frame of the sample voice f21 to Compare with f23. Then, the CPU 102 compares the difference between the formant frequency f11 and the formant frequency f21, the difference between the formant frequency f12 and the formant frequency f22, and the formant frequency f13 and the formant frequency f23. If the difference is greater than or equal to a predetermined value, mismatch information D indicating that the formant frequencies do not match is added to the frame of the model voice.
  • the CPU 102 matches the speech between the corresponding frames.
  • the discrepancy information D is not added to the frame A1.
  • the formant frequencies fl 1 to fl3 of the frame A2 and the formant frequencies f21 to f23 of the speech waveform from the frames B2 to B3 If the difference is equal to or greater than the predetermined value, mismatch information D indicating that the formant frequencies are mismatched is added to frame A2.
  • the CPU 102 determines a match Z mismatch between the formant frequency of the sound waveform of each frame of the singer and the formant frequency of the sound waveform of each frame of the sample sound for the sound waveform of each frame of the sample sound, Count the number N of frames with mismatch information D added. Next, the CPU 102 compares the total number M of the divided sample voice data frames with the value of the number N. If the value of the number N is more than half of the total number M of frames, the read lyrics data is The lyrics to represent! It is determined that the lyrics of the singer's pronunciation are different from the lyrics of the model voice.
  • the lyrics expressed by the read lyrics data and the lyrics Judge that the lyrics are the same. For example, if the number N of discrepancy information is less than half of the total number of frames M for the voice “Kamere-onga” represented by the model voice data,
  • the CPU 102 determines that the lyrics pronounced by the singer are the same as the lyrics of the model voice.
  • the lyrics expressed by the read lyrics data are different from the lyrics of the singer's pronunciation and the lyrics of the model voice. Determining power Total number of frames Number of M Number of N
  • the ratio of N is more than a predetermined ratio other than 50%
  • the lyrics expressed by the lyrics data read out are the lyrics of the singer and the lyrics of the model voice. May be determined to be different.
  • CPU 102 continues to count the elapsed time in parallel with the comparison between the model voice and the singing voice.
  • the counted elapsed time becomes 01:03
  • the time zone in the second row of the lyrics table shown in FIG. Read “01: 03—01: 06” and lyric data “Ichikita-ichi”.
  • the singing voice data is stored in the storage unit 105.
  • Singing voice data representing the voice “come” is generated and gc fe to the storage unit 105.
  • the CPU 102 divides the waveform of the sound input to the microphone 4 in this time interval and the waveform of the model sound in this time interval into a plurality of frames. Then, the speech waveform of each frame of the model speech is associated with the speech waveform of each frame of the singing speech, and the formant frequencies of the speech waveforms are compared between the associated frames. Then, the CPU 102 judges whether the voice waveform of each frame of the model voice matches the formant frequency of the singer's voice waveform, and adds the mismatch information D, and then the divided sample voice data. The total number M of frames is compared with the value of the number N of frames with discrepancy information added to determine whether or not the singer has sung the lyrics correctly.
  • the CPU 102 repeats the reading of the lyrics data and the model voice data and the determination of the correctness of the lyrics sung by the singer as the music is played back. Then, when all performance event data is read, the karaoke accompaniment process is terminated.
  • the present embodiment it is possible to determine whether or not the singer has sung according to the lyrics without performing voice recognition using a dictionary.
  • voice data that is sung correctly according to the lyrics
  • lyrics in various languages without complicating music it is possible to evaluate the ability of the singer to learn the lyrics correctly!
  • the pitch of the voice represented by the singing voice data is corrected so that the pitch of the voice waveform represented by the singing voice data becomes the pitch of the voice waveform represented by the model voice data.
  • the model voice data detects the pitch variation of the voice waveform represented by the model voice data and makes it a model voice (singing a sound that is lower than the set pitch first and then approaching the original pitch) If there is a law or Determine the degree of coincidence between the pitch fluctuation of the voice waveform represented and the pitch fluctuation of the voice waveform represented by the singing voice data, and try to determine whether or not the singer is singing correctly. You can do it.
  • the voice waveform represented by the sample voice data and the voice waveform represented by the singing voice data are divided into a plurality of frequency bands by a plurality of bandpass filters, and the voice is represented for each frequency band.
  • the correctness of the lyrics may be determined by determining the degree of coincidence of the feature quantities.
  • the model voice data representing the model voice waveform is stored, and the formant frequency is analyzed by analyzing the voice waveform represented by the model voice data. Is stored in advance in the storage unit 105, and the stored formant frequency is compared with the formant frequency of each frame of the singer's voice waveform to determine the degree of coincidence. You may decide to judge.
  • the correctness of the lyrics sung by the singer may be determined after the singer has finished singing the music.
  • a message or image that informs the user that the lyrics are incorrect is not displayed. You may make it display on.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Reverberation, Karaoke And Other Acoustics (AREA)

Abstract

La présente invention permet d'évaluer si un chanteur mémorise correctement les paroles d'une chanson sans que le système soit complexe. Une UC (102) divise en trames la forme d'onde d'un son modèle représenté par des données de son modèle et divise la forme d'onde d'un son chanté représenté par des données de son chanté stockées dans une section de stockage (105) en trames. L'UC (102) associe ensuite la forme d'onde du son de chaque trame du modèle avec la forme d'onde du son de chaque trame du son chanté, compare les fréquences formant les formes d'onde du son des trames correspondantes et évalue si oui ou non le son modèle est conforme au son chanté. Dans la négative, l'UC (102) affiche sur un écran (2) les paroles correspondant au son représenté par les données du son modèle.
PCT/JP2007/051413 2006-01-31 2007-01-29 Machine de karaoke et procede de traitement du son WO2007088820A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2006022648A JP4862413B2 (ja) 2006-01-31 2006-01-31 カラオケ装置
JP2006-022648 2006-01-31

Publications (1)

Publication Number Publication Date
WO2007088820A1 true WO2007088820A1 (fr) 2007-08-09

Family

ID=38327393

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2007/051413 WO2007088820A1 (fr) 2006-01-31 2007-01-29 Machine de karaoke et procede de traitement du son

Country Status (2)

Country Link
JP (1) JP4862413B2 (fr)
WO (1) WO2007088820A1 (fr)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6217304B2 (ja) * 2013-10-17 2017-10-25 ヤマハ株式会社 歌唱評価装置およびプログラム
CN104978961B (zh) * 2015-05-25 2019-10-15 广州酷狗计算机科技有限公司 一种音频处理方法、装置及终端
US20180158469A1 (en) * 2015-05-25 2018-06-07 Guangzhou Kugou Computer Technology Co., Ltd. Audio processing method and apparatus, and terminal

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1195760A (ja) * 1997-09-16 1999-04-09 Ricoh Co Ltd 音楽再生装置
JP2001117568A (ja) * 1999-10-21 2001-04-27 Yamaha Corp 歌唱評価装置およびカラオケ装置
JP2006227587A (ja) * 2005-01-20 2006-08-31 Advanced Telecommunication Research Institute International 発音評定装置、およびプログラム

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS60262187A (ja) * 1984-06-08 1985-12-25 松下電器産業株式会社 採点装置
JP3754741B2 (ja) * 1996-03-07 2006-03-15 株式会社エクシング カラオケ装置
JP3673405B2 (ja) * 1998-07-08 2005-07-20 株式会社リコー 演奏曲再生装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1195760A (ja) * 1997-09-16 1999-04-09 Ricoh Co Ltd 音楽再生装置
JP2001117568A (ja) * 1999-10-21 2001-04-27 Yamaha Corp 歌唱評価装置およびカラオケ装置
JP2006227587A (ja) * 2005-01-20 2006-08-31 Advanced Telecommunication Research Institute International 発音評定装置、およびプログラム

Also Published As

Publication number Publication date
JP4862413B2 (ja) 2012-01-25
JP2007206183A (ja) 2007-08-16

Similar Documents

Publication Publication Date Title
Yamada et al. A rhythm practice support system with annotation-free real-time onset detection
US6856923B2 (en) Method for analyzing music using sounds instruments
Durrieu et al. A musically motivated mid-level representation for pitch estimation and musical audio source separation
US5889224A (en) Karaoke scoring apparatus analyzing singing voice relative to melody data
US6182044B1 (en) System and methods for analyzing and critiquing a vocal performance
Cano et al. Voice Morphing System for Impersonating in Karaoke Applications.
KR20090041392A (ko) 악곡 연습 지원 장치, 악곡 연습 지원 장치의 제어 방법, 악곡 연습 지원 장치를 제어하는 제어 방법을 컴퓨터로 실행시키는 프로그램을 기록한 컴퓨터로 읽을 수 있는 매체
JP2008026622A (ja) 評価装置
TWI742486B (zh) 輔助歌唱系統、輔助歌唱方法及其非暫態電腦可讀取記錄媒體
JP4205824B2 (ja) 歌唱評価装置およびカラオケ装置
JP5598516B2 (ja) カラオケ用音声合成システム,及びパラメータ抽出装置
JP4479701B2 (ja) 楽曲練習支援装置、動的時間整合モジュールおよびプログラム
Lerch Software-based extraction of objective parameters from music performances
JP4862413B2 (ja) カラオケ装置
JP2007233077A (ja) 評価装置、制御方法及びプログラム
US20150112687A1 (en) Method for rerecording audio materials and device for implementation thereof
JP3362491B2 (ja) 音声発声装置
Ikemiya et al. Transcribing vocal expression from polyphonic music
JP2008040260A (ja) 楽曲練習支援装置、動的時間整合モジュールおよびプログラム
JP2002041068A (ja) カラオケ装置における歌唱採点方法
JPH11249674A (ja) カラオケ装置における歌唱採点方式
JP2008040258A (ja) 楽曲練習支援装置、動的時間整合モジュールおよびプログラム
JP4048249B2 (ja) カラオケ装置
Sharma et al. Singing characterization using temporal and spectral features in indian musical notes
JP5092311B2 (ja) 音声評価装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07707644

Country of ref document: EP

Kind code of ref document: A1