JP2008170685A - Voice evaluation device and karaoke device - Google Patents

Voice evaluation device and karaoke device Download PDF

Info

Publication number
JP2008170685A
JP2008170685A JP2007003395A JP2007003395A JP2008170685A JP 2008170685 A JP2008170685 A JP 2008170685A JP 2007003395 A JP2007003395 A JP 2007003395A JP 2007003395 A JP2007003395 A JP 2007003395A JP 2008170685 A JP2008170685 A JP 2008170685A
Authority
JP
Japan
Prior art keywords
viewer
scoring
behavior
means
music
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP2007003395A
Other languages
Japanese (ja)
Other versions
JP4655047B2 (en
Inventor
Takahiro Tanaka
孝浩 田中
Original Assignee
Yamaha Corp
ヤマハ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yamaha Corp, ヤマハ株式会社 filed Critical Yamaha Corp
Priority to JP2007003395A priority Critical patent/JP4655047B2/en
Publication of JP2008170685A publication Critical patent/JP2008170685A/en
Application granted granted Critical
Publication of JP4655047B2 publication Critical patent/JP4655047B2/en
Application status is Expired - Fee Related legal-status Critical
Anticipated expiration legal-status Critical

Links

Images

Abstract

The present invention provides a technique capable of scoring closer to an artificial scoring than a conventional singing or musical instrument performance.
When a song is selected by a singer, a karaoke apparatus 10 starts karaoke accompaniment of the selected song. When the singer sings along with the karaoke accompaniment, the singer's singing voice is picked up by the microphone 15 and converted into voice data. The control unit 11 of the karaoke apparatus 10 compares the voice data output from the microphone 15 with the guide melody data, and scores the singing voice based on the comparison result. At this time, the control unit 11 analyzes the video signal supplied from the photographing unit 18 and detects the behavior of the viewer. The control unit 11 corrects the scoring result based on the detected viewer behavior.
[Selection] Figure 1

Description

  The present invention relates to a technique for evaluating speech.

In a karaoke apparatus, various methods for scoring the skill of a singer's singing have been proposed. For example, a method of scoring a song by detecting the volume and pitch of the singing voice of the singer and comparing it with a guide melody has been proposed (see Patent Documents 1 to 6).
JP-A-10-78749 Japanese Patent Laid-Open No. 10-78750 Japanese Patent Laid-Open No. 10-49183 Japanese Patent Laid-Open No. 10-69216 JP-A-10-91172 Japanese Patent Laid-Open No. 10-282978

By the way, many people who sing do not sing mechanically along the score, but sing using a singing technique such as vibrato or fist. However, in conventional devices, it is common to score a song based on the difference from the reference data such as a guide melody. Therefore, the higher the score, the more the singer sings mechanically according to the reference. If you sing in your own way, using various singing techniques or making arrangements, the scoring results often worsen. The same applies to the performance of musical instruments.
This invention is made | formed in view of such a situation, and provides the technique which can perform scoring nearer artificial scoring about the performance of a song or a musical instrument compared with the past.

A speech evaluation apparatus according to a preferred aspect of the present invention includes a reference data storage unit that stores reference data that is referred to when scoring speech, a speech signal that represents speech, and reference data that is stored in the reference data storage unit. Based on the comparison, the scoring means for scoring the sound, the viewer behavior detecting means for detecting the behavior of the viewer watching the person who emits the sound, and the behavior detected by the viewer behavior detecting means. It comprises a correcting means for analyzing and correcting the scoring result of the scoring means based on the analysis result, and an output means for outputting data indicating the scoring result corrected by the correcting means.
In the above-described aspect, the viewer behavior detecting unit may detect the viewer's behavior by performing image analysis on an image captured by the photographing unit that photographs the viewer.
The viewer behavior detecting means may detect the behavior of the viewer by analyzing a voice signal output from the sound collecting means for collecting the viewer's voice.
In addition, the viewer behavior detection unit may detect the viewer's behavior based on an operation signal output from an operation unit that outputs an operation signal corresponding to the operated content.

  Further, in the above-described aspect, the correspondence storage means for storing the correspondence between the combination of the music type and the viewer's behavior and the correction mode data indicating the correction mode of the scoring, and the type of the audio music A music type determination means for determining the score, and the correction means corrects the scoring corresponding to the combination of the music type determined by the music type determination means and the behavior detected by the viewer behavior detection means May be specified with reference to the correspondence storage means, and the scoring result of the scoring means may be corrected in a manner of correcting the specified scoring.

  A karaoke apparatus which is a preferred aspect of the present invention reproduces the above-described voice evaluation apparatus, music data storage means for storing music data representing the music, and music data stored in the music data storage means. And a reproducing means.

  According to the present invention, singing or musical instrument performance can be scored more artificially than in the past.

Next, the best mode for carrying out the present invention will be described.
<A: Configuration>
FIG. 1 is a block diagram showing a hardware configuration of a karaoke apparatus 10 according to an embodiment of the present invention. This karaoke apparatus 10 is an apparatus having a function of performing karaoke accompaniment and a function of scoring a singer's song. In the figure, the control unit 11 includes a CPU (Central Processing Unit), a ROM (Read Only Memory), and a RAM (Random Access Memory), and reads and executes a computer program stored in the ROM or the storage unit 12. The respective units of the karaoke apparatus 10 are controlled via the bus BUS. The storage unit 12 is a storage unit for storing a program executed by the control unit 11 and data used at the time of execution, and is, for example, a hard disk device. The display unit 13 includes a liquid crystal panel or the like, and displays various screens such as a menu screen for operating the karaoke apparatus 10 and a karaoke screen in which lyrics telop is superimposed on a background image under the control of the control unit 11. The operation unit 14 outputs an operation signal corresponding to the operation by the user to the control unit 11. The microphone 15 is a sound collecting device that picks up sound (hereinafter referred to as “input sound”) emitted by the user. The microphone 15 outputs an analog electric signal representing a waveform of the input sound on the time axis. The sound processing unit 16 converts an electrical signal input from the microphone 15 into digital data. The audio processing unit 16 converts the digital data into an analog signal and outputs the analog signal to the speaker 17. The speaker 17 is a sound emitting unit that emits sound with an intensity corresponding to the sound signal that is converted from the digital data into an analog signal and output by the sound processing unit 16. The photographing unit 18 is a photographing unit that photographs a person who views the singer's song (hereinafter referred to as “viewer”), and outputs a video signal representing the photographed video to the control unit 11. The photographing unit 18 may be, for example, a surveillance camera installed in a karaoke box, or may be a dedicated video photographing device provided for viewer photographing.

  In this embodiment, the case where the microphone 15 and the speaker 17 are included in the karaoke apparatus 10 will be described. However, the audio processing unit 16 is provided with an input terminal and an output terminal, and the input terminal is connected to the input terminal via an audio cable. An external microphone may be connected, and similarly, an external speaker may be connected to the output terminal via an audio cable. In this embodiment, the case where the audio signal input from the microphone 15 to the audio processing unit 16 and the audio signal output from the audio processing unit 16 to the speaker 17 are analog audio signals will be described. You may make it input / output. In such a case, the audio processing unit 16 does not need to perform A / D conversion or D / A conversion.

  The storage unit 12 has a music data storage area 121 and a correspondence relation storage area 122 as shown in the figure. The music data storage area 121 stores a large number of music data representing music played back during karaoke performance. The music data includes accompaniment data constituting accompaniment sounds of each music in a data format such as MIDI (Musical Instruments Digital Interface) format, lyrics data representing lyrics of music displayed as lyrics telop at the time of karaoke accompaniment, It includes guide melody data representing a guide melody of music in a data format such as MIDI format. In this embodiment, the guide melody data is used as reference data that is referred to when scoring the singer's voice.

  The correspondence relationship storage area 122 stores a table indicating the correspondence relationship between the viewer's behavior and the correction mode data indicating the correction mode of the scoring. FIG. 2 is a diagram illustrating an example of the contents of the table stored in the correspondence storage area 122. As shown in the figure, each item of “viewer behavior” and “modification mode” is stored in association with each other in this table. Among these items, the “viewer behavior” item includes a behavior type indicating the type of viewer behavior, such as “applause”, “shoulder”, “smile”, “depressed”, etc. Information is stored. In the item “correction mode”, correction mode data indicating a correction mode of scoring such as “+15”, “+20”, “−10”, and the like is stored. The control part 11 of the karaoke apparatus 10 corrects the scoring result of a song with reference to this table.

  Next, the functional configuration of the karaoke apparatus 10 will be described with reference to FIG. FIG. 3 is a block diagram showing a functional configuration of the karaoke apparatus 10. In the figure, the scoring unit 111 and the behavior detecting unit 112 are realized by the control unit 11 of the karaoke apparatus 10 executing a computer program stored in the ROM or the storage unit 12. The arrows in the figure schematically show the flow of data. Note that the scoring unit 111, the behavior detection unit 112, and the like may be realized by hardware such as a DSP (Digital Signal Processor) dedicated to audio processing.

In FIG. 3, the scoring unit 111 compares the singer voice data supplied via the voice processing unit 16 with the guide melody data stored in the music data storage area 121, and based on the comparison result, the singer's voice data. It has a function of scoring voice (hereinafter referred to as “singer voice”). For this scoring, for example, the voice characteristics such as the pitch and power of the singer voice data and the guide melody data may be compared, and the scoring may be performed such that the scoring result is better as the difference is smaller. Note that the content of the scoring process performed by the scoring unit 111 is the same as the scoring process performed in the conventional apparatus, and the detailed description thereof is omitted here.
In this embodiment, the guide melody data is used as reference data to be referred to when scoring the voice. However, the reference data is not limited to the guide melody data, and data that can be scored using the data. Anything may be used.

  Next, the behavior detection unit 112 illustrated in FIG. 3 has a function of analyzing the video signal output from the photographing unit 18 and detecting the viewer's behavior. Here, an example of viewer behavior detection processing performed by the behavior detection unit 112 will be described below. In this embodiment, the behavior detection unit 112 analyzes the video signal output from the imaging unit 18, performs a person (individual) detection process, performs a face detection process, and determines the viewer's behavior based on the detection result. To detect.

  The behavior detection unit 112 distinguishes between an individual object and a moving object from a stream video for a predetermined time, and performs person detection. The behavior detection unit 112 detects the behavior of the viewer by comparing the detected motion pattern of the person with a predetermined pattern. Specifically, for example, data indicating an applause motion pattern is stored in the storage unit 12 in advance, and this data is compared with the detected motion pattern of the person, and the degree of coincidence is a predetermined amount or more. In some cases, it is determined that the viewer (person) is applauding. Similarly, the behavior detection unit 112 also determines whether or not the viewer has a shoulder, whether or not a wave is being performed, and whether or not a scrum is being formed. The behavior detection unit 112 outputs a parameter indicating the determination result to the scoring unit 111. In the case where there are a plurality of viewers, data indicating the ratio of viewers performing the operation (applause, shoulder crossing, etc.) with respect to the total number of viewers may be used as a parameter. Note that the method for detecting the viewer's behavior is not limited to this, and any method may be used as long as the viewer's behavior is suitably detected.

  In addition, the behavior detection unit 112 performs skin color detection, collates the captured video with an image of a predetermined pattern, and detects a face portion from the video based on the degree of coincidence. The behavior detection unit 112 determines the orientation of the viewer's face based on the detected face part image, and outputs a parameter indicating the determination result (facing downward, etc.) to the scoring unit 111.

In addition, the behavior detection unit 112 compares the detected face image with a predetermined pattern to detect a facial expression. Specifically, for example, when data indicating a smiling face is stored in the storage unit 12 in advance, the data is compared with the detected face portion data, and the degree of coincidence is a predetermined amount or more. , Determine that the viewer is laughing. The behavior detection unit 112 outputs a parameter indicating the face determination result to the scoring unit 111. When there are a plurality of viewers, data indicating the ratio of viewers of the facial expression (smiles, etc.) to the total number of people may be used as a parameter. Note that the method for detecting the orientation and expression of the viewer's face is not limited to this, and any method may be used as long as it suitably detects the orientation and expression of the viewer's face. .
In this embodiment, the behavior detection unit 112 periodically outputs a parameter indicating the detection result to the scoring unit 111 (for example, every 10 seconds).

  The scoring unit 111 analyzes the detected behavior based on the parameters supplied from the behavior detection unit 112, and corrects the scoring result based on the analysis result. Specifically, the scoring unit 111 specifies a correction mode corresponding to the supplied parameter with reference to a table stored in the correspondence storage area 122, and corrects the scoring result in the specified correction mode. . Specifically, in the example illustrated in FIG. 2, when the parameter indicating that the viewer is “smile” is received, the scoring unit 111 adds “20” to the scoring result.

  In this way, the scoring unit 111 corrects the scoring result according to the behavior of the viewer, so that, for example, when there are many people who are depressed during singing, it is determined that the viewer feels “boring”. The On the other hand, if there are many people with a smiley expression while singing, it is determined that “the place has a pleasant atmosphere”. If the facial expression is recognized as “laughing”, it is determined that the place is exciting and points are added.

<B: Operation>
Next, the operation of this embodiment will be described. First, a singer operates the operation part 14 of the karaoke apparatus 10, and performs operation which selects the music to sing. The operation unit 14 outputs an operation signal corresponding to the operated content to the control unit 11. The control unit 11 selects music according to the operation signal output from the operation unit 14. The control part 11 starts the karaoke accompaniment of the selected music. That is, the audio processing unit 16 reads accompaniment data from the music data storage area 121 under the control of the control unit 11, converts it into an analog signal, and supplies the analog signal to the speaker 17. The speaker 17 emits an accompaniment sound according to the supplied analog signal. The singer sings along with the accompaniment sound emitted from the speaker 17. At this time, the voice of the singer is picked up by the microphone 15, converted into a voice signal, and output to the voice processing unit 16. The audio processing unit 16 converts the audio signal output from the microphone 15 into digital data (hereinafter referred to as “singer audio data”).

  The control unit 11 performs the processing of the scoring unit 111 described above. That is, the control unit 11 compares the singer voice data with the guide melody data, and scores the singer's voice (hereinafter referred to as “singer voice”) based on the comparison result. Moreover, the control part 11 performs the process of the behavior detection part 112 mentioned above. That is, the control unit 11 detects the viewer's behavior based on the video signal from the photographing unit 18, and corrects the scoring result in a correction manner corresponding to the detected behavior.

  Next, the control unit 11 outputs data indicating the corrected scoring result to the display unit 13. The display unit 13 displays the scoring result based on the data supplied from the control unit 11. The singer can grasp the scoring result for his / her song by checking the screen displayed on the display unit 13.

Thus, in this embodiment, since the control part 11 detects a viewer's behavior and performs scoring which considered the detection result, a viewer's behavior can be reflected in scoring. In this way, by incorporating the viewer's reaction into the evaluation criteria, more human evaluation points can be taken into account. The viewer's behavior often expresses the viewer's feelings during viewing, thereby making it possible to score more artificially as compared with a conventional device. Specifically, for example, when the viewer is applauding, the place is often exciting, and in this case, the scoring result is high. On the other hand, when there are many viewers who are depressed, the place is often raised, and in such a case, the scoring result becomes worse.
As described above, in this embodiment, it is possible to analyze the enjoyment on the spot using gestures and sounds that the viewer reacts in a human manner, and incorporate it into the scoring results. This makes it possible to reflect elements (human reactions) that could not be caught by the conventional parameter-based evaluation algorithm, and to add realism to scoring.

<C: Modification>
As mentioned above, although embodiment of this invention was described, this invention is not limited to embodiment mentioned above, It can implement with another various form. An example is shown below. In addition, you may combine each following aspect suitably.
(1) In the above-described embodiment, the singer's voice picked up by the microphone 15 is scored. However, the input voice may be a voice picked up by the microphone 15 or stored in the storage unit 12 in advance. Audio data may be used. Moreover, it is set as the structure which provides a communication part in the karaoke apparatus 10, and audio | voice data received via this communication part may be scored. In short, any audio data input to the control unit 11 may be used.

(2) In the above-described embodiment, the karaoke apparatus is applied as the voice evaluation apparatus according to the present invention. However, the apparatus applied as the voice evaluation apparatus is not limited to the karaoke apparatus, for example, a personal computer, a mobile communication terminal, or a portable type. Various devices such as a game machine and a portable music player are applicable as the voice evaluation device according to the present invention.

(3) In the embodiment described above, the viewer's behavior detection and voice scoring are performed in real time. However, the present invention is not limited to this, and scoring may be performed when the singing is finished. Is optional.

(4) In the above-described embodiment, the scoring result is notified to the singer by displaying it on the display unit 13, but the notification mode is not limited to this, and for example, it may be notified by outputting a voice message. Alternatively, a voice message indicating the scoring result may be output. Moreover, the form which transmits the information which shows a scoring result to a singer's mail terminal in an email format may be sufficient. In addition, information indicating the scoring result may be output to the storage terminal and stored. In this case, the singer can refer to the information by reading the information from the storage medium using a computer. it can. The scoring result may be printed out on a predetermined sheet. In short, what is necessary is just to output the information which shows a scoring result so that a message thru | or information can be conveyed to a singer by some means.
Further, the scoring result may be notified in real time during the singing, or the scoring result may be notified after the singing is finished, and the timing of notifying the scoring result is arbitrary.

(5) In the above-described embodiment, a surveillance camera installed in a karaoke box is used as a photographing unit for photographing a viewer. The imaging unit is not limited to this. For example, a video from a dedicated camera may be used, or a video from a camera provided in a mobile communication terminal or the like may be used.
In the above-described embodiment, the photographing unit has photographed a moving image. However, the present invention is not limited to this, and a still image may be photographed and a viewer's behavior may be detected based on the photographed still image. In this case, for example, the viewer is photographed every predetermined unit time (for example, 1 minute), the viewer's face is recognized from the photographed image, and it is determined whether or not the viewer is laughing. . Note that facial expression recognition other than laughter may be performed.
In short, the photographing unit may be anything as long as it captures the viewer.

In addition, the method for detecting the viewer's behavior is not limited to the method described in the above-described embodiment. For example, the viewer's voice is collected and the collected voice is analyzed to detect the viewer's behavior. May be. In this case, a sound collecting means (such as a microphone) that outputs a sound signal representing the sound collected by collecting the sound of the viewer is provided, and the sound signal output by the sound collecting means is analyzed for viewing. The behavior of a person is detected. Specifically, for example, when a sound signal output from the sound collecting means is compared with a predetermined sound pattern and the degree of coincidence is within a predetermined range, the behavior corresponding to the sound pattern May be specified as viewer behavior.
In this case, as the sound collecting means, a microphone installed in the karaoke box may be used, or a dedicated microphone for collecting the voice of the viewer may be provided.
In this case, specifically, as the voice pattern, for example, voice characteristics (pitch, spectrum, etc.) emitted in various behaviors such as voice representing boos, voice representing yaji, voice representing applause, etc. What is necessary is just to memorize | store beforehand as a pattern and collate this memorize | stored audio | voice pattern and the pattern of the collected audio | voice.
For example, when booing or gouge is detected as the viewer's behavior, the control unit 11 adds a minus point, and when applause is detected as the viewer's behavior, the control unit 11 performs point addition processing. May be.
In this case, for example, if the environmental sound recorded through the microphone is quiet, it is judged that the song is “listening to the song” and the score is increased. If there is a big applause sound in the interlude, it is judged that it is “good”. The atmosphere of the place that appears in the viewer's behavior can be added to the scoring, such as increasing the score.
Note that the viewer's behavior may be detected by performing both the analysis of the viewer's voice and the analysis of the viewer's video.

  Further, as another configuration for detecting the listener's behavior, for example, an operation means operated by the viewer may be provided, and the viewer's behavior may be detected according to the operation content of the operation means. In this case, specifically, for example, an operation unit provided with various buttons for evaluating the skill of the singer's singing is provided, and when operated by the viewer, the operation unit is set according to the operated content. The operation signal is output to the control unit 11. The control unit 11 detects the viewer's behavior according to the operation signal supplied from the operation means.

As described above, the method for detecting the behavior of the viewer may detect the behavior of the viewer by analyzing the captured image of the viewer, and collects the voice of the viewer. The viewer's behavior may be detected by analyzing the collected sound, or the viewer's behavior may be detected according to the operated content. In addition, for example, a sensor or the like that detects vibration may be attached to the viewer, and the viewer's behavior may be detected according to an output signal from the sensor. In this case, for example, the greater the number of vibrations, the greater the field may be determined.
In short, any device that can detect the behavior of the viewer may be used.

(6) In the above-described embodiment, the scoring result is modified according to the viewer's behavior (reaction). However, the present invention is not limited to this, and scoring may be performed only from the viewer's behavior. Even in this case, the degree of excitement of the field can be determined by detecting the behavior of the viewer. Specifically, for example, not only in singing but also in the case of rakugo, comics, etc., by detecting the behavior of viewers and scoring, when there are multiple people who are particularly evaluated, relative Scoring is possible.

(7) In the above-described embodiment, the singer voice data and the guide melody data are compared, and the singer voice is scored based on the comparison result. However, the scoring method is not limited to this. Anything may be used as long as the person's voice is scored. Specifically, for example, the model voice data representing the model singing voice may be compared with the singer voice data, and the singer voice may be scored based on the comparison result. In this case, specifically, for example, the pitch, power, spectrum, and the like of the model voice data and the singer voice data may be compared, and the score may be scored such that the score is higher as the degree of coincidence is higher. Further, for example, the singer voices and the model voice data may be compared with singing techniques (vibrato, fist, etc.), and the singer voices may be scored based on the degree of coincidence. In short, as long as the singer's voice is scored, any scoring method may be used.

(8) In the embodiment described above, the behavior detection unit 112 is configured to periodically output a parameter indicating the detection result to the scoring unit 111, but the configuration in which the scoring unit 111 acquires the parameter is limited to this. It is not something. Specifically, for example, the behavior detection unit 112 regularly stores data indicating the detection result in a predetermined storage area of the storage unit 12, and the scoring unit 111 stores the predetermined storage. It is good also as a structure which acquires data regularly from an area | region. Further, for example, the behavior detection unit 112 is configured to store data with a time stamp in a predetermined storage area of the storage unit 12, and the scoring unit 111 refers to the time stamp of the stored data, and arbitrarily It is good also as a structure which can acquire the parameter of time (for example, 10 minutes before). In short, as long as the scoring unit 111 can acquire parameters, any acquisition method may be used.

(9) In the above-described embodiment, the control unit includes a correspondence storage unit that stores the correspondence between the combination of the type of music and the behavior of the viewer and the correction mode data indicating the correction mode of the scoring. 11 determines the type of music, specifies the correction of scoring corresponding to the combination of the determined music type and the detected behavior of the viewer with reference to the correspondence storage means, and specifies the specified scoring It is good also as a structure which corrects a scoring result by the aspect of correction of these.
FIG. 4 is a diagram showing an example of the contents of the table used in this case. As shown in the figure, this table stores items of “music type”, “viewer behavior”, and “modification mode” in association with each other. Among these items, the “music type” item stores music type information representing the type of music such as “ballad”, “rap”, “arbitrary”, and the like. Note that an item whose “music type” is “arbitrary” is an item referred to in all music types. In the item “viewer behavior”, behavior type information indicating the type of viewer behavior such as “smile” and “applause” is stored. The item of “modification mode” stores correction mode data indicating the mode of scoring correction, such as “−10”, “+20”, “+15”, and the like.
In this aspect, the control unit 11 of the karaoke apparatus 10 first determines the type of music. This determination may be made, for example, from a song code included in the song data, or the genre data indicating the genre is included in the song data, and the control unit 11 determines the type by referring to this genre data. May be. Moreover, the control part 11 may analyze the rhythm, tempo, melody, etc. of a music, and may determine the kind of music based on the analysis result. When determining the type of music, the control unit 11 searches this table using a combination of the determined type of music and the detected viewer's behavior as a key, and corrects the scoring result in the searched correction mode.
According to this aspect, even if the behavior of the viewer is the same, different scoring corrections can be performed according to the type of music (for example, ballad, rap, etc.).

(10) In the above-described embodiment, the unique scoring is performed for one piece of music, but instead of this, the scoring method may be changed according to the portion of the music. For example, the scoring process and the scoring correction process may be performed only on the rust portion of the music. Also, for example, referring to the table shown in FIG. 4 described above, the first half of the music is corrected in the correction mode corresponding to “ballad”, while the second half of the music is changed to “rap”. You may make it correct scoring by the aspect of corresponding correction.

  Further, not only video but also microphone input or the like may be used to analyze shouts, cheers, applause, etc., and perform weighting in comparison with the music sequence. Specifically, for example, the excitement of the place may be evaluated by the size of the applause sound immediately after entering the interlude immediately after one chorus. In the case of evaluation, instead of comparing / verifying within one musical piece, a change in the immediately preceding musical piece and the same karaoke box / user may be considered. In other words, by introducing relative results, it becomes easier to grasp the “field atmosphere”.

(11) In the above-described embodiment, the case where a singer's song is scored is described as an example. However, the present invention is not limited thereto, and the performance of a person who plays an instrument may be scored. As described above, the “voice” in the present invention includes various sounds such as a voice generated by a person and a performance sound of a musical instrument.

(12) In the above-described embodiment, the control unit 11 corrects the scoring result in the correction mode as shown in FIG. 2, but the scoring correction mode is not limited to this. If you're laughing, you can set the score to "120%", if you're "depressed", you can set the score to "80%", etc. There may be. In short, any mode may be used as long as the viewer's behavior is reflected in the scoring result.

(13) In the above-described embodiment, the correction mode is specified with reference to the table stored in the correspondence storage area 122. However, the correction mode specifying method is not limited to this. For example, the viewer's behavior may be added to the scoring result using a predetermined algorithm, and any method may be used as long as the viewer's behavior is added to the scoring result.

(14) In the embodiment described above, the karaoke apparatus 10 has realized all the functions according to the embodiment. In contrast, two or more devices connected via a network may share the above functions, and a system including the plurality of devices may realize the karaoke device 10 according to the embodiment. For example, a terminal device including a microphone, a speaker, and a camera and a dedicated computer device having a scoring function may be configured as a system connected via a network.

(15) The program realized by the control unit 11 of the karaoke apparatus 10 described above is provided in a state of being recorded on a recording medium such as a magnetic tape, a magnetic disk, a flexible disk, an optical recording medium, a magneto-optical recording medium, a RAM, or a ROM. Can do. It is also possible to download to the karaoke apparatus 10 via a network such as the Internet.

It is a block diagram which shows an example of a structure of a karaoke apparatus. It is a figure which shows an example of the content of the table memorize | stored in the correspondence storage area. It is a block diagram which shows an example of a functional structure of a karaoke apparatus. It is a figure which shows an example of the content of the table memorize | stored in the correspondence storage area.

Explanation of symbols

DESCRIPTION OF SYMBOLS 10 ... Karaoke apparatus, 11 ... Control part, 12 ... Memory | storage part, 13 ... Display part, 14 ... Operation part, 15 ... Microphone, 16 ... Audio | voice processing part, 17 ... Speaker, 18 ... Shooting part, 111 ... Scoring part, 112 ... Behavior detection unit, 121 ... Music data storage area, 122 ... Correspondence storage area.

Claims (6)

  1. Reference data storage means for storing reference data to be referred to when scoring the voice;
    Scoring means for scoring the voice based on a comparison between the voice signal representing the voice and the reference data stored in the reference data storage means;
    Viewer behavior detecting means for detecting the behavior of a viewer who views the person who emits the sound;
    A correcting means for analyzing the behavior detected by the viewer behavior detecting means and correcting the scoring result of the scoring means based on the analysis result;
    An audio evaluation apparatus comprising: output means for outputting data indicating the scoring result corrected by the correction means.
  2. The speech evaluation apparatus according to claim 1,
    The audio evaluation apparatus characterized in that the viewer behavior detecting means detects the behavior of the viewer by analyzing an image taken by a photographing means for photographing the viewer.
  3. The speech evaluation apparatus according to claim 1,
    The audio evaluation apparatus characterized in that the viewer behavior detecting means detects the viewer's behavior by analyzing the audio signal output from the sound collecting means for collecting the audio of the viewer.
  4. The speech evaluation apparatus according to claim 1,
    The voice evaluation device, wherein the viewer behavior detection means detects the viewer's behavior based on an operation signal output from an operation means that outputs an operation signal corresponding to the operated content.
  5. The speech evaluation apparatus according to any one of claims 1 to 4,
    A correspondence relationship storage means for storing a correspondence relationship between a combination of the type of music and the behavior of the viewer and correction mode data indicating a correction mode of scoring;
    Music type determination means for determining the type of music of the sound,
    The correction means refers to the correspondence storage means for a scoring correction mode corresponding to the combination of the music type determined by the music type determination means and the behavior detected by the viewer behavior detection means. The speech evaluation apparatus characterized by correcting the scoring result of the scoring means in a manner of specifying and correcting the specified scoring.
  6. A speech evaluation apparatus according to any one of claims 1 to 5;
    Music data storage means for storing music data representing the music;
    A karaoke apparatus comprising: reproduction means for reproducing music data stored in the music data storage means.
JP2007003395A 2007-01-11 2007-01-11 Voice evaluation device and karaoke device Expired - Fee Related JP4655047B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2007003395A JP4655047B2 (en) 2007-01-11 2007-01-11 Voice evaluation device and karaoke device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2007003395A JP4655047B2 (en) 2007-01-11 2007-01-11 Voice evaluation device and karaoke device

Publications (2)

Publication Number Publication Date
JP2008170685A true JP2008170685A (en) 2008-07-24
JP4655047B2 JP4655047B2 (en) 2011-03-23

Family

ID=39698834

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2007003395A Expired - Fee Related JP4655047B2 (en) 2007-01-11 2007-01-11 Voice evaluation device and karaoke device

Country Status (1)

Country Link
JP (1) JP4655047B2 (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011203357A (en) * 2010-03-24 2011-10-13 Xing Inc Karaoke system, karaoke apparatus and computer program
JP2012068373A (en) * 2010-09-22 2012-04-05 Xing Inc Music upsurge determination device and music upsurge determination program
JP2012165407A (en) * 2007-12-28 2012-08-30 Casio Comput Co Ltd Imaging apparatus and program
JP2012198305A (en) * 2011-03-18 2012-10-18 Yamaha Corp Display controller
JP2012203080A (en) * 2011-03-24 2012-10-22 Xing Inc Karaoke machine and karaoke program
JP2012208156A (en) * 2011-03-29 2012-10-25 Xing Inc Karaoke system and karaoke program
EP2993615A1 (en) * 2014-09-05 2016-03-09 Omron Corporation Scoring device and scoring method
JP2016102962A (en) * 2014-11-28 2016-06-02 株式会社第一興商 Karaoke rating system considering listener evaluation
JP2016102961A (en) * 2014-11-28 2016-06-02 株式会社第一興商 Karaoke rating system considering listener evaluation
WO2016092933A1 (en) * 2014-12-08 2016-06-16 ソニー株式会社 Information processing device, information processing method, and program
JP2016188978A (en) * 2015-03-30 2016-11-04 ブラザー工業株式会社 Karaoke device and program
JP2016191794A (en) * 2015-03-31 2016-11-10 ブラザー工業株式会社 Karaoke device and program
JP2017027070A (en) * 2016-09-16 2017-02-02 ヤマハ株式会社 Evaluation device and program
JP2017049542A (en) * 2015-09-04 2017-03-09 ブラザー工業株式会社 Operation evaluation device and program
JP2017058526A (en) * 2015-09-16 2017-03-23 株式会社エクシング Karaoke device and program for karaoke

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002041067A (en) * 2000-07-25 2002-02-08 Namco Ltd Karaoke system and recording medium
JP2006227247A (en) * 2005-02-17 2006-08-31 Casio Comput Co Ltd Karaoke machine and singing evaluation process program for karaoke playing

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002041067A (en) * 2000-07-25 2002-02-08 Namco Ltd Karaoke system and recording medium
JP2006227247A (en) * 2005-02-17 2006-08-31 Casio Comput Co Ltd Karaoke machine and singing evaluation process program for karaoke playing

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012165407A (en) * 2007-12-28 2012-08-30 Casio Comput Co Ltd Imaging apparatus and program
JP2011203357A (en) * 2010-03-24 2011-10-13 Xing Inc Karaoke system, karaoke apparatus and computer program
JP2012068373A (en) * 2010-09-22 2012-04-05 Xing Inc Music upsurge determination device and music upsurge determination program
JP2012198305A (en) * 2011-03-18 2012-10-18 Yamaha Corp Display controller
JP2012203080A (en) * 2011-03-24 2012-10-22 Xing Inc Karaoke machine and karaoke program
JP2012208156A (en) * 2011-03-29 2012-10-25 Xing Inc Karaoke system and karaoke program
EP2993615A1 (en) * 2014-09-05 2016-03-09 Omron Corporation Scoring device and scoring method
CN105405436A (en) * 2014-09-05 2016-03-16 欧姆龙株式会社 Scoring device and scoring method
JP2016057337A (en) * 2014-09-05 2016-04-21 オムロン株式会社 Point rating device and point rating method
US9892652B2 (en) 2014-09-05 2018-02-13 Omron Corporation Scoring device and scoring method
JP2016102961A (en) * 2014-11-28 2016-06-02 株式会社第一興商 Karaoke rating system considering listener evaluation
JP2016102962A (en) * 2014-11-28 2016-06-02 株式会社第一興商 Karaoke rating system considering listener evaluation
WO2016092933A1 (en) * 2014-12-08 2016-06-16 ソニー株式会社 Information processing device, information processing method, and program
JP2016188978A (en) * 2015-03-30 2016-11-04 ブラザー工業株式会社 Karaoke device and program
JP2016191794A (en) * 2015-03-31 2016-11-10 ブラザー工業株式会社 Karaoke device and program
JP2017049542A (en) * 2015-09-04 2017-03-09 ブラザー工業株式会社 Operation evaluation device and program
JP2017058526A (en) * 2015-09-16 2017-03-23 株式会社エクシング Karaoke device and program for karaoke
JP2017027070A (en) * 2016-09-16 2017-02-02 ヤマハ株式会社 Evaluation device and program

Also Published As

Publication number Publication date
JP4655047B2 (en) 2011-03-23

Similar Documents

Publication Publication Date Title
Castellano et al. Automated analysis of body movement in emotionally expressive piano performances
US6856923B2 (en) Method for analyzing music using sounds instruments
US8407055B2 (en) Information processing apparatus and method for recognizing a user&#39;s emotion
JP4438144B2 (en) Signal classification method and apparatus, descriptor generation method and apparatus, signal search method and apparatus
JP4124247B2 (en) Music practice support device, control method and program
US20110319160A1 (en) Systems and Methods for Creating and Delivering Skill-Enhancing Computer Applications
JP5147389B2 (en) Music presenting apparatus, music presenting program, music presenting system, music presenting method
KR100267663B1 (en) Karaoke apparatus responsive to oral request of entry songs
US7323631B2 (en) Instrument performance learning apparatus using pitch and amplitude graph display
US5005459A (en) Musical tone visualizing apparatus which displays an image of an animated object in accordance with a musical performance
EP1703488A2 (en) Music search system and music search apparatus
US8138409B2 (en) Interactive music training and entertainment system
Wanderley Non-obvious performer gestures in instrumental music
US20090038468A1 (en) Interactive Music Training and Entertainment System and Multimedia Role Playing Game Platform
JP2004086067A (en) Speech generator and speech generation program
JP2010518459A (en) Web portal for editing distributed audio files
US9159338B2 (en) Systems and methods of rendering a textual animation
RU2488179C2 (en) Feedback related to gestures in electronic entertainment system
JP2004326840A (en) Music data selection device, music data selection method, music data selection program, and information recording medium recorded with the program
WO2004038694A1 (en) Musical composition reproduction method and device, and method for detecting a representative motif section in musical composition data
TWI433027B (en) An adaptive user interface
CN1162167A (en) Formant conversion device for correcting singing sound for imitating standard sound
JP2002510403A (en) A method and apparatus for real-time correlation of the performance of the music score
JP4206332B2 (en) Input device, game system, program, and information storage medium
US8538566B1 (en) Automatic selection of representative media clips

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20091117

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20100519

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20100525

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20100726

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20100817

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20101013

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20101124

A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20101207

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20140107

Year of fee payment: 3

R150 Certificate of patent or registration of utility model

Free format text: JAPANESE INTERMEDIATE CODE: R150

LAPS Cancellation because of no payment of annual fees