CN1143262C

CN1143262C - Method of generating images synchronized with sound and karaoke apparatus

Info

Publication number: CN1143262C
Application number: CNB981202756A
Authority: CN
Inventors: 浅井三平; 人; 稻叶尚人
Original assignee: Pioneer Corp
Current assignee: Pioneer Corp
Priority date: 1997-09-19
Filing date: 1998-09-19
Publication date: 2004-03-24
Anticipated expiration: 2018-09-19
Also published as: JPH1195778A; JP3895014B2; CN1212416A; TW372311B; HK1018838A1

Abstract

The present invention relates to a synchronous image forming method for displaying the video, etc., acting in synchronization with the reproduction of a musical piece and a karaoke apparatus using the method. In former technique, video in synchronization with the reproduction of a musical piece is attached to each musical piece such that a plurality of data of image is required. The present invention only requires less data to achieve attaching video in synchronization with the reproduction of an action musical piece. The resolving method is segmenting image of human body, etc. into some elements, Shape data of the video for shape of the elements are arranged and action data for position or action of the elements also are arranged to generate image representing human body, etc. objects.

Description

Synchronous image generation method and the Caraok device that uses this method

Technical field

Synchronous image generation method with the video of the playback synchronization action of sound the present invention relates to reset out, and the Caraok device that uses this method, be exactly in particular, the present invention relates to the data that the mode according to shape by measuring the object that is made of people, animal and their dummy and action thereof obtains, generate the video of the aforementioned object of expression, and the synchronous image generation method that this video is shown by the mode with playback sound synchronised, and the Caraok device that uses this method.

Background technology

When the melody (accompaniment part) of folk song, popular song or the like is reset, can demonstrate this melody the lyrics and with the Caraok device of the corresponding background images of this melody be at present known.This Caraok device has the memory storage that can use such as VCD (Video CD) or LD (Laser Disc) or the like, in this memory storage the voice data used of the in store music replaying of storage, represent the lyrics data that the lyrics are used and represent the pictorial data that background images is used.Here, the data of earlier figures image data for obtaining by the mode of scenery being photographed or the like.This Caraok device can read out voice data by aforementioned memory storage, and then the melody of resetting out, and simultaneously by reading out pictorial data in the aforementioned memory storage, to demonstrate background images.

Yet, be configured in the memory storage in this Caraok device, need to use so-called VCD and LD or the like the bigger medium of memory capacity.Therefore along with the increase of employed melody number in the Karaoke, the particularly rapid increase of pictorial data capacity, even will produce the medium of using aforementioned type, also can not be stored into problem with whole corresponding voice datas of melody, lyrics data and pictorial data.And employed melody mostly is popular song greatly in Karaoke, thereby also has the problem that must constantly increase new melody.Therefore in recent years, dispose communication function on Caraok device, so that can transmit the voice data that music replaying uses and the device of lyrics data by telephone wire, promptly so-called communication Caraok device is popularized rapidly.

The structure form of the composition of this communication Caraok device makes it to receive voice data and the lyrics data that music replaying is used by telephone wire, and therefore the melody number that can reset will no longer be subjected to the restriction of memory device stores capacity.So huge melody of number of can resetting out.Even need reset to up-to-date melody, also can receive the voice data and the lyrics data of this melody by telephone wire, and then this up-to-date melody of resetting out.

Yet the pictorial data of display background image usefulness is the data that scenery or the like is photographed and obtained, so it is compared with the voice data that music replaying is used, its data volume is much bigger.Therefore if for pictorial data, also the onomatopoeia sound data transmits by telephone wire like that, aspect the time and economic aspect all be inappropriate.Therefore, also be similar for the demonstration of background images, promptly according to being stored in the demonstration of carrying out background images such as the pictorial data on the medium of VCD and LD or the like in advance with original Caraok device even adopt aforesaid communication Caraok device.Yet being subjected to the restriction of medium memory capacity owing to can demonstrate the kind of background images, is quite difficult so want to have with the corresponding different background images of the melody of resetting out.So aforesaid Caraok device, the selected melody that is with resets out that demonstrates does not have the background images of direct relation.

And,, also wish to demonstrate with Caraok device the action of the singer when singing, such as swing or beat or the like particularly for tuneful melody for employed melody in the Karaoke.

In order to realize this point, can adopt the mode of photographing to obtain image data (such as being vision signal) by the singer when singing, and the method for when music replaying, resetting out this image data simultaneously.

Yet because singer's action all is different for every first melody, so need move the image data of usefulness to each first melody design expression singer.This will make the image data amount increase rapidly, even adopt so-called data compression technique, it also will be considerably beyond the memory capacity that is arranged on the memory storage in the aforesaid Caraok device.Therefore have and be difficult to each first melody design expression singer is moved the problem of image data of usefulness.

And, for the foregoing singer's that can reset out action, also must make the correct maintenance of the image data of expression singer action and playback melody synchronous.Known here a kind of Caraok device that can demonstrate the video that is changing synchronously by melody, such as by the disclosed device of Japanese kokai publication hei 7-199976 communique.This Caraok device has the image data that is made of pictorial data and time data, and can make the demonstration of video and the playback synchronised of melody according to time data.If yet each first melody is all designed one group of image data that is made of pictorial data and time data, the data total amount will be increased, so still be difficult to each first melody is designed its image data rapidly.

On the other hand, the melody quantity of using in Karaoke increases rapidly, make and to be stored in the Caraok device that has been provided with at present to a large amount of voice data of all each first melodies up to the present of regeneration, and the communication Caraok device sends in the center principal computer that voice data uses.Therefore the Caraok device of the singer's video that matches for the playback that realizes demonstrating its action and melody, the image data that a large amount of melodies that also must want affix to show that the singer moves and stored match.At this moment, for additional image data, a large amount of voice datas that need layout up to this point to be stored have in time and all inappropriate economically problem.

And when playing Karaoka, need by various beats (speed) playback melody.Therefore be provided with usually can be according to the people's who uses Karaoke hobby for Caraok device, make melody beat that actual playback goes out than benchmark beat slower or function faster.Therefore for the Caraok device that can demonstrate singer's video that its action and music replaying match,, also can make the action nature, level and smooth of the shown singer's video that goes out even also need the occasion that changes to have taken place for the beat of melody.

And when playing Karaoka, also can need to begin the melody of exercise usefulness of resetting halfway, or begin to reset after returning a part of melody.Therefore Caraok device also needs usually to have at melody and implements the function of resetting midway.Therefore for the Caraok device that can demonstrate singer's video that its action and music replaying match, when also needing to begin halfway to reset melody, also can demonstrate and the corresponding singer's action of the replayed portion of this melody, even when beginning, also can make singer's action and this melody correctly keep synchronous midway at melody.

Summary of the invention

The present invention is exactly the invention at the problems referred to above, an object of the present invention is to provide a kind of for the sound reproduction synchronization action, and have the occasion of some kinds of sound, also can demonstrate the synchronous image generation method of its action with the different and different video of various sound, and the Caraok device that uses this method.

And, another object of the present invention provides a kind of original voice data that has produced at present that is not programmed into, but demonstrate and the synchronous image generation method of the action video that the sound reset out is synchronous, and the Caraok device that uses this method according to this voice data.

And, even another purpose of the present invention provides a kind of occasion that changes for the playback speed of sound, also can make the playback synchronization of shown video that goes out and sound and make the level and smooth synchronous image generation method of action, and the Caraok device that uses this method.

And, even a further object of the present invention provides and a kind ofly halfway the sound of melody or the like implemented playback time, also can demonstrate the synchronous image generation method of its action and the video of this sound accurate synchronization, and the Caraok device that uses this method.

In order to address the above problem, by the given a kind of synchronous image generation method in the 1st aspect of the present invention, have according to the voice data sound reproduction step that sound uses of resetting out, output is exported step with the synchronizing signal that the corresponding synchronizing signal of sound reproduction speed that described sound reproduction step is reset out is used, to represent by the people, the video of the object that animal and their dummy constitute is divided into several inscapes, use shape shape data of using and the position of setting described each inscape of setting these inscapes or the action data that moves usefulness, generate to show that the video that the video of described object is used generates step, will generate video that step generates by described video and be presented at the step display of using on the display device by mode with the synchronizing signal synchronised of described synchronizing signal output step output.

The sound reproduction step is used for resetting out such as the sound of melody or the like according to voice data.Synchronizing signal output step is used to export and the corresponding synchronizing signal of playback speed of the sound reset out.Video generates step and is used to utilize shape data and action data, generates the video that shows the object that is made of people, animal and their dummy.Step display be used for will generate by video the video that generates of step, be presented on the display device by mode with the synchronizing signal synchronised of synchronizing signal output step output.So just, can demonstrate its action and acoustic phase synchronous, such as the video of people, animal or the like.

Shape data is divided into several inscapes for the video that will represent the object that is made of people, animal and their dummy, and sets the data that the shape of these inscapes is used.Action data is the position of setting each inscape of object or the data of moving usefulness.Therefore the data volume of this shape data and action data is compared with the data volume of the image data that is obtained by human body or the like is directly photographed (such as being vision signal), is considerably less.Therefore if adopt the present invention, just can go out the video of synchronous human body of its action and acoustic phase or the like with a spot of data creating.

And, if adopt the 2nd aspect of the present invention described synchronous image generation method, its action data is with the foundation voice data sound to be reset by datum velocity, object is moved by the mode of coincideing with this acoustic phase of resetting out, and generate by the mode of the position of each inscape on predetermined period measurement object this moment or action.Therefore generate in the step at video, by usage operation data and shape data, generate the expression object successively and pass the image that is changing in time, and then export these visual modes, just can reset out its action and the identical object video of acoustic phase by predetermined period.

And, if adopt the 2nd aspect of the present invention described synchronous image generation method, when the sound reproduction speed of being reset out by the sound reproduction step was datum velocity, then usage operation data and shape data generated the video that shows object in video generation step.This action data as mentioned above, for the position of measuring each inscape on the object that its action and the acoustic phase of resetting out by datum velocity coincide by predetermined period or action constitute.Therefore by usage operation data and shape data, generate the expression object successively and pass the image that is changing in time, and then export these visual modes by predetermined period, the object video that just can automatically reset out its action and the acoustic phase of resetting out by datum velocity coincide.

And, if adopt the 2nd aspect of the present invention described synchronous image generation method, when the playback speed of the sound of being reset out by the sound reproduction step is slower than datum velocity, then generate in the step action data is implemented interpolation at video, generate the interpolation action data, and usage operation data, interpolation action data and shape data generate the video that shows object.Promptly when the playback speed of the sound of being reset out by the sound reproduction step is slower than datum velocity, utilize action data, interpolation action data and shape data, generate the expression object successively and pass the image that is changing in time, and then export these images by predetermined period.If specifically be exactly, generate in the process of image in usage operation data and shape data, insert the image that generates by interpolation action data and shape data, and produce the video of expression object.Therefore, only, just can postpone the action of object by inserting the image that generates by interpolation action data and shape data.So just, can easily make playback speed than the slow sound of datum velocity and the action synchronised of object.

And, if adopt the 3rd aspect of the present invention described synchronous image generation method, when the playback speed of the sound of being reset out by the sound reproduction step is faster than datum velocity, then video generate in the step action data implemented between choosing, and generate the video that shows object by action data after the choosing and shape data.By the mode of choosing between action data is implemented, just can accelerate the action of object.Therefore can easily make playback speed than the fast sound of datum velocity and the action synchronised of object.

And, if adopt the 4th aspect of the present invention described synchronous image generation method, its action data is with the foundation voice data sound to be reset by datum velocity, object is moved by the mode of coincideing with this acoustic phase of resetting out, and generate by the mode for the position of each inscape on period measurement object this moment of the integral multiple of display cycle of described display device or action.Therefore generate in the step at video, by usage operation data and shape data, generate the expression object successively and pass the image that is changing in time, and then export these visual modes by the cycle of the integral multiple of display cycle of described display device, the object video that just can reset out its action and acoustic phase coincide.At this moment, the cycle of the image of output expression object, consistent with the cycle of the display cycle integral multiple of described display device, so whole images of being exported all will be shown by described display device.This will make the action of object level and smooth.

And, if adopt the 5th aspect of the present invention described synchronous image generation method, its action data is with the foundation voice data sound to be reset by datum velocity, object is moved by the mode of coincideing with this acoustic phase of resetting out, and generate by the mode than the position of each inscape on shorter period measurement object this moment of the cycle of the synchronizing signal of synchronizing signal output step output or action.Therefore generate in the step at video, by usage operation data and shape data, generate the expression object successively and pass the image of variation in time, and then export these visual modes by the shorter cycle in cycle, the object video that just can reset out its action and acoustic phase coincide than the synchronizing signal of synchronizing signal output step output.If shorten the cycle that output shows the image of object, can make the action of object more level and smooth, thereby can make the action of sound and object correctly synchronous.Yet when the playback speed of sound is faster than datum velocity, by carry out corresponding action data between the choosing, and shorten the cycle that output shows the image of object, also can make the action of object more level and smooth, thereby can make the action of sound and object correctly synchronous.

And, if adopt the 6th aspect of the present invention described synchronous image generation method, the symbol that the replay position that also adding on the synchronizing signal of synchronizing signal output step output has the identification voice data is used.The i.e. symbol that affix differs from one another on each time clock of synchronizing signal.So just, can identify the replay position of voice data, show the corresponding image of reproduction position with voice data.Even therefore halfway melody or the like sound is implemented to reset, also can demonstrate its action and the correctly synchronous object video of this sound.

And, if adopt the described Caraok device in the 7th aspect of the present invention, has the voice data memory module that the stored sound data are used, according to being stored in voice data in the voice data memory module sound reproduction assembly that sound uses of resetting out, the synchronizing signal output precision that the corresponding synchronizing signal of sound reproduction speed that output and described sound reproduction assembly are reset out is used, to show by the people, the video that the object that animal and their dummy constitute is used is divided into several inscapes, the shape data memory module of the shape data that the shape of these inscapes of storage setting is used, the action data memory module of the action data of the position of each inscape or action usefulness on the storage setting object, use is stored in the shape data and the action data that is stored in the action data memory module in the shape data memory module, generate to show the video formation component that the object video is used, and the video that will be generated by described video formation component is presented at the display module of using on the display device by the mode with the synchronizing signal synchronised of synchronizing signal output precision output.

Constitute for above-mentioned structure, when playing Karaoka, can be according to the playback that is stored in the voice data enforcement sound in the voice data memory module.And, need to use the action data that is stored in the shape data in the shape data memory module and is stored in the action data memory module to generate the video that shows described object, and with this video by showing with the synchronous mode of the acoustic phase of resetting out.So just, can demonstrate the object video of its action and playback sound synchronised.

Special needs to be pointed out is that its voice data is stored in the voice data memory module, shape data is stored in the shape data memory module, and action data is stored in the action data memory module.So just, can distinguish independently voice data, shape data and action data are implemented access.If for instance, this can make voice data all different to each first melody with action data, and shape data is identical to each first melody.But also can only change shape data, and can be on original voice data affix action data and shape data.

And, if adopt the described Caraok device in the 8th aspect of the present invention, also be provided with the playback speed change assembly that the playback speed of the sound that change sound reproduction assembly resets out is used.So just, can utilize playback speed change assembly to change the playback speed of sound.And when changing the playback speed of sound with playback speed change assembly, the cycle of synchronizing signal also will change thereupon.Can also make responsiveness with the video of synchronizing signal synchronization action produce the corresponding variation of variation with playback speed like this.Even therefore the playback speed of sound changes, also can make the action synchronised of sound and object.

And, if adopt the described Caraok device in the 9th aspect of the present invention, its action data is with the foundation voice data sound to be reset by datum velocity, described object is moved by the mode of coincideing with this acoustic phase of resetting out, and generate by the mode of the position of each inscape on predetermined period measurement object this moment or action, and the video formation component is when the playback speed when the sound of being reset out by the sound reproduction assembly is datum velocity, usage operation data and shape data generate the video that the expression object is used, when the sound reproduction speed of being reset out by the sound reproduction assembly is slower than datum velocity, action data is implemented interpolation and generated the interpolation action data, and the usage operation data, interpolation action data and shape data generate the video that the expression object is used.So just with the present invention in the 2nd aspect similar, when the playback speed of the sound of being reset out by the sound reproduction assembly is datum velocity, and when the playback speed of the sound of being reset out by the sound reproduction assembly is slower than datum velocity, can make the shown video action that goes out level and smooth.

And, if adopt the described Caraok device in the 10th aspect of the present invention, its video formation component is when the playback speed of the sound of being reset out by the sound reproduction assembly is faster than datum velocity, choosing between action data implemented, and the action data after the choosing and shape data generate the video of the described object of expression between using.So just with the present invention in the 3rd aspect similar, can easily reset out its action and the synchronous video of resetting out by the playback speed faster of acoustic phase than datum velocity.

And, if adopt the described Caraok device in the 11st aspect of the present invention, its action data is with the foundation voice data sound to be reset by datum velocity, object is moved by the mode of coincideing with this acoustic phase of resetting out, and generate by the mode of the position of each inscape on period measurement described object this moment of the integral multiple of display cycle of display device or action.So just with the present invention in the 4th aspect similar, can make the action of the shown video that goes out level and smooth.

And, if adopt the described Caraok device in the 12nd aspect of the present invention, its action data is with the foundation voice data sound to be reset by datum velocity, object is moved by the mode of coincideing with this acoustic phase of resetting out, and generate by the mode than the position of each inscape on shorter period measurement object this moment of the cycle of the synchronizing signal of synchronizing signal output precision output or action.So just with the present invention in the 5th aspect similar, even the playback speed of sound also can make the action of object level and smooth, and can make the demonstration of sound and video correctly synchronous than very fast.

And, if adopt the described Caraok device in the 13rd aspect of the present invention, the symbol that the replay position that also adding on the synchronizing signal of synchronizing signal output precision output has the identification voice data is used.So just with the present invention in the 6th aspect similar, can go out the replay position of voice data according to the Symbol recognition that is attached on the synchronizing signal, thereby can demonstrate and the corresponding image of the replay position of voice data.Therefore, also can demonstrate the video of its action and this sound accurate synchronization even begin to reset sound halfway such as melody or the like.

And, if adopt the described Caraok device in the 14th aspect of the present invention, it also is provided with reception and transmits the voice data and the action data of coming by the outside, received voice data is stored in the voice data memory module, received action data is stored into the Data Receiving assembly of using in the action data memory module.So just, can use the Data Receiving assembly to receive voice data and action data in voice data, action data and the shape data by the outside.And can use the voice data that receives by the outside sound of resetting out, use the action data and the former shape data that is stored in the Caraok device that receive by the outside to demonstrate video.So just, can reduce by the amount of outside by receive the fewer voice data of data volume and the mode of action data by the outside to the data of Caraok device transmission.

And, if adopt the described Caraok device in the 15th aspect of the present invention, it stores in the shape data memory module and forms several shape datas that difform several objects are used, by utilization be included in the voice data the selection data or by the mode of outside input, select each shape data, and the video formation component uses selected shape data that goes out and the action data that is stored in the action data memory module to generate the video that the expression object is used.Therefore for instance, if in music data, include the selection data of selecting each shape data to use, just can implement to select to shape data according to these selection data, and set the shape of object according to the selected shape data that goes out.In addition, can also constitute selected shape data, so just can form such as be animal, people, the male sex, women's or the like variform some kinds of objects by the mode of outside input.

Description of drawings

Fig. 1 is the schematic block diagram of expression according to the communication Caraok device of a kind of form of implementation structure of the present invention.

Fig. 2 constitutes the figure that schematically illustrates of usefulness for the structure of the music data in this form of implementation of explanation the present invention.

Fig. 3 constitutes the figure that schematically illustrates of usefulness for the structure of the image data in this form of implementation of explanation the present invention.

Fig. 4 constitutes the figure that schematically illustrates of usefulness for the structure of the action data in this form of implementation of explanation the present invention.

Fig. 5 schematically illustrates figure for what the object model in explanation the present invention this form of implementation was used.

Fig. 6 constitutes the figure that schematically illustrates of usefulness for the structure of the shape data in this form of implementation of explanation the present invention.

The schematic oblique view that Fig. 7 uses for the structure inscape of the object model in this form of implementation of explanation the present invention.

Fig. 8 schematically illustrates figure for what the playback frame numerical table in explanation the present invention this form of implementation was used.

Fig. 9 is for being illustrated in this form of implementation of the present invention, and when by datum velocity playback melody, MIDI time clock, synchronizing signal and frame are in the time plot of synchronous regime.

Figure 10 is for being illustrated in this form of implementation of the present invention, and when by than the slow speed playback melody of datum velocity the time, MIDI time clock, synchronizing signal and frame are in the time plot of synchronous regime.

Figure 11 is for being illustrated in this form of implementation of the present invention, and when by than the fast speed playback melody of datum velocity the time, MIDI time clock, synchronizing signal and frame are in the time plot of synchronous regime.

Figure 12 is illustrated in the synchronous image of using in the communication Caraok device of this form of implementation of the present invention to generate the process flow diagram of handling.

Figure 13 generates the process flow diagram of handling for the synchronous image that expression is connected with Figure 12.

Figure 14 generates the process flow diagram of handling for the synchronous image that expression is connected with Figure 13.

Figure 15 is shown in the figure that schematically illustrates that video in the model of communication Caraok device of this form of implementation of the present invention uses for instruction card.

Embodiment

Below with reference to accompanying drawing 1 to accompanying drawing 15 form of implementation of the present invention is described.In this form of implementation, be to be the example explanation Caraok device that uses synchronous image generation method of the present invention with as shown in Figure 1 communication Caraok device 100.

(1) structure of communication Caraok device constitutes and manner of execution

At first communication Caraok device 100 is described.As shown in Figure 1, communication Caraok device 100 has Karaoke performance parts 10, video playback portion 30, background images generating unit 40 and synthetic portion 50.Karaoke performance parts 10 also generates the lyrics image that the expression melody lyrics are used in the melody of resetting out." melody " vocabulary is showing the melody such as folk song, popular song or the like, represents the accompaniment part of this class melody especially.The singer's of the playback synchronization of its action of 30 generations of video playback portion and melody video (hereinafter referred to as " singer's video ").For a first melody several singers' occasion is arranged, video playback portion 30 can also generate several singers' video.And, except the singer, also has some dancers (companion dancer's) occasion for a first melody.For this occasion, video playback portion 30 not only can generate singer's video, can also be generated as singer accompanying dancer's dancer's video.Background images generating unit 40 is used for the generation background image.50 pairs of lyrics images of synthetic portion, singer's video and background images are implemented synthetic.Karaoke performance parts 10 is connected with center principal computer 200 by telephone wire.And Karaoke performance parts 10 also is connected with the mixer amplifier 60 that is used for melody sound and the sound of being imported by microphone 80 are implemented synthetic usefulness, and mixer amplifier 60 is connected with microphone 80 with loudspeaker 70.Synthetic portion 50 is connected with being used to show by the video of these synthetic portion 50 outputs and visual monitor 90.

Below Karaoke performance parts 10 is described.Karaoke performance parts 10 is made of with CPU (CentralProcessing Unit) 11, RAM (Random Access Memory) 12, ROM (Read OnlyMemory) 13, CD-ROM reader 14, music data storage part 16, input part 17, lyrics image generating unit 18, sound source portion 19 and FIFO (First In First Out) loop 20 sound.And these assemblies are connected to each other by bus 22.

Sound is used for Karaoke performance parts 10 is carried out Comprehensive Control with CPU11, and is used to implement the automatic playing of melody.If be exactly concretely, sound can adopt the structure such as MIDI (Music InstrumentDigital Interface) form to constitute with CPU11, and it has the function that can carry out the melody automatic playing according to the MIDI data.This sound has timer with CPU11, thereby has the function that can generate synchronizing signal described later according to the MIDI time clock.And this sound also has the playback speed that changes melody, promptly changes the function of the beat of melody with CPU11.

RAM12 is used as the job storage zone when carrying out control and treatment by sound with CPU11, and is used for temporary transient store various kinds of data.In ROM13, storing the various control programs of regulation Karaoke performance parts 10 action usefulness.

CD-ROM reader 14 is used for reading music data described later and image data by CD-ROM.Here, CD-ROM can be installed in the outside.In CD-ROM, storing various music datas and image data.The music data that is read by CD-ROM reader 14 is passed to music data storage part 16.And also be passed to RAM12 by the music data that CD-ROM reader 14 reads out, particularly all the more so for the occasion of will be at once resetting with CPU11 and sound source portion 19 with sound.

Music data storage part 16 can be by constituting such as hard disk or the like, and storable music data is about about 2000 head.Music data storage part 16 can implement to write replacing, so that the music data that can be received by center principal computer 200 by 21 pairs of modulator-demodular units, transmit the music data that comes by CD-ROM reader 14 and implement to append storage.Music data storage part 16 also is connected with bus 22 by interface loop 15.

Input part 17 is used to import various instructions, to implement to select by the melody that needs are reset, beat during to music replaying is implemented to set, the melody tune is implemented to set, view information and light source information or the like are implemented to set, make melody implement F.F., retreat or the like, communication Caraok device 100 is implemented control.

The lyrics resemble generating unit 18 and can with in music replaying, generate the lyrics that can show and resemble on monitor 90 by constituting such as OSD (On Screen Display) loop or the like.If be exactly in particular, comprising voice data as described later, that the playback melody is used in the music data and generating the lyrics data that the melody lyrics resemble usefulness.Lyrics image generating unit 18 can generate lyrics image according to the lyrics data that is included in this music data.

Sound source portion 19 is used for synthesizing sound according to the voice data that is included in music data.If be exactly for instance, voice data can carry out the MIDI data that automatic playing is used for foundation MIDI specification, and the structure of sound source portion 19 can be for producing the compositor of music or the like according to this MIDI data.

But FIFO loop 20 has the function of implementation data buffering when implementation data transmits between Karaoke performance parts 10 and video playback portion 30.Certainly, also can not pass through this-FIFO loop 20 by sound with the synchronizing signal that CPU11 generates, but directly export video playback portion 30 to.

Modulator-demodular unit 21 is connected with center principal computer 200 by telephone wire, with to by telephone wire the data of being come by center principal computer 200 transmission being implemented to receive and demodulation.Here, storing many music datas in the center principal computer 200, and with the corresponding many image datas of these music datas.When needs, center principal computer 200 can be passed to the Caraok device 100 of communicating by letter with music data and image data.At this moment, modulator-demodular unit 21 is receiving by next music data and the image data of center principal computer 200 transmission, and the music data that receives and image data carried out after the demodulation, it can be passed to image data storage part 34 in RAM12, music data storage part 16 or the video playback portion 30 or the like and locate.

Below video playback portion 30 is described.Video playback portion 30 has video CPU31, ROM32, image data storage part 34, shape data storage part 35 and singer's video generating unit 36.These assemblies are connected to each other by bus 37.

Video is used for video playback portion 30 is carried out Comprehensive Control with CPU31.The various control programs that store the action usefulness of regulation video playback portion 30 in ROM32 generate the various control programs of handling usefulness with the synchronous image that carries out as described later.

Image data storage part 34 can be used for the memory image data by constituting such as hard disk or the like.Image data storage part 34 can implement to write replacing, so that the image data that can be received by center principal computer 200 by 21 pairs of modulator-demodular units, transmit the image data that comes by CD-ROM reader 14 and implement updated stored.Image data storage part 34 also is connected with bus 37 by interface loop 33.

Shape data storage part 35 is used to store shape data as described later, and it can be by constituting such as RAM, ROM or hard disk or the like.

Singer's video generating unit 36 has such as assemblies such as secondary CPU, storer and OSD loops, can utilize to include to be stored in to generate instantaneous image data, and export the instantaneous image data of generate to synthetic portion 50 such as the action data in the image data in image data storage part 34 or the like, the shape data that is stored in the shape data storage part 35.

If be exactly concretely, communication Caraok device 100 as shown in Figure 15, when carrying out music replaying, what can also demonstrate the lyrics on monitor 90 resembles Im1, background frame Im2, singer's video Im3.The movable image of singer's video Im3 for matching with melody wherein, it is to form by predetermined video (such as be 1/15 second cycle) reset out the continuously mode of several still images of being read out on one's body by the singer of resetting the cycle.In this form of implementation, each in several still images that will be read out on one's body by the singer is called " frame "." instantaneous image data " refers to such an extent that be to form the data that each frame still image of being read out by the singer is used on one's body, in other words is exactly, and it is the data that form with the corresponding visual usefulness of a frame singer video.Singer's video generating unit 36 utilizes action data and shape data to generate instantaneous image data continuously, and by the video cycle of resetting it is passed to synthetic portion 50.So just, the still image that can reset out continuously and read out on one's body by the singer.Therefore can produce the dynamic singer's video Im3 that matches with melody.

If be exactly more specifically, be provided with making at singer's video generating unit 36 places and use storer 36B with showing with storer 36A.Singer's video generating unit 36 receives action data and shape data by image data storage part 34 and shape data storage part 35 respectively, and these action datas and shape data are stored in making with among the storer 36A, so that in storer 36A is used in this making, generate instantaneous image data.Singer's video generating unit 36 is passed to demonstration storer 36B with this instantaneous image data by making of storer 36A then.Show with storer 36B be with each frame singer video one to one, be so-called video memory.Therefore according to being passed to the instantaneous image data that shows with storer 36B, just can in showing, generate singer's video of a frame with storer 36B by making of storer 36A.Be formed on demonstration and may be output to synthetic portion 50, in synthetic portion 50, be combined to, and be presented on the monitor 90 with resembling of the lyrics with background images with the singer's video in the storer 36B.

Background images generating unit 40 can be by constituting such as LD (Laser Disc) replay device or the like, to generate the background images Im2 that can show on monitor 90.If be exactly concretely, background images generating unit 40 be used for reading and recording on LD, form the pictorial data that background images is used, and according to this pictorial data generation background image, and then export the background images that is generated to synthetic portion 50.

Monitor 90 can be by constituting such as CRT (Cathode-Ray Tube) display or LCD or the like.

If adopt aforesaid communication Caraok device 100, then when playing Karaoka, can carry out melody selection, beat setting or the like operation, and the input music playing be prepared sign at first by user's operation inputting part 17.So just, can extract and the corresponding music data of selected melody by in the music data in the music data storage part 16 that is stored in Karaoke performance parts 10.For in music data storage part 16 not with the occasion of the corresponding music data of selected melody, can be by extracting music data among the CD-ROM that is installed in CD-ROM reader 14 places.For in music data storage part 16, CD-ROM all less than with the occasion of the corresponding music data of selected melody, can send out the instruction that requires to transmit with the corresponding music data of selected melody to center principal computer 200.So just, can receive this music data of exporting by center principal computer 200 by modulator-demodular unit 21.

Here, comprising the voice data that the playback melody uses and generate the lyrics data that the lyrics resemble usefulness in the music data.Music data is separated voice data and lyrics data, and respectively voice data is passed to RAM12, lyrics data is passed to the lyrics resembles generating unit 18.

Sound is implemented automatic playing with CPU11 according to transmitted the voice data that comes by RAM12 subsequently.Sound by sound source portion 19 after synthetic can export loudspeaker 70 to by mixer amplifier 60, and then the melody of resetting out.And at this moment, sound is also exported and the corresponding synchronizing signals of music playing to video playback portion 30 with CPU11.Lyrics image generating unit 18 generates lyrics image according to lyrics data simultaneously, and exports it to synthetic portion 50.

Meanwhile, also in the image data by the image data storage part 34 that is stored in video playback portion 30, extract and the corresponding image data of selected melody.For in image data storage part 34 not with the occasion of the corresponding image data of selected melody, can be by extracting image data among the CD-ROM that is installed in CD-ROM reader 14 places.For in image data storage part 34, CD-ROM all less than with the occasion of the corresponding image data of selected melody, can send out the instruction that requires to transmit with the corresponding image data of selected melody to center principal computer 200.So just, can receive this image data that sends by center principal computer 200 by modulator-demodular unit 21.

Subsequently by extracting action data in this image data, with the This move data be stored in shape data in the shape data storage part 35 and be passed to making in singer's video generating unit 36 in the lump with storer 36A place, to generate the instantaneous image data that each frame singer video of formation is used by singer's video generating unit 36.And the instantaneous image data of this generation also exports synthetic portion 50 to by the video playback cycle.

Meanwhile, also export and the corresponding background images of selected melody to synthetic portion 50 by background images generating unit 40.With 50 pairs of lyrics images of synthetic portion, synthesize, and be presented on the monitor 90 by singer's video of singer's video generating unit 36 outputs with by the background images of background images generating unit 40 outputs by 18 outputs of lyrics image generating unit.

So just, can be when selected melody be reset, by with the mode of music replaying synchronised, dynamic singer's video, lyrics image and background images are presented on the monitor 90.

(2) formation of music data and image data

Structure formation to music data and image data describes below.

Fig. 2 shows a kind of form of the composition of music data.This music data constitutes by opening initial portion Hm, melody sequence number Nm, beat data Tm, background data Bm and several such performance datas Pm or the like.

Melody sequence number Nm is for specifying the sequence number of usefulness to melody.The data of benchmark beat (datum velocity) usefulness when beat data Tm is the expression music replaying.Background data Bm is for specifying the sequence number of usefulness to corresponding background images with melody.

Such performance data Pm comprises the clock pulse signal that is made of voice data Sd and time data Tsd and the clock pulse signal that is made of lyrics data Wd and time data Twd or the like.Voice data Sd carries out the data that music replaying is used, such as it can be made of MIDI data or the like.If more particularly be exactly, include the instruction of sending usefulness among the voice data Sd such as sound is begun, make sound stop the instruction of usefulness, and specify any sound to begin instruction of sending usefulness or the like.With voice data Sd disposed adjacent be time data Tsd, it is according to voice data the execution time (time clock) of instruction to be controlled the data of usefulness.On the other hand, lyrics data Wd is for generating the data of lyrics image usefulness.With lyrics data Wd disposed adjacent be time data Twd, it is data of being controlled usefulness the time (time clock) that generates lyrics image usefulness.

Fig. 3 shows a kind of form of the composition of pictorial data.This pictorial data by open initial portion Hp, melody sequence number Np, data are counted Dp and several action datas Mp or the like and are constituted.Melody sequence number Np is for specifying the sequence number of usefulness to melody.Data are counted Dp and are included in the data that the number of the action data in the image data is used for expression.Action data Mp will be described below.

(3) form of the composition of action data and shape data

Structure formation to action data and shape data describes below.As mentioned above, action data and shape data are for generating the data that instantaneous image data is used.

Action data is for forming the data of singer's video action usefulness.Action data disposes several action datas as shown in Figure 3 in image data, each action data wherein is all corresponding with a frame singer video.And as mentioned above, action data comes in the image data, and can set respectively each first melody.In other words be exactly, action data is different data with respect to each first melody, be melody intrinsic data.

If be exactly concretely, action data is according to several inscapes, and the object segmentation that expression people, animal or profiling thing or the like are constituted comes, and then sets the position of these inscapes and the data of rotation usefulness.

Such as shown in Figure 5, define the manikin that imitates the personage, and this manikin is divided into each inscape of waist, chest, head, wrist and pin or the like by the hypothesis mode.Each inscape after this cutting apart is called " separation ".Such as can as shown in Figure 5 manikin be divided into separation L1～L17, and wherein the waist on separation L1 and the manikin is suitable, and the head on separation L5 and the manikin is suitable.And separation L1 can be along X-direction, move along Y direction with along Z-direction, simultaneously can also be respectively being that rotate at the center along the axle of X-direction, along the axle of Y direction with along the axle of Z-direction.Each separation L2～L17 is connected by connecting portion R respectively, and can be benchmark with connecting portion R, respectively being that rotate at the center along the axle of X-direction, along the axle of Y direction with along the axle of Z-direction.Action data is exactly the position of these separations of record and the data of rotation.

Fig. 4 shows the particular content of this action data.As shown in Figure 4, can to separation L1 give relevant position coordinates (X, Y, Z) and rotational angle (Xr, Yr, data Zr).Each action data for separation L2～L17 then only can be given relevant rotational angle (Xr, Yr, data Zr).

Action data can method as described below be made.At first according to this melody that the predefined benchmark beat of a certain melody is reset out.Make human body do actual moving by the mode that matches with this melody of resetting out then.Promptly carry out the action identical with the singer.By the predetermined mensuration cycle, for example, by cycle of 1/15 second to the position of each inscape on the human body of performing, rotate or the like and to implement to measure.So just, can obtain the data of position coordinates and the rotational angle of relevant separation L1 according to the predetermined mensuration cycle.And can obtain the data of the rotational angle of relevant separation L2～L17 according to the predetermined mensuration cycle.So just, produced action data.

On the other hand, shape data is according to several inscapes, and the video of the object that expression people, animal or profiling thing or the like are constituted is separated, and sets the data that the shape of these inscapes is used.Shape data is stored in the shape data storage part 35.Shape data is different with action data, and it is not that every first melody is set respectively.Promptly in general, shape data can be general for some first melodies.The kind that shape data can correspond to the singer's video that demonstrates sets out some kinds.If be exactly for instance, can the singer's video that demonstrate be changed to the animal video by personage's video by changing into the mode of some kinds shape data.

Can set its shape data respectively to each separation L1～L17.Here, Fig. 6 shows a kind of form of the composition of the shape data of separation L1.As shown in Figure 6, shape data comprises the vertex coordinates data that is made of apex coordinate A1～A8, and the polygonal shape data that constitute by polygon data P1～P6 or the like.As shown in Figure 7, by mode, just can set out the three-dimensional shape of separation L1 with each apex coordinate A1～A8 on the vertex coordinates data constrain height level L1.The polygonal shape data are used to the texture and the characteristic of each face on the separation L1 that defines.If be exactly concretely, each polygon data P1～P6 that constitutes the polygonal shape data is made of surface data (Surface Data) and summit sequence number respectively.Whether surface data comprises surface color, peripheral end points, transparency, has a width of cloth to be attached to the designation data of its lip-deep image (tissue lines) or the like usefulness.Vertex data forms the data on the summit of each surperficial usefulness for expression.The shape data of separation L1 can be constituted in such a way, and the shape data of separation L2～L17 can be constituted respectively in essentially identical mode.The information that in shape data, is also comprising the connecting portion R (referring to Fig. 5) of relevant each separation.

Therefore, the communication Caraok device 100 in this form of implementation can be produced singer's video according to action data and shape data.As Fig. 4 and shown in Figure 6, action data and shape data constitute by data data of coordinate data, angle-data, sequence number data, indication image color or the like, that data volume is less.Therefore, can utilize the less data creating of data volume to go out singer's video.And, can be separated into action data and shape data by singer's image data that the communication Caraok device 100 in this form of implementation is produced.Action data and shape data can be the less data of data volume, also can make action data and shape data be in a ratio of the less data of data volume.Therefore, can produce every first melody is different action datas, and with its be attached to each first melody on the intrinsic action data.Promptly because the data volume of action data is very little, though additional have every first melody different action data, also increase data processing that can be not excessive and storage burden.If be exactly for instance, when by center principal computer 200 when communication Caraok device 100 transmits action datas, can transmit this action data at short notice.Even the occasion for the melody number increases by the mode to every first melody storage action data, also can reduce the memory capacity of the memory storage of CD-ROM and center principal computer 200 significantly.

(4) melody and singer's video is synchronous

The method of synchronization to melody of being reset out by Karaoke performance parts 10 and the singer's video that is generated by video playback portion 30 describes below.

Melody mainly is to reset out with the automatic playing function among the CPU11 by being arranged on sound.As mentioned above, sound can carry out automatic playing according to midi format according to voice data with CPU11.At this moment, can carry out basic time control to automatic playing with the MIDI time clock.If for instance, the MIDI pulse signal can be the signal by 24 time clock of 4 dieresis output of melody.For the occasion that the melody beat changes, the recurrence interval of MIDI clock pulse signal will change along with the variation of melody beat.

In melody, preestablished the benchmark beat.This benchmark beat is included in the music data as beat data Tm.This benchmark beat is different for each first melody, such as for the occasion of the melody of rock music with rhythmical image or the like, can set than benchmark beat faster.And, can set slow benchmark beat for the occasion of adagio melodies such as folk rhyme.Melody is normally reset according to the benchmark beat.Yet when the input by input part 17 changes its beat, melody will be implemented to reset according to the beat of being imported.Such as when changing into than benchmark beat faster during beat, melody will according to than benchmark beat faster beat reset.

On the other hand, singer's video is mainly generated by singer's video generating unit 36.As mentioned above, singer's video generating unit 36 is according to action data and shape data, generates and---the corresponding instantaneous image data of frame singer video, and it is passed to synthetic portion 50 according to the video cycle of resetting.Be that singer's video generating unit 36 can generate instantaneous image data successively in making with storer 36A, and the instantaneous image data that is generated be passed to demonstration storer 36B according to the video cycle of resetting.Therefore, can on monitor 90, demonstrate the singer's video that changes by the video playback cycle by synthetic portion 50.

Here, the mensuration cycle that video uses when resetting cycle and making action data is same one-period, such as is 1/15 second.Promptly as mentioned above, can be by pressing benchmark beat Tm playback melody, make human body do actual motion (wave, beat) according to the mode that matches with this melody of resetting out, and carry out the position of each inscape on the human body of this motion and the mode of rotation according to predetermined mensuration period measurement, produce action data.Therefore, by utilizing the action data of producing in such a way to generate instantaneous image data, and export the mode of this instantaneous image data according to the video playback cycle identical with the mensuration cycle, just the human motion in the time of can reproducing mensuration substantially, thereby yet the singer's video that can reset and come from smooth motion.Here, the video cycle of resetting normally keeps certain.

The video cycle of resetting also can be set according to the mode of the integral multiples of 90 display cycles of monitor.Promptly when monitor 90 is made of CRT monitor or LCD, it have monitor intrinsic display cycle.If for instance, the display cycle of the CRT monitor of general TSC-system formula is 1/30 second.The CRT monitor of TSC-system formula of occasion when being to(for) monitor 90, the video cycle of resetting can be set at 1/30 second twice, promptly 1/15 second.Therefore can on monitor 90, demonstrate singer's video of each frame smoothly, thereby make the motion that the video that is presented on the monitor 90 can be level and smooth.

Melody and singer's video mainly be, to realize during at sound synchronously according to playback frame numerical table Ts with CPU11 output synchronizing signal by singer's video generating unit 36.

Synchronizing signal can be the signal in a time clock of per 8 dieresis output of melody.This synchronizing signal can generate by the mode of the MIDI time clock being carried out frequency division.The recurrence interval of synchronizing signal also will be along with variation when changing at the melody beat.This synchronizing signal can be generated with CPU11 by sound.

Playback frame numerical table Ts can make in a manner described below.At first as shown in Figure 9, affix pulse sequence number on each time clock of synchronizing signal.This pulse sequence number by the beginning portion of melody to ending by 1,2,3 ... mode increase successively.Below as shown in Figure 9, additional frame sequence number on each frame of singer's video.This frame number also by the beginning of melody to ending press F1, F2, F3 ... mode increase successively.Press the occasion that the benchmark beat is reset for melody, can calculate the recurrence interval of synchronizing signal, and reset the cycle, obtain each clock pulse signal of synchronizing signal respectively and the frame number of the frame that matches with the time according to this recurrence interval and video.Corresponding relation between each time clock sequence number and the frame number and is recorded among the playback frame numerical table Ts as shown in Figure 8.Therefore in playback frame numerical table Ts, the occasion of resetting by the benchmark beat for melody, the frame number of the frame that matches in time with each clock pulse signal in the synchronizing signal is according to implementing to write down with the corresponding mode of each frame number.Here, the occasion of resetting according to the benchmark beat for melody, the frame with each clock pulse signal in the synchronizing signal matches in time promptly is called as " synchronization frame " with the corresponding frame of frame number that is recorded on the playback frame numerical table Ts.Such as for as shown in Figure 8 playback frame numerical table Ts, synchronization frame is frame F1, F4, F7, F11.In Fig. 9 to Figure 11, the frame that has oblique line in frame is a synchronization frame.The making of this playback frame numerical table Ts can be finished by singer's video generating unit 36 in music replaying is about to begin.

Below with reference to Fig. 8 to Figure 11, " principle of melody and singer's video synchronised " is described.

1. melody is pressed benchmark beat playback time

When melody according to benchmark beat playback time, singer's video generating unit 36 when melody begins to reset, demonstrates the first frame F1 as shown in Figure 9, subsequently by resetting successively with the parallel mutually mode of music replaying and demonstrating each frame.Be singer's video generating unit 36 in the mode parallel mutually, generate and form the instantaneous image data that each frame is used, and, export the instantaneous image data that is generated to synthetic portion 50 according to the cycle t that resets of video as shown in Figure 9 with music replaying.Synthetic portion 50 by with the mode of the intrinsic display cycle synchronised of monitor, will export monitor 90 to the corresponding image of instantaneous image data.

Here, forming the action data that instantaneous image data is used, is by making melody by benchmark beat playback time, measures that the mode of the human motion that matches with melody generates.Therefore work as melody with benchmark beat playback time, be consistent demonstration zero hour of the zero hour that makes music replaying and each frame of singer's video, if make the playback of melody carry out parallel so, just can make the action of the playback of melody and singer's video synchronous automatically with the demonstration of each frame.Yet, for the playback beat of melody the occasion that changes having taken place in playback procedure, or the occasion of resetting once more after temporarily stopping for melody or the like, need confirm frequently the corresponding time relationship between the demonstration of the playback of melody and each frame.In the playback procedure of melody, singer's video generating unit 36 can be according to synchronizing signal and playback frame numerical table Ts, and the corresponding time relationship between the demonstration of the playback of melody and each frame is confirmed frequently.If for instance, for playback frame numerical table Ts as shown in Figure 8, the time clock 1 in the synchronizing signal is corresponding with frame F1, and time clock 2 is corresponding with frame F4.And time clock 3 is corresponding with frame F7, and time clock 4 is corresponding with frame F11.Singer's video generating unit 36 is by making corresponding relation between time clock sequence number and the frame number and the corresponding to mode of statement among the playback frame numerical table Ts implement to confirm, and singer's video is implemented to show.

2. work as melody by the beat playback time slower than benchmark beat

Press the beat playback time slower than benchmark beat when melody, singer's video generating unit 36 will be inserted interpolation frame G1, G2 as shown in figure 10 between frame and frame.Promptly work as melody by the beat playback time slower than benchmark beat, because each frame is implemented to show according to video playback cycle t, and each frame shows according to playback order, so the action of singer's video will be early than the playback of melody, thereby makes melody and singer's video no longer synchronous.Therefore press the beat playback time slower than benchmark beat when melody, singer's video generating unit 36 will be inserted interpolation frame G1, G2 in order to make the time clock sequence number consistent with corresponding relation and the statement among the playback frame numerical table Ts between the frame number between frame.

Here, be used to generate the action data that corresponding instantaneous image data with interpolation frame G1, G2 uses (below be called " interpolation action data "), can according to the corresponding action data of frame at dead ahead, the insertion position place that is configured in interpolation frame (below be called " positive front side action data "), with with the corresponding action data of frame at dead astern, the insertion position place that is configured in interpolation frame (below be called " positive rear side action data "), it is calculated.Be exactly that if the motion of singer's video is more violent, when making difference between the value of positive front side action data and positive rear side action data bigger, the mean value that can get positive front side action data and positive rear side action data is as the interpolation action data more specifically.If the motion of singer's video is slower, when making difference between the value of positive front side action data and positive rear side action data smaller, then can get any one value in positive front side action data and the positive rear side action data as the interpolation action data.Like this, even melody is reset by the beat slower than benchmark beat, also can be easy to make melody and singer's video to realize synchronously, and can make the level and smooth action of singer's video that demonstrates.

3. work as melody by the beat playback time faster than benchmark beat

Press the beat playback time faster than benchmark beat when melody, singer's video generating unit 36 will be removed a part of frame as shown in figure 11 between each frame.Such as be F3, F6, F9, F10, F13 or the like, with choosing between each frame is implemented.Promptly work as melody by the beat playback time faster than benchmark beat, because each frame is implemented to show according to video playback cycle t, and each frame shows according to playback order, so the action of singer's video will be later than the playback of melody, thereby makes melody and singer's video no longer synchronous.Therefore singer's video generating unit 36 is according to the corresponding relation and the corresponding to mode of statement among the playback frame numerical table Ts that make between time clock sequence number and the frame number, selects between each frame is implemented.Like this, even press the beat playback time faster, also can be easy to make melody and singer's video to realize synchronously than benchmark beat when melody.

(5) synchronous image generates and handles

, synchronous image is generated processing describe to process flow diagram shown in Figure 14 below with reference to Figure 12.Synchronous image generates and handles is to show the processing that singer's video is used according to the mode with the music replaying synchronization action, is specializing of above-mentioned " principle of melody and singer's video synchronised ".It is according to the control program among the ROM32 that is stored in video playback portion 30 that this synchronous image generates processing, is mainly implemented by singer's video generating unit 36.

At first, user's operation inputting part 17 of communication Caraok device 100, select the melody that needs playback, the sound of Karaoke in the performance parts 10 by corresponding mode, will locate to be passed to RAM12 by music data storage part 16 or the like with CPU11 with the selected corresponding music data of melody that goes out.Sound also by extracting beat data in the music data that is transmitted, and identifies the benchmark beat of melody with CPU11 according to this beat data.And sound also exports the beat data that is extracted to video CPU31 with CPU11.

Use in the course of action of CPU11 at this sound, the video of video playback portion 30 is read out and the selected corresponding image data of melody that goes out by image data storage part 34 with CPU31, and the action data that will comprise this image data is passed to singer's video generating unit 36.Video also reads shape data by shape data storage part 35 with CPU31, and this shape data is passed to singer's video generating unit 36.Video will export singer's video generating unit 36 to by sound with the beat data that CPU11 receives with CPU31 subsequently.In singer's video generating unit 36, start synchronous image then and generate processing.

In step S1 as shown in figure 12, singer's video generating unit 36 receives action data, shape data and beat data.And when receiving shape data, just this shape data is stored in and makes of among the storer 36A (step S5).When receiving action data, the This move data storage is used among the storer 36A (step S5) in making.When receiving beat data, this beat data is stored in the storer that is arranged within singer's video generating unit 36 subsequently.

In step S6, singer's video generating unit 36 judges whether to have received needed shape data then.If consequently do not receive needed shape data, then be back to step S1, receive shape data once more.

Then in step S7, that singer's video generating unit 36 judges whether to have received is needed, with the corresponding action data of selected melody.If consequently do not receive needed action data, then be back to step S1, receive action data once more.

In step S8, the viewpoint position and the light source position of singer's video that singer's video generating unit 36 will show on monitor 90 are set to initial position then.Viewpoint position is being represented the direction of the singer being got phase.Such as singer's video generating unit 36 can be selected to get from the singer front video of phase time, measures the video of phase time from a singer left side and get the video of phase time from the singer right side.And, the video when singer's video generating unit 36 can also be selected singer's nearside feature, and the video during singer's distally.Light source position is being represented to the singer and is being carried out illuminating illumination (spotlight) position.Such as the video of singer's video generating unit 36 can select by upper left side irradiation singer the time, the video during by upper right side irradiation singer, and the video during by the front illuminated singer.When light source position is set in initial position, selection be video during by upper left side irradiation singer.

In step S9, singer's video generating unit 36 calculates the recurrence interval of using by the synchronizing signal of benchmark beat playback time at melody according to the beat data that is received by step S4 then.Reset the cycle according to the recurrence interval and the video of this synchronizing signal then, produce playback frame numerical table Ts as shown in Figure 8.

In step S10 as shown in figure 13, singer's video generating unit 36 identifies the replay position of melody then.Be that common melody is that beginning by this melody begins to reset, but specified the occasion of music replaying position for user's operation inputting part 17, melody will be begun to reset by this assigned address.Specify the occasion of the replay position of melody for user's operation inputting part 17, the replay position data of indicating this replay position to use will be sent to singer's video generating unit 36 by input part 17.Here, additional on each time clock of synchronizing signal have a time clock sequence number.Singer's video generating unit 36 is coincide the replay position data and the contrast of time clock sequence number that are sent by input part 17, to determine the replay position of melody.

In step S11, singer's video generating unit 36 judges whether to receive the playback sign on of melody then.Promptly when user's operation inputting part 17 input is used for instruction that melody begins to reset, this instruction will be passed to sound CPU11 by input part 17.Sound with the corresponding beginning by melody, or is begun the playback of melody with CPU11 by the replay position of appointment.The instruction that the beginning music replaying is used also will be passed to singer's video generating unit 36 by input part 17.When the judged result of singer's video generating unit 36 in step S11 is " YES ",, or begin the demonstration of singer's video by the replay position of appointment with corresponding beginning by melody.

In step S12, singer's video generating unit 36 judges whether to receive the playback END instruction of melody then.Promptly when 17 inputs of user's operation inputting part are used for the instruction of music replaying end (or termination), this instruction will be passed to sound CPU11 by input part 17.Sound will finish the playback of melody correspondingly with CPU11.And the instruction that the end music replaying is used also will be passed to singer's video generating unit 36 by input part 17.When the judged result of singer's video generating unit 36 in step S12 is " YES ", will finish the demonstration of singer's video correspondingly, finish synchronous image simultaneously and generate processing.On the other hand, when not receiving the playback END instruction of melody, the judged result of singer's video generating unit 36 in step S12 is " NO ".Being transferred to step S13 subsequently handles.

In step S13 and step S14, singer's video generating unit 36 judges whether to receive sets the instruction that viewpoint position and light source position are used.Be that the user can operation inputting part 17, the instruction that viewpoint position and light source position are used is set in input when needed.The instruction that setting viewpoint position and light source position are used will be sent to singer's video generating unit 36 by input part 17.Accordingly, singer's video generating unit 36 will be this instruction storage in the storer that is arranged at singer's video generating unit 36 inside (step S15).

In step S16, singer's video generating unit 36 will generate and the corresponding instantaneous image data of each frame singer video according to being stored in shape data and the action data of making of among the storer 36A then.

And generating instantaneous image data to after through 1/15 second, singer's video generating unit 36 will judge whether to receive in step S18 by sound with the time clock in the synchronizing signal of CPU11 output.

When time clock that singer's video generating unit 36 receives in the synchronizing signal, be transferred to step S22 and handle.In step S22, singer's video generating unit 36 is judged the moment of the time clock in receiving synchronizing signal, and whether the frame of the current singer's video that demonstrates is synchronization frame.If the judged result among the step S22 is the moment of the time clock in receiving synchronizing signal, when the frame of the current singer's video that demonstrates was synchronization frame, singer's video generating unit 36 showed the processing of this synchronization frame.

As mentioned above, synchronization frame is when pressing benchmark beat playback melody, in time with synchronizing signal in the corresponding to frame of each time clock.Therefore the occasion of resetting by the benchmark beat for melody, the recurrence interval in the synchronizing signal is consistent with the display cycle of synchronization frame.Like this will be as shown in Figure 9, for the occasion that melody is reset by the benchmark beat, the frame of singer's video that the time clock in receiving synchronizing signal demonstrates constantly automatically becomes synchronization frame.

Yet on the other hand, judged result in step S22 is for receiving the moment of the time clock in the synchronizing signal, when the frame of the current singer's video that demonstrates is not synchronization frame, singer's video generating unit 36 will be abandoned this frame in step S23, and use to be in from the time that the immediate synchronization frame in position replaces the frame that this is abandoned the synchronization frame of this frame rear side, and carry out display process.

This phenomenon the melody beat than benchmark beat faster occasion can occur.Promptly along with the quickening of melody beat, will shorten the clock cycle in the synchronizing signal.On the other hand, the cycle of each frame of demonstration singer video is that the video that keeps certain is usually reset the cycle.Therefore no matter being to choosing between frame enforcement, still implementing the interpolation of interpolation frame, all is to make the display cycle of frame keep certain usually.Therefore when the melody beat is faster than benchmark beat, the cycle of time clock will be shorter than the cycle that display synchronization frame is used, thereby the frame that can appear at singer's video that moment of receiving the time clock in the synchronizing signal demonstrates is not the phenomenon of synchronization frame.For the occasion that has this phenomenon, can abandon this frame as shown in figure 11, and use and be in from the time that the immediate synchronization frame in position replaces the frame that this is abandoned the synchronization frame of this frame rear side, and then implement to show.This show carried out between the choosing.

On the other hand, in step S18, singer's video generating unit 36 does not receive the occasion of the time clock in the synchronizing signal, is transferred to step S19 and handles.In step S19, singer's video generating unit 36 judges whether the frame of the current singer's video that demonstrates is synchronization frame.If the judged result in step S19 is the frame of the current singer's video that demonstrates is not synchronization frame, then singer's video generating unit 36 is carried out the display process of this frame.

Yet on the other hand, be that the frame of the current singer's video that demonstrates is the occasion of synchronization frame, be transferred to step S20 and handle for the judged result in step S19.Singer's video generating unit 36 generates interpolation frame in step S20, and in step S21 subsequently, inserts the interpolation frame that is generated in the dead ahead of current synchronization frame to display side.Subsequently, 36 pairs of interpolation frames that inserted of singer's video generating unit carry out display process.

This phenomenon can occur in the melody beat occasion slower than benchmark beat.I.e. slack-off along with the melody beat, the clock cycle in the synchronizing signal is with elongated.On the other hand, the cycle of each frame of demonstration singer video is that the video that keeps certain is usually reset the cycle.Therefore no matter being to choosing between frame enforcement, still implementing the interpolation of interpolation frame, all is to make the display cycle of frame keep certain usually.Therefore when the melody beat is slower than benchmark beat, the cycle of time clock will be longer than the cycle that display synchronization frame is used, thereby may be in the moment that does not receive time clock, and the frame that just makes the current singer's video that demonstrates is that the phenomenon of synchronization frame occurs.For the occasion that has this phenomenon, can generate interpolation frame as shown in figure 10, and insert the interpolation frame that is generated in the dead ahead of current synchronization frame to display side.

The processing that display frame is used is undertaken by step S24～S27.Promptly in step S24, singer's video generating unit 36 according to the information of corresponding action data of the frame that will show and relevant viewpoint position, the coordinate in calculating the polygon displaing coordinate and being.So just, can calculate the viewpoint position of determining singer's video according in instruction step S13 input, that the setting viewpoint position is used.

Then in step S25, singer's video generating unit 36 is with according to shape data, about the information of light source position and the information of relevant viewpoint position, and card is enclosed and organized lines (texture: the image of presentation surface texture or the like).So just, can be on each surface of polygon (polygon :) with the corresponding unit of each inscape video that constitutes singer's video, surperficial apperance and texture that affix changes with viewpoint position.

In step S26, singer's video generating unit 36 is carried out the polygonal processing (shading: the processing of additional shadow on image) that covers according to the information of relevant light source position and the information of relevant viewpoint position then.So just, can on each polygon, add by formed shade of light source direction or the like.

In step S27, the instantaneous image data that singer's video generating unit 36 will generate in making with storer 36A is passed to demonstration storer 36B then.So just, the corresponding image of each frame with singer's video can be exported to synthetic portion 50, and then after in synthetic portion 50, synthesizing, be presented on the monitor 90 with lyrics image and background images.

After the processing that finishes step S24～S27, be back to step S12 and handle, repeat the processing among implementation step S12～S27.So just, can in music replaying, on monitor 90, demonstrate singer's video as shown in figure 15, that it moves and melody matches.And when the instruction that user's end of input music replaying is used, or music replaying finishes music replaying during to this finale, and finishes synchronous image thereupon and generate and handle.

Therefore, if adopt communication Caraok device 100 in this form of implementation, can utilize fewer data volume to generate the dynamic singer's video that matches with melody by the mode of usage operation data and shape data.In the time of particularly will generating data separating that singer's video uses and be action data and shape data, can by only on each first melody the affix data volume demonstrate dynamic singer's videos than the mode of shape data action data still less to the different actions of each first melody.Promptly because the data volume of action data is considerably less, so, also the increase of posting field, the prolongation of transfer speed of data can be suppressed to minimum limit even to every first melody affix action data.And because the data volume of action data is considerably less, so can pass out action data at short notice by center principal computer 200 by telephone wire.If therefore action data is stored in the memory storage in the center principal computer 200, also can at short notice action data be passed to communication Caraok device 100.So just, can use communication Caraok device 100 in the playback of carrying out various melodies, its action and singer's video that this melody matches are shown.

And, adopt the constituted mode of action data and shape data storage separated from one another, can be provided with and the shown corresponding some kinds of shape datas of video kind that go out,, can change the image of the motion video that matches with melody significantly by changing the mode of shape data kind.Form the shape data that male sex's video uses and form these two kinds of shape datas of shape data that women's video is used such as can set, and do not change action data, so at these two kinds of videos of motion video of motion video that in first melody, can demonstrate the male sex and women.And, can also be provided with the automatic assembly of selected shape data of type (tune and singer are the male sex or women or the like) according to melody.Such as can go into to select the mode of the selection data that shape data uses by record in music data, automatically select the singer's video that matches with melody.When instruction that user's operation inputting part 17 input selected shape data are used, video can adopt the structure constituted mode that can respond this Instruction Selection shape data with CPU31 or singer's video generating unit 36.

If adopt the communication Caraok device 100 in this form of implementation, because music data can separate fully with action data, so affix action data on original music data easily.Such as when resetting the music data of original storage, human body is moved by the mode that matches with this melody of resetting out, measure the action of this human body and generate action data, just can generate the corresponding action data of music data with original storage.The action data of Sheng Chenging can be passed to communication Caraok device 100 by center principal computer 200 in such a way, thus can be easily on the music data of original storage the corresponding action data of affix.Therefore no longer need to be made at once the music data of original storage, so can effectively utilize the music data of original storage.

And, if adopt aforesaid form of implementation, show that the recurrence interval of the video playback period ratio synchronizing signal that each frame is used is short.Here, the beat of melody is represented by the number of 4 dieresis in per 1 minute usually.Compare melody faster for beat, the number maximum of 4 per 1 minute dieresis can reach about 240.On the other hand, the recurrence interval of synchronizing signal is the cycle of the time clock exported of each 8 dieresis of melody, and what reset when per 1 minute is the number of 4 dieresis when being 240 melody, and the recurrence interval of synchronizing signal is 1/8 second.Corresponding is, show video that each frame is used reset the cycle can for, such as 1/15 second.Therefore the cycle that shows the video playback period ratio synchronizing signal that each frame is used is short.

Like this, owing to show that the cycle of the video playback period ratio synchronizing signal that each frame is used is short, can between each time clock of output synchronizing signal, demonstrate several frames.Such as shown in Figure 9, between the output time clock 2 subsequently, just can show 4 frame F1～F4 by output time clock 1.Therefore in the time of between each time clock of output synchronizing signal, can demonstrating several frames, the beat of melody is faster than benchmark beat, by as shown in figure 11, a choosing falls the mode of a part of frame, also can be easily and melody and video are kept synchronously.On the other hand, even when the beat of melody is slower than benchmark beat, by mode as shown in figure 10, that insert interpolation frame between each frame, also can be easily and correctly make melody and video keep synchronous.

And, if adopt the communication Caraok device 100 in this form of implementation, can on each time clock that constitutes synchronizing signal, affix discern the time clock sequence number that the music replaying position is used, thereby can identify the replay position of melody frequently, and can in the traveling process of melody, easily reset, the melody F.F., the counter-rotating or the like the operation.

In aforementioned form of implementation, be to insert interpolation frame, but also can insert synchronization frame in the dead astern of synchronization frame or between other frame in the dead ahead of synchronization frame side.Than the slow many occasions of benchmark beat, need interior continuously several interpolation frames that insert for the melody beat.For this occasion, the insertion position of each interpolation frame can be spread out.Can make the action of shown object more level and smooth like this.Similarly, for the melody beat occasion fast more many than benchmark beat, need continuously between choosing fall several frames.For this occasion, also the position of the frame that each choosing can be fallen spreads out.Can make the action of the shown object that goes out more level and smooth like this.And, the dispersion treatment of carrying out for the insertion position of each interpolation frame, choosing is handled between the dispersion of carrying out for each frame, can perform calculations in the playback procedure of melody and find the solution, and also can perform calculations when the beat of music replaying is implemented to set and find the solution.

In aforementioned form of implementation, the video playback cycle is 1/15 second, but the present invention is not limited to this, also can handle needed time etc. according to data volume, the picture making of action data and be set and be optimal value.

And in aforementioned form of implementation, shape data is in the shape data storage part 35 that is stored in the video playback portion 30, but the present invention is not limited to this.Such as shape data also can be stored in the RAM12.And in aforementioned form of implementation, action data is stored in image data and deposits portion 34 places, but also can, such as action data is stored in the music data storage part 16 with music data.Be that shape data, action data, music data can constitute independently file respectively and implement access, each data can have the address of oneself, can add, change, delete or the like operation respectively, and can be with dividing other memory device stores aforesaid each data.

And, in aforementioned form of implementation, shape data is stored in the shape data storage part 35 is that example describes, but the present invention is not limited to this, its form of the composition also can be can with CD-ROM or the like assembly to shape data add, the form of the composition of operation such as change.And its formation can also be the form of the composition that can carry out the reception of shape data, suitably modification by center principal computer 200, add, change, delete or the like operation.For this occasion, can also the suitable transmission of carrying out shape data in the time of not using communication Caraok device 100.

And, in aforementioned form of implementation, synchronous image generation method be be applicable to communication Caraok device occasion be that example describes, but the present invention is not limited to this, the present invention also is applicable to the device that the video synchronised that makes エアロ PVC Network ス music and エアロ PVC Network ス instruction is used, and makes device that the video synchronised of the sound of broadcasting and sign language uses or the like.

As top describe in detail, if adopt the 1st aspect of the present invention described synchronous image generation method, can usage operation data and shape data, with the fewer data creating of data volume go out to generate that the object video of representing human body or the like is used, with the video of sound reproduction synchronization action.Even be various occasions therefore, also can produce action video with respect to each alternative sounds for sound.

If adopt the 2nd aspect of the present invention described synchronous image generation method, when sound with the datum velocity playback time, and sound can make the video and the playback synchronization of sound of expression object, and make action smoothly with than the slow speed playback of datum velocity the time.

If adopt the 3rd aspect of the present invention described synchronous image generation method, when sound with than the fast speed playback of datum velocity the time, also can make the video and the playback synchronization of sound of expression object, and make action smoothly.

If adopt the 4th aspect of the present invention described synchronous image generation method, the output cycle of expression object image can be consistent with the cycle of the integral multiple of display cycle of display device, thereby can demonstrate whole images of being exported with display device.Therefore can make the action of object level and smooth.

If adopt the 5th aspect of the present invention described synchronous image generation method, even the playback speed of sound is faster than datum velocity, mode by choosing between action data is carried out also can make object move smoothly, and then the action that can make sound and object synchronised correctly.

If adopt the 6th aspect of the present invention described synchronous image generation method, can implement identification to the replay position of voice data, thereby can demonstrate and the corresponding video of the replay position of voice data.Even therefore in the process that the sound of melody or the like is operated, reset, also can demonstrate its action and the correctly synchronous video of this sound.

If adopt the described Caraok device in the 7th aspect of the present invention, voice data, shape data and action data can carry out access respectively independently.So just, can make voice data is different with action data for every first melody, and shape data general for each first melody.And can be on the voice data of original storage affix action data and shape data.Therefore the current original voice data stored that might not need directly to complete, can be subsequently affix action data and shape data again.

If adopt the described Caraok device in the 8th aspect of the present invention,, also can make the playback of sound and the demonstration synchronised of video even the playback speed of sound changes.

If adopt the described Caraok device in the 9th aspect of the present invention, when sound with the datum velocity playback time, and sound can make the video and the playback synchronization of sound of expression object, and make action smoothly with than the slow speed playback of datum velocity the time.

If adopt the described Caraok device in the 10th aspect of the present invention, when sound with than the fast speed playback of datum velocity the time, also can make the video and the playback synchronization of sound of expression object, and make action smoothly.

If adopt the described Caraok device in the 11st aspect of the present invention, the output cycle of the image of expression object can be consistent with the cycle of the integral multiple of display cycle of display device, thereby can demonstrate whole images of being exported with display device.Therefore can make the action of object level and smooth.

If adopt the described Caraok device in the 12nd aspect of the present invention, even the playback speed of sound is faster than datum velocity, mode by choosing between action data is carried out also can make object move smoothly, and then the action that can make sound and object synchronised correctly.

If adopt the described Caraok device in the 13rd aspect of the present invention, can implement identification to the replay position of voice data, thereby can demonstrate and the corresponding video of the replay position of voice data.Therefore, also can demonstrate its action and this sound video of synchronised correctly midway even the resetting of the sound of melody or the like.

If adopt the described Caraok device in the 14th aspect of the present invention, can be only receive fewer voice data and the action data of data volume from the outside, to be passed to the data volume of Caraok device by the outside fewer thereby can make.Therefore can obtain a kind of Caraok device of communicating by letter that can demonstrate singer's video that its action and the playback of melody match or the like.Such as when sending when passing out about the voice data of up-to-date melody and action data to the communication Caraok device by the center principal computer, the up-to-date melody of not only can resetting out at once, and can demonstrate the singer's who sings this melody motion video.

If adopt the described Caraok device in the 15th aspect of the present invention, can respond the kind of sound and user's hobby, shown body form that goes out or the like is set, changed.

Claims

1. synchronous image generation method has:

According to the voice data sound reproduction step that sound uses of resetting out,

Output is exported step with the synchronizing signal that the corresponding synchronizing signal of sound reproduction speed that described sound reproduction step is reset out is used,

To show that the video that the object that is made of people, animal and their dummy is used is divided into several inscapes, use shape shape data of using and the position of setting described each inscape of setting these inscapes or the action data that moves usefulness, generation shows that the video that the video of described object is used generates step

To generate video that step generates by described video and be presented at the step display of using on the display device by mode with the synchronizing signal synchronised of described synchronizing signal output step output.

2. synchronous image generation method as claimed in claim 1, it is characterized in that described action data is according to described voice data sound is reset by datum velocity, described object is moved by the mode of coincideing with this acoustic phase of resetting out, and generate by the mode of the position of each inscape on predetermined period measurement described object this moment or action

Generate in the step at described video, when the playback speed of the sound of being reset out by described sound reproduction step is described datum velocity, use described action data and shape data to generate the video that the described object of expression is used, when the playback speed of the sound of being reset out by described sound reproduction step is slower than described datum velocity, described action data is implemented interpolation and generated the interpolation action data, and use described action data, interpolation action data and shape data to generate the video that the described object of expression is used.

3. synchronous image generation method as claimed in claim 2, it is characterized in that when the sound reproduction speed of being reset out by described sound reproduction step is faster than described datum velocity, choosing between described action data implemented, and the action data after the choosing and described shape data generate the video that the described object of expression is used between using.

4. one kind as the described synchronous image of any one claim generation method in the claim 1 to 3, it is characterized in that described action data is according to described voice data sound is reset by datum velocity, described object is moved by the mode of coincideing with this acoustic phase of resetting out, and generate by the mode of the position of each inscape on period measurement described object this moment of the integral multiple of described display device display cycle or action.

5. one kind as the described synchronous image of any one claim generation method in the claim 1 to 3, it is characterized in that described action data is according to described voice data sound is reset by datum velocity, described object is moved by the mode of coincideing with this acoustic phase of resetting out, and generate by the mode than the position of each inscape on shorter period measurement described object this moment of the cycle of the synchronizing signal of described synchronizing signal output step output or action.

6. one kind as the described synchronous image of any one claim generation method in the claim 1 to 3, it is characterized in that the symbol that the additional replay position that the identification voice data arranged is used on the synchronizing signal of described synchronizing signal output step output.

7. Caraok device has:

The voice data memory module that the stored sound data are used,

According to being stored in voice data in the described voice data memory module sound reproduction assembly that sound uses of resetting out,

Output and the described sound reproduction assembly synchronizing signal output precision that the corresponding synchronizing signal of playback speed of sound is used of resetting out,

To show that the video that the object that is made of people, animal and their dummy is used is divided into several inscapes, the shape data memory module that the shape data of these inscape shapes is used is set in storage,

The action data memory module that the action data of the position of each inscape or action is used on the described object of storage setting,

Use is stored in the shape data and the action data that is stored in the described action data memory module in the described shape data memory module, the video formation component that the video of the described object of generation expression is used,

The video that will be generated by described video formation component is presented at the display module of using on the display device by the mode with the synchronizing signal synchronised of described synchronizing signal output precision output.

8. Caraok device as claimed in claim 7 is characterized in that also being provided with the playback speed change assembly that the playback speed of the sound that the described sound reproduction assembly of change resets is used.

9. Caraok device as claimed in claim 8, it is characterized in that described action data is according to described voice data sound is reset by datum velocity, described object is moved by the mode of coincideing with this acoustic phase of resetting out, and generate by the mode of the position of each inscape on predetermined period measurement described object this moment or action

When described video formation component is described datum velocity at the playback speed when the sound of being reset out by described sound reproduction assembly, use described action data and shape data to generate the video that the described object of expression is used, when the playback speed of the sound of being reset out by described sound reproduction assembly is slower than described datum velocity, described action data is implemented interpolation and generated the interpolation action data, and use described action data, interpolation action data and shape data to generate the video that the described object of expression is used.

10. Caraok device as claimed in claim 9, it is characterized in that described video formation component is when the playback speed of the sound of being reset out by described sound reproduction assembly is faster than described datum velocity, choosing between described action data implemented, and the action data after the choosing and described shape data generate the video that the described object of expression is used between using.

11. one kind as the described Caraok device of any one claim in the claim 7 to 10, it is characterized in that described action data is according to described voice data sound is reset by datum velocity, described object is moved by the mode of coincideing with this acoustic phase of resetting out, and generate by the mode of the position of each inscape on period measurement described object this moment of the display cycle integral multiple of described display device or action.

12. one kind as the described Caraok device of any one claim in the claim 7 to 10, it is characterized in that described action data is according to described voice data sound is reset by datum velocity, described object is moved by the mode of coincideing with this acoustic phase of resetting out, and generate by the mode than the position of each inscape on shorter period measurement described object this moment of the cycle of the synchronizing signal of described synchronizing signal output precision output or action.

13. one kind as the described Caraok device of any one claim in the claim 7 to 10, it is characterized in that the symbol that the additional replay position that the identification voice data arranged is used on the synchronizing signal of described synchronizing signal output precision output.

14. one kind as the described Caraok device of any one claim in the claim 7 to 10, it is characterized in that also being provided with reception and transmit the voice data and the action data of coming by the outside, received voice data is stored in the described voice data memory module, received action data is stored into the Data Receiving assembly of using in the described action data memory module.

15. one kind as the described Caraok device of any one claim in the claim 7 to 10, it is characterized in that in described shape data memory module, storing several shape datas that difform several objects of formation are used, by utilization be included in the described voice data the selection data or by the mode of outside input, select described each shape data, described video formation component uses the selected shape data that goes out, and be stored in action data in the described action data memory module, generate the video that the described object of expression is used.