CN101630507A

CN101630507A - Method, device and system for realizing remote karaoke

Info

Publication number: CN101630507A
Application number: CN200910163162A
Authority: CN
Inventors: 杨海曜
Original assignee: Shenzhen Huawei Communication Technologies Co Ltd
Current assignee: Huawei Device Co Ltd; Huawei Device Shenzhen Co Ltd
Priority date: 2009-08-18
Filing date: 2009-08-18
Publication date: 2010-01-20
Anticipated expiration: 2029-08-18
Also published as: CN101630507B

Abstract

The invention relates to a method, a device and a system for realizing remote karaoke. The method comprises the steps: receiving voice audio frequency transmitted by a first terminal and a second terminal, receiving accompanying audio frequency transmitted by an accompanying tune base server according to the accompanying requirement of the first terminal or the second terminal, mixing the accompanying audio frequency and the voice audio frequency for obtaining mixing audio frequency, and transmitting the mixing audio frequency to the first terminal and/or the second terminal. The voice audio frequency is from the terminal and the accompanying audio frequency is from the accompanying tune base server in the actualizing mode; and mutual disturbance can not be caused when the voice audio frequency and the accompanying audio frequency are mixed, thereby enhancing the audio quality of remote karaoke.

Description

The implementation method of long-distance karaoke, device and system

Technical field

The present invention relates to communication technical field, particularly the implementation method of long-distance karaoke, device and system.

Background technology

Realize at home and in the world that at present the scene of playing Karaoka is generally: in the same area, utilize the mode of analog or digital cable circuit, two/multi-channel sound the signal that participates in the Karaoke antiphonal singing, sing by turns is pooled to an analog or digital processing enter carries out the audio mixing processing, then mixed sound signal is played back by modes such as loudspeakers, thereby realize Kara OK function.

Along with the lifting that people require the service susceptibility, long-distance karaoke becomes a kind of needs.For example: need use the long-distance karaoke technology with singing once first song in pbs film industry realization strange land.The implementation method of the long-distance karaoke of annular can adopt Internet protocol, and (Internet Protocol, IP) audio mixing principle in conference telephone center realizes.In the implementation procedure of this method, the control center of an assigned ip address receives the sound signal from least two terminals; The sound signal of all terminals all comprises speech audio, and wherein a station terminal also comprises audio accompaniment; Control center receive from after the sound signal of at least two terminals to above-mentioned sound signal hybrid processing, broadcast away by IP network then.The object of above-mentioned broadcasting can be all terminals of above-mentioned participation sound signal mixing, also can be synthetic other terminal in addition of above-mentioned participation audio frequency.

The function of the control center in the implementation method of the long-distance karaoke of annular is integrated in the terminal, become the implementation method of the long-distance karaoke of star, in this method, terminal need receive the audio frequency of other-end, synthetic then mixed audio sends to mixed audio the terminal that participates in session then.

The inventor finds in realizing process of the present invention: in the implementation procedure of the long-distance karaoke that a plurality of terminals participate in, a certain terminal in a plurality of terminals with self speech audio and audio accompaniment as passing to control center on the way; Control center is when carrying out the sound signal hybrid processing, and the speech audio that other terminal sends to is all based on above-mentioned audio accompaniment, and many audio accompaniments sound will be more little more to participate in the synthetic terminal of audio frequency like this, and the audio frequency after the hybrid processing also can be noisy more.

Summary of the invention

The technical matters that the embodiment of the invention will solve provides implementation method, device and the system of long-distance karaoke, improves the audio quality of long-distance karaoke.

For solving the problems of the technologies described above, the implementation method embodiment of the long-distance karaoke of annular provided by the present invention can be achieved through the following technical solutions:

Receive the speech audio that first terminal and second terminal send, receive the audio accompaniment of the bent storehouse of accompaniment server according to the accompaniment demand transmission of described first terminal or second terminal;

Described audio accompaniment and speech audio are carried out the audio mix processing, obtain mixed audio;

Described mixed audio is sent to described first terminal and/or second terminal.

The embodiment of the invention also provides a kind of implementation method of long-distance karaoke of star, comprising:

Receive the speech audio of this terminal other-end transmission in addition of participation session, gather the speech audio of this terminal, receive the audio accompaniment of the bent storehouse of accompaniment server according to the accompaniment demand transmission of this terminal or described other-end;

The speech audio of described audio accompaniment, the speech audio that receives and this terminal is carried out audio mix handle, obtain mixed audio;

Play described mixed audio; Described mixed audio is sent to described other-end.

The embodiment of the invention also provides the implementation method of the long-distance karaoke of another kind of annular, comprising:

The speech audio of this terminal of collecting is sent to control center;

The mixed audio that reception is sent by control center; Described mixed audio comprises the audio-frequency information of the terminal that participates in session;

Receive the audio accompaniment that the bent storehouse of accompaniment server sends;

Synthetic described mixed audio and described audio accompaniment obtain target audio and play described target audio.

The embodiment of the invention also provides a kind of control center, comprising:

The audio frequency receiving element is used to receive the speech audio that first terminal and second terminal send, and receives the audio accompaniment that the bent storehouse of accompaniment server sends;

The audio mix processing unit is used for described audio accompaniment and speech audio are carried out the audio mix processing, obtains mixed audio;

The audio frequency transmitting element is used for described mixed audio is sent to described first terminal and/or second terminal.

The embodiment of the invention also provides a kind of terminal, comprising:

The audio frequency receiving element is used to receive the speech audio that the other-end beyond this terminal that participates in session sends, and receives the audio accompaniment that the bent storehouse of accompaniment server sends according to the accompaniment demand of this terminal or described other-end;

The audio collection unit is used to gather the speech audio of this terminal;

The audio frequency synthesis unit is used for that the speech audio of described audio accompaniment, the speech audio that receives and this terminal is carried out audio mix and handles, and obtains mixed audio;

Audio playing unit is used to play described mixed audio;

The mixed audio transmitting element sends to described other-end with described mixed audio.

The embodiment of the invention also provides another kind of terminal, comprising:

The audio frequency transmitting element, the speech audio that is used for this terminal that will collect sends to control center;

The mixed audio receiving element is used to receive the mixed audio that is sent by control center; Described mixed audio comprises the audio-frequency information of the terminal that participates in session;

The audio accompaniment receiving element is used to receive the audio accompaniment that the bent storehouse of accompaniment server sends;

The audio frequency synthesis unit is used for synthetic described mixed audio and described audio accompaniment, obtains target audio;

Audio playing unit is used to play described target audio.

The embodiment of the invention also provides a kind of system of realization of long-distance karaoke, comprising:

The bent storehouse server of accompanying is used for sending audio accompaniment according to the demand of first terminal or second terminal to control center;

Control center is used to receive the speech audio that first terminal and second terminal send, and receives the audio accompaniment that the bent storehouse of accompaniment server sends according to the demand of described first terminal or second terminal; Described audio accompaniment and speech audio are carried out the audio mix processing, obtain mixed audio; Described mixed audio is sent to described first terminal and/or second terminal.

Technique scheme has following beneficial effect: speech audio comes self terminal, audio accompaniment from the bent storehouse of accompaniment server, is carrying out can not causing the phase mutual interference when audio mix is handled, thereby is improving the quality of long-distance karaoke audio frequency.

Description of drawings

In order to be illustrated more clearly in the embodiment of the invention or technical scheme of the prior art, to do to introduce simply to the accompanying drawing of required use in embodiment or the description of the Prior Art below, apparently, accompanying drawing in describing below only is some embodiments of the present invention, for those of ordinary skills, under the prerequisite of not paying creative work, can also obtain other accompanying drawing according to these accompanying drawings.

Fig. 1 is the inventive method embodiment one flow process synoptic diagram;

Fig. 2 is the inventive method embodiment two flow process synoptic diagram;

Fig. 3 is the inventive method embodiment three schematic flow sheets;

Fig. 4 is the inventive method embodiment four schematic flow sheets;

Fig. 5 is apparatus of the present invention embodiment five control center's structural representations;

Fig. 6 is apparatus of the present invention embodiment five another control center's structural representations;

Fig. 7 is apparatus of the present invention embodiment five another control center's structural representations;

Fig. 8 is apparatus of the present invention embodiment five another control center's structural representations;

Fig. 9 is apparatus of the present invention embodiment five another control center's structural representations;

Figure 10 is 61 kinds of terminal structure synoptic diagram of apparatus of the present invention embodiment;

Figure 11 is apparatus of the present invention embodiment six another kind of terminal structure synoptic diagram;

Figure 12 is apparatus of the present invention embodiment kind July 1st terminal structure synoptic diagram;

Figure 13 is apparatus of the present invention embodiment seven another kind of terminal structure synoptic diagram;

Figure 14 is apparatus of the present invention embodiment eight system architecture synoptic diagram.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the invention, the technical scheme in the embodiment of the invention is clearly and completely described, obviously, described embodiment only is the present invention's part embodiment, rather than whole embodiment.Based on the embodiment among the present invention, those of ordinary skills belong to the scope of protection of the invention not making the every other embodiment that is obtained under the creative work prerequisite.

The technical matters that the embodiment of the invention will solve provides a kind of implementation method, device and system of long-distance karaoke, improves the audio quality of long-distance karaoke.

Embodiment one, and as shown in Figure 1, the implementation method of the long-distance karaoke of a kind of annular that the embodiment of the invention provides comprises:

Step 101: receive the speech audio that first terminal and second terminal send, receive the audio accompaniment that the bent storehouse of accompaniment server sends; Can also receive the video image that first terminal and second terminal send in this step; Above-mentioned video can send to control center then for the video image of first end side of the first terminal collection, and the video image of second end side of the second terminal collection sends to control center then.

Alternatively, all right before step 101: the application for registration that receives first terminal and second terminal; After receiving the application for registration of first terminal and second terminal, bind first terminal and second terminal; Subsequent embodiment is further described the process of registration.

In addition, above-mentioned first terminal and second terminal send the speech audio parameter and also can consult, and adopt the preset parameter of setting to send also certainly and are fine, and do not influence the realization of the embodiment of the invention.

Above-mentioned first terminal and second terminal all are the terminal that participates in long-distance karaoke, and for convenient narration is called first terminal and second terminal, first terminal and second terminal can be expressed as a plurality of terminals, and the number of terminal does not limit; Above-mentioned binding can be understood as: the terminal that will participate in long-distance karaoke is formed a conversation group, be specifically as follows, the conversation group storage that control center sets up a Karaoke adds the sign of the terminal of above-mentioned conversation group, read the sign of the terminal of above-mentioned conversation group before sending Composite tone and/or composite video image, the terminal to the above-mentioned sign that reads sends Composite tone and/or composite video image then.

Step 102: above-mentioned audio accompaniment and speech audio are carried out the audio mix processing, obtain mixed audio; If during previous step is rapid video is arranged, this step can also be carried out the synthetic processing of video to the above-mentioned video image that comes from first terminal and second terminal, obtains composite video image.

Step 103: above-mentioned mixed audio is sent to above-mentioned first terminal and/or second terminal.If synthesized composite video image during previous step is rapid, this step can also send to above-mentioned composite video image above-mentioned first terminal and/or second terminal.

Above-mentioned mixed audio sends to first terminal and/or second terminal, and the correlation parameter of transmission also can be held consultation with terminal, adopts the parameter of setting to send not hold consultation certainly and also is fine, and does not influence the realization of the embodiment of the invention.

The executive agent of said method step is a control center; The bent storehouse server of accompanying can be a server independently, also can be integrated in the control center.

In the above-mentioned embodiment, speech audio comes self terminal, audio accompaniment from the bent storehouse of accompaniment server, is carrying out can not causing the phase mutual interference when audio mix is handled, thereby is improving the quality of long-distance karaoke audio frequency.

The synthetic execution side of above-mentioned implementation sound intermediate frequency is if terminal, the execution flow process of scheme can for:

The speech audio of this terminal of collecting is sent to control center; The video image of this terminal of collecting can also be sent to control center;

The mixed audio that reception is sent by control center; Above-mentioned mixed audio comprises the audio-frequency information of the terminal that participates in session; Can also receive the composite video image that control center sends, above-mentioned video image comprises the video information of the terminal that participates in session;

Synthetic above-mentioned mixed audio and above-mentioned audio accompaniment obtain target audio and play above-mentioned target audio;

Simultaneously can also play above-mentioned composite video image in the broadcast target audio.

The executive agent of above-mentioned implementation can be for participating in the terminal of session, based on the same principle of scheme that is disclosed with Fig. 1, control center's audio frequency of self terminal in the future synthesizes, send to terminal then, by terminal mixed audio and audio accompaniment are synthesized then and obtain final Composite tone, present embodiment still can reach the purpose of the audio quality that improves long-distance karaoke.

Embodiment two, and the embodiment of the invention also provides a kind of implementation method of long-distance karaoke of star, as shown in Figure 2, comprising:

Step 201: receive the speech audio of this terminal other-end transmission in addition of participation session, gather the speech audio of this terminal, receive the audio accompaniment of the bent storehouse of accompaniment server according to the accompaniment demand transmission of this terminal or above-mentioned other-end;

Here can also comprise: receive the video image that above-mentioned other-end sends, gather the video image of this terminal;

Step 202: the speech audio of above-mentioned audio accompaniment, the speech audio that receives and this terminal is carried out audio mix handle, obtain mixed audio;

Here also can: the video image to the video image that comes from above-mentioned other-end and this terminal synthesize processing, the acquisition composite video image;

Step 203: play above-mentioned mixed audio; Above-mentioned mixed audio is sent to above-mentioned other-end.

Here also can: play above-mentioned composite video image, above-mentioned composite video image sent to above-mentioned other-end.

In the above-mentioned embodiment, speech audio comes self terminal, audio accompaniment from the bent storehouse of accompaniment server, is carrying out can not causing the phase mutual interference when audio mix is handled, thereby is improving the quality of long-distance karaoke audio frequency.The content that adds video further can be enriched the visual effect of Karaoke.

Embodiment three, and present embodiment is a specific embodiment of the implementation method of the long-distance karaoke that adopts star, and in the present embodiment, control center is integrated in first terminal, and the bent storehouse server of accompanying is server independently.Because the restriction of the audio frequency synthesis capability of terminal, this programme can adopt when participation long-distance karaoke terminal is less than 6; It can be that first terminal contains audio mixing module or similar functions that above-mentioned control center is integrated on first terminal, and for example built-in multipoint control unit (Multipoint Control Unit, MCU).It is a plurality of to be understandable that in the present embodiment second terminal can have, and concrete quantity does not limit.First terminal can be for having the video conference terminal of audio mixing ability, and first terminal can also can be for not having the videophone of audio mixing ability for video conference terminal.First terminal is represented with video TV terminal A in the present embodiment, and second terminal is represented with video TV terminal B.

Long-distance karaoke specific implementation process as shown in Figure 3 (this process can adopt video conferencing H.323 framework carry out):

301, control center's registration:

1) video conference terminal, the control center to the assigned ip address after start powers on sends application for registration;

2) control center (includes multipoint control unit MCU or gatekeeper GK, wherein MCU can be used to receive the sound in each meeting-place and image, the gatekeeper can be used to finish the meeting scheduling feature) accept application for registration, the identity of verification terminal and password can also distribute one " pet name " or " calling cornet " video conference terminal to each application for registration then.

Video conference terminal A and video conference terminal B are video conference terminals in the present embodiment, unreceipted when being A or B, can think video conference terminal A and/or video conference terminal B.

302, the bent storehouse of accompaniment server registration:

1) video conference terminal lands the bent storehouse of accompaniment server according to predefined account number cipher; And send the catalogue of the accompaniment Qu Ku that has access to and coach bent storehouse server stores and the request of song to accompaniment bent storehouse server;

2) bent each video conference terminal of storehouse server authentication of accompaniment land authorities such as number of the account and password, agree or refuse catalogue and the song (can also set open directory size according to above-mentioned authority here) that the accompaniment Qu Ku of bent storehouse server stores was had access to and coached to each video conference terminal.

In addition, each terminal can also be reported self-ability to the bent storehouse of accompaniment server, for example: the video coding protocol (for example H.264/263, MPEG-2/4, AVS etc.) that self supports, audio capability: audio protocols such as (for example G.711/723/728/729) AAC, resolution, flank speed, whether support double fluid etc.The ability of each terminal of server record; 2) in, the bent storehouse server of accompanying sends out suitable Internet protocol at each terminal, and (Internet Protocol, IP) (Karaoketelevision KTV) gives control center to Media Stream transmission accompanying song.

303, call out:

1) video conference terminal A initiates point-to-point invitation to video conference terminal B, invites video conference terminal B to participate in once Karaoke and sings.Concrete mode: video conference terminal A can call out according to the IP address (can referring to H.323 agreement or agreement of the same type) when knowing the IP address of terminal B; Do not know the real ip address of terminal B but know terminal B " pet name " or " calling cornet ", then available meeting is invited sends to control center, and control center carries out the switching of IP map addresses and (is similar to: after the domain name mapping of IP address) handling meeting invitation is sent terminal B.

2) video conference terminal B can select to accept after receiving above-mentioned invitation or refuse to play Karaoka to sing; Establish a communications link if accepting plays Karaoka to sing then.

If more terminal is arranged, and for example C, D...... etc. need to add, can be under the situation that video conference terminal A and B have connected, call out C or D by terminal A according to as above method, perhaps oppositely terminal A is made a call, whether accept its adding by terminal A decision by terminal C or D.

304, consult:

1) video conference terminal A sends call request to video conference terminal B;

2) after video conference terminal B receives the call request of video conference terminal A, show caller " IP address " or " pet name " or " number " etc., whether decision answers (accept the invitation and participate in playing Karaoka) then.Accept the invitation as decision, then both sides' video conference terminal is according to the technological frame of video conferencing, for example: H.323, session initiation protocol (Session Initiation Protocol, SIP) etc. consult ability separately, determine media stream protocol and speed that both sides can both accept (for example to arrange video and adopt H.264 agreement, resolution 4CIF (704 * 576), speed 2Mbit/s; Audio frequency adopts low specification (Advanced Audio Coding Low Delay, AAC LD) agreement, two-channel, the speed 384kbit/s etc. of postponing of Advanced Audio Coding).

3) consult to finish after, the IP link of then setting up video conference terminal A and video conference terminal B links.Beginning video conferencing communication process (Karaoke participates in beginning).

In addition, called video conference terminal B can report the calling of oneself having accepted " from video conference terminal A " to the bent storehouse of accompaniment server, has set up steady I P link with video conference terminal A.(this step be convenient to bind and descending Media Stream and directory information of each terminal) synchronously from the bent storehouse of accompaniment server

If any video conference terminal C, D...... etc., also can consult whether to connect, and to the bent storehouse of accompaniment server reporting information, application be bundled on the logical ip link of same downlink media stream according to as above method.

The process of above-mentioned negotiation is carried out voice and video in control center and is finished before synthetic and get final product, can before sending Voice ﹠ Video, control center carry out in terminal as for negotiations process, still carry out after control center sends Voice ﹠ Video in terminal, do not influence the realization of the embodiment of the invention, will not limit this embodiment of the invention.

305, selected songs:

1) television terminal sends a request message to the bent storehouse of accompaniment server and specifies the accompanying song that needs;

2) when either party video conference terminal was selected song from the alternative song catalog of the bent storehouse server (KTV) of accompanying after, the bent storehouse server of accompanying then flow on each logic binding terminal of meeting together (for example terminal A and B) by the downstream IP medium from accompanying song storehouse server.This moment, Kara OK songs was play beginning, (this step also can be finished the selected songs process before the invitation that makes a call).

Selected how first song separately as a plurality of, then available time sequencing according to selected songs is discharged order, plays descending by accompanying song storehouse server one by one.

306, audio mixing:

1) terminal sung of each participation can preestablish or adjust in real time the effect such as volume, (mode of adjustment is available as the setting weights, the control that volume is raise or reduces) reverberation of own microphone; Then will be separately local terminal digitized voice signal, the audio protocols of arranging in the process passes to video conference terminal A through consultation;

2) the independent audio medium stream of formation sent to the terminal that each participation is sung after video conference terminal A carried out audio mixing.

307, many pictures are synthetic:

1) terminal sung of each participation can send to the video image that our terminal gets access to control center's (mode of obtaining can for, obtain by camera);

2) control center is broadcast to the terminal that each participation is sung with the video image that receives, or a plurality of video images are spliced into a new video image (can be many pictures); Video image after will synthesizing then sends to each side's terminal that participation is sung, and perhaps sends to some terminal of appointment.

308, scheduling:

The process of scheduling can be undertaken by the mode that chairman's terminal is set; The terminal (video TV terminal A) that generally includes the audio mixing module can be appointed as chairman's terminal of this Karaoke sing procedure by control center, can certainly be according to the terminal that receives request is set after, setting a certain terminal is chairman's terminal.But the following authority of initialization:

1) chairman's permission grant is participated in the terminal of session to any; This terminal can be carried out uniform dispatching to the Karaoke session then; Also can adopt shared chairman, any participant all can participate in scheduling; Also can be according to the song of choosing separately, the mode that adopts the single chairman to transfer automatically, promptly who this terminal of singing is exactly chairman;

The terminal that has been defined as chairman can have following authority:

Delete or be provided with the catalogue listing that has preferentially selected song;

The progress of scheduling song: for example playback, F.F., rewind down, put slowly, time-out etc.;

The sound channel of scheduling song: for example switch song former sound, accompany, lead and sing etc.;

Adjust size, effect of whole accompaniment background audio etc.;

The sound channel that allows or shield any one terminal adds audio mixing;

Adjust the effect of audio mixing processing unit: for example increase or reduce certain terminal the audio mixing ratio, adjust overall reverberation effect, audio etc.;

Allow or the new terminals joining the conference (add and sing) of applying for adding of refusal; And kick out of the terminal (making it break away from the logical path of this singing) of having joined;

Monitor the report status information (for example per second is reported an IP packets of information 1 time) that each terminal of each participant (singing) is regularly sent;

Allow or one of them video image of refusal broadcasting; The processing procedure of video hub is: after receiving the instruction of above-mentioned permission or refusal, the video image that comes from the terminal that participates in session and be in enable state that receives is synthesized, concrete synthetic method can be with reference to the multipoint video processing procedure, video image is synthetic finish dealing with after, what obtain can be that several video image pictures synthesize a new image (annotate: new image can be " many pictures "); Then newly-generated " many pictures " video image is handed down to each terminal that participates in session, perhaps sends to some terminal of appointment;

The respective logic binding relationship is closed in the meeting (singing) that declares the closing of.

2) video conference terminal A receives the control messages of the control Karaoke session of chairman's terminal transmission; Above-mentioned control information is: audio accompaniment has been selected in deletion, the playing sequence that has selected audio accompaniment is set, the progress of scheduling song, the sound channel of scheduling song, adjust the size or the effect of accompaniment background audio, the audio frequency that allows or shield first terminal and/or second terminal carries out hybrid processing, the video image that allows or shield first terminal and/or second terminal synthesizes processing, adjust the effect that audio mixing is handled, allow or refuse new terminal to add session, in the session at least one declares the closing of;

3) video conference terminal A is according to the indication of control messages, and the Karaoke session is carried out that deletion has been selected audio accompaniment, the playing sequence that selected audio accompaniment is set, the sound channel of the progress of scheduling song, scheduling song, adjusted size or effect, the permission of accompaniment background audio or shield first terminal and/or the audio frequency of second terminal carries out hybrid processing, permission or shields first terminal and/or the video image of second terminal synthesize processings, adjusts effect, the permission of audio mixing processing or refuses new terminal adding session, session declares the closing of.

309, withdraw from:

1) each terminal that has participated in can withdraw from this meeting to chairman's application;

2) by chairman's terminal record, this terminal is deleted from the IP logical links of this meeting (singing process), and make it break away from the calling procedure of this meeting according to corresponding scheduling protocol in the technological frame of video conferencing (for example: H.323 or technological frame such as SIP).The terminal that has withdrawed from then can apply for adding another meeting (singing process), also can withdraw to accompaniment bent storehouse server application and land.Above-mentioned withdrawing from is the flow process that normally withdraws from.Or

1) control center monitors the timing report status information (for example 1 status information bag of per second as above) that each terminal is sent, and (as 4～5 seconds) do not receive the status information bag always after a period of time, and then chairman's terminal is judged this terminal abnormal.This moment, the logic binding of this terminal of chairman's terminal deletion concerned, can also circulate a notice of this terminal abnormal to each participant terminal and withdraw from.What for example certain the terminal outage or the network terminal caused withdraws from unusually.Or

1) the bent storehouse of accompaniment server withdraws from according to the overtime judgement terminal of terminal operation and lands.

In addition: if chairman's terminal abnormal on duty withdraws from, then chairman's authority is got back to the terminal with audio mixing module one side automatically.As the terminal (video conference terminal A) of bringing into play the effect of audio mixing module withdraws from unusually, after then each participant terminal does not receive information behind the audio mixing in setting-up time, carries out and breaks away from this meeting (singing process), meeting abnormal ending this moment.

310, finish:

1) chairman's terminal is closed the logic binding relation of each self terminal to accompaniment bent storehouse server application, and each self terminal all breaks away from according to corresponding scheduling protocol in the technological frame of video conferencing (for example: H.323 or technological frame such as SIP) calls out linking relationship.Then this meeting (singing process) finishes.

The terminal that has withdrawed from then can apply for adding another meeting (singing process), also can send to the bent storehouse of accompaniment server and withdraw from application, and application is withdrawed from and landed.

Embodiment four, and present embodiment is a specific embodiment of the implementation method of the long-distance karaoke of employing annular.In the present embodiment, control center is server independently, and the bent storehouse server of accompanying also is server independently.The bent storehouse server of accompanying in the present embodiment can be integrated in control center, does not influence the realization of the embodiment of the invention.If the audio mixing ability that control center has is stronger, this programme can adopt under a lot of scene of participation long-distance karaoke terminal; In this programme, all right many picture functions of tool of control center, first terminal and second terminal can be the terminal of same type under this kind framework, and terminal still can have a plurality of, and concrete quantity does not limit.First terminal and second terminal can be video conference terminal, also can be videophone.First terminal is represented with video TV terminal A in the present embodiment, and second terminal is represented with video TV terminal B.

Long-distance karaoke specific implementation process is as shown in Figure 4: (this process can adopt video conferencing H.323 framework or network frame agreement of the same type are carried out)

401, registration:

Identical with control center's registration, the bent storehouse of accompaniment server registration among the embodiment three.The terminal that participates in session need be registered respectively in control center and the bent storehouse of accompaniment server.

402, call out:

For example: video conference terminal A prepares to invite video conference terminal B, participations such as C, D... once to play Karaoka and sing, and video conference terminal A also can invite some terminals separately.

1) terminal A carries out calling terminal B, C, D... according to known terminal B, C, D... " pet name " or " calling cornet " by the mster-control centre, calls out tandem to the MCU in mster-control centre.

2) mster-control centre distributes a meeting sequence number for this meeting, and this sequence number also can be submitted to convene when applying for and determine title by terminal A, and can apply for for this meeting the information such as password that participate in adding being set.Check not have this meeting sequence number of agreement under the situation of bearing the same name and meeting some set naming rule as the mster-control centre.

If any terminal E, F...... etc., can be under the situation that this conference convening has been connected, the terminal that needs to add is passed through information applications adding meetings (singing process) such as this meeting sequence number and password thereof.

403, the bent storehouse of accompaniment server registration, 404, consult, 405, selected songs, 406, audio mixing, 407, many pictures be synthetic, 408, scheduling process, 409, withdraw from, 410, finish: with 302 among the embodiment three, the bent storehouse of accompaniment server registration, 304, consult, 305, selected songs, 306, audio mixing, 307, many pictures are synthetic, 308, scheduling, 309, withdraw from, 310, finish similar.Do not repeat them here.

In the present embodiment, when adding video, each terminal need be gathered video information separately and be sent to control center; In negotiations process, need hold consultation to video parameter, concrete negotiations process still can be carried out with reference to the technological frame of video conferencing; The transmission of above-mentioned video information adopts above-mentioned negotiation parameter to send; The multipoint control unit of control center (MCU) module is synthesized the video pictures that sends, and sends to the terminal of participation session or the terminal of specified scope then; The mode that sends video can the similar send mode of reference audio (sending the video that adapts to terminal capability).In scheduling process, chairman has increased new authority: set up, close many picture effects; Adjust many picture states and order, the video of any several sides' terminal is added in many pictures.Can see vocal accompaniment lyrics picture so on the one hand, can see the information such as expression action of participating parties again, the bigger atmosphere of having enriched when singing.

Embodiment five, as shown in Figure 5, the embodiment of the invention also provides a kind of control center, comprising:

Audio frequency receiving element 501 is used to receive the speech audio that first terminal and second terminal send, and receives the audio accompaniment that the bent storehouse of accompaniment server sends;

Audio mix processing unit 502 is used for above-mentioned audio accompaniment and speech audio are carried out the audio mix processing, obtains mixed audio;

Audio frequency transmitting element 503 is used for above-mentioned mixed audio is sent to above-mentioned first terminal and/or second terminal.

Further, as shown in Figure 6, the function that above-mentioned control center can increase Video processing makes the session of Karaoke abundanter, also comprises:

Video reception unit 601 is used to receive the video image that first terminal and second terminal send;

Video mix processing unit 602 is used for the above-mentioned video image that comes from first terminal and second terminal is synthesized processing, obtains composite video image;

Video transmitting element 603 is used for above-mentioned composite video image is sent to above-mentioned first terminal and/or second terminal.

Above-mentioned embodiment has added the content of video can enrich the Karaoke session more, makes the user that better experience be arranged.

Further, as shown in Figure 7, above-mentioned control center can also comprise:

Mixed effect information receiving unit 701 is used to receive adjustment self audio attribute and/or mixed effect information that first terminal and/or second terminal send;

Above-mentioned audio mix processing unit 502, be used for that specifically the above-mentioned audio accompaniment corresponding with first terminal and above-mentioned speech audio are carried out audio mix according to the mixed effect of above-mentioned adjustment self audio attribute and/or mixed effect information requirements and handle, obtain the mixed audio corresponding with first terminal; The above-mentioned audio accompaniment corresponding with second terminal and above-mentioned speech audio are carried out audio mix according to the mixed effect of above-mentioned adjustment self audio attribute and/or mixed effect information requirements handle, obtain the mixed audio corresponding with second terminal.

Above-mentioned embodiment, the mixing of carrying out audio frequency by the requirement of adopting each terminal setting can better meet the differentiation requirement of terminal.

Further, as shown in Figure 8, above-mentioned control center can also comprise:

Control messages receiving element 801 is used to receive the control messages that chairman's terminal sends;

Control execution unit 802 is used for the indication according to control messages, and corresponding control operation is carried out in Karaoke.

The technical matters that above-mentioned embodiment solves is the implementation of being carried out session control by chairman's terminal.

Further, as shown in Figure 9, above-mentioned control center can also comprise:

Withdraw from information acquisition unit 901, be used to receive the request of withdrawing from of above-mentioned first terminal or the transmission of second terminal, or, the control messages of the rejecting terminal that reception chairman terminal sends, or, listen to above-mentioned first terminal or second terminal abnormal and withdraw from;

Connect switching units 902, be used for when the request of withdrawing from that receives the transmission of above-mentioned first terminal or second terminal, or, receive the control messages of the rejecting terminal of chairman's terminal transmission, or, listen to above-mentioned first terminal or second terminal abnormal when withdrawing from, remove the session connection of this terminal.

The technical matters that above-mentioned embodiment solves is the implementation that terminal withdraws from session.

Embodiment six, and the embodiment of the invention also provides a kind of terminal, as shown in figure 10, comprising:

Audio frequency receiving element 1001 is used to receive the speech audio that the other-end beyond this terminal that participates in session sends, and receives the audio accompaniment that the bent storehouse of accompaniment server sends according to the accompaniment demand of this terminal or described other-end;

Audio collection unit 1002 is used to gather the speech audio of this terminal;

Audio frequency synthesis unit 1003 is used for that the speech audio of described audio accompaniment, the speech audio that receives and this terminal is carried out audio mix and handles, and obtains mixed audio;

Audio playing unit 1004 is used to play described mixed audio;

Mixed audio transmitting element 1005 sends to described other-end with described mixed audio.

Further, as shown in figure 11, described terminal also comprises:

Video reception unit 1101 is used to receive the video image that described other-end sends;

Video acquisition unit 1102 is used to gather the video image of this terminal;

Video synthesis unit 1103 is used for the video image of the video image that comes from described other-end and this terminal synthesize processing the acquisition composite video image;

Video playback unit 1104 is used to play described composite video image;

Video transmitting element 1105 is used for described composite video image is sent to described other-end.

Embodiment seven, and the embodiment of the invention also provides a kind of terminal, as shown in figure 12, comprising:

Audio frequency transmitting element 1201, the speech audio that is used for this terminal that will collect sends to control center

Mixed audio receiving element 1202 is used to receive the mixed audio that is sent by control center; Described mixed audio comprises the audio-frequency information of the terminal that participates in session;

Audio accompaniment receiving element 1203 is used to receive the audio accompaniment that the bent storehouse of accompaniment server sends;

Audio frequency synthesis unit 1204 is used for synthetic described mixed audio and described audio accompaniment, obtains target audio;

Audio playing unit 1205 is used to play described target audio.

Further, as shown in figure 13, described terminal also comprises:

Video transmitting element 1301 is used for sending the video image that this terminal collects to control center;

Composograph receiving element 1302 is used to receive the composite video image that control center sends, and described video image comprises the video information of the terminal that participates in session;

Video playback unit 1303 is used to play described composite video image.

Embodiment eight, as shown in figure 14, the embodiment of the invention also provides a kind of system of realization of long-distance karaoke, comprising:

The bent storehouse server 1401 of accompanying is used for sending audio accompaniments according to the demand of first terminal or second terminal to control center 1402;

Control center 1402 is used to receive the speech audio that first terminal and second terminal send, and receives the audio accompaniment that the bent storehouse of accompaniment server 1401 sends according to the demand of above-mentioned first terminal or second terminal; Above-mentioned audio accompaniment and speech audio are carried out the audio mix processing, obtain mixed audio; Above-mentioned mixed audio is sent to above-mentioned first terminal and/or second terminal.

Further, above-mentioned control center 1402 also is used to receive the video image that first terminal and second terminal send; The above-mentioned video image that comes from first terminal and second terminal is synthesized processing, obtain composite video image;

Above-mentioned composite video image is sent to above-mentioned first terminal and/or second terminal.

Further, above-mentioned control center 1402 also is used for consulting respectively to send with first terminal, second terminal parameter of sound signal;

Above-mentionedly sound signal after the hybrid processing is sent to above-mentioned first terminal or second terminal comprises: the parameter according to the transmission sound signal of consulting with first terminal sends to above-mentioned first terminal with mixed audio; Parameter according to the transmission sound signal of consulting with second terminal sends to above-mentioned second terminal with mixed audio.

Further, the bent storehouse of above-mentioned accompaniment server 1401, the transmission parameter that specifically is used for being provided with according to described first terminal sends audio accompaniment to control center, and the transmission parameter that is provided with according to described second terminal sends audio accompaniment to control center;

Above-mentioned control center 1402 specifically is used for described audio accompaniment and the described speech audio that sends according to the transmission parameter of the first terminal setting carried out the audio mix processing, obtains the mixed audio corresponding with first terminal and gives the first terminal end; Described audio accompaniment and the described speech audio that sends according to the transmission parameter of the second terminal setting carried out the audio mix processing, obtain the mixed audio corresponding and send to second terminal with second terminal.

Further, above-mentioned control center 1402 also is used to receive first terminal and/or adjustment self audio attribute of second terminal transmission and/or the information of mixed effect;

Above-mentioned control center 1402 carries out the audio mix processing to audio accompaniment and speech audio, obtains mixed audio and comprises:

According to the information of the audio attribute and/or the mixed effect of above-mentioned adjusted first terminal, audio accompaniment and speech audio are carried out audio mix handle, obtain the mixed audio corresponding with first terminal; According to the information of the audio attribute and/or the mixed effect of above-mentioned adjusted second terminal, audio accompaniment and speech audio are carried out audio mix handle, obtain the mixed audio corresponding with second terminal.

Above-mentioned further control center 1402 also is used to receive the control messages that chairman's terminal sends; According to the indication of control messages, corresponding control operation is carried out in Karaoke.

Further, above-mentioned control center 1402, also be used for when receiving the request of withdrawing from that above-mentioned first terminal or second terminal send, or, receive the control messages of the rejecting terminal of chairman's terminal transmission, or, listen to above-mentioned first terminal or second terminal abnormal when withdrawing from, remove the session connection of this terminal.

One of ordinary skill in the art will appreciate that all or part of step that realizes in the foregoing description method is to instruct relevant hardware to finish by program, described program can be stored in a kind of computer-readable recording medium, the above-mentioned storage medium of mentioning can be a ROM (read-only memory), disk or CD etc.

More than implementation method, device and the system of a kind of long-distance karaoke that the embodiment of the invention provided is described in detail, used specific case herein principle of the present invention and embodiment are set forth, the explanation of above embodiment just is used for helping to understand method of the present invention and core concept thereof; Simultaneously, for one of ordinary skill in the art, according to thought of the present invention, the part that all can change in specific embodiments and applications, in sum, this description should not be construed as limitation of the present invention.

Claims

1, a kind of implementation method of long-distance karaoke of annular is characterized in that, comprising:

2, according to the described method of claim 1, it is characterized in that, also comprise:

Receive the video image that first terminal and second terminal send;

The described video image that comes from first terminal and second terminal is synthesized processing, obtain composite video image; Described composite video image is sent to described first terminal and/or second terminal.

According to the described method of claim 1, it is characterized in that 3, the described reception bent storehouse server of accompanying comprises according to the audio accompaniment that the accompaniment demand of described first terminal or second terminal sends:

Receive the audio accompaniment of the bent storehouse of accompaniment server according to the transmission parameter transmission of first terminal setting; Receive the audio accompaniment of the bent storehouse of accompaniment server according to the transmission parameter transmission of second terminal setting;

Describedly audio accompaniment and speech audio carried out audio mix handle, obtain mixed audio and comprise:

Described audio accompaniment and the described speech audio that sends according to the transmission parameter of the first terminal setting carried out the audio mix processing, obtain the mixed audio corresponding with first terminal; Described audio accompaniment and the described speech audio that sends according to the transmission parameter of the second terminal setting carried out the audio mix processing, obtain the mixed audio corresponding with second terminal;

Mixed audio is sent to described first terminal and/or second terminal comprises with described:

The mixed audio corresponding with first terminal sent to first terminal;

The mixed audio corresponding with second terminal sent to second terminal.

4, according to the described method of claim 1, it is characterized in that, describedly audio accompaniment and speech audio are carried out audio mix also comprise before handling: receive adjustment self audio attribute that first terminal and/or second terminal send and/or the information of mixed effect;

According to the information of the audio attribute and/or the mixed effect of described adjusted first terminal, audio accompaniment and speech audio are carried out audio mix handle, obtain the mixed audio corresponding with first terminal; According to the information of the audio attribute and/or the mixed effect of described adjusted second terminal, audio accompaniment and speech audio are carried out audio mix handle, obtain the mixed audio corresponding with second terminal;

Describedly mixed audio is sent to described first terminal and/or second terminal comprises:

The mixed audio corresponding with first terminal sent to first terminal; Say that the mixed audio corresponding with second terminal sends to second terminal.

5, according to any described method of claim 1 to 4, it is characterized in that, also comprise:

Receive the control messages of the control Karaoke session of chairman's terminal transmission; Described control information is: audio accompaniment has been selected in deletion, the playing sequence that has selected audio accompaniment is set, the progress of scheduling song, the sound channel of scheduling song, adjust the size or the effect of accompaniment background audio, the audio frequency that allows or shield first terminal and/or second terminal carries out hybrid processing, the video figure that allows or shield first terminal and/or second terminal synthesizes processing, adjust the effect that audio mixing is handled, allow or refuse new terminal to add session, in the session at least one declares the closing of;

According to the indication of control messages, the Karaoke session is carried out that deletion has been selected audio accompaniment, the playing sequence that selected audio accompaniment is set, the sound channel of the progress of scheduling song, scheduling song, adjusted size or effect, the permission of accompaniment background audio or shield first terminal and/or the audio frequency of second terminal carries out hybrid processing, permission or shields first terminal and/or the video image of second terminal synthesize processings, adjusts effect, the permission of audio mixing processing or refuses new terminal adding session, session declares the closing of.

6, according to the described method of claim 5, it is characterized in that, also comprise:

When receiving the request of withdrawing from that described first terminal or second terminal send, or, receive the control messages of the rejecting terminal of chairman's terminal transmission, or, listen to described first terminal or second terminal abnormal when withdrawing from, remove the terminal that sends the request of withdrawing from, terminal that control messages is rejected or the session connection of the terminal that withdraws from unusually.

7, a kind of implementation method of long-distance karaoke of star is characterized in that, comprising:

8, according to the described method of claim 7, it is characterized in that, also comprise:

Receive the video image that described other-end sends, gather the video image of this terminal;

Video image to the video image that comes from described other-end and this terminal synthesize processing, the acquisition composite video image;

Play described composite video image, described composite video image is sent to described other-end.

9, a kind of implementation method of long-distance karaoke of annular is characterized in that, comprising:

The speech audio of this terminal of collecting is sent to control center;

10, according to the described method of claim 9, it is characterized in that, also comprise:

Send the video image that this terminal collects to control center;

Receive the composite video image that control center sends, described video image comprises the video information of the terminal that participates in session;

Also comprise in the described broadcast target audio: play described composite video image.

11, a kind of control center is characterized in that, comprising:

12, according to the described control center of claim 11, it is characterized in that, also comprise:

Video reception unit is used to receive the video image that first terminal and second terminal send;

The video mix processing unit is used for the described video image that comes from first terminal and second terminal is synthesized processing, obtains composite video image;

The video transmitting element is used for described composite video image is sent to described first terminal and/or second terminal.

13, according to claim 11 or 12 described control centers, it is characterized in that, also comprise:

The mixed effect information receiving unit is used to receive adjustment self audio attribute and/or mixed effect information that first terminal and/or second terminal send;

Described audio mix processing unit, be used for that specifically the described audio accompaniment corresponding with first terminal and described speech audio are carried out audio mix according to the mixed effect of described adjustment self audio attribute and/or mixed effect information requirements and handle, obtain the mixed audio corresponding with first terminal; The described audio accompaniment corresponding with second terminal and described speech audio are carried out audio mix according to the mixed effect of described adjustment self audio attribute and/or mixed effect information requirements handle, obtain the mixed audio corresponding with second terminal.

14, according to claim 11 or 12 described control centers, it is characterized in that, also comprise:

The control messages receiving element is used to receive the control messages that chairman's terminal sends;

Control execution unit is used for the indication according to control messages, and corresponding control operation is carried out in Karaoke.

15, according to the described control center of claim 14, it is characterized in that, also comprise:

Withdraw from information acquisition unit, be used to receive the request of withdrawing from of described first terminal or the transmission of second terminal, or, the control messages of the rejecting terminal that reception chairman terminal sends, or, listen to described first terminal or second terminal abnormal and withdraw from;

Connect switching units, be used for when the request of withdrawing from that receives the transmission of described first terminal or second terminal, or, receive the control messages of the rejecting terminal of chairman's terminal transmission, or, listen to described first terminal or second terminal abnormal when withdrawing from, remove the session connection of this terminal.

16, a kind of terminal is characterized in that, comprising:

The audio collection unit is used to gather the speech audio of this terminal;

Audio playing unit is used to play described mixed audio;

17, according to the described terminal of claim 16, it is characterized in that, also comprise:

Video reception unit is used to receive the video image that described other-end sends;

Video acquisition unit is used to gather the video image of this terminal;

The video synthesis unit is used for the video image of the video image that comes from described other-end and this terminal synthesize processing the acquisition composite video image;

The video playback unit is used to play described composite video image;

The video transmitting element is used for described composite video image is sent to described other-end.

18, a kind of terminal is characterized in that, comprising:

Audio playing unit is used to play described target audio.

19, according to the described terminal of claim 18, it is characterized in that, also comprise:

The video transmitting element is used for sending the video image that this terminal collects to control center;

The composograph receiving element is used to receive the composite video image that control center sends, and described video image comprises the video information of the terminal that participates in session;

The video playback unit is used to play described composite video image.

20, a kind of system of realization of long-distance karaoke is characterized in that, comprising:

21, according to the described system of claim 20, it is characterized in that,

Described control center also is used to receive the video image that first terminal and second terminal send; The described video image that comes from first terminal and second terminal is synthesized processing, obtain composite video image;

Described composite video image is sent to described first terminal and/or second terminal.

22, according to claim 20 or 21 described systems, it is characterized in that,

Described control center also is used for consulting respectively to send with first terminal, second terminal parameter of sound signal;

Describedly sound signal after the hybrid processing is sent to described first terminal or second terminal comprises: the parameter according to the transmission sound signal of consulting with first terminal sends to described first terminal with mixed audio; Parameter according to the transmission sound signal of consulting with second terminal sends to described second terminal with mixed audio.

23, according to claim 20 or 21 described systems, it is characterized in that,

The bent storehouse of described accompaniment server, the transmission parameter that specifically is used for being provided with according to described first terminal sends audio accompaniment to control center, and the transmission parameter that is provided with according to described second terminal sends audio accompaniment to control center;

Described control center specifically is used for described audio accompaniment and the described speech audio that sends according to the transmission parameter of the first terminal setting carried out the audio mix processing, obtains the mixed audio corresponding with first terminal and gives the first terminal end; Described audio accompaniment and the described speech audio that sends according to the transmission parameter of the second terminal setting carried out the audio mix processing, obtain the mixed audio corresponding and send to second terminal with second terminal.

24, according to claim 20 or 21 described systems, it is characterized in that,

Described control center also is used to receive first terminal and/or adjustment self audio attribute of second terminal transmission and/or the information of mixed effect;

Described control center carries out the audio mix processing to audio accompaniment and speech audio, obtains mixed audio and comprises:

According to the information of the audio attribute and/or the mixed effect of described adjusted first terminal, audio accompaniment and speech audio are carried out audio mix handle, obtain the mixed audio corresponding with first terminal; According to the information of the audio attribute and/or the mixed effect of described adjusted second terminal, audio accompaniment and speech audio are carried out audio mix handle, obtain the mixed audio corresponding with second terminal.

25, according to claim 20 or 21 described systems, it is characterized in that,

Described control center also is used to receive the control messages that chairman's terminal sends; According to the indication of control messages, corresponding control operation is carried out in Karaoke.

26, according to the described system of claim 25, it is characterized in that,

Described control center also is used for when the request of withdrawing from that receives the transmission of described first terminal or second terminal, or, receive the control messages of the rejecting terminal of chairman's terminal transmission, or, listen to described first terminal or second terminal abnormal when withdrawing from, remove the session connection of this terminal.