Summary of the invention
In view of this, the invention provides a kind of audio-visual synchronization apparatus and method based on HLS agreement, solve that audio signal to be played and vision signal to be played continue picture asynchronous and that bring or sound is detained.
Particularly, described device comprises mark module, demultiplexing module, decoder module, detection module and the playing module arranging in turn, wherein:
Described mark module, for after receiving synchronization request, for the first audio, video data getting adds sync mark, and notifies playing module;
Described demultiplexing module, for being voice data and video data by described audio, video data demultiplexing, and according to described sync mark, upgrades the timestamp of voice data and/or video data;
Described decoder module, for described voice data and the video data of decoding, exports audio signal to be played and vision signal to be played;
Described detection module, whether consistent with the timestamp carrying in vision signal to be played for detection of described audio signal to be played, when the timestamp that carries is inconsistent, send synchronization request to mark module in audio signal to be played and vision signal to be played;
Described playing module, for synchronously exporting according to the described sync mark described audio signal to be played of control and vision signal to be played.
Further, described detection module, if when the timestamp carrying for described audio signal to be played and vision signal to be played is all inconsistent within the predetermined time, send synchronization request to mark module.
Further, the process of the timestamp of described demultiplexing module renewal voice data and/or video data comprises: described demultiplexing module is according to the timestamp of video data described in the timestamp correction of described voice data.
Further, the process of the timestamp of described demultiplexing module renewal voice data and/or video data comprises: described demultiplexing module is according to the timestamp of voice data described in the timestamp correction of described audio, video data and video data.
Further, described playing module controls described audio signal to be played according to described sync mark and the synchronous process of exporting of vision signal to be played comprises: audio signal to be played or the vision signal to be played of carrying sync mark are stored in to output buffer area, wait receives after the vision signal to be played or audio signal to be played of carrying sync mark, controls to carry the audio signal to be played of sync mark and vision signal to be played is synchronously exported.
Described method comprises:
After receiving synchronization request, for the first audio, video data getting adds sync mark;
By described audio, video data demultiplexing, be voice data and video data, and according to described sync mark, upgrade the timestamp of voice data and/or video data;
Decode described voice data and video data, export audio signal to be played and vision signal to be played;
Detect described audio signal to be played whether consistent with the timestamp carrying in vision signal to be played, when the timestamp that carries is inconsistent, send synchronization request in audio signal to be played and vision signal to be played;
According to the described sync mark described audio signal to be played of control and vision signal to be played, synchronously export.
Further, described method also comprises:
If when the timestamp carrying in described audio signal to be played and vision signal to be played is all inconsistent within the predetermined time, send synchronization request.
Further, the process of the timestamp of described renewal voice data and/or video data comprises: according to the timestamp of video data described in the timestamp correction of described voice data.
Further, the process of the timestamp of described renewal voice data and/or video data comprises: according to the timestamp of voice data described in the timestamp correction of described audio, video data and video data.
Further, according to the described sync mark described audio signal to be played of control and the synchronous process of exporting of vision signal to be played, comprise: audio signal to be played or the vision signal to be played of carrying sync mark are stored in to output buffer area, wait receives after the vision signal to be played or audio signal to be played of carrying sync mark, controls to carry the audio signal to be played of sync mark and vision signal to be played is synchronously exported.
By above description, can be found out, the present invention is by detecting the timestamp of audio signal to be played and vision signal to be played, and then detect audio signal to be played and vision signal to be played whether output is synchronous, in audio signal to be played and vision signal to be played, export when asynchronous, for treating that the audio, video data of demultiplexing adds sync mark.According to this sync mark, control audio signal to be played and vision signal to be played is synchronously exported, when demultiplexing, upgrade the timestamp of voice data and video data simultaneously, synchronous to reach follow-up audio signal to be played and vision signal timestamp to be played, and then export synchronous object.Therefore, solved the problem that audio signal to be played and vision signal to be played continue asynchronous brought picture or sound delay.
Embodiment
For making object, technical scheme and the advantage of the embodiment of the present invention clearer, below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly described, obviously, described embodiment is the present invention's part embodiment, rather than whole embodiment.Embodiment based in the present invention, those of ordinary skills, not making the every other embodiment obtaining under creative work prerequisite, belong to the scope of protection of the invention.
For the problems referred to above of the prior art, the invention provides a kind of audio-visual synchronization device based on HLS agreement.For further illustrating the present invention, provide the following example:
Embodiment mono-
Please refer to Fig. 2, the audio-visual synchronization device based on HLS agreement in the present embodiment is provided with in turn: mark module 110, demultiplexing module 120, decoder module 130, detection module 140 and playing module 150.
Mark module 110, for after receiving synchronization request, for the first audio, video data getting adds sync mark, and notifies playing module 150.
Demultiplexing module 120, for being voice data and video data by audio, video data demultiplexing, and according to above-mentioned sync mark, upgrades the timestamp of voice data and/or video data.
Decoder module 130, for above-mentioned voice data and the video data of decoding, exports audio signal to be played and vision signal to be played.
Detection module 140, whether consistent with the timestamp carrying in vision signal to be played for detection of above-mentioned audio signal to be played, when the timestamp that carries is inconsistent, send synchronization request to mark module 110 in audio signal to be played and vision signal to be played.
Playing module 150, for controlling audio signal to be played according to above-mentioned sync mark and vision signal to be played is synchronously exported.
The present invention arranges detection module 140, for the timestamp carrying according to audio signal to be played and vision signal to be played judge audio signal to be played and vision signal to be played whether synchronous.Nonsynchronous time, send synchronization request to mark module 110, mark module 110 is after receiving this synchronization request, for the first audio, video data next getting adds sync mark.This sync mark be used for into 150 promptings of demultiplexing module 120 and playing module synchronous.
In a kind of concrete realization of the present embodiment, demultiplexing module 120, according to sync mark, is upgraded the timestamp of voice data and/or video data, in order to revise voice data and the nonsynchronous problem of video data timestamp.Playing module 150 controls audio signal to be played according to above-mentioned sync mark and vision signal to be played is exported synchronous.
By above description, can be found out, the present invention is by detecting the timestamp of audio signal to be played and vision signal to be played, and then detect audio signal to be played and vision signal to be played whether output is synchronous, in audio signal to be played and vision signal to be played, export when asynchronous, for treating that the audio, video data of demultiplexing adds sync mark.According to this sync mark, control audio signal to be played and vision signal to be played is synchronously exported, when demultiplexing, upgrade the timestamp of voice data and video data simultaneously, synchronous to reach follow-up audio signal to be played and vision signal timestamp to be played, and then export synchronous object.Therefore, solved the problem that audio signal to be played and vision signal to be played continue asynchronous brought picture or sound delay.
Embodiment bis-
The present embodiment provides a kind of more comprehensively audio-visual synchronization device based on HLS agreement, and this device is disposed with mark module, demultiplexing module, decoder module, detection module and playing module.Wherein:
Mark module, for after receiving synchronization request, for the first audio, video data getting adds sync mark, and notifies playing module.
Demultiplexing module, for being voice data and video data by audio, video data demultiplexing, and according to above-mentioned sync mark, upgrades the timestamp of voice data and/or video data.
Decoder module, for above-mentioned voice data and the video data of decoding, exports audio signal to be played and vision signal to be played.
Detection module, whether consistent with the timestamp carrying in vision signal to be played for detection of above-mentioned audio signal to be played, when the timestamp that carries is inconsistent, send synchronization request to mark module in above-mentioned audio signal to be played and vision signal to be played.
Playing module, for controlling audio signal to be played according to above-mentioned sync mark and vision signal to be played is synchronously exported.
Compared to prior art, the present invention arranges detection module between decoder module and playing module, whether consistent with the timestamp carrying in vision signal to be played for detection of audio signal to be played, and then whether the output that detects audio signal to be played and vision signal to be played is synchronous, and inconsistent time, by detection module, send synchronization request to mark module at audio signal to be played and vision signal timestamp to be played.In order to coordinate detection module, the present invention also arranged mark module before demultiplexing module, the synchronization request sending for receiving detection module, and after receiving this synchronization request, for the first audio, video data getting thereafter adds sync mark.Further, playing module controls decoded audio signal to be played according to this sync mark and vision signal to be played is synchronously exported, synchronous to realize the output of Voice & Video.Demultiplexing module is upgraded the timestamp of voice data and/or video data according to this sync mark, synchronous to realize the timestamp of voice data and video data, further guarantees that follow-up output is synchronous.
According to HLS agreement, for a complete multi-medium data, according to the URL in m3u8 file, download the audio, video data of a plurality of TS forms of this multi-medium data.TS is a kind of encapsulation format, and voice data and video data and supplementary all can be encapsulated into the TS bag of 188 bytes.Please refer to the TS pack arrangement schematic diagram shown in Fig. 3, a TS bag is to be formed by TS head and the packing of PES stream.Wherein PES stream is to be formed by PES head and the packing of ES stream, and ES stream comprises voice data and video data.
In a kind of concrete realization of the present embodiment, mark module is for obtaining the audio, video data of having downloaded, when not receiving synchronization request, the output that audio signal at present to be played and vision signal to be played are described is synchronous, mark module does not need to do special processing, and audio, video data is sent to demultiplexing module.Mark module is after receiving synchronization request, illustrate now, audio signal to be played and vision signal to be played are asynchronous, in order to adjust as early as possible, mark module is after receiving synchronization request, and for the first audio, video data next getting adds sync mark, this sync mark can be arranged on TS head, such as, in certain field of TS head, increase corresponding mark.
It is voice data and video data by audio, video data demultiplexing that demultiplexing module is used for.Demultiplexing module is resolved audio, video data TS bag, therefrom parses voice data and video data.In the process of resolving, if find above-mentioned sync mark, upgrade the timestamp of voice data and/or video data.Particularly, include a timestamp in each voice data and video data, this timestamp is exactly displaying time, such as, the time schedule bar of below while watching film.Demultiplexing module, when finding sync mark, synchronously be adjusted the timestamp of voice data and video data.
Particularly, in a kind of exemplary execution mode of the present invention, demultiplexing module has two kinds of synchronous modes of adjusting the timestamp of voice data and/or video data.Be according to a timestamp for the timestamp correction video data of voice data, the timestamp of voice data of namely take is benchmark, revises the timestamp of video data, makes it consistent with the timestamp of voice data.A kind of is according to the timestamp of the timestamp correction video data of audio, video data.According to HLS agreement, each audio, video data has corresponding duration, according to duration corresponding to this audio, video data in m3u8 file, can calculate the time of this audio, video data for whole multi-medium data, namely the timestamp of this audio, video data.In the timestamp of voice data and the timestamp difference of above-mentioned audio, video data when larger, according to the timestamp correction voice data of above-mentioned audio, video data and the timestamp of video data,, the timestamp of voice data and video data is all modified to the timestamp of above-mentioned audio, video data, relatively accurate to guarantee the timestamp of voice data and video data.
Decoder module decode above-mentioned voice data and video data, become the audio signal to be played that can play for output and vision signal to be played by number reduction.
Detection module, for before audio signal to be played and vision signal to be played output, detects the timestamp that audio signal to be played and vision signal to be played are carried.This timestamp is the concrete time of audio, video data, such as 01:20:38, represents the 1st hour 20 minutes 38 seconds.Generally, the time in the playing progress bar of audio, video data just comes from this timestamp.When the timestamp that carries in audio signal to be played and vision signal to be played is inconsistent, illustrate that audio signal to be played and vision signal output to be played are asynchronous, send synchronization request to mark module.Further, when the timestamp that detection module only carries in audio signal to be played and vision signal to be played is all inconsistent within the predetermined time, just send synchronization request, excessive to avoid frequently sending caused mark module pressure.Preferably, this predetermined time can be selected 3 seconds.
The object that playing module is play with realization for exporting decoded audio signal to be played and vision signal to be played.When audio signal to be played and vision signal to be played are exported when asynchronous, playing module carries out the synchronous output of audio signal to be played and vision signal to be played according to sync id.Particularly, playing module is stored in output buffer area by audio signal to be played or the vision signal to be played of carrying sync mark, wait receives after the vision signal to be played or audio signal to be played of carrying sync mark, controls to carry the audio signal to be played of sync mark and vision signal to be played is synchronously exported.Because sync mark is stamped for audio, video data after receiving synchronization request by mark module, carry the audio signal to be played of sync mark and vision signal to be played from same audio, video data, need to synchronously export.If playing module first receives the audio signal to be played of carrying sync mark, illustrate that the output of audio signal to be played is faster than the output of vision signal to be played, therefore first audio signal to be played is stored in to output buffer area, when wait receives the vision signal to be played of carrying sync mark, control and carry the audio signal to be played of sync mark and vision signal to be played is synchronously exported.If playing module first receives the vision signal to be played of carrying sync mark, illustrate that the output of vision signal to be played is faster than the output of audio signal to be played, therefore first vision signal to be played is stored in to output buffer area, when wait receives the audio signal to be played of carrying sync mark, control and carry the audio signal to be played of sync mark and vision signal to be played is synchronously exported.With this, reach the object of isochronous audio and video.
Embodiment tri-
Corresponding said apparatus, the invention provides a kind of audio and video synchronization method based on HLS agreement.Please refer to Fig. 4, the method comprises:
After receiving synchronization request, for the first audio, video data getting adds sync mark;
By audio, video data demultiplexing, be voice data and video data, and according to above-mentioned sync mark, upgrade the timestamp of voice data and/or video data;
Decoding audio data and video data, export audio signal to be played and vision signal to be played;
Detect audio signal to be played whether consistent with the timestamp carrying in vision signal to be played, when the timestamp that carries is inconsistent, send synchronization request in audio signal to be played and vision signal to be played;
According to above-mentioned sync mark, control audio signal to be played and vision signal to be played is synchronously exported.
Further, said method also comprises:
If when the timestamp carrying in audio signal to be played and vision signal to be played is all inconsistent within the predetermined time, send synchronization request.Excessive to avoid frequently sending caused mark module pressure.Preferably, this predetermined time can be selected 3 seconds
Further, the process of the timestamp of renewal voice data and/or video data comprises: according to the timestamp of the timestamp correction video data of voice data.
Further, the process of the timestamp of renewal voice data and/or video data comprises: according to the timestamp correction voice data of audio, video data and the timestamp of video data.
Particularly, in a kind of exemplary execution mode of the present invention, there are two kinds of synchronous modes of adjusting the timestamp of voice data and video data.Be according to a timestamp for the timestamp correction video data of voice data, the timestamp of voice data of namely take is benchmark, revises the timestamp of video data, makes it consistent with the timestamp of voice data.In the timestamp of voice data and the timestamp difference of audio, video data a when larger, according to the timestamp correction voice data of audio, video data and the timestamp of video data.The timestamp of voice data and video data is all modified to the timestamp of above-mentioned audio, video data, relatively accurate to guarantee the timestamp of voice data and video data.
Further, the process of controlling audio signal to be played and the synchronous output of vision signal to be played according to sync mark comprises: audio signal to be played or the vision signal to be played of carrying sync mark are stored in to output buffer area, wait receives after the vision signal to be played or audio signal to be played of carrying sync mark, controls to carry the audio signal to be played of sync mark and vision signal to be played is synchronously exported.
Compared to prior art, before the present invention plays after decoding, detect audio signal to be played whether consistent with the timestamp carrying in vision signal to be played, and then whether the output that detects audio signal to be played and vision signal to be played is synchronous, and inconsistent time, send synchronization request at audio signal to be played and vision signal timestamp to be played.On the one hand, the present invention is before demultiplexing, after receiving synchronization request, for the first audio, video data getting thereafter adds sync mark.Further, according to this sync mark, control decoded audio signal to be played and vision signal to be played is synchronously exported, synchronous to realize the output of Voice & Video.On the other hand, according to this sync mark, upgrade the timestamp of voice data and/or video data, synchronous to realize the timestamp of voice data and video data, further guarantee that follow-up output is synchronous.
By above description, can be found out, the present invention is by detecting the timestamp of audio signal to be played and vision signal to be played, and then detect audio signal to be played and vision signal to be played whether output is synchronous, in audio signal to be played and vision signal to be played, export when asynchronous, for treating that the audio, video data of demultiplexing adds sync mark.According to this sync mark, control audio signal to be played and vision signal to be played is synchronously exported, when demultiplexing, upgrade the timestamp of voice data and video data simultaneously, synchronous to reach follow-up audio signal to be played and vision signal timestamp to be played, and then export synchronous object.Therefore, solved the problem that audio signal to be played and vision signal to be played continue asynchronous brought picture or sound delay.
The foregoing is only preferred embodiment of the present invention, in order to limit the present invention, within the spirit and principles in the present invention not all, any modification of making, be equal to replacement, improvement etc., within all should being included in the scope of protection of the invention.