Specific embodiment
To keep the purposes, technical schemes and advantages of the invention implemented clearer, below in conjunction in the embodiment of the present invention
Attached drawing, technical solution in the embodiment of the present invention is further described in more detail.
It should be understood that in the accompanying drawings, from beginning to end same or similar label indicate same or similar element or
Element with the same or similar functions.Described embodiments are some of the embodiments of the present invention, rather than whole implementation
Example, in the absence of conflict, the features in the embodiments and the embodiments of the present application can be combined with each other.Based in the present invention
Embodiment, every other embodiment obtained by those of ordinary skill in the art without making creative efforts,
It shall fall within the protection scope of the present invention.
Herein, " first ", " second " etc. are only used for mutual differentiation, rather than indicate their significance level and sequence
Deng.
The division of module, unit or assembly herein is only a kind of division of logic function, in actual implementation may be used
To there is other division modes, such as multiple modules and/or unit can be combined or are integrated in another system.As separation
The module of part description, unit, component are also possible to indiscrete may be physically separated.It is shown as a unit
Component can be physical unit, may not be physical unit, it can is located at a specific place, may be distributed over grid
In unit.Therefore some or all of units can be selected to realize the scheme of embodiment according to actual needs.
The media low delay communication means for internet video live broadcasting provided below with reference to Fig. 1 the present invention is described in detail
First embodiment.The present embodiment is mainly used in internet video live broadcasting, and what can be issued to sender includes network direct broadcasting content
Audio-video in the content of video judged, to adjust the priority of transmission of video and audio transmission, to have in network communication
It is delayed or preferentially sends significant data when unstable, reduce the interference that network delay generates, guarantee that recipient can be utmostly
On do not interrupt, do not miss the key content of network direct broadcasting.
As shown in Figure 1, media low delay communication means provided in this embodiment includes the following steps:
Step 100, judge the scene changes degree between the current video frame image got and previous video frame images
It whether is more than preset scene change threshold.
One video frame is a width still image, and multiple video frames combine as one section of video, and audio frame is similarly.
During obtaining video frame and audio frame to audio/video coding, video frame can be implemented with scene judgment mechanism, specific side
Formula is to identify to the video frame Vn and the image of previous video frame Vn-1 that currently need to be sent to recipient, is obtained adjacent
The scene changes degree of two field pictures, and judge scene changes degree Sn whether there is or not beyond preset scene change threshold St.
If scene changes degree Sn exceeds St, then it represents that video frame Vn and video frame Vn-1 have significant change in scene.
Such as the picture of video frame Vn is the scene of spectators under the platform of concert scene, the picture of video frame Vn-1 is that singer sings before the lights
The scene of song, the scene changes degree of two frame pictures can exceed scene change threshold St in such cases, can be judged as aobvious
Write variation.
If Sn is without departing from St for scene changes degree, then it represents that video frame Vn and video frame Vn-1 do not have significant change in scene
Change.Such as the picture of the picture and video frame Vn-1 of video frame Vn is the scene that singer sings before the lights, in such cases
The scene changes degree of two frame pictures can be judged as not having significant change without departing from scene change threshold St.
It should be noted that can include many video frame and audio frame in one section of audio-video, in addition to first frame video frame
Except audio frame, subsequent all video frames are required compared with former frame carries out above-mentioned scene changes degree.
It should also be noted that, with the progress scene changes degree comparison of current video frame image in addition to previous video frame figure
As outside, it can also be preceding N video frame images, N > 1 can also be and preceding second frame image comparison, preceding third frame image pair
Than etc., frame-skipping compares, and the specific value of N can be arranged according to the actual situation, but should not be arranged excessive, otherwise may omit
The case where long interior scene changes become again again when one section shorter.
Step 200, data subpackage is carried out to current video frame and the respective audio frame got respectively, and is become according to scene
Change degree whether be more than preset scene change threshold judging result, added in each video data packet and packets of audio data
Corresponding scene markers.
After whether the picture for the picture and video frame Vn-1 for judging video frame Vn has significant change, by video frame Vn
And corresponding audio frame An carries out data subpackage respectively.It, can be by data point when being carried out data transmission using internet
It sends at multiple data packets and in batches, recipient, which receives, will do it integration after data packet, obtain capable of completely playing
Data.
After video frame Vn and audio frame An is carried out data subpackage, multiple video data packets and multiple audio datas are obtained
Packet, adds scene markers in each video data packet and each packets of audio data.The scene markers of addition are depended on not
Whether scene changes degree is more than preset scene change threshold before subpackage, if the scene of video frame Vn is relative to video frame
Vn-1 has significant change, then the scene markers that add in the data packet of video frame Vn and adds in the data packet of video frame Vn-1
The scene markers that add are different, using the picture of above-mentioned video frame Vn as concert scene platform under spectators scene and video frame Vn-1
For picture is the scene sung before the lights of singer, the scene mark that is added in the data packet of video frame Vn-1 and audio frame An-1
It is denoted as Sm, then the scene markers added in the data packet of video frame Vn and video frame Vn-1 are Sm+1.And if video frame Vn
The picture of picture and video frame Vn-1 is without significant changes, such as is the scene that singer sings before the lights, then video frame Vn and
The scene markers added in the data packet of audio frame An are identical as video frame Vn-1 and audio frame An-1, are Sm.
It is understood that data packet can use UDP message packet or TCP data packet.UDP(User Datagram
Protocol, User Datagram Protocol) it is a kind of connectionless agreement, it is passed for providing the simple unreliable information towards affairs
Business is taken, can support the network application for needing to transmit data between the computers, UDP does not carry out reliability to transmission data packet
Guarantee, is suitable for once transmitting low volume data, and transmission speed is very fast.TCP (Transmission Control Protocol,
Transmission control protocol) it is a kind of Connection-oriented Protocol, it comprises special transmitting pledge systems, and can ensure that the hair of data
It send and reception sequence, but needs occupying system resources, transmission speed is under equal conditions slower than UDP.
Step 300, buffers video data packet and packets of audio data, according to scene markers adjustment video data packet and audio number
Video data packet and packets of audio data are issued according to the transmission priority of packet, and according to priority is sent.
Sender is by sending buffer pool come buffered data.Video frame Vn and audio frame An are being subjected to data subpackage respectively
Later, video data packet is sent in the video transmission buffer area for sending buffer pool, it is slow that packets of audio data is sent to transmission
The audio for rushing pond sends buffer area.Buffer pool is used for buffered data, for retransmiting away data accumulation afterwards to a certain extent.
After video data packet and packets of audio data are sent to and send buffer pool, the data packet to current video frame is needed
Scene markers and the scene markers of the data packet of previous video frame detected, to judge it is same whether adjacent video frames belong to
Scene.Scene markers are identical, indicate that current video frame and previous video frame are same scene, scene markers difference is then different fields
Scape, and then separate same scene and lower to haircut and the mode of priority and different scenes downward is given to haircut and give the mode of priority.
It is understood that since packets of audio data is also added to scene markers, it can also be current by judging
Whether the scene markers of the data packet of the scene markers of the data packet of audio frame and previous audio frame are identical to know whether same field
Scape.
In the case where current video frame and previous video frame are with scene, show the live streaming picture of current video frame with before
The live streaming picture of one video frame is essentially identical, such as the picture of two frame is scene that singer sings before the lights, then at this time
The main contents of live streaming are the song (audio) that singer sings rather than the background (video) of the looks of singer and stage, therefore sound
The live content that conversation structure includes is more, and the transmission priority of packets of audio data is adjusted to the hair higher than video data packet at this time
Send priority, that is, when the picture of the picture and former frame of judging present frame is with scene, send the video that buffer pool is sent
The ratio of data packet and packets of audio data is tilted to packets of audio data.
For example, initial time, sends every millisecond of transmission m data packet of buffer pool, middle pitch, video data packet ratio are 1:
9, after the data packet of first frame video frame and the data packet of first frame audio frame issue, the data packet of the second frame video frame and the
The data packet of two frame audio frames, which enters, to be sent in buffer pool, and is detected the second frame video frame and the same field of first frame video frame
Scape, then at this time send buffer pool in sound, video data packet ratio be adjusted to 2:8 so that recipient receive it is same amount of
When data packet, packets of audio data accounting is some more.
If, even if the transmission accounting of packets of audio data improves, recipient also can in the preferable situation of network state
It is synchronously received video data packet and packets of audio data.If network state is deteriorated suddenly, such as network speed is relatively slow, network is unstable
Fixed etc., then since the transmission accounting of packets of audio data improves, the packets of audio data accounting that recipient (viewing side) receives also can
It improves, audio data can buffer in advance than video data and finish and play back, then in recipient, network direct broadcasting concert
The picture of window might have Caton, but the song of singer can keep continuously, guaranteeing that recipient can be to the full extent as far as possible
It does not interrupt, miss the emphasis of concert.
In the case where current video frame and previous video frame are different scenes, show the live streaming picture of current video frame with
The live streaming picture difference of previous video frame is larger, for example, from the scene transitions of spectators under the concert scene platform of former frame be current
The scene that the singer of frame sings before the lights, then singer might have excellent performance while singing at this time, therefore
Image information may or picture more important than acoustic information and acoustic information it is of equal importance.At this time by the transmission of video data packet
Priority is adjusted to the transmission priority greater than or equal to packets of audio data, that is, in the picture and former frame for judging present frame
Picture when being different scenes, send video data packet that buffer pool is sent and the ratio of packets of audio data inclined to video data packet
Tiltedly or ratio is restored to initial proportion.
For example, after the data packet of n-th frame video frame and the data packet of n-th frame audio frame issue, the (n+1)th frame video frame
Data packet and the data packet of the (n+1)th frame audio frame, which enter, to be sent in buffer pool, and is detected the (n+1)th frame video frame and n-th
The scene of frame video frame is different, then sends the sound in buffer pool at this time, video data packet ratio is adjusted to from the 2:8 under same scene
0.5:9.5, when so that recipient receiving same amount of data packet, video data packet accounting is some more;Or sound, video
Data packet ratio is restored to initial proportion 1:9, makes the two is balanced to issue.
If in the preferable situation of network state, even if the transmission accounting of video data packet is mentioned relative to initial proportion
Height, recipient can also be synchronously received video data packet and packets of audio data.If network state is poor, then due to video
The transmission accounting of data packet improves, and the video data packet accounting that recipient (viewing side) receives can also improve, video data meeting
It is buffered in advance than audio data and finishes and play back, then in recipient, the sound of network direct broadcasting window might have card
, but the content of the performance of singer at the scene can keep as far as possible continuously, guarantee recipient can to the full extent not in
The key content of live action performance that is disconnected, not missing singer.It is restored in sending buffer pool in sound, video data packet initial
Normal rates when, sound that recipient (viewing side) receives, video data packet also can be more balanced, and picture and sound all may
Caton is had, but relative to having for the case where priority, sound and picture are more taken into account.
As shown in the above, the bandwidth allocation of the practical as data transmission of priority is sent.
Whole network live streaming during, no matter the quality of network environment, send buffer pool in data packet always exist
Judge same scene or different scenes according to scene markers and reconcile again to issue data packet according to priority after priority.
Step 400, recipient buffers the video data packet and packets of audio data received according to scene markers, to realize sound
Video data plays.
Recipient is by receiving buffer pool come buffered data.Sender (live streaming side) is according to sending priority for sound, video
After data packet issues, recipient (viewing side) receives sound, video data packet, and is put into and receives in buffer pool, wherein audio number
Audio is put into according to packet and receives buffer area, and video data packet is put into video reception buffer area.In sound, video buffer to certain data volume
Afterwards, the scene markers according to the carrying of each data packet and some other feature, obtain video frame and audio frame for decoded packet data,
And play out frame by frame so that recipient can audiovisual to concert, realize Internet net cast.
In the good situation of network state, the sound of sender's sending, video data packet can be received quickly, therefore nothing
By data packet transmission priority how, recipient can buffer in time completes sound, video and plays back, so that recipient
Broadcasting pictures and sound are continuous.
In the case where network state is bad, sender issues sound, video data packet according to priority is sent, and recipient is same
When buffer sound, video data packet, under same scene, the transmission priority of packets of audio data is high, and preferential send includes significant data
Packets of audio data, recipient can first buffer and finish audio data, therefore audio data can first be played and come out, and reduce network and prolong
When the interference that generates.Under different scenes, the transmission priority of video data packet is high, preferential to send the video counts comprising significant data
According to packet, recipient can first buffer and finish video data, therefore video data can first be played and come out, and the same network delay that reduces produces
Raw interference.When network recovery is to kilter, for sender, sender remain according to the identification of scene come by
It is approved for distribution that priority is sent to issue data packet, for recipient, since the good therefore buffer speed of network environment becomes faster, recipient
It is asynchronous due to generating sound picture when network state before is bad, it can cut fall behind due to completion in audio-video because not buffering at this time
The part of broadcasting, directly from the advanced broadcast point of another party be played simultaneously after audio-video frequency content.
In one embodiment, the video frame got in step 100 and the audio frame got pass through real-time recording
Audio-video encoded to obtain.
It is usually to be encoded by audio-video document as each video frame and audio frame of initial data.Sound view
Frequency file is that scene passes through the recording arrangements real-time recordings such as camera.Recording side starts live broadcast when concert starts and drills
Picture and sound that can be live is sung, the file of real-time recording is immediately encoded to obtain video frame and audio frame, and in cataloged procedure
The middle judgement for carrying out subsequent scene changes degree.
When judgement two field pictures scene changes in step 100, the scene changes of current frame image and previous frame image
Judgement is realized in an encoding process.
In one embodiment, the coding mode of audio-video is determined according to network state.
Coding mode includes the parameters such as video code rate, video resolution, video frame rate, audio code rate, audio sample rate.
The data bits of unit time transmission, i.e. sampling rate when video code rate is exactly data transmission, unit are kbps (kilobit
It is per second).Sampling rate is bigger in unit time, and precision is higher, and data volume is also bigger, and the file dealt is just closer to original
File.Audio code rate is identical as video code rate principle, and audio code rate is the sampling rate of audio.
Resolution ratio is to be often expressed as ppi (Pixel per for measuring the parameter that data volume is how many in image
Inch, per inch pixel), resolution ratio is higher, and video is more clear, and data volume is also bigger.
Frame per second (Frame rate) be for measure display frame number measurement, unit be FPS (Frames per Second,
Display frame number per second).Frame per second is higher, and the more smooth picture the more true to nature, and frame per second is too low, and picture has Caton.
Sample rate refers to that each second acquires how many a sample of signal, and sample frequency is higher, then obtains within the unit time
Sample data is more, also more accurate to the expression of signal waveform, and the sound quality of audio, tone more restore, and data volume is also bigger.
It is understood that when network state is bad, code rate that when coding uses, resolution ratio, frame per second, sample rate
It is relatively lower etc. what can be arranged, to compress, reduce data volume, so that recipient can accurately receive the same of live content
When reduce requirement to network speed, to guarantee the normal transmission of data.When network state is good, code that when coding uses
Rate, resolution ratio, frame per second, sample rate etc. can be set normal or higher, to improve the viewing experience of viewing side.
In one embodiment, judge whether scene changes degree surpasses by frame differential method and/or background subtraction
Cross preset scene change threshold.
When judging whether there is scene changes in step 100, two frame video frame figures can be judged by frame differential method
It, can also be simultaneously as that can also judge that whether there is or not scene changes for two frame video frame images by background subtraction whether there is or not scene changes
The double-deck judgement is carried out using frame differential method and background subtraction, further increases the accuracy of judging result.
Frame differential method is by two frame adjacent in video flowing or to be separated by the two images pixel values of a few frame images and subtract each other, and is obtained
Two field pictures brightness absolute value of the difference, by judge the absolute value whether be greater than preset threshold value determine two frames scene whether
Have significant change.
Background subtraction be using in image sequence present frame and reference background model relatively detect moving object
A kind of method obtains the grayscale image of target moving region by the way that the picture frame currently obtained and background image are done calculus of differences,
Grayscale image thresholding and take absolute value, determines the field of two frames by judging whether the absolute value is greater than preset threshold value
Whether scape has significant change.Wherein, background image is updated according to the current picture frame that obtains.
In one embodiment, the buffer size of recipient is variable, is adjusted according to scene markers.
Recipient includes receiving buffer pool, and the video reception buffer area and audio that receive buffer pool receive the size of buffer area
It can be set to thick-and-thin, may be set to be variable.If video reception buffer area and audio receive setting buffers
Variable for size, then the foundation changed is to make an addition to the scene markers in data packet in step 200.
During being buffered to the sound that receives, video data packet, is decoded, sound, video data packet can be carried
Scene markers detected.If the scene markers for detecting that data packet carries are identical, then it represents that the picture field of two frame video frames
Scape does not have significant changes, can suitably reduce the buffer capacity for receiving buffer pool at this time, is equivalent to and reduces the slow of audio, video data
It rushes the time, and then plays out the audio, video data having received as early as possible.In the case where Network status is bad, due to audio data
The occupied space of packet is relatively small, and can suitably allow video pictures second-rate under scene, therefore appropriate reduce connects
The buffer capacity of buffer pool is received, to play out audio content as early as possible, reduces the Caton phenomenon during live streaming.
If detecting, the scene markers that data packet carries are different, then it represents that the pictured scene of two frame video frames has significant become
Change, can suitably increase the buffer capacity for receiving buffer pool at this time, is equivalent to the buffer time for increasing audio, video data, and then make
The audio, video data having received plays again after having buffered one section.In the case where Network status is bad, due to video data packet
Occupied space it is relatively large, it is therefore desirable to buffering is completed to broadcast again after the video of one section of duration, therefore appropriate increase receives
The buffer capacity of buffer pool prevents to have buffered enough videos played for example less than one second and just plays immediately, straight to reduce
Caton sense during broadcasting.
In one embodiment, after sender issues data packet, the confirmation signal of recipient's feedback is waited, is being received
Corresponding data packet in buffer area is deleted after the confirmation signal, sender is being more than that setting time does not receive confirmation signal
When, data packet is re-emitted automatically.
The transmission buffer pool of sender is anti-to sender if recipient receives data packet after issuing data packet
Confirmation signal is presented, shows oneself to have received data packet, sender both can delete the data packet in sending buffer pool, if but
Packet loss phenomenon occurs for data packet, then sender determines the data packet issued when being more than that setting time does not receive confirmation signal
It loses, and re-emits data packet automatically, to again attempt to make recipient's received data packet.
The media low delay communication means for internet video live broadcasting provided below with reference to Fig. 2 the present invention is described in detail
Second embodiment.The content of video carries out in the audio-video comprising network direct broadcasting content that the present embodiment can issue sender
Judgement, to adjust the priority of transmission of video and audio transmission, with network communication have delay or it is unstable when preferentially send weight
Data are wanted, the interference that network delay generates is reduced, guarantee that recipient can not interrupt to the full extent, not miss network direct broadcasting
Key content;The duration that the data packet for postponing sending is postponed is limited by adding time identifier in audio, video data simultaneously,
The data packet for preventing priority low falls behind too many.
As shown in Fig. 2, media low delay communication means provided in this embodiment includes the following steps:
Step 100, judge the scene changes degree between the current video frame image got and previous video frame images
Whether it is more than preset scene change threshold, and adds corresponding time identifier in each frame video frame and audio frame.
During obtaining video frame and audio frame to audio/video coding, video frame can be implemented and judge machine with scene
System, concrete mode are to know to the video frame Vn and the image of previous video frame Vn-1 that currently need to be sent to recipient
Not, the scene changes degree of adjacent two field pictures is obtained, and judges scene changes degree Sn whether there is or not beyond preset scene changes
Threshold value St.If scene changes degree Sn exceeds St, then it represents that video frame Vn and video frame Vn-1 have significant change in scene.If
Sn is without departing from St for scene changes degree, then it represents that video frame Vn and video frame Vn-1 do not have significant change in scene.
It should be noted that with the progress scene changes degree comparison of current video frame image in addition to previous video frame images
Outside, preceding N video frame images be can also be, N > 1 can also be and preceding second frame image comparison, preceding third frame image comparison
Deng frame-skipping compares, and the specific value of N can be arranged according to the actual situation, but should not be arranged excessive, otherwise may omit
The case where long interior scene changes become again again when one section shorter.
It is view according to acquisition time or clock signal etc. since the time for getting each frame video frame is different
Frequency frame adds time identifier, and the time identifier of each video frame is different at the time of included.Audio frame is then and synchronous view
Frequency frame have same time identifier, therefore each audio frame time identifier it is included at the time of it is also different.Specifically, when
Between mark can be timestamp.Timestamp (timestamp) is to indicate that a data have been deposited before some specific time
, the complete, data that can verify that, a usually character string uniquely identifies the time at certain a moment.
It is understood that being also possible to first add time identifier in video frame and audio frame, two frames are then judged again
Whether the scene changes degree of video frame images exceeds scene change threshold, and the execution sequence of the two has no Compulsory Feature.
Step 200, data subpackage is carried out to current video frame and the respective audio frame got respectively, and is become according to scene
Change degree whether be more than preset scene change threshold judging result, added in each video data packet and packets of audio data
Corresponding scene markers.
After whether the picture for the picture and video frame Vn-1 for judging video frame Vn has significant change, by video frame Vn
And corresponding audio frame An carries out data subpackage respectively.
After video frame Vn and audio frame An is carried out data subpackage, multiple video data packets and multiple audio datas are obtained
Packet, adds scene markers in each video data packet and each packets of audio data.If the scene of video frame Vn is relative to view
Frequency frame Vn-1 has significant change, then the scene markers added in the data packet of video frame Vn and the data packet in video frame Vn-1
The scene markers of middle addition are different, if the scene of video frame Vn relative to video frame Vn-1 without significant changes, in video frame Vn
Data packet and video frame Vn-1 data packet in add identical scene markers.The scene markers of audio frame according to video frame and
Come, the video frame of synchronization and the scene markers of audio frame are identical.
It should be noted that the video frame and audio frame that carry time identifier be after data subpackage, each data packet according to
It is old to carry time identifier.
Step 300, buffers video data packet and packets of audio data, according to scene markers adjustment video data packet and audio number
Video data packet and packets of audio data are issued according to the transmission priority of packet, and according to priority is sent.
After video frame Vn and audio frame An are carried out data subpackage respectively, video data packet is sent to transmission buffering
The video in pond is sent in buffer area, and packets of audio data is sent to the audio transmission buffer area for sending buffer pool.
After video data packet and packets of audio data are sent to and send buffer pool, the data packet to current video frame is needed
Scene markers and the scene markers of the data packet of previous video frame detected, scene markers are identical, indicate current video number
It is same scene according to video frame belonging to video frame and previous video data packet belonging to packet, scene markers difference is then different fields
Scape.
In the case where current video frame and previous video frame are with scene, the transmission priority of packets of audio data is adjusted
For the transmission priority higher than video data packet, so that in the bandwidth allocation for sending data, bandwidth that packets of audio data occupies
Suitably increase.Since the transmission accounting of packets of audio data improves, the packets of audio data accounting that recipient (viewing side) receives
It can improve, audio data can buffer in advance than video data and finish and play back, so that when network direct broadcasting, such as concert is existing
When the live streaming of field, the song of singer can keep continuously, guaranteeing that recipient can not interrupt to the full extent, not miss performance as far as possible
Key content.
In the case where current video frame and previous video frame are different scenes, by the transmission priority tune of video data packet
The whole transmission priority for greater than or equal to packets of audio data, so that video data packet accounts in the bandwidth allocation for sending data
Bandwidth suitably increases.Since the transmission accounting of video data packet improves, the video data that recipient (viewing side) receives
Packet accounting can also improve, and video data can buffer in advance than audio data and finish and play back, and enable the picture of concert
It keeps continuous as far as possible, guarantees that recipient can not interrupt to the full extent, not miss the key content of performance.
Whole network live streaming during, no matter the quality of network environment, send buffer pool in data packet always exist
Judge same scene or different scenes according to scene markers and reconcile again to issue data packet according to priority after priority.
Step 400, recipient buffers the video data packet and packets of audio data received according to scene markers, to realize sound
Video data plays, and time identifier and preset duration threshold value according to the carrying of each data packet, limitation send priority
High data are ahead of the duration for sending the low data playback of priority.
Sender will send the data sending for buffering and finishing in buffer pool, and recipient receives the video of sender's sending
Data packet and packets of audio data, and the scene markers that carry according to data packet and the data such as packet header know data packet
Not, it buffers, while receiving buffer pool after reception of the data packet, also to the scene markers of the data packet of current video frame with before
The scene markers of the data packet of one video frame are detected, to judge whether adjacent video frames belong to Same Scene.Scene markers
It is identical, indicate that current video frame and previous video frame are same scene, scene markers difference is then different scenes.In data buffering
It after completion, needs to judge the time difference of audio and video playing, prevents the advanced another party of audio or video excessive.In data
Decoding and playing audio-video content, realize the purpose of internet video live broadcasting after buffering.
When sender preferentially sends the high data packet of priority, for example, sender continues preferentially to send packets of audio data
When, if Network status is bad, the audio that recipient plays can be ahead of video, and if Network status is bad for a long time
State before playing back the audio, also needs the time to packets of audio data then in order to avoid to be ahead of video excessive for the audio of broadcasting
Mark is detected, if receiving the time identifier of packets of audio data that will be played in buffer pool is Tn, video data packet when
Between identify also as Tn, then show that sound, video data packet are played simultaneously, the time difference be not present between sound, video.
If the time identifier for receiving the packets of audio data that will be played in buffer pool is Tn+10, the time of video data packet
Mark is also Tn, then audio advanced time difference of 10 units of video, has been equivalent in advance 10 frames.It at this time will be advanced with regard to needs
Frame number and preset duration threshold value be compared, if duration threshold value be 20, advanced frame number is simultaneously less than duration threshold value, because
This allows to continue to play audio in advance, and video can only then be broadcast since data packet is transmitted unsmooth in a network with Caton state
It puts.If duration threshold value is 10 frames, advanced frame number has reached the limitation of duration threshold value, cannot constantly broadcast in advance again at this time
Playback frequency cannot draw nonsynchronous time difference expanding sound, but need to abandon video and lag behind audio without playing video
Partial content, when video data is buffered to Tn+10, sound, video are directly broadcast since time identifier is the position of Tn+10
It puts.
If recipient detects the scene of the sound, video data packet that receive during above-mentioned audio plays in advance
Label changes, then it represents that the pictured scene of the frame is different from former frame, and video playing may catch up with a part by sound at this time
Frequently advanced part, scene followed by no longer change again, then audio may will continue to increase the length for being ahead of video.
If network state is by badly improving, sound, video data when audio leads video plays and is less than duration threshold value
Bao Junneng is normally received and smooth buffering, then can abandon video and lag behind the content of audio-frequency unit, and time mark is played simultaneously
Know identical sound, video data, the time difference between sound, video is zero at this time.
In one embodiment, duration threshold value variable is adjusted according to scene markers.
For judge the duration threshold value of the advanced argument of the data packet played in advance beyond setting value be it is adjustable, specifically according to
It is adjusted according to the scene markers of the data packet received.
If receiving the scene markers for the current video frame data packet that buffer pool receives and the field of previous video requency frame data packet
Scape label is identical, i.e. non-occurrence scene variation, at this time sender can it is multiple go out audio data, then duration threshold value can suitably become
Greatly, so as to receive buffer pool when not receiving complete video frame slowly, audio is allowed mostly to play in advance.
If receiving the scene markers for the current video frame data packet that buffer pool receives and the field of previous video requency frame data packet
Scape label is different, that is, scene changes have occurred, at this time sender can it is multiple go out video data, then duration threshold value can suitably become
Small, so that not get higher than video excessive for audio, sound, video can synchronize broadcast as far as possible.
In one embodiment, the video frame got in step 100 and the audio frame got pass through real-time recording
Audio-video encoded to obtain.
It is usually to be encoded by audio-video document as each video frame and audio frame of initial data.Sound view
Frequency file is that scene passes through the recording arrangements real-time recordings such as camera.
In one embodiment, the coding mode of audio-video is determined according to network state.
Coding mode includes the parameters such as video code rate, video resolution, video frame rate, audio code rate, audio sample rate.It can
With understanding, when network state is bad, code rate, resolution ratio, frame per second, the sample rate etc. that when coding uses can be arranged
It is relatively lower, to compress, reduce data volume, so that recipient's reduction while can accurately receive network direct broadcasting content
Requirement to network speed, to guarantee the normal transmission of data.When network state is good, code rate that when coding uses is differentiated
Rate, frame per second, sample rate etc. can be set normal or higher, to improve the viewing experience of viewing side.
In one embodiment, judge whether scene changes degree surpasses by frame differential method and/or background subtraction
Cross preset scene change threshold.
In one embodiment, the buffer size of recipient is variable, is adjusted according to scene markers.
During being buffered to the sound that receives, video data packet, is decoded, sound, video data packet can be carried
Scene markers detected.If the scene markers for detecting that data packet carries are identical, then it represents that the picture field of two frame video frames
Scape does not have significant changes, can suitably reduce the buffer capacity for receiving buffer pool at this time, in the case where Network status is bad, fits
Audio content can be played out as early as possible when reducing the buffer capacity for receiving buffer pool, reduce the Caton phenomenon during live streaming.
If detecting, the scene markers that data packet carries are different, then it represents that the pictured scene of two frame video frames has significant become
Change, can suitably increase the buffer capacity for receiving buffer pool at this time, in the case where Network status is bad, appropriate increase receives slow
The buffer capacity in pond is rushed, can prevent to have buffered enough videos played for example less than one second and just play immediately, to reduce
Caton sense during live streaming.
In one embodiment, after sender issues data packet, the confirmation signal of recipient's feedback is waited, is being received
Corresponding data packet in buffer area is deleted after the confirmation signal, sender is being more than that setting time does not receive confirmation signal
When, data packet is re-emitted automatically.
The media low delay communication system for internet video live broadcasting provided below with reference to Fig. 3 the present invention is described in detail
First embodiment.The present embodiment is the system for implementing preceding method first embodiment, and it is straight which is mainly used in network video
It broadcasts, the content of video judges in the audio-video comprising network direct broadcasting content that can be issued to sender, to adjust video
The priority of transmission and audio transmission, with network communication have delay or it is unstable when preferentially send significant data, reduce network
Be delayed the interference generated, guarantees that recipient can not interrupt to the full extent, not miss the key content of network direct broadcasting.
As shown in figure 3, media low delay communication system provided in this embodiment includes:
Scene judgment module, the scene between current video frame image and previous video frame images for judging to get
Whether variation degree is more than preset scene change threshold.It is determined as that two frame scenes are different if being more than scene change threshold, not
Then it is determined as that two frame scenes are identical more than scene change threshold.
Data subpackage module, for the current video frame by the judgement of scene judgment module and the respective audio got
Frame carries out data subpackage respectively.
Scene identity module is connect with scene judgment module, is used for after data subpackage, according to scene changes degree
Whether be more than preset scene change threshold judging result, added in each video data packet and packets of audio data corresponding
Scene markers.If scene judgment module determines that the scene of two frames is identical, the scene markers added in data packet are identical, if scene
Judgment module determines that the scene of two frames is different, then the scene markers added in data packet are different.
Buffer module is sent, is used for after adding corresponding scene markers, buffers video data packet and packets of audio data, foundation
Scene markers adjust the transmission priority of video data packet and packets of audio data, and issue video data packet according to priority is sent
And packets of audio data.Sending buffer module includes the transmission buffer pool for buffers video data packet and packets of audio data.If two
The scene markers of frame are identical, then determine that the priority of audio data is high, send the transmission bandwidth middle pitch frequency of buffer module at this time
According to accounting improve, when sending data packet, the traffic volume of packets of audio data is with respect to increasing.If the scene markers of two frames are different,
Then determine that the priority of video data is high, the accounting for sending video data in the transmission bandwidth of buffer module at this time improves, and is sending out
When sending data packet, the traffic volume of packets of audio data is opposite to be increased.
Buffer module is received, is connect with the broadcasting end of recipient, for receiving the video counts for sending buffer module and issuing
According to packet and packets of audio data, and the video data packet and packets of audio data received is buffered according to scene markers, to realize that sound regards
Frequency data playback.Receiving buffer module includes the reception buffer pool for buffers video data packet and packets of audio data.If audio
The priority of data packet is high, then receives buffer module and can buffer in advance and complete and play in advance audio so that Network status not
When well and with scene, audio is preferentially played.If the priority of video data packet is high, receiving buffer module can buffer in advance
At and play video in advance so that preferentially playing video when and different scenes bad in Network status.
It is sending side above dotted line in Fig. 3, is receiving side below dotted line.
In one embodiment, system further include: coding records module, connect with video recording equipment, for real-time
The audio-video of recording is simultaneously encoded the video frame got and the audio frame got to it.
In one embodiment, scene judgment module includes: the first judging unit and/or second judgment unit, and first
Judging unit is used to judge by frame differential method whether scene changes degree to be more than preset scene change threshold, the second judgement
Unit is used to judge whether scene changes degree is more than preset scene change threshold by background subtraction.
In one embodiment, sending buffer module includes: the first adjusting unit, for adjusting hair according to scene markers
Send the buffer size of buffer module.First adjusting unit is connect with buffer pool is sent.When two frame video frames are with scene, first is adjusted
Section unit can suitably turn the buffer size for sending buffer module down, and in two frame video frame different scenes, first adjusts list
Member can suitably tune up the buffer size for sending buffer module.
The media low delay communication system for internet video live broadcasting provided below with reference to Fig. 4 the present invention is described in detail
Second embodiment.The present embodiment is the system for implementing preceding method second embodiment, the packet which can issue sender
The content of video is judged in the audio-video of the content containing network direct broadcasting, to adjust the priority of transmission of video and audio transmission,
With network communication have delay or it is unstable when preferentially send significant data, reduce the interference that network delay generates, guarantee to receive
Side can not interrupt to the full extent, not miss the key content of network direct broadcasting.Simultaneously by adding the time in audio, video data
It identifies to limit the duration that the data packet for postponing sending is postponed, the data packet for preventing priority low falls behind too many.
As shown in figure 4, media low delay communication system provided in this embodiment includes:
Scene judgment module, the scene between current video frame image and previous video frame images for judging to get
Whether variation degree is more than preset scene change threshold.It is determined as that two frame scenes are different if being more than scene change threshold, not
Then it is determined as that two frame scenes are identical more than scene change threshold.
Time identifier module is used for before data subpackage, when adding corresponding in each frame video frame and audio frame
Between identify, time identifier can use timestamp.
Data subpackage module, for the current video frame by the judgement of scene judgment module and the respective audio got
Frame carries out data subpackage respectively.
Scene identity module is connect with scene judgment module, is used for after data subpackage, according to scene changes degree
Whether be more than preset scene change threshold judging result, added in each video data packet and packets of audio data corresponding
Scene markers.If scene judgment module determines that the scene of two frames is identical, the scene markers added in data packet are identical, if scene
Judgment module determines that the scene of two frames is different, then the scene markers added in data packet are different.
Buffer module is sent, is used for after adding corresponding scene markers, buffers video data packet and packets of audio data, foundation
Scene markers adjust the transmission priority of video data packet and packets of audio data, and issue video data packet according to priority is sent
And packets of audio data.Sending buffer module includes the transmission buffer pool for buffers video data packet and packets of audio data.If two
The scene markers of frame are identical, then determine that the priority of audio data is high, send the transmission bandwidth middle pitch frequency of buffer module at this time
According to accounting improve, when sending data packet, the traffic volume of packets of audio data is with respect to increasing.If the scene markers of two frames are different,
Then determine that the priority of video data is high, the accounting for sending video data in the transmission bandwidth of buffer module at this time improves, and is sending out
When sending data packet, the traffic volume of packets of audio data is opposite to be increased.
Buffer module is received, is connect with the broadcasting end of recipient, for receiving the video counts for sending buffer module and issuing
According to packet and packets of audio data, and the video data packet and packets of audio data received is buffered according to scene markers, to realize that sound regards
Frequency data playback.Receiving buffer module includes the reception buffer pool for buffers video data packet and packets of audio data.
Time difference adjustment module connect with buffer module is received, is used for after receiving buffer module buffered data packet, according to
The time identifier carried according to each data packet and preset duration threshold value, it is excellent that the high data of limitation transmission priority are ahead of transmission
The duration of the low data playback of first grade, the data packet for preventing priority low fall behind too many.
It is sending side above dotted line in Fig. 4, is receiving side below dotted line.
In one embodiment, duration threshold value variable, time difference adjustment module carry out duration threshold value according to scene markers
It adjusts.When receiving two frame video frames in buffer module is with scene, time difference adjustment module can suitably be adjusted duration threshold value
Greatly, when receiving two frame video frames in buffer module is different scenes, time difference adjustment module can suitably be adjusted duration threshold value
It is small.
In one embodiment, system further include: coding records module, connect with video recording equipment, for real-time
The audio-video of recording is simultaneously encoded the video frame got and the audio frame got to it.
In one embodiment, scene judgment module includes: the first judging unit and/or second judgment unit, and first
Judging unit is used to judge by frame differential method whether scene changes degree to be more than preset scene change threshold, the second judgement
Unit is used to judge whether scene changes degree is more than preset scene change threshold by background subtraction.
In one embodiment, sending buffer module includes: the first adjusting unit, for adjusting hair according to scene markers
Send the buffer size of buffer module.First adjusting unit is connect with buffer pool is sent.When two frame video frames are with scene, first is adjusted
Section unit can suitably turn the buffer size for sending buffer module down, and in two frame video frame different scenes, first adjusts list
Member can suitably tune up the buffer size for sending buffer module.
The scene judgment module of the present embodiment, scene identity module, sends buffer module, receives and delay data subpackage module
The specific setting that die block, coding record the components such as module can refer to structure described in aforementioned system first embodiment and set
It sets, no longer repeats one by one.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any
In the technical scope disclosed by the present invention, any changes or substitutions that can be easily thought of by those familiar with the art, all answers
It is included within the scope of the present invention.Therefore, protection scope of the present invention should be with the scope of protection of the claims
It is quasi-.