CN114697720A - Method and device for synchronizing self-adaptive audio and video RTP timestamp - Google Patents
Method and device for synchronizing self-adaptive audio and video RTP timestamp Download PDFInfo
- Publication number
- CN114697720A CN114697720A CN202011629055.0A CN202011629055A CN114697720A CN 114697720 A CN114697720 A CN 114697720A CN 202011629055 A CN202011629055 A CN 202011629055A CN 114697720 A CN114697720 A CN 114697720A
- Authority
- CN
- China
- Prior art keywords
- packet
- video
- audio
- timestamp
- ntp
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 230000003044 adaptive effect Effects 0.000 claims abstract description 14
- 238000005070 sampling Methods 0.000 claims description 27
- 230000005540 biological transmission Effects 0.000 claims description 6
- 230000001360 synchronised effect Effects 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 2
- 230000007704 transition Effects 0.000 description 4
- 238000013459 approach Methods 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000009499 grossing Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000009877 rendering Methods 0.000 description 2
- 230000003139 buffering effect Effects 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/4302—Content synchronisation processes, e.g. decoder synchronisation
- H04N21/4307—Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/60—Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client
- H04N21/63—Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
- H04N21/643—Communication protocols
- H04N21/6437—Real-time Transport Protocol [RTP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/8547—Content authoring involving timestamps for synchronizing content
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Synchronisation In Digital Transmission Systems (AREA)
Abstract
The invention relates to a method and a device for synchronizing an adaptive audio and video RTP timestamp. The method for synchronizing the self-adaptive audio and video RTP timestamp comprises the following steps: a sending end sends a video packet and an audio packet of an audio and video service to a receiving end; when a receiving end receives an RTP packet of a first video packet and an RTP packet of a first audio packet, the local timestamps are respectively used as an absolute timestamp of the first video packet and an absolute timestamp of the first audio packet, so that the video packet and the audio packet are synchronously controlled by using absolute time; when a receiving end receives the NTP packet of the first video packet and the NTP packet of the first audio packet, smooth deviation compensation processing is carried out on the deviation value in subsequent local playing according to the deviation value between the local timestamp of the nth video packet and the local timestamp of the nth audio packet and the NTP absolute timestamp of the nth video packet and the NTP absolute timestamp of the nth audio packet.
Description
Technical Field
The invention relates to the technical field of computer audio and video, in particular to a method for synchronizing timestamps of RTP (Real-time Transport Protocol)/RTCP (Real-time Transport Control Protocol) audio and video data packets.
Background
The audio and video synchronization control is a key technical point in the field of real-time audio and video, particularly video conferences, and whether the audio and video are synchronized or not has great influence on the experience of a call user. Real-time audio and video generally use RTP/RTCP protocol to transmit media data, and audio and video data packets are separately transmitted, so that the received data is not synchronized due to delay in the network transmission process, and therefore, buffering, sequencing, synchronizing and rendering are required to be performed on the received data at a receiving end.
Two timestamps exist in the audio-video data packet, one is the relative timestamp in the RTP data packet and the other is the NTP absolute timestamp in the RTCP data packet. Where a relative timestamp is present in each packet, while an absolute timestamp is typically sent every few seconds, or even no absolute timestamp.
There are three existing timestamp synchronization methods: the first method is to use relative time stamps as synchronization control; the method has the advantages of simplicity and easy implementation; the disadvantage is that the relative time stamps of audio and video generated by the sending end must start from a fixed value, and the receiving end joining the conference cannot synchronize in the conversation process, and certainly, the time stamp of the newly joined receiving end can be converted at the server end so as to start from a fixed value. The second method is to use absolute time stamps as synchronization control; the method has the advantages that absolute synchronization can be achieved; the disadvantage is that the absolute timestamp must be sent by the sending end, and rendering play and the effect of second-to-second opening can not be achieved at the initial stage of receiving the absolute timestamp, which affects user experience. The third method is to use the local timestamp of the received data packet as the synchronization control; the advantage of this method is that it is simple to implement; the disadvantage is that the method is too sensitive to network delay and jitter, and when the network jitter increases, cumulative delay and clock drift are introduced, so that serious asynchronization is caused.
Therefore, a method and an apparatus capable of adaptively performing audio-video synchronization are needed.
The above statements in the background are only intended to facilitate a thorough understanding of the present technical solutions (technical means used, technical problems solved and technical effects produced, etc.) and should not be taken as an acknowledgement or any form of suggestion that the messages constitute prior art already known to a person skilled in the art.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a self-adaptive algorithm, which combines an NTP absolute timestamp, an RTP relative timestamp and a local timestamp to carry out smooth transition on the control of the three timestamps, avoids respective defects, enhances user experience and further can achieve a good synchronization effect.
According to an embodiment of the present invention, there is provided a synchronization method of an adaptive audio and video RTP timestamp, including: a sending end sends a video packet and an audio packet of an audio and video service to a receiving end, wherein the video packet and the audio packet respectively comprise an RTP packet and an NTP packet; when a receiving end receives an RTP packet of a first video packet and an RTP packet of a first audio packet, local timestamps LV (1) and LA (1) when the RTP relative timestamp RV (1) of the first video packet and the RTP relative timestamp RA (1) of the first audio packet are received are respectively used as an absolute timestamp of the first video packet and an absolute timestamp of the first audio packet, so that the video packet and the audio packet are synchronously controlled by using the absolute time; when a receiving end receives an NTP packet of a first video packet and an NTP packet of a first audio packet, according to a calculated deviation value between a local time stamp LV (n) of a received nth video packet and a local time stamp LA (n) of the nth audio packet, and a calculated NTP absolute time stamp AV (n) of the received nth video packet and an NTP absolute time stamp AA (n) of the nth audio packet, smooth deviation compensation processing is carried out on the deviation value in subsequent local playing, and therefore synchronous control of the video packet and the audio packet is carried out by using the deviation compensated absolute time, wherein n is an integer larger than 1.
Preferably, when the NTP packet of the first video packet and the NTP packet of the first audio packet are received, the NTP absolute timestamp AV (1) of the first video packet and the NTP absolute timestamp AA (1) of the first audio packet are calculated by the following equations:
AA (1) ═ AA (ntp)) + [ RA (ntp)) -RA (1) ]/audio sampling rate
AV (1) ((ntp) + [ RV (ntp)) -RV (1) ]/video sampling rate
The NTP packets of the video packets include absolute timestamps av (NTP) and relative timestamps rv (NTP) of the video packets, and the NTP packets of the audio packets include absolute timestamps aa (NTP) and relative timestamps ra (NTP) of the video packets.
Preferably, the NTP absolute timestamp av (n) of the received nth video packet and the NTP absolute timestamp aa (n) of the nth audio packet are calculated by the following equations:
AV (n) ═ AV (1) + [ RV (n) -RV (1) ]/video sampling rate
AA (1) + [ RA (n) -RA (1) ]/audio sampling rate.
Preferably, when the NTP packet of the first video packet and the NTP packet of the first audio packet are received, the local timestamp lv (n) of the received nth video packet and the local timestamp la (n) of the received nth audio packet are calculated by the following equations:
LV (n) ═ LV (1) + [ RV (n) -RV (1) ]/video sampling rate
LA (1) + [ RA (n) -RA (1) ]/audio sampling rate.
Preferably, the offset value provision existing between the audio packet and the video packet is calculated by the following equation:
deviation=AA(n)-AV(n)-[LA(n)-LV(n)]
wherein, if the deviation value deviation is 0ms, it indicates that there is no fluctuation in network transmission; if the deviation value deviation is not 0ms, it indicates a fluctuation in network transmission.
Preferably, when compensating for the deviation existing between the audio packet and the video packet, the local timestamp of the video packet is gradually compensated with a smooth step of L based on the audio packet, wherein the time for each compensation is the compensation/L, the deviation for the remaining video packets is the compensation-compensation/L, and L is an integer greater than 1; when the remaining video packets compensate for the offset of 0, the local timestamps of the video packets are no longer compensated.
Preferably, in compensating for the deviation value existing between the audio packet and the video packet, the estimated local timestamp LV (estimate _ n) of the video packet is calculated by the following equation:
LV(estimate_n)=LV(n)+deviation/L
when the receiving end receives the relative time stamp RV (n +1) of the (n +1) th video packet again, the local time stamp LV (n +1) of the (n +1) th video packet when the relative time stamp RV (n +1) of the (n +1) th video packet is received can be calculated by the following equation:
LV (n +1) ═ LV (estimate _ n) + (RV (n +1) -RV (n))/video sampling rate.
Preferably, when calculating the remaining video packet compensation offset, rounding calculation is performed on the remaining video packet compensation offset.
According to an embodiment of the present invention, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method according to an embodiment of the present invention.
Compared with the prior art, the method and the device for synchronizing the self-adaptive audio and video RTP timestamp can more effectively realize audio and video synchronization.
Drawings
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. For purposes of clarity, the same reference numbers will be used in different drawings to identify the same elements. It is noted that the drawings are merely schematic and are not necessarily drawn to scale. In these drawings:
fig. 1 is a flowchart illustrating a method for synchronization of adaptive audio-video RTP timestamps according to an embodiment of the present invention.
Fig. 2 is a schematic diagram showing the structure of a computing system for implementing an exemplary embodiment of the present invention.
Detailed Description
The embodiments of the present invention will be described in detail below, which are carried out on the premise of the technical scheme of the present invention, and detailed embodiments and specific operation procedures are given, but the scope of the present invention is not limited to the embodiments described below.
The invention adds the local timestamp in addition to the relative timestamp and the absolute timestamp provided by the traditional audio and video, so that the audio and video playing is converted from depending on the relative timestamp and the absolute timestamp into depending on the local timestamp, and finally the transmission delay caused by network fluctuation is eliminated through a certain algorithm.
Fig. 1 is a flowchart illustrating a synchronization method of an adaptive audio-video RTP timestamp according to an embodiment of the present invention.
The synchronization method of the adaptive audio and video RTP timestamp according to the embodiment of the invention is divided into two stages.
The first phase is an initial relative timestamp phase.
In step 101: the sending end sends the video packet and the audio packet to the receiving end, and the video packet and the audio packet only have RTP relative timestamps at the stage and have no NTP absolute timestamps.
In step 102: taking the local timestamps of the RTP relative timestamps of the first video packet and the first audio packet as their absolute timestamps, respectively, the network jitter is usually sporadic, and the sending time difference and the receiving time difference are approximately equal in rate, so that it is acceptable in the initial relative timestamp stage.
Assume that the RTP relative timestamp of the first video packet is RV (1), the RTP relative timestamp of the first audio packet is RA (1), the local timestamp of the first video packet is LV (1), and the local timestamp of the first audio packet is LA (1).
When the RTP relative timestamps of the nth video packet and the nth audio packet are received, the local timestamps lv (n) and la (n) of the nth video packet and the nth audio packet may be calculated by the following equations:
LV (n) ═ LV (1) + [ RV (n) -RV (1) ]/video sampling rate
LA (1) + [ RA (n) -RA (1) ]/audio sampling rate
In the initial relative timestamp stage, the RTP relative timestamp of the first video packet and the local timestamp of the RTP relative timestamp of the first audio packet are used as absolute timestamps for synchronous control, so that the limitation that a receiving end must receive the NTP absolute timestamp and the RTP relative timestamps of the audio packet and the video packet generated by the transmitting end must start from a fixed value is overcome.
The second stage is the NTP absolute timestamp stage.
In step 103: NTP absolute timestamps for video and audio packets are received. And entering the stage of NTP absolute timestamp after receiving the NTP absolute timestamps of the video packet and the audio packet, wherein the NTP absolute timestamps of the video packet and the audio packet are received, so that the absolute time can be correctly calculated, and the video packet and the audio packet are played according to the absolute time. In the process of switching the local timestamp, which is previously taken as the absolute timestamp, to the received NTP absolute timestamp, a smooth transition needs to be performed on the absolute timestamp, otherwise a sharp feeling is caused to the user.
Assuming that the NTP packet has been received for the nth video packet and the nth audio packet, the absolute time stamp can be obtained, the NTP absolute time stamp for the nth video packet is AVn, and the NTP absolute time stamp for the nth audio packet is AAn, and the local time stamp LV (n +1) for the (n +1) th video packet is adjusted by using the following equation:
LV(n+1)=LV(n)-[AV(n)-AA(n)]/L
and L is the deviation smoothing step amplitude, the larger the value is, the smoother the transition is, but the longer the time required for synchronizing to the absolute timestamp is, and the value of L can be adjusted according to the actual service requirement.
The following describes a processing procedure of a synchronization method of an adaptive audio-video RTP timestamp according to an embodiment of the present invention with a specific example.
Suppose that a transmitting end transmits an audio/video link to a receiving end, the audio sampling rate is 48000Hz, the video sampling rate is 90000Hz, and the deviation smoothing step amplitude L is 4.
Initial relative timestamp phase:
when a receiving end receives the RTP relative time stamp of the first audio packet, the local time of the receiving end is obtained, namely the RTP relative time stamp RA (1) of the first audio packet is 10000Hz, the local time stamp LA (1) is 80000ms, when the receiving end receives the RTP relative time stamp of the first video packet, the local time of the receiving end is obtained, namely the RTP relative time stamp RV (1) of the first video packet is 20000Hz, and the local time stamp LV (1) is 80005 ms. The RTP relative time stamp and the local time stamp of the first audio packet and the first video packet are recorded.
When the receiving end receives RTP timestamps ra (n) and rv (n) of subsequent audio and video packets, the local timestamps la (n) and lv (n) of the audio and video packets can be calculated by the following equations:
LA (n) ═ LA (1) + [ RA (n) — RA (1) ]/audio sampling rate
LV (n) ═ LV (1) + [ RV (n) -RV (1) ]/video sampling rate
Wherein n is an integer greater than 1.
In the above stage, the local time stamp when the first video packet and the first audio packet are received is used as the absolute time stamp, so that the absolute time synchronization control is performed, which overcomes the limitation that the receiving end must receive the NTP absolute time stamp and the RTP relative time stamps of the audio and video generated by the transmitting end must start from a fixed value.
NTP absolute timestamp stage:
when the receiving end receives the NTP packets of the first video packet and the first audio packet, respectively, according to absolute timestamps aa (NTP) and av (NTP) and relative timestamps ra (NTP) and rv (NTP) in the NTP packets (which are transmitted through RTCP data packets, where the RTCP data packets include an NTP absolute timestamp and an RTP relative timestamp corresponding to the absolute timestamp), the absolute timestamps of the received first video packet and first audio packet can be calculated by the following equation:
AA (1) ═ AA (ntp)) + [ RA (ntp)) -RA (1) ]/audio sampling rate
AV (1) (ntp) + [ RV (ntp) -RV (1) ]/video sampling rate
At this time, the RTP relative timestamp ra (n) of the received nth audio packet is 10480Hz, and the local timestamp la (n) of the nth audio packet can be calculated by the following equation, i.e. 80010ms,
LA (n) ═ LA (1) + [ RA (n) — RA (1) ]/audio sampling rate
And the absolute time stamp of the nth audio packet, i.e. 90025ms,
AA (n) ═ AA (1) + [ RA (n) -RA (1) ]/audio sampling rate
Similarly, the RTP relative timestamp rv (n) of the nth video packet received is 21800Hz, and the local timestamp of the nth video packet can be calculated by the following equation, i.e. 80025ms,
LV (n) ═ LV (1) + [ RV (n) -RV (1) ]/video sampling rate
And the NTP absolute timestamp for the nth video packet can be calculated by the following equation, namely 90025ms,
AV (n) ═ AV (1) + [ RV (n) -RV (1) ]/video sampling rate
If the network transmits the audio and video data without fluctuation, the relative time difference of the audio and video should be equal to the absolute time difference, namely LA (n) -LV (n) -AA (n) -AV (n)
From the above equation, one can obtain: [ aa (n) -av (n) ] - [ la (n) -lv (n) ] ═ 10ms, which illustrates that due to network fluctuations, there is a 10ms deviation between the absolute timestamp a (n) and the local timestamp l (n) of the audio-video packets, which may be audio-and video-derived.
According to an embodiment of the present invention, the audio packets are referenced regardless of whether there is a deviation. I.e. assuming that the audio packets have no deviation, all deviations are caused by video packets. Therefore, only the deviation of the video packets is considered in the calculation, and the deviation of the audio packets is not required to be considered.
If the relative time difference and the absolute time difference of the audio and video packets have deviation, the deviation of the video packets needs to be compensated.
The local timestamp lv (n) of the nth video packet is found to be 80025ms, at which time the video offset is known to be 10ms, and the offset smoothness L is set to 4.
The estimated time stamp LV (estimate _ n) of the nth video packet can be calculated by the following equation, which is 80027.5 ms.
LV(estimate_n)=LV(n)+deviation/L
At this time, the offset/L of the already compensated video packet is 10/4-2.5 ms, and the offset-compensation/L of the remaining video packet is 7.5ms, so as to ensure that the offset converges as soon as possible (the error approaches 0 ms). According to an embodiment of the present invention, the remaining video packet compensation offsets are rounded, i.e. (int) (removal-removal/L) is 7 ms.
When the receiving end receives the (n +1) th video packet, its relative timestamp RV (n +1) (here, 23600Hz), and at this time, when the local timestamp LV (n +1) of the (n +1) th video packet is obtained, the estimated timestamp LV (estimate _ n) of the (n) th video packet is obtained as the local timestamp of the (n) th video packet (since the estimated timestamp is the local timestamp that has been compensated for), the local timestamp LV (n +1) of the (n +1) th video packet can be calculated by the following equation, that is, 80047.5ms,
LV (n +1) ═ LV (estimate _ n) + (RV (n +1) -RV (n))/video sampling rate
The calculated local timestamp LV (n +1) of the (n +1) th video packet is the local timestamp of the (n +1) th video packet based on the last compensation, and at this time, the timestamp is not the actual timestamp, because the timestamp needs to be compensated again, and the estimated timestamp LV (estimate _ n +1) of the (n +1) th video packet is obtained by the following equation, which is 80049ms,
LV(estimate_n+1)=LV(n+1)+deviation/L
after the current deviation compensation, the residual video packet compensation deviation (int) (deviation-compensation/L) is 6ms
After the above steps are sequentially calculated circularly, when the offset compensation of the remaining video packets is 0ms, the resolution/L is 0ms, and the video packets are calculated according to the above formula and the offset is not compensated any more, at this time, it can be known from the following equation:
LV (estimate _ n +1) ═ LV (n +1) + removal/L, removal/L ═ 0 ms. LV (estimate _ n +1) ═ LV (n + 1). I.e., LV (estimate _ n +1) is equal to LV (n + 1).
At this point it can be concluded that the local timestamp LV of the video has eliminated the network fluctuations, approaching the absolute timestamp.
Here, L is a deviation smoothing step width, the larger the value of L is, the smoother the transition is, but the longer the time required for synchronization to the absolute timestamp is, L is an integer greater than 1, and the value of L can be adjusted according to actual service needs.
After receiving the NTP absolute timestamps of the video packets and the audio packets, in the process of switching the local timestamp, which is previously used as the absolute timestamp, to the received NTP absolute timestamp, the absolute timestamp needs to be smoothly transitioned, otherwise, a sharp feeling is caused to the user.
Fig. 2 is a schematic diagram showing the structure of a computing system for implementing an exemplary embodiment of the present invention.
Referring to fig. 2, a computing system 200 may include at least one processor 202, memory 203, I/O components 204, and a network interface 205 connected via a bus 201.
The processor 202 may be a Central Processing Unit (CPU) or a semiconductor device that performs processing on commands stored in the memory 203. The memory 203 may include various types of volatile or non-volatile storage media. For example, the memory 203 may include Read Only Memory (ROM) and Random Access Memory (RAM).
Thus, the operations of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware or in a software module executed by the processor 202, or in a combination of the two. A software module may reside on storage media (i.e., memory 203) such as RAM, flash memory, ROM, EPROM, EEPROM, registers, hard disk, a removable disk, and a CD-ROM. An exemplary storage medium is coupled to the processor 202, and the processor 202 can read information from, and write information to, the storage medium. In another approach, the storage medium may be integral to the processor 202. The processor 202 and the storage medium may reside in an Application Specific Integrated Circuit (ASIC). The ASIC may reside in a user terminal. In another approach, the processor and the storage medium may reside as discrete components in a user terminal.
While the above-described exemplary methods of the invention are shown as a series of acts for clarity of description, it is not intended to limit the order in which the steps are performed, and each step may be performed concurrently or in a different order as may be desired. The steps shown may further include other steps, may include other steps than some steps, or may include additional steps than some steps, in order to implement a method according to the present invention.
The various embodiments of the invention are not an exhaustive list of all possible combinations, but are intended to describe representative aspects of the invention, and what is described in the various embodiments can be applied independently or in combinations of two or more.
In addition, various embodiments of the invention may be implemented in hardware, firmware, software, or a combination thereof. The hardware may be implemented by one or more of an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a general purpose processor, a controller, a microcontroller, a microprocessor, or the like.
The scope of the present invention is intended to include software or machine-executable instructions (e.g., operating systems, application programs, firmware, programs, etc.) which cause operations according to various embodiments to be performed on an apparatus or computer, as well as non-volatile computer-readable media which are executable on a device or computer on which such software or instructions, etc., are stored.
The above description of exemplary embodiments has been presented only to illustrate the technical solutions of the present invention, and is not intended to be exhaustive or to limit the invention to the precise forms described. Obviously, many modifications and variations are possible to those skilled in the art in light of the above teachings. The exemplary embodiments were chosen and described in order to explain certain principles of the invention and its practical application to thereby enable others skilled in the art to understand, implement and utilize the invention in various exemplary embodiments and with various alternatives and modifications. It is intended that the scope of the invention be defined by the following claims and their equivalents.
Claims (9)
1. A synchronization method of an adaptive audio and video RTP timestamp is characterized by comprising the following steps:
a sending end sends a video packet and an audio packet of an audio and video service to a receiving end, wherein the video packet and the audio packet respectively comprise an RTP packet and an NTP packet;
when a receiving end receives an RTP packet of a first video packet and an RTP packet of a first audio packet, local timestamps LV (1) and LA (1) when the RTP relative timestamp RV (1) of the first video packet and the RTP relative timestamp RA (1) of the first audio packet are received are respectively used as an absolute timestamp of the first video packet and an absolute timestamp of the first audio packet, so that the video packet and the audio packet are synchronously controlled by using the absolute time;
when a receiving end receives an NTP packet of a first video packet and an NTP packet of a first audio packet, according to a calculated deviation value between a local timestamp LV (n) of an nth video packet and a local timestamp LA (n) of an nth audio packet, and a calculated NTP absolute timestamp AV (n) of an nth video packet and an NTP absolute timestamp AA (n) of an nth audio packet, smooth deviation compensation processing is carried out on the deviation value in subsequent local playing, and therefore synchronous control of the video packet and the audio packet is carried out by using absolute time of deviation compensation, wherein n is an integer larger than 1.
2. The method of synchronizing an adaptive audio-video RTP timestamp according to claim 1,
when the NTP packet of the first video packet and the NTP packet of the first audio packet are received, the NTP absolute timestamp AV (1) of the first video packet and the NTP absolute timestamp AA (1) of the first audio packet are calculated by the following equations:
AA (1) ═ AA (ntp) + [ RA (ntp) -RA (1) ]/audio sampling rate
AV (1) ((ntp) + [ RV (ntp)) -RV (1) ]/video sampling rate
The NTP packets of the video packets include absolute time stamps av (NTP) and relative time stamps rv (NTP) of the video packets, and the NTP packets of the audio packets include absolute time stamps aa (NTP) and relative time stamps ra (NTP) of the video packets.
3. The method of synchronizing an adaptive audio-video RTP timestamp according to claim 2,
the NTP absolute timestamp av (n) of the received nth video packet and the NTP absolute timestamp aa (n) of the nth audio packet are calculated by the following equations:
AV (n) ═ AV (1) + [ RV (n) -RV (1) ]/video sampling rate
AA (1) + [ RA (n) -RA (1) ]/audio sampling rate.
4. The method of synchronizing an adaptive audio-video RTP timestamp according to claim 3,
when the NTP packet of the first video packet and the NTP packet of the first audio packet are received, the local timestamp lv (n) of the received nth video packet and the local timestamp la (n) of the received nth audio packet are calculated by the following equations:
LV (n) ═ LV (1) + [ RV (n) -RV (1) ]/video sampling rate
LA (1) + [ RA (n) -RA (1) ]/audio sampling rate.
5. The method for synchronizing an adaptive audio-video RTP timestamp according to claim 4,
the offset value provision existing between the audio packet and the video packet is calculated by the following equation:
deviation=AA(n)-AV(n)-[LA(n)-LV(n)]
wherein, if the deviation value deviation is 0ms, it indicates that there is no fluctuation in network transmission; if the deviation value deviation is not 0ms, it indicates that there is a fluctuation in the network transmission.
6. The method of synchronizing an adaptive audio-video RTP timestamp according to claim 5,
when the deviation between the audio packet and the video packet is compensated, the local timestamp of the video packet is gradually compensated by taking the audio packet as a reference and taking L as a smooth step, wherein the compensation time is compensation/L, the compensation deviation of the rest video packets is compensation-compensation/L, and L is an integer greater than 1;
when the remaining video packets compensate for the offset of 0, the local timestamps of the video packets are no longer compensated.
7. The method for synchronizing an adaptive audio-video RTP timestamp according to claim 6,
in compensating for the deviation value existing between the audio packet and the video packet, the estimated local time stamp LV (estimate _ n) of the video packet is calculated by the following equation:
LV(estimate_n)=LV(n)+deviation/L
when the receiving end receives the relative time stamp RV (n +1) of the (n +1) th video packet again, the local time stamp LV (n +1) of the (n +1) th video packet at the time of receiving the relative time stamp RV (n +1) of the (n +1) th video packet can be calculated by the following equation:
LV (n +1) ═ LV (estimate _ n) + (RV (n +1) -RV (n))/video sampling rate.
8. The method for synchronizing an adaptive audio-video RTP timestamp according to claim 6,
and when calculating the residual video packet compensation deviation, rounding the residual video packet compensation deviation.
9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011629055.0A CN114697720B (en) | 2020-12-31 | 2020-12-31 | Synchronization method and device of adaptive audio and video RTP (real-time protocol) time stamps |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011629055.0A CN114697720B (en) | 2020-12-31 | 2020-12-31 | Synchronization method and device of adaptive audio and video RTP (real-time protocol) time stamps |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114697720A true CN114697720A (en) | 2022-07-01 |
CN114697720B CN114697720B (en) | 2023-11-07 |
Family
ID=82133981
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011629055.0A Active CN114697720B (en) | 2020-12-31 | 2020-12-31 | Synchronization method and device of adaptive audio and video RTP (real-time protocol) time stamps |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114697720B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050281246A1 (en) * | 2004-06-22 | 2005-12-22 | Lg Electronics Inc. | Synchronizing video/audio data of mobile communication terminal |
US20070116057A1 (en) * | 2003-07-04 | 2007-05-24 | Liam Murphy | System and method for determining clock skew in a packet -based telephony session |
JP2009010863A (en) * | 2007-06-29 | 2009-01-15 | Oki Electric Ind Co Ltd | Audio/video synchronizing method, audio/video synchronizing system and audio/video receiving terminal |
US20100100917A1 (en) * | 2008-10-16 | 2010-04-22 | Industrial Technology Research Institute | Mobile tv system and method for synchronizing the rendering of streaming services thereof |
CN102571687A (en) * | 2010-12-10 | 2012-07-11 | 联芯科技有限公司 | Method for building synchronous status information among real-time media streams, device adopting same and SCC AS |
CN103414957A (en) * | 2013-07-30 | 2013-11-27 | 广东工业大学 | Method and device for synchronization of audio data and video data |
CN104092697A (en) * | 2014-07-18 | 2014-10-08 | 杭州华三通信技术有限公司 | Anti-replaying method and device based on time |
CN111385625A (en) * | 2018-12-29 | 2020-07-07 | 成都鼎桥通信技术有限公司 | Non-IP data transmission synchronization method and device |
-
2020
- 2020-12-31 CN CN202011629055.0A patent/CN114697720B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070116057A1 (en) * | 2003-07-04 | 2007-05-24 | Liam Murphy | System and method for determining clock skew in a packet -based telephony session |
US20050281246A1 (en) * | 2004-06-22 | 2005-12-22 | Lg Electronics Inc. | Synchronizing video/audio data of mobile communication terminal |
CN1738437A (en) * | 2004-06-22 | 2006-02-22 | Lg电子株式会社 | Synchronizing video/audio of mobile communication terminal |
JP2009010863A (en) * | 2007-06-29 | 2009-01-15 | Oki Electric Ind Co Ltd | Audio/video synchronizing method, audio/video synchronizing system and audio/video receiving terminal |
US20100100917A1 (en) * | 2008-10-16 | 2010-04-22 | Industrial Technology Research Institute | Mobile tv system and method for synchronizing the rendering of streaming services thereof |
CN102571687A (en) * | 2010-12-10 | 2012-07-11 | 联芯科技有限公司 | Method for building synchronous status information among real-time media streams, device adopting same and SCC AS |
CN103414957A (en) * | 2013-07-30 | 2013-11-27 | 广东工业大学 | Method and device for synchronization of audio data and video data |
CN104092697A (en) * | 2014-07-18 | 2014-10-08 | 杭州华三通信技术有限公司 | Anti-replaying method and device based on time |
CN111385625A (en) * | 2018-12-29 | 2020-07-07 | 成都鼎桥通信技术有限公司 | Non-IP data transmission synchronization method and device |
Also Published As
Publication number | Publication date |
---|---|
CN114697720B (en) | 2023-11-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP1775964B1 (en) | Method and device for stream synchronization of real-time multimedia transport over packet network | |
DK3118855T3 (en) | Method, device and system for synchronous audio playback | |
US7392102B2 (en) | Method of synchronizing the playback of a digital audio broadcast using an audio waveform sample | |
KR100968928B1 (en) | Apparatus and method for synchronization of audio and video streams | |
US20080152309A1 (en) | Method and apparatus for audio/video synchronization | |
US20030198254A1 (en) | Method of synchronizing the playback of a digital audio broadcast by inserting a control track pulse | |
CN108366283B (en) | Media synchronous playing method among multiple devices | |
US10602468B2 (en) | Software based audio timing and synchronization | |
KR20080007577A (en) | Synchronized audio/video decoding for network devices | |
JP2013134119A (en) | Transmitter, transmission method, receiver, reception method, synchronous transmission system, synchronous transmission method, and program | |
JP2001186180A (en) | Ip terminal device, method for estimating frequency error range, method of estimating frequency difference and method of calculating estimated required time | |
KR102566550B1 (en) | Method of display playback synchronization of digital contents in multiple connected devices and apparatus using the same | |
US7440474B1 (en) | Method and apparatus for synchronizing clocks on packet-switched networks | |
US20070009071A1 (en) | Methods and apparatus to synchronize a clock in a voice over packet network | |
US8477810B2 (en) | Synchronization using multicasting | |
US9991981B2 (en) | Method for operating a node of a communications network, a node and a communications network | |
CN114697720A (en) | Method and device for synchronizing self-adaptive audio and video RTP timestamp | |
JP4042396B2 (en) | Data communication system, data transmission apparatus, data reception apparatus and method, and computer program | |
KR100457508B1 (en) | Apparatus for setting time stamp offset and method thereof | |
US20040218633A1 (en) | Method for multiplexing, in MPEG stream processor, packets of several input MPEG streams into one output transport stream with simultaneous correction of time stamps | |
US11900010B2 (en) | Method of managing an audio stream read in a manner that is synchronized on a reference clock | |
JP4425115B2 (en) | Clock synchronization apparatus and program | |
JP2018121199A (en) | Receiving device and clock generation method | |
WO2020206465A1 (en) | Software based audio timing and synchronization | |
CA2651701C (en) | Generation of valid program clock reference time stamps for duplicate transport stream packets |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |