CN116489342B - Method and device for determining coding delay, electronic equipment and storage medium - Google Patents

Method and device for determining coding delay, electronic equipment and storage medium Download PDF

Info

Publication number
CN116489342B
CN116489342B CN202310729514.XA CN202310729514A CN116489342B CN 116489342 B CN116489342 B CN 116489342B CN 202310729514 A CN202310729514 A CN 202310729514A CN 116489342 B CN116489342 B CN 116489342B
Authority
CN
China
Prior art keywords
frame
coding
source
determining
video data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310729514.XA
Other languages
Chinese (zh)
Other versions
CN116489342A (en
Inventor
徐进
葛涛
许春蕾
崔俊生
李岩
于亮
付威
刘博�
朱易
王子明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Media Group
Original Assignee
China Media Group
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Media Group filed Critical China Media Group
Priority to CN202310729514.XA priority Critical patent/CN116489342B/en
Publication of CN116489342A publication Critical patent/CN116489342A/en
Application granted granted Critical
Publication of CN116489342B publication Critical patent/CN116489342B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N17/00Diagnosis, testing or measuring for television systems or their details
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/80Responding to QoS

Abstract

The embodiment of the application relates to the technical field of image coding and decoding, in particular to a method and a device for determining coding delay, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring source recording time of each frame of source video data and encoding recording time of each frame; determining a target coding position of the coding synchronous frame and a target source position of the source synchronous frame according to the first position of the coding key frame, the second position of the source key frame, the type of the coding key frame and the coded video data; and determining the coding delay time of the coder according to the coding recording time and the source recording time. The method obtains the time of each frame of video data entering the encoder in a recording mode, and the time of each frame output from the encoder can specially determine the delay time of the encoder according to the time difference of the two; the frame used for determining the delay is the only determined frame, and the delay time of the encoder can be more accurately determined according to the time of entering the encoder and the time of outputting from the encoder.

Description

Method and device for determining coding delay, electronic equipment and storage medium
Technical Field
The present application relates to the field of image encoding and decoding technologies, and in particular, to a method and apparatus for determining encoding delay, an electronic device, and a storage medium.
Background
The audio/video remote propagation needs to be encoded by an encoder, and decoded by a decoder at a decoding terminal. In order to ensure that the decoding terminal can play the audio and video on time according to the time requirement of the transmitting end, the delay time caused by encoding and decoding needs to be detected.
The detection method of the delay time of encoding and decoding in the prior art is shown in fig. 1. The test video before encoding is played back by the monitor 1, and the decoded test video is played back by the monitor 2. The video test has a time code that varies from frame to frame (i.e., the time code on each frame of picture can characterize the frame number of the current frame). And respectively obtaining the playing time of two monitors for playing one frame of the same time code, and then obtaining the difference value between the two playing times to obtain the delay time of encoding and decoding.
The method for detecting the encoding and decoding delay time has the following defects:
1) Frequent testing on the full link of the operating system is not suitable;
2) The method can only test the total delay time of the encoding and decoding of the full link, lacks flexibility, and cannot determine the influence of the encoding or decoding on the delay. Applications in a scene where the encoder and decoder applications are flexible, re-detection is required, no matter whether the encoder or decoder changes, and a lot of resources are required for detection.
3) If the accuracy of the monitored signal source is high. For example, if the source is an 8K uncompressed signal in SMPTE ST 2110 format, the requirement on the monitoring device is high, and at present, no analysis device capable of directly monitoring an 8K uncompressed all-IP signal based on SMPTE ST-2110 is provided in the test, and after the IP-to-SDI (Serial Digital Interface, digital component serial interface) gateway is used for conversion, the gateway conversion delay error is introduced for monitoring, and the coding delay of the encoder and the decoding delay of the decoder cannot be detected.
The 'Baicheng Qian Screen Walkman' service of 'Baicheng Qian Screen' is an item which is promoted by the current government. Under the condition that the public large screen is not suitable for playing sound, the service adopts a special line to transmit 8K ultra-high definition video and the Internet to additionally transmit one television accompanying sound (hereinafter referred to as independent transmission accompanying sound), and realizes that the audio of the decoding terminal is synchronous with the playing of the large screen video by utilizing the time stamp embedded in the video, thereby meeting the requirement that a viewer hears synchronous high-quality audio when watching the large screen program. In the live broadcast process, in order to ensure that a user terminal can obtain a video with higher real-time performance, a low-delay encoder and a low-delay decoder are required to be adopted, and the delay time of encoding and decoding is measured.
In order to ensure the universality, the decoding terminal can use various brands of decoders to decode the encoded audio transmitted by the Internet. The receiving buffering, decoding model and image post-processing implementation of different brands of decoders may be different, the clock based on the reported embedded UTC (Universal Time Coordinated, coordinated universal time) time and the clock based on the system in which the encoder is located may be different, so that the accuracy of detecting the delay time is affected, in addition, the delay time of the whole link can only be measured, the delay time of the encoder or the decoder cannot be determined more specifically, and the encoder or the decoder is adjusted correspondingly.
Therefore, there is a need for a test method that can test independent encoder ends without requiring the coordination of a full-link system, with accuracy independent of the end decoder.
Disclosure of Invention
In order to solve one of the above technical drawbacks, embodiments of the present application provide a method and apparatus for determining coding delay, an electronic device, and a storage medium.
In a first aspect of an embodiment of the present application, a method for determining a coding delay is provided, including:
Acquiring source recording time of each frame of source video data when the source video data is input into an encoder and encoding recording time of each frame when the encoder outputs encoded video data;
determining a target coding position of a coding synchronous frame corresponding to the coding key frame and a target source position of a source synchronous frame corresponding to the source key frame according to a first position of the coding key frame in the coding recording sequence, a second position of the source key frame in the source recording sequence, the type of the coding key frame and the coding video data;
and determining the coding delay time of the encoder according to the coding recording time corresponding to the target coding position and the source recording time corresponding to the source video sequence number of the source synchronous frame.
In a second aspect of the embodiment of the present application, there is provided an apparatus for determining coding delay, including:
the acquisition module is used for acquiring the source recording time of each frame of the source video data when the source video data is input into the encoder and the encoding recording time of each frame when the encoder outputs the encoded video data;
the first determining module is used for determining a target coding position of a coding synchronous frame corresponding to the coding key frame and a target source position of a source synchronous frame corresponding to the source key frame according to a first position of the coding key frame in the coding recording sequence, a second position of the source key frame in the source recording sequence, the type of the coding key frame and the coding video data;
And the second determining module is used for determining the coding delay time of the coder according to the coding recording time corresponding to the target coding position and the source recording time corresponding to the source video sequence number of the source synchronous frame.
In a third aspect of an embodiment of the present application, there is provided an electronic device, including:
a memory;
a processor; and
a computer program;
wherein the computer program is stored in a memory and configured to be executed by a processor to implement a method of determining a coding delay as in any of the above.
In a fourth aspect of embodiments of the present application, there is provided a computer-readable storage medium having a computer program stored thereon; a computer program is executed by a processor to implement a method of determining a coding delay as in any of the above.
The method of the embodiment of the application obtains the time of each frame of video data entering the encoder and the time of each frame of encoded video data output from the encoder in a recording mode, so that the time delay of the encoder can be specially determined through the time difference of the same frame input into the encoder and the time difference of the same frame output from the encoder; in addition, the frame used for determining the delay is a frame which has scene change with the previous frame, is a frame with uniquely determinable position in the video, and can more accurately determine the delay of the encoder according to the time when the frame with uniquely determinable content and position enters the encoder and the time when the frame is output from the encoder.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
FIG. 1 is a schematic diagram of a method for determining a codec delay time according to the prior art;
FIG. 2 is a schematic diagram of another method for determining the codec delay time according to the prior art;
FIG. 3 is a flow chart of a method for determining coding delay according to one embodiment of the present application;
FIG. 4 is a schematic diagram of a method for determining coding delay according to an embodiment of the present application;
FIG. 5 is a flow chart of a method for determining coding delay according to one embodiment of the present application;
FIG. 6 is a flow chart of a method for determining coding delay according to one embodiment of the present application;
FIG. 7 is a flow chart of a method for determining coding delay according to one embodiment of the present application;
FIG. 8 is a flow chart of a method for determining coding delay according to one embodiment of the present application;
FIG. 9 is a schematic diagram of a method for determining codec delay according to an embodiment of the present application;
FIG. 10 is a flow chart of a method for determining codec delay according to an embodiment of the present application;
FIG. 11 is a flow chart of an apparatus for determining coding delay according to an embodiment of the present application;
fig. 12 is a schematic structural diagram of a computer device according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions and advantages of the embodiments of the present application more apparent, the following detailed description of exemplary embodiments of the present application is provided in conjunction with the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present application and not exhaustive of all embodiments. It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other.
As shown in fig. 2, the application scenario of the embodiment of the present application may be a codec system of a "Baicheng Qianlian Walkman" service of "Baicheng Qianlian".
The coding and decoding system of the 'Baicheng Qian screen walkman' service comprises:
(1) A video signal source for providing uncoded source video data. The source video data can have a frame rate of 50fps for 8K super-high definition video, each frame has a duration of 20ms, and each frame needs to carry a time stamp in order to accurately achieve video synchronization.
(2) And the source switch is used for distributing the source video data to one or more encoders to encode according to the same or different encoding protocols.
(3) An encoder for encoding the source video data according to an encoding protocol, the protocol used may be a protocol satisfying the T/UWA 012.2-2022 standard. The encoder puts the Time stamps for synchronous play, packets into the syntax structure of time_stamp (Time Stamp) satisfying the T/UWA 012.2-2022 standard as shown in the following table:
wherein utc _time is used to store the time when a frame of source video data enters the encoder. The encoder stores utc _time of a frame in the PES (Packetized Elementary Streams, packetized elementary stream) header of the frame during encoding. The PES header containing the time stamp of the video data entry into the encoder is stored in a data packet and sent to the switch.
(4) TS stream output exchanger
And the data packet obtained by encoding the video is distributed to a decoder.
(5) Decoder
The decoder is used for decoding the data packet to obtain a decoded video and a time stamp of each frame of the video.
In the process of implementing the present application, as shown in fig. 2, the method for detecting the codec delay in the prior art is a method for comparing the same frame of the test video including the time code by using two monitors in the background art of fig. 1, and the other method is to solve the system time of the decoding terminal where the decoder is located when the video data enters the time stamp of the encoder and the decoder decodes to obtain the decoded data of the video of the frame. Both the above methods can only obtain the total delay time of the whole coding and decoding link, and cannot obtain the delay time of the encoder or the delay time of the decoder specially.
The execution body of the embodiment of the application can be a computer of a terminal or a server or other equipment with computing capability. In view of the above problems, the embodiment of the present application provides a method for determining coding delay, as shown in fig. 3, including the following steps 301 to 303:
step 301, acquiring a source recording time of each frame of source video data when the source video data is input to an encoder and a coding recording time of each frame when the encoder outputs the coded video data.
The determination of the source recording time per frame and the encoding recording time per frame may be based on the same clock source, e.g., NTP (Network Time Protocol ) clock source. The detection system that detects the source recording time and the encoding recording time may also use a clock source used by the encoder. The encoder may employ an encoder based on a PTP (Precise Time protocol, high precision time synchronization protocol) clock. The recorded source video data may be non-test video without a time code added. For example, in the "one hundred city thousand screen" service, the recorded video is a video played in a large screen in public places, and the video image does not contain a time code.
As shown in fig. 4, the method according to the embodiment of the present application may record at a position (recording point a) before video data of a signal source enters an encoder and a position (recording point b) where encoded data is output from the encoder, and simultaneously acquire a time when each frame of uncoded source video data enters the encoder (i.e., source recording time) and a time when each frame is output from the encoder (i.e., encoding recording time). Recording point a records video received from a video signal source, and the recorded source video data can be stored in the form of a file. For example, a video transmitted by the signal source through the ST2110 protocol is recorded and saved as a "YUV file". b recording the data packets containing the TS (Transport Stream) Stream after being encoded. The recording is followed by storage in the form of a file. The TS stream is composed of TS packets, and the TS packets are immediately analyzed after being stored. Data of one frame of image is typically stored in a plurality of TS packets transmitted in succession, and the system time for recording the TS packets including the PES header in these TS packets is determined as the recording time (i.e., encoding recording time) of the encoded image data of this frame. Recording the recording point a and the recording point b can adopt different recording devices, or can adopt the same recording device, and the recorded file is sent to a computer for processing.
Step 302, determining a target coding position of a coding synchronization frame corresponding to the coding key frame and a target source position of a source synchronization frame corresponding to the source key frame according to a first position of the coding key frame in the coding recording sequence, a second position of the source key frame in the source recording sequence, a type of the coding key frame and the coded video data.
As described above, the encoded video data is in the form of packets, and the encoded key frame is actually a frame in the decoded video. An encoded key frame is a frame in the decoded video where the scene changes significantly from the previous frame. For example, the picture of the previous frame is landscape and the picture of the encoded key frame is character. The reason for using the encoded key frame as a basis for determining the encoding delay time is that the encoded key frame is unique, from which a source video key frame having the same content as the encoded key frame and the previous frame being the same as the previous frame of the encoded key frame can be determined in the source video data. To determine the encoding delay, it is necessary to determine the time difference between a position-unique frame in the source video data and the output of the encoded frame data from the encoder.
To determine the encoded key frame where the scene change occurs, the data packet of the TS Stream is decoded to obtain an ES (elementary Stream) Stream that can be played using the playing software, and then played in the video playing software. For example, the source video data is a YUV format file with 8K resolution, after being encoded, the source video data is decoded by an AV3 decoder to obtain a YUV format file for playing, and then the YUV format file is played on a player YUV player for playing the YUV format file. The PTS (Presentation Time Stamp ) of the encoded key frame, i.e., the first position of the encoded key frame in the encoded recording sequence, may be manually determined from the played picture. The manner of determining the second Position (PTS) of the source key frame in the source recording sequence may also be by manual means.
The encoded synchronization frame may be the I-frame nearest to the encoded key frame (i.e., the intra-frame reference encoded frame). The reason for determining the encoded synchronization frame is. The transmission sequence of the coded B frames and P frames is inconsistent with the playing sequence of the coded B frames and P frames which are decoded and played by a player. If the encoded key frame is not an I frame, the time at which the encoded key frame is output from the encoder cannot be uniquely determined based on the PTS (first position) representing the play order of the encoded key frame. Thus, if the type of the encoded key frame is not an I frame, it is not easy to accurately determine the recording time of the encoded key frame. Therefore, it is necessary to determine the encoded sync frame (I frame) whose PTS value is closest to the PTS value of the encoded key frame. In this way, the time when the encoding synchronization frame is output from the encoder can be uniquely determined according to the PTS (i.e. the target encoding position) of the encoding synchronization frame, and the delay time of the encoder can be further determined according to the time when the encoding synchronization frame is input into the encoder and the time when the encoding synchronization frame is output from the encoder. If the encoded key frame is an I frame, the encoded key frame may be used as an encoded sync frame, and the source key frame may be used as a source sync frame.
The embodiment of the application provides a method for acquiring a coding synchronous frame and a source synchronous frame when a coding key frame is an I frame or not.
In an alternative embodiment of the present application, a target encoding position of an encoding synchronization frame corresponding to an encoding key frame and a target source position of a source synchronization frame corresponding to a source key frame are determined according to a first position of the encoding key frame in an encoding recording sequence, a second position of the source key frame in the source recording sequence, a type of the encoding key frame and encoded video data:
if the type of the coding key frame is determined to be the intra-frame reference coding frame, determining the first position as a target coding position, and determining the second position as a target source position of the source synchronous frame; if the type of the coding key frame is determined not to be the intra-frame reference coding frame, determining a fourth position of the intra-frame reference coding frame closest to the first position as a target coding position; the target source position is determined based on the difference between the current timestamp of the encoded key frame and the current timestamp of the nearest intra-frame reference encoded frame, the source video sequence number of the source synchronization key frame, the current timestamp clock period, and the frame rate of the source video data.
In practical application, if the type of the encoded key frame is not an I frame, taking the current time stamp clock period as 90kHz, the frame frequency as 50 frames/second as an example, the PTS of the encoded key frame determined manually is B1, the PTS of the source key frame is A1, the PTS of the I frame (encoded synchronization frame) nearest to the encoded key frame is B2 (target encoding position), the target source position A2 of the source synchronization frame may be determined according to the following formula (1) and formula (2):
D=(B2-P1)÷90÷20; (1)
A2=A1+D (2)
Wherein D is the offset between the source sync frame and the source key frame.
The embodiment of the application provides a method for determining video frames for calculating encoder delay, which can more accurately determine the output time of a frame from an encoder according to the display time stamp of the frame because the frames for determining the delay are I frames, and further can more accurately obtain the delay time of the encoder according to the difference between the output time of the frame from the encoder and the input time of the frame into the encoder.
Step 303, determining the coding delay time of the encoder according to the coding recording time corresponding to the target coding position and the source recording time corresponding to the source video sequence number of the source synchronization frame.
Since the encoding and recording time of each frame is obtained in step 301, the source video sequence number of the source synchronization frame can also be obtained according to the encoding and recording time of the encoding synchronization frame at the target encoding position in the encoded video data. The encoding synchronization frame and the source synchronization frame respectively correspond to the output time of one frame from the encoder and the input time of the encoder, and the two times are subjected to difference, so that the delay time of one frame of image data passing through the encoder can be obtained.
The method of the embodiment of the application obtains the time of each frame of video data entering the encoder and the time of each frame of encoded video data output from the encoder in a recording mode, so that the time delay of the encoder can be specially determined through the time difference of the same frame input into the encoder and the time difference of the same frame output from the encoder; in addition, the frame used for determining the delay is a frame which has scene change with the previous frame, is a frame with uniquely determinable position in the video, and can more accurately determine the delay of the encoder according to the time when the frame with uniquely determinable content and position enters the encoder and the time when the frame is output from the encoder. The method solves the technical problems that the detection coding delay precision is not influenced by a terminal decoder, the independent encoder end cannot be tested without the cooperation of a full-link system, and the technical effect of accurately determining the delay of the encoder can be realized.
In an alternative embodiment of the present application, as shown in fig. 5, there is provided a method of determining whether a time stamp in encoded video data is accurate, comprising the steps of 501-502:
step 501, determining a target entry time of a source key frame into an encoder according to a target source location and a packet elementary stream header of each frame of encoded video data.
Step 502, determining a difference between a target entry time and a target source time.
The embodiment of the application is used for determining whether the time stamp time embedded in the video data by the encoder is accurate. Typically, when the encoder encodes source video data, the time stamp of each frame of video data into the encoder is stored in the header of the encoded file packet (i.e., the packetized elementary stream header). Therefore, according to the encoding time (target source position) when the recorded source key frame enters the encoder, the time when the source key frame enters the encoder can be determined, and whether the timestamp of the encoded key frame is accurate can be further judged.
In an alternative embodiment of the present application, as shown in fig. 6, there is provided a method of extracting a time stamp of each frame of uncoded source video data input to an encoder from a data packet output from the encoder, comprising steps 601-602:
step 601, for each user datagram protocol packet output by the encoder, obtaining a user datagram protocol packet and a recording time of the user datagram protocol packet.
Step 602, obtain each transport stream packet contained in the user datagram protocol packet.
In step 603, if each transport stream packet includes a packet header of the packetized elementary stream, the recording time of the protocol packet is determined to be the encoding recording time of one frame, and the recording time of the protocol packet is determined to be the encoding recording time of one frame.
Wherein each user datagram protocol packet comprises one or more transport stream packets, and the transport stream packet as the head of a frame of encoded video data comprises a packetized elementary stream packet header.
After encoding video data, the encoder encapsulates the source video data ES into PES, converts the PES into TS packets by encoding, and packetizes the TS packets into UDP (User Datagram Protocol) packets for transmission to the decoder. Accordingly, the time instants at which encoded video data of each frame is to be obtained output from the encoder need to be in the manner provided above: first, the time when each UDP packet is output from the encoder (i.e. the protocol packet recording time) needs to be recorded; then, it is determined which TS packet in the UDP packets contains the PES packet header, and the time when the UDP packet containing the PES packet header is output from the encoder, that is, the time when one frame of encoded image data is output from the encoder.
The embodiment of the application determines the output time of the UDP packet containing the PES packet head from the encoder, and can more accurately position the output time of one frame of video from the encoder because of the start of a plurality of TS packets corresponding to one frame of video of the TS packet containing the PES packet head, thereby more accurately calculating the delay time of the encoder according to the time.
In an alternative embodiment of the present application, as shown in fig. 7, there is provided a method for extracting a basic code stream for playing by a player from encoded video data, comprising steps 701 to 704:
step 701, determining a program mapping table of a designated program according to the program association table;
as above, encoded video data is typically output from the encoder in the form of data packets. In order to determine the encoded key frames of the encoded video data, the encoded video data needs to be decoded, and then the decoded video is played by a player. To meet the playing requirements of the player, PAT (Program Association Table ) and PMT (Program Map Table, program map table) need to be extracted from the encoded data. Determining the PAT may first determine whether the PID of the TS header of each TS packet is 0 in order according to the specific PID of the PAT, the PID of the PAT being 0, to determine the PAT. The PAT contains PMT of the specified program to be recorded.
Step 702, determining the video transport stream packet number of the video of the specified program according to the program mapping table.
The TS packets meeting the video transport stream packet number in all TS packet headers can be obtained according to the transport stream packet number PID (i.e. the video transport stream packet number) of the video in the PMT. And decoding the ES according to the TS packets of all the video transport stream packet numbers. The ES decoded by the TS packets may correspond to multiple types of frames, and in order to facilitate playback by a subsequent video player, an I frame may be found in the ES, and the video player may begin playback from this I frame.
The embodiment of the application can acquire PAT and PMT data in the encoded TS stream, so that an ES for playing can be obtained from the TS stream according to the PAT and the PMT, and further can be used for playing in a video player, so that key frames for scene switching of the encoded video data can be more conveniently determined, and the delay time of an encoder can be further determined according to the key frames.
In an alternative embodiment of the present application, as shown in fig. 8, there is provided a method of determining an encoder and decoder full link delay, comprising steps 801-803:
step 801, obtaining a decoding recording time of each frame when the decoder outputs the decoded video data.
As shown in fig. 9, in the embodiment of the present application, a recording point c may be set at the output end of the decoder for recording video data (i.e., decoded video data) output from the decoder, and the decoding recording time of each frame of the decoded video data is obtained at the same time of recording. And determining the full-link delay time of the encoder and the decoder, namely, the time difference of the frame which is the same as the content and is output from the decoder to the recording device of the recording point c when the recording point a records a frame of source video data to enter the encoder.
For example, the encoder outputs a TS stream, which is decoded into a 4 x 12g-SDI signal by a decoder, and the SDI baseband signal recording apparatus records video frames output by the decoder through an SDI interface and records a system time at the time of recording each frame.
Step 802, determining a target decoding position of a decoding synchronization frame corresponding to the decoding key frame according to the target encoding position and a third position of the decoding key frame in the decoding recording sequence.
Determining the decoded key frame may play the decoded video data via video play software. The decoded video data can be saved into a file in YUV format, and the file is played by the video player playing the video in YUV format. The decoded key frame is a frame of the decoded video data that has the same content as the encoded key frame, and may be determined manually.
The manner of determining the decoding synchronization frame from the decoding key frame is similar to the method of determining the source synchronization frame from the source key frame above. And is also realized differently according to the type of the encoded key frame. If the encoded key frame is an I frame, the decoded key frame is the decoded sync frame. If the encoded key frame is not an I frame, a decoded sync frame can be obtained according to equations (2) and (3):
C2=C1+D (3)
where D is the result of formula (2) above, C1 is the PTS of the decoded key frame, and C2 is the PTS of the decoded sync frame.
Step 803, determining the total delay time of the encoding and decoding link consisting of the encoder and the decoder according to the decoding recording time corresponding to the target decoding position and the source recording time corresponding to the target source position.
The embodiment of the application can acquire the time of decoding the video of each frame output by the decoder, and can determine the position of the single frame in the playing sequence in the video output by the decoder, thereby more accurately determining the total delay time of the whole link formed by the encoder and the decoder according to the time of inputting the frame into the encoder and the time of outputting the frame from the decoder.
In an alternative embodiment of the present application, there is provided a method for determining a decoding delay time of a decoder, further comprising: and determining the decoding delay time of the decoder according to the decoding recording time corresponding to the target decoding position and the encoding recording time corresponding to the target encoding position.
An exemplary block diagram of a method for determining a coding delay time and a decoding delay time according to an embodiment of the present application is shown in fig. 10. The ST2110 signal from the video source is recorded at recording point a to obtain source video data, and the host UTC time at which each frame of the source video data arrives, i.e., the time at which each frame arrives at the encoder, is recorded. The host UTC time refers to the clock source upon which the detection system is based. Recording TS by using a custom-written program, analyzing a time stamp in PES in real time, recording the UTC time of a host where a TS packet of the PES header arrives, extracting the TS packet from the UDP packet in correspondence to the UTC time, and determining the output time of each frame of encoded video data from the encoder according to the arrival time of each TS packet comprising the PES packet header. The custom written program is used to find the first I frame after the first PAT/PMT and extract the frame and the ES after the frame, i.e. steps 701-704 of the method of extracting the elementary streams for player playback from the encoded video data, corresponding to the above. The SDI signal is recorded using a custom-written program, saved as a V210 format file, and the host time at which each frame arrives is recorded, corresponding to the time at which decoded video data output from the decoder was obtained above, and the time at which decoded video data per frame was output from the decoder. The source video data, the decoded video data and the decoded encoded video data are YUV format files, a player playing the YUV format files is used for playing, and the encoded key frames, the source key frames and the decoded key frames are determined manually according to the content of each frame played. The encoding synchronization frame, the source synchronization frame, and the decoding synchronization frame are obtained on the basis of the encoding key frame, the source key frame, and the decoding key frame. And determining UTC time corresponding to each frame according to sequence numbers of the coding synchronous frame, the source synchronous frame and the decoding synchronous frame, and further determining coding delay time and decoding delay time.
As shown in fig. 11, an embodiment of the present application provides an apparatus 1100 for determining a coding delay, including a first acquisition module 1110, a first determination module 1120, and a second determination module 1130:
the first obtaining module 1110 is configured to obtain a source recording time of each frame of the source video data when the source video data is input to the encoder and an encoding recording time of each frame of the source video data when the encoder outputs the encoded video data;
the first determining module 1120 is configured to determine, according to a first position of the encoded key frame in the encoded recording sequence, a second position of the source key frame in the source recording sequence, a target encoding position of the encoded synchronization frame corresponding to the encoded key frame, and a target source position of the source synchronization frame corresponding to the source key frame, where the encoded key frame is of a type and encoded video data;
the second determining module 1130 is configured to determine a coding delay time of the encoder according to a coding recording time corresponding to the target coding position and a source recording time corresponding to a source video sequence number of the source synchronization frame.
In an alternative embodiment of the present application, the first obtaining module 1110 is specifically configured to:
for each user datagram protocol packet output by an encoder, acquiring the user datagram protocol packet and the protocol packet recording time of the user datagram protocol packet;
Obtaining each transport stream packet contained in the user datagram protocol packet;
if each transport stream packet contains a packet header of a packaging basic code stream, determining the recording time of the protocol packet as the encoding recording time of one frame, and determining the recording time of the protocol packet as the encoding recording time of one frame.
In an alternative embodiment of the present application, the first obtaining module 1110 further includes:
a first determining submodule, if at least one transport stream packet contains the program association table, determining the video or audio program number of the program mapping table according to the program association table;
a first extraction sub-module, configured to extract, from each transport stream packet, a program transport stream packet having a packet number that is a program number;
the second extraction submodule is used for extracting a target transport stream packet which contains a packaging basic code stream packet head and corresponds to the earliest protocol packet recording time from all program transport stream packets;
and the third extraction sub-module is used for extracting the basic stream data from all the program transport stream packets according to the package basic stream packet heads of the target transport stream packets.
In an alternative embodiment of the present application, the apparatus 1100 further comprises:
the second acquisition module is used for acquiring the decoding recording time of each frame when the decoder outputs the decoded video data;
The third determining module is configured to determine, according to the first position of the encoded key frame in the encoded recording sequence, the second position of the source key frame in the source recording sequence, the type of the encoded key frame, and the encoded video data, a target encoding position of the encoded synchronization frame corresponding to the encoded key frame, and a target source position of the source synchronization frame corresponding to the source key frame, and then further include:
a fourth determining module, configured to determine, according to the target encoding position and the third position of the decoding key frame in the decoding recording sequence, a target decoding position of a decoding synchronization frame corresponding to the decoding key frame;
and the fifth determining module is used for determining the total delay time of the encoding and decoding link consisting of the encoder and the decoder according to the decoding recording time corresponding to the target decoding position and the source recording time corresponding to the target source position.
In an alternative embodiment of the present application, the fourth determining module is further configured to:
and determining the decoding delay time of the decoder according to the decoding recording time corresponding to the target decoding position and the encoding recording time corresponding to the target encoding position.
In an alternative embodiment of the present application, the third determining module is further configured to:
determining the target entry time of a source key frame into an encoder according to the target source position and the packet header of a packed basic code stream of each frame of encoded video data; the difference between the target entry time and the target source time is determined.
In an alternative embodiment of the present application, the third determining module is specifically configured to:
if the type of the coding key frame is determined to be the intra-frame reference coding frame, determining the first position as a target coding position, and determining the second position as a target source position of the source synchronous frame;
if the type of the coding key frame is determined not to be the intra-frame reference coding frame, determining a fourth position of the intra-frame reference coding frame closest to the first position as a target coding position; the target source position is determined based on the difference between the current timestamp of the encoded key frame and the current timestamp of the nearest intra-frame reference encoded frame, the source video sequence number of the source synchronization key frame, the current timestamp clock period, and the frame rate of the source video data.
In one embodiment, a computer device is provided, the internal structure of which may be as shown in FIG. 12. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is for storing data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of determining coding delay as described above. Comprising the following steps: comprising a memory storing a computer program and a processor implementing any of the steps of the method of determining the coding delay as described above when the processor executes the computer program.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The scheme in the embodiment of the application can be realized by adopting various computer languages, such as C language, VHDL language, verilog language, object-oriented programming language Java, an transliteration script language JavaScript and the like.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In the description of the present application, it should be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", etc. indicate orientations or positional relationships based on the drawings are merely for convenience in describing the present application and simplifying the description, and do not indicate or imply that the device or element in question must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present application.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present application, the meaning of "plurality" means at least two, for example, two, three, etc., unless specifically defined otherwise.
In the present application, unless explicitly specified and limited otherwise, the terms "mounted," "connected," "secured," and the like are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally formed; may be mechanically connected, may be electrically connected or may communicate with each other; can be directly connected or indirectly connected through an intermediate medium, and can be communicated with the inside of two elements or the interaction relationship of the two elements. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art according to the specific circumstances.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (9)

1. A method of determining coding delay, comprising:
acquiring source recording time of each frame of source video data when the source video data is input into an encoder and encoding recording time of each frame when the encoder outputs encoded video data; the source video data is video data recorded according to a source recording sequence, wherein the source recording sequence comprises a source key frame and a source synchronous frame;
acquiring an encoded key frame from the encoded video data; the coding key frame refers to a frame with obviously changed scene compared with the previous frame in the coded video;
determining a target coding position of a coding synchronous frame corresponding to the coding key frame according to a first position of the coding key frame in a coding recording sequence, a second position of the source key frame in the source recording sequence, the type of the coding key frame and the coding video data; wherein the coding synchronization frame refers to an intra-frame reference coding frame nearest to the coding key frame; the corresponding relation between the coding synchronous frame and the coding key frame means that the coding synchronous frame and the source synchronous frame respectively correspond to the output time of a frame from the encoder and the input time of the encoder;
Determining a target source position of a source synchronous frame corresponding to the source key frame according to the target coding position of the coding synchronous frame and the source video data;
and determining the coding delay time of the coder according to the coding recording time corresponding to the target coding position and the source recording time corresponding to the target source position of the source synchronous frame.
2. The method for determining coding delay of claim 1, wherein said obtaining a coding recording time per frame when the encoder outputs coded video data comprises:
acquiring a user datagram protocol packet and a protocol packet recording time of the user datagram protocol packet for each user datagram protocol packet output by the encoder;
obtaining each transport stream packet contained in the user datagram protocol packet;
and if the transport stream packet comprises a packaging basic code stream packet head, determining the recording time of the protocol packet as the encoding recording time of one frame.
3. The method of determining coding delay of claim 2, wherein after obtaining each transport stream packet contained in the user datagram protocol packet, the method further comprises:
Determining a program mapping table of a designated program according to the program association table;
and determining the video transport stream packet number of the video of the appointed program according to the program mapping table.
4. The method of determining an encoding delay of claim 1, wherein the encoder outputs the encoded video data to a decoder, the method further comprising, after the obtaining a source recording time per frame of source video data at the time of input to the encoder and an encoding recording time per frame of encoded video data at the time of output of the encoder:
acquiring decoding recording time of each frame when the decoder outputs the decoded video data;
after determining the target coding position of the coding synchronous frame corresponding to the coding key frame and the target source position of the source synchronous frame corresponding to the source key frame according to the first position of the coding key frame in the coding recording sequence, the second position of the source key frame in the source recording sequence, the type of the coding key frame and the coding video data, the method further comprises:
determining a target decoding position of a decoding synchronous frame corresponding to the decoding key frame according to the target coding position and a third position of the decoding key frame in the decoding recording sequence;
And determining the total delay time of the encoding and decoding link consisting of the encoder and the decoder according to the decoding recording time corresponding to the target decoding position and the source recording time corresponding to the target source position.
5. The method of determining coding delay of claim 4, wherein after determining the target decoding position of the decoding synchronization frame corresponding to the decoding key frame based on the target coding position and a third position of the decoding key frame in the decoding recording sequence, the method further comprises:
and determining the decoding delay time of the decoder according to the decoding recording time corresponding to the target decoding position and the encoding recording time corresponding to the target encoding position.
6. The method for determining coding delay of claim 1, wherein determining the target coding position of the coding synchronization frame corresponding to the coding key frame based on the first position of the coding key frame in the coding recording sequence, the second position of the source key frame in the source recording sequence, and the type of the coding key frame and the coded video data comprises:
if the type of the coding key frame is determined to be an intra-frame reference coding frame, determining the first position as the target coding position, and determining the second position as a target source position of the source synchronous frame;
If the type of the coding key frame is determined not to be the intra-frame reference coding frame, determining a fourth position of the intra-frame reference coding frame closest to the first position as the target coding position; the target source position is determined from a difference between a current timestamp of the encoded key frame and a current timestamp of the nearest intra-frame reference encoded frame, the target source position of the source synchronization frame, a current timestamp clock period, and a frame rate of the source video data.
7. An apparatus for determining coding delay, comprising:
the acquisition module is used for acquiring the source recording time of each frame of the source video data when the source video data is input into the encoder and the encoding recording time of each frame when the encoder outputs the encoded video data; the source video data is video data recorded according to a source recording sequence, wherein the source recording sequence comprises a source key frame and a source synchronous frame; the method comprises the steps of,
acquiring an encoded key frame from the encoded video data; the coding key frame refers to a frame with obviously changed scene compared with the previous frame in the coded video;
the first determining module is used for determining a target coding position of a coding synchronous frame corresponding to the coding key frame according to a first position of the coding key frame in a coding recording sequence, a second position of the source key frame in the source recording sequence, the type of the coding key frame and the coding video data; wherein the coding synchronization frame refers to an intra-frame reference coding frame nearest to the coding key frame; the corresponding relation between the coding synchronous frame and the coding key frame means that the coding synchronous frame and the source synchronous frame respectively correspond to the output time of a frame from the encoder and the input time of the encoder;
A second determining module, configured to determine a target source position of a source synchronization frame corresponding to the source key frame according to the target encoding position of the encoding synchronization frame and the source video data; and
and determining the coding delay time of the coder according to the coding recording time corresponding to the target coding position and the source recording time corresponding to the target source position of the source synchronous frame.
8. An electronic device, comprising:
a memory;
a processor; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method of determining coding delay according to any of claims 1-6.
9. A computer-readable storage medium, characterized in that a computer program is stored thereon; the computer program being executed by a processor to implement the method of determining coding delay as claimed in any one of claims 1 to 6.
CN202310729514.XA 2023-06-20 2023-06-20 Method and device for determining coding delay, electronic equipment and storage medium Active CN116489342B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310729514.XA CN116489342B (en) 2023-06-20 2023-06-20 Method and device for determining coding delay, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310729514.XA CN116489342B (en) 2023-06-20 2023-06-20 Method and device for determining coding delay, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN116489342A CN116489342A (en) 2023-07-25
CN116489342B true CN116489342B (en) 2023-09-15

Family

ID=87221692

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310729514.XA Active CN116489342B (en) 2023-06-20 2023-06-20 Method and device for determining coding delay, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116489342B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017177675A1 (en) * 2016-04-11 2017-10-19 华为技术有限公司 Video coding method and device
CN113141522A (en) * 2020-01-17 2021-07-20 北京达佳互联信息技术有限公司 Resource transmission method, device, computer equipment and storage medium
WO2022000350A1 (en) * 2020-06-30 2022-01-06 深圳市大疆创新科技有限公司 Video transmission method, mobile platform, terminal device, video transmission system, and storage medium
WO2023010992A1 (en) * 2021-08-02 2023-02-09 腾讯科技(深圳)有限公司 Video coding method and apparatus, computer readable medium, and electronic device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9192859B2 (en) * 2002-12-10 2015-11-24 Sony Computer Entertainment America Llc System and method for compressing video based on latency measurements and other feedback

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017177675A1 (en) * 2016-04-11 2017-10-19 华为技术有限公司 Video coding method and device
CN113141522A (en) * 2020-01-17 2021-07-20 北京达佳互联信息技术有限公司 Resource transmission method, device, computer equipment and storage medium
WO2022000350A1 (en) * 2020-06-30 2022-01-06 深圳市大疆创新科技有限公司 Video transmission method, mobile platform, terminal device, video transmission system, and storage medium
WO2023010992A1 (en) * 2021-08-02 2023-02-09 腾讯科技(深圳)有限公司 Video coding method and apparatus, computer readable medium, and electronic device

Also Published As

Publication number Publication date
CN116489342A (en) 2023-07-25

Similar Documents

Publication Publication Date Title
US20220078491A1 (en) Transmitting method
KR101828639B1 (en) Method for synchronizing multimedia flows and corresponding device
KR100972792B1 (en) Synchronizer and synchronizing method for stereoscopic image, apparatus and method for providing stereoscopic image
KR101689616B1 (en) Method for transmitting/receiving media segment and transmitting/receiving apparatus thereof
US20100110199A1 (en) Measuring Video Quality Using Partial Decoding
CN102037728B (en) Device and method for synchronizing an interactive mark to streaming content
CN101828351B (en) Apparatus and method for storing and reading a file having a media data container and a metadata container
US20130336379A1 (en) System and Methods for Encoding Live Multimedia Content with Synchronized Resampled Audio Data
KR20120084252A (en) Receiver for receiving a plurality of transport stream, transmitter for transmitting each of transport stream, and reproducing method thereof
KR100837720B1 (en) Method and Apparatus for synchronizing data service with video service in Digital Multimedia Broadcasting and Executing Method of Data Service
JP5720051B2 (en) Method and apparatus for measuring delay variation in digital stream
KR20130032842A (en) Media data transmission apparatus and method, and media data reception apparatus and method in mmt system
CN112565224B (en) Video processing method and device
EP2429136A1 (en) Method and apparatus for carrying transport stream
KR102171652B1 (en) Transmission device, transmission method, reception device, and reception method
JP2018182677A (en) Information processing apparatus, information processing method, program, and recording medium manufacturing method
JP2010531087A (en) System and method for transmission of constant bit rate streams
JP2007522696A (en) Method and apparatus for synchronizing audio and video presentation
JP6957186B2 (en) Information processing equipment, information processing methods, programs, and recording medium manufacturing methods
CN116489342B (en) Method and device for determining coding delay, electronic equipment and storage medium
KR101748382B1 (en) Method and system for providing video streaming
KR20170083844A (en) Set-Top Box for Measuring Frame Loss in a Video Stream and Method for Operating Same
TWI762980B (en) Method for debugging digital stream and circuit system thereof
TWI713364B (en) Method for encoding raw high frame rate video via an existing hd video architecture
EP2150066A1 (en) Procedure for measuring the change channel time on digital television

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant