CN113055711A - Audio and video synchronization detection method and detection system thereof - Google Patents

Audio and video synchronization detection method and detection system thereof Download PDF

Info

Publication number
CN113055711A
CN113055711A CN202110198925.1A CN202110198925A CN113055711A CN 113055711 A CN113055711 A CN 113055711A CN 202110198925 A CN202110198925 A CN 202110198925A CN 113055711 A CN113055711 A CN 113055711A
Authority
CN
China
Prior art keywords
audio
video
standard
signal
time stamp
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110198925.1A
Other languages
Chinese (zh)
Other versions
CN113055711B (en
Inventor
李政
邓志明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xunlei Computer Shenzhen Co ltd
Original Assignee
Xunlei Computer Shenzhen Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xunlei Computer Shenzhen Co ltd filed Critical Xunlei Computer Shenzhen Co ltd
Priority to CN202110198925.1A priority Critical patent/CN113055711B/en
Publication of CN113055711A publication Critical patent/CN113055711A/en
Application granted granted Critical
Publication of CN113055711B publication Critical patent/CN113055711B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/242Synchronization processes, e.g. processing of PCR [Program Clock References]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content

Abstract

The application discloses an audio and video synchronization detection method and a detection system thereof, comprising the following steps: encoding the standard audio and video by using the to-be-detected code to form the to-be-detected audio and video; the standard audio/video comprises a standard jump time stamp for carrying out periodic level conversion along the time axis direction; respectively decoding an audio signal and a video signal of an audio and video to be detected, and detecting a jump time stamp of the audio signal and a jump time stamp of the video signal; and respectively comparing the jump time stamp of the audio signal and the jump time stamp of the video signal with the standard jump time stamp to judge whether the audio signal and the video signal are synchronous. By the method, whether the coded audio signal and the coded video signal of the audio and video to be detected are synchronous or not can be accurately detected.

Description

Audio and video synchronization detection method and detection system thereof
Technical Field
The invention relates to the technical field of audio and video synchronization detection, in particular to an audio and video synchronization detection method and an audio and video synchronization detection system.
Background
When a soft decoding scheme is used in a decoder of a terminal to independently decode and render video stream data and audio stream data, the problem of audio and video asynchrony may occur in the decoding and rendering process due to some reasons, for example, when the audio and video decoder software has a defect, the audio and video asynchrony problem may occur in the decoding and rendering process when the audio and video decoding rendering time is different from the video decoding rendering time.
The existing audio and video synchronization detection technology mainly comprises two types: one is to manually observe the video of the receiving end, clip the video and listen to the accompanying audio to draw the conclusion whether the audio and the video are synchronous, and the method not only consumes the labor cost, but also has lower accuracy. The other type is that a precision instrument is adopted to detect the synchronization of audio and video, a dual-trace storage oscilloscope is utilized to store the waveforms of audio and video signals respectively, and the time difference is read according to the scales of the audio and video signals, or a professional millisecond meter is adopted to measure, so that the hardware cost is high, and the measurement precision is also influenced by the instrument precision and errors caused by human factors.
Disclosure of Invention
The technical problem mainly solved by the application is to provide an audio and video synchronization detection method and system to solve the problem that the elongation/shortening of the audio and video cannot be accurately detected in the prior art.
In order to solve the above problem, the present application provides an audio and video synchronization detection method, including: encoding the standard audio and video by using the to-be-detected code to form the to-be-detected audio and video; the standard audio/video comprises a standard jump time stamp for carrying out periodic level conversion along the time axis direction; respectively decoding an audio signal and a video signal of an audio and video to be detected so as to detect a jump time stamp of the audio signal and a jump time stamp of the video signal; and respectively comparing the jump time stamp of the audio signal and the jump time stamp of the video signal with the standard jump time stamp to judge whether the audio signal and the video signal are synchronous.
The step of comparing the jump time stamp of the audio signal and the jump time stamp of the video signal with the standard jump time stamp to judge whether the audio signal and the video signal are synchronous comprises the following steps: judging whether the hopping timestamp of the audio signal and the standard hopping timestamp hop synchronously or not, and judging whether the hopping timestamp of the video signal and the standard hopping timestamp hop synchronously or not; and if the jump timestamp of the audio signal jumps synchronously with the standard jump timestamp and the jump timestamp of the video signal jumps synchronously with the standard jump timestamp, synchronizing the audio signal with the video signal and synchronizing the audio and video to be detected with the standard audio and video.
The step of comparing the jump time stamp of the audio signal and the jump time stamp of the video signal with the standard jump time stamp to judge whether the audio signal and the video signal are synchronous comprises the following steps: respectively calculating the difference value between the jumping time stamp of the audio signal and the standard jumping time stamp and the difference value between the jumping time stamp of the video signal and the standard jumping time stamp; and judging whether the difference value between the jumping time stamp of the audio signal and the standard jumping time stamp is the same as the difference value between the jumping time stamp of the video signal and the standard jumping time stamp, if so, synchronizing the audio signal and the video signal.
The method comprises the following steps of using a code to be detected to code a standard audio/video, and forming the audio/video to be detected, wherein the method also comprises the following steps of: and generating standard audio and video for carrying out periodic level conversion on the audio and video along the direction of a time axis.
The step of generating the standard video with the audio and video subjected to the periodic level conversion along the time axis direction comprises the following steps: constructing audio PCM data, and carrying out periodic level conversion on the audio PCM data along the direction of a time axis; and constructing video YUV data, and performing synchronous periodic level conversion on the video YUV data and the audio PCM data along a time axis method.
The method comprises the following steps of respectively decoding audio signals and video signals of audio and video to be detected, and detecting a jump time stamp of the audio signals and a jump time stamp of the video signals, wherein the steps comprise: demultiplexing the audio and video to be detected, separating to obtain coded data of the audio and video to be detected, and decoding the data of the audio and video to be detected to obtain audio PCM data and video YUV data of the audio and video to be detected; analyzing the audio PCM data and the video YUV data, and recording a jump time stamp of the audio PCM data and a jump time stamp of the video YUV data; the step of comparing the jump time stamp of the audio signal and the jump time stamp of the video signal with the standard jump time stamp respectively to judge whether the audio signal and the video signal are synchronous comprises the following steps: calculating a difference value between a jumping time stamp of the audio PCM data and a standard jumping time stamp and a difference value between a jumping time stamp of the video YUV data and the standard jumping time stamp to judge whether the difference value between the jumping time stamp of the audio PCM data and the standard jumping time stamp is the same as the difference value between the jumping time stamp of the video YUV data and the standard jumping time stamp; if the audio signal and the video signal are the same, the audio signal and the video signal are synchronous.
The method comprises the following steps of calculating a difference value between a jumping time stamp of audio PCM data and a standard jumping time stamp and a difference value between a jumping time stamp of video YUV data and the standard jumping time stamp to judge whether the difference value between the jumping time stamp of the audio PCM data and the standard jumping time stamp is the same as the difference value between the jumping time stamp of the video YUV data and the standard jumping time stamp, and the method also comprises the following steps: and filtering the audio PCM data.
The application also provides an audio and video synchronization detection system, which comprises: the encoding unit is used for encoding the standard audio and video by using the code to be detected to form the audio and video to be detected; the standard audio/video comprises a standard jump time stamp for carrying out periodic level conversion along the time axis direction; the acquisition unit is used for acquiring the audio signal and the video signal of the audio and video to be detected, decoding the audio signal and the video signal and acquiring a jump time stamp of the decoded audio signal and a jump time stamp of the video signal; and the judging unit is used for respectively comparing the jump time stamp of the audio signal and the jump time stamp of the video signal with the standard jump time stamp so as to judge whether the audio signal and the video signal are synchronous.
Wherein, audio frequency and video synchronization detection system still includes: the calculating unit is used for calculating the difference value between the jumping time stamp of the audio signal and the standard jumping time stamp and calculating the difference value between the jumping time stamp of the video signal and the standard jumping time stamp; the judging unit is also used for judging whether the difference value between the jumping time stamp of the audio signal and the standard jumping time stamp is the same as the difference value between the jumping time stamp of the video signal and the standard jumping time stamp, if so, the audio signal and the video signal are synchronous.
Wherein, audio frequency and video synchronization detection system still includes:
and the standard audio and video generating unit is used for generating standard audio and video of which the audio data and the video data are subjected to periodic level conversion along the time axis direction.
The beneficial effect of this application is: and after the standard audio and video is coded by using the code to be detected, the audio and video to be detected is formed, then the audio and video to be detected is decoded, the audio signal and the video signal of the audio and video to be detected are analyzed, and the jump timestamp of the audio signal and the jump timestamp of the video signal are detected, so that whether the audio signal and the video signal of the audio and video signal to be detected are synchronous or not is judged.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flow diagram of an embodiment of an audio and video synchronization detection method according to the present application;
fig. 2 is a schematic structural diagram of an embodiment of a standard audio/video in the present application;
fig. 3 is a schematic flow chart of another embodiment of the audio and video synchronization detection method according to the present application;
fig. 4 is a schematic structural diagram of an embodiment of the audio/video synchronization detection system according to the present application.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the examples of this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise, the "plural" includes at least two in general, but does not exclude the presence of at least one.
It should be understood that the term "and/or" as used herein is merely one type of association that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
It should be understood that the terms "comprises," "comprising," or any other variation thereof, as used herein, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
AAC is short for advanced audio coding and is an MPEG-2 based audio coding technique.
AVC is short for advanced video coding, also called H264.
PCM (pulse code modulation) is binary data of an audio waveform signal.
YUV is a color coding method, which is mainly divided into three components, where "Y" represents brightness (gray value), and "U" and "V" represent chroma, which are used to describe image color and saturation.
The present application provides an audio and video synchronization detection method, please refer to fig. 1, fig. 1 is a schematic flow diagram of an embodiment of the audio and video synchronization detection method of the present application, and as shown in fig. 1, the method includes the following steps:
step S11: and coding the standard audio and video by using the code to be detected to form the audio and video to be detected.
The standard audio and video comprises a standard jump time stamp for carrying out periodic level conversion along the time axis direction.
In this embodiment, the encoding of the standard audio/video by using the to-be-detected code is to encode the video signal and the audio signal of the standard audio/video, respectively, and since the phenomenon that the audio signal and the video signal are not synchronized occurs in the process of encoding the standard audio/video by using the to-be-detected code, the encoding method includes: the audio signal or the video signal is stretched/shortened. The method and the device mainly aim to detect whether the audio and the video are synchronous after the standard audio and the video are coded by using the code to be detected.
To achieve the above effect, step S11 may be preceded by: and generating standard audio and video with periodically and synchronously changed audio and video. The standard audio-video comprises a standard audio signal which is subjected to periodic transformation and a standard video signal which is subjected to periodic transformation. Specifically, in a preferred embodiment, the steps include: and constructing audio PCM data, and performing periodic level conversion on the audio PCM data along the time axis direction to obtain standard audio data. For example, 16bits of signed sampled audio PCM data, which has a value in the range of-32768, 32767, is selected as-16383 for the low level and 16383 for the high level in a preferred embodiment. And constructing video YUV data, and performing synchronous periodic transformation on the video YUV data and the audio PCM data along the direction of a time axis to obtain standard video data. In one embodiment, the video data corresponds to different frames when the audio PCM data is at a low voltage level and a high voltage level. For the detection convenience, the video YUV data can adopt pure color pictures with different colors. And the constructed audio PCM data and the video YUV data jointly form standard audio and video.
In a preferred embodiment, the switching period of the high and low levels of the audio PCM data and the switching period of the picture of the video YUV data are selected to be 10 seconds. In this embodiment, the audio signal and the video signal are compared only at the rising edge or the falling edge. The rising edge is the moment (time) at which the signal level changes from the low level to the high level, and the falling edge is the moment (time) at which the signal level changes from the high level to the first level. In the present embodiment, the playing is selected every 10 seconds, which is convenient for observation; on the other hand, the 10-second period is selected, so that the phenomenon that the difference value between the audio data and the video data overflows in one period due to the fact that the period is too small when the audio data and the video data are asynchronous can be avoided; again, selecting a period that is 10 seconds longer may reduce the amount of computation.
In this embodiment, the transition time stamp of the signal includes any one of a time stamp of a rising edge and a time stamp of a falling edge, or a time stamp of a rising edge and a time stamp of a falling edge.
In this embodiment, the standard audio/video includes standard audio data and standard video data, where a schematic structural diagram of the generated standard audio/video is shown in fig. 2, and fig. 2 is a schematic structural diagram of an embodiment of the standard audio/video in this application. As shown in fig. 2, the rising/falling edge of the standard audio signal and the rising/falling edge of the standard video signal are at the same time.
In step S11, after the standard audio/video is generated, the standard audio/video is encoded by using the code to be detected, so as to obtain the audio/video to be detected. Specifically, the standard audio in the standard video is compressed by AAC coding to form an audio stream or audio data to be detected, and the standard video is compressed by AVC coding to form a video stream or video data to be detected. The standard audio and video are coded and compressed by adopting AAC coding and AVC coding, so that the coded audio and video to be detected has higher compression ratio, and the occupation ratio of the operating space is reduced. In this embodiment, both AAC encoding and AVC encoding have a high compression rate and can better support software and hardware encoding.
In this embodiment, the coded audios and videos to be detected are placed in a container according to a set rule for propagation, and in an embodiment, the format of the container is a universal mp4 format. It should be noted that the container format of the media is different from the encoding format, and the media container is usually used as a carrier of the encoding format, and organizes at least one data of the encoding format into a specific format and provides some auxiliary functions. mp4 is also known as MPEG-4Part14, a multimedia container format.
Step S12: and respectively decoding the audio signal and the video signal of the audio and video to be detected, and detecting the jump time stamp of the audio signal and the jump time stamp of the video signal.
The process of separating the audio and video to be detected is also called demultiplexing, specifically, the audio signal and the video signal of the audio and video to be detected are demultiplexed from the mp4 container and decoded, and the audio signal and the video signal of the audio and video to be detected are analyzed respectively.
In the step, after the audio and video to be detected are decoded, the audio signal and the video signal to be detected are extracted, and the jump time stamp (namely the time stamp corresponding to the rising edge or the time stamp corresponding to the falling edge) of the audio signal and the video signal is detected. Specifically, decoding the mp4 container further includes: and separating the video stream and the audio stream from the mp4 container, and decoding the video stream and the audio stream to obtain audio PCM data to be detected and video YUV data to be detected respectively. It should be noted that a stream is a transmission method of media data information, and generally includes 5 streams of audio, video, subtitles, attachments, and data.
In one embodiment, the audio signal to be detected and the video signal to be detected may be audio signals and video signals generated after the terminal device plays a video file. For example, a live video stream is taken as an example, a streaming media format of FLV/fMP4 is generally adopted, FLV/fMP4 includes encoded video such as AVC and audio encoded in an Advanced Audio Coding (AAC) manner, and an audio signal to be detected and a video signal to be detected are obtained by separating and decoding from FLV/fMP 4. In this embodiment, the process of playing the audio/video file is the process of decoding.
Step S13: and respectively comparing the jump time stamp of the audio signal and the jump time stamp of the video signal with the standard jump time stamp to judge whether the audio signal and the video signal are synchronous.
And comparing the jump time stamp of the audio signal to be detected obtained by decoding with the standard jump time stamp in the standard audio and video formed in the step S11, and analyzing whether the audio signal and the video signal are synchronous. Specifically, the audio signal to be detected is compared with the standard audio, whether the transition timestamp of the audio signal to be detected is the same as the standard transition timestamp of the standard audio is judged, and if the transition timestamp of the audio signal to be detected is the same as the standard transition timestamp of the standard audio, the audio signal of the audio and video to be detected is synchronous with the standard audio signal of the standard audio and video. And if the audio signal of the audio and video to be detected is different from the standard audio and video signal in the standard audio and video, the audio signal of the audio and video to be detected is not synchronous with the standard audio and video signal in the standard audio and video. And comparing the video signal of the audio and video to be detected with the standard video, judging whether the jump time stamp of the video signal is the same as the standard jump time stamp of the standard video, and if so, synchronizing the video signal of the audio and video to be detected with the standard video signal of the standard audio and video. And if the detected audio and video signals are different, the video signals of the audio and video to be detected are not synchronous with the standard video signals in the standard audio and video. The method specifically comprises the steps of comparing the time interval between a certain jump time stamp and the next jump time stamp of the audio signal of the audio and video to be detected with the time interval between a certain jump time stamp and the next jump time stamp of the audio signal of the standard audio and video, and if the time interval from the certain jump time stamp to the next jump time stamp of the audio signal of the audio and video to be detected is the same as the standard time interval duration of the standard audio and video, synchronizing the audio signal of the audio and video to be detected with the audio signal of the standard audio and video, and indicating that the audio signal of the audio and video to be detected is normal and is not stretched/compressed. The standard interval duration of the standard audio/video refers to a time interval from one jump timestamp to the next jump timestamp of an audio signal/video signal of the standard audio/video, and also refers to a half cycle of the standard audio/video. Wherein the transition time stamp comprises a transition of a rising edge or a falling edge. Similarly, the time interval between one jump time stamp and the next jump time stamp of the video signal of the audio and video to be detected is compared with the standard interval duration of the standard audio and video, and if the time interval is the same, the video signal of the audio and video to be detected is synchronous with the video signal of the standard audio and video.
If the audio signal of the audio and video to be detected is synchronous with the audio signal of the standard video and the video signal of the audio and video signal to be detected is synchronous with the video signal of the standard video, the audio signal and the video signal of the audio and video to be detected are synchronous, and the audio signal and the video signal of the audio and video to be detected are not lengthened or shortened. And if any one of the video signal and the audio signal of the audio and video to be detected is not synchronous with the standard audio and video, the video signal and the audio signal of the audio and video to be detected are not synchronous.
In another embodiment, the method further comprises: comparing an audio signal of an audio and video to be detected with a standard audio, calculating a difference value between a hopping time stamp of the audio signal and a standard hopping time stamp of the standard audio signal, comparing a video signal of the audio and video to be detected with the standard video, calculating a difference value between a hopping time stamp of the video signal and a standard hopping time stamp of the standard video, judging whether the difference value between the hopping time stamp of the audio signal and the standard hopping time stamp is the same as the difference value between the hopping time stamp of the video signal and the standard hopping time stamp, if so, synchronizing the audio signal and the video signal of the audio and video to be detected, and if the difference value is not zero, calculating to obtain how many seconds the audio and video to be detected is lengthened or shortened. In this embodiment, the hopping timestamp includes a time interval between a certain hopping time point of the audio/video signal and a next hopping time point, and specifically, the hopping timestamp of the audio/video signal to be detected includes a time interval between a certain hopping time point (including a rising edge or a falling edge) and a next hopping time point of the audio/video signal to be detected; the standard transition time stamp includes a time interval between a certain transition time point and a next transition time point of an audio signal/video signal of the standard audio-video.
The beneficial effect of this embodiment is: the audio signal of the audio and video to be detected is compared with the standard audio of the standard audio and video, and the video signal of the audio and video to be detected is compared with the standard video of the standard audio and video, so that whether the audio and video to be detected is synchronous with the standard audio and video or whether the audio and video to be detected is elongated or shortened is judged, and whether the audio signal of the audio and video to be detected is synchronous with the video signal is judged.
The present application further provides another audio and video synchronization detection method, please refer to fig. 3, and fig. 3 is a schematic flow diagram of another embodiment of the audio and video synchronization detection method according to the present application. As shown in fig. 3, the method includes:
step S31: and coding the standard audio and video by using the code to be detected to form the audio and video to be detected.
The standard audio and video comprises a standard jump time stamp for carrying out periodic level conversion along the time axis direction.
Specifically, before the standard audio/video is coded by using the code to be tested, the standard audio/video subjected to periodic level conversion is generated. The standard audio and video comprises standard audio data and standard video data, specifically, the standard audio and the standard video are obtained by constructing audio PCM data and video YUV data, and the standard audio data and the standard video data are subjected to synchronous periodic level conversion along the direction of a time axis. The method for generating the standard audio and video comprises the following steps: constructing audio PCM data, namely performing periodic level conversion on the audio PCM data along the direction of a time axis to obtain standard audio data; constructing video YUV data, and performing synchronous periodic level conversion on the video YUV data and the audio PCM data along the direction of a time axis to obtain standard video data; the standard audio data and the standard video data jointly form standard audio and video. In one embodiment, 10 seconds is selected as one period for level conversion, and in other embodiments, 20 seconds or 30 seconds may be selected as one period, which is not limited herein. In the embodiment, a longer period is selected for level conversion, and firstly, the observation is facilitated; secondly, the phenomenon that the difference value of the jumping timestamps of the audios and the videos overflows in a period due to the fact that the period is too small when the audios and the videos are asynchronous is avoided; thirdly, the audio and video synchronization detection only carries out comparison calculation on the rising edge/the falling edge, and the calculation amount of whether the audio and the video are synchronized can be reduced.
The standard audio and video coding processing by using the code to be tested comprises the following steps: the standard audio data are respectively processed by AAC coding to obtain audio to be detected, the standard video data are processed by AVC coding to obtain video to be detected, and the audio and video to be detected after coding processing are multiplexed.
In this embodiment, standard audio data and standard video data are encoded and then put into a container according to a set rule for transmission. In one embodiment, an mp4 container format is employed. The method specifically comprises the steps of coding standard audio data and standard video data to obtain an audio/video to be detected, and placing the audio/video to be detected into an mp4 container according to the rule of an mp4 format.
Step S32: demultiplexing the audio and video to be detected, separating to obtain the coded data of the audio and video to be detected, and decoding the data of the audio and video to be detected to obtain audio PCM data and video YUV data of the audio and video to be detected.
And analyzing the audio PCM data and the video YUV data of the audio and video to be detected from the mp4 container, and respectively analyzing the audio PCM data and the video YUV data of the audio and video to be detected.
In an embodiment, the process of playing the audio/video file is a decoding process, and the decoding method is not limited herein.
Step S33: and analyzing the audio PCM data and the video YUV data, and recording a jump time stamp of the audio PCM data and a jump time stamp of the video YUV data.
The method specifically comprises the steps of comparing a jump time stamp of audio PCM data of the audio and video to be detected with a standard jump time stamp of audio PCM data of the standard audio and video, and comparing a jump time stamp of video YUV data of the audio and video to be detected with a standard jump time stamp of video YUV data of the standard audio and video. Wherein the transition timestamp comprises a rising edge timestamp and a falling edge timestamp.
Step S34: and calculating the difference value between the jumping time stamp of the audio PCM data and the standard jumping time stamp and the difference value between the jumping time stamp of the video YUV data and the standard jumping time stamp.
Step S35: and judging whether the difference value of the jumping time stamp of the audio PCM data and the standard jumping time stamp is the same as the difference value of the jumping time stamp of the video YUV data and the standard jumping time stamp.
If the difference value of the jumping timestamp of the audio PCM data and the standard jumping timestamp is the same, the audio signal of the audio and video to be detected is synchronous with the standard audio of the standard audio and video, and if the difference value of the jumping timestamp of the video YUV data and the standard jumping timestamp is also the same, the video signal of the audio and video to be detected is synchronous with the standard video of the standard audio and video, and the audio signal of the audio and video to be detected is synchronous with the video signal. Specifically, whether a time stamp interval from a rising edge to a falling edge of audio PCM data is the same as a time stamp interval from a rising edge to a falling edge of a standard audio/video is judged, if so, an audio signal of the audio/video to be detected is synchronous with an audio signal of the standard audio/video, or whether a time stamp interval from a falling edge to a rising edge of the audio PCM data is the same as a time stamp interval from a falling edge to a rising edge of the standard audio/video is judged, and if so, the audio signal of the audio/video to be detected is synchronous with the audio signal of the standard audio/video. And similarly, judging whether the time stamp interval from the rising edge to the falling edge of the video YUV data is the same as the time stamp interval from the rising edge to the falling edge of the standard audio and video, if so, synchronizing the video signal of the audio and video to be detected with the video signal of the standard audio and video, or judging whether the time stamp interval from the falling edge to the rising edge of the video YUV data is the same as the time stamp interval from the falling edge to the rising edge of the standard audio and video, and if so, synchronizing the video signal of the audio and video to be detected with the video signal of the standard audio and video. If the audio signal and the video signal of the audio and video to be detected are synchronous with the audio signal and the video signal of the standard audio and video, the audio and video to be detected are synchronous with the standard audio and video, and the audio and video to be detected are not compressed/elongated, and the time length of the audio and video to be detected is normal.
And if the difference value of the jumping time stamp of the audio PCM data and the standard jumping time stamp is different, judging whether the difference value of the jumping time stamp of the audio PCM data and the standard jumping time stamp is the same as the difference value of the jumping time stamp of the video YUV data and the standard jumping time stamp.
Step S36: if the audio signal and the video signal are the same, the audio signal and the video signal are synchronized.
Step S37: if not, the audio signal and the video signal are not synchronized.
In this embodiment, it is determined whether the audio signal and the video signal of the audio and video to be detected are synchronous, if the difference between the transition timestamp of the audio PCM data of the audio and video to be detected and the standard transition timestamp is the same as the difference between the transition timestamp of the video YUV data of the audio and video to be detected and the standard transition timestamp, the audio signal and the video signal of the audio and video to be detected are synchronous, and if the difference is not the same, the audio signal and the video signal of the audio and video to be detected are asynchronous.
Specifically, a jump time stamp of audio PCM data of the audio and video to be detected and a standard jump time stamp of audio PCM data of the standard audio and video are compared and calculated, and a jump time stamp of video YUV data of the audio and video to be detected and a standard jump time stamp of video YUV data of the standard audio and video are compared and calculated.
In this embodiment, calculating the difference between the transition timestamp of the audio PCM data and the standard transition timestamp and the difference between the transition timestamp of the video YUV data and the standard transition timestamp comprises: and filtering the audio PCM data. Specifically, the audio PCM data is filtered to remove a small range of abrupt changes introduced by encoding due to errors introduced by encoding.
In an embodiment, the transition timestamp of the audio signal is compared with a standard transition timestamp, whether a difference between a first rising edge timestamp and a second rising edge timestamp of the audio signal is the same as a difference between a first standard rising edge timestamp and a second standard rising edge timestamp of the standard audio-video or whether a difference between a first falling edge timestamp and a second falling edge timestamp of the audio signal is the same as a difference between a first standard falling edge timestamp and a second standard falling edge timestamp of the standard audio-video is analyzed, and if the differences are the same, the audio signal is synchronized with the standard audio. And judging whether the jump time stamp of the video signal is synchronous with the standard jump time stamp by the same method, if the jump time stamp of the video signal is synchronous with the standard jump time stamp, synchronizing the audio signal of the audio and video to be detected with the video signal of the audio and video to be detected, and not lengthening/compressing the audio signal and the video signal of the audio and video to be detected.
In another embodiment, transition time stamps of the audio signal are compared with standard transition time stamps of standard audio, wherein the time stamps record time information. And if the jump time stamp of the audio signal is the same as the standard jump time stamp of the standard audio, the audio signal is synchronous with the standard audio. And comparing the jump time stamp of the video signal with the standard time stamp of the standard video, and if the jump time stamp of the video signal is the same as the standard jump time stamp of the standard video, synchronizing the video signal with the standard video. The audio signal and the video signal of the audio and video to be detected are synchronous, the audio signal and the video signal of the audio and video to be detected are synchronous with the standard audio and the standard video of the standard audio and video, and the audio signal and the video signal of the audio and video to be detected are not delayed.
The beneficial effect of this embodiment is: the jumping time stamp of the audio PCM data is compared with the standard jumping time stamp, and the jumping time stamp of the video YUV data is compared with the standard jumping time stamp, so that whether the audio and video to be detected are synchronous with the standard audio and video or whether the audio and video to be detected are lengthened or shortened is judged, and whether the audio signal and the video signal of the audio and video to be detected are synchronous is judged.
Please refer to fig. 4, and fig. 4 is a schematic diagram of a framework of an embodiment of an audio and video synchronization detection system 40 according to the present application. As shown in fig. 4, the audio-video synchronization detection system 40 includes:
and the coding unit 41 is used for coding the standard audio/video to form the audio/video to be detected. The method specifically comprises the following steps: and respectively coding the standard audio data and the standard video data of the standard audio and video by using the codes to be detected to obtain the audio signal to be detected and the video signal to be detected of the audio and video to be detected.
And the obtaining unit 42, coupled to the encoding unit 41, is configured to obtain an audio signal and a video signal of an audio/video to be detected, decode the audio signal and the video signal, and obtain a transition timestamp of the decoded audio signal and a transition timestamp of the video signal. The method specifically comprises the following steps: separating the audio signal and the video signal to be detected from the specific container, decoding the audio signal and the video signal in a playing form, and recording the transition time stamp of the audio signal and the transition time stamp of the video signal in the playing process. Wherein, the obtaining unit 42 further comprises a standard jump time stamp for obtaining the standard audio-visual.
And a judging unit 43 coupled to the obtaining unit 42, for comparing the transition timestamp of the audio signal and the transition timestamp of the video signal with the standard transition timestamp respectively to judge whether the audio signal and the video signal are synchronous. The method specifically comprises the following steps: comparing the jump time stamp of the audio signal of the audio and video to be detected with the standard jump time stamp, and if the jump time stamp of the audio signal is the same as the standard jump time stamp, synchronizing the audio signal of the audio and video to be detected with the standard audio and video; comparing the jump timestamp of the video signal of the audio and video to be detected with the standard jump timestamp, and if the jump timestamp of the video signal is the same as the standard jump timestamp, synchronizing the video signal of the audio and video to be detected with the standard audio and video; and if the video signal and the audio signal are synchronous with the standard audio and video, synchronizing the audio and video to be detected with the standard audio and video, and synchronizing the audio signal and the video signal of the audio and video to be detected.
The audio and video synchronization detection system 40 further includes: a calculating unit 44 coupled to the judging unit 43, for calculating a difference between the transition timestamp of the audio signal and the standard transition timestamp, and calculating a difference between the transition timestamp of the video signal and the standard transition timestamp. The judging unit 43 is configured to judge whether a difference between a transition timestamp of the audio signal and a standard transition timestamp is the same as a difference between a transition timestamp of the video signal and the standard transition timestamp, and if the difference is the same, judge that the audio signal and the video signal are synchronous, and if the difference is not the same, judge that the audio signal and the video signal are asynchronous.
The audio and video synchronization detection system 40 further includes a standard audio and video generation unit 411 coupled to the encoding unit 41: and forming standard audio and video for carrying out periodic level conversion along the direction of a time axis. Wherein the standard audio-video includes a standard transition timestamp. Specifically, audio PCM data and video YUV data are constructed for the audio and video to be detected, and synchronous periodic level conversion is carried out on the audio PCM data and the video YUV data along the direction of a time axis. And the time stamp corresponding to the periodic level conversion of the audio PCM data and the video YUV data is a standard jump time stamp. In one embodiment, the period of the standard transition timestamp is 10 seconds, and in other embodiments, the period may also be 20 seconds or 30 seconds, which is not limited herein.
The beneficial effect of this embodiment is: the audio signal of the audio and video to be detected is compared with the standard audio of the standard audio and video, and the video signal of the audio and video to be detected is compared with the standard video of the standard audio and video, so that whether the audio and video to be detected and the standard audio and video are synchronous or not is judged, and whether the audio signal and the video signal of the audio and video to be detected are synchronous or not is judged.
The above description is only for the purpose of illustrating embodiments of the present application and is not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application or are directly or indirectly applied to other related technical fields, are also included in the scope of the present application.

Claims (10)

1. A method for detecting audio and video synchronization is characterized in that,
encoding the standard audio and video by using the to-be-detected code to form the to-be-detected audio and video; the standard audio/video comprises a standard jump time stamp for carrying out periodic level conversion along the time axis direction;
respectively decoding the audio signal and the video signal of the audio and video to be detected, and detecting a jump time stamp of the audio signal and a jump time stamp of the video signal;
and respectively comparing the jump time stamp of the audio signal and the jump time stamp of the video signal with the standard jump time stamp to judge whether the audio signal and the video signal are synchronous.
2. The audio-video synchronization detection method according to claim 1, wherein the step of comparing the transition timestamp of the audio signal and the transition timestamp of the video signal with the standard transition timestamp to determine whether the audio signal and the video signal are synchronized comprises:
judging whether the jump time stamp of the audio signal and the standard jump time stamp jump synchronously or not, and judging whether the jump time stamp of the video signal and the standard jump time stamp jump synchronously or not;
and if the jump timestamp of the audio signal jumps synchronously with the standard jump timestamp and the jump timestamp of the video signal jumps synchronously with the standard jump timestamp, the audio signal is synchronous with the video signal, and the audio and video to be detected is synchronous with the standard audio and video.
3. The audio-video synchronization detection method according to claim 1, wherein the step of comparing the transition timestamp of the audio signal and the transition timestamp of the video signal with the standard transition timestamp to determine whether the audio signal and the video signal are synchronized comprises:
respectively calculating the difference value of the jump time stamp of the audio signal and the standard jump time stamp and the difference value of the jump time stamp of the video signal and the standard jump time stamp;
and judging whether the difference value between the jump time stamp of the audio signal and the standard jump time stamp is the same as the difference value between the jump time stamp of the video signal and the standard jump time stamp, if so, synchronizing the audio signal and the video signal.
4. The method for detecting audio and video synchronization according to claim 1, wherein the step of encoding the standard audio and video by using the code to be detected to form the audio and video to be detected further comprises:
and generating the standard audio and video with the audio and video subjected to periodic level conversion along the time axis direction.
5. The audio-video synchronization detection method according to claim 4, wherein the step of generating the standard video with audio-video level-shifted periodically along a time axis comprises:
constructing audio PCM data, and performing periodic level conversion on the audio PCM data along a time axis direction;
and constructing video YUV data, and performing the periodic level conversion of the video YUV data and the audio PCM data along a time axis method.
6. The method for detecting audio and video synchronization according to claim 1, wherein the steps of decoding the audio signal and the video signal of the audio and video to be detected, respectively, and detecting the transition timestamp of the audio signal and the transition timestamp of the video signal comprise:
demultiplexing the audio and video to be detected, separating to obtain coded data of the audio and video to be detected, and decoding the data of the audio and video to be detected to obtain audio PCM data and video YUV data of the audio and video to be detected;
analyzing the audio PCM data and the video YUV data, and recording a jump time stamp of the audio PCM data and a jump time stamp of the video YUV data;
the step of comparing the transition timestamp of the audio signal and the transition timestamp of the video signal with the standard transition timestamp respectively to determine whether the audio signal and the video signal are synchronous comprises:
calculating a difference value between the transition time stamp of the audio PCM data and the standard transition time stamp and a difference value between the transition time stamp of the video YUV data and the standard transition time stamp to judge whether the difference value between the transition time stamp of the audio PCM data and the standard transition time stamp is the same as the difference value between the transition time stamp of the video YUV data and the standard transition time stamp;
if the audio signal and the video signal are the same, the audio signal and the video signal are synchronous.
7. The method for detecting audio and video synchronization according to claim 6, wherein the step of calculating the difference between the transition timestamp of the audio PCM data and the standard transition timestamp and the difference between the transition timestamp of the video YUV data and the standard transition timestamp to determine whether the difference between the transition timestamp of the audio PCM data and the standard transition timestamp and the difference between the transition timestamp of the video YUV data and the standard transition timestamp are the same further comprises:
and filtering the audio PCM data.
8. An audio-video synchronization detection system, comprising:
the encoding unit is used for encoding the standard audio and video by using the code to be detected to form the audio and video to be detected; the standard audio/video comprises a standard jump time stamp for carrying out periodic level conversion along the time axis direction;
the acquisition unit is used for acquiring the audio signal and the video signal of the audio and video to be detected, decoding the audio signal and the video signal and acquiring a jump timestamp of the decoded audio signal and a jump timestamp of the decoded video signal;
and the judging unit is used for respectively comparing the jump time stamp of the audio signal and the jump time stamp of the video signal with the standard jump time stamp so as to judge whether the audio signal and the video signal are synchronous.
9. The audio-video synchronization detection system according to claim 8, further comprising:
a calculating unit for calculating a difference between a transition timestamp of the audio signal and the standard transition timestamp, and calculating a difference between a transition timestamp of the video signal and the standard transition timestamp;
the judging unit is further configured to judge whether a difference between the transition timestamp of the audio signal and the standard transition timestamp is the same as a difference between the transition timestamp of the video signal and the standard transition timestamp, and if so, synchronize the audio signal with the video signal.
10. The audio-video synchronization detection system according to claim 9, further comprising:
and the standard audio and video generating unit is used for generating the standard audio and video with the audio data and the video data subjected to periodic level conversion along the time axis direction.
CN202110198925.1A 2021-02-22 2021-02-22 Audio and video synchronous detection method and detection system thereof Active CN113055711B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110198925.1A CN113055711B (en) 2021-02-22 2021-02-22 Audio and video synchronous detection method and detection system thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110198925.1A CN113055711B (en) 2021-02-22 2021-02-22 Audio and video synchronous detection method and detection system thereof

Publications (2)

Publication Number Publication Date
CN113055711A true CN113055711A (en) 2021-06-29
CN113055711B CN113055711B (en) 2023-10-17

Family

ID=76509387

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110198925.1A Active CN113055711B (en) 2021-02-22 2021-02-22 Audio and video synchronous detection method and detection system thereof

Country Status (1)

Country Link
CN (1) CN113055711B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116437134A (en) * 2023-06-13 2023-07-14 中国人民解放军军事科学院系统工程研究院 Method and device for detecting audio and video synchronicity

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08140054A (en) * 1994-11-10 1996-05-31 Matsushita Electric Ind Co Ltd Audio signal and video signal synchronous reproduction method
CN101742357A (en) * 2009-12-29 2010-06-16 北京牡丹电子集团有限责任公司 Method for measuring audio/video synchronous error of digital television device
CN103237255A (en) * 2013-04-24 2013-08-07 南京龙渊微电子科技有限公司 Multi-thread audio and video synchronization control method and system
CN103327368A (en) * 2012-03-25 2013-09-25 联发科技股份有限公司 Method for performing multimedia playback control and associated apparatus
CN103888813A (en) * 2012-12-21 2014-06-25 北京计算机技术及应用研究所 Audio and video synchronization realization method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08140054A (en) * 1994-11-10 1996-05-31 Matsushita Electric Ind Co Ltd Audio signal and video signal synchronous reproduction method
CN101742357A (en) * 2009-12-29 2010-06-16 北京牡丹电子集团有限责任公司 Method for measuring audio/video synchronous error of digital television device
CN103327368A (en) * 2012-03-25 2013-09-25 联发科技股份有限公司 Method for performing multimedia playback control and associated apparatus
CN103888813A (en) * 2012-12-21 2014-06-25 北京计算机技术及应用研究所 Audio and video synchronization realization method and system
CN103237255A (en) * 2013-04-24 2013-08-07 南京龙渊微电子科技有限公司 Multi-thread audio and video synchronization control method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116437134A (en) * 2023-06-13 2023-07-14 中国人民解放军军事科学院系统工程研究院 Method and device for detecting audio and video synchronicity
CN116437134B (en) * 2023-06-13 2023-09-22 中国人民解放军军事科学院系统工程研究院 Method and device for detecting audio and video synchronicity

Also Published As

Publication number Publication date
CN113055711B (en) 2023-10-17

Similar Documents

Publication Publication Date Title
KR100499037B1 (en) Method and apparatus of dtv lip-sync test
US8379735B2 (en) Automatic video glitch detection and audio-video synchronization assessment
US9489980B2 (en) Video/audio synchronization apparatus and video/audio synchronization method
CN1859584A (en) Video frequency broadcast quality detecting method for medium broadcast terminal device
JP5025722B2 (en) Audio / video synchronization delay measuring method and apparatus
CN112437242A (en) Method and apparatus for embedded applications
US6366314B1 (en) Method and system for measuring the quality of digital television signals
US8433143B1 (en) Automated detection of video artifacts in an information signal
CN113055711A (en) Audio and video synchronization detection method and detection system thereof
US7634005B2 (en) Method and apparatus for synchronized recording of audio and video streams
CN100496133C (en) Method for testing audio and video frequency out of step of audio and video frequency coding-decoding system
US8754947B2 (en) Systems and methods for comparing media signals
US6912011B2 (en) Method and system for measuring audio and video synchronization error of audio/video encoder system and analyzing tool thereof
US6618077B1 (en) Method for controlling a digital television metrology equipment
CN1207922C (en) Method and system for measuring accuracy of video/audio output synchronization, and analysis means
CN111131917B (en) Real-time audio frequency spectrum synchronization method and playing device
KR101086434B1 (en) Method and apparatus for displaying video data
US20020037038A1 (en) Method and apparatus for reproducing images
CN116489342B (en) Method and device for determining coding delay, electronic equipment and storage medium
GB2437122A (en) Method and apparatus for measuring audio/video sync delay
CN111131868B (en) Video recording method and device based on player
US20230231988A1 (en) Embedded timestamps for determining offset between test streams
Terry et al. Detection and correction of lip-sync errors using audio and video fingerprints
CN111343451A (en) Method and device for monitoring digital video/audio decoder
CN114222130A (en) Method and device for detecting picture effect of transcoding stream and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant