CN112188259B - Method and device for audio and video synchronization test and correction and electronic equipment - Google Patents

Method and device for audio and video synchronization test and correction and electronic equipment Download PDF

Info

Publication number
CN112188259B
CN112188259B CN202011052394.7A CN202011052394A CN112188259B CN 112188259 B CN112188259 B CN 112188259B CN 202011052394 A CN202011052394 A CN 202011052394A CN 112188259 B CN112188259 B CN 112188259B
Authority
CN
China
Prior art keywords
audio
video
frame
video frame
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011052394.7A
Other languages
Chinese (zh)
Other versions
CN112188259A (en
Inventor
王文峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN202011052394.7A priority Critical patent/CN112188259B/en
Publication of CN112188259A publication Critical patent/CN112188259A/en
Application granted granted Critical
Publication of CN112188259B publication Critical patent/CN112188259B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4305Synchronising client clock from received content stream, e.g. locking decoder clock with encoder clock, extraction of the PCR packets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4394Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The disclosure relates to a method, a device and an electronic device for audio and video synchronization test and correction, wherein the method comprises the following steps: capturing a specific video frame sequence and an audio frame sequence which are played according to a preset frame rate, wherein each video frame corresponds to each audio frame in a one-to-one mode, each video frame is different from each other, each audio frame is different from each other, each video frame in the specific video frame sequence is played simultaneously with the audio frame corresponding to the video frame in the audio frame sequence, and the playing time of each video frame is set according to the preset frame rate, so that at least one video frame in the specific video frame sequence can be captured according to the preset frame rate; performing image analysis on the captured video frames to determine first timestamps of the captured video frames and audio frames of the sequence of audio frames corresponding to the captured video frames, and determining second timestamps of the audio frames corresponding to the captured video frames by performing sound analysis on the captured audio frames; and carrying out audio and video synchronization correction according to the first time stamp and the second time stamp.

Description

Method and device for audio and video synchronization test and correction and electronic equipment
Technical Field
The present disclosure relates to the field of signal processing, and in particular, to a method and an apparatus for audio and video synchronization testing and correction, and an electronic device.
Background
In the related art, even if the same recording software is used on different models, the difference of audio and video capture time is often different. In the traditional audio and video correction for recording software, audio and video synchronization correction is respectively performed on various types of machines in a development stage (for example, a recording software development stage). However, when a general software is adapted to various models, if an engineer manually adjusts appropriate parameters for each model to perform audio/video synchronization verification, a large amount of development time is consumed, and the correction method is prone to introducing more human factors, which may result in poor audio/video synchronization correction effect.
Disclosure of Invention
The present disclosure provides a method, an apparatus, and an electronic apparatus for audio and video synchronization testing and correction, so as to at least solve the problems that audio and video synchronization correction in the related art consumes a large amount of development time and the audio and video synchronization correction effect is not good enough, and also not solve any of the above problems. The technical scheme of the disclosure is as follows:
according to a first aspect of the embodiments of the present disclosure, there is provided a method for audio-video synchronization correction, including capturing a specific video frame sequence and an audio frame sequence which are played according to a predetermined frame rate, wherein each video frame and each audio frame in the specific video frame sequence and the audio frame sequence correspond to each other in a one-to-one manner, and each video frame is different from each other, and each audio frame is different from each other, wherein each video frame in the specific video frame sequence is played simultaneously with an audio frame corresponding to the video frame in the audio frame sequence, and a playing time of each video frame is set according to the predetermined frame rate, so that at least one video frame in the specific video frame sequence can be captured according to the predetermined frame rate; performing image analysis on the captured video frames to determine first timestamps of the captured video frames and audio frames of the sequence of audio frames corresponding to the captured video frames, and determining second timestamps of the audio frames corresponding to the captured video frames by performing sound analysis on the captured audio frames; and carrying out audio and video synchronization correction according to the first time stamp and the second time stamp.
Optionally, the playing time of each video frame is set to be the remainder obtained by dividing the playing time by the reciprocal of the predetermined frame rate, and the remainder includes all integers smaller than the reciprocal.
Optionally, the video frames in the specific video frame sequence are color images respectively having different color values, wherein performing image analysis on the captured video frames includes: color values of the captured video frame are analyzed.
Optionally, the performing audio-video synchronization correction according to the first timestamp and the second timestamp includes: and calculating and storing the difference value of the first time stamp and the second time stamp or a statistic value related to the difference value as a check value for audio and video synchronization.
Optionally, the method for audio-video synchronization correction further includes: and responding to a recording request of a user, capturing any audio and video according to the preset frame rate, and synchronizing the captured audio and video according to the stored check value.
Optionally, the play duration of each video frame and each audio frame is less than or equal to the screen refresh interval.
According to a second aspect of the embodiments of the present disclosure, there is provided a method for audio and video synchronization test, the method including: acquiring a specific video frame sequence and an audio frame sequence, wherein each video frame and each audio frame in the specific video frame sequence and the audio frame sequence are in one-to-one correspondence, the video frames are different from each other, and the audio frames are different from each other; playing a specific video frame sequence and an audio frame sequence, wherein each video frame in the specific video frame sequence is played simultaneously with the audio frame corresponding to the video frame in the audio frame sequence, and the playing time of each video frame is set according to a predetermined frame rate, so that at least one video frame in the specific video frame sequence can be captured according to the predetermined frame rate.
Optionally, the playing time of each video frame is set to be the remainder obtained by dividing the playing time by the reciprocal of the predetermined frame rate, and the remainder includes all integers smaller than the reciprocal.
Optionally, the video frames in the specific video frame sequence are color images respectively having different color values.
Optionally, the play duration of each video frame and each audio frame is less than or equal to the screen refresh interval.
Optionally, the method for audio-video synchronization testing further includes: before playing a specific video frame sequence and an audio frame sequence, setting the playing modes of the specific video frame sequence and the audio frame sequence according to the preset frame rate.
According to a third aspect of the embodiments of the present disclosure, there is provided an apparatus for audio-video synchronization correction, the apparatus including: a capturing unit configured to capture a specific video frame sequence and an audio frame sequence which are played according to a predetermined frame rate, wherein each video frame and each audio frame in the specific video frame sequence and the audio frame sequence correspond to each other in a one-to-one manner, and each video frame is different from each other, and each audio frame is different from each other, wherein each video frame in the specific video frame sequence is played simultaneously with the audio frame corresponding to the video frame in the audio frame sequence, and the playing time of each video frame is set according to the predetermined frame rate, so that at least one video frame in the specific video frame sequence can be captured according to the predetermined frame rate; an analysis unit configured to perform image analysis on the captured video frames to determine first time stamps of the captured video frames and audio frames corresponding to the captured video frames in the sequence of audio frames, and to determine second time stamps of the audio frames corresponding to the captured video frames by performing sound analysis on the captured audio frames; and the correction unit is configured to perform audio and video synchronization correction according to the first time stamp and the second time stamp.
Optionally, the playing time of each video frame is set to be the remainder obtained by dividing the playing time by the reciprocal of the predetermined frame rate, and the remainder includes all integers smaller than the reciprocal.
Optionally, the video frames in the specific video frame sequence are color images respectively having different color values, wherein performing image analysis on the captured video frames comprises: color values of the captured video frame are analyzed.
Optionally, the correction unit is configured to calculate and save a difference value of the first time stamp and the second time stamp or a statistical value related to the difference value as a check value for audio-video synchronization.
Optionally, the capturing unit is further configured to capture any audio and video at the predetermined frame rate in response to a recording request of a user, and the correcting unit is further configured to synchronize the captured audio and video according to the saved check value.
According to a fourth aspect of the embodiments of the present disclosure, there is provided an apparatus for audio and video synchronization test, the apparatus including: an acquisition unit configured to acquire a specific video frame sequence and an audio frame sequence, wherein each video frame and each audio frame in the specific video frame sequence and the audio frame sequence correspond to each other one by one, each video frame is different from each other, and each audio frame is different from each other; the video playback device includes a playback unit configured to play a specific video frame sequence and an audio frame sequence, wherein each video frame in the specific video frame sequence is played simultaneously with an audio frame corresponding to the video frame in the audio frame sequence, and a playback time of each video frame is set according to a predetermined frame rate, so that at least one video frame in the specific video frame sequence can be captured according to the predetermined frame rate.
Optionally, the playing time of each video frame is set to be the remainder obtained by dividing the playing time by the reciprocal of the predetermined frame rate, and the remainder includes all integers smaller than the reciprocal.
Optionally, the video frames in the specific video frame sequence are color images respectively having different color values.
Optionally, the play duration of each video frame and each audio frame is less than or equal to the screen refresh interval.
Optionally, the apparatus further comprises: and the setting unit is configured to set the playing modes of the specific video frame sequence and the audio frame sequence according to the preset frame rate before the specific video frame sequence and the audio frame sequence are played.
According to a fifth aspect of embodiments of the present disclosure, there is provided an electronic device comprising at least one processor; at least one memory storing computer-executable instructions, wherein the computer-executable instructions, when executed by the at least one processor, cause the at least one processor to perform a method for audiovisual synchronization testing or a method for audiovisual synchronization correction as described above.
According to a sixth aspect of embodiments of the present disclosure, there is provided a system for audio-video synchronization correction, the system comprising: an audio-video synchronization test device configured to: acquiring a specific video frame sequence and an audio frame sequence, wherein each video frame and each audio frame in the specific video frame sequence and the audio frame sequence correspond to each other one by one, the video frames are different from each other, and the audio frames are different from each other; playing a specific video frame sequence and an audio frame sequence, wherein each video frame in the specific video frame sequence is played simultaneously with an audio frame corresponding to the video frame in the audio frame sequence, and the playing time of each video frame is set according to a predetermined frame rate, so that at least one video frame in the specific video frame sequence can be captured according to the predetermined frame rate; and an audio-video synchronization correction device configured to capture the specific video frame sequence and the audio frame sequence played according to the predetermined frame rate, perform image analysis on the captured video frames to determine a first time stamp of the captured video frames and an audio frame corresponding to the captured video frames in the audio frame sequence, determine a second time stamp of the audio frame corresponding to the captured video frames by performing sound analysis on the captured audio frames, and perform audio-video synchronization correction according to the first time stamp and the second time stamp.
According to a seventh aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium storing instructions, which when executed by at least one processor, cause the at least one processor to execute the method for audiovisual synchronization test or the method for audiovisual synchronization correction as described above.
According to an eighth aspect of embodiments of the present disclosure, there is provided a computer program product, instructions of which are executed by at least one processor in an electronic device to perform the method for audiovisual synchronization test or the method for audiovisual synchronization correction as described above.
The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects: the playing of the specific video frame sequence and the audio frame sequence is set in consideration of the frame rate when the video is captured, so that at least one video frame in the specific video frame sequence can be captured during audio-video synchronization correction, and audio-video synchronization correction is performed through image analysis and sound analysis, therefore, a user does not need to spend time to adjust the video capturing mode of the user to ensure that a proper video frame can be captured to perform audio-video synchronization correction, and any user (for example, a user of any model) can conveniently perform audio-video synchronization correction automatically by capturing the specific video frame sequence and the audio frame sequence, and does not need an engineer to perform audio-video synchronization correction through parameter adjustment in the development stage, so that a large amount of manpower and time cost are saved. In addition, because no artificial parameter adjustment is needed, the audio and video synchronization correction is not influenced by artificial factors, and a better audio and video synchronization effect can be achieved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.
FIG. 1 is an exemplary system architecture diagram in which exemplary embodiments of the present disclosure may be applied;
fig. 2 is a flow chart illustrating a method for audio video synchronization testing according to an exemplary embodiment of the present disclosure;
fig. 3 is a flow chart illustrating a method for audio video synchronization correction according to an exemplary embodiment of the present disclosure;
fig. 4 is a block diagram illustrating an apparatus for audio video synchronization testing according to an exemplary embodiment of the present disclosure;
fig. 5 is a block diagram illustrating an apparatus for audio video synchronization correction according to an exemplary embodiment of the present disclosure;
fig. 6 is a block diagram illustrating a system for audio video synchronization correction according to an exemplary embodiment of the present disclosure;
fig. 7 is a block diagram illustrating an electronic device according to an exemplary embodiment of the present disclosure.
Detailed Description
In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The embodiments described in the following examples do not represent all embodiments consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
In this case, the expression "at least one of the items" in the present disclosure means a case where three types of parallel expressions "any one of the items", "a combination of any plural ones of the items", and "the entirety of the items" are included. For example, "include at least one of a and B" includes the following three cases in parallel: (1) comprises A; (2) comprises B; (3) including a and B. For another example, "at least one of the first step and the second step is performed", which means that the following three cases are juxtaposed: (1) executing the step one; (2) executing the step two; (3) and executing the step one and the step two.
As mentioned in the background of the present disclosure, in a conventional audio/video correction for recording software, audio/video synchronization correction is performed for each model in a development stage (e.g., a recording software development stage). For example, in performing audio-video synchronization correction, a video frame and an audio frame may be played, and then the time difference between the audio frame and the video frame capture is manually adjusted by an engineer, and then the recording software is corrected according to the adjusted time difference. Specifically, when a video frame a is sent at a time point a, an audio frame B is also sent at the time point a, but the recording software does not record at the same time point when recording the video frame a and the audio frame B, for example, the recording software records the video frame a at a time B and records the audio frame B at a time c, but B is not equal to c, that is, there is a time difference. When b-c > d, a person will feel that the audio and video in the recorded video is not synchronized when d is longer than the time when the person feels obvious. An engineer needs to adjust a parameter s so that | b-c-s | < t, and the value of t is a duration which is not obviously felt by a human, namely, the time difference of audio and video recording is very small and is as small as that the human does not feel asynchronous. However, when a general software is adapted to various models, if an engineer manually adjusts appropriate parameters for each model to perform audio/video synchronization verification, a large amount of development time is inevitably consumed, and the correction mode is easy to introduce more human factors, which may result in poor audio/video synchronization correction effect. In view of the above, the present disclosure provides a method and an apparatus capable of automatically performing audio and video synchronization correction, which not only saves labor and time costs, but also can obtain a better audio and video synchronization effect.
Fig. 1 illustrates an exemplary system architecture 100 in which exemplary embodiments of the present disclosure may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others. A user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages (e.g., an audio-video data upload request, an audio-video data acquisition request), etc. Various communication client applications, such as a video recording application, an audio playing application, an instant messaging tool, a mailbox client, social platform software, and the like, may be installed on the terminal devices 101, 102, and 103. The terminal apparatuses 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices having a display screen and capable of playing and recording audio and video, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like. When the terminal device 101, 102, 103 is software, it may be installed in the electronic devices listed above, it may be implemented as a plurality of software or software modules (for example, to provide distributed services), or it may be implemented as a single software or software module. And is not particularly limited herein.
The terminal devices 101, 102, 103 may be equipped with an image capturing device (e.g., a camera) to capture video data. In practice, the smallest visual unit that makes up a video is a Frame (Frame). Each frame is a static image. Temporally successive sequences of frames are composited together to form a motion video. Further, the terminal apparatuses 101, 102, 103 may also be mounted with a component (e.g., a speaker) for converting an electric signal into sound to play the sound, and may also be mounted with a device (e.g., a microphone) for converting an analog audio signal into a digital audio signal to pick up the sound.
The terminal devices 101, 102, 103 may collect video data by using an image collecting device installed thereon, and may play audio data by using an audio processing component that supports audio playing and is installed thereon. Moreover, the terminal devices 101, 102, and 103 may perform processing such as timestamp calculation on the acquired audio and video data, and may store the processing result.
The server 105 may be a server providing various services, such as a background server providing support for video recording type applications installed on the terminal devices 101, 102, 103. The background server can analyze, store and the like the received audio and video data uploading request and other data, and can also receive the audio and video data acquisition request sent by the terminal equipment 101, 102 and 103 and feed back the audio and video data indicated by the audio and video data acquisition request to the terminal equipment 101, 102 and 103.
The server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules (e.g., to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.
It should be noted that the method for testing and correcting audio and video synchronization provided in the embodiment of the present application is generally executed by the terminal devices 101, 102, and 103, and accordingly, the apparatus for testing and correcting audio and video synchronization is generally disposed in the terminal devices 101, 102, and 103.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation, and the disclosure is not limited thereto.
FIG. 2 is a flow diagram illustrating a method for audio video synchronization testing in accordance with an exemplary embodiment;
in step S201, a specific video frame sequence and audio frame sequence are acquired. Here, each video frame and each audio frame in the specific video frame sequence and audio frame sequence may have a one-to-one correspondence, each video frame being different from each other, and each audio frame being different from each other. As an example, the specific video frame sequence and audio frame sequence may be acquired from an external device or server, or may also be acquired from a storage unit of a main body performing the method shown in fig. 2, and the present disclosure does not limit the manner of acquisition.
In step S202, a specific sequence of video frames and a sequence of audio frames are played. For example, a particular sequence of video frames and a sequence of audio frames may be played in response to a user's request for play. Specifically, each video frame in the specific video frame sequence and an audio frame in the audio frame sequence corresponding to the video frame may be played simultaneously, and the playing time of each video frame may be set according to a predetermined frame rate, so that at least one video frame in the specific video frame sequence can be captured according to the predetermined frame rate.
Here, the predetermined frame rate may be a frame rate used when recording software records a video. For example, the predetermined frame rate is 25 frames/second, but is not limited thereto. Alternatively, the predetermined frame rate may be set in response to a user input and the playback manner of the specific video frame sequence and audio frame sequence may be set according to the predetermined frame rate. For example, an interface (e.g., a graphic user interface) allowing a user to input the predetermined frame rate may be provided, or information regarding the predetermined frame rate may be received from other devices (e.g., a device in which a user installs recording software), but is not limited thereto, and the manner of obtaining the predetermined frame rate is not limited by the present disclosure.
According to an exemplary embodiment of the present disclosure, the video frames in the specific video frame sequence may be set to be different from each other, the audio frames in the audio frame sequence may be different from each other, and each video frame in the specific video frame sequence may be played simultaneously with an audio frame corresponding to the video frame in the audio frame sequence. As an example, in order to facilitate image analysis at the time of subsequent audio-video synchronization correction and reduce the amount of computation at the time of analysis, each video frame may be a color image having respectively different color values, that is, each video frame is a solid color image (hereinafter, referred to as "color image") having a certain color value, for example, a color image such as red, orange, yellow, green, blue, violet …. However, the video frames are not limited to being color images, but may be any images that are different from each other. By way of example, each audio frame may be a sound corresponding to a particular note, such as, but not limited to, dao, ruai, mi, fa, sao, la, xi, and the like. As described above, each video frame in a particular sequence of video frames has an audio frame corresponding thereto in the sequence of audio frames, and the corresponding video frame and audio frame are played simultaneously. For example, a red color image may be played corresponding to and at the same time as an audio frame that emits dao. Furthermore, the playing time of each video frame may be set according to the predetermined frame rate, such that at least one video frame of the specific video frame sequence can be captured at the predetermined frame rate. In the disclosure, because the frame rate of the subsequent recording software recording the video is considered in the playing mode of the specific video frame sequence, at least one video frame can be captured when the played video frame sequence is captured according to the predetermined frame rate, and audio and video synchronization correction can be performed by using the captured video frame conveniently, a user does not need to spend time to adjust the video capturing mode of the user to capture a proper video frame for audio and video synchronization correction, and any user (for example, a user of any model) can capture the specific video frame sequence and the audio frame sequence by using the recording software to perform audio and video synchronization correction on the recording software of the user automatically, and an engineer does not need to perform audio and video synchronization correction on the recording software for each model in a development stage.
According to an exemplary embodiment, the playing time of each video frame may be set such that a remainder of division by a reciprocal of the predetermined frame rate includes all integers smaller than the reciprocal. For convenience of description, setting the playing time of each video frame according to the predetermined frame rate is described below by taking the predetermined frame rate as an example of 25 frames/second (i.e., capturing one frame every 40ms, i.e., the reciprocal of the frame rate is 40 ms/frame). In the case where the preset frame rate is 25 frames/sec, the specific video frame sequence may be set to include 40 different color images (alternatively, more than 40 color images may be selected), and the first color image is played at the time of 0ms, the second color image is played at the time of 0+40+ 1ms, the third color image is played at the time of 0+40 x 2+2 ms, the fourth color image … … is played at the time of 0+40 x 3+3 ms, and the forty color image is played at the time of 0+40 x 39+39 ms. Alternatively, during the playing interval of the images of different colors, a pure black image may be displayed, or other images that can be distinguished from the played color image may be displayed, or no image may be displayed. The purpose of the above playmode settings is to ensure that the remainder of the division of the time instants at which the respective colors occur by 40 is 0, 1, 2, 3 … 38, 39, respectively, i.e. comprises all integers less than 40, and this may enable at least one video frame to be captured when capturing the above specific sequence of video frames at a frame rate of 25 frames/second. For example, if capture is started at time 0ms, at least the first color image played at time 0m may be captured, while if capture is started at time 1ms, at least the second color image played at time 0+40+ 1ms may be captured, and so on. In addition, a particular sequence of video frames and a sequence of audio frames may be played in a loop, and thus, multiple color images may be captured. As described above, since there is a certain interval (in the above example, the interval is 41ms) between video frames and between audio frames, it is not possible to cause an inaccurate playback due to a screen refresh or an excessively long audio frame.
According to an exemplary embodiment, the play duration of each video frame and each audio frame may be set as small as possible. As an example, the play duration of each video frame and each audio frame may be less than or equal to the screen refresh interval. This is to ensure that the user is able to capture at least one video frame of the sequence of specific video frames even if the user does not capture video frames at integer times corresponding to the remainder.
According to the method for audio and video synchronization testing, users of different models can conveniently and directly carry out audio and video synchronization testing by utilizing the played specific video frame sequence and audio frame sequence, and audio and video synchronization correction can be carried out on the basis, so that an engineer is prevented from carrying out audio and video synchronization testing and correction on recording software of different models in a development stage.
Optionally, the method shown in fig. 2 may further comprise the following steps (not shown): the playing modes of the specific video frame sequence and the audio frame sequence are set according to the preset frame rate. However, it should be noted that the above operations for setting the play modes of the specific video frame sequence and the audio frame sequence may be performed in other devices, and the method for audio video synchronization testing may play the specific video frame sequence and the audio frame sequence after the play modes are set by other devices, without necessarily including the above setting operations. Optionally, the method shown in fig. 2 may further include other operations, for example, operations such as storing or processing the video frame sequence and the audio frame sequence, which are not limited by this disclosure. Fig. 3 is a flowchart illustrating a method for audio video synchronization correction according to an exemplary embodiment of the present disclosure.
In step S301, a specific video frame sequence and a specific audio frame sequence that are played back are captured at a predetermined frame rate. Here, as mentioned in the description above with respect to fig. 2, each video frame and each audio frame in the specific video frame sequence and audio frame sequence have a one-to-one correspondence, and each video frame is different from each other and each audio frame is different from each other. In addition, each video frame in the specific video frame sequence is played simultaneously with an audio frame corresponding to the video frame in the audio frame sequence, and the playing time of each video frame is set according to the predetermined frame rate, so that at least one video frame in the specific video frame sequence can be captured according to the predetermined frame rate. As an example, the play time of each video frame is set to be a remainder obtained by dividing by a reciprocal of the predetermined frame rate, including all integers smaller than the reciprocal. In addition, the playback duration of each video frame and each audio frame is less than or equal to the screen refresh interval. The above description of fig. 2 has introduced the playing manner of a specific video frame sequence and audio frame sequence, and is not repeated here.
In step S302, a first timestamp of the captured video frame and an audio frame of the sequence of audio frames corresponding to the captured video frame are determined by performing image analysis on the captured video frame, and a second timestamp of the audio frame corresponding to the captured video frame is determined by performing sound analysis on the captured audio frame. When capturing video frames and audio frames, the capture times of the video frames and the audio frames can be recorded. The capture times of the video frames and the audio frames may be system time stamps (timestamps), also referred to as timestamps for short, at the time the video frames and the audio frames were captured. Generally, a time stamp is a sequence of characters that uniquely identifies a time of a moment.
As described above, since the playing time of each video frame is set according to the predetermined frame rate, so that at least one video frame in the specific video frame sequence can be captured according to the predetermined frame rate, capturing the played specific video frame sequence and audio frame sequence according to the predetermined frame rate in step 301 necessarily captures at least one video frame in the specific video frame sequence, and further, since the audio frames are captured continuously, it is also necessary to capture the audio frame corresponding to the at least one video frame. According to an exemplary embodiment, the video frames of the particular sequence of video frames may be color images each having a different color value, in which case the image analysis of the captured video frames comprises: the color values of the captured video frames are analyzed to determine which of a particular sequence of video frames the captured video frames are, and thus a first timestamp of the captured video frame and an audio frame of the sequence of audio frames corresponding to the captured video frame. Alternatively, the video frames in the particular sequence of video frames may not be simple color images, but rather more complex color images that differ from one another, e.g., each color image may include different objects. In this case, the image analysis of the captured video frame may include: and carrying out object recognition or image feature matching on the captured video frames. It should be noted that the present disclosure is not limited to the manner of image analysis, and different image analysis manners may be adopted according to different video frames, as long as it is determined which of specific video frame sequences the captured video frame is.
After determining the audio frame corresponding to the captured video frame, a second timestamp of the audio frame corresponding to the captured video frame may be determined by performing a sound analysis on the captured audio frame. Here, for convenience of description, still taking the specific playing manner mentioned in fig. 2 as an example, in this case, assuming that it is determined by color analysis that the captured video frame is a color image corresponding to yellow, since each video frame has an audio frame corresponding thereto in the sequence of audio frames and the corresponding video frame and audio frame are played simultaneously, it is possible to determine the audio frame corresponding to the color image. Assuming that the audio frame is a sound corresponding to mi, the sound analysis of the captured audio frame may be, for example, performing a spectral analysis of the captured audio frame to find the time at which the audio frame was recorded, and then determining the timestamp corresponding to mi. It should also be noted that the sound analysis method is not limited in the present disclosure, and different sound analysis methods may be adopted according to different audio frames.
In step S303, audio-video synchronization correction is performed according to the first time stamp and the second time stamp. Specifically, in step S303, a difference value between the first time stamp and the second time stamp or a statistical value related to the difference value may be calculated and saved as a check value for audio-video synchronization. For example, in the case where only one video frame of a particular sequence of video frames is captured, the difference between the first time stamp and the second time stamp may be calculated and saved as a check value for audio-video synchronization. For example, assuming that a captured video frame is a color image corresponding to yellow and a time stamp thereof is video _ time, an audio frame corresponding to the video frame is a sound corresponding to mi and a time stamp thereof is audio _ time, the check value may be sync _ time ═ video _ time-audio _ time, and the check value may be locally stored. As another example, in the case where a plurality of video frames in a specific video frame sequence are captured, a statistical value (e.g., an average value, a weighted average value, a median value, etc.) related to a difference between a first time stamp of each of the plurality of video frames and a second time stamp of an audio frame corresponding to each video frame may be calculated as a check value for audio video synchronization, so that a better check result may be obtained. The check value is the difference between the recording time of the video frame and the corresponding audio frame, and the check value can be stored locally and used when recording by subsequent users. Optionally, the method for audio-video synchronization correction of the present disclosure may further include the following steps (not shown): and responding to a recording request of a user, capturing any audio and video according to the preset frame rate, and synchronizing the captured audio and video according to the stored check value. For example, during recording, a previously stored check value for audio/video synchronization correction can be acquired, and the check value is subtracted from the timestamp of the captured video frame, so that a timestamp capable of synchronizing audio/video can be obtained, and a better audio/video synchronization recording result can be obtained.
By the method for audio and video synchronous correction, a user can perform audio and video synchronous correction by himself without an engineer performing audio and video synchronous correction on recording software for each model in a development stage, and a process of manually adjusting parameters by the engineer is reduced, so that a more objective and accurate correction result can be obtained.
Fig. 4 is a block diagram illustrating an apparatus for audio-video synchronization test (hereinafter, simply referred to as "audio-video synchronization test apparatus" for convenience of description) according to an exemplary embodiment of the present disclosure.
Referring to fig. 4, the av synchronization testing apparatus 400 may include an acquisition unit 401 and a play unit 402. The acquisition unit 401 may acquire a specific video frame sequence and audio frame sequence. Here, each video frame and each audio frame in the specific video frame sequence and audio frame sequence have a one-to-one correspondence, and each video frame is different from each other and each audio frame is different from each other. The playing unit 402 can play a specific video frame sequence and audio frame sequence. Specifically, each video frame in the specific video frame sequence and an audio frame corresponding to the video frame in the audio frame sequence can be played simultaneously, and the playing time of each video frame is set according to a predetermined frame rate, so that at least one video frame in the specific video frame sequence can be captured according to the predetermined frame rate. The captured at least one video frame may be used for audio video synchronization testing and audio video synchronization correction. Optionally, the av synchronization testing apparatus 400 may further include a setting unit (not shown). The setting unit may set a play manner of the specific video frame sequence and the audio frame sequence according to a predetermined frame rate. The predetermined frame rate, the setting of the playing modes of the specific video frame sequence and the audio frame sequence, and the like have been described in the description related to fig. 2, and are not described again here.
It should be noted that the operation performed by the setting unit may be performed by another device, that is, after the specific playing mode of the video frame sequence and the audio frame sequence is set by the other device, the playing unit in the av synchronization testing apparatus 400 plays the specific video frame sequence and the audio frame sequence after acquiring the specific video frame sequence and the audio frame sequence. Optionally, the audio/video synchronization testing apparatus 400 may further include other units, such as a storage unit, a data quantity unit, and the like.
In addition, since the audio-video synchronization testing method shown in fig. 2 can be executed by the audio-video synchronization testing apparatus 400 shown in fig. 4, any relevant details related to the operations executed by the units in fig. 4 can be referred to the corresponding description of fig. 2, and are not described herein again.
Fig. 5 is a block diagram illustrating an apparatus for av synchronization correction (hereinafter, simply referred to as "av synchronization correction apparatus" for convenience of description) according to an exemplary embodiment of the present disclosure.
Referring to fig. 5, the av synchronization correcting apparatus 500 may include a capturing unit 501, an analyzing unit 502, and a correcting unit 503. Specifically, the capturing unit 501 may capture a specific video frame sequence and an audio frame sequence played back at a predetermined frame rate. According to an exemplary embodiment, each video frame and each audio frame in the particular sequence of video frames and the sequence of audio frames have a one-to-one correspondence, and each video frame is different from each other and each audio frame is different from each other. Furthermore, each video frame in the specific video frame sequence is played simultaneously with an audio frame corresponding to the video frame in the audio frame sequence, and the playing time of each video frame is set according to the predetermined frame rate, so that at least one video frame in the specific video frame sequence can be captured at the predetermined frame rate. The analysis unit 502 may perform image analysis on the captured video frames to determine first timestamps of the captured video frames and audio frames of the sequence of audio frames corresponding to the captured video frames, and may determine second timestamps of the audio frames corresponding to the captured video frames by performing sound analysis on the captured audio frames. The correction unit 503 may perform av synchronization correction according to the first time stamp and the second time stamp.
Since the av synchronization correction method shown in fig. 3 can be executed by the av synchronization correction apparatus 500 shown in fig. 5, and the capturing unit 501, the analyzing unit 502, and the correcting unit 503 can respectively execute operations corresponding to step 301, step 302, and step 303 in fig. 3, any relevant details related to the operations executed by the units in fig. 5 can be referred to the corresponding description related to fig. 3, and are not repeated here.
Further, it should be noted that although the audio-video synchronization testing device 400 and the audio-video synchronization correcting device 500 are described above as being divided into units for respectively performing corresponding processes, it is clear to those skilled in the art that the processes performed by the units may be performed without any specific division of the units by the audio-video synchronization testing device 400 and the audio-video synchronization correcting device 500 or without explicit demarcation between the units.
Fig. 6 is a block diagram illustrating a system for av synchronization correction (hereinafter, referred to as an "av synchronization correction system" for convenience of description) according to an exemplary embodiment of the present disclosure.
Referring to fig. 6, the av synchronization correction system 600 may include an av synchronization testing means 601 and an av synchronization correction means 602. Specifically, the av synchronization testing apparatus 601 may first acquire a specific video frame sequence and an audio frame sequence. Here, each video frame and each audio frame in the specific video frame sequence and audio frame sequence have a one-to-one correspondence, and each video frame is different from each other and each audio frame is different from each other. Subsequently, the av synchronization test apparatus 601 may play a specific video frame sequence and an audio frame sequence, wherein each video frame in the specific video frame sequence is played simultaneously with an audio frame corresponding to the video frame in the audio frame sequence, and the playing time of each video frame is set according to a predetermined frame rate, so that at least one video frame in the specific video frame sequence can be captured according to the predetermined frame rate. The av synchronization correction apparatus 602 may capture the specific video frame sequence and the audio frame sequence played at the predetermined frame rate, perform image analysis on the captured video frames to determine a first time stamp of the captured video frame and an audio frame corresponding to the captured video frame in the audio frame sequence, determine a second time stamp of the audio frame corresponding to the captured video frame by performing sound analysis on the captured audio frame, and perform av synchronization correction according to the first time stamp and the second time stamp.
In addition, the audio-video synchronization testing device 601 and the audio-video synchronization correcting device 602 can be implemented in a hardware manner, or can be implemented in a software manner completely through a computer program or instructions. For example, the audio and video synchronization testing device 601 may be implemented as audio and video synchronization testing software capable of playing audio and video, and the audio and video synchronization correcting device 602 may be implemented as recording software having an audio and video recording function and an audio and video synchronization correcting function. An example scenario in which the audio-video synchronization correction system 600 of the present disclosure is applied is briefly described below in conjunction with the system of fig. 6. For example, when a user finds that the recording software installed on the terminal of the user has the problem of audio and video asynchronism, the user expects to be capable of carrying out audio and video synchronization correction on the recording software, and at the moment, the user can download audio and video synchronization test software. As described above, the av synchronization testing software can play a specific video frame sequence and an audio frame sequence in a specific manner, and the user can automatically perform av synchronization correction only by capturing the played specific video frame sequence and audio frame sequence with the recording software. For example, as described with reference to fig. 3 and 4, the recording software may calculate a check value for audio-video synchronization correction and may save it locally. And then, when the user records the audio and video, the recording software can obtain the check value from the local and utilize the check value to keep the recorded audio and video synchronous.
Further, as an example, the audio video synchronization test software may default to playing a specific video frame sequence and audio frame sequence set according to a frame rate used when a video is recorded by general recording software. Optionally, the audio video synchronization test software may also provide a user interface to allow the user to input the frame rate at which their recording software records the video, and then play back a specific sequence of video frames and a specific sequence of audio frames set according to the frame rate input by the user. In addition, optionally, the recording software may provide a user interface to receive an audio/video synchronization correction request from a user, and in addition, may provide a user interface to allow the user to download audio/video synchronization test software.
The above is only an exemplary application scenario of the audio and video synchronization correction system 600, and various modifications may exist in specific applications, which is not limited in this application.
Fig. 7 is a block diagram of an electronic device 700 that may include at least one memory 701 having a set of computer-executable instructions stored therein that, when executed by the at least one processor, perform a method for audio video synchronization testing or a method for audio video synchronization correction according to an embodiment of the present disclosure, and at least one processor 702, according to an embodiment of the present disclosure.
By way of example, the electronic device may be a PC computer, tablet device, personal digital assistant, smartphone, or other device capable of executing the set of instructions described above. The electronic device need not be a single electronic device, but can be any collection of devices or circuits that can execute the above instructions (or sets of instructions) either individually or in combination. The electronic device may also be part of an integrated control system or system manager, or may be configured as a portable electronic device that interfaces with local or remote (e.g., via wireless transmission).
In an electronic device, a processor may include a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a programmable logic device, a special-purpose processor system, a microcontroller, or a microprocessor. By way of example, and not limitation, processors may also include analog processors, digital processors, microprocessors, multi-core processors, processor arrays, network processors, and the like.
The processor may execute instructions or code stored in the memory, which may also store data. The instructions and data may also be transmitted or received over a network via a network interface device, which may employ any known transmission protocol.
The memory may be integral to the processor, e.g., RAM or flash memory disposed within an integrated circuit microprocessor or the like. Further, the memory may comprise a stand-alone device, such as an external disk drive, storage array, or other storage device usable by any database system. The memory and the processor may be operatively coupled or may communicate with each other, such as through an I/O port, a network connection, etc., so that the processor can read files stored in the memory.
In addition, the electronic device may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). All components of the electronic device may be connected to each other via a bus and/or a network.
According to an embodiment of the present disclosure, there may also be provided a computer-readable storage medium storing instructions that, when executed by at least one processor, cause the at least one processor to perform a method for audio video synchronization test or a method for audio video synchronization correction according to an exemplary embodiment of the present disclosure. Examples of the computer-readable storage medium herein include: read-only memory (ROM), random-access programmable read-only memory (PROM), electrically erasable programmable read-only memory (EEPROM), random-access memory (RAM), dynamic random-access memory (DRAM), static random-access memory (SRAM), flash memory, non-volatile memory, CD-ROM, CD-R, CD + R, CD-RW, CD + RW, DVD-ROM, DVD-R, DVD + R, DVD-RW, DVD + RW, DVD-RAM, BD-ROM, BD-R, BD-R LTH, BD-RE, Blu-ray or compact disc memory, Hard Disk Drive (HDD), solid-state drive (SSD), card-type memory (such as a multimedia card, a Secure Digital (SD) card or a extreme digital (XD) card), magnetic tape, a floppy disk, a magneto-optical data storage device, an optical data storage device, a hard disk, a magnetic tape, a magneto-optical data storage device, a hard disk, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, a magnetic tape, a magnetic data storage device, A solid state disk, and any other device configured to store and provide a computer program and any associated data, data files, and data structures to a processor or computer in a non-transitory manner such that the processor or computer can execute the computer program. The computer program in the computer-readable storage medium described above can be run in an environment deployed in a computer device, such as a client, a host, a proxy appliance, a server, or the like, and further, in one example, the computer program and any associated data, data files, and data structures are distributed across networked computer systems such that the computer program and any associated data, data files, and data structures are stored, accessed, and executed in a distributed fashion by one or more processors or computers.
According to an embodiment of the present disclosure, there may also be provided a computer program product, instructions of which are executable by at least one processor in an electronic device to implement a method for audio video synchronization test or a method for audio video synchronization correction according to an exemplary embodiment of the present disclosure.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (21)

1. A method for audio-video synchronization correction, comprising:
capturing a specific video frame sequence and an audio frame sequence which are played according to a preset frame rate, wherein each video frame and each audio frame in the specific video frame sequence and the audio frame sequence correspond to each other in a one-to-one mode, each video frame is different from each other, each audio frame is different from each other, each video frame in the specific video frame sequence and the audio frame corresponding to the video frame in the audio frame sequence are played simultaneously, and the playing time of each video frame is set according to the preset frame rate, so that at least one video frame in the specific video frame sequence can be captured according to the preset frame rate;
performing image analysis on the captured video frames to determine first timestamps of the captured video frames and audio frames of the sequence of audio frames corresponding to the captured video frames, and determining second timestamps of the audio frames corresponding to the captured video frames by performing sound analysis on the captured audio frames;
audio and video synchronization correction is carried out according to the first time stamp and the second time stamp,
wherein, the playing time of each video frame is set to be the remainder obtained by dividing the playing time by the reciprocal of the predetermined frame rate, and the remainder comprises all integers smaller than the reciprocal.
2. The method of claim 1, wherein the video frames in the particular sequence of video frames are color images each having a different color value,
wherein performing image analysis on the captured video frames comprises: color values of the captured video frame are analyzed.
3. The method of claim 2, wherein performing audio-video synchronization correction based on the first timestamp and the second timestamp comprises:
and calculating and storing the difference value of the first time stamp and the second time stamp or a statistic value related to the difference value as a check value for audio and video synchronization.
4. The method of claim 3, further comprising:
and responding to a recording request of a user, capturing any audio and video according to the preset frame rate, and synchronizing the captured audio and video according to the stored check value.
5. The method of claim 1, wherein the playback duration of each video frame and each audio frame is less than or equal to a screen refresh interval.
6. A method for audio-video synchronization testing, comprising:
acquiring a specific video frame sequence and an audio frame sequence, wherein each video frame and each audio frame in the specific video frame sequence and the audio frame sequence are in one-to-one correspondence, the video frames are different from each other, and the audio frames are different from each other;
playing the specific video frame sequence and the audio frame sequence, wherein each video frame in the specific video frame sequence is played simultaneously with the audio frame corresponding to the video frame in the audio frame sequence, and the playing time of each video frame is set according to a predetermined frame rate, so that at least one video frame in the specific video frame sequence can be captured according to the predetermined frame rate,
wherein, the playing time of each video frame is set to be the remainder obtained by dividing the playing time by the reciprocal of the predetermined frame rate, and the remainder comprises all integers smaller than the reciprocal.
7. The method of claim 6, wherein the video frames in the particular sequence of video frames are color images each having a different color value.
8. The method of claim 6, wherein the playback duration of each video frame and each audio frame is less than or equal to a screen refresh interval.
9. The method of any one of claims 6 to 8, further comprising: before playing a specific video frame sequence and an audio frame sequence, setting the playing modes of the specific video frame sequence and the audio frame sequence according to the preset frame rate.
10. An apparatus for audio-video synchronization correction, comprising:
a capturing unit configured to capture a specific video frame sequence and an audio frame sequence which are played according to a predetermined frame rate, wherein each video frame and each audio frame in the specific video frame sequence and the audio frame sequence correspond to each other in a one-to-one manner, and each video frame is different from each other, and each audio frame is different from each other, wherein each video frame in the specific video frame sequence is played simultaneously with the audio frame corresponding to the video frame in the audio frame sequence, and the playing time of each video frame is set according to the predetermined frame rate, so that at least one video frame in the specific video frame sequence can be captured according to the predetermined frame rate;
an analysis unit configured to perform image analysis on the captured video frames to determine first time stamps of the captured video frames and audio frames corresponding to the captured video frames in the sequence of audio frames, and to perform sound analysis on the captured audio frames to determine second time stamps of the audio frames corresponding to the captured video frames;
a correction unit configured to perform audio-video synchronization correction according to the first time stamp and the second time stamp,
wherein, the playing time of each video frame is set to be the remainder obtained by dividing the playing time by the reciprocal of the predetermined frame rate, and the remainder comprises all integers smaller than the reciprocal.
11. The apparatus of claim 10, wherein the video frames in the particular sequence of video frames are color images respectively having different color values,
wherein performing image analysis on the captured video frame comprises: color values of the captured video frame are analyzed.
12. The apparatus according to claim 11, wherein the correction unit is configured to calculate and save a difference value of the first time stamp and the second time stamp or a statistical value related to the difference value as the check value for audio-video synchronization.
13. The apparatus of claim 12, wherein the capturing unit is further configured to capture any audio video at the predetermined frame rate in response to a recording request by a user, and the correction unit is further configured to synchronize the captured audio video according to the saved check value.
14. The apparatus of claim 10, wherein a playback duration of each video frame and each audio frame is less than or equal to a screen refresh interval.
15. An apparatus for audio video synchronization testing, comprising:
an acquisition unit configured to acquire a specific video frame sequence and an audio frame sequence, wherein each video frame and each audio frame in the specific video frame sequence and the audio frame sequence correspond to each other one by one, each video frame is different from each other, and each audio frame is different from each other;
a playing unit configured to play a specific video frame sequence and an audio frame sequence, wherein each video frame in the specific video frame sequence is played simultaneously with an audio frame corresponding to the video frame in the audio frame sequence, and a playing time of each video frame is set according to a predetermined frame rate so that at least one video frame in the specific video frame sequence can be captured according to the predetermined frame rate,
wherein, the playing time of each video frame is set to be the remainder obtained by dividing the playing time by the reciprocal of the predetermined frame rate, and the remainder comprises all integers smaller than the reciprocal.
16. The apparatus of claim 15, wherein the video frames in the particular sequence of video frames are color images each having a different color value.
17. The apparatus of claim 15, wherein a playback duration of each video frame and each audio frame is less than or equal to a screen refresh interval.
18. The apparatus of any one of claims 15 to 17, further comprising: and the setting unit is configured to set the playing modes of the specific video frame sequence and the specific audio frame sequence according to the preset frame rate before the specific video frame sequence and the specific audio frame sequence are played.
19. An electronic device, comprising:
at least one processor;
at least one memory storing computer-executable instructions,
wherein the computer-executable instructions, when executed by the at least one processor, cause the at least one processor to perform the method of any one of claims 1 to 9.
20. A system for audio-video synchronization correction, comprising: an audio-video synchronization test device configured to: acquiring a specific video frame sequence and an audio frame sequence, wherein each video frame and each audio frame in the specific video frame sequence and the audio frame sequence are in one-to-one correspondence, the video frames are different from each other, and the audio frames are different from each other; playing a specific video frame sequence and an audio frame sequence, wherein each video frame in the specific video frame sequence is played simultaneously with an audio frame corresponding to the video frame in the audio frame sequence, and the playing time of each video frame is set according to a predetermined frame rate, so that at least one video frame in the specific video frame sequence can be captured according to the predetermined frame rate; and
an audio-video synchronization correction device configured to: capturing the specific video frame sequence and the audio frame sequence played at the predetermined frame rate, performing image analysis on the captured video frames to determine first time stamps of the captured video frames and audio frames corresponding to the captured video frames in the audio frame sequence, determining second time stamps of the audio frames corresponding to the captured video frames by performing sound analysis on the captured audio frames, and performing audio-video synchronization correction based on the first time stamps and the second time stamps,
wherein, the playing time of each video frame is set to be the remainder obtained by dividing the playing time by the reciprocal of the predetermined frame rate, and the remainder comprises all integers smaller than the reciprocal.
21. A computer-readable storage medium storing instructions that, when executed by at least one processor, cause the at least one processor to perform the method of any one of claims 1 to 9.
CN202011052394.7A 2020-09-29 2020-09-29 Method and device for audio and video synchronization test and correction and electronic equipment Active CN112188259B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011052394.7A CN112188259B (en) 2020-09-29 2020-09-29 Method and device for audio and video synchronization test and correction and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011052394.7A CN112188259B (en) 2020-09-29 2020-09-29 Method and device for audio and video synchronization test and correction and electronic equipment

Publications (2)

Publication Number Publication Date
CN112188259A CN112188259A (en) 2021-01-05
CN112188259B true CN112188259B (en) 2022-09-23

Family

ID=73945977

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011052394.7A Active CN112188259B (en) 2020-09-29 2020-09-29 Method and device for audio and video synchronization test and correction and electronic equipment

Country Status (1)

Country Link
CN (1) CN112188259B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116230003B (en) * 2023-03-09 2024-04-26 北京安捷智合科技有限公司 Audio and video synchronization method and system based on artificial intelligence
CN117061730A (en) * 2023-10-11 2023-11-14 天津华来科技股份有限公司 Method for testing sound and picture synchronization performance of intelligent camera

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101616331A (en) * 2009-07-27 2009-12-30 北京汉邦高科数字技术有限公司 A kind of method that video frame rate and audio-visual synchronization performance are tested
CN103167342A (en) * 2013-03-29 2013-06-19 天脉聚源(北京)传媒科技有限公司 Audio and video synchronous processing device and method
CN106385628A (en) * 2016-09-23 2017-02-08 努比亚技术有限公司 Apparatus and method for analyzing audio and video asynchronization
CN111050023A (en) * 2019-12-17 2020-04-21 深圳追一科技有限公司 Video detection method and device, terminal equipment and storage medium
CN111464256A (en) * 2020-04-14 2020-07-28 北京百度网讯科技有限公司 Time stamp correction method and device, electronic equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7948558B2 (en) * 2006-09-29 2011-05-24 The Directv Group, Inc. Audio video timing measurement and synchronization

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101616331A (en) * 2009-07-27 2009-12-30 北京汉邦高科数字技术有限公司 A kind of method that video frame rate and audio-visual synchronization performance are tested
CN103167342A (en) * 2013-03-29 2013-06-19 天脉聚源(北京)传媒科技有限公司 Audio and video synchronous processing device and method
CN106385628A (en) * 2016-09-23 2017-02-08 努比亚技术有限公司 Apparatus and method for analyzing audio and video asynchronization
CN111050023A (en) * 2019-12-17 2020-04-21 深圳追一科技有限公司 Video detection method and device, terminal equipment and storage medium
CN111464256A (en) * 2020-04-14 2020-07-28 北京百度网讯科技有限公司 Time stamp correction method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN112188259A (en) 2021-01-05

Similar Documents

Publication Publication Date Title
AU2017200865B2 (en) Methods and apparatus for an embedded appliance
CN112188259B (en) Method and device for audio and video synchronization test and correction and electronic equipment
CN110585702B (en) Sound and picture synchronous data processing method, device, equipment and medium
CN112040225B (en) Playing delay difference measuring method, device, equipment, system and storage medium
CN110335590B (en) Voice recognition test method, device and system
CN106791821A (en) Play appraisal procedure and device
CN111026595B (en) Multi-screen system synchronous display testing method, device and equipment
CN102148963A (en) Method and system for facing digital high-definition network video monitoring based on cloud storage
CN110582016A (en) video information display method, device, server and storage medium
CN114155852A (en) Voice processing method and device, electronic equipment and storage medium
CN109600571B (en) Multimedia resource transmission test system and multimedia resource transmission test method
CN110415318B (en) Image processing method and device
CN115623252B (en) Method, device and storage medium for controlling restarting and stream pushing of automatic detection of online examination
CN107995538B (en) Video annotation method and system
CN114466145B (en) Video processing method, device, equipment and storage medium
CN114245036B (en) Video production method and device
CN113194270B (en) Video processing method and device, electronic equipment and storage medium
CN114299089A (en) Image processing method, image processing device, electronic equipment and storage medium
CN114500767B (en) Input video source adjusting method and device, video input card and video processing equipment
CN114071127A (en) Live video delay testing method and device, storage medium and electronic equipment
CN111683215B (en) Video playback method and device, electronic equipment and computer readable storage medium
CN114866829A (en) Synchronous playing control method and device
CN110753261B (en) Audio and video time delay testing method and device, computer equipment and storage medium
US11930299B2 (en) Measuring audio and video latencies in virtual desktop environments
CN113741842B (en) Screen refresh delay determination method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant