CN111787365A - Multi-channel audio and video synchronization method and device - Google Patents

Multi-channel audio and video synchronization method and device Download PDF

Info

Publication number
CN111787365A
CN111787365A CN202010692102.XA CN202010692102A CN111787365A CN 111787365 A CN111787365 A CN 111787365A CN 202010692102 A CN202010692102 A CN 202010692102A CN 111787365 A CN111787365 A CN 111787365A
Authority
CN
China
Prior art keywords
audio
video
frame
path
time stamp
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010692102.XA
Other languages
Chinese (zh)
Inventor
吴广礼
黄新明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ysten Technology Co ltd
Original Assignee
Ysten Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ysten Technology Co ltd filed Critical Ysten Technology Co ltd
Priority to CN202010692102.XA priority Critical patent/CN111787365A/en
Publication of CN111787365A publication Critical patent/CN111787365A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/23602Multiplexing isochronously with the video sync, e.g. according to bit-parallel or bit-serial interface formats, as SDI
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4781Games

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention provides a multi-channel audio and video synchronization method for solving the problem that pictures and audio are not synchronous in multi-channel audio and video synchronization, which comprises the following steps: receiving multi-channel audio and video data, recording the time of the first frame of each channel of audio and video to reach a server, and acquiring the original display timestamp of each frame in each channel of audio and video; updating the display time stamp of the first frame in each path of audio and video as the time when the first frame reaches the server according to the original display time stamp of the first frame in each path of audio and video; updating the display time stamp of the non-first frame in each path of audio and video according to the original display time stamp of the non-first frame in each path of audio and video, the original display time stamp of the first frame in each path of audio and video and the time for the first frame to reach a server; and mixing and outputting the audio and video data after the display time stamp is updated, so that the receiving terminal plays the audio and video data according to the updated display time stamp. The problem of unsynchronization after mixed flow is solved. Corresponding apparatus, devices and media are also provided.

Description

Multi-channel audio and video synchronization method and device
Technical Field
The invention belongs to the technical field of streaming media processing, and particularly relates to a multi-channel audio and video synchronization method, a synchronization device, a computer readable medium and electronic equipment.
Background
Real-time audio and video synchronization applications based on the internet are beginning to get attention as a new requirement, such as online classroom, live game and the like. In the network transmission of real-time audio and video, time delay is generated due to a complex network environment, and simultaneously, internal clocks of various clients may be inconsistent, so that the condition that multiple paths of audio and video media streams are not synchronized after being received and processed at a server side is caused.
At present, a common technical means is to synchronize according to a timestamp in an RTP (Real-time transport protocol) packet of a Real-time audio/video stream, but if the time of each client is not synchronized, a multi-path stream mixing failure is caused, and the scheme must ensure the time synchronization of all devices between a server and the client.
Disclosure of Invention
In order to solve the above drawbacks of the prior art, the present invention provides an audio/video data synchronization scheme based on a unified server time, so as to avoid the problem of mixed flow failure caused by non-unified audio/video data clocks, and in particular, in a first aspect, an embodiment of the present invention provides a multi-path audio/video synchronization method, including the following steps:
s110, receiving multi-channel audio and video data, recording the time of the first frame of each channel of audio and video to reach a server, and acquiring the original display timestamp of each frame in each channel of audio and video;
s120, updating the display time stamp of the first frame in each path of audio and video to be the time when the first frame reaches the server according to the original display time stamp of the first frame in each path of audio and video;
s130, updating the display time stamp of the non-first frame in each path of audio and video according to the original display time stamp of the non-first frame in each path of audio and video, the original display time stamp of the first frame in each path of audio and video and the arrival time of the first frame at a server;
and S140, mixing and outputting the audio and video data of each path after the display time stamp is updated so that the receiving terminal plays the audio and video data according to the updated display time stamp.
Further, in step S130, the display timestamp of the non-first frame in each channel of audio and video is: and adding the original display time stamp of the current non-first frame to the time when the first frame of each path of audio and video arrives at the server, and then subtracting the original display time stamp of the first frame in each path of audio and video.
Further, the original display time stamps of the multi-path audio and video data are formed according to respective independent clocks.
Further, the step S110 includes: and receiving multi-path audio and video data, decoding the multi-path audio and video data, acquiring the time of the first frame of each path of audio and video to reach a server, and acquiring the original display timestamp of each frame in each path of audio and video.
Furthermore, the multi-channel audio and video synchronization method is operated in a streaming media server, and the time of the first frame of each channel of audio and video to reach the server is the time of the first frame of each channel of audio and video to reach the streaming media server.
Further, the step S140 includes: and synthesizing each path of audio and video data with the updated display timestamp into a path of data stream for outputting.
Further, the step of playing the audio and video data by the receiving terminal according to the updated display timestamp includes: and playing the audio and video data according to the display time stamp and the time base of each frame.
In a second aspect, an embodiment of the present invention provides a multi-channel audio and video synchronization apparatus, including a decoding module, a first frame synchronization module, a non-first frame synchronization module, and a mixed flow module;
the decoding module is used for receiving multi-channel audio and video data, recording the time of the first frame of each channel of audio and video to reach the server, and acquiring the original display timestamp of each frame in each channel of audio and video;
the first frame synchronization module is used for updating the display time stamp of the first frame in each path of audio and video to be the time when the first frame reaches the server according to the original display time stamp of the first frame in each path of audio and video;
the non-first frame synchronization module is used for updating the display time stamp of the non-first frame in each path of audio and video according to the original display time stamp of the non-first frame in each path of audio and video, the original display time stamp of the first frame in each path of audio and video and the arrival time of the first frame at the server;
the mixed flow module is used for mixing and outputting each path of audio and video data with the updated display timestamp so that the receiving terminal can play the audio and video data according to the updated display timestamp.
In a third aspect of the present invention, there is provided an electronic device comprising:
one or more processors;
a storage device having one or more programs stored thereon,
when executed by the one or more processors, cause the one or more processors to implement any of the methods described above.
In a fourth aspect of the invention, a computer-readable medium is provided, on which a computer program is stored, wherein the program, when executed by a processor, implements any of the methods described above.
The method for synchronizing the multi-channel audio and video can synchronize the multi-channel audio and video in real time and adapt to the problem of time asynchronization before a client and a server; for each path of stream, the server can realize the synchronization effect only by decoding once and encoding once, and no other server performance is consumed; the problem of mixed flow output picture and audio asynchronism caused by network jitter and data packet delay reaching the server can be effectively solved in the plug flow.
Drawings
The features and advantages of the present invention will be more clearly understood by reference to the accompanying drawings, which are illustrative and not to be construed as limiting the invention in any way, and in which:
fig. 1 is a schematic diagram of a system architecture in which a multi-channel audio and video synchronization method and an extraction device according to some embodiments of the present invention operate;
fig. 2 is a schematic flow diagram of a multi-channel audio and video synchronization method in some embodiments of the invention;
fig. 3 is a schematic diagram of multiple audio and video streams in a multiple audio and video synchronization method according to some embodiments of the present invention;
fig. 4 is a schematic flow chart of a multi-channel audio and video synchronization method according to another embodiment of the present invention;
fig. 5 is a system schematic diagram of a multi-channel audio and video synchronization device implemented based on the multi-channel audio and video synchronization method in the above-mentioned drawings in some embodiments of the present invention;
fig. 6 is a schematic structural diagram of a computer system in which the multi-channel audio and video synchronization method or the extraction device operates according to some embodiments of the present invention.
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described herein, and therefore the scope of the present invention is not limited by the specific embodiments disclosed below.
Fig. 1 illustrates an exemplary system architecture 100 to which embodiments of the multi-way audio video synchronization method or multi-way audio video synchronization apparatus of embodiments of the present application may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
A user may use terminal devices 101, 102, 103 to interact with a server 105 over a network 104 to receive or transmit data (e.g., video), etc. The terminal devices 101, 102, 103 may have various communication client applications installed thereon, such as video playing software, video processing applications, web browser applications, shopping applications, search applications, instant messaging tools, mailbox clients, social platform software, and the like.
The terminal apparatuses 101, 102, and 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices having a display screen and supporting data transmission, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like. When the terminal apparatuses 101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.
The server 105 may be a server providing various services, such as a background server providing support for videos displayed on the terminal devices 101, 102, 103. The background server may analyze and perform other processing on the received data such as the image processing request, and feed back a processing result (for example, mixing the audio and the video to obtain mixed-flow audio and video data) to an electronic device (for example, a terminal device) in communication connection with the background server.
It should be noted that the multi-channel audio and video synchronization method provided in the embodiment of the present application may be executed by the server 105, and accordingly, the multi-channel audio and video synchronization apparatus may be disposed in the server 105. In addition, the multi-channel audio and video synchronization method provided by the embodiment of the present application may also be executed by the terminal devices 101, 102, and 103, and accordingly, the multi-channel audio and video synchronization apparatus may also be disposed in the terminal devices 101, 102, and 103.
The server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. When the electronic device on which the multi-channel audio and video synchronization method operates does not need to perform data transmission with other electronic devices, the system architecture may only include the electronic device (such as the terminal device 101, 102, 103 or the server 105) on which the video segmentation method operates.
Fig. 2 shows a general flow of a multi-channel audio-video synchronization algorithm according to an embodiment of the present invention, and fig. 3 is a schematic diagram of a multi-channel real-time audio-video time and frame relationship.
Detailed description of the invention
The invention will be described in further detail with reference to the accompanying drawings, in which some basic concepts are first described before describing a specific implementation of the invention:
(1) the server time: the time of the streaming media processing server is used as a standard to synchronize the multi-path audio and video stream;
(2) pts (presentation Time stamp) displays a Time stamp telling the player at what Time to display the data of this frame;
(3) mixing flow: the multi-path audio/video media streams are synthesized by the streaming media processing server to be output as one stream or output as a few streams.
The flow chart for realizing the multi-path real-time audio and video synchronization of the invention is shown in figure 2, firstly, a streaming media processing server receives and decodes an original audio and video stream, records pts of each frame, records the current server time T1 and assigns a firstpts as pts if the first frame is the first frame, and modifies the pts of the first frame to be T1. For audio and video data, the display time stamp can be generated when the audio and video data are generated
The non-first frame has its pts modified by the following formula:
pts=T1+(pts–firstpts);
pts-firstpts is the offset of the frame and the first frame from which the server can calculate the timestamp relative to the server time. Therefore, the pts of each frame of all audio and video streams is ensured to be increased all the time by taking the server time as a reference.
As shown in fig. 3, Stream2 and Stream3 push the streams to the media processing server later than Stream1, but the pts of these three streams are synchronized. Suppose stream1 comes from terminal C1, it produces the first frame at terminal C1 with time tc1, and it arrives at the server with time T1; stream1 generates the second packet at terminal C1 for tc2, and the second packet arrives at server for T1+ (tc2-tc1), so that the conversion from terminal time to server time is realized; stream2 and stream3 also complete time conversion in the same way, and pts of the three streams are both based on server time, so mixed streams can realize three-stream synchronization.
In the embodiment of the present invention, the player needs to play according to a time base (time _ base), if 1 second is divided into 90000 equal portions, each scale is 1/90000 seconds, and the time _ base at this time is {1,90000 }. The value of pts is the number of time _ base which is neither the server time nor the terminal time, when the player decodes and plays, the player will record the first pts value (pts _00), and the time of displaying the later picture can be obtained by the time stamp (pts _ now-pts _00)/90000 of the currently played frame, and the unit is second. In the embodiment of the invention, the time bases of all the terminals in the system can be specified to be consistent.
As also shown in fig. 4, some embodiments of the present invention provide a multi-channel audio and video synchronization method, including the following steps:
s110, receiving multi-channel audio and video data, recording the time of the first frame of each channel of audio and video to reach a server, and acquiring the original display timestamp of each frame in each channel of audio and video;
s120, updating the display time stamp of the first frame in each path of audio and video to be the time when the first frame reaches the server according to the original display time stamp of the first frame in each path of audio and video;
s130, updating the display time stamp of the non-first frame in each path of audio and video according to the original display time stamp of the non-first frame in each path of audio and video, the original display time stamp of the first frame in each path of audio and video and the arrival time of the first frame at a server;
and S140, mixing and outputting the audio and video data of each path after the display time stamp is updated so that the receiving terminal plays the audio and video data according to the updated display time stamp.
In the embodiment of the invention, the display time stamps of all frames in all paths of audio and video data are uniformly defined according to the server time, so that all frames are sequentially increased in an increasing mode, and after the audio and video data of all paths are mixed, the influence of network jitter is avoided. Ensures synchronous playing of picture and audio.
The display time stamp for each path of audio and video in the embodiment of the invention is more real-time, for a certain path of audio and video data, after the first frame is obtained, the time when the first frame reaches the server and the original display time stamp can be recorded, and after the original display time stamp is obtained for each frame which arrives later, the updated time stamp can be obtained, wherein the updated time stamp is increased by taking the time when the first frame reaches the service as a reference, and the time difference between the frames is not changed. In the embodiment of the invention, a single-thread mode can be adopted for processing, each path of audio and video is provided with an identifier, the time when the first frame reaches the server and the original display timestamp are also provided with corresponding identifiers, the time when the first frame reaches the server and the original display timestamp are loaded into a memory, and after the identifier is read each time a new video frame comes, the corresponding time when the first frame reaches the server and the original display timestamp are read by using the identifier, so that the updated display timestamp can be obtained. In the embodiment of the invention, a thread can be started for each path in a multithread mode, when no corresponding video frame arrives, the thread is suspended, and when a corresponding video frame arrives, the thread is recovered.
It should be noted that, in the embodiment of the present invention, after the timestamp is updated, the mixed audio and video data may be output correspondingly, and it is not necessary to output the mixed audio and video data after each audio and video data is processed, but a real-time or quasi-real-time manner may be adopted to directly output the processed data to the outside. When the processing capacity of the server and the transmission capacity of the network are enough, the audio and video data played by the receiving terminal can be ensured to be continuous.
Further, in step S130, the display timestamp of the non-first frame in each channel of audio and video is: and adding the original display time stamp of the current non-first frame to the time when the first frame of each path of audio and video arrives at the server, and then subtracting the original display time stamp of the first frame in each path of audio and video.
Further, the original display time stamps of the multi-path audio and video data are formed according to respective independent clocks.
Further, the step S110 includes: and receiving multi-path audio and video data, decoding the multi-path audio and video data, acquiring the time of the first frame of each path of audio and video to reach a server, and acquiring the original display timestamp of each frame in each path of audio and video.
Furthermore, the multi-channel audio and video synchronization method is operated in a streaming media server, and the time of the first frame of each channel of audio and video to reach the server is the time of the first frame of each channel of audio and video to reach the streaming media server.
Further, the step S140 includes: and synthesizing each path of audio and video data with the updated display timestamp into a path of data stream for outputting. The synthesized data stream in the embodiment of the invention can be output in real time, can be rapidly processed without acquiring a frame of data, and then is output, the data of each path is integrated into one path for output, meanwhile, each frame can also keep the original identifier of each path, and after receiving the data frame, the receiving terminal synchronizes to the corresponding display area for display according to the identifier.
Further, the step of playing the audio and video data by the receiving terminal according to the updated display timestamp includes: and playing the audio and video data according to the display time stamp and the time base of each frame.
The method for synchronizing the multi-channel real-time audio and video in the embodiment of the invention obtains the time of the first frame of each channel of stream reaching a server; acquiring pts of a first frame of each path of flow; acquiring pts of each path of flow non-first frame and modifying the pts into server time; and encoding and mixing the multiple paths of streams into one path for outputting. The multi-channel audio and video can be synchronized in real time, and the problem that the time before a client and a server is not synchronized can be solved; for each path of stream, the server can realize the synchronization effect only by decoding once and encoding once, and no other server performance is consumed; the problem of mixed flow output picture and audio asynchronism caused by network jitter and data packet delay reaching the server can be effectively solved in the plug flow.
As shown in fig. 5, according to the foregoing method embodiment, an embodiment of the present invention further provides a multi-channel audio/video synchronization apparatus 100, which includes a decoding module 110, a first frame synchronization module 120, a non-first frame synchronization module 130, and a mixed flow module 140;
the decoding module 110 is configured to receive multiple paths of audio/video data, record time when a first frame of each path of audio/video reaches a server, and obtain an original display timestamp of each frame in each path of audio/video;
the first frame synchronization module 120 is configured to update the display timestamp of the first frame in each path of audio and video as the time when the first frame arrives at the server according to the original display timestamp of the first frame in each path of audio and video;
the non-first frame synchronization module 130 is configured to update the display time stamp of the non-first frame in each path of audio and video according to the original display time stamp of the non-first frame in each path of audio and video, the original display time stamp of the first frame in each path of audio and video, and the time when the first frame arrives at the server;
the mixed flow module 140 is configured to mix and output each path of audio/video data with the updated display timestamp, so that the receiving terminal plays the audio/video data according to the updated display timestamp.
The specific execution steps of the above modules are described in detail in the corresponding steps of the multi-path audio and video synchronization method, and are not described in detail herein.
Referring now to FIG. 6, a block diagram of a computer system 800 suitable for use in implementing the control device of an embodiment of the present application is shown. The control device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 6, the computer system 800 includes a Central Processing Unit (CPU)801 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)802 or a program loaded from a storage section 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data necessary for the operation of the system 800 are also stored. The CPU 801, ROM 802, and RAM 803 are connected to each other via a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.
The following components are connected to the I/O interface 805: an input portion 806 including a keyboard, a mouse, and the like; an output section 807 including a signal such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 808 including a hard disk and the like; and a communication section 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 809 performs communication processing via a network such as the internet. A drive 810 is also connected to the I/O interface 805 as necessary. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 810 as necessary, so that a computer program read out therefrom is mounted on the storage section 808 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 809 and/or installed from the removable medium 811. The computer program performs the above-described functions defined in the method of the present application when executed by the Central Processing Unit (CPU) 801.
It should be noted that the computer readable medium described herein can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Python, Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes an acquisition unit, a segmentation unit, a determination unit, and a selection unit. The names of the units do not in some cases constitute a limitation on the units themselves, and for example, the acquisition unit may also be described as a "unit that acquires a to-be-processed picture of the picture".
As another aspect, the present application also provides a computer-readable medium, which may be contained in the electronic device described in the above embodiments; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: receiving multi-channel audio and video data, recording the time of the first frame of each channel of audio and video to reach a server, and acquiring the original display timestamp of each frame in each channel of audio and video; updating the display time stamp of the first frame in each path of audio and video as the time when the first frame reaches the server according to the original display time stamp of the first frame in each path of audio and video; updating the display time stamp of the non-first frame in each path of audio and video according to the original display time stamp of the non-first frame in each path of audio and video, the original display time stamp of the first frame in each path of audio and video and the time for the first frame to reach a server; and mixing and outputting the audio and video data after the display time stamp is updated, so that the receiving terminal plays the audio and video data according to the updated display time stamp.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention herein disclosed is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the invention. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims (10)

1. A multi-channel audio and video synchronization method is characterized by comprising the following steps:
s110, receiving multi-channel audio and video data, recording the time of the first frame of each channel of audio and video to reach a server, and acquiring the original display timestamp of each frame in each channel of audio and video;
s120, updating the display time stamp of the first frame in each path of audio and video to be the time when the first frame reaches the server according to the original display time stamp of the first frame in each path of audio and video;
s130, updating the display time stamp of the non-first frame in each path of audio and video according to the original display time stamp of the non-first frame in each path of audio and video, the original display time stamp of the first frame in each path of audio and video and the arrival time of the first frame at a server;
and S140, mixing and outputting the audio and video data with the updated display time stamp so that the receiving terminal plays the audio and video data according to the updated display time stamp.
2. The method according to claim 1, wherein in step S130, the display time stamp of the non-first frame in each audio/video is: and adding the original display time stamp of the current non-first frame to the time when the first frame of each path of audio and video arrives at the server, and then subtracting the original display time stamp of the first frame in each path of audio and video.
3. The multi-channel audio-video synchronization method according to claim 1 or 2, wherein the original display time stamps of the multi-channel audio-video data are formed according to respective independent clocks.
4. The multi-channel audio and video synchronization method according to claim 1 or 2, wherein the step S110 comprises: and receiving multi-path audio and video data, decoding the multi-path audio and video data, acquiring the time of the first frame of each path of audio and video to reach a server, and acquiring the original display timestamp of each frame in each path of audio and video.
5. The multi-channel audio and video synchronization method according to claim 1 or 2, wherein the multi-channel audio and video synchronization method is operated in a streaming media server, and the time of the first frame of each channel of audio and video to reach the server is the time of the first frame of each channel of audio and video to reach the streaming media server.
6. The multi-channel audio and video synchronization method according to claim 1 or 2, wherein the step S140 comprises: and synthesizing each path of audio and video data with the updated display timestamp into a path of data stream for outputting.
7. The multi-channel audio and video synchronization method according to claim 1 or 2, wherein the step of playing the audio and video data by the receiving terminal according to the updated display timestamp comprises: and playing the audio and video data according to the display time stamp and the time base of each frame.
8. A multipath audio and video synchronization device is characterized by comprising a decoding module, a first frame synchronization module, a non-first frame synchronization module and a mixed flow module;
the decoding module is used for receiving multi-channel audio and video data, recording the time of the first frame of each channel of audio and video to reach the server, and acquiring the original display timestamp of each frame in each channel of audio and video;
the first frame synchronization module is used for updating the display time stamp of the first frame in each path of audio and video to be the time when the first frame reaches the server according to the original display time stamp of the first frame in each path of audio and video;
the non-first frame synchronization module is used for updating the display time stamp of the non-first frame in each path of audio and video according to the original display time stamp of the non-first frame in each path of audio and video, the original display time stamp of the first frame in each path of audio and video and the arrival time of the first frame at the server;
the mixed flow module is used for mixing and outputting each path of audio and video data with the updated display timestamp so that the receiving terminal can play the audio and video data according to the updated display timestamp.
9. An electronic device, comprising:
one or more processors;
a storage device having one or more programs stored thereon,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.
10. A computer-readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method of any one of claims 1-7.
CN202010692102.XA 2020-07-17 2020-07-17 Multi-channel audio and video synchronization method and device Pending CN111787365A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010692102.XA CN111787365A (en) 2020-07-17 2020-07-17 Multi-channel audio and video synchronization method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010692102.XA CN111787365A (en) 2020-07-17 2020-07-17 Multi-channel audio and video synchronization method and device

Publications (1)

Publication Number Publication Date
CN111787365A true CN111787365A (en) 2020-10-16

Family

ID=72763401

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010692102.XA Pending CN111787365A (en) 2020-07-17 2020-07-17 Multi-channel audio and video synchronization method and device

Country Status (1)

Country Link
CN (1) CN111787365A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112351294A (en) * 2020-10-27 2021-02-09 广州赞赏信息科技有限公司 Method and system for frame synchronization among multiple machine positions of cloud director
CN112492357A (en) * 2020-11-13 2021-03-12 北京安博盛赢教育科技有限责任公司 Method, device, medium and electronic equipment for processing multiple video streams
CN112511768A (en) * 2020-11-27 2021-03-16 上海网达软件股份有限公司 Multi-picture synthesis method, device, equipment and storage medium
CN113766215A (en) * 2021-09-07 2021-12-07 中电科航空电子有限公司 Airborne passenger cabin passenger broadcasting synchronous testing method and system
CN114007108A (en) * 2021-10-28 2022-02-01 广州华多网络科技有限公司 Audio stream mixing control method, device, equipment, medium and product
CN114302169A (en) * 2021-12-24 2022-04-08 威创集团股份有限公司 Picture synchronous recording method, device, system and computer storage medium
CN114339353A (en) * 2021-12-31 2022-04-12 晶晨半导体科技(北京)有限公司 Audio and video synchronization method and device, electronic equipment and computer readable storage medium
CN115334322A (en) * 2022-10-17 2022-11-11 腾讯科技(深圳)有限公司 Video frame synchronization method, terminal, server, electronic device and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102595202A (en) * 2011-01-11 2012-07-18 中兴通讯股份有限公司 Method, device and system for synchronizing multiple media streams
CN105430537A (en) * 2015-11-27 2016-03-23 刘军 Method and server for synthesis of multiple paths of data, and music teaching system
CN107071509A (en) * 2017-05-18 2017-08-18 北京大生在线科技有限公司 The live video precise synchronization method of multichannel
CN108156509A (en) * 2017-12-28 2018-06-12 新华三云计算技术有限公司 Video broadcasting method, device and user terminal

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102595202A (en) * 2011-01-11 2012-07-18 中兴通讯股份有限公司 Method, device and system for synchronizing multiple media streams
CN105430537A (en) * 2015-11-27 2016-03-23 刘军 Method and server for synthesis of multiple paths of data, and music teaching system
CN107071509A (en) * 2017-05-18 2017-08-18 北京大生在线科技有限公司 The live video precise synchronization method of multichannel
CN108156509A (en) * 2017-12-28 2018-06-12 新华三云计算技术有限公司 Video broadcasting method, device and user terminal

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112351294A (en) * 2020-10-27 2021-02-09 广州赞赏信息科技有限公司 Method and system for frame synchronization among multiple machine positions of cloud director
CN112492357A (en) * 2020-11-13 2021-03-12 北京安博盛赢教育科技有限责任公司 Method, device, medium and electronic equipment for processing multiple video streams
CN112511768A (en) * 2020-11-27 2021-03-16 上海网达软件股份有限公司 Multi-picture synthesis method, device, equipment and storage medium
CN112511768B (en) * 2020-11-27 2024-01-02 上海网达软件股份有限公司 Multi-picture synthesis method, device, equipment and storage medium
CN113766215A (en) * 2021-09-07 2021-12-07 中电科航空电子有限公司 Airborne passenger cabin passenger broadcasting synchronous testing method and system
CN114007108A (en) * 2021-10-28 2022-02-01 广州华多网络科技有限公司 Audio stream mixing control method, device, equipment, medium and product
CN114007108B (en) * 2021-10-28 2023-09-19 广州华多网络科技有限公司 Audio stream mixing control method and device, equipment, medium and product thereof
CN114302169B (en) * 2021-12-24 2023-03-07 威创集团股份有限公司 Picture synchronous recording method, device, system and computer storage medium
CN114302169A (en) * 2021-12-24 2022-04-08 威创集团股份有限公司 Picture synchronous recording method, device, system and computer storage medium
CN114339353A (en) * 2021-12-31 2022-04-12 晶晨半导体科技(北京)有限公司 Audio and video synchronization method and device, electronic equipment and computer readable storage medium
CN114339353B (en) * 2021-12-31 2023-09-29 晶晨半导体科技(北京)有限公司 Audio/video synchronization method and device, electronic equipment and computer readable storage medium
CN115334322B (en) * 2022-10-17 2023-01-31 腾讯科技(深圳)有限公司 Video frame synchronization method, terminal, server, electronic device and storage medium
CN115334322A (en) * 2022-10-17 2022-11-11 腾讯科技(深圳)有限公司 Video frame synchronization method, terminal, server, electronic device and storage medium

Similar Documents

Publication Publication Date Title
CN111787365A (en) Multi-channel audio and video synchronization method and device
US10798440B2 (en) Methods and systems for synchronizing data streams across multiple client devices
US20040179554A1 (en) Method and system of implementing real-time video-audio interaction by data synchronization
US10582232B1 (en) Transcoding frame-synchronous metadata for segmented video delivery
WO2020151599A1 (en) Method and apparatus for publishing video synchronously, electronic device, and readable storage medium
CN112492357A (en) Method, device, medium and electronic equipment for processing multiple video streams
WO2023071598A1 (en) Audio and video synchronous monitoring method and apparatus, electronic device, and storage medium
CN113225577B (en) Live stream processing method, device and system, electronic equipment and storage medium
CN110290398B (en) Video issuing method and device, storage medium and electronic equipment
US11689749B1 (en) Centralized streaming video composition
CN111818383B (en) Video data generation method, system, device, electronic equipment and storage medium
CN114095671A (en) Cloud conference live broadcast system, method, device, equipment and medium
CN108337556B (en) Method and device for playing audio-video file
Tang et al. Audio and video mixing method to enhance WebRTC
US20210227005A1 (en) Multi-user instant messaging method, system, apparatus, and electronic device
CN113542856A (en) Reverse playing method, device, equipment and computer readable medium for online video
CN112770159A (en) Multi-screen interaction system, method, device, equipment and storage medium
CN114697712B (en) Method, device and equipment for downloading media stream and storage medium
CN113923530B (en) Interactive information display method and device, electronic equipment and storage medium
CN112887742B (en) Live stream processing method, device, equipment and storage medium
CN111294628B (en) Multi-channel immersive video and audio control system
JP2005323268A (en) Method, device, and program for content reproduction and record medium
KR102051985B1 (en) Synchronization of Media Rendering in Heterogeneous Networking Environments
CN115802085A (en) Content playing method and device, electronic equipment and storage medium
CN113727183B (en) Live push method, apparatus, device, storage medium and computer program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20201016