WO2016068760A1 - Synchronisation de flux vidéo - Google Patents

Synchronisation de flux vidéo Download PDF

Info

Publication number
WO2016068760A1
WO2016068760A1 PCT/SE2014/051263 SE2014051263W WO2016068760A1 WO 2016068760 A1 WO2016068760 A1 WO 2016068760A1 SE 2014051263 W SE2014051263 W SE 2014051263W WO 2016068760 A1 WO2016068760 A1 WO 2016068760A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
user device
server system
peer
synchronization server
Prior art date
Application number
PCT/SE2014/051263
Other languages
English (en)
Inventor
Heidi-Maria BACK
Le Wang
Miljenko OPSENICA
Tomas Mecklin
Original Assignee
Telefonaktiebolaget L M Ericsson (Publ)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget L M Ericsson (Publ) filed Critical Telefonaktiebolaget L M Ericsson (Publ)
Priority to PCT/SE2014/051263 priority Critical patent/WO2016068760A1/fr
Publication of WO2016068760A1 publication Critical patent/WO2016068760A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/2365Multiplexing of several video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/222Secondary servers, e.g. proxy server, cable television Head-end
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2665Gathering content from different sources, e.g. Internet and satellite
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/835Generation of protective data, e.g. certificates
    • H04N21/8358Generation of protective data, e.g. certificates involving watermark

Definitions

  • the present embodiments generally relate to video stream synchronization, and in particular to synchronization of video streams originating from multiple user devices recording a scene.
  • the emerging applications allow users to produce videos collaboratively using multiple mobile cameras, in a manner similar to how professional live TV is produced.
  • the scenario includes three user roles, namely producers, directors and consumers.
  • the producers are users with user devices 1 , 2, 3 who collaboratively record and stream video feeds, for example, in a stadium to application servers or a server system 10.
  • a mixed view of video feeds enables the directors to conduct video direction and rich-content assertion.
  • the consumers thus, are able to watch live broadcast of the event from different viewpoints based on the directors' selection rather than only few options provided by traditional TV broadcasting.
  • Fig. 2 illustrating video streams 21 , 22, 23 from user devices 1 , 2, 3, the marked video frames 31 , 32, 33 are taken by cameras of the user devices 1 , 2, 3 at the same time. Due to the network delay, the time when the marked video frames 31 , 32, 33 arrive at the server system 10 is different. Thus, one of the most import requirements of social video streaming is adequate synchronization so that each video stream is aligned to each other.
  • the multi-producer video filming turns out to be a problem of asynchrony, which has to be solved.
  • Various techniques for achieving synchronization among video streams have been proposed in the art.
  • clock synchronization is used. Synchronization offsets are calculated using timestamps generated by the cameras' internal clocks on the user devices 1 , 2, 3.
  • This solution is one of the most processing efficient methods.
  • some user devices 1 , 2, 3 do not have an internal high- resolution clock.
  • clock drift and skew may cause the user devices 1 , 2, 3 out of synchronization.
  • the solution requires all the user devices 1, 2, 3 to synchronize with a centralized Network Time Protocol (NTP) server.
  • NTP Network Time Protocol
  • the transmission delay between each user device 1 , 2, 3 and the server system 10 would also vary from each other, especially when wireless network is highly congested. Hence, this solution will not be practicable for a typical user case to achieve video stream synchronization involving multiple user devices 1 , 2, 3.
  • audio fingerprints are extracted from audio streams and compared to find a match among all the audio streams when multiple cameras are recording the same event. By comparing the occurrence of similar sound matches, it may be possible to calculate the synchronization offset.
  • this solution requires all the user devices 1 , 2, 3 to be close enough to the event since the speed of sound is much slower than the speed of light. The sound, recorded by a user device 1 , 2, 3 that is closer to the sound source, could be up to one second ahead as compared to the sound recorded by another user device 1 , 2, 3, when watching a sport game in large stadium. Furthermore, the noise generated by the crowds would also decrease the accuracy of finding suitable audio fingerprints. This means that audio fingerprinting will generally not be very reliable to achieve video stream synchronization involving multiple user devices 1, 2, 3.
  • a further solution involves analyzing the incoming video streams, and monitoring the sequence of video frames for the occurrence of at least one of a plurality of different types of visual events.
  • the occurrence of a selected visual event should be detected among all the video streams and taken as a marker to synchronize all video streams.
  • this solution requires all user devices 1 , 2, 3 recording at least one common visual event in order to find the marker among all the video streams from each user device 1 , 2, 3. If the user devices 1 , 2, 3 are focusing on different parts of the event, there is no way for this solution to identify the marker.
  • US 2011/0043691 discloses a method for synchronizing at least two video streams originating from at least two cameras having a common visual field. This solution requires studying trajectories of objects of a scene. It is not adapted for a situation where multiple users are filming at the same time but at different parts of an event.
  • An aspect of the embodiments relates to a video synchronization method comprising, for each user device of multiple user devices, receiving a video stream of encoded video frames over a wireless media channel from the user device.
  • the method also comprises transmitting, to the user device and over a wireless peer-to-peer channel, a timestamp generated based on a current system time.
  • the method further comprises receiving a frame fingerprint and the timestamp from the user device over the wireless peer-to-peer channel.
  • the method additionally comprises determining an estimated capture time of a video frame, used to generate the received frame fingerprint, based on the timestamp and a current system time.
  • the method also comprises decoding the video stream to get decoded video frames.
  • the method further comprises comparing the received frame fingerprint with a respective frame fingerprint generated for the decoded video frames.
  • the method further comprises assigning the estimated capture time to a decoded video frame based on the comparison.
  • the method also comprises time aligning video streams from the multiple user devices based on the assigned estimated capture times.
  • Another aspect of the embodiments relates to a method for enabling video synchronization.
  • the method comprises a user device transmitting a video stream of encoded video frames to a video synchronization server system over a wireless media channel.
  • the method also comprises the user device generating a frame fingerprint of a current video frame upon reception of a timestamp from the video synchronization server system over a wireless peer-to-peer channel.
  • the method further comprises the user device transmitting the frame fingerprint and the timestamp to the video synchronization server system over the wireless peer-to-peer channel.
  • a further aspect of the embodiments relates to a video synchronization server system.
  • the video synchronization server system is configured to receive a video stream of encoded video frames over a wireless media channel from each user device of multiple user devices.
  • the video synchronization server system is also configured to transmit, to each user device of the multiple user devices and over a wireless peer-to-peer channel, a timestamp generated based on a current system time.
  • the video synchronization server system is further configured to receive a frame fingerprint and the timestamp from each user device of the multiple user devices over the wireless peer-to-peer channel.
  • the video synchronization server system is additionally configured to determine, for each user device of the multiple user devices, an estimated capture time of a video frame, used to generate the received frame fingerprint, based on the timestamp and a current system time.
  • the video synchronization server system is also configured to decode, for each user device of the multiple user devices, the video stream to get decoded video frames.
  • the video synchronization server system is configured to compare, for each user device of the multiple user devices, the received frame fingerprint with a respective frame fingerprint generated for the decoded video frames.
  • the video synchronization server system is further configured to assign, for each user device of the multiple user devices, the estimated capture time to a decoded video frame based on the comparison.
  • the video synchronization server system is also configured to time align video streams from the multiple user devices based on the assigned estimated capture times.
  • a video synchronization server system comprising a timestamp generator for generating, for each user device of multiple user devices, a timestamp based on a current system time.
  • the timestamp is output for transmission to the user device over a wireless peer-to-peer channel.
  • the video synchronization server system also comprises a time estimator for determining, for each user device of the multiple user devices, an estimated capture time of a video frame, used to generate a frame fingerprint, received from the user device with a timestamp over the wireless peer-to-peer channel, based on the timestamp and a current system time.
  • the video synchronization server system further comprises a decoder for decoding, for each user device of the multiple user devices, a video stream of encoded video frames, received from the user device over a wireless media channel, to get decoded video frames.
  • the video synchronization server system additionally comprises a comparator for comparing, for each user device of the multiple user devices, the received frame fingerprint with a respective frame fingerprint generated for the decoded video frames.
  • the video synchronization server system further comprises an assigning unit for assigning, for each user device of the multiple user devices, the estimated capture time to a decoded video frame based on the comparison.
  • the video synchronization server system additionally comprises a time aligner for time aligning video streams from the multiple user devices based on the assigned estimated capture times.
  • a further of the embodiments relates to a user device that is configured to transmit a video stream of encoded video frames to a video synchronization server system over a wireless media channel.
  • the user device is also configured to generate a frame fingerprint of a current video frame upon reception of a timestamp from the video synchronization server system over a wireless peer-to-peer channel.
  • the user device is further configured to transmit the frame fingerprint and the timestamp to the video synchronization server system over the wireless peer-to-peer channel.
  • a user device comprising an encoder for generating a video stream of encoded video frames for transmission to a video synchronization server system over a wireless media channel.
  • the user device also comprises a fingerprint generator for generating a frame fingerprint of a current video frame upon reception of a timestamp from the video synchronization server system over a wireless peer-to-peer channel.
  • the user device further comprises an associating unit for associating the timestamp with the frame fingerprint for transmission to the video synchronization server system over the wireless peer-to-peer channel.
  • a further aspect of the embodiments relates to a computer program comprising instructions, which when executed by a processor, cause the processor to generate, for each user device of multiple user devices, a timestamp based on a current system time.
  • the timestamp is output for transmission to the user device over a wireless peer-to-peer channel.
  • the processor is also caused to determine, for each user device of the multiple user devices, an estimated capture time of a video frame, used to generate a frame fingerprint, received from the user device with the timestamp over the wireless peer-to-peer channel, based on the timestamp and a current system time.
  • the processor is further caused to decode, for each user device of the multiple user devices, a video stream of encoded video frames, received from the user device over a wireless media channel, to get decoded video frames.
  • the processor is additionally caused to compare, for each user device of the multiple user devices, the received frame fingerprint with a respective frame fingerprint generated for the decoded video frames and to assign, for each user device of the multiple user devices, the estimated capture time to a decoded video frame based on the comparison.
  • the processor is further caused to time align video streams from the multiple user devices based on the assigned estimated capture times.
  • Another aspect of the embodiments relates to a computer program comprising instructions, which when executed by a processor, cause the processor to generate a video stream of encoded video frames for transmission to a video synchronization server system over a wireless media channel.
  • the processor is also caused to generate a frame fingerprint of a current video frame upon reception of a timestamp from the video synchronization server system over a wireless peer-to-peer channel.
  • the processor is further caused to associate the timestamp with the frame fingerprint for transmission to the video synchronization server system over the wireless peer-to-peer channel.
  • a related aspect of the embodiments defines a carrier comprising a computer program as defined above.
  • the carrier is one of an electronic signal, an optical signal, an electromagnetic signal, a magnetic signal, an electric signal, a radio signal, a microwave signal, or a computer-readable storage medium.
  • the present embodiments address problems that video frames originating from different user devices recording a scene are out-of-synchronization, for instance, in social media environments.
  • the embodiments achieve a reliable and implementation friendly, i.e. low complexity, solution to synchronize video stream from multiple user devices.
  • the solution does not require installation of any proprietary applications on the user devices and is applicable to all kinds of social events.
  • Fig. 1 illustrates social video streaming of a sports event
  • FIG. 2 schematically illustrates lack of synchronization of video streams sent from multiple user devices
  • Fig. 3 is a flow chart illustrating a method for enabling video synchronization according to an embodiment
  • Fig. 4 is a flow chart illustrating additional, optional steps of the method illustrated in Fig. 3;
  • Fig. 5 is a flow chart illustrating an embodiment of the fingerprint generating step illustrated in Fig. 3;
  • Fig. 6 is a flow chart illustrating an additional, optional step of the method illustrated in Fig. 4;
  • Fig. 7 is a flow chart illustrating a video synchronization method according to an embodiment;
  • Fig. 8 is a flow chart illustrating an embodiment of the time determining step illustrated in Fig. 7;
  • Fig. 9 is a flow chart illustrating additional, optional steps of the method illustrated in Fig. 7;
  • Fig. 10 is a flow chart illustrating an embodiment of the comparing step illustrated in Fig. 7;
  • Fig. 11 schematically illustrates a system comprising a user device and a video synchronization server system and the operation flow in order to achieve synchronization of video streams according to an embodiment
  • Fig. 12 schematically illustrates a block diagram of a user device according to an embodiment
  • Fig. 13 schematically illustrates a block diagram of a user device according to another embodiment
  • Fig. 14 schematically illustrates a block diagram of a user device according to a further embodiment
  • Fig. 15 schematically illustrates a block diagram of a video synchronization server system according to an embodiment
  • Fig. 16 schematically illustrates a block diagram of a video synchronization server system according to another embodiment
  • Fig. 17 schematically illustrates a block diagram of a video synchronization server system according to a further embodiment.
  • Fig. 18 schematically illustrates a computer program implementation according to an embodiment.
  • the present embodiments generally relate to video stream synchronization, and in particular to synchronization of video streams originating from multiple user devices recording a scene.
  • the embodiments thereby enable video frame synchronization for video streaming of multiple user devices, for instance, in connection with a social event, such as a game or concert.
  • a social event such as a game or concert.
  • a video frame is used to denote a picture or image of a video stream.
  • a video frame could alternatively be denoted (video) picture or (video) image in the art.
  • a video frame is encoded according to a video coding standard to get an encoded video frame, such as an intra-coded frame, or I frame or picture, or an inter-coded frame, or P or B frame or picture.
  • Fig. 7 is a flow chart illustrating a video synchronization method according to the embodiment.
  • the steps S40 to S46 as shown in the figure are performed for each user device of multiple user devices.
  • Step S40 comprises receiving a video stream of encoded video frames over a wireless media channel from the user device.
  • a next step S41 comprises transmitting, to the user device and over a wireless peer-to-peer (P2P) channel, a timestamp generated based on a current system time.
  • a frame fingerprint and the timestamp are received from the user device over the wireless peer-to-peer channel in step S42.
  • a next step S43 comprises determining an estimated capture time of a video frame, used to generate the received frame fingerprint, based on the timestamp and a current system time.
  • P2P wireless peer-to-peer
  • the method also comprises decoding the video stream in step S44 to get decoded video frames.
  • the received frame fingerprint is compared with a respective frame fingerprint generated for the decoded video frames in step S45.
  • Step S46 then comprises assigning the estimated capture time to a decoded video frame based on the comparison.
  • the video synchronization method further comprises time aligning video streams from the multiple user devices in step S47 based on the assigned estimated capture times.
  • synchronization of video streams originating from different user devices is achieved by generating and transmitting timestamps to the user devices on wireless peer-to-peer channels running in parallel to the wireless media channels used by the user devices to transmit video streams of encoded video frames.
  • the timestamps trigger the user devices to generate a frame fingerprint of a current video frame and return the frame fingerprint with the associated timestamp on the peer-to-peer channel.
  • the timestamp enables estimation of the capture time of the video frame at the user device.
  • the decoded video frames obtained by decoding the video stream are then used to generate frame fingerprints that are compared to the frame fingerprint received over the peer-to-peer channel.
  • the estimated capture time can be assigned to the decoded video frame used to generate the matching frame fingerprint.
  • a correct time expressed in the system time is obtained for the position within the video stream corresponding to this decoded video frame.
  • the video streams can be time aligned so that video frames recorded at the same time at the different user devices will be time aligned.
  • the embodiments thereby achieve synchronization of video streams from multiple sources by using a feedback channel, i.e. the peer-to-peer channel, to send timestamps and frame fingerprints from the user devices for calculating video frame arrival time of each video stream.
  • the peer-to-peer channel can thereby be used to send timestamps, which are embedded along with frame fingerprints of sampled pictures or frames of recorded video into a packet on the user device.
  • the timestamp is used to calculate when the sampled picture or frame was taken.
  • the fingerprinted frame in the received video stream it is possible to determine when the video frame was taken in local time. i.e. system time.
  • all the sampled video frames can be time stamped in the system time when they arrive on the wireless media channel. These video frames can then be used as pointers to align all the video streams.
  • each user device transmits a video stream, or bitstream, of encoded video frames over a respective wireless media channel.
  • a video synchronization server system As the camera of or connected to the user device records the scene and the user device, encodes the frames or pictures from the camera they are transmitted over the wireless media channel to a video synchronization server system.
  • Each user device has, in addition, a wireless peer-to-peer channel established with the video synchronization server system to receive timestamps and to return the timestamps together with generated frame fingerprints.
  • the synchronization server system can determine an estimated capture time of the video frame used to generate the frame fingerprint based on the timestamp received together with the frame fingerprint and the current system time when the frame fingerprint and the timestamp were received at the video synchronization server system.
  • the video synchronization video server system decodes the encoded video streams of the video streams and generate respective frame fingerprints for the decoded video frames.
  • the video synchronization video server performs the same or at least similar fingerprinting procedure on the decoded video frames as the user device performed on the current video frame when it received the timestamp from the video synchronization server system.
  • the received frame fingerprint is then compared to the generated frame fingerprints in order to identify a generated frame fingerprint that matches, i.e. is sufficiently equal to, the received frame fingerprint.
  • the video server synchronization system will then find, based on the comparison, the position within the video stream at which the user device received the timestamp sent over the wireless peer-to-peer channel.
  • the estimated capture time determined in system time based on the timestamp and the arrival time of the frame fingerprint and timestamp at the video synchronization video server system can thereby be assigned to one of the decoded video frames.
  • the video sever synchronization system knows the correct time of at least one decoded video frame in the video stream.
  • the video synchronization server system can use the determined estimated capture times to correctly time align decoded frames between the video streams in order to achieve video synchronization. As a consequence, video frames captured at the same time at the different user devices become time aligned at the video synchronization server system.
  • System time as used herein represents and denotes the time as recorded by the video synchronization server system, typically using an internal clock of the video synchronization server system. However, it could be possible to use another time reference than an internal clock as long as one and the same time reference is used by the video synchronization server system to generate timestamps and record current times of arrival of video frame and of the timestamp and the frame fingerprint.
  • Fig. 8 is a flow chart illustrating a particular embodiment of determining estimated capture time in Fig. 7.
  • the method continues from step S42 of Fig. 7.
  • the following step S50 comprises estimating a one-way transmission delay based on the timestamp and a reception time, in the system time, of the received frame fingerprint and the timestamp.
  • a next step S51 comprises calculating the estimated capture time based on the one-way transmission delay and a reception time, in the system time, of a video frame used to generate the frame fingerprint.
  • the method then continues to step S44 of Fig. 7.
  • the timestamp and the reception time i.e. the time at which the frame fingerprint and the timestamp were received on the wireless peer-to-peer channel expressed in the system time, are used to estimate a one-way transmission delay.
  • This one-way transmission delay represents the transmission delay from transmission of a data packet from the user device until the data packet is received at the video synchronization server system.
  • the one-way transmission delay can be estimated by comparing the received timestamp with the current system time at which the timestamp and the frame fingerprint were received on the wireless peer-to-peer channels.
  • the one-way delay could be estimated as disclosed in [1].
  • each video frame received on the wireless media channel is time stamped (or more correctly the data packets carrying the video frames) with the current reception time in the system time reference.
  • the relevant reception time used in the calculation of step S51 is the reception time of the video frame that generates a frame fingerprint that matches the received frame fingerprint obtained from the user device over the wireless peer-to-peer channel. Hence, this relevant video frame was the one that resulted in the best match in the comparison of step S45.
  • step S41 of Fig. 7 comprises periodically transmitting, to the user device and over the wireless peer-to-peer channel, a timestamp generated based on a current system time.
  • timestamps are periodically generated for the user devices and transmitted thereto on the wireless peer-to-peer channels.
  • a same periodicity could be used for all user devices.
  • the timestamp representing the current system time is generated and sent to all user devices over the respective wireless peer-to-peer channels.
  • the periodicity is configurable by the video synchronization server system.
  • the generation of timestamps could be performed on request of the video synchronization server system, such as when an increase in accuracy of estimating one-way transmission delays.
  • the embodiments are, however, not limited to periodic transmission of timestamps.
  • the periodicity of transmitting timestamps or the occasions at which timestamp transmission are scheduled could be individually determined for the user devices and hence be different for different user devices.
  • the scheduling of timestamp transmissions could be adapted to the particular network conditions experienced by the individual user devices. In such a case, the scheduling could be adapted to a trend in the one-way transmission delays estimated for the user devices.
  • Fig. 9 is a flow chart illustrating an embodiment of such adaptation in scheduling transmissions of timestamps. The method continues from step S47 in Fig. 7.
  • a next step S60 comprises storing the oneway transmission delay estimated for the current user device. Any trend in one-way transmission delay is then determined in step S61 and used in step S62 to schedule transmission of timestamps. The method then continues to step S40 of Fig. 7, where the scheduling is used to determine when timestamps are to be sent to the user device.
  • information of estimated one-way transmission delays are stored at the video synchronization server system.
  • the video synchronization server system will, over time, have access to multiple one-way transmission delays estimated for a given user device. It is then possible to determine any trend in the one-way transmission delay, such as fairly constant one-way transmission delay, increasing one-way transmission delay, decreasing one-way transmission delay or fluctuating one-way transmission delay.
  • the transmission occasions for timestamps to the particular user device can then be scheduled in step S62 based on the determined trend in one-way transmission delay.
  • the scheduling could, in one approach, be simply in terms of i) keeping a current periodicity in timestamp transmissions, ii) reducing the time period between transmission occasions, or iii) increasing the time period between transmission occasions. More elaborative scheduling, including usage of non- periodic timestamp transmissions, based on the trend determined in step S61 are possible and within the scope of the embodiments.
  • Fig. 10 is a flow chart illustrating an embodiment of comparing fingerprints in Fig. 7.
  • the method continues from step S44 in Fig. 7.
  • a next step S70 comprises calculating a respective difference metric between the received frame fingerprint and the respective frame fingerprint generated for the decoded video frames.
  • the following step S71 comprises selecting a decoded video frame that results in a difference metric representing a difference between the received frame fingerprint and a frame fingerprint generated for the decoded video frame that is smaller than a threshold difference.
  • the video synchronization server system generates a respective frame fingerprint for each decoded video frame output from the decoding process in step S44 of Fig. 7. Each such frame fingerprint is then compared to the frame fingerprint received in step S42 of Fig.
  • the video synchronization server system determines whether the difference between the generated frame fingerprint and the received frame fingerprint is smaller than a threshold difference. If the difference is smaller than the threshold difference then the video synchronization server system assumes that the generated frame fingerprint and the received frame fingerprint are the same or more correctly that the decoded video frame used to generate the generated frame fingerprint at the video synchronization server system represents the same picture or frame as the video frame used to generate the received frame fingerprint at the user device.
  • a low value of the difference metric represents a small difference between the frame fingerprints and a high value of the difference metric represents a large difference between the frame fingerprints.
  • difference metrics that can be used include sum of absolute differences (SAD) or sum of squared differences (SSD) between corresponding pixels or sample values in the frame fingerprints.
  • SAD sum of absolute differences
  • SSD sum of squared differences
  • step S71 the method continues to step S46 of Fig. 7, in which the estimated capture time is assigned to the selected decoded video frame.
  • step S47 of Fig. 7 comprises time aligning the video streams from the multiple user devices based on the assigned estimated capture times so that video frames in the video streams having a same capture time are time aligned.
  • Fig. 3 is a flow chart illustrating a method for enabling video synchronization.
  • the method is performed by a user device communicating with a video synchronization server system.
  • the method comprises a user device transmitting, in step S1 , a video stream of encoded video frames to a video synchronization server system over a wireless media channel.
  • a next step S2 comprises the user device generating a frame fingerprint of a current video frame upon reception of a timestamp from the video synchronization server system over a wireless peer-to-peer channel.
  • the next step S3 comprises the user device transmitting the frame fingerprint and the timestamp to the video synchronization server system over the wireless peer-to-peer channel.
  • Step S1 of Fig. 3 is preferably performed throughout the communication session.
  • Steps S2 and S3 are, however, performed each time the user device receives a timestamp from the video synchronization sever system over the wireless peer-to-peer channel as schematically illustrated by the hashed line in the figure.
  • Fig. 4 is a flow chart illustrating additional, optional steps of the method for enabling video synchronization in Fig. 3.
  • the method starts in step S10, which comprises the user device recording a scene with a camera of or connected to the user device to produce video frames.
  • the user device then generates, in step S11 , the video stream by encoding the video frames.
  • the method continues to step S1 of Fig. 3.
  • the user device therefore comprises a camera or is at least connected, wirelessly or by a wired connection, to a camera that is used to record a scene.
  • the video frames output from the camera are encoded to generate the video stream that is transmitted to the video synchronization server system over the media channel in step S1 of Fig. 3.
  • Fig. 5 is a flow chart illustrating an embodiment of generating frame fingerprint in Fig. 3.
  • the method continues from step S1 in Fig. 3.
  • a next step S20 comprises the user device providing a video frame of a current scene recorded by the camera upon reception of the timestamp from the video synchronization server system over the wireless peer-to-peer channel.
  • the user device then generates the frame fingerprint of the video frame in step S21.
  • the method continues to step S3 of Fig. 3, where the generated frame fingerprint is transmitted together with the timestamp to the video synchronization server system over the wireless peer-to-peer channel.
  • the user device fetches or retrieves a current picture or video frame as output from the camera when it receives the timestamp over the wireless peer-to-peer channel.
  • This current picture or video frame thereby represents a snapshot of the current scene at the moment when the user device received the timestamp.
  • This current picture or video frame is then processed to generate a frame fingerprint that is transmitted together with the received timestamp to the video synchronization server system.
  • step S2 of Fig. 3, step S21 of Fig. 5 and step S45 of Fig. 7 can be performed according to various embodiments traditionally employed for generating fingerprints of images, pictures and/or video frames.
  • the generation of the frame fingerprint comprises:
  • a luminosity histogram of the original video frame can be used as frame fingerprint.
  • Further alternatives include use a Radon transformation of the video frame to produce a normative mapping of the frame data as frame fingerprint, take a Haar wavelet of the video frame, etc.
  • Other examples of generating frame fingerprints include image hashing, such as disclosed in [2-7].
  • the particular technique used to generate frame fingerprints is not essential to the embodiments and various prior art fingerprinting techniques can be used. It is, though, preferred that the fingerprinting is not too computational complex to allow battery-driven user devices to generate frame fingerprints.
  • the video synchronization server system preferably generates a respective frame fingerprint on all or at least some of the decoded video frames. Hence, the generation of frame fingerprints should preferably be sufficient fast to enable generation of such fingerprints in real-time or at least near real-time as decoded video frames are output from the video decoder.
  • the methods of the embodiments can advantageously be implemented using Web Real-Time Communication (WebRTC).
  • WebRTC Web Real-Time Communication
  • Web RTC is an application programming interface (API) that supports browser-to-browser applications for, among others, voice calling, video chat and peer-to-peer file sharing without plugins. WebRTC is thereby a secure and reliable solution to transmit video and audio streams from user devices to backend servers, such as the video synchronization server system.
  • backend servers such as the video synchronization server system.
  • WebRTC is gaining widely support from, for instance, Firefox and Chrome on Android.
  • the present embodiments can thereby be used to achieve live delivery of video streams from WebRTC-capable user devices to the video synchronization server system where video streams from different WebRTC-capable user devices are time aligned and synchronized.
  • Fig. 6 is a flow chart illustrating an additional, optional step of the method when implemented using WebRTC.
  • the method starts in step S30, which comprises the user device initiating a browser-based application service to activate a WebRTC getilserMedia API to access the camera of or connected to the user device.
  • the browser-based application service also activates a WebRTC MediaStream API to transmit the video stream to the video synchronization server system over the wireless media channel using Real-time Transport Protocol (RTP).
  • RTP Real-time Transport Protocol
  • the browser-based application service further activates a WebRTC DataChannel API to establish the wireless peer-to-peer channel with the video synchronization server system.
  • the method then continues to step S10 of Fig. 4.
  • WebRTC is a suitable technology of implementing the video synchronization, the embodiments are not limited thereto.
  • Fig. 11 schematically illustrates a system comprising a user device 1 and a video synchronization server system 10 and the operation flow in order to achieve synchronization of video streams according to an embodiment.
  • the user device 1 opens a Web application service that calls WebRTC getilserMedia API to access the camera 5 of the user device 1 and uses WebRTC MediaStream API to stream encoded video frames to the video synchronization server system 10 via RTP/ RTP Control Protocol (RTCP) over the wireless media channel 40.
  • the Web application service uses WebRTC DataChannel API to create a data channel 45, i.e. the wireless peer-to-peer channel 45, to the video synchronization server system 10.
  • the video synchronization server system 10 generates a timestamp and sends it to the user device 1 , such as periodically, over the data channel 45.
  • the user device 1 Upon receiving the timestamp, the user device 1 takes a screenshot or snapshot of the current scene 7 and generates a frame fingerprint (denoted image fingerprint in the figure) of the screenshot or snapshot. The user device 1 then sends the frame fingerprint together with the timestamp to the video synchronization server system 10 through the data channel 45.
  • the video synchronization server system 10 retrieves the frame fingerprint and the timestamp. By comparing the timestamp with the current system time, the video synchronization server system 10 is able to accurately estimate the one-way delay. Thus, the time when the screenshot or snapshot was taken can be derived. Meanwhile the video synchronization server system 10 decodes the received video stream. The video synchronization server system 10 produces a frame fingerprint for each decoded video frame. By comparing the produced frame fingerprints with the received frame fingerprint, the video synchronization server system 10 can determine when a video frame producing a frame fingerprint that matches the received frame fingerprint was captured on the user device 1. The video synchronization server system 10 performs this operation for each received video stream. Once the timestamp of each video stream is derived, the video synchronization server system is able to align them with the system time to achieve full video synchronization.
  • the video synchronization server system 10 could be a backend server capable of communicating with user devices 1 over the wireless media channel 40 and the wireless peer-to-peer channel 45, such as using WebRTC communication technology.
  • the video synchronization server system 10 could alternatively be implemented as a group or cluster of multiple, i.e. at least two, backend servers that are interconnected by wired or wireless connections.
  • the multiple backend servers could be locally arranged at the video synchronization service provider or be distributed among multiple locations.
  • cloud-based implementations of the video synchronization server system 10 are possible and within the scope of the embodiments.
  • a further aspect of the embodiments relates to a video synchronization server system.
  • the video synchronization server system is configured to receive a video stream of encoded video frames over a wireless media channel from each user device of multiple user devices.
  • the video synchronization server system is also configured to transmit, to each user device of the multiple user devices and over a wireless peer-to-peer channel, a timestamp generated based on a current system time.
  • the video synchronization server system is further configured to receive a frame fingerprint and the timestamp from each user device of the multiple user devices over the wireless peer-to-peer channel.
  • the video synchronization server system is additionally configured to determine, for each user device of the multiple user devices, an estimated capture time of a video frame, used to generate the received frame fingerprint, based on the timestamp and a current system time.
  • the video synchronization server system is also configured to decode, for each user device of the multiple user devices, the video stream to get decoded video frames.
  • the video synchronization server system is configured to compare, for each user device of the multiple user devices, the received frame fingerprint with a respective frame fingerprint generated for the decoded video frames.
  • the video synchronization server system is further configured to assign, for each user device of the multiple user devices, the estimated capture time to a decoded video frame based on the comparison.
  • the video synchronization server system is also configured to time align video streams from the multiple user devices based on the assigned estimated capture times.
  • the video synchronization server system is configured to estimate, for each user device of the multiple user devices, a one-way transmission delay based on the timestamp and a reception time of the received frame fingerprint and the timestamp in the system time.
  • the video synchronization server system is also configured to calculate, for each user device of the multiple user devices, the estimated capture time based on the one-way transmission delay and a reception time of a video frame used to generate the frame fingerprint in the system time.
  • the video synchronization server system is configured to periodically transmit, to each user device of the multiple user devices and over the wireless peer-to-peer channel, a timestamp generated based on a current system time.
  • the video synchronization server system is configured to store, for each user device of the multiple user devices, the one-way transmission delay.
  • the video synchronization server system is also configured to determine, for each user device of the multiple user devices, any trend in one-way transmission delay.
  • the video synchronization server system is further configured to schedule, for each user device of the multiple user devices, transmission of timestamps based on the trend in one-way transmission delay.
  • the video synchronization server system is configured to calculate, for each user device of the multiple user devices, a respective difference metric between the received frame fingerprint and the respective frame fingerprint generated for the decoded video frames.
  • the video synchronization server system is also configured to select, for each user device of the multiple user devices, a decoded video frame that results in a difference metric representing a difference between the received frame fingerprint and a frame fingerprint generated for the decoded video frame that is lower than a threshold difference.
  • the video synchronization server system is configured to time align the video stream from the multiple user devices based on the assigned estimated capture times so that video frames in the video streams having a same capture time are time aligned.
  • embodiments may be implemented in hardware, or in software for execution by suitable processing circuitry, or a combination thereof.
  • FIG. 15 illustrates a particular hardware implementation of the video synchronization server system 400.
  • the video synchronization server system 400 comprises a receiver 410 configured to receive the video stream over the wireless media channel and receive the frame fingerprint and the timestamp over the wireless peer-to-peer channel.
  • the video synchronization server system 400 also comprises a transmitter 410 configured to transmit the timestamp over the wireless peer-to-peer channel.
  • a time estimator 420 of the video synchronization server system 400 is configured to determine the estimated capture time.
  • the video synchronization server system 400 further comprises a decoder 430 configured to decode the video stream and a comparator 440 configured to compare the received frame fingerprint with the respective frame fingerprint.
  • An assigning unit 450 of the video synchronization server system 400 is configured to assign the estimated capture time to the decoded video frame and a time aligner 460 is configured to time align the video streams.
  • the receiver and transmitter have been exemplified by a transceiver (TX/RX) 410.
  • the video synchronization server system 400 could comprise a dedicated receiver 410 and a dedicated transmitter 410, or a first receiver used for reception of encoded video frames on the wireless media channel and a second receiver used for reception of timestamps and frame fingerprints on the wireless peer-to-peer channel in addition to the transmitter 410 or multiple transmitters.
  • the receiver 410 is preferably connected to the decoder 430 and the comparator 440 for forwarding received encoded video frame thereto and to the time estimator 420 for forwarding the received timestamp thereto.
  • the comparator 440 is connected to the decoder 430 for receiving decoded video frames therefrom and to the assigning unit 450 for forwarding a selected decoded video frame thereto.
  • the assigning unit 450 is connected to the time estimator 420 for receiving the estimated capture time therefrom and to the time aligner 460 for forwarding a decoded video frame with assigned estimated capture time thereto.
  • the time aligner 460 is also connected to the decoder 430 for receiving the decoded video frames to be time aligned therefrom.
  • the comparator 440 of the video synchronization server system 400 is configured to generate frame fingerprints of the decoded video frames output from the decoder 430.
  • the video synchronization server system 400 comprises a fingerprint generator (not illustrated) configured to generate frame fingerprints of the decoded video frames output from the decoder 430.
  • the figure also shows a system clock 15 used as internal time reference by the video synchronization server system 400. This system clock 15 is preferably used to generate timestamps and to record reception times of encoded video frames and of the frame fingerprints in the system time.
  • At least some of the steps, functions, procedures, modules and/or blocks described herein may be implemented in software such as a computer program for execution by suitable processing circuitry such as one or more processors or processing units.
  • processing circuitry includes, but is not limited to, one or more microprocessors, one or more Digital Signal Processors (DSPs), one or more Central Processing Units (CPUs), video acceleration hardware, and/or any suitable programmable logic circuitry such as one or more Field Programmable Gate Arrays (FPGAs), or one or more Programmable Logic Controllers (PLCs).
  • DSPs Digital Signal Processors
  • CPUs Central Processing Units
  • FPGAs Field Programmable Gate Arrays
  • PLCs Programmable Logic Controllers
  • the video synchronization server system 500 comprises a processor 510 and a memory 520 comprising instructions executable by the processor 510.
  • the processor 510 is operative to cause a receiver 530 to receive the video stream over the wireless media channel and receive the frame fingerprint and the timestamp over the wireless peer-to-peer channel.
  • the processor 510 is also operative to cause a transmitter 530 to transmit the timestamp over the wireless peer-to-peer channel.
  • the processor 510 is further operative to determine the estimated capture time, to decode the video stream and to compare the received frame fingerprint with the respective frame fingerprint.
  • the processor 510 is additionally operative to assign the estimated capture time to the decoded video frame and to time align the video streams.
  • the processor 510 is operative, when executing the instructions stored in the memory 520, to perform the above described operations.
  • the processor 510 is thereby interconnected to the memory 520 to enable normal software execution.
  • Fig. 16 illustrates the video synchronization server system 500 as comprising a transceiver 530.
  • the video synchronization server system 500 could instead comprises one or more receivers and one or more transmitters.
  • FIG. 18 is, in an embodiment, a schematic block diagram illustrating an example of a video synchronization server system 700 comprising a processor 710, an associated memory 720 and a communication circuitry 730.
  • a computer program 740 which is loaded into the memory 720 for execution by processing circuitry including one or more processors 710.
  • the processor 710 and memory 720 are interconnected to each other to enable normal software execution.
  • a communication circuitry 730 is also interconnected to the processor 710 and/or the memory 720 to enable input and/or output of encoded video frames, timestamps and frame fingerprints.
  • the term 'processor' should be interpreted in a general sense as any system or device capable of executing program code or computer program instructions to perform a particular processing, determining or computing task.
  • the processing circuitry including one or more processors is thus configured to perform, when executing the computer program, well-defined processing tasks such as those described herein.
  • the processing circuitry does not have to be dedicated to only execute the above-described steps, 10 functions, procedure and/or blocks, but may also execute other tasks.
  • the computer program 740 comprises instructions, which when executed by the processor 710, cause the processor 710 to generate, for each user device of multiple user devices, a timestamp based on a current system time. The timestamp is output for transmission to the user device
  • the processor 710 is also caused to determine, for each user device of the multiple user devices, an estimated capture time of a video frame, used to generate a frame fingerprint, received from the user device with the timestamp over the wireless peer-to-peer channel, based on the timestamp and a current system time.
  • the processor 710 is further caused to decode, for each user device of the multiple user devices, a video stream of encoded video frames,
  • the processor 710 is additionally caused to compare, for each user device of the multiple user devices, the received frame fingerprint with a respective frame fingerprint generated for the decoded video frames and to assign, for each user device of the multiple user devices, the estimated capture time to a decoded video frame based on the comparison.
  • the processor 710 is further caused to time align video
  • the proposed technology also provides a carrier 750 comprising the computer program 740.
  • the carrier 750 is one of an electronic signal, an optical signal, an electromagnetic signal, a magnetic signal, an electric signal, a radio signal, a microwave signal, or a computer-readable storage medium 30 750.
  • the software or computer program 740 may be realized as a computer program product, which is normally carried or stored on a computer-readable medium 750, preferably nonvolatile computer-readable storage medium 750.
  • the computer-readable medium 750 may include one or more removable or non-removable memory devices including, but not limited to a Read-Only Memory (ROM), a Random Access Memory (RAM), a Compact Disc (CD), a Digital Versatile Disc (DVD), a Blue-ray disc, a Universal Serial Bus (USB) memory, a Hard Disk Drive (HDD) storage device, a flash memory, a magnetic tape, or any other conventional memory device.
  • the computer program 5 740 may thus be loaded into the operating memory of a computer or equivalent processing device, represented by the video synchronization server system 700 in Fig. 18, for execution by the processor 710 thereof.
  • a corresponding video synchronization server system may be defined as a group of function modules, where each step performed by the processor corresponds to a function module.
  • the function modules are implemented as a computer program running on the processor.
  • the video synchronization server system may alternatively be defined as a group of function modules, where the function modules are implemented5 as a computer program running on at least one processor.
  • the computer program residing in memory may thus be organized as appropriate function modules configured to perform, when executed by the processor, at least part of the steps and/or tasks described herein.
  • An example of such function modules is illustrated in Fig. 17 illustrating a schematic0 block diagram of a video synchronization server system 600 with function modules.
  • the video synchronization server system 600 comprises a timestamp generator 610 for generating, for each user device of multiple user devices, a timestamp based on a current system time. The timestamp is output for transmission to the user device over a wireless peer-to-peer channel.
  • the video synchronization server system 600 also comprises a time estimator 620 for determining, for each user device of the5 multiple user devices, an estimated capture time of a video frame, used to generate a frame fingerprint, received from the user device with a timestamp over the wireless peer-to-peer channel, based on the timestamp and a current system time.
  • the video synchronization server system 600 further comprises a decoder 630 for decoding, for each user device of the multiple user devices, a video stream of encoded video frames, received from the user device over a wireless media channel, to get decoded0 video frames.
  • the video synchronization server system 600 additionally comprises a comparator 640 for comparing, for each user device of the multiple user devices, the received frame fingerprint with a respective frame fingerprint generated for the decoded video frames.
  • the video synchronization server system 600 further comprises an assigning unit 650 for assigning, for each user device of the multiple user devices, the estimated capture time to a decoded video frame based on the comparison.
  • the video synchronization server system 600 additionally comprises a time aligner 660 for time aligning video streams from the multiple user devices based on the assigned estimated capture times.
  • the video synchronization server system 600 also comprises a receiver 670 for receiving the video streams over the wireless media channel from each user device of the multiple user devices and for receiving the frame fingerprint and the timestamp from each user device of the multiple user devices over the wireless peer-to-peer channel.
  • the video synchronization server system 600 also comprises a transmitter 670 for transmitting, to each user device of the multiple user devices, the timestamp over the wireless peer-to-peer channel.
  • the receiver and transmitter can be implemented as a transceiver or one and more receivers and one or more transmitters.
  • Yet another aspect of the embodiments relates to a user device that is configured to transmit a video stream of encoded video frames to a video synchronization server system over a wireless media channel.
  • the user device is also configured to generate a frame fingerprint of a current video frame upon reception of a timestamp from the video synchronization server system over a wireless peer-to- peer channel.
  • the user device is further configured to transmit the frame fingerprint and the timestamp to the video synchronization server system over the wireless peer-to-peer channel.
  • the user device is configured to record a scene with a camera of or connected to the user device to produce video frames.
  • the user device is also configured to generate the video stream by encoding the video frames.
  • the user device is configured to provide a video frame of a current scene recorded by the camera upon reception of the timestamp from the video synchronization server system over the wireless peer-to-peer channel.
  • the user device is also configured to generate the frame fingerprint of the video frame.
  • the user device is configured to initiate a browser-based application service to active a WebRTC getilserMedia API to access the camera.
  • the browser-based application service also actives a WebRTC MediaStream API to transmit the video stream to the video synchronization server system over the wireless media channel using RTP.
  • the browser-based application service further activates a WebRTC DataC annel API to establish the wireless peer-to-peer channel with the video synchronization server system.
  • Fig. 12 is a schematic block diagram of a hardware implementation of the user device 100.
  • the user device 100 comprises a transmitter 110 configured to transmit the video stream to the video synchronization server system over the wireless media channel and transmit the frame fingerprint and the timestamp to the video synchronization server system over the wireless peer-to-peer channel.
  • the user device 100 also comprises a receiver 110 configured to receive the timestamp from the video synchronization server system over the wireless peer-to-peer channel.
  • the user device 100 further comprises a fingerprint generator 120 configured to generate the frame fingerprint.
  • the user device 100 optionally comprises a camera 5 used to record a scene and output video frames.
  • the camera 5 does not necessarily have to be part of the user device 100 but could, alternatively, be connected thereto through a wired or wireless connection.
  • An encoder (not illustrated), such as implemented in the camera 5 or in the user device 100, is used to encode the video frames to get encoded video frames of the video stream.
  • the fingerprint generator 120 is preferably connected to the receiver 110 to get a notification of generating a frame fingerprint upon reception of a timestamp by the receiver 110.
  • the fingerprint generator 120 is preferably also connected to the camera 5 to retrieve a current video frame or picture therefrom to generate the frame fingerprint. This frame fingerprint is forwarded to the connected transmitter 110 for transmission together with the timestamp to the video synchronization server system.
  • the receiver and transmitter can be implemented as a transceiver 110 or one and more receivers and one or more transmitters.
  • the user device 200 comprises a processor 210 and a memory 220 comprising instructions executable by the processor 210.
  • the processor 210 is operative to cause a transmitter 230 to transmit the video stream to the video synchronization server system over the wireless media channel and to transmit the frame fingerprint and the timestamp to the video synchronization server system over the wireless peer-to-peer channel.
  • the processor 210 is also operative to generate the frame fingerprint.
  • the receiver and transmitter can be implemented as a transceiver 230 or one and more receivers and one or more transmitters.
  • the processor 210 is operative, when executing the instructions stored in the memory 220, to perform the above described operations.
  • the processor 210 is thereby interconnected to the memory 220 to enable normal software execution.
  • the user device 200 may optionally also comprise a camera 5 configured to record a scene to produce video frames.
  • Fig. 18 is, in an embodiment, a schematic block diagram illustrating an example of a user device 700 comprising a processor 710, an associated memory 720 and a communication circuitry 730.
  • a computer program 740 which is loaded into the memory 720 for execution by processing circuitry including one or more processors 710.
  • the processor 710 and memory 720 are interconnected to each other to enable normal software execution.
  • a communication circuitry 730 is also interconnected to the processor 710 and/or the memory 720 to enable input and/or output of encoded video frames, timestamps and frame fingerprints.
  • the computer program 740 comprises instructions, which when executed by the processor 710, cause the processor 710 to generate a video stream of encoded video frames for transmission to a video synchronization server system over a wireless media channel.
  • the processor 710 is also caused to generate a frame fingerprint of a current video frame upon reception of a timestamp from the video synchronization server system over a wireless peer-to-peer channel.
  • the processor 710 is further caused to associate the timestamp with the frame fingerprint for transmission to the video synchronization server system over the wireless peer-to-peer channel.
  • the computer program 740 may be comprised in the previously described carrier 750.
  • Associating the timestamp with the frame fingerprint may, for instance, be achieved by including the timestamp and the frame fingerprint in a data packet that is transmitted over the wireless peer-to-peer channel to the video synchronization server system.
  • the computer program residing in memory may thus be organized as appropriate function modules configured to perform, when executed by the processor, at least part of the steps and/or tasks described herein.
  • An example of such function modules is illustrated in Fig. 14 illustrating a schematic block diagram of a user device 300 with function modules.
  • the user device 300 comprises an encoder
  • the user device 300 also comprises a fingerprint generator 320 for generating a frame fingerprint of a current video frame upon reception of a timestamp from the video synchronization server system over a wireless peer-to-peer channel.
  • the user device 300 further comprises an associating unit 330 for associating the timestamp with the frame fingerprint0 for transmission to the video synchronization server system over the wireless peer-to-peer channel.
  • the user device 300 also comprises a transmitter 340 for transmitting the video stream to the video synchronization server system over the wireless media channel and for transmitting the timestamp and the frame fingerprint to the video synchronization server system over the wireless5 peer-to-peer channel.
  • the user device 300 preferably further comprises a receiver 340 for receiving the timestamp from the video synchronization server system over the wireless peer-to-peer channel.
  • the transmitter and receiver could be implemented as a transceiver 340 or as one or more transmitters and one or more receivers.
  • the user device is preferably in the form of a mobile or portable user device, such a mobile telephone, a smart phone, a tablet, a laptop, a video camera with wireless communication circuitry, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Physics & Mathematics (AREA)
  • Astronomy & Astrophysics (AREA)
  • General Physics & Mathematics (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

Selon l'invention, une synchronisation vidéo est obtenue par transmission d'estampilles temporelles à des dispositifs d'utilisateur (1, 2, 3), qui renvoient les estampilles temporelles conjointement avec des empreintes de trame. Les dispositifs d'utilisateur (1, 2, 3) transmettent également des flux vidéo (21, 22, 23) qui sont décodés pour obtenir des trames vidéo décodées (31, 32, 33). Les empreintes de trame reçues sont comparées à des empreintes de trame générées pour les trames vidéo décodées (31, 32, 33) de façon à trouver une correspondance. Les trames vidéo décodées (31, 32, 33) qui ont généré une correspondance d'empreintes de trame respective se voient affecter un temps de capture estimé respectif déterminé sur la base des estampilles temporelles et des heures de système courantes. Les flux vidéo (21, 22, 23) provenant des différents dispositifs d'utilisateur (1, 2, 3) sont alignés dans le temps sur la base des temps de capture estimés affectés aux trames vidéo décodées (31, 32, 33).
PCT/SE2014/051263 2014-10-27 2014-10-27 Synchronisation de flux vidéo WO2016068760A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/SE2014/051263 WO2016068760A1 (fr) 2014-10-27 2014-10-27 Synchronisation de flux vidéo

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/SE2014/051263 WO2016068760A1 (fr) 2014-10-27 2014-10-27 Synchronisation de flux vidéo

Publications (1)

Publication Number Publication Date
WO2016068760A1 true WO2016068760A1 (fr) 2016-05-06

Family

ID=51900956

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SE2014/051263 WO2016068760A1 (fr) 2014-10-27 2014-10-27 Synchronisation de flux vidéo

Country Status (1)

Country Link
WO (1) WO2016068760A1 (fr)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10250941B2 (en) 2016-12-13 2019-04-02 Nbcuniversal Media, Llc System and method for mapping affiliated graphs using video fingerprints
CN111183650A (zh) * 2018-07-16 2020-05-19 格雷斯诺特公司 动态控制指纹识别速率以促进媒体内容的时间精确修订
CN112533075A (zh) * 2020-11-24 2021-03-19 湖南傲英创视信息科技有限公司 视频处理方法、装置及系统
WO2021137252A1 (fr) * 2019-12-31 2021-07-08 Sling Media Pvt Ltd. Mode de latence faible dynamique pour un système de production de vidéo numérique
US20210274231A1 (en) * 2020-02-27 2021-09-02 Ssimwave Inc. Real-time latency measurement of video streams
CN114302169A (zh) * 2021-12-24 2022-04-08 威创集团股份有限公司 一种画面同步录制方法、装置、系统及计算机存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080122986A1 (en) * 2006-09-19 2008-05-29 Florian Diederichsen Method and system for live video production over a packeted network
US20110043691A1 (en) 2007-10-05 2011-02-24 Vincent Guitteny Method for synchronizing video streams
US20130304243A1 (en) * 2012-05-09 2013-11-14 Vyclone, Inc Method for synchronizing disparate content files
EP2670157A2 (fr) * 2012-06-01 2013-12-04 Koninklijke KPN N.V. Synchronisation de supports inter-destination à base d'empreintes digitales

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080122986A1 (en) * 2006-09-19 2008-05-29 Florian Diederichsen Method and system for live video production over a packeted network
US20110043691A1 (en) 2007-10-05 2011-02-24 Vincent Guitteny Method for synchronizing video streams
US20130304243A1 (en) * 2012-05-09 2013-11-14 Vyclone, Inc Method for synchronizing disparate content files
EP2670157A2 (fr) * 2012-06-01 2013-12-04 Koninklijke KPN N.V. Synchronisation de supports inter-destination à base d'empreintes digitales

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
HERNANDEZ ET AL.: "Robust Image Hashing Using Image Normalization and SVD Decomposition", IEEE 54TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS, 2011, pages 1 - 4
JIN-HEE CHOI; CHUNCK YOO: "Analytic End-to-End Estimation for the One-Way Delay and Its Variation", CONSUMER COMMUNICATIONS AND NETWORKING CONFERENCE, 2005, pages 527 - 532
LIU; XIAO: "A Robust Image Hashing Algorithm Resistant Against Geometrical Attacks", RADIOENGINEERING, vol. 22, no. 4, 2013, pages 1072 - 1081
PICKET, SIMPLE IMAGE HASHING WITHY PYTHON, 2003, Retrieved from the Internet <URL:https://blog.safaribooksonline.com/2013/11/26/image-hashing-with-python>
SAMINATHAN ET AL.: "Robust and secure image hashing", IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, vol. 1, no. 2, 2006, pages 215 - 230
SHEPHERD D ET AL: "Extending OSI to support synchronization required by multimedia applications", COMPUTER COMMUNICATIONS, ELSEVIER SCIENCE PUBLISHERS BV, AMSTERDAM, NL, vol. 13, no. 7, 1 September 1990 (1990-09-01), pages 399 - 406, XP024227243, ISSN: 0140-3664, [retrieved on 19900901], DOI: 10.1016/0140-3664(90)90159-E *
VENKATESAN: "Robust image hasing", 2000, MICROSOFT RESEARCH PUBLICATIONS
ZHAO ET AL.: "A Robust Image Hashing Method Based on Zernike Moments", JOURNAL OF COMPUTATIONAL INFORMATION SYSTEMS, vol. 6, no. 3, 2010, pages 717 - 725

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10250941B2 (en) 2016-12-13 2019-04-02 Nbcuniversal Media, Llc System and method for mapping affiliated graphs using video fingerprints
CN111183650A (zh) * 2018-07-16 2020-05-19 格雷斯诺特公司 动态控制指纹识别速率以促进媒体内容的时间精确修订
CN111183650B (zh) * 2018-07-16 2021-10-29 六科股份有限公司 动态控制指纹识别速率以促进媒体内容的时间精确修订
US11503362B2 (en) 2018-07-16 2022-11-15 Roku, Inc. Dynamic control of fingerprinting rate to facilitate time-accurate revision of media content
WO2021137252A1 (fr) * 2019-12-31 2021-07-08 Sling Media Pvt Ltd. Mode de latence faible dynamique pour un système de production de vidéo numérique
US11784839B2 (en) 2019-12-31 2023-10-10 Dish Network Technologies India Private Limited Dynamic low latency mode for a digital video production system
US20210274231A1 (en) * 2020-02-27 2021-09-02 Ssimwave Inc. Real-time latency measurement of video streams
US11638051B2 (en) * 2020-02-27 2023-04-25 Ssimwave, Inc. Real-time latency measurement of video streams
CN112533075A (zh) * 2020-11-24 2021-03-19 湖南傲英创视信息科技有限公司 视频处理方法、装置及系统
CN114302169A (zh) * 2021-12-24 2022-04-08 威创集团股份有限公司 一种画面同步录制方法、装置、系统及计算机存储介质

Similar Documents

Publication Publication Date Title
US20210195275A1 (en) Video stream synchronization
US11290770B2 (en) Dynamic control of fingerprinting rate to facilitate time-accurate revision of media content
US11627351B2 (en) Synchronizing playback of segmented video content across multiple video playback devices
US11063999B2 (en) Distributed fragment timestamp synchronization
US10516757B2 (en) Server-side scheduling for media transmissions
US9319738B2 (en) Multiplexing, synchronizing, and assembling multiple audio/video (A/V) streams in a media gateway
CN107211078B (zh) 基于vlc的视频帧同步
WO2016068760A1 (fr) Synchronisation de flux vidéo
US9100461B2 (en) Automatically publishing streams to multiple destinations
US20150113576A1 (en) Method and apparatus for ip video signal synchronization
US20140365685A1 (en) Method, System, Capturing Device and Synchronization Server for Enabling Synchronization of Rendering of Multiple Content Parts, Using a Reference Rendering Timeline
US20170048291A1 (en) Synchronising playing of streaming content on plural streaming clients
US20230328308A1 (en) Synchronization of multiple content streams
US11943125B2 (en) Discontinuity detection in transport streams
WO2023170679A1 (fr) Synchronisation de multiples flux de contenu

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14799242

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14799242

Country of ref document: EP

Kind code of ref document: A1