CN117376622A - Audio sharing synchronization method and system - Google Patents

Audio sharing synchronization method and system Download PDF

Info

Publication number
CN117376622A
CN117376622A CN202210768625.7A CN202210768625A CN117376622A CN 117376622 A CN117376622 A CN 117376622A CN 202210768625 A CN202210768625 A CN 202210768625A CN 117376622 A CN117376622 A CN 117376622A
Authority
CN
China
Prior art keywords
playing end
playing
play
data
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210768625.7A
Other languages
Chinese (zh)
Inventor
文志平
裘昊
沈德欢
杨阳
李祖金
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Arcvideo Technology Co ltd
Original Assignee
Hangzhou Arcvideo Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Arcvideo Technology Co ltd filed Critical Hangzhou Arcvideo Technology Co ltd
Priority to CN202210768625.7A priority Critical patent/CN117376622A/en
Publication of CN117376622A publication Critical patent/CN117376622A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • H04N21/43076Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen of the same content streams on multiple devices, e.g. when family members are watching the same movie on different devices
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4392Processing of audio elementary streams involving audio buffer management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4394Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention discloses an audio sharing synchronization method and system, wherein the audio sharing synchronization system comprises a master playing end and at least one slave playing end, wherein the master playing end is equipment for requesting audio data from a local file or a media content service platform in multi-screen equipment, and the slave playing end is equipment for receiving shared audio data; the data forwarding service is used for sharing and transmitting the audio data between the master playing end and the slave playing end, and the signaling forwarding service is used for transmitting signaling messages between the master playing end and the slave playing end through equipment; the main playing end further comprises a main playing end data acquisition and analysis module, a main playing end playing control module and a main playing end signaling receiving and transmitting module; the slave playing end further comprises a slave playing end data receiving module, a slave playing end data obtaining and analyzing module, a slave playing end playing control module, a slave playing end signaling receiving and transmitting module and a slave playing end synchronous control module, and the slave playing end is used for realizing audio sharing and millisecond synchronization in media content playing under a multi-screen fusion scene.

Description

Audio sharing synchronization method and system
Technical Field
The invention belongs to the technical field of audio and video playing, and particularly relates to an audio sharing synchronization method and system.
Background
With the development of technology, application scenes of multimedia are changed day by day. The method comprises the steps of integrating an original television screen, a mobile terminal, an outdoor large screen and a multi-screen of everything interconnection. The network communication technology is advanced along with the development of the new application scenes such as home theater, cabin entertainment and the like. The multi-screen fusion in the independent space has the natural requirement of multi-screen simulcasting in the video playing field, such as a vehicle-mounted intelligent cabin, and a plurality of screen devices at different positions can share audio and video contents such as movies, television shows, variety, music and the like, so that riding experience can be enhanced.
Most media content has both video and audio, and sharing of the media content between multiple devices must be accomplished by sharing and synchronizing video data and audio data. Wherein the audio synchronization technique is a difficulty in realizing the sharing among the multi-screen devices of the multimedia content. In the prior art, the synchronization method of the audio equipment is that each equipment firstly goes to a server to request the audio content of the same source, then uses the same clock server time as the reference time, and the synchronization is carried out by referring to the clock server time when the audio of each equipment is played. Although this method can realize play synchronization, it does not perform well in the following cases: a. the data received between the multiple devices is fast or slow due to network jitter; b. the network delay phase difference among the devices is relatively large; c. the audio playing delay of the hardware per se is inconsistent among the devices. In addition, for online media content, the multimedia content service platform provides content according to flow rate charging or member account charging, so that multiple devices can waste flow rate by pulling the same stream from the multimedia content service platform, and cost is increased. In the prior art, a common screen sharing, namely screen throwing, is that a small screen and a large screen share two screens, for example, a mobile terminal shares video content to an intelligent television, and the method cannot realize sharing among multi-screen devices.
Disclosure of Invention
Aiming at the defects of the technology and the characteristics of multi-screen fusion scenes such as an intelligent cabin, the invention provides an audio sharing synchronization method and an audio sharing synchronization system, which are used for realizing audio sharing and millisecond synchronization in media content playing under the multi-screen fusion scene.
In order to solve the technical problems, the invention adopts the following technical scheme:
an aspect of the present invention provides an audio sharing synchronization method, which is applicable to sharing audio content between multi-screen devices, including:
setting equipment responsible for requesting audio data from a local file or a media content service platform in the multi-screen equipment as a master playing end, and other equipment for receiving shared audio data as slave playing ends;
setting a data forwarding service and a signaling forwarding service, wherein the data forwarding service is used for sharing and transmitting audio data between a master playing end and a slave playing end, and the signaling forwarding service is used for transmitting signaling messages between the master playing end and the slave playing end through equipment;
starting to play a local audio and video file or a network streaming media video of a media content platform at a main playing end, acquiring audio data from a data acquisition and analysis module at the main playing end, inputting the audio data into a pre-caching pool, copying one data at the same time, transmitting the data to a data forwarding service and transmitting the data to a slave playing end;
Requesting data from the pre-cache pool, and starting to output data for playing when the data of the pre-cache pool is larger than a minimum cache threshold value;
after the master playing end plays, a synchronous reference time stamp is sent to the slave playing end according to a certain frequency;
after being started from the playing end equipment, the device enters a monitoring waiting state, and monitors waiting audio data and signaling messages;
after the master playing end plays, a synchronous reference time stamp is sent to the slave playing end according to a certain frequency; after receiving the synchronous reference time signaling message for the first time from the playing end, initializing a timer by utilizing the synchronous reference time and the network transmission delay, wherein the starting time of the timer is the sum of the reference time and the network transmission delay; the slave playing end receives the synchronous reference time stamp sent by the master playing end at regular intervals in the playing process of the slave playing end, and after each time of synchronous reference time is received by the slave playing end, the latest synchronous reference time and the network transmission delay are used for updating the timer;
after receiving the audio data from the playing end, continuously sending the audio data to a decoder in a playing control module of the playing end for decoding, outputting decoded audio frame data from the decoder of the playing end, and calling a synchronous control module by the playing control module to judge the synchronous state of the decoded audio frame data, and if judging synchronization, directly sending the audio frame data to equipment for sound playing; if the player advances from the playing end, a synchronous logic of slow playing is started by a multiple algorithm; and if the time is backward from the playing end, starting the synchronous logic of the multiple algorithm for quick playing.
In one possible design, if advanced from the playback end, slow-play synchronization logic is enabled, comprising: setting a first slow play speed and a second slow play speed which are smaller than the normal play multiple, adopting the smaller first slow play speed to play slowly, setting the larger second slow play speed to play slowly when the play advance time difference is 1-2 times of the maximum synchronous error time, judging that the synchronous processing is finished when the play advance time difference is smaller than the minimum synchronous error time, entering a synchronous state and setting to play at the normal speed.
In one possible design, if it falls behind from the playback end, the fast playback synchronization logic is enabled, comprising: setting a first fast play speed and a second fast play speed which are both larger than the normal play multiple, adopting the larger first fast play speed to perform fast play, setting the smaller second fast play speed to perform fast play when the play lag time difference is 1-2 times of the maximum synchronous error time, judging that the synchronous processing is finished when the play lead time difference is smaller than the minimum synchronous error time, entering a synchronous state and setting to perform play according to the normal speed.
In one possible design, after the first frame decoded data is obtained from the playback end, if the timer is not started, the arrival of the first synchronization reference timestamp and the timer start are waited.
In one possible design, the synchronization reference timestamp is the latest audio playing timestamp of the audio data of the master playing end, the master playing end sends the latest audio playing timestamp to the signaling forwarding service, and the signaling forwarding service sends the received timestamp to all the slave playing ends at the same time.
In yet another aspect, an embodiment of the present invention provides an audio sharing synchronization system, including a master playing end and at least one slave playing end,
the main playing end is the equipment for requesting audio data from a local file or a media content service platform in the multi-screen equipment, and the slave playing end is other equipment for receiving shared audio data; the data forwarding service is used for sharing and transmitting the audio data between the master playing end and the slave playing end, and the signaling forwarding service is used for transmitting signaling messages between the master playing end and the slave playing end through equipment;
the main playing end further comprises a main playing end data acquisition and analysis module, a main playing end playing control module and a main playing end signaling receiving and transmitting module; the slave playing end further comprises a slave playing end data receiving module, a slave playing end data obtaining and analyzing module, a slave playing end playing control module, a slave playing end signaling receiving and transmitting module and a slave playing end synchronous control module;
The method comprises the steps that a master playing end starts to play a local audio and video file or a network streaming media video of a media content platform, audio data are acquired from a master playing end data acquisition analysis module, the audio data are input into a pre-cache pool of a playing control module, data are requested from the pre-cache pool, when the data in the pre-cache pool are larger than a minimum cache threshold value, data are output to play, and meanwhile, one copy of the data is transmitted to a data forwarding service and sent to a slave playing end;
after being started from the playing end equipment, the device enters a monitoring waiting state, and monitors waiting audio data and signaling messages;
after the master playing end plays, a synchronous reference time stamp is sent to the slave playing end according to a certain frequency; after the slave playing end receives the synchronous reference time signaling message for the first time, the synchronous control module of the slave playing end initializes a timer by utilizing the synchronous reference time and the network transmission delay, and the starting time of the timer is the sum of the reference time and the network transmission delay; the slave playing end receives the synchronous reference time stamp sent by the master playing end at regular intervals in the playing process of the slave playing end, and after each time of synchronous reference time is received by the slave playing end, the latest synchronous reference time and the network transmission delay are used for updating the timer;
Decoding from a decoder in a play control module of a play end, outputting decoded audio frame data from the decoder of the play end, calling a synchronization control module of the play end to judge the synchronization state of the decoded audio frame data, and if judging synchronization, directly sending the audio frame data into equipment for sound playing; if the player advances from the playing end, a synchronous logic of slow playing is started by a multiple algorithm; and if the time is backward from the playing end, starting the synchronous logic of the multiple algorithm for quick playing.
In one possible design, if advanced from the playback end, slow-play synchronization logic is enabled, comprising: setting a first slow play speed and a second slow play speed which are smaller than the normal play multiple, adopting the smaller first slow play speed to play slowly, setting the larger second slow play speed to play slowly when the play advance time difference is 1-2 times of the maximum synchronous error time, judging that the synchronous processing is finished when the play advance time difference is smaller than the minimum synchronous error time, entering a synchronous state and setting to play at the normal speed.
In one possible design, if it falls behind from the playback end, the fast playback synchronization logic is enabled, comprising: setting a first fast play speed and a second fast play speed which are both larger than the normal play multiple, adopting the larger first fast play speed to perform fast play, setting the smaller second fast play speed to perform fast play when the play lag time difference is 1-2 times of the maximum synchronous error time, judging that the synchronous processing is finished when the play lead time difference is smaller than the minimum synchronous error time, entering a synchronous state and setting to perform play according to the normal speed.
In one possible design, after the first frame decoded data is obtained from the playback end, if the timer is not started, the arrival of the first synchronization reference timestamp and the timer start are waited.
In one possible design, the synchronization reference timestamp is the latest audio playing timestamp of the audio data of the master playing end, the master playing end sends the latest audio playing timestamp to the signaling forwarding service, and the signaling forwarding service sends the received timestamp to all the slave playing ends at the same time.
The invention has the following beneficial effects:
(1) The main playing end uses a pre-cache pool mechanism, so that not only can the influence caused by network jitter be effectively counteracted, but also the influence caused by the difference of decoders of different devices and the inconsistent decoding efficiency can be counteracted;
(2) The built-in Playing Time (PTS) of the video data is directly used without any third-party time system, so that a special time server is not needed to be relied on, and the unification of the time systems can be ensured;
(3) The Device hardware delay device_Latecy parameter is introduced to carry out synchronous control, so that synchronous control can be compatible with synchronous processing among devices with different hardware capacities, and synchronous control can be more accurate;
(4) The synchronization process introduces a maximum synchronization error threshold tolerance_timemax and a minimum synchronization error threshold tolerance_timemin, wherein tolerance_timemax is used for judging whether the synchronization process is finished or not, and tolerance_timemin is used for eliminating the frequent synchronization and the state transition phenomenon of the non-synchronization occurring at the critical position of the synchronization judgment.
Drawings
Fig. 1 is a schematic structural diagram of an audio sharing synchronization system according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The embodiment of the invention provides an audio sharing synchronization method, which is suitable for sharing audio content among multi-screen devices and comprises the following steps:
setting equipment responsible for requesting audio data from a local file or a media content service platform in the multi-screen equipment as a master playing end, and other equipment for receiving shared audio data as slave playing ends; that is, there is only one master playing end, the number of slave playing ends can be at least one, and the master playing end and the slave playing end can be in one-to-many relation;
setting a data forwarding service and a signaling forwarding service, wherein the data forwarding service is used for sharing and transmitting audio data between a master playing end and a slave playing end, and the signaling forwarding service is used for transmitting signaling messages between the master playing end and the slave playing end through equipment; the signaling message includes, but is not limited to, a play operation instruction such as selecting other devices, play Pause (Pause), play skip (Seek), synchronous play time information, etc., and the signaling forwarding service transmits text information with a small data size.
Starting to play a local audio and video file or a network streaming media video of a media content platform at a main playing end, acquiring audio data from a data acquisition and analysis module at the main playing end, inputting the audio data into a pre-caching pool, copying one data at the same time, transmitting the data to a data forwarding service and transmitting the data to a slave playing end;
the main playing end requests data from the pre-cache pool, and starts to output data for playing when the data of the pre-cache pool is larger than a minimum cache threshold value;
after being started from the playing end equipment, the device enters a monitoring waiting state, and monitors waiting audio data and signaling messages;
after the master playing end plays, a synchronous reference time stamp is sent to the slave playing end according to a certain frequency; after receiving the synchronous reference time signaling message for the first time from the playing end, initializing a timer by utilizing the synchronous reference time and the network transmission delay, wherein the starting time of the timer is the sum of the reference time and the network transmission delay; the slave playing end receives the synchronous reference time stamp sent by the master playing end at regular intervals in the playing process of the slave playing end, and after each time of synchronous reference time is received by the slave playing end, the latest synchronous reference time and the network transmission delay are used for updating the timer;
after receiving the audio data from the playing end, continuously sending the audio data to a decoder in a playing control module of the playing end for decoding, outputting decoded audio frame data from the decoder of the playing end, and calling a synchronous control module by the playing control module to judge the synchronous state of the decoded audio frame data, and if judging synchronization, directly sending the audio frame data to equipment for sound playing; if the player advances from the playing end, a synchronous logic of slow playing is started by a multiple algorithm; and if the time is backward from the playing end, starting the synchronous logic of the multiple algorithm for quick playing.
In an embodiment of the present invention, when the main playing end plays, the pre-Cache pool may set a minimum Cache threshold cache_min and a maximum Cache Capacity cache_capacity. Playing after the data in the pre-Cache pool is larger than the cache_Min, and keeping the data in the pre-Cache pool at least in the cache_Min amount all the time in the playing process, and finally outputting all the data in the pre-Cache pool when the playing of the data source is finished. And simultaneously transmitting one data to the data forwarding service and transmitting to the slave playing end when each frame of data is input into the pre-cache pool. The size of the pre-Cache pool Capacity, cache_capacity, and the minimum Cache threshold, cache_Min, are configurable. The purpose of the pre-Cache set cache_Min threshold is to ensure that the master and slave terminals have certain redundant data to process synchronization and network jitter. The master playing end pre-caches the data and does not play the data before the data reaches the cache_Min, but the data before the cache_Min is instantly sent to the slave playing end. When the primary playing end pre-caches data and exceeds the cache_Min, the primary playing end starts playing, and at the moment, the secondary playing end also receives data with the size of about the cache_Min, and the secondary playing end is the described redundant data and can be used for processing synchronization and network jitter. For example, the main playing end cache_min is 5 frames, when the main playing end caches to the 6 th frame, the main playing end starts playing the 1 st frame data, at this time, the slave playing end has already received the 5 th frame data to wait, if the network jitter is poor, the first synchronization time stamp is not sent to the slave playing end in time, after the network is good, the slave playing end may continuously receive the first synchronization time stamp, the second synchronization time stamp, and the like for a plurality of times, at this time, the main playing end has already played the 4 th frame data, at this time, the slave playing end may directly start playing from the 4 th frame data.
In the prior art, for the problem that the playing progress of a master playing end and a slave playing end in audio playing is inconsistent, a general processing method is that the master playing end and the slave playing end wait before advancing and the master playing end chase after losing frames. But the audio is characterized by its continuity, and waiting or frame loss catch-up can break the continuity of the audio, thereby making the audio sound intermittent and unsmooth. By the audio sharing synchronization method provided by the embodiment, the audio playing time stamp (Presentation Time Stamp, PTS) built in the audio data is directly used without using any third-party time system, so that the synchronization of playing the same media content in audio millisecond level between multi-screen devices can be realized without depending on a special time server. Only the main playing end needs to request data from the content platform, and the audio data among multiple screens are forwarded through the built-in data forwarding service, so that the flow consumption of the content platform is greatly saved. The audio synchronization processing adopts a multiple speed processing algorithm, so that the continuity of the audio is ensured, and the synchronous smoothness and no perception can be realized by setting multiple values of multiple levels.
In an embodiment of the present invention, the synchronization reference timestamp is not the system time of the master playing end, nor the server time of the time server, but the audio playing timestamp (Presentation Time Stamp, PTS) of the audio data, the master playing end sends the latest audio playing timestamp to the signaling forwarding service, and the signaling forwarding service sends the received timestamp to all the slave playing ends at the same time, so as to ensure that the timestamps received by all the slave playing ends are consistent each time.
In an embodiment of the present invention, after receiving audio data from a playing end, the audio data is sent to a decoder for decoding, and after starting a synchronization time stamp for the first time from the playing end, a timer is started for playing synchronization control. Defining the Time stamp of the audio Frame as frame_time, the Time of the timer as clock_time, the acceptable synchronous error range as tolerance_time, tolerance_time can be set according to actual requirements, the Network transmission Time delay as network_latency, the synchronous Time stamp as Ref_time, and the Device hardware Time delay as device_latency. Wherein the device hardware delay refers to the time consumed from the input of audio decoding data to the playing of sound from the device sound card, and the synchronization logic process is as follows:
(1) After the first frame decoding data is obtained from the playing end, if the timer is not started, waiting for the arrival of a first synchronous reference time stamp and starting the timer;
(2) When the timer starts, the reference Time ref_time and the network_latency are used to initialize the timer, and the starting Time of the timer is start_time=ref_time+network_latency. In the playing process, the master playing end sends the latest reference Time to the slave playing end according to a certain frequency, when the slave playing end receives the New reference Time, the clock of the timer is immediately updated to Update update_time=ref_time_new+network_latency, so that the Time stamps of all the slave playing ends are consistent with the Time stamp of the master playing end, and the master playing end and the slave playing end play according to the same Time system. The frequency of the updated reference time sent by the main playing end can be set by the user according to the actual situation, for example, can be 500 milliseconds.
(3) The upper Device synchronization threshold, a_time=clock_time+tolerance_time+device_latency, indicates that the current audio Frame leads the audio being played by the master player when frame_time > a_time, at which Time the playback rate needs to be slowed down until resynchronization;
(4) The lower Device synchronization threshold b_time=clock_time-tolerance_time+device_latency, when frame_time < b_time, it is indicated that the current audio Frame is behind the audio being played by the master player, and the playing rate needs to be accelerated until resynchronization;
(5) When b_time < frame_time < a_time, i.e., frame_time is located between b_time and a_time, it is indicated that the current audio Frame and the audio Frame being played by the main playing end are within the allowable synchronization error range and are in a synchronous state, and at this Time, the audio only needs to be played according to the normal playing Frame rate.
Further, in an embodiment of the present invention, the following strategy is adopted to implement the synchronization process under the advanced and the backward conditions:
firstly, refining a synchronization error threshold value, namely, subdividing the synchronization error threshold value into a maximum synchronization error threshold value Tolerance_TimeMax and a minimum synchronization error threshold value Tolerance_TimeMin, wherein Tolerance_TimeMax is greater than Tolerance_TimeMin, and Tolerance_TimeMax is used for judging whether the synchronization state is in; tolerance_TimeMin is used for judging whether synchronization is achieved again in the synchronization process so as to end the synchronization process; the above arrangement has the advantage of preventing critical situations of frequent synchronous and asynchronous state transitions in the case of only one threshold; the maximum synchronization error threshold and the minimum synchronization error threshold are not fixed, and can be configured according to actual conditions, and only the Tolerance_TimeMax > Tolerance_TimeMin is ensured.
Secondly, for the processing of playing in advance and behind, in order to keep the continuity of the audio, an audio double-speed processing algorithm is adopted, so that the sampling rate, the fundamental frequency and the formants of the voice are not changed while the speed change of the sound is ensured, and the effect of speed change and tone change are achieved. In addition, for the synchronous processing, multiple Speed values with different levels can be defined according to the severity of the asynchronous processing, so that the smoothness of the synchronous processing process is achieved, the perception of a user is reduced, and the Speed value is defined as speed_factor.
Frame_time > a0_time when playing leads, a0_time=clock_time+tolerance_timemax+device_latency; at this point, the speed_factor <1.0 is configured, and synchronization is achieved by slowing down the playback rate. When frame_time < a1_time and a1_time=clock_time+tolerance_timemin+device_latency are reached during the synchronization process, it is determined that the synchronization state has been reached again, the slow-release synchronization process is ended, and the speed_sector=1.0 normal rate play is resumed.
Frame_time < b0_time, b0_time=clock_time-tolerance_timemax+device_latency when playing back is backward; at this Time, the speed_factor >1.0 is configured, synchronization is realized by accelerating the playing rate, frame_time > B1_time is reached in the synchronization processing process, B1_time=clock_time-tolerance_TimeMin+device_latency, then the synchronization state is judged to have been reached again, the fast playing synchronization processing is ended, and the speed_factor=1.0 normal rate playing is recovered;
During the synchronization processing, the multi-level speed_factor can be customized according to the actual asynchronous severity, so that the synchronization processing process is performed stepwise, the playing is smoother in the synchronization processing process, and the user experience is better.
In order to enable a person skilled in the art to make a comparison between time settings more clear, the method for audio sharing synchronization according to the embodiment of the present invention includes the following specific implementation steps:
in the specific implementation step, the data forwarding service and the signaling forwarding service both adopt standard Socket protocols to realize sending and monitoring; tolerance_timemax=100 milliseconds, tolerance_timemin=50 milliseconds; the capacity of the pre-Cache pool is 50, the minimum Cache threshold value is cache_Min=10, the hardware delay of the slave terminal equipment is 80 milliseconds, and the audio frequency multiple speed processing algorithm adopts a time domain companding algorithm;
the method comprises the steps that a main playing end starts to play a local audio and video file or a network streaming media video of a media content platform, acquires audio data from a data acquisition and analysis module, inputs the audio data into a pre-cache pool, and copies one data to be sent to a data forwarding service;
the main playing end requests data from the pre-cache pool, does not output any data before the data in the pre-cache pool does not reach 10 frames, and starts to output the data for playing after the data in the pre-cache pool is more than 10 frames;
After being started from the playing end equipment, the device enters a monitoring waiting state, and monitors waiting audio data and signaling messages;
after receiving the synchronization reference Time signaling message from the playing end for the first Time, a timer is initialized by utilizing the synchronization reference Time Ref_Time and the Network transmission delay, and the timer starts for Time Start_Time=Ref_Time+network_Latency.
After receiving the audio data from the playing end, continuously sending the audio data to a decoder for decoding, outputting decoded audio frame data by the decoder, and calling a synchronous control module by a playing control module to judge the synchronous state of the decoded audio frame data, wherein parameters such as Tolerance_TimeMax, timer time, network transmission delay, equipment delay and the like are used in judgment;
if judging synchronization, directly sending the synchronous signals into equipment for sound playing;
if the player advances by 200 milliseconds, starting a slow play synchronization logic; setting speed_factor=0.5, speed_factor=0.8 two levels; firstly, setting a speed_factor=0.5 time Speed slow play, setting the speed_factor=0.8 time Speed slow play when the play advance time difference is 100 milliseconds, judging that the synchronous processing is finished when the play advance time difference is smaller than Tolerance_TimeMin, entering a synchronous state and setting the speed_factor=1.0 to play at a normal Speed;
If 300 milliseconds are behind from the playing end, starting a quick playing synchronous logic; setting a speed_factor=1.2, setting a speed_factor=1.5 two levels, firstly setting a speed_factor=1.5 time Speed fast play, setting a speed_factor=1.2 time Speed fast play when a play lag time difference is 150 milliseconds, judging that the synchronous processing is finished when a play lead time difference is smaller than Tolerance_TimeMin, entering a synchronous state and setting a speed_factor=1.0 to play at a normal Speed;
the slave player updates a timer by using the latest synchronous reference Time and Network transmission delay after receiving the synchronous reference Time every certain Time in the playing process of the slave player, wherein the Update Time of the timer is updated by using the update_time=Ref_Time_New+network_Latency;
and closing the master-slave playing end, and ending the playing.
Referring to fig. 1, an audio sharing synchronization system according to an embodiment of the present invention includes a master playing end and at least one slave playing end,
the main playing end is the equipment for requesting audio data from a local file or a media content service platform in the multi-screen equipment, and the slave playing end is other equipment for receiving shared audio data; the data forwarding service is used for sharing and transmitting the audio data between the master playing end and the slave playing end, and the signaling forwarding service is used for transmitting signaling messages between the master playing end and the slave playing end through equipment;
The main playing end further comprises a main playing end data acquisition and analysis module, a main playing end playing control module and a main playing end signaling receiving and transmitting module; the slave playing end further comprises a slave playing end data receiving module, a slave playing end data obtaining and analyzing module, a slave playing end playing control module, a slave playing end signaling receiving and transmitting module and a slave playing end synchronous control module;
the method comprises the steps that a master playing end starts to play a local audio and video file or a network streaming media video of a media content platform, audio data are acquired from a master playing end data acquisition analysis module, the audio data are input into a pre-cache pool of a playing control module, data are requested from the pre-cache pool, when the data in the pre-cache pool are larger than a minimum cache threshold value, data are output to play, and meanwhile, one copy of the data is transmitted to a data forwarding service and sent to a slave playing end;
after being started from the playing end equipment, the device enters a monitoring waiting state, and monitors waiting audio data and signaling messages;
after the master playing end plays, a synchronous reference time stamp is sent to the slave playing end according to a certain frequency; after the slave playing end receives the synchronous reference time signaling message for the first time, the synchronous control module of the slave playing end initializes a timer by utilizing the synchronous reference time and the network transmission delay, and the starting time of the timer is the sum of the reference time and the network transmission delay; the slave playing end receives the synchronous reference time stamp sent by the master playing end at regular intervals in the playing process of the slave playing end, and after each time of synchronous reference time is received by the slave playing end, the latest synchronous reference time and the network transmission delay are used for updating the timer;
Decoding from a decoder in a play control module of a play end, outputting decoded audio frame data from the decoder of the play end, calling a synchronization control module of the play end to judge the synchronization state of the decoded audio frame data, and if judging synchronization, directly sending the audio frame data into equipment for sound playing; if the player advances from the playing end, a synchronous logic of slow playing is started by a multiple algorithm; and if the time is backward from the playing end, starting the synchronous logic of the multiple algorithm for quick playing.
In an embodiment of the present invention, when the main playing end plays, the pre-Cache pool may set a minimum Cache threshold cache_min and a maximum Cache Capacity cache_capacity. Playing after the data in the pre-Cache pool is larger than the cache_Min, and keeping the data in the pre-Cache pool at least in the cache_Min amount all the time in the playing process, and finally outputting all the data in the pre-Cache pool when the playing of the data source is finished. And simultaneously transmitting one data to the data forwarding service and transmitting to the slave playing end when each frame of data is input into the pre-cache pool. The size of the pre-Cache pool Capacity, cache_capacity, and the minimum Cache threshold, cache_Min, are configurable. The purpose of the pre-Cache set cache_Min threshold is to ensure that the master and slave terminals have certain redundant data to process synchronization and network jitter.
In the prior art, for the problem that the playing progress of a master playing end and a slave playing end in audio playing is inconsistent, a general processing method is that the master playing end and the slave playing end wait before advancing and the master playing end chase after losing frames. But the audio is characterized by its continuity, and waiting or frame loss catch-up can break the continuity of the audio, thereby making the audio sound intermittent and unsmooth. By the audio sharing synchronization method provided by the embodiment, the audio playing time stamp (Presentation Time Stamp, PTS) built in the audio data is directly used without using any third-party time system, so that the synchronization of playing the same media content in audio millisecond level between multi-screen devices can be realized without depending on a special time server. Only the main playing end needs to request data from the content platform, and the audio data among multiple screens are forwarded through the built-in data forwarding service, so that the flow consumption of the content platform is greatly saved. The audio synchronization processing adopts a multiple speed processing algorithm, so that the continuity of the audio is ensured, and the synchronous smoothness and no perception can be realized by setting multiple values of multiple levels.
In an embodiment of the present invention, the synchronization reference timestamp is not the system time of the master playing end, nor the server time of the time server, but the audio playing timestamp (Presentation Time Stamp, PTS) of the audio data, the master playing end sends the latest audio playing timestamp to the signaling forwarding service, and the signaling forwarding service sends the received timestamp to all the slave playing ends at the same time, so as to ensure that the timestamps received by all the slave playing ends are consistent each time.
In an embodiment of the present invention, after receiving audio data from a playing end, the audio data is sent to a decoder for decoding, and after starting a synchronization time stamp for the first time from the playing end, a timer is started for playing synchronization control. Defining the Time stamp of the audio Frame as frame_time, the Time of the timer as clock_time, the acceptable synchronous error range as tolerance_time, tolerance_time can be set according to actual requirements, the Network transmission Time delay as network_latency, the synchronous Time stamp as Ref_time, and the Device hardware Time delay as device_latency. Wherein the device hardware delay refers to the time consumed from the input of audio decoding data to the playing of sound from the device sound card, and the synchronization logic process is as follows:
(1) After the first frame decoding data is obtained from the playing end, if the timer is not started, waiting for the arrival of a first synchronous reference time stamp and starting the timer;
(2) When the timer starts, the reference Time ref_time and the network_latency are used to initialize the timer, and the starting Time of the timer is start_time=ref_time+network_latency. In the playing process, the master playing end sends the latest reference Time to the slave playing end according to a certain frequency, when the slave playing end receives the New reference Time, the clock of the timer is immediately updated to Update update_time=ref_time_new+network_latency, so that the Time stamps of all the slave playing ends are consistent with the Time stamp of the master playing end, and the master playing end and the slave playing end play according to the same Time system.
(3) The upper Device synchronization threshold, a_time=clock_time+tolerance_time+device_latency, indicates that the current audio Frame leads the audio being played by the master player when frame_time > a_time, at which Time the playback rate needs to be slowed down until resynchronization;
(4) The lower Device synchronization threshold b_time=clock_time-tolerance_time+device_latency, when frame_time < b_time, it is indicated that the current audio Frame is behind the audio being played by the master player, and the playing rate needs to be accelerated until resynchronization;
(5) When b_time < frame_time < a_time, i.e., frame_time is located between b_time and a_time, it is indicated that the current audio Frame and the audio Frame being played by the main playing end are within the allowable synchronization error range and are in a synchronous state, and at this Time, the audio only needs to be played according to the normal playing Frame rate.
Further, in an embodiment of the present invention, the following strategy is adopted to implement the synchronization process under the advanced and the backward conditions:
firstly, refining a synchronization error threshold value, namely, subdividing the synchronization error threshold value into a maximum synchronization error threshold value Tolerance_TimeMax and a minimum synchronization error threshold value Tolerance_TimeMin, wherein Tolerance_TimeMax is greater than Tolerance_TimeMin, and Tolerance_TimeMax is used for judging whether the synchronization state is in; tolerance_TimeMin is used for judging whether synchronization is achieved again in the synchronization process so as to end the synchronization process; the above arrangement has the advantage of preventing critical situations of frequent synchronous and asynchronous state transitions in the case of only one threshold; the maximum synchronization error threshold and the minimum synchronization error threshold are not fixed, and can be configured according to actual conditions, and only the Tolerance_TimeMax > Tolerance_TimeMin is ensured.
Secondly, for the processing of playing in advance and behind, in order to keep the continuity of the audio, an audio double-speed processing algorithm is adopted, so that the sampling rate, the fundamental frequency and the formants of the voice are not changed while the speed change of the sound is ensured, and the effect of speed change and tone change are achieved. In addition, for the synchronous processing, multiple Speed values with different levels can be defined according to the severity of the asynchronous processing, so that the smoothness of the synchronous processing process is achieved, the perception of a user is reduced, and the Speed value is defined as speed_factor.
Frame_time > a0_time when playing leads, a0_time=clock_time+tolerance_timemax+device_latency; at this point, the speed_factor <1.0 is configured, and synchronization is achieved by slowing down the playback rate. When frame_time < a1_time and a1_time=clock_time+tolerance_timemin+device_latency are reached during the synchronization process, it is determined that the synchronization state has been reached again, the slow-release synchronization process is ended, and the speed_sector=1.0 normal rate play is resumed.
Frame_time < b0_time, b0_time=clock_time-tolerance_timemax+device_latency when playing back is backward; at this Time, the speed_factor >1.0 is configured, synchronization is realized by accelerating the playing rate, frame_time > B1_time is reached in the synchronization processing process, B1_time=clock_time-tolerance_TimeMin+device_latency, then the synchronization state is judged to have been reached again, the fast playing synchronization processing is ended, and the speed_factor=1.0 normal rate playing is recovered;
During the synchronization processing, the multi-level speed_factor can be customized according to the actual asynchronous severity, so that the synchronization processing process is performed stepwise, the playing is smoother in the synchronization processing process, and the user experience is better.
The data acquisition and analysis module of the main playing end has two functions, wherein the first function is responsible for downloading data from a media server; the second function is to perform packet format analysis, data encapsulation layer format analysis and encoding format analysis on the audio data, and finally output audio data which is not decoded yet frame by frame. The data acquisition analysis module at the playing end only needs to analyze the data packet format of the audio data, analyze the data encapsulation layer format and analyze the encoding format, and finally output the audio data which is not decoded yet frame by frame.
It should be understood that the exemplary embodiments described herein are illustrative and not limiting. Although one or more embodiments of the present invention have been described with reference to the accompanying drawings, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present invention as defined by the following claims.

Claims (10)

1. An audio sharing synchronization method, which is suitable for sharing audio content between multi-screen devices, is characterized by comprising the following steps:
Setting equipment responsible for requesting audio data from a local file or a media content service platform in the multi-screen equipment as a master playing end, and other equipment for receiving shared audio data as slave playing ends;
setting a data forwarding service and a signaling forwarding service, wherein the data forwarding service is used for sharing and transmitting audio data between a master playing end and a slave playing end, and the signaling forwarding service is used for transmitting signaling messages between the master playing end and the slave playing end through equipment;
starting to play a local audio and video file or a network streaming media video of a media content platform at a main playing end, acquiring audio data from a data acquisition and analysis module at the main playing end, inputting the audio data into a pre-caching pool, copying one data at the same time, transmitting the data to a data forwarding service and transmitting the data to a slave playing end;
requesting data from the pre-cache pool, and starting to output data for playing when the data of the pre-cache pool is larger than a minimum cache threshold value;
after being started from the playing end equipment, the device enters a monitoring waiting state, and monitors waiting audio data and signaling messages;
after the master playing end plays, a synchronous reference time stamp is sent to the slave playing end according to a certain frequency; after receiving the synchronous reference time signaling message for the first time from the playing end, initializing a timer by utilizing the synchronous reference time and the network transmission delay, wherein the starting time of the timer is the sum of the reference time and the network transmission delay; the slave playing end receives the synchronous reference time stamp sent by the master playing end at regular intervals in the playing process of the slave playing end, and after each time of synchronous reference time is received by the slave playing end, the latest synchronous reference time and the network transmission delay are used for updating the timer;
After receiving the audio data from the playing end, continuously sending the audio data to a decoder in a playing control module of the playing end for decoding, outputting decoded audio frame data from the decoder of the playing end, and calling a synchronous control module by the playing control module to judge the synchronous state of the decoded audio frame data, and if judging synchronization, directly sending the audio frame data to equipment for sound playing; if the player advances from the playing end, a synchronous logic of slow playing is started by a multiple algorithm; and if the time is backward from the playing end, starting the synchronous logic of the multiple algorithm for quick playing.
2. The audio sharing synchronization method of claim 1, wherein if advanced from the playback end, the slow playback synchronization logic is enabled, comprising: setting a first slow play speed and a second slow play speed which are smaller than the normal play multiple, adopting the smaller first slow play speed to play slowly, setting the larger second slow play speed to play slowly when the play advance time difference is 1-2 times of the maximum synchronous error time, judging that the synchronous processing is finished when the play advance time difference is smaller than the minimum synchronous error time, entering a synchronous state and setting to play at the normal speed.
3. The audio sharing synchronization method of claim 1, wherein if it is lagging from the playback end, then starting the fast playback synchronization logic comprises: setting a first fast play speed and a second fast play speed which are both larger than the normal play multiple, adopting the larger first fast play speed to perform fast play, setting the smaller second fast play speed to perform fast play when the play lag time difference is 1-2 times of the maximum synchronous error time, judging that the synchronous processing is finished when the play lead time difference is smaller than the minimum synchronous error time, entering a synchronous state and setting to perform play according to the normal speed.
4. The audio sharing synchronization method of claim 1, wherein after the first frame decoded data is obtained from the playback end, if the timer is not started, waiting for the arrival of the first synchronization reference time stamp and the timer is started.
5. The audio sharing synchronization method of claim 1, wherein the synchronization reference time stamp is a latest audio play time stamp of audio data of the master player, the master player transmits the latest audio play time stamp to the signaling forwarding service, and the signaling forwarding service transmits the received time stamp to all the slave players at the same time.
6. An audio sharing synchronization system is characterized by comprising a master playing end and at least one slave playing end,
the main playing end is the equipment for requesting audio data from a local file or a media content service platform in the multi-screen equipment, and the slave playing end is other equipment for receiving shared audio data; the data forwarding service is used for sharing and transmitting the audio data between the master playing end and the slave playing end, and the signaling forwarding service is used for transmitting signaling messages between the master playing end and the slave playing end through equipment;
the main playing end further comprises a main playing end data acquisition and analysis module, a main playing end playing control module and a main playing end signaling receiving and transmitting module; the slave playing end further comprises a slave playing end data receiving module, a slave playing end data obtaining and analyzing module, a slave playing end playing control module, a slave playing end signaling receiving and transmitting module and a slave playing end synchronous control module;
the method comprises the steps that a master playing end starts to play a local audio and video file or a network streaming media video of a media content platform, audio data are acquired from a master playing end data acquisition analysis module, the audio data are input into a pre-cache pool of a playing control module, data are requested from the pre-cache pool, when the data in the pre-cache pool are larger than a minimum cache threshold value, data are output to play, and meanwhile, one copy of the data is transmitted to a data forwarding service and sent to a slave playing end; after being started from the playing end equipment, the device enters a monitoring waiting state, and monitors waiting audio data and signaling messages;
After the master playing end plays, a synchronous reference time stamp is sent to the slave playing end according to a certain frequency; after the slave playing end receives the synchronous reference time signaling message for the first time, the synchronous control module of the slave playing end initializes a timer by utilizing the synchronous reference time and the network transmission delay, and the starting time of the timer is the sum of the reference time and the network transmission delay; the slave playing end receives the synchronous reference time stamp sent by the master playing end at regular intervals in the playing process of the slave playing end, and after each time of synchronous reference time is received by the slave playing end, the latest synchronous reference time and the network transmission delay are used for updating the timer;
decoding from a decoder in a play control module of a play end, outputting decoded audio frame data from the decoder of the play end, calling a synchronization control module of the play end to judge the synchronization state of the decoded audio frame data, and if judging synchronization, directly sending the audio frame data into equipment for sound playing; if the player advances from the playing end, a synchronous logic of slow playing is started by a multiple algorithm; and if the time is backward from the playing end, starting the synchronous logic of the multiple algorithm for quick playing.
7. The audio sharing synchronization system of claim 1, wherein if advanced from the playback end, the slow playback synchronization logic is enabled, comprising: setting a first slow play speed and a second slow play speed which are smaller than the normal play multiple, adopting the smaller first slow play speed to play slowly, setting the larger second slow play speed to play slowly when the play advance time difference is 1-2 times of the maximum synchronous error time, judging that the synchronous processing is finished when the play advance time difference is smaller than the minimum synchronous error time, entering a synchronous state and setting to play at the normal speed.
8. The audio sharing synchronization system of claim 1, wherein if it is lagging from the playback end, the fast playback synchronization logic is enabled comprising: setting a first fast play speed and a second fast play speed which are both larger than the normal play multiple, adopting the larger first fast play speed to perform fast play, setting the smaller second fast play speed to perform fast play when the play lag time difference is 1-2 times of the maximum synchronous error time, judging that the synchronous processing is finished when the play lead time difference is smaller than the minimum synchronous error time, entering a synchronous state and setting to perform play according to the normal speed.
9. The audio sharing synchronization system of claim 1, wherein after the first frame decoded data is obtained from the playback end, if the timer is not started, waiting for the first synchronization reference timestamp to arrive and the timer to start.
10. The audio sharing synchronization system of claim 1, wherein the synchronization reference time stamp is a latest audio play time stamp of audio data of the master player, the master player transmits the latest audio play time stamp to the signaling forwarding service, and the signaling forwarding service transmits the received time stamp to all the slave players simultaneously.
CN202210768625.7A 2022-06-30 2022-06-30 Audio sharing synchronization method and system Pending CN117376622A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210768625.7A CN117376622A (en) 2022-06-30 2022-06-30 Audio sharing synchronization method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210768625.7A CN117376622A (en) 2022-06-30 2022-06-30 Audio sharing synchronization method and system

Publications (1)

Publication Number Publication Date
CN117376622A true CN117376622A (en) 2024-01-09

Family

ID=89393401

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210768625.7A Pending CN117376622A (en) 2022-06-30 2022-06-30 Audio sharing synchronization method and system

Country Status (1)

Country Link
CN (1) CN117376622A (en)

Similar Documents

Publication Publication Date Title
CN108347622B (en) Multimedia data pushing method and device, storage medium and equipment
CN101809906B (en) Synchronizing related data streams in interconnection networks
CN101917391B (en) Method for playing network video and system for playing network video
CN1497876B (en) Equipment and method for transmitting and receiving multi-medium broadcast
CN105453580B (en) Method of reseptance, sending method, reception device and sending device
US8683535B2 (en) Fast channel change
CN102761776B (en) Video and audio synchronizing method of P2PVoD (peer-to-peer video on demand) system based on SVC (scalable video coding)
US11516518B2 (en) Live streaming with live video production and commentary
CN103200461A (en) Multiple-player-terminal synchronized playing system and playing method
WO2000073758A9 (en) Method and apparatus for user-time-alignment for broadcast works
CN102291599A (en) Network video playing method and network video playing device
KR20080014843A (en) Method and system for improving interactive media response systems using visual cues
CN103167320A (en) Audio and video synchronization method and audio and video synchronization system and mobile phone live broadcast client-side
US20200059687A1 (en) Live streaming with multiple remote commentators
CN111404882B (en) Media stream processing method and device
US20210235170A1 (en) System and Method for Optimizing Playlist Information for Ultra Low Latency Live Streaming
US11758245B2 (en) Interactive media events
KR102566550B1 (en) Method of display playback synchronization of digital contents in multiple connected devices and apparatus using the same
WO2016102224A1 (en) Quality of media synchronization
US10917675B2 (en) System and method for intelligent delivery of segmented media streams
CN113691847A (en) Multi-screen frame synchronization method and device
KR101600891B1 (en) Synchronization method and system for audio and video of a plurality terminal
CN111246222B (en) Method for realizing multicast control audio and video synchronization of PIS (peer to peer system) in recorded broadcast and broadcast-on-demand states
CN117376622A (en) Audio sharing synchronization method and system
CN117376623A (en) Audio and video sharing synchronization method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination