WO2022142481A1 - 音视频数据的处理方法、直播装置、电子设备和存储介质 - Google Patents

音视频数据的处理方法、直播装置、电子设备和存储介质 Download PDF

Info

Publication number
WO2022142481A1
WO2022142481A1 PCT/CN2021/118485 CN2021118485W WO2022142481A1 WO 2022142481 A1 WO2022142481 A1 WO 2022142481A1 CN 2021118485 W CN2021118485 W CN 2021118485W WO 2022142481 A1 WO2022142481 A1 WO 2022142481A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
audio
module
timestamp
type
Prior art date
Application number
PCT/CN2021/118485
Other languages
English (en)
French (fr)
Inventor
程文波
葛天杰
张�林
孟环宇
贾宇宁
王康茂
尹洪福
阎云逸
Original Assignee
杭州星犀科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202011637767.7A external-priority patent/CN112822505B/zh
Priority claimed from CN202120826728.5U external-priority patent/CN215072677U/zh
Priority claimed from CN202110611537.1A external-priority patent/CN113055718B/zh
Priority claimed from CN202110643677.7A external-priority patent/CN113365094A/zh
Application filed by 杭州星犀科技有限公司 filed Critical 杭州星犀科技有限公司
Priority to CN202180087403.2A priority Critical patent/CN116762344A/zh
Publication of WO2022142481A1 publication Critical patent/WO2022142481A1/zh
Priority to US18/345,209 priority patent/US20230345089A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/2368Multiplexing of audio and video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/238Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/27Server based end-user applications
    • H04N21/274Storing end-user multimedia data in response to end-user request, e.g. network recorder
    • H04N21/2743Video hosting of uploaded data from client
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42203Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • H04N21/43072Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen of multiple content streams on the same device
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/434Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
    • H04N21/4341Demultiplexing of audio and video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/436Interfacing a local distribution network, e.g. communicating with another STB or one or more peripheral devices inside the home
    • H04N21/4363Adapting the video stream to a specific local network, e.g. a Bluetooth® network
    • H04N21/43632Adapting the video stream to a specific local network, e.g. a Bluetooth® network involving a wired protocol, e.g. IEEE 1394
    • H04N21/43635HDMI
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440281Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the temporal resolution, e.g. by frame skipping
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/647Control signaling between network components and server or clients; Network processes for video distribution between server and clients, e.g. controlling the quality of the video stream, by dropping packets, protecting content from unauthorised alteration within the network, monitoring of network load, bridging between two different networks, e.g. between IP and wireless
    • H04N21/64746Control signals issued by the network directed to the server or the client
    • H04N21/64761Control signals issued by the network directed to the server or the client directed to the server
    • H04N21/64769Control signals issued by the network directed to the server or the client directed to the server for rate control

Definitions

  • the present application relates to the technical field of media streaming, and in particular, to a method for processing audio and video data, a live broadcast device, an electronic device, and a storage medium.
  • the multimedia data In order to transmit multimedia data in real time on the Internet, the multimedia data must be streamed first.
  • the process of streaming is to perform necessary encapsulation processing on the multimedia data, and package the audio and video data into RTP (Real -time Transport Protocol, real-time transport protocol) data packets, so as to realize the streaming media transmission of multimedia data.
  • RTP Real -time Transport Protocol, real-time transport protocol
  • Live video is live broadcast by using the Internet and streaming media technology.
  • Video has gradually become the mainstream expression method of the Internet because of the combination of rich elements such as images, text, and sound.
  • Internet live broadcast adopts real-time streaming transmission technology.
  • the host starts the live broadcast, encodes and compresses the live broadcast content, and transmits it to the website server.
  • This process is called “pushing”, that is, the video content is pushed to the server, and the live broadcast content is transmitted to the website.
  • the server is installed, when users watch the live broadcast, they will directly pull the live content from the website server.
  • This process is called “pulling the stream”.
  • the corresponding media stream is obtained by pulling the stream, it can be decoded and played locally.
  • the decoding and playback process will depend on the timestamps carried by the audio and video frames in the media stream. In the case of poor uniformity of original acquisition or uneven timestamp sequence output by the encoder, etc. In this case, the time stamps carried by the audio and video frames in the media stream will be uneven, resulting in abnormal playback on the player side.
  • An embodiment of the present application provides a method for processing audio and video data, the method comprising:
  • the media stream is an audio and video stream
  • the audio and video stream includes a video stream and an audio stream
  • the method before updating the current media frame timestamp to the sum of the last media frame timestamp and the standard media frame interval, the method further includes:
  • performing forward compensation or backward compensation on the updated timestamp of the current media frame according to the compensation coefficient includes:
  • the forward compensation is to use the updated sum of the current media frame timestamp and the compensation coefficient as the current media frame target timestamp;
  • the reverse compensation is to use the difference between the updated timestamp of the current media frame and the compensation coefficient as the target timestamp of the current media frame.
  • the method further includes:
  • the last media frame time stamp is updated according to the current media frame target time stamp, and the updated last media frame time stamp is used as the last media frame time stamp of the next media frame time stamp.
  • obtaining the upper and lower limit ranges includes obtaining the upper and lower limit ranges according to the standard media frame interval and fluctuation upper and lower limit coefficients, wherein the fluctuation upper and lower limit coefficients are smaller than what can be tolerated by the decoder at the playback end fluctuation range.
  • obtaining the standard media frame interval of the media stream includes:
  • the media stream is a video stream
  • a standard audio frame interval is obtained according to the sampling rate of the audio stream and the actual number of audio sampling points per frame, and the standard audio frame interval is used as the standard media frame interval.
  • the method further includes:
  • the frame loss operation is performed.
  • the type frame includes at least a first type frame and a second type frame
  • the frame dropping operation includes:
  • the frames of the second type in the queue are discarded in descending order of time stamps.
  • the type frame includes at least a first type frame and a second type frame
  • the second type frame establishes a secondary weight according to the order of importance
  • the frame dropping operation includes:
  • the frames of the second type in the queue are sequentially discarded according to the second-level weight from small to large.
  • the method further includes:
  • the maximum time interval difference between the current two frame timestamps in the lost type frames in the queue is repeatedly calculated, and then compared with the frame loss judgment threshold corresponding to the type of frame until the type of frame in the queue is in the queue. If the maximum time interval difference between the timestamps of the two frames is not greater than the frame drop judgment threshold corresponding to the frame of this type, the frame drop operation is stopped.
  • the method further includes:
  • the stacking ratio is the ratio of the maximum time interval difference between the current two frame timestamps in any type of frame and the frame loss judgment threshold of this type of frame;
  • the maximum time interval difference between the current two frame timestamps in the dropped frames of the type in the queue is repeatedly calculated.
  • the frame dropping operation is stopped.
  • the method further includes:
  • the server creates push tasks for each target live broadcast platform
  • the server distributes the user's audio and video data to multiple target live broadcast platforms.
  • the method further includes:
  • the server generates live broadcast platform configuration data matching the corresponding live broadcast platform demand information in response to the user's configuration instruction based on the interaction information.
  • sending live broadcast platform configuration interaction information to the user includes:
  • the server responds to the user's configuration instruction based on the interaction information, including:
  • the binding account information request When the binding account information request is authorized, receive user selection data and send configuration data to the target live broadcast platform, where the configuration data includes privacy setting indication information and audio and video release setting information;
  • the server completes the setting according to the user's selection data and stores the user's configuration data for the target live broadcast platform.
  • the method further includes:
  • the server receives and stores the address of the live broadcast room created by the streaming task.
  • An embodiment of the present application further provides a live broadcast device, the live broadcast device includes an audio processing module, a device interface module and a processor module, the audio processing module includes an audio input interface and an audio processing chip, and the audio input interface is used for connecting A microphone, the audio processing chip is respectively connected to the audio input interface, the device interface module and the processor module, and the audio processing chip is responsible for the audio data input by the audio input interface and/or the device interface module Perform noise reduction and/or sound mixing processing, and transmit the processed audio data to the processor module;
  • the processor module includes a time stamp homogenization processing unit, the time stamp homogenization processing unit includes an acquisition module, a judgment module, a compensation module, an adjustment module and an output module, the acquisition module is connected with the judgment module, and the The judgment module is connected with the adjustment module, the adjustment module is connected with the compensation module, and the compensation module is connected with the output module;
  • the obtaining module is used to obtain a media stream, wherein the media stream is an audio and video stream;
  • the judging module is used to obtain the difference between the timestamp of the current media frame and the timestamp of the previous media frame and the upper and lower limits of the difference, and determine whether the difference is within the upper and lower limits, and if the If the judgment result is yes, the output module outputs the current media frame timestamp as the current media frame target timestamp; if the judgment result is no, the compensation module is used to obtain the standard media frame of the media stream interval, the adjustment module is configured to update the current media frame timestamp to the sum of the last media frame timestamp and the standard media frame interval;
  • the judging module is further configured to judge whether the difference is greater than the standard media frame interval, and if the judgment result is yes, the compensation module corrects the updated timestamp of the current media frame according to the compensation coefficient. compensation, if the judgment result is no, the compensation module performs reverse compensation on the updated timestamp of the current media frame according to the compensation coefficient;
  • the output module is configured to output the timestamp after the forward compensation or the backward compensation as the target timestamp of the current media frame.
  • the device interface module includes an HDMI interface module and/or a USB interface module, wherein the HDMI interface module includes at least one HDMI input interface, the USB interface module includes at least one USB interface, and the The HDMI input interface and the USB interface are respectively connected to the audio processing chip.
  • the HDMI interface module further includes at least one first format converter, the first format converter connects the HDMI input interface and the processor module, and the first format converter converts The data input by the HDMI input interface is converted from HDMI format to MIPI format, and the data in MIPI format is transmitted to the processor module, wherein the data input by the HDMI input interface includes video data and/or the audio data.
  • the USB interface module includes a first USB interface and a second USB interface, and the first USB interface is connected to the audio processing chip through the processor module, and is used to connect to the audio processing chip The audio data is input; the second USB interface is connected to the processor module for system debugging.
  • the processor module includes a USB port
  • the first USB interface is set to be multiple
  • the USB interface module further includes an interface expander, one end of the interface expander is connected to the USB port, The other end of the interface expander is connected to a plurality of the first USB interfaces.
  • the audio input interface includes an active input interface and a passive input interface, wherein the active input interface is used for connecting an active microphone, and the passive input interface is used for connecting a passive microphone .
  • the audio processing module further includes an audio output interface, the audio output interface is connected to the audio processing chip, and is used for outputting the processed audio data.
  • the live broadcast apparatus further includes a display module, the display module includes a display screen and a second format converter, the second format converter is connected to the processor module and the display screen, and
  • the processor module outputs data in MIPI format
  • the second format converter converts the data in MIPI format into LVDS format
  • the display screen displays the data in LVDS format, wherein the data output by the processor module MIPI format data includes video data.
  • the display screen includes a touch screen
  • the USB interface module includes a third USB interface
  • the third USB interface connects the interface extender and the touch screen.
  • the live broadcast device further includes a data output module, the data output module includes a third format converter and an HDMI output interface, the third format converter is connected to the processor module and the HDMI an output interface, wherein the third format converter converts the data output by the processor module from MIPI format to HDMI format, and transmits the HDMI format data to the HDMI output interface, wherein the processor module The outputted data includes video data and the audio data.
  • the processor module further includes an audio and video frame dropping unit, and the audio and video frame dropping unit includes:
  • a determination module used to determine the weight coefficient corresponding to each type of frame in the audio and video stream
  • the calculation module is used to calculate the frame loss judgment threshold corresponding to each type of frame according to the weight coefficient of each type of frame and the queue capacity of the queue;
  • the frame dropping module is used to execute the frame dropping operation if the maximum time interval difference between the timestamps of the two frames in the frame of this type in the queue is greater than the frame dropping judgment threshold corresponding to the frame of this type at the sending moment of any type of frame.
  • the processor module further includes an audio and video frame dropping unit, and the audio and video frame dropping unit includes:
  • External dynamic parameter setter used to set the weight coefficient of audio frame and video frame, and set the frame loss judgment threshold parameter
  • the parameter collector is used to collect parameters related to frame loss judgment, including weight coefficient, queue capacity, and frame loss judgment threshold parameters;
  • the parameter calculator is used to obtain the frame loss judgment thresholds of various types of frames according to the collected parameters according to the calculation rules;
  • the frame loss judger is used to first find the frame loss judgment threshold of this type of frame, and then calculate the maximum time interval difference between the timestamps of the two frames of this type in the queue. Judgment threshold for comparison and judgment;
  • the frame drop executor is used to drop frames of this type in the queue in descending order according to the timestamps when the frame drop determiner determines that the frame drop operation is performed.
  • a frame loss determiner which repeatedly calculates the maximum time interval difference between the timestamps of the current two frames in the lost type frames in the queue, and performs frame loss determination.
  • Embodiments of the present application further provide an electronic device, including a memory and a processor, where a computer program is stored in the memory, and the processor is configured to run the computer program to execute any of the methods described above.
  • An embodiment of the present application further provides a storage medium, where a computer program is stored in the storage medium, wherein the computer program is configured to execute any of the above methods when running.
  • the media frames obtained through the system api may have inconsistencies. Evenly, there may be queue buffers inside the encoder, and the encoding processing of media frames also has time differences. If unmarked timestamps are collected and the timestamp sequence output by the encoder is used, there may be unevenness, or the collected timestamps are decoded.
  • the audio files are further obtained, and the process of decoding each frame may be uneven in time, resulting in uneven acquisition, etc., which will make the time stamps carried by the audio and video frames in the media stream uneven, resulting in abnormal playback on the player side.
  • the current media frame timestamp is updated to the previous media frame timestamp and the standard media frame interval.
  • the frame interval between the updated current media frame timestamp and the next media frame timestamp will increase, so the updated current media frame timestamp is forward compensated according to the compensation coefficient.
  • the difference between the target timestamp of the media frame and the timestamp of the previous media frame is within the fluctuation range of the standard media frame interval, and the frame interval between the updated current media frame timestamp and the next media frame timestamp is also reduced.
  • the corrected timestamp of each media frame is corrected and compensated, which not only solves the problem that the timestamps carried by the audio and video frames in the media stream are not uniform, resulting in abnormal playback at the playback end, but also balances the balance through forward compensation and reverse compensation.
  • the accumulated error prevents the accumulated error from accumulating more and more in the process of adjusting the time stamp sequence, which improves the compatibility of audio and video, and has standardization significance.
  • FIG. 1 is a flowchart of a method for processing audio and video data according to Embodiment 1 of the present application;
  • FIG. 2 is a flowchart of another method for processing audio and video data according to Embodiment 1 of the present application;
  • FIG. 3 is a structural block diagram of a time stamp homogenization processing unit according to Embodiment 1 of the present application.
  • FIG. 4 is a schematic diagram of a frame loss process according to Embodiment 2 of the present application.
  • FIG. 5 is a schematic flow chart of the configuration flow of audio and video streaming data according to Embodiment 3 of the present application.
  • FIG. 6 is a schematic diagram of an application environment of a live broadcast device according to Embodiment 4 of the present application.
  • FIG. 7 is a schematic diagram of a first live broadcast apparatus according to Embodiment 4 of the present application.
  • FIG. 8 is a schematic diagram of a second live broadcast apparatus according to Embodiment 4 of the present application.
  • FIG. 9 is a schematic diagram of a third live broadcast apparatus according to Embodiment 4 of the present application.
  • FIG. 10 is a schematic diagram of a fourth live broadcast apparatus according to Embodiment 4 of the present application.
  • FIG. 11 is a schematic diagram of a fifth live broadcast apparatus according to Embodiment 4 of the present application.
  • FIG. 12 is a schematic diagram of a sixth live broadcast apparatus according to Embodiment 4 of the present application.
  • FIG. 13 is a schematic diagram of a seventh live broadcast apparatus according to Embodiment 4 of the present application.
  • FIG. 14 is a schematic diagram of an internal structure of an electronic device according to Embodiment 5 of the present application.
  • FIG. 1 is a flowchart of a method for processing audio and video data according to Embodiment 1 of the present application. As shown in FIG. 1 , the method includes the following steps:
  • Step S101 acquiring a media stream, wherein the media stream is an audio and video stream, the audio and video stream includes a video stream and an audio stream, and the media stream is a technology that uses streaming transmission to make streaming media play on the Internet;
  • Step S102 obtaining the difference between the timestamp of the current media frame and the timestamp of the previous media frame in the media stream, as well as obtaining the upper and lower limits of the difference, and judging whether the difference is within the upper and lower limits; in this embodiment, obtaining the media After streaming, each media frame data will be accompanied by the time stamp of the media frame collection time and the time stamp marked after the media frame data is encoded.
  • the time stamp used in this application may be the time stamp of the collection time or the time marked after encoding. stamp;
  • Step S103 if the judgment result is yes, output the current media frame time stamp as the current media frame target time stamp; if the judgment result is no, obtain the standard media frame interval of the media stream, and update the current media frame time stamp to the previous time stamp.
  • the sum of the media frame time stamp and the standard media frame interval in this embodiment, if the difference is within the upper and lower limits, it is considered that the frame interval between the current media frame time stamp and the previous media frame time stamp meets the requirements, and there is no need for the current media frame time stamp
  • the timestamp is corrected, and the current media frame timestamp is output as the target timestamp of the current media frame. When the difference is outside the upper and lower limits, it will cause the playback end to play abnormally after decoding. Therefore, update the current media frame timestamp to the previous media frame.
  • the sum of the frame timestamp and the standard media frame interval in this embodiment, if the difference is within the upper and lower limits, it is considered that the frame interval between the current media frame time stamp and the previous media frame time
  • Step S104 judging whether the difference value is greater than the standard media frame interval, if the judgment result is yes, then perform forward compensation on the updated current media frame timestamp according to the compensation coefficient; if the judgment result is no, then according to the compensation coefficient The timestamp of the current media frame is reversely compensated, and the timestamp after forward compensation or reverse compensation is output as the target timestamp of the current media frame.
  • the current media frame time stamp is updated to the sum of the previous media frame time stamp and the standard media frame interval, and then it is determined whether the difference is greater than the standard media frame interval, and if the difference is greater than the standard media frame interval The value is greater than the standard media frame interval.
  • the frame interval between the updated current media frame timestamp and the next media frame timestamp will increase. Therefore, the updated current media frame timestamp is calculated according to the compensation coefficient. Perform forward compensation.
  • the difference between the target timestamp of the current media frame and the timestamp of the previous media frame is still within the upper and lower limits, that is, within the fluctuation range of the standard media frame interval, and the update is reduced at the same time.
  • the frame interval between the current media frame timestamp and the next media frame timestamp if the difference is less than the standard media frame interval, after updating the current media frame timestamp, the updated current media frame timestamp and the next media frame timestamp
  • the frame interval of the current media frame will be reduced, so the updated current media frame timestamp is reversely compensated according to the compensation coefficient.
  • the difference between the current media frame target timestamp and the previous media frame timestamp is still at the standard within the fluctuation range of the media frame interval, the frame interval of the updated current media frame time stamp and the next media frame time stamp is also increased; after correcting the current media frame time stamp, forward compensation and reverse compensation are performed. Balance the cumulative error, among which, considering that the compensation coefficient is too large, the error will be further increased, and the compensation capacity is limited if the compensation coefficient is too small, so the compensation coefficient can be set to 0.1 times the standard media frame interval.
  • obtaining the upper and lower limit ranges includes obtaining the upper and lower limit ranges according to the standard media frame interval and the fluctuation upper and lower limit coefficients, wherein the fluctuation upper and lower limit coefficients are smaller than the fluctuation range that can be tolerated by the decoder at the playback end.
  • the fluctuation upper and lower limit coefficients are generally required to be smaller than the fluctuation range that can be tolerated by the decoder at the playback end. If the fluctuation upper limit coefficient is set to 1.05 and the fluctuation lower limit coefficient is set to 0.8, the upper limit range threshold is the fluctuation upper limit coefficient 1.05 times the standard media Frame interval, the lower limit range threshold is the fluctuation lower limit coefficient of 0.8 multiplied by the standard media frame interval.
  • the upper limit of the fluctuation range is set to 1.05, which is far less than the upper limit of the fluctuation range that the h5 player can tolerate, 1.5, so it is not set as the maximum upper limit, because A smaller upper limit will improve the uniformity of the output, but it will lead to more original points that need to be adjusted, and it will also increase the possibility of forced synchronization exceeding the maximum error tolerance factor.
  • a larger upper limit (less than 1.5 times the standard audio frame interval of the maximum upper limit that the player can tolerate) will slightly reduce the output uniformity, but the collection points that need to be adjusted will be less, and the difference will also reduce the difference exceeding the maximum error allowable coefficient.
  • the fluctuation upper and lower limit coefficients can be set according to the requirements, as long as the set fluctuation upper and lower limit coefficients are smaller than the fluctuation range that can be tolerated by the decoder at the playback end, exemplarily, the fluctuation upper limit coefficient ⁇ standard media frame interval +
  • the compensation coefficient can be less than the upper limit coefficient defined by the playback kernel ⁇ the standard media frame interval.
  • the current media frame time stamp before updating the current media frame time stamp to the sum of the previous media frame time stamp and the standard media frame interval, it is judged whether the difference is greater than the maximum error allowable coefficient, and if the judgement result is yes, the current media frame is output.
  • the frame time stamp is used as the target time stamp of the current media frame. If the judgment result is no, the current media frame time stamp is updated to the sum of the previous media frame time stamp and the standard media frame interval, where the maximum allowable error coefficient is n times.
  • Standard media frame interval n is a value greater than 1. In this embodiment, n is a value greater than 1, and n is greater than the upper limit of the fluctuation range.
  • n When the upper limit of the fluctuation range is changed, the value of n can be dynamically set, and the current media frame timestamp is updated to the previous media frame timestamp and standard Before the sum of the media frame intervals, first determine whether the difference is too large. When the difference is greater than n times the standard media frame interval, the current media frame timestamp is not corrected, and the current media frame timestamp is directly output as the current media frame. Target timestamp, this is because when the difference is greater than n times the standard media frame interval, it means that there is a large deviation between the current media frame timestamp and the previous media frame timestamp. The deviation may be that the acquisition end is collecting media. There is an interruption when streaming data. Exemplarily, the acquisition end allows audio source switching.
  • performing forward compensation or backward compensation on the updated timestamp of the current media frame according to the compensation coefficient includes:
  • the calibration process does not necessarily correct the frame interval of each frame to the standard media frame interval.
  • the upper limit of the fluctuation range that the h5 player can tolerate is 1.5 times the standard audio frame interval.
  • the standard audio frame interval is 22.5 milliseconds, then after playing the first frame of audio, the frame interval between the second frame audio timestamp and the first frame audio timestamp can be played normally as long as it is within 1.5*22.5 milliseconds, so update the current media frame timestamp to the previous one.
  • the current media frame target time stamp is the updated current media frame time stamp.
  • the sum of the media frame time stamp and the compensation coefficient so that the player can play normally, reduce the frame interval between the updated current media frame time stamp and the next media frame time stamp, balance the accumulated error, and under reverse compensation, the current The target timestamp of the media frame is the difference between the updated current media frame timestamp and the compensation coefficient. If the player can play normally, increase the frame interval between the updated current media frame timestamp and the next media frame timestamp. , balance the cumulative error, and by introducing a compensation coefficient to balance the cumulative error, prevent the cumulative error from accumulating more and more.
  • the last media frame timestamp is updated according to the current media frame target timestamp, and the updated last media frame The frame timestamp is used as the previous media frame timestamp for the next media frame timestamp.
  • the correction of the current media frame time stamp has been completed.
  • the frame time stamp is corrected, and the frame interval between the next media frame time stamp and the current media frame target time stamp is required, so the previous media frame time stamp is updated according to the current media frame target time stamp, and the updated last media frame time stamp
  • a certain audio frame time stamp sequence is pts1, pts2, pts3 and pts4
  • the current media frame time stamp is pts2
  • the previous media frame time stamp prevpts pts1
  • the difference between pts2 and pts1 is greater than the standard audio frame interval
  • correct and compensate pts2 to obtain the current media frame target timestamp PTS2
  • obtaining the standard media frame interval of the media stream includes: if the media stream is a video stream, obtaining the standard video frame interval according to the frame rate of the video stream, and the standard video frame interval is taken as the standard media frame interval; exemplary If the frame rate of the video stream is 30fps, the standard video frame interval is 1/30 ⁇ 1000, in milliseconds; if the media stream is an audio stream, the standard video is obtained according to the sampling rate of the audio stream and the actual sampling points of each frame of audio. Audio frame interval, standard audio frame interval as standard media frame interval.
  • the sampling rate of the audio stream is 44100HZ
  • the actual number of sampling points of the audio read per frame at the acquisition end is 1024
  • the standard audio frame interval is 1024/44100 ⁇ 1000, and the unit is milliseconds.
  • the standard media frame is calculated by this embodiment. After the interval, the media frame timestamp is corrected by this standard media frame interval.
  • FIG. 2 is a flowchart of another method for processing audio and video data according to Embodiment 1 of the present application. As shown in FIG. 2 , taking the media stream as an audio stream as an example, the method includes the following steps :
  • Step S201 the encoder outputs the audio frame time stamp pts, wherein the output audio frame time stamp may be the time stamp of the audio frame collection time, or may be the system time stamp marked after the audio frame data is encoded, and the time stamp of the collection time is relatively Since the timestamp marked after encoding is more accurate, pts recommends using the timestamp of the audio frame acquisition time for correction;
  • Step S203 judge whether diff ⁇ lowThreshold
  • Step S205 judge diff ⁇ n ⁇ SAMPLE_DURATION? That is to judge whether the difference value diff is less than n times the standard audio frame interval SAMPLE_DURATION, if the judgment result is yes, it is forward compensation, and jump to step S206, if the judgment result is no, it is reverse compensation, and jump to step S206 S204;
  • Step S207 judge diff ⁇ SAMPLE_DURATION? If the judgment result is yes, then jump to step S208, if the judgment result is no, then jump to step S209;
  • Step S212 output curpts.
  • FIG. 3 is a structural block diagram of a time stamp uniformity processing unit according to Embodiment 1 of the present application.
  • the time stamp uniformity processing unit includes an acquisition module 31 , a judgment module 32 , an adjustment module 33 , a compensation module 34 and a The output module 35 , and the acquisition module 31 is connected to the judgment module 32 , the judgment module 32 is connected to the adjustment module 33 , the adjustment module 33 is connected to the compensation module 34 , and the compensation module 34 is connected to the output module 35 .
  • the acquisition module 31 is used to acquire a media stream, wherein the media stream is an audio and video stream, and the audio and video stream includes a video stream and an audio stream;
  • the judgment module 32 is used to acquire the difference between the current media frame time stamp and the previous media frame time stamp The upper and lower limits of the value and the difference, and determine whether the difference is within the upper and lower limits. If the judgment result is yes, the output module 35 outputs the current media frame timestamp as the current media frame target timestamp.
  • the compensation module 34 obtains the standard media frame interval of the media stream, and the adjustment module 33 updates the current media frame time stamp to the sum of the previous media frame time stamp and the standard media frame interval; the judgment module 32 judges whether the difference is greater than the standard media frame interval , if the judgment result is yes, the compensation module 34 performs forward compensation on the updated current media frame timestamp according to the compensation coefficient, and if the judgment result is no, the compensation module 34 performs the updated current media frame timestamp according to the compensation coefficient Perform reverse compensation; the output module 35 outputs the time stamp after forward compensation or reverse compensation as the current media frame time stamp, which solves the problem that the time stamps carried by the audio and video frames in the media stream are uneven, resulting in abnormal playback at the playback end. It also balances the cumulative error through forward compensation and reverse compensation, preventing the cumulative error from accumulating more and more, improving the compatibility of audio and video, and having standardized significance.
  • Embodiment 1 the problem that the time stamps carried by the audio and video frames in the media stream are not uniform, resulting in abnormal playback at the playback end, is solved, and the normal playback at the playback end is guaranteed. Therefore, based on the above Embodiment 1, this embodiment further considers that in the case of unsatisfactory network conditions, the live video screen may freeze, resulting in poor experience for the audience; in the prior art, in order to improve the video on the audience side Generally speaking, frame loss processing is adopted for audio and video data, but the traditional frame loss strategy is relatively simple and general, which may have a great impact on video quality. Therefore, the method for processing audio and video data provided in this embodiment may further include the audio and video frame dropping process before or after the process in Embodiment 1.
  • the audio and video frame dropping process includes the following steps:
  • Step 1 Determine the weight coefficient corresponding to each type of frame in the audio and video stream
  • Step 2 According to the weight coefficient of each type of frame and the queue capacity of the corresponding queue, calculate the frame loss judgment threshold of each type of frame as the basis for the frame loss judgment;
  • Step 3 At the sending moment of any type of frame, if the maximum time interval difference between the time stamps of the two frames of the type of frame in the queue is greater than the frame loss judgment threshold corresponding to the type of frame, the frame loss operation is performed.
  • the type frame includes at least the first type frame and the second type frame, and there are two frame dropping operation methods.
  • the frames of the second type in the queue are discarded in descending order of time stamps.
  • the second type of frame sets up secondary weights according to the order of importance
  • the frames of the second type in the queue are sequentially discarded according to the second-level weight from small to large.
  • the above-mentioned types of frames include at least a first type of frame and a second type of frame.
  • the first type of frame is designed as an audio frame
  • the second type of frame is designed as a video frame.
  • the weight coefficient of the audio frame is greater than the weight coefficient of the video frame;
  • the first type frame is designed as a video frame
  • the second type frame is designed as an encoded frame, and the encoded frame is specifically divided into P frame, I frame, B frame, I frame
  • the weight coefficient of is larger than the weight coefficient of P frame.
  • the P frames in each group of GOP images can be further sorted by their importance. For example, a secondary weight is established, and the P frames are discarded in order according to the secondary weight from small to large, so that the dropped frames are more refined.
  • the design of the frame loss judgment threshold refers to the weight coefficient of the design and the queue capacity of the queue, and calculates the frame loss judgment threshold of each type of frame as the basis for the frame loss judgment.
  • Frame judgment threshold parameter
  • the frame loss judgment threshold may be obtained by multiplying the above weight coefficient, queue capacity, and frame loss judgment threshold parameters.
  • the maximum time interval difference between the time stamps of the two frames in the above-mentioned frame of this type may be the difference between the time stamp of the last frame of this type and the time stamp of the frame of this type at the front, or it may be the difference between the time stamps of the last frame of this type and the time stamp of the frame of this type at the front.
  • the time stamp difference value of the frame of this type at the rear position can be the time stamp difference value of different positions of the frame of this type in the queue, which is specifically designed and obtained according to the actual situation.
  • Type frame drop frame operation instruction stop the frame drop operation until the maximum time interval difference between the time stamps of two frames in the queue of this type of frame is not greater than the frame drop judgment threshold corresponding to this type of frame.
  • the frame dropping operation is first performed on the type frame with the lowest weight coefficient, until the maximum time interval difference between the current two frame timestamps in the dropped frame type frame is not greater than the drop frame corresponding to the type frame. Frame judgment threshold. If the network is still congested at this time, the frame dropping operation is performed on the type frame with the next lowest weight coefficient. In this way, taking the weight coefficient of the frame type as the first priority condition and the frame loss judgment threshold corresponding to each type of frame as the second priority condition to perform frame loss judgment and perform frame drop operation, the impact of frame loss on video quality can be reduced.
  • the type frames include P frames and I frames, where the weight coefficient of the I frame is greater than the weight coefficient of the P frame, then in the case of network congestion, the P frame may first be judged for frame loss and executed. Frame drop operation. Only when the P frame satisfies the maximum time interval difference between the time stamps of the two frames is not greater than the frame drop judgment threshold corresponding to the P frame, and the network is still congested, the frame drop judgment is performed on the I frame and the frame drop operation is performed. , until the I frame satisfies that the maximum time interval difference between the time stamps of the two frames is not greater than the frame loss judgment threshold corresponding to the I frame.
  • the drop frame judgment introduces a reset window height, which extends two kinds of reset window height application logic, namely the reset window Height fixed and dynamic adjustment.
  • the reset window height is fixed: simply introduce a reset window height, and compare the timestamp difference with the frame drop judgment threshold until the timestamp difference is less than the difference between the drop frame judgment threshold and the reset window height to obtain the corresponding type of frame sending operation. instruction.
  • the height of the reset window can be dynamically adjusted according to the maximum time interval difference between the timestamps of the two frames in the queue of this type and the actual situation of the frame loss judgment threshold, until two frames of the type in the queue are in the queue.
  • the maximum time interval difference of the timestamp is less than the difference between the drop frame judgment threshold and the height of the reset window to obtain the corresponding type of frame sending operation instruction;
  • a set of decision logic is designed, but it is not limited to this, that is, the height of the back window is dynamically adjusted with the stacking ratio, and the stacking ratio is the maximum time interval difference between the timestamps of the two frames in the type of frame in the queue and the frame loss judgment threshold
  • the ratio of ; the specific decision logic is as follows:
  • the height of the back window is 0;
  • the stacking ratio is greater than 1, and the excess part is between N times the frame loss ladder coefficient to N+1 times the frame loss ladder coefficient, the height of the reset window is N+1 times the frame loss ladder coefficient, and N is 0, 1 ,2,. together
  • Audio and video streaming media transmission mainly includes audio stream and video stream.
  • the audio stream mainly includes audio frames.
  • the commonly used encoding method for video streams is H.264, which mainly includes P frames, I frames, and B frames.
  • the audio frame and the video frame are unified into the frame weight table, and different weight coefficients are given; according to experience, the audio frame is given a higher value due to the characteristics of the human ear being extremely sensitive to the intermittent audio stream and the small amount of packet data.
  • Weight coefficient; I frame, as a key frame, can be decoded independently, and as the decoding reference of P frame and B frame, its importance is relatively high, and a high weight coefficient is also given.
  • a good frame weight table reference is shown in Table 1:
  • the invention uses the frame loss judgment threshold as the frame loss judgment basis, and can describe the network congestion situation more directly, accurately and sensitively.
  • the frame loss judgment threshold T is designed considering frame weight coefficient a, queue capacity n (n usually ⁇ 200), and frame loss judgment threshold parameter p.
  • the empirical value of the frame loss judgment threshold parameter p is usually about 0.002.
  • the frame weight table is updated as shown in Table 2 below:
  • the buffer queue can take the form of, but not limited to, data structures such as arrays, lists, queues, and linked lists, usually using a first-in-first-out FIFO; in this way, audio frames and video frames can be calculated separately each time frame loss determination is calculated.
  • FIG. 4 is a schematic diagram of the frame loss process according to Embodiment 2 of the present application, as shown in FIG. 4 , at the moment of sending any frame, the frame loss judgment policy of this type of frame is executed first,
  • the specific decision logic is as follows:
  • the calculation method of the total duration S is: find the type of timestamp F1 at the front of the queue and the type of timestamp F2 at the back of the queue, and calculate the time interval ⁇ between the two frame timestamps.
  • S, S F2 -F1;
  • M is the height of the reset window, and the size of M directly reflects the number of dropped frames.
  • M depends to a certain extent on the ratio of S to T, that is, the stacking ratio Q.
  • the frame loss step coefficient step is now introduced to dynamically adjust the size of M.
  • the following examples are not limited, see Table 3:
  • the frame weight coefficient table is used to describe the importance of audio and video frames, the priority of frame loss, and the calculation of the frame loss tolerance threshold.
  • the frame weight coefficient, the frame loss judgment threshold parameter p, the frame loss step coefficient step, and the reset window are adopted. Quantization coefficients such as height M and frame loss judgment threshold T to accurately describe the frame loss operation;
  • the number of dropped frames can refer to the stacking ratio, which can more accurately measure the size of the number of dropped frames, and the network status and the frame drop operation are well matched. Adaptive; the more serious the network congestion, the larger the number of lost frames, the lighter the network congestion, the less the number of lost frames;
  • the design of the reset window is used when the frame loss operation is performed, and a certain margin is left after each frame loss, which can greatly reduce the repeated frame loss operation;
  • the frame weight and frame loss judgment threshold parameters can be dynamically adjusted, and the algorithm has better adaptability.
  • the beneficial effect of this embodiment is that weights are designed for different types of frames in the audio and video frames.
  • the frames with lower weights are discarded first, and further Second-level weights are established for the second type of frames, which can be more refined when frames are dropped; or frames with larger timestamps in the queue (the ones that enter the queue later) will be discarded first; among them, the frame loss judgment is added to the return window
  • the frame drop jitter phenomenon when the frame drop time is near the threshold critical point can be greatly eliminated. In response to network fluctuations, it can basically be covered by a single frame drop.
  • a good match makes the size measurement of the number of dropped frames more accurate. Generally speaking, the more serious the network congestion is, the larger the number of dropped frames; the lighter the network congestion, the smaller the number of dropped frames.
  • an audio and video frame dropping unit includes an encoder output unit, a frame receiving unit, an audio and video frame dropping unit, and a sending unit that are electrically connected in sequence, wherein the audio and video frame dropping unit
  • the unit includes a determination module, a calculation module and a frame dropping module.
  • the determination module is used to determine the weight coefficient corresponding to each type of frame in the audio and video stream; the calculation module is used to calculate the weight coefficient of each type of frame and the queue capacity of the queue.
  • the frame loss judgment threshold corresponding to each type of frame is obtained; the frame loss module is used to send a frame of any type, if the maximum time interval difference between the time stamps of the two frames in the frame of this type in the queue is greater than the frame loss corresponding to the frame of this type If the threshold is determined, the frame dropping operation is performed.
  • an audio and video frame dropping unit includes an external dynamic parameter setter, a parameter collector, a parameter calculator, a frame dropping determiner and a frame dropping executor.
  • the external dynamic parameter setter is used to set the weight coefficient of the audio frame and the video frame, and to set the frame loss judgment threshold parameter;
  • the parameter collector is used to collect parameters related to frame loss judgment, including weight coefficient, queue capacity, and frame loss judgment threshold parameters;
  • the parameter calculator is used to obtain the frame loss judgment thresholds of various types of frames according to the collected parameters according to the calculation rules;
  • the frame loss determiner is used to first find the frame loss judgment threshold of this type of frame, and then calculate the maximum time interval difference between the time stamps of the two frames of this type in the queue. According to the frame loss judgment principle, the maximum time interval difference and the frame loss judgment threshold are determined make comparative judgments;
  • the frame drop executor is used to drop frames of this type in the queue in descending order according to the timestamps when the frame drop judger determines that the frame drop operation is performed. Each time a frame of this type is dropped, it will be fed back to the parameter calculator and dropped frames.
  • a determiner which repeatedly calculates the maximum time interval difference between the timestamps of the current two frames in the lost type frames in the queue, and performs frame loss determination.
  • this embodiment further includes a method for processing audio and video streaming data, which aims to implement streaming from one host to multiple platforms.
  • the processing flow of the audio and video streaming data includes the following steps:
  • Step 1 Obtain the audio and video data uploaded by the user, wherein the audio and video data is transmitted in the form of audio and video streams, and the audio and video data carries the user's bound account information, and the account has been targeted for multiple target live broadcast platforms.
  • the playback parameters are configured;
  • Step 2 The server creates a streaming task for each target live broadcast platform
  • Step 3 Under the bound account of the user, the server distributes the audio and video data of the user to multiple target live broadcast platforms.
  • the configuration based on live broadcast is completed by binding the bound user account on one platform for the playback parameters of multiple target live broadcast platforms; the server creates push tasks for each target live broadcast platform and distributes them to multiple target live broadcast platforms.
  • the user's audio and video data satisfies the requirement for a host to broadcast live on multiple platforms at the same time, and overcomes the technical defect that one-click multi-platform streaming cannot be realized in the prior art.
  • step 1 on the client side of the all-in-one live broadcast machine, the user's account needs to be authorized and bound with multiple target live broadcast platforms through the client side, and the process preferably includes the following steps:
  • the back-end cloud server establishes a link between the account of the smart terminal where the client is located and the account of the target live broadcast platform;
  • the client receives the newly added binding information, and the binding is completed.
  • the above process first obtains platform account information, live streaming and other permissions through the target platform's open interface. Users can log in to the target platform, bind the platform account to the local account, authorize the local account to operate the platform account, and perform live streaming and other operations under the platform account. After the binding is completed, the server will record the one-to-many binding relationship between the user (local account) logged in the device and the account of the target platform (third-party live broadcast platform), and store it in the server database persistently.
  • the process of providing users with interactive information for configuring live broadcast parameters on multiple platforms includes the following steps:
  • the configuration interaction information may be displayed to the user through a pop-up interactive interface of the live broadcast integrated machine, or may be prompted to the user in a push manner, which is not limited.
  • the server generates live broadcast platform configuration data matching the corresponding live broadcast platform demand information in response to the user's configuration instruction based on the interaction information.
  • the data preparation for streaming can be completed through the following steps:
  • the above steps can be completed by the live broadcast all-in-one machine, and can also be completed by cooperating with the server, which is not limited.
  • step 2 after the audio and video data arrives at the server, multiple tasks are created according to the number of target platforms held in the server.
  • Each task is a production line responsible for streaming distribution, so as to complete the push of audio and video data to the corresponding target live broadcast platform and live broadcast room independently.
  • step 3 after the configuration of the live broadcast room of the target live broadcast platform is completed, the audio and video signals that can communicate with the protocol can be pushed and broadcasted, such as the RTMP protocol.
  • the push stream arrives, the information can be configured for live broadcast.
  • the live broadcast platform A at 8:00 p.m.
  • the broadcast is only visible to friends; the live broadcast platform B is all public, and the broadcast time is 9:00 pm.
  • Each live broadcast room does not affect each other. Even if one of the streaming tasks fails, the live broadcast rooms of other live broadcast platforms can continue the live broadcast.
  • audio and video data can be collected through built-in Camera or external HDMI or USB video capture devices.
  • the data is encapsulated into the format specified by the RTMP protocol through the transcoding operation.
  • the audio data is aligned with the video data to ensure the synchronization of audio and video.
  • the cloud server already holds a local account, multiple target platform accounts bound to the local account, the streaming address of the target live broadcast platform, and the permission to complete the streaming. Then you can push the prepared audio and video data to the back-end cloud server, and click the Start Live button to obtain the streaming address of the cloud server and start streaming.
  • the open-source technical solution librtmp can be selected as the RTMP data transmission tool, and librtmp needs to be cross-compiled and transplanted into the live broadcast all-in-one machine.
  • FIG. 5 is a schematic diagram of the configuration flow of audio and video streaming data according to Embodiment 3 of the present application. As shown in FIG. 5 , the flow of sending live platform configuration interaction information to a client for a client can be specifically implemented through the following steps:
  • S801 Send the bound account information for the target live broadcast platform to the user;
  • the server responds to the user's configuration instruction based on the interaction information, including:
  • the client terminal receives user selection data and sends configuration data to the target live broadcast platform, where the configuration data includes privacy setting indication information and audio and video release setting information;
  • the user selection data may be a button or a start/stop selection, or other forms that can reflect the user's personalized needs in the configuration process.
  • the client receives the selection data and interprets the personalized configuration data locally. and packaged and sent to the server.
  • S803 The server completes the setting according to the user selection data and stores the configuration data of the user for the target live broadcast platform.
  • the device converts the user operation interaction information into communication messages: the live broadcast platform unique identifier publishId and the permission level privacy are passed to the back-end server to set the privacy options configured by the user to the live broadcast platform of target A.
  • the interactive interface includes "video area”, and related setting functions such as “timeline”, “fan circle”, “group”, “public” and “only visible to friends”.
  • the server receives and stores the addresses of the live broadcast rooms created by the streaming task respectively.
  • the client or the server sends the live platform configuration interaction information to the user through the client, including:
  • the client sends the configuration interface for the target live broadcast platform to the user;
  • the binding account information request When the binding account information request is authorized, receive user selection data and send configuration data to the target live broadcast platform, where the configuration data includes privacy setting indication information and audio and video release setting information;
  • the setting is completed according to the user selection data and the configuration data of the user for the target live broadcast platform is stored.
  • This embodiment considers that in the process of live broadcast, it is often necessary to access multiple audio signals for live broadcast, but in the miniaturized live broadcast device in the related art, the audio data input to the live broadcast device is passed through the processor module of the live broadcast device.
  • the processing results in a large amount of computation of the processor module, resulting in low operating efficiency of the processor module, which is likely to cause a freeze in the live broadcast, which affects the quality of the content presented in the live broadcast.
  • FIG. 6 is a schematic diagram of the application environment of the live broadcast device according to Embodiment 4 of the present application.
  • the live broadcast device 12 integrates a broadcast director It integrates the functions of various live broadcast devices such as TV sets, hard disk recorders, encoders, capture cards, etc., and can perform multi-channel video capture, decoding, encoding, streaming and other live broadcast data for various live broadcast devices such as high-definition cameras, microphones, and cameras11.
  • the live broadcast device 11 when the user is performing live broadcast, the live broadcast device 11 is connected to the live broadcast device 12, the live broadcast device 12 is connected to the remote server 13 through the network, the live broadcast device 12 pushes the processed data to the server 13, and the server 13 forwards the data
  • viewers can watch live broadcasts on various live broadcast platforms through viewing devices 14 such as tablets, mobile phones, and computers.
  • the audio data input by the miniaturized live broadcast device is processed by the processor module of the live broadcast device, which leads to a large amount of calculation of the processor module, resulting in low operating efficiency of the processor module, which is likely to cause the live broadcast freeze phenomenon, which affects the The quality of the live presentation content.
  • FIG. 7 is a schematic diagram of the first live broadcast apparatus according to Embodiment 4 of the present application.
  • the live broadcast apparatus 12 includes an audio processing module 21 , a device interface module 22 and a processing
  • the device interface module 22 can be used to connect live broadcast devices such as high-definition cameras, cameras, etc.
  • the audio processing module 21 includes an audio input interface 211 and an audio processing chip 212.
  • the audio input interface 211 can be used to connect a microphone
  • an audio processing chip 212 is respectively connected to the audio input interface 211, the device interface module 22 and the processor module 23, and the audio processing chip 212 performs noise reduction and/or mixing processing on the audio data input by the audio input interface 211 and/or the device interface module 22, and combines The processed audio data is transmitted to the processor module 23.
  • the model of the audio processing chip 212 may be AK7735.
  • the multi-channel audio input terminals include the audio input interface 211 and the device interface module 22, and the audio processing chip 212 can perform audio matching.
  • the audio data of the miniaturized live broadcast device is processed by the processor module 23, which causes the problem that the processor module 23 has a large amount of computation. 12 improves the operation efficiency of the processor module 23, which is beneficial to reduce the freeze phenomenon of the live broadcast, thereby improving the quality of the content presented in the live broadcast.
  • the audio input interface 211 includes an active input interface 2111 (or called a Line In interface) and a passive input interface 2112 (or called a Mic In interface), wherein the active input interface 2111
  • passive input interface 2112 is used to connect passive microphones; by setting active input interface 2111 and passive input interface 2112, for different types of input audio, the live broadcast device 12 supports active microphone input and Passive microphone input, good applicability.
  • the audio processing module 21 also includes an audio output interface 213 (or referred to as an Audio out interface), and the audio output interface 213 is connected to the audio processing chip 212 and is used to process the audio frequency processed by the audio processing chip 212. Data output to devices such as headphones.
  • an audio output interface 213 or referred to as an Audio out interface
  • FIG. 8 is a schematic diagram of a second live broadcast apparatus according to Embodiment 4.
  • the device interface module 22 includes an HDMI interface module 31 and a USB interface module 32 , wherein the high-definition multimedia Interface (HighDefinition Multimedia Interface, referred to as HDMI) is a fully digital video and sound transmission interface, can send uncompressed audio and video signals; Universal Serial Bus (Universal Serial Bus, referred to as USB) is a serial bus standard, also A technical specification of an input and output interface, which is widely used in information communication products such as personal computers and mobile devices, and extends to other related fields such as photographic equipment, digital TV (set-top boxes), and game consoles.
  • HDMI HighDefinition Multimedia Interface
  • USB Universal Serial Bus
  • the HDMI interface module 31 includes a plurality of HDMI input interfaces 311 and a plurality of first format converters 312, and the plurality of HDMI input interfaces 311 are respectively connected to the audio processing chip 212; the plurality of HDMI input interfaces 311 and a plurality of first format converters
  • the converters 312 are connected in one-to-one correspondence, and one end of the first format converter 312 is connected to the HDMI input interface 311, and the other end is connected to the processor module 23; by setting multiple HDMI input interfaces 311, the live broadcast device 12 supports multi-video access , so as to meet the needs of some users for multi-channel video access during live broadcast; by setting the first format converter 312, the input data can be converted from HDMI format to MIPI format, so the live broadcast device 12 can be adapted to the general market. It solves the problem of poor compatibility of portable encoders in the related art and improves the applicability of the live broadcast device 12.
  • the chip model of the first format converter 312 can be Longxun LT6911 HDMI to MIPI bridge bridge
  • the HDMI input interface 311 can be connected to a live broadcast device such as a high-definition camera, the data input by the HDMI input interface 311 includes video data and/or audio data, and the first format converter 312 converts the video data input from the HDMI input interface 311 and /or audio data is converted from HDMI format to MIPI format, and the video data and/or audio data in MIPI format are transmitted to the processor module 23, and the processor module 23 processes the video data after receiving the video data, optionally,
  • the processor module 23 can be a Quectel SC66 intelligent module.
  • the Quectel SC66 intelligent module integrates a Qualcomm Qualcomm Snapdragon 8-core processor and a Qualcomm Adreno 512 graphics processor (Graphic Processing Unit, GPU for short), and supports multiple channels and each channel is up to Decoding and encoding processing of video data in 1080P format.
  • the device interface module 22 may also include only the HDMI interface module 31 , or only the USB interface module 32 .
  • FIG. 9 is a schematic diagram of a third live broadcast device according to Embodiment 4 of the present application.
  • the USB interface module 32 includes a first USB interface 41 , a second USB interface 42 and a third USB interface 43
  • the processor module 23 includes a USB port, wherein the first USB interface 41 is connected to the USB port of the processor module 23, and audio data is input to the audio processing chip 212 through the processor module 23, optional, as shown in the figure 9, the USB interface module 32 may also include an interface expander 44, one end of the interface expander 44 is connected to a USB port, and the other end is connected to a plurality of first USB interfaces 41 and a third USB interface 43.
  • a single USB port can be expanded into a plurality of first USB interfaces 41, so that the live broadcast device 12 supports multi-device access.
  • the live broadcast device 12 can be connected to the physical interface of USB A type on the plurality of first USB interfaces 41 respectively.
  • the third USB interface 43 can also be integrated in the USB port of the processor module 23, and the third USB interface 43 can be used to connect the touch screen, wherein , the chip model of the interface expander 44 can be LAN9514; the second USB interface 42 is connected to the processor module 23 for system debugging, and the second USB interface 42 is not open to users.
  • FIG. 10 is a schematic diagram of a fourth live broadcast device according to Embodiment 4 of the present application.
  • the live broadcast device 12 further includes a display module 50
  • the display module 50 includes a display screen 51 and a second live broadcast device.
  • Format converter 52 one end of the second format converter 52 is connected to the processor module 23, the other end of the second format converter 52 is connected to the display screen 51, the processor module 23 outputs video data in MIPI format, and the second format converter 52 Convert the video data in MIPI format into LVDS format, and the display screen 51 displays the video data in LVDS format.
  • the chip model of the second format converter 52 can be Longxun LT9211 MIPI to LVDS bridge chip; by setting the display module 50.
  • the live broadcast device 12 supports a display screen 51 equipped with LVDS interfaces of different sizes and specifications, and the user can view the video images in real time through the display screen 51 of the LVDS interface during live broadcast.
  • the display screen 51 includes a touch screen 511
  • the third USB interface 43 is connected to the touch screen 511 , so that the touch signal captured by the touch screen 511 can be transmitted to the processor module 23 through the third USB interface 43 . , so that the processor module 23 can respond to the touch signal.
  • FIG. 11 is a schematic diagram of a fifth live broadcast apparatus according to Embodiment 4 of the present application.
  • the live broadcast apparatus further includes a data output module 60
  • the data output module 60 includes a third format converter 61 and HDMI output interface 62
  • one end of the third format converter 61 is connected to the processor module 23
  • the other end of the third format converter 61 is connected to the HDMI output interface 62
  • the third format converter 61 converts the video output by the processor module 23
  • the data and audio data are converted from MIPI format to HDMI format, and the video data and audio data in HDMI format are transmitted to the HDMI output interface 62.
  • the chip model of the third format converter 61 can be Longxun LT9611 MIPI converter.
  • HDMI bridge chip during live broadcast, the user can connect the HDMI output interface to the display of the HDMI interface, so that the video picture can be watched in real time on the display of the HDMI interface.
  • FIG. 12 is a schematic diagram of a sixth live broadcast apparatus according to Embodiment 4 of the present application.
  • the live broadcast apparatus 12 further includes a network module 70 , which can realize WIFI connection, wired network Connection and 4G network connection and other networking methods, so that the live broadcast device 12 supports working under the wired network or wireless network, the network module 70 is connected to the processor module 23, and is used for the video data or audio processed by the processor module 23. The data is pushed to the server so that the server can forward the video data or audio data to multiple webcast platforms.
  • a network module 70 which can realize WIFI connection, wired network Connection and 4G network connection and other networking methods, so that the live broadcast device 12 supports working under the wired network or wireless network
  • the network module 70 is connected to the processor module 23, and is used for the video data or audio processed by the processor module 23.
  • the data is pushed to the server so that the server can forward the video data or audio data to multiple webcast platforms.
  • FIG. 13 is a schematic diagram of a seventh live broadcast apparatus according to Embodiment 4 of the present application.
  • the audio processing chip 212 includes an I2S1 port, an I2S2 port, an I2S3 port, an AIN1 port, and an AIN2 port.
  • the processor module 23 includes MIPI CSI1 port, MIPI CSI2 port, I2S port, I2C port, USIM port, USB3.0 port, POWER CORE port, LCD MIPI port, USB2.0 port and MIPI DSI port
  • the second format converter 52 includes an LDVS1 port and an LDVS2 port
  • the display screen 51 includes a TP touch screen port
  • the interface expander 44 includes a USB0 port, a USB1 port, a USB2 port, a USB3 port and a PHY port
  • the live broadcast device 12 also includes a SIM interface 81.
  • the power input interface 82, the power conversion chip 83 and the network port 84, the connection relationship between the components of the live broadcast device 12 or the ports of the components is shown in FIG. 13 .
  • the SIM interface 81 can be connected to a SIM card, the SIM interface 81 is connected to the USIM port of the processor module 23 ; the power input interface 82 can be connected to a power supply, and the power conversion chip 83 is connected to the power input interface 82 and the processor module 23
  • the POWER CORE port is used for power voltage conversion.
  • the model of the power conversion chip 83 can be RT7295.
  • the chip of the RT7295 model converts the 12V voltage input by the power input interface into the 3.9V voltage of the matching processor module 23 , and transmits the 3.9V voltage to the processor module 23; the network port 84 is connected to the interface expander 44, and the network port 84 is used to access the network cable.
  • the live broadcast device 12 provided in this embodiment can realize multi-channel video capture by setting multiple HDMI input interfaces 311 and multiple first USB interfaces 41;
  • the interface module 22 is connected with the processor module 23, and can realize noise reduction and/or sound mixing processing of the input audio data; by setting the processor module 23, the processor module 23 is respectively interfaced with the audio processing module 21 and the device
  • the modules 22 are connected, and can realize the decoding and encoding processing of the input video data and audio data; by setting the display module 50, the real-time viewing of the video picture can be realized; by setting the data output module 60, the video data and the audio data can be converted. Data format and output data; by setting the network module 70, the network push of video data and audio data can be realized.
  • the live broadcast device 12 integrates the functions of multi-channel video collection, decoding, encoding and streaming.
  • additional equipment such as director station, hard disk video recorder, encoder, capture card, etc., it makes live broadcast more convenient for users and helps to reduce the cost of live broadcast.
  • the live broadcast device includes an audio processing module, a device interface module and a processor module
  • the audio processing module includes an audio input interface and an audio processing chip
  • the audio input interface is used to connect a microphone
  • the audio processing chip is respectively connected to the audio input interface, the device interface module and the audio processing chip.
  • the processor module, the audio processing chip performs noise reduction and/or mixing processing on the audio data input by the audio input interface and/or the device interface module, and transmits the processed audio data to the processor module, which solves the problem of live broadcast in the related art.
  • the processor module of the device has low operation efficiency, which affects the quality of the content presented in the live broadcast, and improves the viewing experience of the audience.
  • the processor module in the above-mentioned live broadcast device may further include the time stamp homogenization processing unit in Embodiment 1 and/or the audio and video frame dropping unit in Embodiment 2, so as to realize time stamp homogenization processing and/or Or audio and video frames are dropped.
  • a system for processing audio and video data may be provided.
  • the system includes the above-mentioned live broadcast device, and may also include the server in Embodiment 3.
  • the client may also serve as the The above-mentioned live broadcast apparatus is used, that is to say, its processor module can implement the method for processing audio and video streaming data in Embodiment 3.
  • the above-mentioned modules may be functional modules or program modules, and may be implemented by software or hardware.
  • the above-mentioned modules may be located in the same processor; or the above-mentioned modules may also be located in different processors in any combination.
  • the embodiments of the present application may provide a storage medium for implementation.
  • a computer program is stored on the storage medium; when the computer program is executed by the processor, any one of the audio and video data processing methods in the foregoing embodiments is implemented.
  • FIG. 14 is a schematic diagram of the internal structure of the electronic device according to Embodiment 5 of the present application.
  • an electronic device is provided, and the electronic device may be The internal structure of the server can be shown in Figure 14.
  • the electronic device includes a processor, a network interface, an internal memory, and a non-volatile memory connected by an internal bus, wherein the non-volatile memory stores an operating system, a computer program, and a database.
  • the processor is used to provide computing and control capabilities
  • the network interface is used to communicate with external terminals through a network connection
  • the internal memory is used to provide an environment for the operation of the operating system and computer programs.
  • the data processing method, the database is used to store the data.
  • FIG. 14 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the electronic device to which the solution of the present application is applied.
  • the specific electronic device may be Include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Databases & Information Systems (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本申请涉及一种时间戳均匀化处理的方法、直播装置、电子设备和存储介质,通过获取媒体流,获取媒体流中当前媒体帧时间戳和上一媒体帧时间戳的差值,以及差值的上下限范围,当差值在上下限范围内,则输出当前媒体帧时间戳,若不在,则获取媒体流的标准媒体帧间隔,将当前媒体帧时间戳更新为上一媒体帧时间戳和标准媒体帧间隔之和;当差值大于标准媒体帧间隔时,根据补偿系数对更新后的当前媒体帧时间戳进行正向补偿,小于时,根据补偿系数对更新后的当前媒体帧时间戳进行反向补偿;输出补偿后的当前媒体帧目标时间戳,解决了音视频帧的时间戳不均匀,播放端播放异常的问题,还通过正向补偿和反向补偿平衡累计误差,防止累计误差越积累越大。

Description

音视频数据的处理方法、直播装置、电子设备和存储介质 技术领域
本申请涉及媒体流技术领域,特别是涉及音视频数据的处理方法、直播装置、电子设备和存储介质。
背景技术
多媒体数据要在因特网上进行实时的传输,必须先对多媒体数据进行流化处理,流化处理的过程是对多媒体数据进行必要的封装处理,把音视频数据打包成能进行流传输的RTP(Real-time Transport Protocol,实时传输协议)数据包,从而实现多媒体数据的流媒体传输。
视频直播就是利用互联网及流媒体技术进行直播的,视频因融合了图像、文字、声音等丰富元素,声形并茂,效果极佳,逐渐成为互联网的主流表达方式。
互联网直播采用实时流式传输技术,首先,主播开启直播,将直播内容编码压缩后,传输至网站服务器,这一过程被称为“推流”,即将视频内容推给服务器,直播内容传输至网站服务器后,用户观看直播时,会直接从网站服务器拉取直播内容,这一过程被称为“拉流”。拉流获得相应媒体流后可在本地进行解码播放,解码播放的过程会依赖媒体流中音视频帧携带的时间戳,在原始采集均匀性不佳或编码器输出的时间戳序列不均匀等情况下,会使媒体流中的音视频帧携带的时间戳不均匀,导致播放端播放异常。
发明内容
本申请实施例提供一种音视频数据的处理方法,所述方法包括:
获取媒体流,其中,所述媒体流为音视频流,所述音视频流包括视频流和音频流;
获取所述媒体流中当前媒体帧时间戳和上一媒体帧时间戳的差值,以及获取所述差值的上下限范围,并判断所述差值是否在所述上下限范围内;
若所述判断结果为是,则输出所述当前媒体帧时间戳作为当前媒体帧目标时间戳,若所述判断结果为否,则获取所述媒体流的标准媒体帧间隔,将所述当前媒体帧时间戳更新为所述上一媒体帧时间戳和所述标准媒体帧间隔之和;
判断所述差值是否大于所述标准媒体帧间隔,若所述判断结果为是,则根据补偿系数对更新后的所述当前媒体帧时间戳进行正向补偿,若所述判断结果为否,则根据补偿系数对更新后的所述当前媒体帧时间戳进行反向补偿,输出所述正向补偿或所述反向补偿后的时间戳作为所述当前媒体帧目标时间戳。
在其中一些实施例中,将所述当前媒体帧时间戳更新为所述上一媒体帧时间戳和所述标准媒体帧间隔之和之前,所述方法还包括:
判断所述差值是否大于最大误差允许系数,若所述判断结果为是,则输出所述当前媒体帧时间戳作为当前媒体帧目标时间戳,若所述判断结果为否,则将所述当前媒体帧时间戳更新为所述上一媒体帧时间戳和所述标准媒体帧间隔之和,其中,所述最大误差允许系数为n倍的所述标准媒体帧间隔,n为大于1的数值。
在其中一些实施例中,根据补偿系数对更新后的所述当前媒体帧时间戳进行正向补偿或反向补偿包括:
所述正向补偿为将更新后的所述当前媒体帧时间戳与所述补偿系数之和作为所述当前媒体帧目标时间戳;
所述反向补偿为将更新后的所述当前媒体帧时间戳与所述补偿系数之差作为所述当前媒体帧目标时间戳。
在其中一些实施例中,所述根据补偿系数对更新后的所述当前媒体帧时间戳进行正向补偿或反向补偿之后,所述方法还包括:
根据所述当前媒体帧目标时间戳更新上一媒体帧时间戳,更新后的上一媒体帧时间戳作为下一媒体帧时间戳的上一媒体帧时间戳。
在其中一些实施例中,获取所述上下限范围包括,根据所述标准媒体帧间隔和波动上下限系数,获取所述上下限范围,其中,所述波动上下限系数小于播放端解码器能容忍的波动范围。
在其中一些实施例中,获取所述媒体流的标准媒体帧间隔包括:
若所述媒体流为视频流,则根据所述视频流的帧率,获取标准视频帧间隔,所述标准视频帧间隔作为所述标准媒体帧间隔;
若所述媒体流为音频流,则根据所述音频流的采样率和每帧音频实际采样点数,获取标准音频帧间隔,所述标准音频帧间隔作为所述标准媒体帧间隔。
在其中一些实施例中,所述方法还包括:
确定音视频流中的每个类型帧对应的权重系数;
根据每个类型帧的权重系数、队列的队列容量,计算出每个类型帧对应的丢帧判断阈值;
在任一类型帧的发送时刻,若队列中该类型帧中两帧时间戳的最大时间间隔差值大于该类型帧对应的丢帧判断阈值,则执行丢帧操作。
在其中一些实施例中,所述类型帧至少包括第一类型帧以及第二类型帧,所述丢帧操作包括:
若所述第一类型帧的权重系数大于所述第二类型帧的权重系数,则将队列中第二类型帧按照时间戳由大到小进行依次丢弃。
在其中一些实施例中,所述类型帧至少包括第一类型帧以及第二类型帧,第二类型帧根据重要程度排序设立二级权重,所述丢帧操作包括:
若所述第一类型帧的权重系数大于所述第二类型帧的权重系数,则将队列中第二类型帧按照二级权重由小到大进行依次丢弃。
在其中一些实施例中,所述方法还包括:
在每执行一次丢帧操作之后,重复计算队列中所丢类型帧中当前两帧时间戳的最大时间间隔差值,再与该类型帧对应的丢帧判断阈值进行比较,直至队列中该类型帧中两帧时间戳的最大时间间隔差值不大于该类型帧对应的丢帧判断阈值时停止丢帧操作。
在其中一些实施例中,所述方法还包括:
计算队列中每个类型帧的堆积比,所述堆积比为任一类型帧中当前两帧时间戳的最大时间间隔差值与该类型帧丢帧判断阈值的比值;
根据堆积比与回置窗口高度之间预设的对应关系确定每个类型帧对应的回置窗口高度;
在每执行一次丢帧操作之后,重复计算队列中所丢类型帧中当前两帧时间戳的最大时间间隔差值,若所述最大时间间隔差值小于该类型帧对应的丢帧判断阈值与回置窗口高度的差值时,则停止丢帧操作。
在其中一些实施例中,所述方法还包括:
获取用户上传的音视频数据,其中,所述音视频数据以音视频流的形式传输,且所述音视频数据携带用户绑定账号信息,且该账号已针对多个目标直播平台的播放参数完成配置;
由服务器分别针对各目标直播平台创建推流任务;
在该用户绑定账号下,由所述服务器向多个目标直播平台分发该用户的音视频数据。
在其中一些实施例中,所述方法还包括:
获取用户绑定账号信息及多个目标直播平台的推流需求信息;
在该用户绑定账号下,向用户发送直播平台配置交互信息;
由服务器响应该用户基于所述交互信息的配置指示,生成匹配对应直播平台需求信息的直播平台配置数据。
在其中一些实施例中,向用户发送直播平台配置交互信息,包括:
向用户发送针对目标直播平台的绑定账号信息;
由服务器响应该用户基于所述交互信息的配置指示,包括:
当所述绑定账号信息请求授权通过时,接收用户选择数据并向目标直播平台发送配置数据,所述配置数据包括隐私设置指示信息和音视频发布设置信息;
由服务器依据所述用户选择数据完成设置及存储该用户针对目标直播平台的配置数据。
在其中一些实施例中,所述方法还包括:
由服务器分别接收推流任务创建的直播间地址并存储。
本申请实施例还提供一种直播装置,所述直播装置包括音频处理模块、设备接口模块和处理器模块,所述音频处理模块包括音频输入接口和音频处理芯片,所述音频输入接口用于连接麦克风,所述音频处理芯片分别连接所述音频输入接口、所述设备接口模块和所述处理器模块,所述音频处理芯片对所述音频输入接口和/或所述设备接口模块输入的音频数据进行降噪和/或混音处理,并将处理后的所述音频数据传送至所述处理器模块;
所述处理器模块包括时间戳均匀化处理单元,所述时间戳均匀化处理单元包括获取模块、判断模块、补偿模块、调整模块和输出模块,所述获取模块与所述判断模块连接,所述判断模块与所述调整模块连接,所述调整模块与所述补偿模块连接,所述补偿模块与所述输出模块连接;
所述获取模块用于获取媒体流,其中,所述媒体流为音视频流;
所述判断模块用于获取当前媒体帧时间戳和上一媒体帧时间戳的差值和所述差值的上下限范围,并判断所述差值是否在所述上下限范围内,若所述判断结果为是,则所述输出模块输出所述当前媒体帧时间戳作为当前媒体帧目标时间戳,若所述判断结果为否,则所述补偿模块用于获取所述媒体流的标准媒体帧间隔,所述调整模块用于将所述当前媒体帧时间戳更新为所述上一媒体帧时间戳和所述标准媒体帧间隔之和;
所述判断模块还用于判断所述差值是否大于所述标准媒体帧间隔,若所述判断结果为是,则所述补偿模块根据补偿系数对更新后的所述当前媒体帧时间戳进行正向补偿,若所述判断结果为否,则所述补偿模块根据补偿系数对更新后的所述当前媒体帧时间戳进行反向补偿;
所述输出模块用于输出所述正向补偿或所述反向补偿后的时间戳作为所述当前媒体帧目标时间戳。
在其中一些实施例中,所述设备接口模块包括HDMI接口模块和/或USB接口模块,其中,所述HDMI接口模块包括至少一个HDMI输入接口,所述USB接口模块包括至少一个USB接口,所述HDMI输入接口和所述USB接口分别连接所述音频处理芯片。
在其中一些实施例中,所述HDMI接口模块还包括至少一个第一格式转换器,所述第一格式转换器连接所述HDMI输入接口和所述处理器模块,所述第一格式转换器将所述HDMI输入接口输入的数据由HDMI格式转换成MIPI格式,并将所述MIPI格式的数据传送至所述处理器模块,其中,所述HDMI输入接口输入的所述数据包括视频数据和/或所述音频数据。
在其中一些实施例中,所述USB接口模块包括第一USB接口和第二USB接口,所述第一USB接口通过所述处理器模块连接所述音频处理芯片,并用于向所述音频处理芯片输入所述音频数据;所述第二USB接口连接所述处理器模块,用于系统调试。
在其中一些实施例中,所述处理器模块包括USB端口,所述第一USB接口设为多个,所述USB接口模块还包括接口拓展器,所述接口拓展器一端连接所述USB端口,所述接口拓展器另一端连接多个所述第一USB接口。
在其中一些实施例中,所述音频输入接口包括有源输入接口和无源输入接口,其中,所述有源输入接口用于连接有源麦克风,所述无源输入接口用于连接无源麦克风。
在其中一些实施例中,所述音频处理模块还包括音频输出接口,所述音频输出接口连接所述音频处理芯片,并用于输出所述处理后的所述音频数据。
在其中一些实施例中,所述直播装置还包括显示模块,所述显示模块包括显示屏和第二格式转换器,所述第二格式转换器连接所述处理器模块和所述显示屏,所述处理器模块输出MIPI格式的数据,所述第二格式转换器将所述MIPI格式的数据转换成LVDS格式,所述显 示屏显示所述LVDS格式的数据,其中,所述处理器模块输出的MIPI格式数据包括视频数据。
在其中一些实施例中,所述显示屏包括触摸屏,所述USB接口模块包括第三USB接口,所述第三USB接口连接所述接口拓展器和所述触摸屏。
在其中一些实施例中,所述直播装置还包括数据输出模块,所述数据输出模块包括第三格式转换器和HDMI输出接口,所述第三格式转换器连接所述处理器模块和所述HDMI输出接口,所述第三格式转换器将所述处理器模块输出的数据由MIPI格式转换成HDMI格式,并将所述HDMI格式的数据传送至所述HDMI输出接口,其中,所述处理器模块输出的所述数据包括视频数据和所述音频数据。
在其中一些实施例中,所述处理器模块还包括音视频丢帧单元,所述音视频丢帧单元包括:
确定模块,用于确定音视频流中的每个类型帧对应的权重系数;
计算模块,用于根据每个类型帧的权重系数、队列的队列容量,计算出每个类型帧对应的丢帧判断阈值;
丢帧模块,用于在任一类型帧的发送时刻,若队列中该类型帧中两帧时间戳的最大时间间隔差值大于该类型帧对应的丢帧判断阈值,则执行丢帧操作。
在其中一些实施例中,所述处理器模块还包括音视频丢帧单元,所述音视频丢帧单元包括:
外部动态参数设置器,用于设置音频帧和视频帧的权重系数、以及设置丢帧判断阈值参数;
参数收集器,用于将丢帧判断相关的参数进行收集,参数包括权重系数、队列容量、丢帧判断阈值参数;
参数计算器,用于将收集到的参数根据计算规则得到各类型帧的丢帧判断阈值;
丢帧判定器,用于先寻找该类型帧丢帧判断阈值,再计算队列中该类型帧中两帧时间戳的最大时间间隔差值,根据丢帧判定原则对最大时间间隔差值与丢帧判断阈值进行比较判断;
丢帧执行器,用于当丢帧判定器判断出执行丢帧操作,则将队列中该类型帧按照时间戳由大到小进行依次丢弃,每丢弃一次该类型帧将反馈给参数计算器和丢帧判定器,重复计算队列中所丢类型帧中当前两帧时间戳的最大时间间隔差值并进行丢帧判定。
本申请实施例还提供一种电子设备,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器被设置为运行所述计算机程序以执行上述任一项所述的方法。
本申请实施例还提供一种存储介质,所述存储介质中存储有计算机程序,其中,所述计算机程序被设置为运行时执行上述任一项所述的方法。
相比于相关技术,在原始采集均匀性不佳或编码器输出的时间戳序列不均匀等情况下,即采集端为android系统,程序定时调用audiorecord接口,通过系统api获取的媒体帧可能存在不均匀、编码器内部可能存在队列缓冲区,媒体帧的编码处理过程也有时间差异,如果采集未标记时间戳,采用编码器输出的时间戳序列,可能存在不均匀,或者采集的时间戳是通过解码音频文件进一步得到的,解码每一帧的过程可能时间不均匀,进而导致采集不均匀等情况下,会使媒体流中的音视频帧携带的时间戳不均匀,导致播放端播放异常的问题。本申请实施例提供的音视频数据的处理方法中,根据当前媒体帧时间戳和上一媒体帧时间戳的差值判断是否需要对当前媒体帧时间戳进行校正,当差值在上下限范围内,则认为当前媒体帧时间戳和上一媒体帧时间戳的帧间隔符合要求,当差值在上下限范围外,则将当前媒体帧时间戳更新为上一媒体帧时间戳和标准媒体帧间隔之和,更新当前媒体帧时间戳后,判断更新前当前媒体帧时间戳和上一媒体帧时间戳的差值是否大于标准媒体帧间隔,若差值大于标准媒体帧间隔,更新当前媒体帧时间戳后,更新后的当前媒体帧时间戳和下一媒体帧时间戳的帧间隔会增大,故根据补偿系数对更新后的当前媒体帧时间戳进行正向补偿,经过正向补偿后,当前媒体帧目标时间戳与上一媒体帧时间戳的差值在标准媒体帧间隔的波动范围内,同时还缩小了更新后的当前媒体帧时间戳和下一媒体帧时间戳的帧间隔,对需要校正的每一媒 体帧时间戳经过校正和补偿的处理,不仅解决了媒体流中的音视频帧携带的时间戳不均匀,导致播放端播放异常的问题,还通过正向补偿和反向补偿平衡累计误差,防止调整时间戳序列过程中累计误差越积累越大,提升了音视频的兼容性,有标准化意义。
附图说明
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:
图1是根据本申请实施例1的一种音视频数据的处理方法的流程图;
图2是根据本申请实施例1的另一种音视频数据的处理方法的流程图;
图3是根据本申请实施例1的时间戳均匀化处理单元的结构框图;
图4是根据本申请实施例2的丢帧过程示意图;
图5是根据本申请实施例3的音视频推流数据的配置流程示意图;
图6是根据本申请实施例4的直播装置的应用环境示意图;
图7是根据本申请实施例4的第一种直播装置的示意图;
图8是根据本申请实施例4的第二种直播装置的示意图;
图9是根据本申请实施例4的第三种直播装置的示意图;
图10是根据本申请实施例4的第四种直播装置的示意图;
图11是根据本申请实施例4的第五种直播装置的示意图;
图12是根据本申请实施例4的第六种直播装置的示意图;
图13是根据本申请实施例4的第七种直播装置的示意图;
图14是根据本申请实施例5的电子设备的内部结构示意图。
具体实施方式
实施例1
本实施例提供了一种音视频数据的处理方法,图1是根据本申请实施例1的一种音视频数据的处理方法的流程图,如图1所示,该方法包括如下步骤:
步骤S101,获取媒体流,其中,媒体流为音视频流,音视频流包括视频流和音频流,媒体流是采用流式传输的方式使得流式媒体在互联网上播放的技术;
步骤S102,获取媒体流中当前媒体帧时间戳和上一媒体帧时间戳的差值,以及获取差值的上下限范围,并判断差值是否在上下限范围内;本实施例中,获取媒体流后,每一媒体帧数据都会附带媒体帧采集时刻的时间戳和媒体帧数据编码后标记的时间戳,本申请采用的时间戳可以为采集时刻的时间戳,也可以为编码后标记的时间戳;
步骤S103,若判断结果为是,则输出当前媒体帧时间戳作为当前媒体帧目标时间戳,若判断结果为否,则获取媒体流的标准媒体帧间隔,将当前媒体帧时间戳更新为上一媒体帧时间戳和标准媒体帧间隔之和;本实施例中,差值在上下限范围内,则认为当前媒体帧时间戳和上一媒体帧时间戳的帧间隔符合要求,无需对当前媒体帧时间戳进行校正,输出当前媒体帧时间戳作为当前媒体帧目标时间戳,当差值在上下限范围外,则会导致播放端解码后播放异常,故将当前媒体帧时间戳更新为上一媒体帧时间戳和标准媒体帧间隔之和;
步骤S104,判断差值是否大于标准媒体帧间隔,若判断结果为是,则根据补偿系数对更新后的当前媒体帧时间戳进行正向补偿,若判断结果为否,则根据补偿系数对更新后的当前媒体帧时间戳进行反向补偿,输出正向补偿或反向补偿后的时间戳作为当前媒体帧目标时间戳。
本实施例中,当差值在上下限范围外,则将当前媒体帧时间戳更新为上一媒体帧时间戳和标准媒体帧间隔之和之后,判断差值是否大于标准媒体帧间隔,若差值大于标准媒体帧间隔,更新当前媒体帧时间戳后,更新后的当前媒体帧时间戳和下一媒体帧时间戳的帧间隔会增大,故根据补偿系数对更新后的当前媒体帧时间戳进行正向补偿,经过正向补偿后,当前 媒体帧目标时间戳与上一媒体帧时间戳的差值还在上下限范围内,即在标准媒体帧间隔的波动范围内,并且同时缩小了更新后的当前媒体帧时间戳和下一媒体帧时间戳的帧间隔;若差值小于标准媒体帧间隔,更新当前媒体帧时间戳后,更新后的当前媒体帧时间戳和下一媒体帧时间戳的帧间隔会减小,故根据补偿系数对更新后的当前媒体帧时间戳进行反向补偿,经过反向补偿后,当前媒体帧目标时间戳与上一媒体帧时间戳的差值还在标准媒体帧间隔的波动范围内,同时还增大了更新后的当前媒体帧时间戳和下一媒体帧时间戳的帧间隔;对当前媒体帧时间戳进行校正后,通过正向补偿和反向补偿平衡累计误差,其中,考虑到补偿系数太大会进一步增大误差,太小的补偿能力有限,故补偿系数可设置为0.1倍的标准媒体帧间隔。
在其中一些实施例中,获取上下限范围包括,根据标准媒体帧间隔和波动上下限系数,获取上下限范围,其中,波动上下限系数小于播放端解码器能容忍的波动范围。本实施例中,波动上下限系数一般要求小于播放端解码器能容忍的波动范围,将波动上限系数设置为1.05,波动下限系数设置为0.8,则上限范围阈值为波动上限系数1.05乘于标准媒体帧间隔,下限范围阈值为波动下限系数0.8乘于标准媒体帧间隔,本申请将波动范围上限设置为1.05,远远小于h5播放器能容忍的波动范围上限1.5,不设置为最大上限,是因为较小的上限会使输出的均匀度提升,但是会导致需要调整的原始点变多,也会增大超过最大误差允许系数强制同步的可能性。较大的上限(小于播放器能容忍的最大上限1.5倍标准音频帧间隔)会使输出均匀性略微下降,但是需要调整的采集点会变少,也会减少差值超过最大误差允许系数,不进行校正的可能性,故可根据需求设定波动上下限系数,只要设定的波动上下限系数小于播放端解码器能容忍的波动范围即可,示例性的,波动上限系数×标准媒体帧间隔+补偿系数小于播放内核限定的上限系数×标准媒体帧间隔即可。
在其中一些实施例中,将当前媒体帧时间戳更新为上一媒体帧时间戳和标准媒体帧间隔之和之前,判断差值是否大于最大误差允许系数,若判断结果为是,则输出当前媒体帧时间戳作为当前媒体帧目标时间戳,若判断结果为否,则将当前媒体帧时间戳更新为上一媒体帧时间戳和标准媒体帧间隔之和,其中,最大误差允许系数为n倍的标准媒体帧间隔,n为大于1的数值。本实施例中,n为大于1的数值,并且n大于波动范围上限,波动范围上限改变时,可动态设置n的取值,在将当前媒体帧时间戳更新为上一媒体帧时间戳和标准媒体帧间隔之和之前,先判断差值是否有太大偏差,当差值大于n倍的标准媒体帧间隔时,不对当前媒体帧时间戳进行校正,直接输出当前媒体帧时间戳作为当前媒体帧目标时间戳,这是因为当差值大于n倍的标准媒体帧间隔时,意味着当前媒体帧时间戳和上一媒体帧时间戳有了较大的偏差,该偏差可能是采集端在采集媒体流数据时出现了中断情况,示例性的,采集端允许音频源切换,切换的过程会有一定的时间间隙,该时间间隙内没有音频帧产生,当音频帧恢复时,当前音频帧时间戳和上一音频时间戳的帧间隔会非常大,如果对当前音频帧时间戳进行校正,会导致从切换时刻开始的后续音频帧都需要经过较大幅度的调整,故直接输出当前媒体帧时间戳作为当前媒体帧目标时间戳,不进行校正,牺牲一个点的均匀性来消除这个大的差值对一大段时间戳序列的影响。
在其中一些实施例中,根据补偿系数对更新后的当前媒体帧时间戳进行正向补偿或反向补偿包括:
正向补偿为将更新后的当前媒体帧时间戳与补偿系数之和作为当前媒体帧目标时间戳;反向补偿为将更新后的当前媒体帧时间戳与补偿系数之差作为当前媒体帧目标时间戳。本实施例中,校正过程并不一定要将每帧的帧间隔校正为标准媒体帧间隔,例如,h5播放器能容忍的波动范围上限为1.5倍的标准音频帧间隔,当标准音频帧间隔为22.5毫秒,那么播放第一帧音频后,第二帧音频时间戳与第一帧音频时间戳的帧间隔只要在1.5*22.5毫秒内就能正常播放,故将当前媒体帧时间戳更新为上一媒体帧时间戳和标准媒体帧间隔之和之后,根据补偿系数对更新后的当前媒体帧时间戳进行正向补偿或反向补偿,正向补偿下,当前媒体帧目标时间戳为更新后的当前媒体帧时间戳与补偿系数之和,使播放器能正常播放的情况下, 缩小更新后的当前媒体帧时间戳和下一媒体帧时间戳的帧间隔,平衡累计误差,反向补偿下,当前媒体帧目标时间戳为更新后的当前媒体帧时间戳与补偿系数之差,使播放器能正常播放的情况下,增大更新后的当前媒体帧时间戳和下一媒体帧时间戳的帧间隔,平衡累计误差,通过引入补偿系数以平衡累计误差,防止累计误差越积累越大。
在其中一些实施例中,根据补偿系数对更新后的当前媒体帧时间戳进行正向补偿或反向补偿之后,根据当前媒体帧目标时间戳更新上一媒体帧时间戳,更新后的上一媒体帧时间戳作为下一媒体帧时间戳的上一媒体帧时间戳。本实施例中,根据补偿系数对更新后的当前媒体帧时间戳进行正向补偿或反向补偿之后,对当前媒体帧时间戳的校正就已经完成了,此时,判断是否需要对下一媒体帧时间戳进行校正,需求出下一媒体帧时间戳和当前媒体帧目标时间戳的帧间隔,故根据当前媒体帧目标时间戳更新上一媒体帧时间戳,更新后的上一媒体帧时间戳作为下一媒体帧时间戳的上一媒体帧时间戳,示例性的,某个音频帧时间戳序列为pts1、pts2、pts3和pts4,当前媒体帧时间戳为pts2,上一媒体帧时间戳prevpts=pts1,pts2与pts1差值大于标准音频帧间隔,对pts2进行校正和补偿,得到当前媒体帧目标时间戳PTS2,若pts3与PTS2差值也大于标准音频帧间隔,需对pts3进行校正和补偿,根据当前媒体帧目标时间戳更新上一媒体帧时间戳,即此时prevpts=PTS2,更新后的上一媒体帧时间戳prevpts=PTS2作为下一媒体帧时间戳pts3的上一媒体帧时间戳,通过pts3-PTS2求出pts3与上一媒体帧时间戳的差值。
在其中一些实施例中,获取媒体流的标准媒体帧间隔包括:若媒体流为视频流,则根据视频流的帧率,获取标准视频帧间隔,标准视频帧间隔作为标准媒体帧间隔;示例性的,视频流的帧率为30fps,则标准视频帧间隔为1/30×1000,单位为毫秒;若媒体流为音频流,则根据音频流的采样率和每帧音频实际采样点数,获取标准音频帧间隔,标准音频帧间隔作为标准媒体帧间隔。示例性的,音频流的采样率为44100HZ,采集端每帧读取音频实际采样点数为1024,则标准音频帧间隔为1024/44100×1000,单位为毫秒,通过本实施例计算出标准媒体帧间隔后,通过该标准媒体帧间隔对媒体帧时间戳进行校正。
在其中一些实施例中,图2是根据本申请实施例1的另一种音视频数据的处理方法的流程图,如图2所示,以媒体流为音频流为例,该方法包括如下步骤:
步骤S201,编码器输出音频帧时间戳pts,其中,输出的音频帧时间戳可以是音频帧采集时刻的时间戳,也可以是音频帧数据编码后标记的系统时间戳,采集时刻的时间戳相对于编码后标记的时间戳更准确,故pts推荐采用音频帧采集时刻的时间戳进行校正;
步骤S202,更新diff=pts-prevPts,即差值diff=当前音频帧时间戳pts-上一音频帧时间戳prevPts;
步骤S203,判断diff<lowThreshold||diff>highThreshold?即差值diff小于下限lowThreshold或差值diff大于上限highThreshold,则说明差值在上下限范围外,判断差值diff是否在上下限范围外,若判断结果为是,则跳转到步骤S205,若判断结果为否,则跳转到步骤S204;
步骤S204,curpts=pts,即差值diff在上下限范围内,无需校正,直接输出当前音频帧时间戳pts,或差值diff大于或等于n倍的标准音频帧间隔,无需校正,直接输出当前音频帧时间戳pts;
步骤S205,判断diff<n×SAMPLE_DURATION?即判断差值diff是否小于n倍的标准音频帧间隔SAMPLE_DURATION,若判断结果为是,则为正向补偿,跳转到步骤S206,若判断结果为否,则为反向补偿,跳转到步骤S204;
步骤S206,curpts=prevPts+SAMPLE_DURATION,即将当前音频帧时间戳curpts更新为上一音频帧时间戳prevPts与标准音频帧间隔之和;
步骤S207,判断diff<SAMPLE_DURATION?若判断结果为是,则跳转到步骤S208,若判断结果为否,则跳转到步骤S209;
步骤S208,normalAdujst=COMPENSATE,即正向补偿时,补偿系数normalAdujst为正 的COMPENSATE;
步骤S209,normalAdujst=-COMPENSATE,即反向补偿时,补偿系数normalAdujst为负的COMPENSATE;
步骤S210,curpts=curpts+normalAdujst,即将当前音频帧目标时间戳为更新后的当前音频帧时间戳与补偿系数之和;
步骤S211,prevPts=curpts,即根据当前音频帧时间戳更新上一音频帧时间戳,作为下一音频帧时间戳的上一音频帧时间戳;
步骤S212,输出curpts。
本实施例还提供了一种时间戳均匀化处理单元。图3是根据本申请实施例1的时间戳均匀化处理单元的结构框图,如图3所示,该时间戳均匀化处理单元包括获取模块31、判断模块32、调整模块33、补偿模块34和输出模块35,并且,获取模块31与判断模块32连接,判断模块32与调整模块33连接,调整模块33与补偿模块34连接、补偿模块34与输出模块35连接。
获取模块31,用于获取媒体流,其中,媒体流为音视频流,音视频流包括视频流和音频流;判断模块32,用于获取当前媒体帧时间戳和上一媒体帧时间戳的差值和差值的上下限范围,并判断差值是否在上下限范围内,若判断结果为是,则输出模块35输出当前媒体帧时间戳作为当前媒体帧目标时间戳,若判断结果为否,则补偿模块34获取媒体流的标准媒体帧间隔,调整模块33将当前媒体帧时间戳更新为上一媒体帧时间戳和标准媒体帧间隔之和;判断模块32判断差值是否大于标准媒体帧间隔,若判断结果为是,则补偿模块34根据补偿系数对更新后的当前媒体帧时间戳进行正向补偿,若判断结果为否,则补偿模块34根据补偿系数对更新后的当前媒体帧时间戳进行反向补偿;输出模块35输出正向补偿或反向补偿后的时间戳作为当前媒体帧时间戳,解决了媒体流中的音视频帧携带的时间戳不均匀,导致播放端播放异常的问题,还通过正向补偿和反向补偿平衡累计误差,防止累计误差越积累越大,提升了音视频的兼容性,有标准化意义。
实施例2
根据实施例1的技术方案,解决了媒体流中的音视频帧携带的时间戳不均匀,导致播放端播放异常的问题,保障了播放端的正常播放。因此,基于上述实施例1,本实施例进一步考虑到在网络条件不理想的情况下,视频直播画面可能会出现卡顿,造成观众的体验不佳;在现有技术中,为了改善观众端的视频质量,一般采取对音视频数据进行丢帧处理,但是传统的丢帧策略比较单一笼统,可能对视频质量影响较大。因此,本实施例提供的音视频数据的处理方法可以在实施例1的流程之前或之后,还包括音视频丢帧流程。
在其中一些实施例中,音视频丢帧流程包括以下步骤:
步骤一:确定音视频流中的每个类型帧对应的权重系数;
步骤二:根据每个类型帧的权重系数、对应队列的队列容量,计算出作为丢帧判断依据的每个类型帧的丢帧判断阈值;
步骤三:在任一类型帧的发送时刻,若队列中该类型帧中两帧时间戳的最大时间间隔差值大于该类型帧对应的丢帧判断阈值,则执行丢帧操作。
下文对每个步骤进行详细说明:
步骤一中,类型帧至少包括第一类型帧和第二类型帧,存在着两种丢帧操作方法。
丢帧操作方法一:
若所述第一类型帧的权重系数大于所述第二类型帧的权重系数,则将队列中第二类型帧按照时间戳由大到小进行依次丢弃。
丢帧操作方法二:
第二类型帧根据重要程度排序设立二级权重;
若所述第一类型帧的权重系数大于所述第二类型帧的权重系数,则将队列中第二类型帧按照二级权重由小到大进行依次丢弃。
上述类型帧至少包括第一类型帧以及第二类型帧,在现有的设计中,但不局限于此种设计方式,例如,第一类型帧设计为音频帧、第二类型帧设计为视频帧,音频帧的权重系数大于视频帧的权重系数;又例如,第一类型帧设计为视频帧、第二类型帧设计为编码帧,编码帧具体分为P帧、I帧、B帧,I帧的权重系数大于P帧的权重系数。进一步地,每一组GOP图像中的P帧还可以再进行重要程度排序,如设立二级权重,按照二级权重由小到大进行依次丢弃P帧,使丢帧更加精细化。
步骤二中,丢帧判断阈值的设计参考了设计的权重系数,队列的队列容量,计算出作为丢帧判断依据的每个类型帧的丢帧判断阈值,丢帧判断阈值计算的同时会加入丢帧判断阈值参数。
可选的,丢帧判断阈值可通过上述权重系数、队列容量、丢帧判断阈值参数三者的乘积获得。
步骤三中,上述该类型帧中两帧时间戳的最大时间间隔差值可为最靠后的该类型帧时间戳与最靠前的该类型帧时间戳的差值,也可为队列中靠后位置的该类型帧的时间戳差值,即可为队列中该类型帧不同位置的时间戳差值,具体根据实际情况设计得到。
在每进行一次丢帧操作之后,重复计算队列中所丢类型帧中当前两帧时间戳的最大时间间隔差值,再与丢帧判断阈值进行比较,判定是得到该类型帧发送操作指令或该类型帧丢帧操作指令,直至队列中该类型帧中两帧时间戳的最大时间间隔差值不大于该类型帧对应的丢帧判断阈值时停止丢帧操作。
值得说明的是,在执行丢帧过程中,先对权重系数最低的类型帧进行丢帧操作,直至所丢类型帧中当前两帧时间戳的最大时间间隔差值不大于该类型帧对应的丢帧判断阈值。若此时网络仍然存在拥塞情况,则对权重系数次低的类型帧进行丢帧操作。这样,以帧类型的权重系数为第一优先条件、各类型帧对应的丢帧判断阈值为第二优先条件进行丢帧判断并执行丢帧操作,可以减少丢帧对视频质量的影响。
示例地,在本实施例中,类型帧包括P帧、I帧,其中I帧的权重系数大于P帧的权重系数,则在网络拥塞的情况下,可以先对P帧进行丢帧判断并执行丢帧操作,在P帧满足两帧时间戳的最大时间间隔差值不大于P帧对应的丢帧判断阈值,且网络仍然存在拥塞情况时,才对I帧进行丢帧判断并执行丢帧操作,直至I帧满足两帧时间戳的最大时间间隔差值不大于I帧对应的丢帧判断阈值。
另外,为应对网络波动的情况,及位于阈值临界点附近时的丢帧抖动现象,丢帧判断引入了回置窗口高度,由此延伸出两种回置窗口高度应用逻辑,分别为回置窗口高度固定和动态调整。
回置窗口高度固定:单纯引入一个回置窗口高度,时间戳差值与丢帧判断阈值进行比较,直到时间戳差值小于丢帧判断阈值与回置窗口高度的差值得到对应类型帧发送操作指令。
回置窗口高度动态调整:回置窗口高度可根据队列中该类型帧中两帧时间戳的最大时间间隔差值与丢帧判断阈值的实际情况进行动态调整,直到队列中该类型帧中两帧时间戳的最大时间间隔差值小于丢帧判断阈值与回置窗口高度的差值得到对应类型帧发送操作指令;
目前设计出一套判定逻辑,但不局限于此,即回置窗口高度随着堆积比动态调整,堆积比为队列中该类型帧中两帧时间戳的最大时间间隔差值与丢帧判断阈值的比值;具体的判定逻辑如下:
当堆积比小于等于1时,回置窗口高度为0;
当堆积比大于1、且多出部分介于N倍丢帧阶梯系数至N+1倍丢帧阶梯系数之间,则回置窗口高度为N+1倍丢帧阶梯系数,N取0,1,2,……。
根据上述描述设计逻辑和描述内容,具体的音视频帧的设计过程如下:
1、权重表设计
音视频流媒体发送主要包含音频流与视频流,音频流主要包含音频帧,视频流常用的编码方式为H.264,主要包含P帧、I帧、B帧。
本方案将音频帧与视频帧统一纳入帧权重表,给定不同的权重系数;按照经验,音频帧由于人耳对于断续的音频流极为敏感、包数据量较小等特点,给予较高的权重系数;I帧作为关键帧,能独立解码,且作为P帧、B帧的解码参考,重要程度也相对较高,也给予较高的权重系数,按照经验得出弱网下推流效果较好的帧权重表参考如表1所示:
帧类型 帧名称 权重系数a(0-10)
音频 音频帧 8
视频 I帧 6
视频 P帧 3
表1
2、丢帧判断阈值的确定
本发明使用丢帧判断阈值作为丢帧判断依据,对网络拥塞情况描述更加直白、准确、灵敏。
丢帧判断阈值T的设计考虑帧权重系数a、队列容量n(n通常≥200)、丢帧判断阈值参数p,设计计算公式如下:T=p*a*n。
丢帧判断阈值参数p的经验值通常为0.002左右,此时,帧权重表更新如下述表2所示:
帧类型 帧名称 权重系数a(0-10) 丢帧判断阈值(T)
音频 音频帧 8 3.6
视频 I帧 6 2.4
视频 P帧 3 1.2
表2
3、缓冲队列
基于本方案的丢帧策略,音频帧往往重要度较高、不容易被丢弃,设计两个队列容器,一个作为音频帧发送缓冲,一个作为视频帧发送缓冲,如此可以大大降低丢帧判断算法的计算量。
缓冲队列可以采取的形式包括但不限于数组、列表、队列、链表等数据结构,通常采用先进先出的FIFO;如此,在每次计算丢帧判定时,可以将音频帧与视频帧分开计算。
4、丢帧判断与丢帧操作,图4是根据本申请实施例2的丢帧过程示意图,如图4所示,在任一帧的发送时刻,先进行该类型帧的丢帧判定策略执行,具体的判定逻辑如下:
1、根据该帧类型,在表中寻找丢帧判断阈值T;
2、根据音视频帧类型,统计对应队列中该类型帧的总时长S;
总时长S的计算方法为:寻找队列中最靠前的该类型时间戳F1与队列中最靠后的该类型时间戳F2,计算两帧时间戳的时间间隔Δ值即为S,S=F2-F1;
3、将计算所得的总时长S与丢帧判断阈值T做比较,若S≥T,则发生丢帧操作,丢帧的执行逻辑为按照队列中该类型帧的时间顺序从后往前进行依次丢弃,每丢弃一次重复计算当前总时长S,再和丢帧判断阈值T进行比较,直到S满足S<T-M。
其中,M为回置窗口高度,M的大小直接反映了该丢帧的数量,同时,M在一定程度上取决于S与T的比值,即堆积比Q,堆积比计算方法为Q=S/T。
现引入丢帧阶梯系数step,用于动态调整M的大小,以下举例但不限定,见表3:
堆积比Q 回置窗口高度M
≤1 M=0,不丢帧
1<Q≤1+step M=step,丢帧到(1-M)
1+step<Q≤1+2*step M=2*step,丢帧到(1-M)
依次类推 ……
表3
基于上述本申请内容,创新点可总结如下:
1、采用帧权重系数表来描述音视频帧的重要程度、丢帧优先级、计算丢帧容限度阈值, 采取了帧权重系数、丢帧判断阈值参数p、丢帧阶梯系数step、回置窗口高度M、丢帧判断阈值T等量化系数来精确描述丢帧操作;
2、基于本方案的丢帧策略,音频帧往往重要度较高、不容易被丢弃,设计两个队列容器,一个作为音频帧发送缓冲,一个作为视频帧发送缓冲,如此可以大大降低丢帧判断算法的计算量;
3、使用丢帧判断阈值作为丢帧判断依据,对网络拥塞情况描述更加直白、准确、灵敏,且丢帧时,能立即刷新当前总时长并与丢帧判断阈值再次比较,控制反应速度极快;
4、丢帧时,丢帧数量参照堆积比,能更加准确的衡量该丢帧数量的大小,将网络的状态与丢帧操作进行了很好的匹配,对于不同网络的拥塞程度具备较好的自适应性;网络拥塞情况越严重,丢帧数量增大,网络拥塞较轻,丢帧数量减小;
5、基于丢帧判断阈值,执行丢帧操作时使用回置窗口的设计,每次丢帧后留有一定余量,能大大减少丢帧操作反复进行的情况;
6、帧权重、丢帧判断阈值参数能动态调整,算法具备较好的适应性。
本实施例对比于现有技术的有益效果为:对音视频帧中的不同类型帧进行权重设计,依据上述丢帧方法的弃帧顺序逻辑,权重越低的帧越先被丢弃,并且可进一步对第二类型帧设立二级权重,丢帧的时候能够做到更加精细化;或者队列中时间戳越大的帧(后入队列的)会被先丢弃;其中,丢帧判断加入回置窗口高度,整体丢帧操作执行来看,由于增加了回置窗口,丢帧时间位于阈值临界点附近时的丢帧抖动现象能被大大消除,应对网络波动的情况,基本上一次丢帧就能覆盖一次网络波动时间;并且从丢帧数量与网络波动的匹配关系来看,由于丢帧时丢帧数量参照堆积比,即回置窗口高度取决于堆积比,将网络的状态与丢帧操作进行了很好的匹配,使得该丢帧数量的大小衡量更加精确,总体上呈现网络拥塞情况越严重,丢帧数量增大;网络拥塞较轻,丢帧数量减小的状态。
本实施例中,还提供一种音视频丢帧单元,该音视频丢帧单元包括依次电连接的编码器输出单元、帧接收单元、音视频丢帧单元和发送单元,其中,音视频丢帧单元包括确定模块、计算模块和丢帧模块,确定模块用于确定音视频流中的每个类型帧对应的权重系数;计算模块用于根据每个类型帧的权重系数、队列的队列容量,计算出每个类型帧对应的丢帧判断阈值;丢帧模块用于在任一类型帧的发送时刻,若队列中该类型帧中两帧时间戳的最大时间间隔差值大于该类型帧对应的丢帧判断阈值,则执行丢帧操作。
在其中一个实施例中,还提供了一种音视频丢帧单元,该音视频丢帧单元包括外部动态参数设置器、参数收集器、参数计算器、丢帧判定器和丢帧执行器。
外部动态参数设置器用于设置音频帧和视频帧的权重系数、以及设置丢帧判断阈值参数;
参数收集器用于将丢帧判断相关的参数进行收集,参数包括权重系数、队列容量、丢帧判断阈值参数;
参数计算器用于将收集到的参数根据计算规则得到各类型帧的丢帧判断阈值;
丢帧判定器用于先寻找该类型帧丢帧判断阈值,再计算队列中该类型帧中两帧时间戳的最大时间间隔差值,根据丢帧判定原则对最大时间间隔差值与丢帧判断阈值进行比较判断;
丢帧执行器用于当丢帧判定器判断出进行丢帧操作,则将队列中该类型帧按照时间戳由大到小进行依次丢弃,每丢弃一次该类型帧将反馈给参数计算器和丢帧判定器,重复计算队列中所丢类型帧中当前两帧时间戳的最大时间间隔差值并进行丢帧判定。
实施例3
基于上述任一项实施例,本实施例还包括音视频推流数据的处理方法,旨在实现一个主播对多平台的推流。该音视频推流数据的处理流程包括以下步骤:
步骤1:获取用户上传的音视频数据,其中,所述音视频数据以音视频流的形式传输,且所述音视频数据携带用户绑定账号信息,且该账号已针对多个目标直播平台的播放参数完成配置;
步骤2:由服务器分别针对各目标直播平台创建推流任务;
步骤3:在该用户绑定账号下,由所述服务器向多个目标直播平台分发该用户的音视频数据。
因此,通过在一个平台上将绑定的用户绑定账号针对多个目标直播平台的播放参数完成基于直播的配置;由服务器分别针对各目标直播平台创建推流任务,向多个目标直播平台分发该用户的音视频数据,满足了针对一个主播在多个平台同时直播的需求,克服了现有技术中无法实现一键多平台推流的技术缺陷。下文对每个步骤进行详细说明:
步骤1中,在直播一体机的客户端,用户的账号需要通过客户端与多个目标直播平台进行授权绑定,其流程优选包括以下步骤:
(1)用户点击添加账号按钮;
(2)通过浏览器调起目标平台登录授权网页;
(3)用户登录目标平台完成授权;
(4)后端云服务器建立本客户端所在智能端的账号与目标直播平台的账号之间的链接;
(5)客户端接收到新增绑定信息,绑定完成。
以上流程首先通过目标平台开放接口,获取平台账号信息,直播推流等权限。用户可以以登录到目标平台,并将平台账号与本机账号绑定,授权本机账号可操作平台账号,在平台账号下进行直播推流等操作。绑定完成后服务器将会记录设备登录的用户(本机账号)与目标平台(第三方直播平台)账号的一对多的绑定关系,持久化存储在服务器数据库中。
提供给用户完成多个平台的直播参数配置交互信息的流程包括以下步骤:
(1)在该用户绑定账号下,向用户发送直播平台配置交互信息;
所述配置交互信息可以通过直播一体机的弹窗交互界面向用户展示,也可以以推送方式向用户提示,并不局限。
(2)由服务器响应该用户基于所述交互信息的配置指示,生成匹配对应直播平台需求信息的直播平台配置数据。
上述方法作为示例在本方案中列举,所述多个目标直播平台并不作为局限本发明范围的条件。
用户上传音视频数据后,可通过如下步骤完成推流的数据准备:
(1)通过音视频采集设备采集音视频数据;
(2)通过对所述音视频数据进行编码及封装,以准备进行针对多个目标直播平台的推流。
其中,包括一路或者多路音视频采集设备,该音视频采集设备可以是外接的摄像头/麦克风,也可以是具备音视频采集功能的录播一体机、直播一体机,还可以是软件采集模块,如虚拟摄像头、虚拟麦克风等。示例地,以上步骤可通过直播一体机本机完成,也可以通过与服务器配合的方式完成,并不局限。
步骤2中,音视频数据到达服务器之后,根据服务器中持有的目标平台数量创建多个任务。每个任务都是-条负责将流分发产线,以单独完成将音视频数据推送到对应目标直播平台及直播间。
步骤3中,目标直播平台直播间在配置完成后,可供协议互通的音视频信号推流开播,比如RTMP协议,当推流-旦到达就可配置信息进行直播,直播平台A,晚8点开播,仅好友可见;直播平台B为全部公开,且开播时间为晚9点,每个直播间互不影响,即使其中一个推流任务执行失败,其他直播平台的直播间也可以继续直播。
对于音视频的同时播放,可以通过如内置Camera或者外接HDMI、USB视频采集设备采集音视频数据。通过编码转码操作将数据封装成RTMP协议规定格式。利用音频均匀化算法,在视频数据发送之前,将音频数据与视频数据进行对齐操作来保证音画同步。由于云服务器中已经持有本机账号,本机账号绑定的多个目标平台账号,目标直播平台的推流地址以及可完成推流的权限。那么就可以将准备好的音视频数据推流到后端云服务器,通过点击开始直播按钮获取云服务器的推流地址,开始推流。此过程中可优选开源技术方案librtmp作为RTMP数据传输工具,并且需要将librtmp交叉编译并移植到该直播一体机中。
图5是根据本申请实施例3的音视频推流数据的配置流程示意图,如图5所示,针对客户端向用户发送直播平台配置交互信息的流程,具体可通过如下步骤实现:
S801:向用户发送针对目标直播平台的绑定账号信息;
由服务器响应该用户基于所述交互信息的配置指示,包括:
S802:当所述绑定账号信息请求授权通过时,客户端接收用户选择数据并向目标直播平台发送配置数据,所述配置数据包括隐私设置指示信息和音视频发布设置信息;
需要说明的是,所述用户选择数据可以是按钮或者启停选择,或是其他可以反映用户在配置过程中的个性化需求的形式,客户端接收选择数据,在本地进行个性化配置数据的解读和打包,发往服务器。
S803:由服务器依据所述用户选择数据完成设置及存储该用户针对目标直播平台的配置数据。
例如,用户需要设置直播平台A直播观看权限的隐私设置为仅好友可见,就可通过点击已绑定完整的直播平台列表中目标直播平台右侧的箭头图标进入设置页,进入页面可以看到三种隐私权限交互信息,可以是操作按钮形式,点击“仅好友可见”按钮,点击完成执行设置操作。随后设备会将用户操作交互信息转换为通信消息:直播平台唯-标识符publishId、权限等级privacy2个参数传递给后端服务器即可将用户配置的隐私选项设置到目标A的直播平台。
在其中一些实施例中,交互界面包括“视频区域”,以及“时间线”、“粉丝圈”、“小组”、“公开”和“仅朋友可见”等相关设置功能。
作为优选,由服务器分别接收推流任务创建的直播间地址并存储。
由客户端或者由服务器通过客户端向用户发送直播平台配置交互信息,包括:
由客户端向用户发送针对目标直播平台的配置交互界面;
响应该用户基于所述交互信息的配置指示,包括:
当所述绑定账号信息请求授权通过时,接收用户选择数据并向目标直播平台发送配置数据,所述配置数据包括隐私设置指示信息和音视频发布设置信息;
依据所述用户选择数据完成设置及存储该用户针对目标直播平台的配置数据。
实施例4
本实施例考虑到,在直播的过程中,经常需要接入多路音频信号进行直播,但在相关技术中的小型化的直播装置中,输入直播装置的音频数据是通过直播装置的处理器模块进行处理,导致处理器模块运算量大,造成处理器模块运行效率低,容易造成直播卡顿现象,影响直播呈现内容的质量。
本实施例提供的直播装置,可以应用于如图6所示的应用环境中,图6是根据本申请实施例4的直播装置的应用环境示意图,如图6所示,该直播装置12集成导播台、硬盘录像机、编码器、采集卡等多种直播装置的功能于一体,针对高清摄像机、麦克风、摄像头等多种直播设备11,可以进行多路视频采集、解码、编码、推流等直播数据处理操作;用户在直播时,将直播设备11连上该直播装置12,直播装置12通过网络连接远端的服务器13,该直播装置12将处理后的数据推送至服务器13,服务器13将数据转发至多个直播平台,观众可以通过平板、手机、电脑等观看设备14在各个直播平台上观看直播。
在相关技术中,小型化的直播装置输入的音频数据是通过直播装置的处理器模块进行处理,导致处理器模块运算量大,造成处理器模块运行效率低,从而容易造成直播卡顿现象,影响直播呈现内容的质量。
本实施例提供了一种直播装置,图7是根据本申请实施例4的第一种直播装置的示意图,如图7所示,该直播装置12包括音频处理模块21、设备接口模块22和处理器模块23,其中,设备接口模块22可以用于连接高清摄像机、摄像头等直播设备,音频处理模块21包括音频输入接口211和音频处理芯片212,音频输入接口211可以用于连接麦克风,音频处理芯片212分别连接音频输入接口211、设备接口模块22和处理器模块23,音频处理芯片212对音 频输入接口211和/或设备接口模块22输入的音频数据进行降噪和/或混音处理,并将处理后的音频数据传送至处理器模块23,可选的,该音频处理芯片212的型号可以是AK7735。
通过设置音频处理芯片212,并将音频处理芯片212分别与多路音频输入端相连接,该多路音频输入端包括音频输入接口211和设备接口模块22,在音频处理芯片212内可以进行对音频输入接口211和/或设备接口模块22输入的音频数据的降噪和/或混音处理,音频处理芯片212将处理后的音频数据传送至处理器模块23,处理器模块23在处理直播数据的过程中无需再进行音频处理工作,相对于现有技术中小型化的直播装置的音频数据是通过处理器模块23进行处理,导致处理器模块23运算量大的问题,本实施例中的直播装置12提高了处理器模块23运行效率,有利于减少直播卡顿现象,从而提高直播呈现内容的质量。
另外,通过在音频处理芯片212内对音频输入接口211和/或设备接口模块22输入的音频数据进行降噪和/或混音处理,也可以实现该直播装置12对应的用户界面的音量调节、音频源切换、混音等功能。
可选的,如图7所示,音频输入接口211包括有源输入接口2111(或者称为Line In接口)和无源输入接口2112(或者称为Mic In接口),其中,有源输入接口2111用于连接有源麦克风,无源输入接口2112用于连接无源麦克风;通过设置有源输入接口2111和无源输入接口2112,针对不同类型的输入音频,该直播装置12支持有源麦克风输入和无源麦克风输入,适用性好。
可选的,如图7所示,音频处理模块21还包括音频输出接口213(或者称为Audio out接口),音频输出接口213连接音频处理芯片212,并用于将音频处理芯片212处理后的音频数据输出至耳机等设备。
在其中一些实施例中,图8是根据本实施例4的第二种直播装置的示意图,如图8所示,该设备接口模块22包括HDMI接口模块31和USB接口模块32,其中,高清多媒体接口(HighDefinition Multimedia Interface,简称HDMI)是一种全数字化视频和声音发送接口,可以发送未压缩的音频及视频信号;通用串行总线(Universal Serial Bus,简称USB)是一种串口总线标准,也是一种输入输出接口的技术规范,被广泛地应用于个人电脑和移动设备等信息通讯产品,并扩展至摄影器材、数字电视(机顶盒)、游戏机等其它相关领域。
进一步的,HDMI接口模块31包括多个HDMI输入接口311和多个第一格式转换器312,多个HDMI输入接口311分别连接音频处理芯片212;多个HDMI输入接口311和多个第一格式转换器312一一对应连接,并且第一格式转换器312的一端连接HDMI输入接口311,另一端连接处理器模块23;通过将HDMI输入接口311设置成多个,该直播装置12支持多视频接入,从而满足部分用户直播时的多路视频接入的需求;通过设置该第一格式转换器312,可以将输入的数据由HDMI格式转换成MIPI格式,因此该直播装置12可以适配市面上通用的摄像机和单反相机,解决了相关技术中的便携式编码器兼容性差的问题,提高了直播装置12的适用性,可选的,第一格式转换器312的芯片型号可以是龙讯LT6911HDMI转MIPI桥接芯片。
如图8所示,HDMI输入接口311可以外接高清摄像机等直播设备,HDMI输入接口311输入的数据包括视频数据和/或音频数据,第一格式转换器312将HDMI输入接口311输入的视频数据和/或音频数据由HDMI格式转换成MIPI格式,并将MIPI格式的视频数据和/或音频数据传送至处理器模块23,处理器模块23接收到视频数据后对视频数据进行处理,可选的,该处理器模块23可以是移远SC66智能模块,移远SC66智能模块集成了高通骁龙8核处理器与高通Adreno 512图形处理器(Graphic Processing Unit,简称GPU),支持多路且每路高达1080P格式的视频数据的解码和编码处理。
在其他一些实施例中,该设备接口模块22也可以只包括HDMI接口模块31,或者只包括USB接口模块32。
在其中一些实施例中,图9是根据本申请实施例4的第三种直播装置的示意图,如图9所示,USB接口模块32包括第一USB接口41、第二USB接口42和第三USB接口43,处 理器模块23包括USB端口,其中,第一USB接口41连接处理器模块23的USB端口,并通过该处理器模块23向音频处理芯片212输入音频数据,可选的,如图9所示,USB接口模块32还可以包括接口拓展器44,该接口拓展器44一端连接USB端口,另一端连接多个第一USB接口41和第三USB接口43,通过设置接口拓展器44,可以将单个USB端口拓展成多个第一USB接口41,使直播装置12支持多设备接入,例如,该直播装置12可以在多个第一USB接口41上分别接入USB A型的物理接口的鼠标、键盘、摄像头等设备,同时,通过设置接口拓展器44,还可以将第三USB接口43也集成在处理器模块23的USB端口,该第三USB接口43可以用于连接触摸屏,其中,该接口拓展器44的芯片型号可以是LAN9514;第二USB接口42连接处理器模块23,用于系统调试,该第二USB接口42不对用户开放。
在其中一些实施例中,图10是根据本申请实施例4的第四种直播装置的示意图,如图10所示,直播装置12还包括显示模块50,显示模块50包括显示屏51和第二格式转换器52,第二格式转换器52的一端连接处理器模块23,第二格式转换器52的另一端连接显示屏51,处理器模块23输出MIPI格式的视频数据,第二格式转换器52将MIPI格式的视频数据转换成LVDS格式,显示屏51显示LVDS格式的视频数据,可选的,该第二格式转换器52的芯片型号可以是龙讯LT9211 MIPI转LVDS桥接芯片;通过设置显示模块50,该直播装置12支持搭载不同尺寸规格的LVDS接口的显示屏51,用户在直播时,可以通过该LVDS接口的显示屏51实时观看视频画面。
可选的,如图10所示,显示屏51包括触摸屏511,第三USB接口43连接该触摸屏511,从而该触摸屏511捕捉到的触摸信号可以通过该第三USB接口43传输至处理器模块23,使得处理器模块23能够对触摸信号作出响应。
在其中一些实施例中,图11是根据本申请实施例4的第五种直播装置的示意图,如图11所示,直播装置还包括数据输出模块60,数据输出模块60包括第三格式转换器61和HDMI输出接口62,第三格式转换器61的一端连接处理器模块23,第三格式转换器61的另一端连接HDMI输出接口62,第三格式转换器61将处理器模块23输出的视频数据和音频数据由MIPI格式转换成HDMI格式,并将HDMI格式的视频数据和音频数据传送至HDMI输出接口62,可选的,该第三格式转换器61的芯片型号可以是龙讯LT9611 MIPI转HDMI桥接芯片;用户在直播时,可以将HDMI输出接口与HDMI接口的显示器相连接,从而可以在该HDMI接口的显示器上实时观看视频画面。
在其中一些实施例中,图12是根据本申请实施例4的第六种直播装置的示意图,如图12所示,直播装置12还包括网络模块70,该网络模块可以实现WIFI连接、有线网络连接和4G网络连接等多种联网方式,使该直播装置12支持在有线网络或者无线网络下工作,该网络模块70连接处理器模块23,用于将处理器模块23处理后的视频数据或音频数据推送到服务器,从而该服务器可以将视频数据或音频数据转发至多个网络直播平台。
在其中一些实施例中,图13是根据本申请实施例4的第七种直播装置的示意图,如图13所示,音频处理芯片212包括I2S1端口、I2S2端口、I2S3端口、AIN1端口、AIN2端口、I2C端口和AOUT1端口,处理器模块23包括MIPI CSI1端口、MIPI CSI2端口、I2S端口、I2C端口、USIM端口、USB3.0端口、POWER CORE端口、LCD MIPI端口、USB2.0端口和MIPI DSI端口,第二格式转换器52包括LDVS1端口和LDVS2端口,显示屏51包括TP触屏端口,接口拓展器44包括USB0端口、USB1端口、USB2端口、USB3端口和PHY端口,直播装置12还包括SIM接口81、电源输入接口82、电源转换芯片83和网口84,该直播装置12各组成部分或各组成部分的端口间的连接关系如图13所示。
如图13所示,SIM接口81可以接入SIM卡,SIM接口81连接处理器模块23的USIM端口;电源输入接口82可以接入电源,电源转换芯片83连接电源输入接口82和处理器模块23的POWER CORE端口并用于电源电压转换,可选的,该电源转换芯片83的型号可以是RT7295,该RT7295型号的芯片将电源输入接口输入的12V电压转换成适配处理器模块23的3.9V电压,并将3.9V电压传送给处理器模块23;网口84连接接口拓展器44,该网口84 用于接入网线。
本实施例中提供的直播装置12,通过设置多个HDMI输入接口311和多个第一USB接口41,可以实现多路视频采集;通过设置音频处理模块21,并将音频处理模块21分别与设备接口模块22和处理器模块23相连接,可以实现对输入的音频数据的降噪和/或混音处理;通过设置处理器模块23,并将处理器模块23分别与音频处理模块21和设备接口模块22相连接,可以实现对输入的视频数据和音频数据的解码、编码处理;通过设置显示模块50,可以实现视频画面的实时观看;通过设置数据输出模块60,可以转换视频数据和音频数据的数据格式并输出数据;通过设置网络模块70,可以实现视频数据和音频数据的网络推送,因此,该直播装置12集多路视频采集、解码、编码、推流功能于一体,用户使用时,无需额外配合导播台、硬盘录像机、编码器、采集卡等设备,使得用户直播更加方便,且有利于降低直播成本。
通过该直播装置包括音频处理模块、设备接口模块和处理器模块,音频处理模块包括音频输入接口和音频处理芯片,音频输入接口用于连接麦克风,音频处理芯片分别连接音频输入接口、设备接口模块和处理器模块,音频处理芯片对音频输入接口和/或设备接口模块输入的音频数据进行降噪和/或混音处理,并将处理后的音频数据传送至处理器模块,解决了相关技术中直播装置的处理器模块运行效率低,影响直播呈现内容的质量的问题,提高了观众的观看体验。
需要说明的是,上述直播装置中的处理器模块还可以包括实施例1中的时间戳均匀化处理单元和/或实施例2中的音视频丢帧单元,以实现时间戳均匀化处理和/或音视频丢帧。
在其中一些实施例中,可以提供一种音视频数据的处理系统,该系统包括上述的直播装置,也可以包括实施例3中的服务器,且在实施例3的场景下,客户端也可以作为上述直播装置使用,也就是说,其处理器模块可以实现实施例3的音视频推流数据的处理方法。
上述各个模块可以是功能模块也可以是程序模块,既可以通过软件来实现,也可以通过硬件来实现。对于通过硬件来实现的模块而言,上述各个模块可以位于同一处理器中;或者上述各个模块还可以按照任意组合的形式分别位于不同的处理器中。
实施例5
结合上述实施例中的音视频数据的处理方法,本申请实施例可提供一种存储介质来实现。该存储介质上存储有计算机程序;该计算机程序被处理器执行时实现上述实施例中的任意一种音视频数据的处理方法。
本申请的一个实施例中还提供了一种电子设备,图14是根据本申请实施例5的电子设备的内部结构示意图,如图14所示,提供了一种电子设备,该电子设备可以是服务器,其内部结构图可以如图14所示。该电子设备包括通过内部总线连接的处理器、网络接口、内存储器和非易失性存储器,其中,该非易失性存储器存储有操作系统、计算机程序和数据库。处理器用于提供计算和控制能力,网络接口用于与外部的终端通过网络连接通信,内存储器用于为操作系统和计算机程序的运行提供环境,计算机程序被处理器执行时以实现一种音视频数据的处理方法,数据库用于存储数据。
本领域技术人员可以理解,图14中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的电子设备的限定,具体的电子设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
本领域的技术人员应该明白,以上所述实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。

Claims (29)

  1. 一种音视频数据的处理方法,其特征在于,所述方法包括:
    获取媒体流,其中,所述媒体流为音视频流,所述音视频流包括视频流和音频流;
    获取所述媒体流中当前媒体帧时间戳和上一媒体帧时间戳的差值,以及获取所述差值的上下限范围,并判断所述差值是否在所述上下限范围内;
    若所述判断结果为是,则输出所述当前媒体帧时间戳作为当前媒体帧目标时间戳,若所述判断结果为否,则获取所述媒体流的标准媒体帧间隔,将所述当前媒体帧时间戳更新为所述上一媒体帧时间戳和所述标准媒体帧间隔之和;
    判断所述差值是否大于所述标准媒体帧间隔,若所述判断结果为是,则根据补偿系数对更新后的所述当前媒体帧时间戳进行正向补偿,若所述判断结果为否,则根据补偿系数对更新后的所述当前媒体帧时间戳进行反向补偿,输出所述正向补偿或所述反向补偿后的时间戳作为所述当前媒体帧目标时间戳。
  2. 根据权利要求1所述的方法,其特征在于,将所述当前媒体帧时间戳更新为所述上一媒体帧时间戳和所述标准媒体帧间隔之和之前,所述方法还包括:
    判断所述差值是否大于最大误差允许系数,若所述判断结果为是,则输出所述当前媒体帧时间戳作为当前媒体帧目标时间戳,若所述判断结果为否,则将所述当前媒体帧时间戳更新为所述上一媒体帧时间戳和所述标准媒体帧间隔之和,其中,所述最大误差允许系数为n倍的所述标准媒体帧间隔,n为大于1的数值。
  3. 根据权利要求1所述的方法,其特征在于,根据补偿系数对更新后的所述当前媒体帧时间戳进行正向补偿或反向补偿包括:
    所述正向补偿为将更新后的所述当前媒体帧时间戳与所述补偿系数之和作为所述当前媒体帧目标时间戳;
    所述反向补偿为将更新后的所述当前媒体帧时间戳与所述补偿系数之差作为所述当前媒体帧目标时间戳。
  4. 根据权利要求1或3所述的方法,其特征在于,所述根据补偿系数对更新后的所述当前媒体帧时间戳进行正向补偿或反向补偿之后,所述方法还包括:
    根据所述当前媒体帧目标时间戳更新上一媒体帧时间戳,更新后的上一媒体帧时间戳作为下一媒体帧时间戳的上一媒体帧时间戳。
  5. 根据权利要求1所述的方法,其特征在于,获取所述上下限范围包括,根据所述标准媒体帧间隔和波动上下限系数,获取所述上下限范围,其中,所述波动上下限系数小于播放端解码器能容忍的波动范围。
  6. 根据权利要求1所述的方法,其特征在于,获取所述媒体流的标准媒体帧间隔包括:
    若所述媒体流为视频流,则根据所述视频流的帧率,获取标准视频帧间隔,所述标准视频帧间隔作为所述标准媒体帧间隔;
    若所述媒体流为音频流,则根据所述音频流的采样率和每帧音频实际采样点数,获取标准音频帧间隔,所述标准音频帧间隔作为所述标准媒体帧间隔。
  7. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    确定音视频流中的每个类型帧对应的权重系数;
    根据每个类型帧的权重系数、队列的队列容量,计算出每个类型帧对应的丢帧判断阈值;
    在任一类型帧的发送时刻,若队列中该类型帧中两帧时间戳的最大时间间隔差值大于该类型帧对应的丢帧判断阈值,则执行丢帧操作。
  8. 根据权利要求7所述的方法,其特征在于,所述类型帧至少包括第一类型帧以及第二类型帧,所述丢帧操作包括:
    若所述第一类型帧的权重系数大于所述第二类型帧的权重系数,则将队列中第二类型帧按照时间戳由大到小进行依次丢弃。
  9. 根据权利要求7所述的方法,其特征在于,所述类型帧至少包括第一类型帧以及第二类型帧,第二类型帧根据重要程度排序设立二级权重,所述丢帧操作包括:
    若所述第一类型帧的权重系数大于所述第二类型帧的权重系数,则将队列中第二类型帧按照二级权重由小到大进行依次丢弃。
  10. 根据权利要求7-9中任一项所述的方法,其特征在于,所述方法还包括:
    在每执行一次丢帧操作之后,重复计算队列中所丢类型帧中当前两帧时间戳的最大时间间隔差值,再与该类型帧对应的丢帧判断阈值进行比较,直至队列中该类型帧中两帧时间戳的最大时间间隔差值不大于该类型帧对应的丢帧判断阈值时停止丢帧操作。
  11. 根据权利要求7-9中任一项所述的方法,其特征在于,所述方法还包括:
    计算队列中每个类型帧的堆积比,所述堆积比为任一类型帧中当前两帧时间戳的最大时间间隔差值与该类型帧丢帧判断阈值的比值;
    根据堆积比与回置窗口高度之间预设的对应关系确定每个类型帧对应的回置窗口高度;
    在每执行一次丢帧操作之后,重复计算队列中所丢类型帧中当前两帧时间戳的最大时间间隔差值,若所述最大时间间隔差值小于该类型帧对应的丢帧判断阈值与回置窗口高度的差值时,则停止丢帧操作。
  12. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    获取用户上传的音视频数据,其中,所述音视频数据以音视频流的形式传输,且所述音视频数据携带用户绑定账号信息,且该账号已针对多个目标直播平台的播放参数完成配置;
    由服务器分别针对各目标直播平台创建推流任务;
    在该用户绑定账号下,由所述服务器向多个目标直播平台分发该用户的音视频数据。
  13. 根据权利要求12所述的方法,其特征在于,所述方法还包括:
    获取用户绑定账号信息及多个目标直播平台的推流需求信息;
    在该用户绑定账号下,向用户发送直播平台配置交互信息;
    由服务器响应该用户基于所述交互信息的配置指示,生成匹配对应直播平台需求信息的直播平台配置数据。
  14. 根据权利要求13所述的方法,其特征在于,向用户发送直播平台配置交互信息,包括:
    向用户发送针对目标直播平台的绑定账号信息;
    由服务器响应该用户基于所述交互信息的配置指示,包括:
    当所述绑定账号信息请求授权通过时,接收用户选择数据并向目标直播平台发送配置数据,所述配置数据包括隐私设置指示信息和音视频发布设置信息;
    由服务器依据所述用户选择数据完成设置及存储该用户针对目标直播平台的配置数据。
  15. 根据权利要求12所述的方法,其特征在于,所述方法还包括:
    由服务器分别接收推流任务创建的直播间地址并存储。
  16. 一种直播装置,其特征在于,所述直播装置包括音频处理模块、设备接口模块和处理器模块,所述音频处理模块包括音频输入接口和音频处理芯片,所述音频输入接口用于连接麦克风,所述音频处理芯片分别连接所述音频输入接口、所述设备接口模块和所述处理器模块,所述音频处理芯片对所述音频输入接口和/或所述设备接口模块输入的音频数据进行降噪和/或混音处理,并将处理后的所述音频数据传送至所述处理器模块;
    所述处理器模块包括时间戳均匀化处理单元,所述时间戳均匀化处理单元包括获取模块、判断模块、补偿模块、调整模块和输出模块,所述获取模块与所述判断模块连接,所述判断模块与所述调整模块连接,所述调整模块与所述补偿模块连接,所述补偿模块与所述输出模块连接;
    所述获取模块用于获取媒体流,其中,所述媒体流为音视频流;
    所述判断模块用于获取当前媒体帧时间戳和上一媒体帧时间戳的差值和所述差值的上下限范围,并判断所述差值是否在所述上下限范围内,若所述判断结果为是,则所述输出模块输出所述当前媒体帧时间戳作为当前媒体帧目标时间戳,若所述判断结果为否,则所述补偿模块用于获取所述媒体流的标准媒体帧间隔,所述调整模块用于将所述当前媒体帧时间戳更 新为所述上一媒体帧时间戳和所述标准媒体帧间隔之和;
    所述判断模块还用于判断所述差值是否大于所述标准媒体帧间隔,若所述判断结果为是,则所述补偿模块根据补偿系数对更新后的所述当前媒体帧时间戳进行正向补偿,若所述判断结果为否,则所述补偿模块根据补偿系数对更新后的所述当前媒体帧时间戳进行反向补偿;
    所述输出模块用于输出所述正向补偿或所述反向补偿后的时间戳作为所述当前媒体帧目标时间戳。
  17. 根据权利要求16所述的直播装置,其特征在于,所述设备接口模块包括HDMI接口模块和/或USB接口模块,其中,所述HDMI接口模块包括至少一个HDMI输入接口,所述USB接口模块包括至少一个USB接口,所述HDMI输入接口和所述USB接口分别连接所述音频处理芯片。
  18. 根据权利要求17所述的直播装置,其特征在于,所述HDMI接口模块还包括至少一个第一格式转换器,所述第一格式转换器连接所述HDMI输入接口和所述处理器模块,所述第一格式转换器将所述HDMI输入接口输入的数据由HDMI格式转换成MIPI格式,并将所述MIPI格式的数据传送至所述处理器模块,其中,所述HDMI输入接口输入的所述数据包括视频数据和/或所述音频数据。
  19. 根据权利要求17所述的直播装置,其特征在于,所述USB接口模块包括第一USB接口和第二USB接口,所述第一USB接口通过所述处理器模块连接所述音频处理芯片,并用于向所述音频处理芯片输入所述音频数据;所述第二USB接口连接所述处理器模块,用于系统调试。
  20. 根据权利要求19所述的直播装置,其特征在于,所述处理器模块包括USB端口,所述第一USB接口设为多个,所述USB接口模块还包括接口拓展器,所述接口拓展器一端连接所述USB端口,所述接口拓展器另一端连接多个所述第一USB接口。
  21. 根据权利要求16所述的直播装置,其特征在于,所述音频输入接口包括有源输入接口和无源输入接口,其中,所述有源输入接口用于连接有源麦克风,所述无源输入接口用于连接无源麦克风。
  22. 根据权利要求16所述的直播装置,其特征在于,所述音频处理模块还包括音频输出接口,所述音频输出接口连接所述音频处理芯片,并用于输出所述处理后的所述音频数据。
  23. 根据权利要求20所述的直播装置,其特征在于,所述直播装置还包括显示模块,所述显示模块包括显示屏和第二格式转换器,所述第二格式转换器连接所述处理器模块和所述显示屏,所述处理器模块输出MIPI格式的数据,所述第二格式转换器将所述MIPI格式的数据转换成LVDS格式,所述显示屏显示所述LVDS格式的数据,其中,所述处理器模块输出的MIPI格式数据包括视频数据。
  24. 根据权利要求23所述的直播装置,其特征在于,所述显示屏包括触摸屏,所述USB接口模块包括第三USB接口,所述第三USB接口连接所述接口拓展器和所述触摸屏。
  25. 根据权利要求16所述的直播装置,其特征在于,所述直播装置还包括数据输出模块,所述数据输出模块包括第三格式转换器和HDMI输出接口,所述第三格式转换器连接所述处理器模块和所述HDMI输出接口,所述第三格式转换器将所述处理器模块输出的数据由MIPI格式转换成HDMI格式,并将所述HDMI格式的数据传送至所述HDMI输出接口,其中,所述处理器模块输出的所述数据包括视频数据和所述音频数据。
  26. 根据权利要求16所述的直播装置,其特征在于,所述处理器模块还包括音视频丢帧单元,所述音视频丢帧单元包括:
    确定模块,用于确定音视频流中的每个类型帧对应的权重系数;
    计算模块,用于根据每个类型帧的权重系数、队列的队列容量,计算出每个类型帧对应的丢帧判断阈值;
    丢帧模块,用于在任一类型帧的发送时刻,若队列中该类型帧中两帧时间戳的最大时间间隔差值大于该类型帧对应的丢帧判断阈值,则执行丢帧操作。
  27. 根据权利要求16所述的直播装置,其特征在于,所述处理器模块还包括音视频丢帧单元,所述音视频丢帧单元包括:
    外部动态参数设置器,用于设置音频帧和视频帧的权重系数、以及设置丢帧判断阈值参数;
    参数收集器,用于将丢帧判断相关的参数进行收集,参数包括权重系数、队列容量、丢帧判断阈值参数;
    参数计算器,用于将收集到的参数根据计算规则得到各类型帧的丢帧判断阈值;
    丢帧判定器,用于先寻找该类型帧丢帧判断阈值,再计算队列中该类型帧中两帧时间戳的最大时间间隔差值,根据丢帧判定原则对最大时间间隔差值与丢帧判断阈值进行比较判断;
    丢帧执行器,用于当丢帧判定器判断出执行丢帧操作,则将队列中该类型帧按照时间戳由大到小进行依次丢弃,每丢弃一次该类型帧将反馈给参数计算器和丢帧判定器,重复计算队列中所丢类型帧中当前两帧时间戳的最大时间间隔差值并进行丢帧判定。
  28. 一种电子设备,包括存储器和处理器,其特征在于,所述存储器中存储有计算机程序,所述处理器被设置为运行所述计算机程序以执行权利要求1至15中任一项所述的方法。
  29. 一种存储介质,其特征在于,所述存储介质中存储有计算机程序,其中,所述计算机程序被设置为运行时执行权利要求1至15中任一项所述的方法。
PCT/CN2021/118485 2020-12-31 2021-09-15 音视频数据的处理方法、直播装置、电子设备和存储介质 WO2022142481A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202180087403.2A CN116762344A (zh) 2020-12-31 2021-09-15 音视频数据的处理方法、直播装置、电子设备和存储介质
US18/345,209 US20230345089A1 (en) 2020-12-31 2023-06-30 Audio and Video Data Processing Method, Live Streaming Apparatus, Electronic Device, and Storage Medium

Applications Claiming Priority (8)

Application Number Priority Date Filing Date Title
CN202011637767.7 2020-12-31
CN202011637767.7A CN112822505B (zh) 2020-12-31 2020-12-31 音视频丢帧方法、装置、系统、存储介质和计算机设备
CN202120826728.5U CN215072677U (zh) 2021-04-21 2021-04-21 直播装置
CN202120826728.5 2021-04-21
CN202110611537.1 2021-06-02
CN202110611537.1A CN113055718B (zh) 2021-06-02 2021-06-02 时间戳均匀化处理的方法、系统、电子装置和存储介质
CN202110643677.7A CN113365094A (zh) 2021-06-09 2021-06-09 基于直播的推流数据处理方法、计算设备和存储介质
CN202110643677.7 2021-06-09

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/345,209 Continuation US20230345089A1 (en) 2020-12-31 2023-06-30 Audio and Video Data Processing Method, Live Streaming Apparatus, Electronic Device, and Storage Medium

Publications (1)

Publication Number Publication Date
WO2022142481A1 true WO2022142481A1 (zh) 2022-07-07

Family

ID=82258980

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/118485 WO2022142481A1 (zh) 2020-12-31 2021-09-15 音视频数据的处理方法、直播装置、电子设备和存储介质

Country Status (2)

Country Link
US (1) US20230345089A1 (zh)
WO (1) WO2022142481A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115499673A (zh) * 2022-08-30 2022-12-20 深圳市思为软件技术有限公司 一种直播方法及装置

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000216813A (ja) * 1999-01-20 2000-08-04 Nippon Telegr & Teleph Corp <Ntt> 誤り補償方法、並びに該方法を用いた誤り補償装置
CN104620595A (zh) * 2012-10-11 2015-05-13 坦戈迈公司 主动式视频帧丢弃
CN108156500A (zh) * 2017-12-29 2018-06-12 珠海全志科技股份有限公司 多媒体数据时间修正方法、计算机装置、计算机可读存储介质
CN109089130A (zh) * 2018-09-18 2018-12-25 网宿科技股份有限公司 一种调整直播视频的时间戳的方法和装置
CN109413469A (zh) * 2018-08-31 2019-03-01 北京潘达互娱科技有限公司 一种直播连麦延迟控制方法、装置、电子设备及存储介质
CN111464256A (zh) * 2020-04-14 2020-07-28 北京百度网讯科技有限公司 时间戳的校正方法、装置、电子设备和存储介质
CN113055718A (zh) * 2021-06-02 2021-06-29 杭州星犀科技有限公司 时间戳均匀化处理的方法、系统、电子装置和存储介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000216813A (ja) * 1999-01-20 2000-08-04 Nippon Telegr & Teleph Corp <Ntt> 誤り補償方法、並びに該方法を用いた誤り補償装置
CN104620595A (zh) * 2012-10-11 2015-05-13 坦戈迈公司 主动式视频帧丢弃
CN108156500A (zh) * 2017-12-29 2018-06-12 珠海全志科技股份有限公司 多媒体数据时间修正方法、计算机装置、计算机可读存储介质
CN109413469A (zh) * 2018-08-31 2019-03-01 北京潘达互娱科技有限公司 一种直播连麦延迟控制方法、装置、电子设备及存储介质
CN109089130A (zh) * 2018-09-18 2018-12-25 网宿科技股份有限公司 一种调整直播视频的时间戳的方法和装置
CN111464256A (zh) * 2020-04-14 2020-07-28 北京百度网讯科技有限公司 时间戳的校正方法、装置、电子设备和存储介质
CN113055718A (zh) * 2021-06-02 2021-06-29 杭州星犀科技有限公司 时间戳均匀化处理的方法、系统、电子装置和存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115499673A (zh) * 2022-08-30 2022-12-20 深圳市思为软件技术有限公司 一种直播方法及装置
CN115499673B (zh) * 2022-08-30 2023-10-20 深圳市思为软件技术有限公司 一种直播方法及装置

Also Published As

Publication number Publication date
US20230345089A1 (en) 2023-10-26

Similar Documents

Publication Publication Date Title
US9788035B2 (en) Systems and methods for communicating events to users
US9621949B2 (en) Method and apparatus for reducing latency in multi-media system
US20190089761A1 (en) Method and apparatus for providing adaptive streaming service
WO2020056877A1 (zh) 一种调整直播视频的时间戳的方法和装置
CN107846633A (zh) 一种直播方法及系统
US20130135179A1 (en) Control method and device thereof
CN112752115B (zh) 直播数据传输方法、装置、设备及介质
CN102802075A (zh) 一种在线播放缓冲系统和方法
US20090055549A1 (en) Content Reproduction Apparatus, Content Reproduction Method, and Program
US20210368215A1 (en) Managing a multi-view event comprising several streams, stream buffers, and rendering onto a single canvas
CN112822502A (zh) 直播去抖动的智能缓存与直播方法、设备及存储介质
WO2014144641A1 (en) System and method for replicating a media stream
US20230345089A1 (en) Audio and Video Data Processing Method, Live Streaming Apparatus, Electronic Device, and Storage Medium
US20130166769A1 (en) Receiving device, screen frame transmission system and method
JP2009038420A (ja) コンテンツ評価ソフトウェア及びサービス提供システム
US20220295127A1 (en) Consolidating content streams to conserve bandwidth
CN107317815A (zh) 一种视频叠加的方法及装置、存储介质和终端
CN114501052B (zh) 直播数据处理方法、云平台、计算机设备和存储介质
US20120154678A1 (en) Receiving device, screen frame transmission system and method
CN111107387B (zh) 视频转码方法、装置及计算机存储介质
US20230239525A1 (en) Server, method and terminal
US8699860B2 (en) Method of scheduled and non-scheduled acquisition of media services in response to media service provider commands
CN106375786B (zh) 一种udp媒体数据通信方法、装置、机顶盒及系统
CN116762344A (zh) 音视频数据的处理方法、直播装置、电子设备和存储介质
TWI523511B (zh) Variable bit rate video panning method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21913228

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202180087403.2

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21913228

Country of ref document: EP

Kind code of ref document: A1