WO2019218415A1 - 一种音视频流的转码方法及设备 - Google Patents

一种音视频流的转码方法及设备 Download PDF

Info

Publication number
WO2019218415A1
WO2019218415A1 PCT/CN2018/091207 CN2018091207W WO2019218415A1 WO 2019218415 A1 WO2019218415 A1 WO 2019218415A1 CN 2018091207 W CN2018091207 W CN 2018091207W WO 2019218415 A1 WO2019218415 A1 WO 2019218415A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
audio
frame rate
decoder
source stream
Prior art date
Application number
PCT/CN2018/091207
Other languages
English (en)
French (fr)
Inventor
荆睿
马良
吕士表
Original Assignee
网宿科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 网宿科技股份有限公司 filed Critical 网宿科技股份有限公司
Priority to US16/339,244 priority Critical patent/US20210360314A1/en
Priority to EP18899005.5A priority patent/EP3588959A4/en
Publication of WO2019218415A1 publication Critical patent/WO2019218415A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/233Processing of audio elementary streams
    • H04N21/2335Processing of audio elementary streams involving reformatting operations of audio signals, e.g. by converting from one coding standard to another
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/434Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
    • H04N21/4343Extraction or processing of packetized elementary streams [PES]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/438Interfacing the downstream path of the transmission network originating from a server, e.g. retrieving encoded video stream packets from an IP network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4398Processing of audio elementary streams involving reformatting operations of audio signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/40Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video transcoding, i.e. partial or full decoding of a coded input stream followed by re-encoding of the decoded output stream

Definitions

  • the present invention relates to the field of audio and video processing technologies, and in particular, to a transcoding method and device for audio and video streams.
  • the format of audio and video uploaded by different users to the Internet is not uniform.
  • the client that plays audio and video may not be able to adapt to audio and video in all formats. Therefore, before providing audio and video to the client, the format of the audio and video is usually converted to a format supported by the client, so that the client can play normally. Received audio and video.
  • the process of protocol parsing, stream parsing, decoding, and encoding can be generally included.
  • the protocol parsing process takes time to identify the encapsulation format of the audio and video source stream.
  • it takes a lot of time to determine the parameters of the audio and video source.
  • the ffmpeg transcoding process performs stream information parsing of the audio and video source stream in the FLV encapsulation format, it is usually required to obtain video data of at least 40 frames of the audio and video source stream, so as to identify the frame rate corresponding to the audio and video source stream. In this way, the process of loading 40 frames of video data will seriously affect the efficiency of the entire transcoding.
  • the purpose of the application is to provide a transcoding method and device for audio and video streams, which can improve the transcoding speed.
  • an application of the present invention provides a transcoding method for an audio and video stream, the method comprising: acquiring an audio and video source stream from a source server, and pre-packaging the transcoding of the audio and video source stream.
  • Format is specified as an encapsulation format of the audio and video source stream; parsing header file data of the audio and video source stream, obtaining configuration information of the audio and video source stream, and initializing a video decoder and an audio decoder according to the configuration information;
  • the initialized video decoder and audio decoder decodes the audio and video source stream, and re-encodes the decoded audio and video data into a target audio and video stream, and pushes the target audio and video stream to a live broadcast server.
  • an aspect of the present application further provides a transcoding device for audio and video streams
  • the device includes: an encapsulation format specifying unit, configured to acquire an audio and video source stream from a source server, and in the audio and video When the source stream is transcoded, the preset encapsulation format is specified as the encapsulation format of the audio and video source stream;
  • the decoder initialization unit is configured to parse the header file data of the audio and video source stream to obtain configuration information of the audio and video source stream, and Initializing a video decoder and an audio decoder according to the configuration information;
  • a re-encoding unit configured to decode the audio-video source stream by using the initialized video decoder and the audio decoder, and re-encode the decoded audio-video data.
  • the target audio and video stream is streamed and sent to the live server.
  • a transcoding device for audio and video streams, the device comprising a memory and a processor, the memory for storing a computer program, the computer program being executed by the processor
  • the above method is implemented.
  • the technical solution provided by the present application can directly specify the preset encapsulation format as the encapsulation format of the audio and video source stream in the protocol parsing stage of the transcoding process, without parsing the corresponding encapsulation according to the data of the audio and video source stream. Format, so that the process of protocol parsing can be omitted.
  • the stream information parsing stage it is not necessary to wait for loading the multi-frame data of the audio-video source stream, but directly parsing the header file data of the audio-video source stream.
  • the header file data may include configuration parameters of the audio and configuration parameters of the video. In this way, the process of waiting to load multi-frame data can be omitted.
  • the decoding frame rate of the video decoder can be set to the default frame rate value, thereby avoiding the lack of decoding.
  • the frame rate causes a decoding abnormality, which further improves the efficiency of transcoding.
  • FIG. 1 is a schematic diagram of a transcoding process in the prior art
  • FIG. 2 is a flowchart of a transcoding method of an audio and video stream in an embodiment of the present invention
  • FIG. 3 is a flowchart of a transcoding method including a frame rate verification process according to an embodiment of the present invention
  • FIG. 4 is a schematic diagram of functional modules of a transcoding device for audio and video streams in an embodiment of the present invention
  • FIG. 5 is a schematic structural diagram of a transcoding device for audio and video streams in an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of a computer terminal according to an embodiment of the present invention.
  • the application provides a transcoding method for audio and video streams, and the transcoding method can be applied to a transcoding device or a transcoding process.
  • the method may include the following steps.
  • S1 Acquire an audio and video source stream from the source server, and when transcoding the audio and video source stream, specify a preset encapsulation format as an encapsulation format of the audio and video source stream.
  • the source server may be a server that stores an original audio and video stream, and the original audio and video stream may be the audio and video source stream described above.
  • the audio and video source stream can be converted into a format that can be recognized by the client through a transcoding device or a transcoding process.
  • the audio and video source stream may be transcoded according to a normal transcoding process, but only part of the process may be optimized during the transcoding process.
  • it is generally required to parse the data of the audio and video source stream to determine a package format of the audio and video source stream.
  • the parsing process may be omitted, and the preset encapsulation format is directly specified as the encapsulation format of the audio and video source stream.
  • the preset encapsulation format may be known in advance before transcoding the audio and video source stream. In an actual application, the encapsulation format of the audio and video source stream can be determined by the suffix name of the file.
  • the suffix name of the audio and video source stream can be identified, thereby suffixing the suffix name.
  • the characterized encapsulation format is used as the preset encapsulation format.
  • the source server may be a server in a source node of a CDN (Content Delivery Network).
  • CDN Content Delivery Network
  • the CDN operator can store only the audio and video source streams of the same encapsulation format in the same source server, or open multiple different storage areas and the same storage area in the same source server. Only the audio and video source streams of the same package format are stored. In this way, a correspondence can be established between the source server and the package format or between the storage area and the package format.
  • the source server or the storage area where the audio and video source stream is located may be identified, so that the encapsulation format corresponding to the audio and video source stream can be obtained.
  • the obtained package format can be used as the preset package format.
  • the transcoding device or the transcoding process acquires the audio and video source stream
  • the encapsulation format of the audio and video source stream is already known, so that the protocol parsing phase is not needed, thereby saving the transcoding time.
  • S3 Parse the header file data of the audio and video source stream, obtain configuration information of the audio and video source stream, and initialize a video decoder and an audio decoder according to the configuration information.
  • the header file data of the audio and video source stream can be directly parsed.
  • the header file data may be, for example, AVC (Advanced Video Coding) header data or AAC (Advanced Audio Coding) header data.
  • the configuration information contained in the header data may be different for different header files.
  • the current header file data is an audio header file data (AAC header)
  • AAC header an audio sample rate and an audio channel number
  • the number of audio channels is used as configuration information of audio data in the audio and video source stream.
  • the audio header file data may also include a variety of other parameters.
  • more configuration information may be parsed according to actual needs.
  • the foregoing audio sampling rate and the number of audio channels are only for the convenience of describing the technical solutions of the present application, and do not mean that the technical solutions of the present application are applicable to only the two configuration information.
  • the audio decoder can be initialized using the extracted configuration information.
  • the decoding sample rate of the audio decoder can be set to the audio sample rate extracted from the audio header file data so that the audio decoder can perform the decoding process normally.
  • the audio decoder can also be set during the initialization process.
  • the video frame resolution, the frame rate, and the video format may be extracted from the video header file data, and the extracted The video frame resolution, the frame rate, and the video format are used as configuration information of the video data in the audio and video source stream.
  • AVC header video header file data
  • the video frame resolution, the frame rate, and the video format are used as configuration information of the video data in the audio and video source stream.
  • the video decoder can be initialized and set using the extracted configuration information.
  • the decoding frame rate of the video decoder may be set to the frame rate extracted from the video header file data, and the decoding resolution of the decoder may be set to the video frame resolution described above.
  • the video decoder can perform subsequent decoding processes normally.
  • the video decoder can also be set during the initialization process.
  • the frame rate may not be extracted from the video header file data when parsing the video header file data.
  • the decoding frame rate of the video decoder may be set to a default frame rate, and the default frame rate may be compatible.
  • a higher frame rate which can be summarized based on historical transcoding records. Then, the frame rate cannot be extracted from the video header file data, and the subsequent video decoding process can be performed normally.
  • the frame rate check thread may be started at the stage of initializing the video encoder.
  • the frame rate verification thread can be operated in an asynchronous manner and executed simultaneously with the stream information parsing phase of the transcoding process, thereby saving the time of the transcoding process.
  • the frame rate verification thread may count the number of video frames acquired from the audio and video source stream within a specified duration. Then, a standard frame rate corresponding to the audio and video source stream may be calculated according to the specified duration and the number of the acquired video frames.
  • the frame rate check thread can read N video frames for a period of time, and then, respectively, record the respective time stamps of the first video frame and the last video frame of the N video frames.
  • the difference between the recorded two timestamps can characterize the duration of the N video frames.
  • the unit of the difference between the two timestamps may not meet the unit requirement of the frame rate calculation.
  • the difference in timestamps is in milliseconds, while the frame rate is usually calculated in seconds. Therefore, the unit of the difference between the time stamps can be converted to the unit used in the frame rate calculation process, thereby ensuring that the normal frame rate value is finally calculated.
  • the frame rate verification thread can determine whether the calculated standard frame rate is consistent with the currently set decoding frame rate in the video decoder. If not, the frame rate parsed from the video header file data may be incorrect. In this case, the decoding frame rate currently set in the video decoder may be modified to the standard frame rate to ensure that the decoding frame rate in the video decoder matches the actual frame rate of the audio and video source stream, thereby ensuring video decoding. The process can go smoothly.
  • S5 Decoding the audio and video source stream by using the initialized video decoder and the audio decoder, and re-encoding the decoded audio and video data into a target audio and video stream, and pushing the target audio and video stream to the live broadcast server. .
  • the audio/video source stream can be decoded by the initialized video decoder and audio decoder to obtain the decoded audio and video data. Then, the decoded audio and video data can be sequentially subjected to filter processing and encoding processing, thereby obtaining a target audio and video stream that can be supported by the client.
  • the target audio and video stream can be pushed to the live broadcast server, where the live broadcast server can be a streaming media server for providing audio and video streams to the user, and the subsequent client can directly obtain the transcoded target audio and video stream from the live broadcast server.
  • the encoder used in re-encoding may be an X264 encoder.
  • a target parameter for reducing the delay may be set, and the decoded audio and video data may be re-encoded by an encoder in which the target parameter is set.
  • the target parameter for reducing the delay can be the tune zerolatency parameter.
  • the X264 encoder can have a lower delay when encoding the decoded audio and video data, thereby improving the overall speed of the transcoding process.
  • the audio and video source stream whose format is determined to be in the FLV format can be obtained from the source server through the ffmpeg transcoding process.
  • the ffmpeg transcoding process transcodes the audio and video source stream
  • the encapsulation format can be specified in the FLV format, thereby omitting the protocol parsing time.
  • the ffmpeg transcoding process receives the AAC header or the AVC header of the audio and video source stream, so that the parameters such as the audio sampling rate, the number of audio channels, the video frame resolution, the video frame rate, and the video format can be separately parsed. In this way, the time to wait for loading at least 40 frames of data can be omitted.
  • the audio decoder and the video decoder can be initialized separately using the configuration information.
  • the decoding frame rate of the video decoder can be set to a default value to avoid transcoding abnormality.
  • the frame rate verification thread may be started in an asynchronous working manner, and the frame rate verification thread may count the number of video frames acquired from the audio and video source stream within a specified duration. Then, the number of acquired video frames can be divided by the specified duration, thereby obtaining the number of frames transmitted per unit time. The number of frames transmitted in the unit time can be the actual frame rate of the audio and video source stream.
  • the frame rate verification thread can determine whether the calculated standard frame rate is consistent with the currently set decoding frame rate in the video decoder. If not, the frame rate parsed from the video header file data may be incorrect. In this case, the decoding frame rate currently set in the video decoder may be modified to the standard frame rate to ensure that the decoding frame rate in the video decoder matches the actual frame rate of the audio and video source stream, thereby ensuring video decoding.
  • the process can go smoothly.
  • the decoded audio and video data can be encoded by the X264 encoder with the tune zerolatency reduction delay parameter, so as to obtain the target audio and video stream that the client can support.
  • the target audio and video stream can be pushed to the streaming media server, and the subsequent client can obtain the transcoded target audio and video stream from the streaming media server.
  • the present application further provides a transcoding device for audio and video streams, where the device includes:
  • the encapsulation format specifying unit is configured to obtain an audio and video source stream from the source server, and when transcoding the audio and video source stream, specify a preset encapsulation format as an encapsulation format of the audio and video source stream;
  • a decoder initialization unit configured to parse header file data of the audio and video source stream, obtain configuration information of the audio and video source stream, and initialize a video decoder and an audio decoder according to the configuration information, respectively;
  • a re-encoding unit configured to decode the audio-video source stream by using the initialized video decoder and the audio decoder, and re-encode the decoded audio-video data into a target audio-video stream, and push the target audio-video stream To the live server.
  • the decoder initialization unit includes:
  • a video configuration information extracting module configured to extract a video frame resolution, a frame rate, and a video format from the video header file data if the current header file data is video header file data, and extract the extracted video frame resolution
  • the frame rate and the video format are used as configuration information of the video data in the audio and video source stream.
  • the decoder initialization unit further includes:
  • a frame rate setting module configured to set a decoding frame rate of the video decoder to the frame rate extracted from the video header file data; if a frame rate cannot be extracted from the video header file data, The decoding frame rate of the video decoder is set to the default frame rate.
  • the device further includes:
  • a frame rate checking module configured to count, when the video decoder and the audio decoder are respectively initialized according to the configuration information, the number of video frames acquired from the audio and video source stream within a specified duration
  • a standard frame rate calculation module configured to calculate a standard frame rate corresponding to the audio and video source stream according to the specified duration and the number of the acquired video frames
  • a frame rate update module configured to determine whether the standard frame rate is consistent with a decoding frame rate currently set in the video decoder, and if not, modify a decoding frame rate currently set in the video decoder to the standard Frame rate.
  • the re-encoding unit comprises:
  • a parameter setting module configured to set a target parameter for reducing the delay in the encoder, and re-encode the decoded audio and video data by using an encoder that sets the target parameter.
  • the present application further provides a transcoding device for audio and video streams, where the device includes a memory and a processor, where the memory is used to store a computer program, when the computer program is executed by the processor, The transcoding method of the above audio and video stream is implemented.
  • Computer terminal 10 may include one or more (only one of which is shown) processor 102 (processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA), for storing data.
  • processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA), for storing data.
  • FIG. 6 is merely illustrative and does not limit the structure of the above electronic device.
  • computer terminal 10 may also include more or fewer components than shown in FIG. 6, or have a different configuration than that shown in FIG.
  • the above-described transcoding method of audio and video streams may be stored as a computer program in the above-described memory 104, and the memory 104 may be coupled to the processor 102, and then the processor 102 executes the memory 104.
  • the steps in the transcoding method of the audio and video stream described above can be implemented.
  • the memory 104 can be used to store software programs and modules of application software, and the processor 102 executes various functional applications and data processing by running software programs and modules stored in the memory 104.
  • Memory 104 may include high speed random access memory, and may also include non-volatile memory such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory.
  • memory 104 may further include memory remotely located relative to processor 102, which may be coupled to computer terminal 10 via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • Transmission device 106 is for receiving or transmitting data via a network.
  • the network specific examples described above may include a wireless network provided by a communication provider of the computer terminal 10.
  • the transmission device 106 includes a Network Interface Controller (NIC) that can be connected to other network devices through a base station to communicate with the Internet.
  • the transmission device 106 can be a Radio Frequency (RF) module for communicating with the Internet wirelessly.
  • NIC Network Interface Controller
  • RF Radio Frequency
  • the technical solution provided by the present application can directly specify the preset encapsulation format as the encapsulation format of the audio and video source stream in the protocol parsing stage of the transcoding process, without parsing the corresponding encapsulation according to the data of the audio and video source stream. Format, so that the process of protocol parsing can be omitted.
  • the stream information parsing stage it is not necessary to wait for loading the multi-frame data of the audio-video source stream, but directly parsing the header file data of the audio-video source stream.
  • the header file data may include configuration parameters of the audio and configuration parameters of the video. In this way, the process of waiting to load multi-frame data can be omitted.
  • the decoding frame rate of the video decoder can be set to the default frame rate value, thereby avoiding the lack of decoding.
  • the frame rate causes a decoding abnormality, which further improves the efficiency of transcoding.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本发明公开了一种音视频流的转码方法及设备,其中,所述方法包括:从源服务器中获取音视频源流,并在对所述音视频源流转码时,将预设封装格式指定为所述音视频源流的封装格式;解析所述音视频源流的头文件数据,得到所述音视频源流的配置信息,并根据所述配置信息分别初始化视频解码器和音频解码器;利用初始化后的视频解码器和音频解码器对所述音视频源流解码,并将解码后的音视频数据重新编码为目标音视频流,并将所述目标音视频流推送至直播服务器中。本申请提供的音视频流的转码方法及设备,能够提高转码速度。

Description

一种音视频流的转码方法及设备 技术领域
本发明涉及音视频处理技术领域,特别涉及一种音视频流的转码方法及设备。
背景技术
随着互联网技术的不断发展,互联网中的主导内容也从文字、图片逐步发展为音视频。当前,由于音视频的格式多种多样,不同用户上传至互联网的音视频的格式并不统一。但播放音视频的客户端可能无法适配所有格式的音视频,因此,在向客户端提供音视频之前,通常会将音视频的格式转换为客户端支持的格式,从而使得客户端能够正常播放接收到的音视频。
请参阅图1,目前在对音视频进行转码时,通常可以包括协议解析、流信息解析、解码以及编码这几个过程。其中,上述协议解析过程需要耗费时间来识别音视频源流的封装格式,此外,在协议解析阶段,也需要耗费较多的时间来确定音视频源的参数。举例来说,ffmpeg转码进程在对FLV封装格式的音视频源流进行流信息解析时,通常需要获取音视频源流至少40帧的视频数据,才能识别出音视频源流所对应的帧率。这样,在加载40帧视频数据的过程会严重影响整个转码的效率。
由此可见,现有技术中的转码过程,会在多个阶段浪费较多的时间,从而导致转码速度较慢,用户需要等待的时间较长。
发明内容
本申请的目的在于提供一种音视频流的转码方法及设备,能够提高转码速度。
为实现上述目的,本申请一方面提供一种音视频流的转码方法,所述方法包括:从源服务器中获取音视频源流,并在对所述音视频源流转码时,将预设封装格式指定为所述音视频源流的封装格式;解析所述音视频源流的头文件数 据,得到所述音视频源流的配置信息,并根据所述配置信息分别初始化视频解码器和音频解码器;利用初始化后的视频解码器和音频解码器对所述音视频源流解码,并将解码后的音视频数据重新编码为目标音视频流,并将所述目标音视频流推送至直播服务器中。
为实现上述目的,本申请另一方面还提供一种音视频流的转码设备,所述设备包括:封装格式指定单元,用于从源服务器中获取音视频源流,并在对所述音视频源流转码时,将预设封装格式指定为所述音视频源流的封装格式;解码器初始化单元,用于解析所述音视频源流的头文件数据,得到所述音视频源流的配置信息,并根据所述配置信息分别初始化视频解码器和音频解码器;重新编码单元,用于利用初始化后的视频解码器和音频解码器对所述音视频源流解码,并将解码后的音视频数据重新编码为目标音视频流,并将所述目标音视频流推送至直播服务器中。
为实现上述目的,本申请另一方面还提供一种音视频流的转码设备,所述设备包括存储器和处理器,所述存储器用于存储计算机程序,所述计算机程序被所述处理器执行时,实现上述的方法。
由上可见,本申请提供的技术方案,在转码过程的协议解析阶段,可以直接将预设封装格式指定为音视频源流的封装格式,而不需要根据音视频源流的数据来解析对应的封装格式,从而可以省略协议解析的过程。此外,在流信息解析阶段,无需等待加载音视频源流的多帧数据,而是直接对音视频源流的头文件数据进行解析。头文件数据中可以包括音频的配置参数和视频的配置参数。这样,可以省略等待加载多帧数据的过程。进一步地,在为视频解码器配置帧率时,如果无法从头文件数据中解析出包含的帧率,则可以将视频解码器的解码帧率设置为默认的帧率值,从而避免了由于缺少解码帧率而导致解码异常的情况,进一步地提高了转码的效率。由上可见,本申请提供的技术方案,对现有技术中的转码过程进行优化,省略了现有技术中多个需要耗费时间的过程,从而提高了整个转码阶段的速度。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明 的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是现有技术中转码过程的示意图;
图2是本发明实施例中音视频流的转码方法流程图;
图3是本发明实施例中包含帧率校验过程的转码方法流程图;
图4是本发明实施例中音视频流的转码设备的功能模块示意图;
图5是本发明实施例中音视频流的转码设备的结构示意图;
图6是本发明实施例中计算机终端的结构示意图。
具体实施方式
为使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明实施方式作进一步地详细描述。
实施例一
本申请提供一种音视频流的转码方法,所述转码方法可以应用于转码设备或者转码进程中。请参阅图2,所述方法可以包括以下步骤。
S1:从源服务器中获取音视频源流,并在对所述音视频源流转码时,将预设封装格式指定为所述音视频源流的封装格式。
在本实施方式中,所述源服务器可以是存储原始音视频流的服务器,该原始音视频流可以是上述的音视频源流。在需要将该音视频源流提供给客户端播放之前,可以通过转码设备或者转码进程将该音视频源流转换为能够被客户端识别的格式。
在本实施方式中,当获取到所述音视频源流后,可以按照正常的转码流程对所述音视频源流进行转码,只不过在转码过程中,可以对其中的部分流程进行优化。具体地,在开始对所述音视频源流进行转码时,通常需要对该音视频源流的数据进行解析,从而确定该音视频源流所具备的封装格式。在本实施方式中,可以省去该解析过程,而是直接将预设封装格式指定为所述音视频源流的封装格式。所述预设封装格式可以是在对所述音视频源流转码之前预先获知的。在实际应用中,音视频源流的封装格式通常可以通过文件的后缀名来确定,当从源服务器中获取到所述音视频源流之后,可以识别该音视频源流的后缀名,从而将该后缀名表征的封装格式作为所述预设封装格式。此外,在另一个应用 场景中,源服务器可以是CDN(Content Delivery Network,内容分发网络)的源节点中的服务器。CDN的运营商在管理源节点中的服务器时,可以在同一个源服务器中仅存储同一封装格式的音视频源流,或者在同一个源服务器中开设多个不同的存储区域,并且同一个存储区域中仅存储同一封装格式的音视频源流。这样,源服务器与封装格式之间或者存储区域与封装格式之间便可以建立对应关系。在获取音视频源流时,可以识别该音视频源流所处的源服务器或者所处的存储区域,从而可以获知该音视频源流对应的封装格式。这样,获知的该封装格式便可以作为所述预设封装格式。
根据以上描述,转码设备或者转码进程在获取到所述音视频源流时,便已经知晓该音视频源流的封装格式,从而无需进行协议解析阶段,进而节省转码的时间。
S3:解析所述音视频源流的头文件数据,得到所述音视频源流的配置信息,并根据所述配置信息分别初始化视频解码器和音频解码器。
在本实施方式中,设置了所述音视频源流的封装格式之后,需要对所述音视频源流的配置信息进行识别,从而利用识别出的配置信息初始化视频解码器和音频解码器。在现有技术中,通常是加载音视频源流的多帧数据后,对加载的多帧数据进行分析得到对应的配置信息。但等待加载多帧数据的时间太过漫长,在本实施方式中,可以直接对音视频源流的头文件数据进行解析。所述头文件数据例如可以是AVC(Advanced Video Coding,高级视频编码)header数据或者AAC(Advanced Audio Coding,高级音频编码)header数据。针对不同的头文件数据,其中包含的配置信息也可以不同。具体地,若当前的头文件数据为音频头文件数据(AAC header),则可以从所述音频头文件数据中提取音频采样率和音频通道数,并将提取的所述音频采样率和所述音频通道数作为所述音视频源流中音频数据的配置信息。当然,在实际应用中,音频头文件数据中还可以包括多种其它的参数,这里就不一一例举,可以根据实际需求,解析出更多的配置信息。上述的音频采样率和音频通道数只是为了便于描述本申请的技术方案,并不代表本申请的技术方案仅适用于这两个配置信息。本领域技术人员应当知晓,在理解本申请技术方案的精髓的情况下,从音频头文件数据中提取出更多的配置信息也应当属于本申请的保护范围。
在本实施方式中,在从音频头文件数据中提取出配置信息之后,可以利用 提取的配置信息初始化音频解码器。例如,可以将所述音频解码器的解码采样率设置为从所述音频头文件数据中提取的所述音频采样率,这样,音频解码器便可以正常进行解码的过程。当然,在实际应用中,如果提取出的配置信息中包含其它参数,也可以在初始化过程中一并对音频解码器进行设置。
在本实施方式中,若当前的头文件数据为视频头文件数据(AVC header),则可以从所述视频头文件数据中提取视频帧分辨率、帧率以及视频格式,并将提取的所述视频帧分辨率、帧率以及视频格式作为所述音视频源流中视频数据的配置信息。当然,如上所述,本领域技术人员应当知晓,在理解本申请技术方案的精髓的情况下,从视频头文件数据中提取出更多的配置信息也应当属于本申请的保护范围。
在本实施方式中,在从视频头文件数据中提取出配置信息之后,可以利用提取出的配置信息,对视频解码器进行初始化设置。例如,可以将所述视频解码器的解码帧率设置为从所述视频头文件数据中提取的所述帧率,以及将解码器的解码分辨率设置为上述的视频帧分辨率。这样,视频解码器便可以正常进行后续的解码过程。当然,在实际应用中,如果提取出的配置信息中包含其它参数,也可以在初始化过程中一并对视频解码器进行设置。
在一个实施方式中,考虑到有时候在对视频头文件数据进行解析时,可能无法从所述视频头文件数据中提取出帧率。这样,如果不对视频解码器配置帧率,会导致解码异常,从而使得转码过程无法继续。为了克服该缺陷,在本实施方式中,若无法从所述视频头文件数据中提取帧率,可以将所述视频解码器的解码帧率设置为默认帧率,所述默认帧率可以是兼容性较高的一个帧率,该默认帧率可以是根据历史转码记录总结得出的。那么,即时无法从所述视频头文件数据中提取帧率,也能保证后续的视频解码过程能够正常进行。
请参阅图3,在一个实施方式中,为了确保在视频解码器中初始化设置的帧率与音视频源流实际的帧率相同,可以在初始化视频编码器的阶段,启动帧率校验线程。该帧率校验线程可以采用异步的工作方式,与转码过程的流信息解析阶段同时执行,从而节省转码过程的时间。具体地,该帧率校验线程可以统计指定时长内从所述音视频源流中获取到的视频帧的数量。然后,可以根据所述指定时长以及所述获取到的视频帧的数量,计算所述音视频源流对应的标准帧率。例如,所述帧率校验线程可以在一段时间内读取N个视频帧,然后,可 以分别记录这N个视频帧中第一个视频帧和最后一个视频帧各自的时间戳。记录的两个时间戳之间的差值可以表征这N个视频帧对应的时长。当然,在实际应用中,两个时间戳之间的差值的单位可能不符合帧率计算的单位要求。例如,时间戳的差值是以毫秒为单位,而计算帧率时通常是以秒为单位。因此,可以将时间戳之间的差值的单位换算至帧率计算过程中所采用的单位,从而保证最终计算出正常的帧率数值。然后,帧率校验线程可以判断计算的所述标准帧率与所述视频解码器中当前设置的解码帧率是否一致,若不一致,表明从视频头文件数据中解析出的帧率可能存在错误,此时,可以将所述视频解码器中当前设置的解码帧率修改为所述标准帧率,以保证视频解码器中的解码帧率与音视频源流的实际帧率相符,进而保证视频解码过程能够顺利进行。
S5:利用初始化后的视频解码器和音频解码器对所述音视频源流解码,并将解码后的音视频数据重新编码为目标音视频流,并将所述目标音视频流推送至直播服务器中。
在本实施方式中,在对视频解码器和音频解码器完成初始化设置后,便可以利用初始化后的视频解码器和音频解码器对所述音视频源流解码,得到解码后的音视频数据。然后,可以对解码后的音视频数据依次进行滤镜处理和编码处理,从而得到能够被客户端支持的目标音视频流。该目标音视频流可以被推送至直播服务器中,该直播服务器可以是用于向用户提供音视频流的流媒体服务器,后续客户端可以直接从该直播服务器中获取转码后的目标音视频流。
在一个实施方式中,考虑到在客户端中常见的音视频格式为X264格式,因此在重新编码时采用的编码器可以是X264编码器。在该编码器中,为了提高编码速度,可以设置用于降低延时的目标参数,并利用设置了所述目标参数的编码器对所述解码后的音视频数据重新编码。例如,该用于降低延时的目标参数可以为tune zerolatency参数。在设置了该参数之后,X264编码器在对解码后的音视频数据编码时,能够具备较低的延时,从而提高转码过程的整体速度。
在一个具体应用场景中,可以通过ffmpeg转码进程从源服务器中获取封装格式确定为FLV格式的音视频源流。这样,ffmpeg转码进程在对该音视频源流转码时,可以指定其封装格式为FLV格式,从而省略协议解析的时间。然后,ffmpeg转码进程接收到音视频源流的AAC header或者AVC header,从而可以分别解析出其中包含的音频采样率、音频通道数、视频帧分辨率、视频帧率、 视频格式等参数。这样,可以省略等待加载至少40帧数据的时间。在从AAC header和AVC header中提取到对应的配置信息之后,可以利用配置信息分别对音频解码器和视频解码器进行初始化。在初始化时,若配置信息中不包含视频帧率,则可以将视频解码器的解码帧率设置为默认值,从而避免转码异常。在初始化的同时,可以采用异步工作的方式,启动帧率校验线程,该帧率校验线程可以统计指定时长内从所述音视频源流中获取到的视频帧的数量。然后,可以将获取到的视频帧的数量除以该指定时长,从而得到单位时间内传输的帧数。该单位时间内传输的帧数便可以是音视频源流实际的帧率。然后,帧率校验线程可以判断计算的所述标准帧率与所述视频解码器中当前设置的解码帧率是否一致,若不一致,表明从视频头文件数据中解析出的帧率可能存在错误,此时,可以将所述视频解码器中当前设置的解码帧率修改为所述标准帧率,以保证视频解码器中的解码帧率与音视频源流的实际帧率相符,进而保证视频解码过程能够顺利进行。在完成解码过程之后,便可以采用设置了tune zerolatency降低延迟参数的X264编码器对解码后的音视频数据进行编码,从而得到客户端能够支持的目标音视频流。该目标音视频流可以被推送至流媒体服务器,后续客户端可以从该流媒体服务器中获取该转码后的目标音视频流。
实施例二
请参阅图4,本申请还提供一种音视频流的转码设备,所述设备包括:
封装格式指定单元,用于从源服务器中获取音视频源流,并在对所述音视频源流转码时,将预设封装格式指定为所述音视频源流的封装格式;
解码器初始化单元,用于解析所述音视频源流的头文件数据,得到所述音视频源流的配置信息,并根据所述配置信息分别初始化视频解码器和音频解码器;
重新编码单元,用于利用初始化后的视频解码器和音频解码器对所述音视频源流解码,并将解码后的音视频数据重新编码为目标音视频流,并将所述目标音视频流推送至直播服务器中。
在一个实施方式中,所述解码器初始化单元包括:
视频配置信息提取模块,用于若当前的头文件数据为视频头文件数据,从所述视频头文件数据中提取视频帧分辨率、帧率以及视频格式,并将提取的所 述视频帧分辨率、帧率以及视频格式作为所述音视频源流中视频数据的配置信息。
在一个实施方式中,所述解码器初始化单元还包括:
帧率设置模块,用于将所述视频解码器的解码帧率设置为从所述视频头文件数据中提取的所述帧率;若无法从所述视频头文件数据中提取帧率,将所述视频解码器的解码帧率设置为默认帧率。
在一个实施方式中,所述设备还包括:
帧率校验模块,用于在根据所述配置信息分别初始化视频解码器和音频解码器时,统计指定时长内从所述音视频源流中获取到的视频帧的数量;
标准帧率计算模块,用于根据所述指定时长以及所述获取到的视频帧的数量,计算所述音视频源流对应的标准帧率;
帧率更新模块,用于判断所述标准帧率与所述视频解码器中当前设置的解码帧率是否一致,若不一致,将所述视频解码器中当前设置的解码帧率修改为所述标准帧率。
在一个实施方式中,所述重新编码单元包括:
参数设置模块,用于在编码器中设置用于降低延时的目标参数,并利用设置了所述目标参数的编码器对所述解码后的音视频数据重新编码。
请参阅图5,本申请还提供一种音视频流的转码设备,所述设备包括存储器和处理器,所述存储器用于存储计算机程序,所述计算机程序被所述处理器执行时,可以实现上述的音视频流的转码方法。
请参阅图6,在本申请中,上述实施例中的技术方案可以应用于如图6所示的计算机终端10上。计算机终端10可以包括一个或多个(图中仅示出一个)处理器102(处理器102可以包括但不限于微处理器MCU或可编程逻辑器件FPGA等的处理装置)、用于存储数据的存储器104、以及用于通信功能的传输模块106。本领域普通技术人员可以理解,图6所示的结构仅为示意,其并不对上述电子装置的结构造成限定。例如,计算机终端10还可包括比图6中所示更多或者更少的组件,或者具有与图6所示不同的配置。
具体地,在本申请中,上述的音视频流的转码方法可以作为计算机程序存 储于上述的存储器104中,所述存储器104可以与处理器102耦合,那么当处理器102执行所述存储器104中的计算机程序时,便可以实现上述的音视频流的转码方法中的各个步骤。
存储器104可用于存储应用软件的软件程序以及模块,处理器102通过运行存储在存储器104内的软件程序以及模块,从而执行各种功能应用以及数据处理。存储器104可包括高速随机存储器,还可包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器104可进一步包括相对于处理器102远程设置的存储器,这些远程存储器可以通过网络连接至计算机终端10。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
传输装置106用于经由一个网络接收或者发送数据。上述的网络具体实例可包括计算机终端10的通信供应商提供的无线网络。在一个实例中,传输装置106包括一个网络适配器(Network Interface Controller,NIC),其可通过基站与其他网络设备相连从而可与互联网进行通讯。在一个实例中,传输装置106可以为射频(Radio Frequency,RF)模块,其用于通过无线方式与互联网进行通讯。
由上可见,本申请提供的技术方案,在转码过程的协议解析阶段,可以直接将预设封装格式指定为音视频源流的封装格式,而不需要根据音视频源流的数据来解析对应的封装格式,从而可以省略协议解析的过程。此外,在流信息解析阶段,无需等待加载音视频源流的多帧数据,而是直接对音视频源流的头文件数据进行解析。头文件数据中可以包括音频的配置参数和视频的配置参数。这样,可以省略等待加载多帧数据的过程。进一步地,在为视频解码器配置帧率时,如果无法从头文件数据中解析出包含的帧率,则可以将视频解码器的解码帧率设置为默认的帧率值,从而避免了由于缺少解码帧率而导致解码异常的情况,进一步地提高了转码的效率。由上可见,本申请提供的技术方案,对现有技术中的转码过程进行优化,省略了现有技术中多个需要耗费时间的过程,从而提高了整个转码阶段的速度。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到各实施方式可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件来实现。基于这样的理解,上述技术方案本质上或者说对现有技术做出贡献的部 分可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行各个实施例或者实施例的某些部分所述的方法。
以上所述仅为本发明的较佳实施例,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (13)

  1. 一种音视频流的转码方法,其特征在于,所述方法包括:
    从源服务器中获取音视频源流,并在对所述音视频源流转码时,将预设封装格式指定为所述音视频源流的封装格式;
    解析所述音视频源流的头文件数据,得到所述音视频源流的配置信息,并根据所述配置信息分别初始化视频解码器和音频解码器;
    利用初始化后的视频解码器和音频解码器对所述音视频源流解码,并将解码后的音视频数据重新编码为目标音视频流,并将所述目标音视频流推送至直播服务器中。
  2. 根据权利要求1所述的方法,其特征在于,解析所述音视频源流的头文件数据,得到所述音视频源流的配置信息包括:
    若当前的头文件数据为音频头文件数据,从所述音频头文件数据中提取音频采样率和音频通道数,并将提取的所述音频采样率和所述音频通道数作为所述音视频源流中音频数据的配置信息。
  3. 根据权利要求2所述的方法,其特征在于,根据所述配置信息初始化音频解码器包括:
    将所述音频解码器的解码采样率设置为从所述音频头文件数据中提取的所述音频采样率。
  4. 根据权利要求1所述的方法,其特征在于,解析所述音视频源流的头文件数据,得到所述音视频源流的配置信息包括:
    若当前的头文件数据为视频头文件数据,从所述视频头文件数据中提取视频帧分辨率、帧率以及视频格式,并将提取的所述视频帧分辨率、帧率以及视频格式作为所述音视频源流中视频数据的配置信息。
  5. 根据权利要求4所述的方法,其特征在于,根据所述配置信息初始化视频解码器包括:
    将所述视频解码器的解码帧率设置为从所述视频头文件数据中提取的所述帧率;
    若无法从所述视频头文件数据中提取帧率,将所述视频解码器的解码帧率设置为默认帧率。
  6. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    在根据所述配置信息分别初始化视频解码器和音频解码器时,统计指定时长内从所述音视频源流中获取到的视频帧的数量;
    根据所述指定时长以及所述获取到的视频帧的数量,计算所述音视频源流对应的标准帧率;
    判断所述标准帧率与所述视频解码器中当前设置的解码帧率是否一致,若不一致,将所述视频解码器中当前设置的解码帧率修改为所述标准帧率。
  7. 根据权利要求1所述的方法,其特征在于,在将解码后的音视频数据重新编码为目标音视频流时,所述方法还包括:
    在编码器中设置用于降低延时的目标参数,并利用设置了所述目标参数的编码器对所述解码后的音视频数据重新编码。
  8. 一种音视频流的转码设备,其特征在于,所述设备包括:
    封装格式指定单元,用于从源服务器中获取音视频源流,并在对所述音视频源流转码时,将预设封装格式指定为所述音视频源流的封装格式;
    解码器初始化单元,用于解析所述音视频源流的头文件数据,得到所述音视频源流的配置信息,并根据所述配置信息分别初始化视频解码器和音频解码器;
    重新编码单元,用于利用初始化后的视频解码器和音频解码器对所述音视频源流解码,并将解码后的音视频数据重新编码为目标音视频流,并将所述目标音视频流推送至直播服务器中。
  9. 根据权利要求8所述的设备,其特征在于,所述解码器初始化单元包括:
    视频配置信息提取模块,用于若当前的头文件数据为视频头文件数据,从 所述视频头文件数据中提取视频帧分辨率、帧率以及视频格式,并将提取的所述视频帧分辨率、帧率以及视频格式作为所述音视频源流中视频数据的配置信息。
  10. 根据权利要求9所述的设备,其特征在于,所述解码器初始化单元还包括:
    帧率设置模块,用于将所述视频解码器的解码帧率设置为从所述视频头文件数据中提取的所述帧率;若无法从所述视频头文件数据中提取帧率,将所述视频解码器的解码帧率设置为默认帧率。
  11. 根据权利要求8所述的设备,其特征在于,所述设备还包括:
    帧率校验模块,用于在根据所述配置信息分别初始化视频解码器和音频解码器时,统计指定时长内从所述音视频源流中获取到的视频帧的数量;
    标准帧率计算模块,用于根据所述指定时长以及所述获取到的视频帧的数量,计算所述音视频源流对应的标准帧率;
    帧率更新模块,用于判断所述标准帧率与所述视频解码器中当前设置的解码帧率是否一致,若不一致,将所述视频解码器中当前设置的解码帧率修改为所述标准帧率。
  12. 根据权利要求8所述的设备,其特征在于,所述重新编码单元包括:
    参数设置模块,用于在编码器中设置用于降低延时的目标参数,并利用设置了所述目标参数的编码器对所述解码后的音视频数据重新编码。
  13. 一种音视频流的转码设备,其特征在于,所述设备包括存储器和处理器,所述存储器用于存储计算机程序,所述计算机程序被所述处理器执行时,实现如权利要求1至7中任一权利要求所述的方法。
PCT/CN2018/091207 2018-05-18 2018-06-14 一种音视频流的转码方法及设备 WO2019218415A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US16/339,244 US20210360314A1 (en) 2018-05-18 2018-06-14 Transcoding method and device for audio/video stream
EP18899005.5A EP3588959A4 (en) 2018-05-18 2018-06-14 METHOD AND DEVICE FOR TRANSCODING FOR AN AUDIO / VIDEO stream

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810481764.5A CN108712654B (zh) 2018-05-18 2018-05-18 一种音视频流的转码方法及设备
CN201810481764.5 2018-05-18

Publications (1)

Publication Number Publication Date
WO2019218415A1 true WO2019218415A1 (zh) 2019-11-21

Family

ID=63869125

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/091207 WO2019218415A1 (zh) 2018-05-18 2018-06-14 一种音视频流的转码方法及设备

Country Status (4)

Country Link
US (1) US20210360314A1 (zh)
EP (1) EP3588959A4 (zh)
CN (1) CN108712654B (zh)
WO (1) WO2019218415A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113727113A (zh) * 2020-05-26 2021-11-30 网易(杭州)网络有限公司 视频解码方法、推流方法及系统
CN114339316A (zh) * 2022-01-11 2022-04-12 北京易智时代数字科技有限公司 一种基于视频直播的视频流编码处理方法
CN115396725A (zh) * 2022-08-25 2022-11-25 深圳市新龙鹏科技有限公司 一种基于it6616的网络推流控制方法、装置、设备及存储介质

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109218430A (zh) * 2018-09-26 2019-01-15 深圳市网心科技有限公司 一种视频文件传输方法、系统及电子设备和存储介质
WO2020107353A1 (zh) * 2018-11-29 2020-06-04 深圳市欢太科技有限公司 视频解码方法、装置、电子设备、计算机可读存储介质
CN109714628B (zh) * 2018-12-29 2021-08-03 广州方硅信息技术有限公司 播放音视频的方法、装置、设备、存储介质及系统
CN111901661B (zh) * 2020-07-30 2022-05-24 海信视像科技股份有限公司 一种视频录制方法、播放方法及显示设备
CN111954027B (zh) * 2020-08-06 2022-07-08 浩联时代(北京)科技有限公司 流媒体数据转码方法、装置、计算设备及可读存储介质
CN113852850B (zh) * 2020-11-24 2024-01-09 广东朝歌智慧互联科技有限公司 音视频流播放装置
CN112866727B (zh) * 2020-12-23 2024-03-01 贵阳叁玖互联网医疗有限公司 一种可接收第三方推流的流媒体直播方法及系统
CN112995714A (zh) * 2021-04-08 2021-06-18 天津天地伟业智能安全防范科技有限公司 私有视频流转换rtmp标准流的方法及装置
CN113965776B (zh) * 2021-10-20 2022-07-05 江下信息科技(惠州)有限公司 一种多模式的音视频格式高速转换方法及系统
CN114900507A (zh) * 2022-04-29 2022-08-12 阿里巴巴(中国)有限公司 Rtc音频数据的处理方法、装置、设备以及存储介质
CN115225928B (zh) * 2022-05-11 2023-07-25 北京广播电视台 一种多类型音视频混播系统及方法
CN115379248B (zh) * 2022-07-14 2023-12-12 百果园技术(新加坡)有限公司 一种视频源流替换方法、系统、设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102055966A (zh) * 2009-11-04 2011-05-11 腾讯科技(深圳)有限公司 一种媒体文件的压缩方法和系统
CN103248950A (zh) * 2013-04-28 2013-08-14 天脉聚源(北京)传媒科技有限公司 一种视频帧率定制的系统及方法
CN103686210A (zh) * 2013-12-17 2014-03-26 广东威创视讯科技股份有限公司 实时音视频转码方法和系统
CN105657524A (zh) * 2016-01-13 2016-06-08 上海视云网络科技有限公司 一种视频间无缝切换的方法
CN105847957A (zh) * 2016-05-27 2016-08-10 天脉聚源(北京)传媒科技有限公司 一种基于移动终端的现场直播方法及装置

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103200425B (zh) * 2013-03-29 2016-04-06 天脉聚源(北京)传媒科技有限公司 一种多媒体处理装置及方法
CN107295317A (zh) * 2017-08-25 2017-10-24 四川长虹电器股份有限公司 一种移动设备音视频流实时传输方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102055966A (zh) * 2009-11-04 2011-05-11 腾讯科技(深圳)有限公司 一种媒体文件的压缩方法和系统
CN103248950A (zh) * 2013-04-28 2013-08-14 天脉聚源(北京)传媒科技有限公司 一种视频帧率定制的系统及方法
CN103686210A (zh) * 2013-12-17 2014-03-26 广东威创视讯科技股份有限公司 实时音视频转码方法和系统
CN105657524A (zh) * 2016-01-13 2016-06-08 上海视云网络科技有限公司 一种视频间无缝切换的方法
CN105847957A (zh) * 2016-05-27 2016-08-10 天脉聚源(北京)传媒科技有限公司 一种基于移动终端的现场直播方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3588959A4 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113727113A (zh) * 2020-05-26 2021-11-30 网易(杭州)网络有限公司 视频解码方法、推流方法及系统
CN114339316A (zh) * 2022-01-11 2022-04-12 北京易智时代数字科技有限公司 一种基于视频直播的视频流编码处理方法
CN115396725A (zh) * 2022-08-25 2022-11-25 深圳市新龙鹏科技有限公司 一种基于it6616的网络推流控制方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN108712654A (zh) 2018-10-26
US20210360314A1 (en) 2021-11-18
EP3588959A1 (en) 2020-01-01
CN108712654B (zh) 2020-04-14
EP3588959A4 (en) 2020-01-22

Similar Documents

Publication Publication Date Title
WO2019218415A1 (zh) 一种音视频流的转码方法及设备
CA2870059C (en) Methods and systems for real-time transmuxing of streaming media content
US11670314B2 (en) Audio decoder, apparatus for generating encoded audio output data and methods permitting initializing a decoder
US9653113B2 (en) File format for synchronized media
US20150256600A1 (en) Systems and methods for media format substitution
US10003626B2 (en) Adaptive real-time transcoding method and streaming server therefor
US11223856B2 (en) Method for processing streaming media data and server for processing streaming media
WO2015169172A1 (zh) 网络视频播放的方法和装置
CN113973214A (zh) 视频流格式转换方法、装置和存储介质
TW201626774A (zh) 載運媒體內容品質資訊之技術
US20180152670A1 (en) Recording Video from a Bitstream
CA2554987C (en) Storage of advanced video coding (avc) parameter sets in avc file format
CN114363303B (zh) 一种实现监控视频快速起播的流媒体转码方法
EP3352077A1 (en) Method for synchronously taking audio and video in order to proceed one-to-multi multimedia stream
US9398351B2 (en) Method and apparatus for converting content in multimedia system
CN107248991B (zh) 基于视频关键帧的ip流调度系统及方法
US20140142955A1 (en) Encoding Digital Media for Fast Start on Digital Media Players
WO2018044338A1 (en) Quantization parameter reporting for video streaming
WO2012168373A1 (en) Method and apparatus for optimizing media streams
RU2366103C2 (ru) Хранение наборов параметров улучшенного видеокодирования (avc) в файловом формате avc
CN117812388A (zh) 支持HEVC解码的web播放方法及web播放器
CN113784210A (zh) 预监视频播放方法及云导播台服务系统
Zhang et al. An implementation on extracting H. 264/AVC compressed data from flash video
MXPA06008820A (en) Storage of advanced video coding (avc) parameter sets in avc file format

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2018899005

Country of ref document: EP

Effective date: 20190715

ENP Entry into the national phase

Ref document number: 2018899005

Country of ref document: EP

Effective date: 20190715

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18899005

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE