WO2024098836A1 - Video alignment method and apparatus - Google Patents

Video alignment method and apparatus Download PDF

Info

Publication number
WO2024098836A1
WO2024098836A1 PCT/CN2023/108948 CN2023108948W WO2024098836A1 WO 2024098836 A1 WO2024098836 A1 WO 2024098836A1 CN 2023108948 W CN2023108948 W CN 2023108948W WO 2024098836 A1 WO2024098836 A1 WO 2024098836A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
transcoded
source
sei information
sei
Prior art date
Application number
PCT/CN2023/108948
Other languages
French (fr)
Chinese (zh)
Inventor
冯宇飞
汤然
郑龙
Original Assignee
上海哔哩哔哩科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海哔哩哔哩科技有限公司 filed Critical 上海哔哩哔哩科技有限公司
Publication of WO2024098836A1 publication Critical patent/WO2024098836A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/40Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video transcoding, i.e. partial or full decoding of a coded input stream followed by re-encoding of the decoded output stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder

Definitions

  • the present disclosure relates to the field of live broadcast technology, and in particular to a video alignment method and device.
  • the host's source video can be transcoded to obtain transcoded videos, which can be pushed to users for viewing.
  • quality assessment is required to ensure that their quality does not affect the viewing experience of users. Quality assessment can be performed by visually observing the same screen of the source video and the transcoded video. However, due to the delay introduced during the transcoding process, the screen of the transcoded video and the source video cannot be aligned. Therefore, a method for video alignment is needed.
  • the embodiments of the present disclosure are proposed to provide a video alignment method and device that overcome the above problems or at least partially solve the above problems.
  • a video alignment method which includes:
  • the source video is transcoded and the SEI information is copied and written to obtain a transcoded video
  • the source video and the transcoded video are aligned according to the same SEI information in the source video and the transcoded video.
  • a video alignment device comprising:
  • An acquisition module adapted to acquire supplementary enhancement information SEI information in a source video
  • the copy-write module is suitable for transcoding the source video and copying and writing the SEI information to obtain the transcoded video;
  • the alignment module is adapted to align the source video and the transcoded video according to the same SEI information in the source video and the transcoded video.
  • a computing device including: a processor, The memory, the communication interface and the communication bus, the processor, the memory and the communication interface communicate with each other via the communication bus;
  • the memory is used to store at least one executable instruction, and the executable instruction enables the processor to execute operations corresponding to the above-mentioned video alignment method.
  • a non-volatile computer-readable storage medium in which at least one executable instruction is stored, and the executable instruction enables a processor to perform operations corresponding to the above-mentioned video alignment method.
  • a computer program product which includes a computer program stored on the above-mentioned non-volatile computer-readable storage medium.
  • the SEI information of the source video is copied and written into the transcoded video, that is, the SEI information in the obtained transcoded video is the same as that in the source video. According to the same SEI information in the source video and the transcoded video, the source video and the transcoded video can be aligned, and there is no need to decode the video images of the source video and the transcoded video, which saves resources and greatly improves the processing speed of video alignment.
  • FIG1 shows a flow chart of a video alignment method according to an embodiment of the present disclosure
  • FIG2 shows a flow chart of a video alignment method according to another embodiment of the present disclosure
  • FIG3 shows a schematic structural diagram of a video alignment device according to an embodiment of the present disclosure.
  • FIG. 4 shows a schematic diagram of the structure of a computing device according to an embodiment of the present disclosure.
  • Transcoding re-encoding audio and video
  • Video alignment The content of the video played before and after transcoding is the same;
  • Video codec A program or device that can compress or decompress digital video
  • Bitstream a data structure that describes the properties of a video frame
  • ffprobe collects information from multimedia streams and prints it in a human- and machine-readable form
  • ffmpeg An open source computer program that can be used to record, convert, and stream digital audio and video;
  • Supplemental Enhancement Information provides users with a method to add additional information to the video stream;
  • PTS Presentation Time Stamp: Displays the timestamp, telling the player when to play this frame of data.
  • FIG. 1 shows a flow chart of a video alignment method according to an embodiment of the present disclosure. As shown in FIG. 1 , the method includes the following steps:
  • Step S101 obtaining supplemental enhancement information SEI information in a source video.
  • SEI information is supplementary enhancement information, which can be inserted into the audio and video stream to convey additional information.
  • SEI information includes payloadType, which defines the SEI information type, payloadSize, which is the size of the SEI message, and uuid_iso_iec_11578, which starts writing content from byte 16.
  • the content can be customized.
  • SEI information is used to align the video.
  • the content can use data such as tags that are convenient for comparison. The video alignment is completed based on the comparison results.
  • the specific content can be set according to the implementation situation and is not limited here.
  • SEI information is not a mandatory option in the decoding process, but can be used for fault tolerance and error correction in the decoding process, and can be integrated into the video bitstream.
  • SEI information can be inserted in the video generation and video transmission process, and can be transmitted together with the video through the transmission link, and SEI information can be obtained without decoding, which reduces resource consumption and has a faster processing speed.
  • ffmpeg is used to decapsulate the source video, and the SEI information therein is obtained, and the SEI information is determined according to payloadType, payloadSize, etc., and the content contained in uuid_iso_iec_11578 is read.
  • Step S102 transcoding the source video, and copying and writing the SEI information to obtain a transcoded video.
  • the source video needs to be decoded first and then re-encoded.
  • This process can be decoded and encoded using a video codec, etc., to facilitate users to stream and watch.
  • the SEI information in the final transcoded video will be inconsistent with the SEI information in the source video. Therefore, this embodiment first obtains the SEI information in the source video, and then transcodes the source video. After the transcoding process, the obtained SEI information is copied and written to the transcoded video, so that the SEI information in the final transcoded video is consistent with the SEI information in the source video.
  • Step S103 align the source video and the transcoded video according to the same SEI information in the source video and the transcoded video.
  • the SEI information in the source video and the transcoded video can be compared, such as decapsulating the source video and the transcoded video to obtain the SEI information of the source video and the SEI information of the transcoded video. Since the SEI information is bound to the video frame, the SEI information of the source video and the SEI information of the transcoded video can be compared one by one according to the video frame to determine the video frame with the same SEI information location, that is, to determine the video frame with the same playback content in the source video and the transcoded video, and align the source video and the transcoded video according to the same video frame, so that the source video and the transcoded video play the same content.
  • the SEI information of the source video is copied and written into the transcoded video, that is, the SEI information in the obtained transcoded video is the same as that in the source video. According to the same SEI information in the source video and the transcoded video, the source video and the transcoded video can be aligned, and there is no need to decode the video images of the source video and the transcoded video, which saves resources and greatly improves the processing speed of video alignment.
  • FIG2 shows a flow chart of a video alignment method according to an embodiment of the present disclosure. As shown in FIG2 , the method includes the following steps:
  • Step S201 adding SEI information to the video frames in the source video.
  • SEI information can be added to the source video of the live broadcast.
  • the SEI information is added before the source video is transcoded, such as when the source video of the anchor is uploaded to the server.
  • the source video with added SEI information is transmitted to the edge computing node for transcoding, or transmitted to the edge computing node and SEI information is added before transcoding, etc., which is not limited here.
  • the added SEI information corresponds to the video frames in the source video, and SEI information is added to the video frames in the source video. Specifically, SEI information corresponding to each video frame in the source video is added. SEI information is set in an incremental manner.
  • the content contained in uuid_iso_iec_11578 can be SEI serial number sei_idx. In the first frame, SEI serial number sei_idx is set to 1, in the second frame, SEI serial number sei_idx is set to 2, and in the third frame, SEI serial number sei_idx is set to 3...
  • each video frames, and the increasing order of SEI sequence numbers matches the order of video frames; or, the content contained in uuid_iso_iec_11578 may be PTS (Presentation Time Stamp), which can tell the video player the specific time to play the corresponding video frame.
  • PTS is related to the order of video frames in the source video. The PTS time of the earlier video frame is earlier than the PTS time of the later video frame. The PTS in each video frame will also show an increasing trend according to the order of the video frames.
  • the specific value of PTS can be set by the encoder when generating the source video, such as selecting a reference clock, the time on the reference clock is linearly increasing, and the encoder timestamps each video frame according to the time on the reference clock.
  • the timestamp is PTS.
  • the above is an example, and it can also be set in other ways, which is not limited here; or, the content contained in uuid_iso_iec_11578 can be an identification code, such as a QR code or other identification codes.
  • the identification code When generating the identification code, it can be converted to generate a corresponding identification code according to the increasing serial number or sequence number, and a different identification code is set for each video frame, so that the identification code of the SEI information corresponding to each video frame also shows an increasing trend, which corresponds to the order of the video frames and can effectively distinguish each video frame.
  • the above SEI information is an example, and the specific settings can be set according to the implementation situation, which will not be explained in detail here.
  • SEI information belongs to the category of bitstream, which is additional information added to the video bitstream.
  • SEI information adopts bitstream.
  • adding for example, ffmpeg is used to extract h.264 bitstream from the source video, and h264_metadata bitstream filter is used to add SEI information.
  • ffmpeg is used to extract h.264 bitstream from the source video
  • h264_metadata bitstream filter is used to add SEI information.
  • the above is an example, and the specific technical means used to add SEI information are not limited here.
  • ffmpeg can be used to detect whether SEI information already exists in the source video. If it already exists, there is no need to add it again to avoid repeated settings.
  • ffprobe can be used to view various information of the audio and video files, such as the encapsulation format, audio/video stream information, data packet information, etc. By viewing the structure avPacket that stores compressed encoded data in the data packet information, it can be checked whether the corresponding SEI information is set for each video frame in the source video to avoid repeated settings.
  • Step S202 Acquire SEI information in the source video.
  • the source video Before transmitting the source video with added SEI information to the edge computing node for transcoding, the source video is decapsulated, such as using the ffmpeg tool to decapsulate the source video to obtain the structure avPacket storing the compressed encoded data in the source video, and the SEI information is set in the avPacket. By parsing the structure avPacket, the SEI information of the source video can be obtained.
  • the source video is decapsulated to obtain its SEI information, and no decoding or re-encoding is performed on it, which can reduce resource consumption and has a faster processing speed.
  • the source video is only decapsulated without any other processing, and does not involve modification of the specific video content in the source video, ensuring that the source video is still the original video content.
  • Step S203 storing the SEI information into a cache based on the decoding timestamp of the structure.
  • the transcoding process will decode the source video and re-encode it.
  • the SEI information in the transcoded video is modified and inconsistent with the SEI information of the source video, making it impossible to align the video directly. Therefore, it is necessary to store the obtained SEI information of the source video first, so that it is convenient to directly copy and write the transcoded video later. If the SEI information is stored in the cache, the SEI information can be directly obtained from the cache after the transcoding process for copying and writing.
  • SEI information corresponds to video frames, and multiple video frames correspond to multiple SEI information.
  • SEI information can be stored according to DTS (Decoding Time Stamp) in avPacket.
  • DTS Decoding Time Stamp
  • DTS is used to tell the player when to decode the data of this video frame, corresponding to the video frame.
  • Storing SEI information according to DTS can facilitate decoding video frames according to DTS when transcoding the source video, and obtaining SEI information corresponding to DTS, which is convenient for accurate writing.
  • Step S204 transcoding the source video, obtaining SEI information according to the decoding timestamp of the structure, and copying and writing the SEI information to obtain a transcoded video.
  • the source video is transcoded, such as decoding the source video, such as re-encoding each video frame according to the decoding timestamp DTS corresponding to the source video, and obtaining the SEI information corresponding to each video frame from the cache according to the decoding timestamp DTS of the structure, and directly copying the SEI information into the transcoded video to obtain the transcoded video.
  • the SEI information in the transcoded video is a copy of the SEI information of the source video, which ensures that the SEI information in the transcoded video is consistent with the SEI information of the source video.
  • Step S205 align the source video and the transcoded video according to the same SEI information in the source video and the transcoded video.
  • the video frames with the same SEI information are aligned to achieve video alignment of the source video and the transcoded video.
  • the source video and the transcoded video are decapsulated, such as using ffmpeg to decapsulate the source video and the transcoded video, and the SEI information of the source video and the SEI information of the transcoded video are obtained respectively.
  • the SEI information of the source video and the SEI information of the transcoded video are compared to determine the first video frame of the source video and the second video frame of the transcoded video corresponding to the same SEI information.
  • the SEI sequence number of the SEI information of each video frame in the source video is compared with the SEI sequence number of the SEI information of each video frame in the transcoded video to determine the first video frame of the source video and the second video frame of the transcoded video with the same SEI sequence number. For example, if the SEI sequence number in the first frame of the source video is 1 and the SEI sequence number in the first frame of the transcoded video is also 1, the first video frame of the source video is the first frame of the source video, and the second video frame of the transcoded video is the first frame of the transcoded video.
  • the display timestamp of the SEI information of each video frame in the source video is compared with the display timestamp of the SEI information of each video frame in the transcoded video.
  • the display timestamp in the video frame of the source video is the same as the display timestamp in the video frame of the transcoded video
  • the first video frame of the source video and the second video frame of the transcoded video with the same display timestamp are determined; or, the identification code of the SEI information of each video frame in the source video is compared with the identification code of the SEI information of each video frame in the transcoded video, when the identification code in the video frame of the source video is the same as the identification code in the video frame of the transcoded video, the first video frame of the source video and the second video frame of the transcoded video with the same identification code are determined.
  • the number of the first video frames of the source video and the number of the second video frames of the transcoded video are the same. For example, after comparison, it is determined that the number of the first video frames of the source video is 20, and the number of the second video frames of the transcoded video is also 20.
  • the first video frames of the source video and the second video frames of the transcoded video with the same SEI information can be sorted in the order of video playing, such as the first video frame of the source video includes the first frame, the second frame, the third frame...
  • the second video frame of the transcoded video includes the first frame, the second frame, the third frame..., each according to the order of their respective video playing, such as the order of the display timestamp of the PTS, and a one-to-one correspondence is established between each first video frame and the second video frame, such as the first frame of the source video corresponds to the first frame of the transcoded video, the second frame of the source video corresponds to the second frame of the transcoded video, the third frame of the source video corresponds to the third frame of the transcoded video...
  • the first video frame and the second video frame with a corresponding relationship can be played at the same time to complete the video alignment, and the pictures of the two video frames can be visually observed to perform quality assessment, etc.
  • the corresponding first video frame and the second video frame are stored according to the corresponding relationship to complete the video alignment of the source video and the transcoded video.
  • the processing speed can be increased by about 50 times compared with the decoding operation by only decapsulating, and the CPU performance occupancy is reduced from 1764.14% of decoding to 3.14%, greatly reducing resource consumption.
  • the CPU performance occupancy is reduced from 1764.14% of decoding to 3.14%, greatly reducing resource consumption.
  • it will be slightly different according to the device performance data.
  • decoding is not required, which can greatly improve the processing speed and reduce the decoding pressure.
  • the SEI information of the acquired source video is first stored in the cache, and during the transcoding process, the SEI information of the source video is copied and written into the transcoded video, and the SEI information in the transcoded video finally obtained is consistent with the SEI information in the source video, and the SEI information in the source video and the transcoded video can be compared to determine the first video frame of the source video and the second video frame of the transcoded video corresponding to the same SEI information, and the first video frame of the source video and the second video frame of the transcoded video are played and stored at the same time to complete the video alignment of the source video and the transcoded video.
  • the source video and the transcoded video are decapsulated to obtain the SEI information for comparison, and there is no need to decode the video screens of the source video and the transcoded video, which improves the processing speed and reduces the occupation of resources.
  • FIG3 shows a schematic diagram of the structure of a video alignment device provided by an embodiment of the present disclosure. As shown, the device comprises:
  • An acquisition module 310 is adapted to acquire supplemental enhancement information SEI information in a source video
  • the copy-write module 320 is adapted to perform transcoding processing on the source video and copy-write the SEI information to obtain a transcoded video;
  • the alignment module 330 is adapted to perform video alignment on the source video and the transcoded video according to the same SEI information in the source video and the transcoded video.
  • the acquisition module 310 is further adapted to:
  • the device further comprises:
  • the cache module 340 is adapted to store the SEI information into a cache based on the decoding timestamp of the structure.
  • the copy-write module 320 is further adapted to:
  • the SEI information is copied and written into the transcoded video to obtain a transcoded video.
  • the device further comprises:
  • the adding module 350 is adapted to add SEI information to the video frames in the source video.
  • the SEI information is set in an incremental manner.
  • the SEI information includes a SEI sequence number, a display timestamp and/or an identification code.
  • the alignment module 330 is further adapted to:
  • the first video frame of the source video and the second video frame of the transcoded video are played and/or stored simultaneously to complete the video alignment of the source video and the transcoded video.
  • the alignment module 330 is further adapted to:
  • the identification code of the source video and the identification code of the transcoded video are compared to determine a first video frame of the source video and a second video frame of the transcoded video corresponding to the same identification code.
  • the number of first video frames of the source video is the same as the number of second video frames of the transcoded video.
  • the number of the first video frames is multiple; the number of the second video frames is multiple;
  • the alignment module 330 is further adapted to:
  • the first video frame and the second video frame having a corresponding relationship are played and/or stored simultaneously to complete the video alignment of the source video and the transcoded video.
  • the SEI information of the source video is copied and written into the transcoded video, that is, the SEI information in the obtained transcoded video is the same as that in the source video. According to the same SEI information in the source video and the transcoded video, the source video and the transcoded video can be aligned, and there is no need to decode the video images of the source video and the transcoded video, which saves resources and greatly improves the processing speed of video alignment.
  • the present disclosure also provides a non-volatile computer-readable storage medium, which stores at least one executable instruction, and the executable instruction can execute the video alignment method in any of the above method embodiments.
  • FIG4 shows a schematic diagram of the structure of a computing device according to an embodiment of the present disclosure.
  • the specific embodiment of the present disclosure does not limit the specific implementation of the computing device.
  • the computing device may include: a processor (processor) 402, a communication interface (Communications Interface) 404, a memory (memory) 406, and a communication bus 408.
  • processor processor
  • communication interface Communication Interface
  • memory memory
  • the processor 402 , the communication interface 404 , and the memory 406 communicate with each other via a communication bus 408 .
  • the communication interface 404 is used to communicate with other devices such as clients or other servers.
  • Processor 402 is used to execute program 410, which can specifically execute the above-mentioned video alignment method to implement The relevant steps in the example.
  • the program 410 may include program codes, which include computer operation instructions.
  • Processor 402 may be a central processing unit (CPU), or an application specific integrated circuit (ASIC), or one or more integrated circuits configured to implement the present disclosure.
  • the one or more processors included in the computing device may be processors of the same type, such as one or more CPUs; or may be processors of different types, such as one or more CPUs and one or more ASICs.
  • Memory 406 is used to store program 410.
  • Memory 406 may include high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk storage.
  • the program 410 can be specifically used to enable the processor 402 to perform the video alignment method in any of the above method embodiments.
  • the specific implementation of each step in the program 410 can refer to the corresponding descriptions in the corresponding steps and units in the above video alignment embodiments, which will not be repeated here.
  • Those skilled in the art can clearly understand that for the convenience and simplicity of description, the specific working process of the above-described devices and modules can refer to the corresponding process description in the above-mentioned method embodiments, which will not be repeated here.
  • modules in the device of the embodiment may be adaptively changed and arranged in one or more devices different from the embodiment.
  • the modules or units or components in the embodiments are combined into one module or unit or component, and further they can be divided into multiple sub-modules or sub-units or sub-components.
  • the various component embodiments of the present disclosure can be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. It should be understood by those skilled in the art that a microprocessor or digital signal processor (DSP) can be used in practice to implement some or all functions of some or all components of the present disclosure.
  • DSP digital signal processor
  • the present disclosure can also be implemented as a device or apparatus program (e.g., computer program and computer program product) for executing a part or all of the methods described herein.
  • Such a program implementing the present disclosure can be stored on a computer-readable medium, or can have the form of one or more signals. Such a signal can be downloaded from an Internet website, or provided on a carrier signal, or provided in any other form.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

Disclosed in the present disclosure is a video alignment method and apparatus. The method comprises: acquiring supplemental enhancement information (SEI) information in a source video; transcoding the source video, and copying and writing the SEI information to obtain a transcoded video; and, according to the same SEI information in the source video and the transcoded video, performing video alignment on the source video and the transcoded video. The SEI information of the source video is copied and written into the transcoded video, such that the SEI information in the obtained transcoded video and the SEI information in the source video are the same SEI information.

Description

视频对齐方法及装置Video alignment method and device
相关申请的交叉参考CROSS-REFERENCE TO RELATED APPLICATIONS
本申请要求于2022年11月11日提交中国专利局、申请号为2022114114271、名称为“视频对齐方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application filed with the China Patent Office on November 11, 2022, with application number 2022114114271 and title “Video Alignment Method and Device”, the entire contents of which are incorporated by reference into this application.
技术领域Technical Field
本公开涉及直播技术领域,具体涉及一种视频对齐方法及装置。The present disclosure relates to the field of live broadcast technology, and in particular to a video alignment method and device.
背景技术Background technique
在直播业务中,为方便用户观看视频,减少视频卡顿等情况,可以对主播的源视频进行转码,得到转码视频,将转码视频推流给用户观看。In the live broadcast business, in order to facilitate users to watch videos and reduce video freezes, the host's source video can be transcoded to obtain transcoded videos, which can be pushed to users for viewing.
对于转码视频,为保障其质量不影响用户的观看效果,需要对其进行质量评估。质量评估时可以如播放源视频和转码视频的相同画面,通过视觉观察画面进行视频质量评估。但由于转码的过程中会引入延时,导致转码视频和源视频的画面不能对齐,因此,需要一种对视频对齐的方法。For transcoded videos, quality assessment is required to ensure that their quality does not affect the viewing experience of users. Quality assessment can be performed by visually observing the same screen of the source video and the transcoded video. However, due to the delay introduced during the transcoding process, the screen of the transcoded video and the source video cannot be aligned. Therefore, a method for video alignment is needed.
发明内容Summary of the invention
鉴于上述问题,提出了本公开实施例以便提供一种克服上述问题或者至少部分地解决上述问题的视频对齐方法及装置。In view of the above problems, the embodiments of the present disclosure are proposed to provide a video alignment method and device that overcome the above problems or at least partially solve the above problems.
根据本公开实施例的第一方面,提供了一种视频对齐方法,其包括:According to a first aspect of an embodiment of the present disclosure, a video alignment method is provided, which includes:
获取源视频中的补充增强信息SEI信息;Obtaining supplemental enhancement information SEI information in the source video;
将源视频进行转码处理,并将SEI信息复制写入,得到转码视频;The source video is transcoded and the SEI information is copied and written to obtain a transcoded video;
根据源视频及转码视频中的相同SEI信息,将源视频及转码视频进行视频对齐。The source video and the transcoded video are aligned according to the same SEI information in the source video and the transcoded video.
根据本公开实施例的第二方面,提供了一种视频对齐装置,其包括:According to a second aspect of an embodiment of the present disclosure, a video alignment device is provided, comprising:
获取模块,适于获取源视频中的补充增强信息SEI信息;An acquisition module, adapted to acquire supplementary enhancement information SEI information in a source video;
复制写入模块,适于将源视频进行转码处理,并将SEI信息复制写入,得到转码视频;The copy-write module is suitable for transcoding the source video and copying and writing the SEI information to obtain the transcoded video;
对齐模块,适于根据源视频及转码视频中的相同SEI信息,将源视频及转码视频进行视频对齐。The alignment module is adapted to align the source video and the transcoded video according to the same SEI information in the source video and the transcoded video.
根据本公开实施例的第三方面,提供了一种计算设备,包括:处理器、 存储器、通信接口和通信总线,处理器、存储器和通信接口通过通信总线完成相互间的通信;According to a third aspect of an embodiment of the present disclosure, there is provided a computing device, including: a processor, The memory, the communication interface and the communication bus, the processor, the memory and the communication interface communicate with each other via the communication bus;
存储器用于存放至少一可执行指令,可执行指令使处理器执行上述视频对齐方法对应的操作。The memory is used to store at least one executable instruction, and the executable instruction enables the processor to execute operations corresponding to the above-mentioned video alignment method.
根据本公开实施例的第四方面,提供了一种非易失性计算机可读存储介质,该非易失性计算机可读存储介质中存储有至少一可执行指令,可执行指令使处理器执行如上述视频对齐方法对应的操作。According to a fourth aspect of an embodiment of the present disclosure, a non-volatile computer-readable storage medium is provided, in which at least one executable instruction is stored, and the executable instruction enables a processor to perform operations corresponding to the above-mentioned video alignment method.
根据本公开实施例的第五方面,提供了一种计算机程序产品,该计算机程序产品包括存储在上述非易失性计算机可读存储介质上的计算程序。According to a fifth aspect of an embodiment of the present disclosure, a computer program product is provided, which includes a computer program stored on the above-mentioned non-volatile computer-readable storage medium.
根据本公开的提供的视频对齐方法及装置,将源视频的SEI信息复制写入至转码视频中,即得到的转码视频与源视频中的SEI信息为同一SEI信息。根据源视频及转码视频中的相同SEI信息,可以将源视频及转码视频进行视频对齐,且无需对源视频及转码视频的视频画面进行解码,节省资源,视频对齐的处理速度也大幅提升。According to the video alignment method and device provided by the present disclosure, the SEI information of the source video is copied and written into the transcoded video, that is, the SEI information in the obtained transcoded video is the same as that in the source video. According to the same SEI information in the source video and the transcoded video, the source video and the transcoded video can be aligned, and there is no need to decode the video images of the source video and the transcoded video, which saves resources and greatly improves the processing speed of video alignment.
上述说明仅是本公开技术方案的概述,为了能够更清楚了解本公开的技术手段,而可依照说明书的内容予以实施,并且为了让本公开的上述和其它目的、特征和优点能够更明显易懂,以下特举本公开的具体实施方式。The above description is only an overview of the technical solution of the present disclosure. In order to more clearly understand the technical means of the present disclosure, it can be implemented according to the contents of the specification. In order to make the above and other purposes, features and advantages of the present disclosure more obvious and easy to understand, the specific implementation methods of the present disclosure are listed below.
附图概述BRIEF DESCRIPTION OF THE DRAWINGS
通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本公开的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:Various other advantages and benefits will become apparent to those of ordinary skill in the art by reading the detailed description of the preferred embodiments below. The accompanying drawings are only for the purpose of illustrating the preferred embodiments and are not to be considered as limiting the present disclosure. Also, the same reference symbols are used throughout the accompanying drawings to represent the same components. In the accompanying drawings:
图1示出了根据本公开一个实施例的视频对齐方法的流程图;FIG1 shows a flow chart of a video alignment method according to an embodiment of the present disclosure;
图2示出了根据本公开另一个实施例的视频对齐方法的流程图;FIG2 shows a flow chart of a video alignment method according to another embodiment of the present disclosure;
图3示出了根据本公开一个实施例的视频对齐装置的结构示意图;以及FIG3 shows a schematic structural diagram of a video alignment device according to an embodiment of the present disclosure; and
图4示出了根据本公开一个实施例的一种计算设备的结构示意图。FIG. 4 shows a schematic diagram of the structure of a computing device according to an embodiment of the present disclosure.
本公开的较佳实施方式Preferred embodiments of the present disclosure
下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。 The exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although the exemplary embodiments of the present disclosure are shown in the accompanying drawings, it should be understood that the present disclosure can be implemented in various forms and should not be limited by the embodiments set forth herein. On the contrary, these embodiments are provided in order to enable a more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.
首先,对本公开一个或多个实施例涉及的名词术语进行解释。First, the terms involved in one or more embodiments of the present disclosure are explained.
转码:对音视频进行重新编码;Transcoding: re-encoding audio and video;
视频对齐:视频在转码前后播放的内容是相同的;Video alignment: The content of the video played before and after transcoding is the same;
视频编解码器:能够对数字视频进行压缩或者解压缩的程序或设备;Video codec: A program or device that can compress or decompress digital video;
比特流:描述视频帧属性的数据结构;Bitstream: a data structure that describes the properties of a video frame;
ffprobe:从多媒体流中收集信息并以人类和机器可读的方式打印出来;ffprobe: collects information from multimedia streams and prints it in a human- and machine-readable form;
ffmpeg:可以用来记录、转换数字音视频,并能将其转化为流的开源计算机程序;ffmpeg: An open source computer program that can be used to record, convert, and stream digital audio and video;
SEI(Supplemental Enhancement Information):补充增强信息,提供给用户向视频码流中加入额外信息的方法;SEI (Supplemental Enhancement Information): Supplemental enhancement information provides users with a method to add additional information to the video stream;
PTS(Presentation Time Stamp):显示时间戳,告诉播放器在什么时间播放这一帧的数据。PTS (Presentation Time Stamp): Displays the timestamp, telling the player when to play this frame of data.
图1示出了根据本公开一实施例的视频对齐方法的流程图,如图1所示,该方法包括如下步骤:FIG. 1 shows a flow chart of a video alignment method according to an embodiment of the present disclosure. As shown in FIG. 1 , the method includes the following steps:
步骤S101,获取源视频中的补充增强信息SEI信息。Step S101, obtaining supplemental enhancement information SEI information in a source video.
SEI信息即补充增强信息,可以插入到音视频流中,用于传达额外信息。SEI信息包括如payloadType定义SEI信息类型,payloadSize为SEI消息的大小,uuid_iso_iec_11578中从16个字节开始写入内容,内容可以自定义,本实施例中SEI信息用于将视频对齐,内容可以采用如标签等方便比对的数据,根据比对结果来完成视频对齐,具体内容可以根据实施情况设置,此处不做限定。SEI information is supplementary enhancement information, which can be inserted into the audio and video stream to convey additional information. SEI information includes payloadType, which defines the SEI information type, payloadSize, which is the size of the SEI message, and uuid_iso_iec_11578, which starts writing content from byte 16. The content can be customized. In this embodiment, SEI information is used to align the video. The content can use data such as tags that are convenient for comparison. The video alignment is completed based on the comparison results. The specific content can be set according to the implementation situation and is not limited here.
进一步地,SEI信息并非解码过程中的必须选项,可以用于解码过程中的容错、纠错,可以集成在视频比特流中,也就是说,SEI信息在视频生成、视频传输过程中均可以插入,可以随着视频经传输链路一起传输,且无需解码即可获取到SEI信息,减少资源消耗,处理速度较快。基于以上特性,在边缘计算节点对源视频进行转码处理前,即边缘计算节点接收到源视频后,源视频中的SEI信息可以直接快速获取,具体地,如利用ffmpeg对源视频进行解封装,根据获取其中的SEI信息,根据payloadType、payloadSize等确定SEI信息,读取uuid_iso_iec_11578中包含的内容等。Furthermore, SEI information is not a mandatory option in the decoding process, but can be used for fault tolerance and error correction in the decoding process, and can be integrated into the video bitstream. In other words, SEI information can be inserted in the video generation and video transmission process, and can be transmitted together with the video through the transmission link, and SEI information can be obtained without decoding, which reduces resource consumption and has a faster processing speed. Based on the above characteristics, before the edge computing node transcodes the source video, that is, after the edge computing node receives the source video, the SEI information in the source video can be directly and quickly obtained. Specifically, ffmpeg is used to decapsulate the source video, and the SEI information therein is obtained, and the SEI information is determined according to payloadType, payloadSize, etc., and the content contained in uuid_iso_iec_11578 is read.
步骤S102,将源视频进行转码处理,并将SEI信息复制写入,得到转码视频。 Step S102, transcoding the source video, and copying and writing the SEI information to obtain a transcoded video.
对源视频进行转码处理需要将源视频先进行解码,再对其重新编码,这一过程可以利用如视频编解码器等进行解码、编码处理等,方便用户拉流观看。考虑到在转码处理过程中,SEI信息会被修改,导致最终得到的转码视频中的SEI信息与源视频中的SEI信息不一致。因此,本实施例先获取到源视频中的SEI信息后,再对源视频进行转码处理,在转码处理后,再将获取到的SEI信息复制写入至转码处理后的视频中,使得最终得到的转码视频中的SEI信息与源视频中的SEI信息保存一致。To transcode the source video, the source video needs to be decoded first and then re-encoded. This process can be decoded and encoded using a video codec, etc., to facilitate users to stream and watch. Considering that the SEI information will be modified during the transcoding process, the SEI information in the final transcoded video will be inconsistent with the SEI information in the source video. Therefore, this embodiment first obtains the SEI information in the source video, and then transcodes the source video. After the transcoding process, the obtained SEI information is copied and written to the transcoded video, so that the SEI information in the final transcoded video is consistent with the SEI information in the source video.
步骤S103,根据源视频及转码视频中的相同SEI信息,将源视频及转码视频进行视频对齐。Step S103: align the source video and the transcoded video according to the same SEI information in the source video and the transcoded video.
得到转码视频后,可以对源视频及转码视频中SEI信息进行比对,如对源视频及转码视频进行解封装,获取到源视频的SEI信息以及转码视频的SEI信息。由于SEI信息与视频帧绑定,可以将源视频的SEI信息以及转码视频的SEI信息按照视频帧进行一一比对,确定具有相同SEI信息所在位置的视频帧,即确定源视频和转码视频中具有相同播放内容的视频帧,按照相同的视频帧对源视频及转码视频进行视频对齐,使得源视频及转码视频播放相同内容。After obtaining the transcoded video, the SEI information in the source video and the transcoded video can be compared, such as decapsulating the source video and the transcoded video to obtain the SEI information of the source video and the SEI information of the transcoded video. Since the SEI information is bound to the video frame, the SEI information of the source video and the SEI information of the transcoded video can be compared one by one according to the video frame to determine the video frame with the same SEI information location, that is, to determine the video frame with the same playback content in the source video and the transcoded video, and align the source video and the transcoded video according to the same video frame, so that the source video and the transcoded video play the same content.
根据本公开提供的视频对齐方法,将源视频的SEI信息复制写入至转码视频中,即得到的转码视频与源视频中的SEI信息为同一SEI信息。根据源视频及转码视频中的相同SEI信息,可以将源视频及转码视频进行视频对齐,且无需对源视频及转码视频的视频画面进行解码,节省资源,视频对齐的处理速度也大幅提升。According to the video alignment method provided by the present disclosure, the SEI information of the source video is copied and written into the transcoded video, that is, the SEI information in the obtained transcoded video is the same as that in the source video. According to the same SEI information in the source video and the transcoded video, the source video and the transcoded video can be aligned, and there is no need to decode the video images of the source video and the transcoded video, which saves resources and greatly improves the processing speed of video alignment.
图2示出了根据本公开一实施例的视频对齐方法的流程图,如图2所示,该方法包括以下步骤:FIG2 shows a flow chart of a video alignment method according to an embodiment of the present disclosure. As shown in FIG2 , the method includes the following steps:
步骤S201,为源视频中的视频帧增加SEI信息。Step S201, adding SEI information to the video frames in the source video.
在直播的源视频中可以为其增加SEI信息,增加SEI信息在对源视频进行转码处理前增加,如将主播的源视频上传至服务端时增加,将增加SEI信息后的源视频传输至边缘计算节点再进行转码处理,或者传输至边缘计算节点,在进行转码处理前为其增加SEI信息等,此处不做限定。SEI information can be added to the source video of the live broadcast. The SEI information is added before the source video is transcoded, such as when the source video of the anchor is uploaded to the server. The source video with added SEI information is transmitted to the edge computing node for transcoding, or transmitted to the edge computing node and SEI information is added before transcoding, etc., which is not limited here.
增加的SEI信息与源视频中的视频帧对应,对源视频中的视频帧增加SEI信息,具体地,对源视频中每一视频帧增加一一对应的SEI信息。SEI信息采用递增方式设置,uuid_iso_iec_11578中包含的内容可以为SEI序号sei_idx,第1帧中SEI序号sei_idx设置为1,第2帧中SEI序号sei_idx设置为2,第3帧中SEI序号sei_idx设置为3……基于SEI序号可以有效区分各 个视频帧,且SEI序号的递增顺序与视频帧的先后顺序相符;或者,uuid_iso_iec_11578中包含的内容可以为PTS(Presentation Time Stamp,显示时间戳),PTS可以告诉视频播放器具体播放其对应的视频帧的时间,PTS与源视频中视频帧的顺序相关,视频帧在先的PTS的时间早于视频帧在后的PTS的时间,每个视频帧中的PTS根据视频帧的顺序也会呈现递增趋势。PTS具体数值的设置可以如在生成源视频时由编码器生成,如选择一个参考时钟,参考时钟上的时间是线性递增的,编码器依据参考时钟上的时间给每个视频帧打上时间戳,该时间戳即PTS,以上为举例说明,也可以采用其他方式设置,此处不做限定;或者,uuid_iso_iec_11578中包含的内容可以为标识码,如二维码等各种标识码,生成标识码时,可以根据递增的流水号或者序号等,将其转换生成对应的标识码,为每个视频帧设置不同的标识码,使得每个视频帧对应的SEI信息的标识码也呈现递增趋势,与视频帧的顺序相呼应,且可以有效区分各个视频帧。以上SEI信息为举例说明,具体设置可以根据实施情况设置,此处不做展开说明。The added SEI information corresponds to the video frames in the source video, and SEI information is added to the video frames in the source video. Specifically, SEI information corresponding to each video frame in the source video is added. SEI information is set in an incremental manner. The content contained in uuid_iso_iec_11578 can be SEI serial number sei_idx. In the first frame, SEI serial number sei_idx is set to 1, in the second frame, SEI serial number sei_idx is set to 2, and in the third frame, SEI serial number sei_idx is set to 3... Based on the SEI serial number, each video frames, and the increasing order of SEI sequence numbers matches the order of video frames; or, the content contained in uuid_iso_iec_11578 may be PTS (Presentation Time Stamp), which can tell the video player the specific time to play the corresponding video frame. PTS is related to the order of video frames in the source video. The PTS time of the earlier video frame is earlier than the PTS time of the later video frame. The PTS in each video frame will also show an increasing trend according to the order of the video frames. The specific value of PTS can be set by the encoder when generating the source video, such as selecting a reference clock, the time on the reference clock is linearly increasing, and the encoder timestamps each video frame according to the time on the reference clock. The timestamp is PTS. The above is an example, and it can also be set in other ways, which is not limited here; or, the content contained in uuid_iso_iec_11578 can be an identification code, such as a QR code or other identification codes. When generating the identification code, it can be converted to generate a corresponding identification code according to the increasing serial number or sequence number, and a different identification code is set for each video frame, so that the identification code of the SEI information corresponding to each video frame also shows an increasing trend, which corresponds to the order of the video frames and can effectively distinguish each video frame. The above SEI information is an example, and the specific settings can be set according to the implementation situation, which will not be explained in detail here.
SEI信息属于码流范畴,是向视频码流中加入的额外信息,SEI信息采用比特流,增加时使用如ffmpeg从源视频中提取出h.264码流,使用h264_metadata比特流过滤器添加SEI信息等,以上为举例说明,具体增加SEI信息所采用的技术手段此处不做限定。SEI information belongs to the category of bitstream, which is additional information added to the video bitstream. SEI information adopts bitstream. When adding, for example, ffmpeg is used to extract h.264 bitstream from the source video, and h264_metadata bitstream filter is used to add SEI information. The above is an example, and the specific technical means used to add SEI information are not limited here.
进一步地,在对源视频的视频帧增加SEI信息前,还可以利用ffmpeg检测源视频中是否已经存在SEI信息,若已经存在,则无需重新增加,避免重复设置。如利用ffmpeg对源视频解封装,利用ffprobe可以查看音视频文件的各种信息,如封装格式、音频/视频流信息、数据包信息等,通过查看数据包信息中存储压缩编码数据的结构体avPacket,可以查看源视频中各视频帧是否设置有对应的SEI信息,避免重复设置。Furthermore, before adding SEI information to the video frames of the source video, ffmpeg can be used to detect whether SEI information already exists in the source video. If it already exists, there is no need to add it again to avoid repeated settings. For example, by using ffmpeg to decapsulate the source video, ffprobe can be used to view various information of the audio and video files, such as the encapsulation format, audio/video stream information, data packet information, etc. By viewing the structure avPacket that stores compressed encoded data in the data packet information, it can be checked whether the corresponding SEI information is set for each video frame in the source video to avoid repeated settings.
步骤S202,获取源视频中的SEI信息。Step S202: Acquire SEI information in the source video.
将增加SEI信息后的源视频传输给边缘计算节点进行转码处理前,对源视频进行解封装,如利用ffmpeg工具对源视频进行解封装,得到源视频中存储压缩编码数据的结构体avPacket,SEI信息设置在avPacket中。通过解析结构体avPacket,可以得到源视频的SEI信息。Before transmitting the source video with added SEI information to the edge computing node for transcoding, the source video is decapsulated, such as using the ffmpeg tool to decapsulate the source video to obtain the structure avPacket storing the compressed encoded data in the source video, and the SEI information is set in the avPacket. By parsing the structure avPacket, the SEI information of the source video can be obtained.
此处,仅对源视频进行解封装,获取其SEI信息,不对其进行解码、重新编码等处理,可以减少对资源的消耗,其处理速度较快。对源视频仅进行解封装,不对其做其它任何处理,不涉及对源视频中具体视频内容的修改,保障源视频还是原来的视频内容。 Here, only the source video is decapsulated to obtain its SEI information, and no decoding or re-encoding is performed on it, which can reduce resource consumption and has a faster processing speed. The source video is only decapsulated without any other processing, and does not involve modification of the specific video content in the source video, ensuring that the source video is still the original video content.
步骤S203,基于结构体的解码时间戳将SEI信息存储至缓存。Step S203: storing the SEI information into a cache based on the decoding timestamp of the structure.
考虑到后续对源视频进行转码处理时,转码处理会将源视频进行解码,重新对其进行编码,这一过程中,导致转后视频中的SEI信息被修改,与源视频的SEI信息不一致,无法直接对视频对齐。因此,需要将获取的源视频的SEI信息先进行存储,方便后续直接复制写入转码视频。如将SEI信息存储至缓存中,在转码处理后可以直接从缓存获取SEI信息进行复制写入。Considering that when the source video is subsequently transcoded, the transcoding process will decode the source video and re-encode it. During this process, the SEI information in the transcoded video is modified and inconsistent with the SEI information of the source video, making it impossible to align the video directly. Therefore, it is necessary to store the obtained SEI information of the source video first, so that it is convenient to directly copy and write the transcoded video later. If the SEI information is stored in the cache, the SEI information can be directly obtained from the cache after the transcoding process for copying and writing.
进一步地,SEI信息与视频帧对应,多个视频帧对应多个SEI信息,复制写入时也需要确定具体写入时对应的视频帧。在存储时,可以根据avPacket中的DTS(Decoding Time Stamp,解码时间戳)来存储SEI信息。具体地,DTS用于告诉播放器该在什么时候解码这一视频帧的数据,与视频帧对应。将SEI信息按照DTS来存储,可以方便在对源视频进行转码处理时,根据DTS解码视频帧,并获取DTS对应的SEI信息,方便准确写入。Furthermore, SEI information corresponds to video frames, and multiple video frames correspond to multiple SEI information. When copying and writing, it is also necessary to determine the specific video frame corresponding to the writing. When storing, SEI information can be stored according to DTS (Decoding Time Stamp) in avPacket. Specifically, DTS is used to tell the player when to decode the data of this video frame, corresponding to the video frame. Storing SEI information according to DTS can facilitate decoding video frames according to DTS when transcoding the source video, and obtaining SEI information corresponding to DTS, which is convenient for accurate writing.
步骤S204,将源视频进行转码处理,根据结构体的解码时间戳获取SEI信息,并将SEI信息复制写入,得到转码视频。Step S204, transcoding the source video, obtaining SEI information according to the decoding timestamp of the structure, and copying and writing the SEI information to obtain a transcoded video.
对源视频进行转码处理,如将源视频进行解码,如按照DTS对应的解码各视频帧,对其进行重新编码,同时,从缓存中根据结构体的解码时间戳DTS获取各视频帧对应的SEI信息,将SEI信息直接复制写入转码处理后的视频,得到转码视频。转码视频中的SEI信息是基于源视频的SEI信息的复制,保证了转码视频中的SEI信息与源视频的SEI信息的一致。The source video is transcoded, such as decoding the source video, such as re-encoding each video frame according to the decoding timestamp DTS corresponding to the source video, and obtaining the SEI information corresponding to each video frame from the cache according to the decoding timestamp DTS of the structure, and directly copying the SEI information into the transcoded video to obtain the transcoded video. The SEI information in the transcoded video is a copy of the SEI information of the source video, which ensures that the SEI information in the transcoded video is consistent with the SEI information of the source video.
步骤S205,根据源视频及转码视频中的相同SEI信息,将源视频及转码视频进行视频对齐。Step S205: align the source video and the transcoded video according to the same SEI information in the source video and the transcoded video.
在进行视频对齐时,根据源视频和转码视频中所包含的SEI信息,将具有相同SEI信息的各视频帧进行对齐,实现源视频及转码视频的视频对齐。具体地,对源视频及转码视频进行解封装,如利用ffmpeg对源视频及转码视频进行解封装,分别获取到源视频的SEI信息以及转码视频的SEI信息。对源视频的SEI信息和转码视频的SEI信息进行比对,确定相同SEI信息对应的源视频的第一视频帧和转码视频的第二视频帧。如SEI信息为SEI序号,则比对源视频中各视频帧的SEI信息的SEI序号和转码视频中各视频帧的SEI信息的SEI序号,确定具有相同SEI序号的源视频的第一视频帧和转码视频的第二视频帧,如源视频的第1帧中SEI序号为1,转码视频的第1帧中SEI序号也为1,源视频的第一视频帧即源视频的第1帧,转码视频的第二视频帧即转码视频的第1帧;或者,SEI信息为显示时间戳,比对源视频中各视频帧的SEI信息的显示时间戳和转码视频中各视频帧的SEI信息的显 示时间戳,当源视频的视频帧中的显示时间戳与转码视频的视频帧中的显示时间戳相同时,确定具有相同显示时间戳的源视频的第一视频帧和转码视频的第二视频帧;或者,比对源视频中各视频帧的SEI信息的标识码和转码视频中各视频帧的SEI信息的标识码,当源视频的视频帧中的标识码与转码视频的视频帧中的标识码相同时,确定具有相同标识码的源视频的第一视频帧和转码视频的第二视频帧。进一步地,源视频的第一视频帧的个数和转码视频的第二视频帧的个数相同,如比对后,确定源视频的第一视频帧的个数为20个,转码视频的第二视频帧的个数也是20个。When performing video alignment, according to the SEI information contained in the source video and the transcoded video, the video frames with the same SEI information are aligned to achieve video alignment of the source video and the transcoded video. Specifically, the source video and the transcoded video are decapsulated, such as using ffmpeg to decapsulate the source video and the transcoded video, and the SEI information of the source video and the SEI information of the transcoded video are obtained respectively. The SEI information of the source video and the SEI information of the transcoded video are compared to determine the first video frame of the source video and the second video frame of the transcoded video corresponding to the same SEI information. If the SEI information is an SEI sequence number, then the SEI sequence number of the SEI information of each video frame in the source video is compared with the SEI sequence number of the SEI information of each video frame in the transcoded video to determine the first video frame of the source video and the second video frame of the transcoded video with the same SEI sequence number. For example, if the SEI sequence number in the first frame of the source video is 1 and the SEI sequence number in the first frame of the transcoded video is also 1, the first video frame of the source video is the first frame of the source video, and the second video frame of the transcoded video is the first frame of the transcoded video. Alternatively, if the SEI information is a display timestamp, then the display timestamp of the SEI information of each video frame in the source video is compared with the display timestamp of the SEI information of each video frame in the transcoded video. When the display timestamp in the video frame of the source video is the same as the display timestamp in the video frame of the transcoded video, the first video frame of the source video and the second video frame of the transcoded video with the same display timestamp are determined; or, the identification code of the SEI information of each video frame in the source video is compared with the identification code of the SEI information of each video frame in the transcoded video, when the identification code in the video frame of the source video is the same as the identification code in the video frame of the transcoded video, the first video frame of the source video and the second video frame of the transcoded video with the same identification code are determined. Furthermore, the number of the first video frames of the source video and the number of the second video frames of the transcoded video are the same. For example, after comparison, it is determined that the number of the first video frames of the source video is 20, and the number of the second video frames of the transcoded video is also 20.
当第一视频帧和第二视频帧的个数为多个时,对将源视频的第一视频帧以及转码视频的第二视频帧同时播放、或者存储时,可以先将相同SEI信息的源视频的第一视频帧和转码视频的第二视频帧按照视频播放的先后顺序排序,如源视频的第一视频帧包括第1帧、第2帧、第3帧……转码视频的第二视频帧包括第1帧、第2帧、第3帧……,各自按照各自视频播放的先后顺序,如PTS的显示时间戳的先后顺序排序,并对各个第一视频帧和第二视频帧建立一一对应关系,如源视频的第1帧对应转码视频的第1帧、源视频的第2帧对应转码视频的第2帧、源视频的第3帧对应转码视频的第3帧……在播放时,可以将具有对应关系的第一视频帧和第二视频帧同时播放,完成视频对齐,可以视觉观察两个视频帧的画面,进行质量评估等。存储时,按照对应关系,将对应的第一视频帧和第二视频帧进行存储,完成源视频及转码视频的视频对齐。When there are multiple first video frames and second video frames, when playing or storing the first video frames of the source video and the second video frames of the transcoded video at the same time, the first video frames of the source video and the second video frames of the transcoded video with the same SEI information can be sorted in the order of video playing, such as the first video frame of the source video includes the first frame, the second frame, the third frame... The second video frame of the transcoded video includes the first frame, the second frame, the third frame..., each according to the order of their respective video playing, such as the order of the display timestamp of the PTS, and a one-to-one correspondence is established between each first video frame and the second video frame, such as the first frame of the source video corresponds to the first frame of the transcoded video, the second frame of the source video corresponds to the second frame of the transcoded video, the third frame of the source video corresponds to the third frame of the transcoded video... During playing, the first video frame and the second video frame with a corresponding relationship can be played at the same time to complete the video alignment, and the pictures of the two video frames can be visually observed to perform quality assessment, etc. During storage, the corresponding first video frame and the second video frame are stored according to the corresponding relationship to complete the video alignment of the source video and the transcoded video.
基于以上处理,如以时长2s的视频为例,仅解封装与解码操作相比,处理速度可以提升50倍左右,cpu的性能占用也从解码的1764.14%降低到3.14%,大大减少对资源消耗,具体实施时,根据设备性能数据也会略有不同。本实施例在视频对齐时,无需解码可以大幅提高处理速度,并且降低解码压力。Based on the above processing, taking a 2s video as an example, the processing speed can be increased by about 50 times compared with the decoding operation by only decapsulating, and the CPU performance occupancy is reduced from 1764.14% of decoding to 3.14%, greatly reducing resource consumption. When it is implemented, it will be slightly different according to the device performance data. In this embodiment, when aligning the video, decoding is not required, which can greatly improve the processing speed and reduce the decoding pressure.
根据本公开提供的视频对齐方法,将获取的源视频的SEI信息先存储至缓存,在转码处理时,将源视频的SEI信息复制写入至转码视频中,最终得到的转码视频中的SEI信息与源视频中的SEI信息一致,可以根据源视频及转码视频中的SEI信息进行比对,确定相同SEI信息对应的源视频的第一视频帧和转码视频的第二视频帧,将源视频的第一视频帧以及转码视频的第二视频帧同时播放、存储,完成源视频和转码视频的视频对齐。这一过程中仅对源视频及转码视频进行解封装获取SEI信息进行比对,无需对源视频及转码视频的视频画面进行解码,提升处理速度,也减低对资源的占用。According to the video alignment method provided by the present disclosure, the SEI information of the acquired source video is first stored in the cache, and during the transcoding process, the SEI information of the source video is copied and written into the transcoded video, and the SEI information in the transcoded video finally obtained is consistent with the SEI information in the source video, and the SEI information in the source video and the transcoded video can be compared to determine the first video frame of the source video and the second video frame of the transcoded video corresponding to the same SEI information, and the first video frame of the source video and the second video frame of the transcoded video are played and stored at the same time to complete the video alignment of the source video and the transcoded video. In this process, only the source video and the transcoded video are decapsulated to obtain the SEI information for comparison, and there is no need to decode the video screens of the source video and the transcoded video, which improves the processing speed and reduces the occupation of resources.
图3示出了本公开一实施例提供的视频对齐装置的结构示意图。如图3 所示,该装置包括:FIG3 shows a schematic diagram of the structure of a video alignment device provided by an embodiment of the present disclosure. As shown, the device comprises:
获取模块310,适于获取源视频中的补充增强信息SEI信息;An acquisition module 310 is adapted to acquire supplemental enhancement information SEI information in a source video;
复制写入模块320,适于将源视频进行转码处理,并将SEI信息复制写入,得到转码视频;The copy-write module 320 is adapted to perform transcoding processing on the source video and copy-write the SEI information to obtain a transcoded video;
对齐模块330,适于根据源视频及转码视频中的相同SEI信息,将源视频及转码视频进行视频对齐。The alignment module 330 is adapted to perform video alignment on the source video and the transcoded video according to the same SEI information in the source video and the transcoded video.
可选地,获取模块310进一步适于:Optionally, the acquisition module 310 is further adapted to:
对源视频进行解封装,得到存储压缩编码数据的结构体;Decapsulate the source video to obtain a structure storing compressed coded data;
解析结构体,得到源视频的SEI信息。Parse the structure to obtain the SEI information of the source video.
可选地,装置还包括:Optionally, the device further comprises:
缓存模块340,适于基于结构体的解码时间戳将SEI信息存储至缓存。The cache module 340 is adapted to store the SEI information into a cache based on the decoding timestamp of the structure.
可选地,复制写入模块320进一步适于:Optionally, the copy-write module 320 is further adapted to:
将源视频进行转码处理;Transcode the source video;
从缓存中根据结构体的解码时间戳获取对应的SEI信息;Get the corresponding SEI information from the cache according to the decoding timestamp of the structure;
将SEI信息复制写入转码处理后的视频,得到转码视频。The SEI information is copied and written into the transcoded video to obtain a transcoded video.
可选地,装置还包括:Optionally, the device further comprises:
增加模块350,适于为源视频中的视频帧增加SEI信息。The adding module 350 is adapted to add SEI information to the video frames in the source video.
可选地,SEI信息采用递增方式设置。Optionally, the SEI information is set in an incremental manner.
可选地,SEI信息包括SEI序号、显示时间戳和/或标识码。Optionally, the SEI information includes a SEI sequence number, a display timestamp and/or an identification code.
可选地,对齐模块330进一步适于:Optionally, the alignment module 330 is further adapted to:
对源视频及转码视频进行解封装,分别获取源视频的SEI信息以及转码视频的SEI信息;Decapsulate the source video and the transcoded video, and obtain the SEI information of the source video and the SEI information of the transcoded video respectively;
比对源视频的SEI信息和转码视频的SEI信息,确定相同SEI信息对应的源视频的第一视频帧和转码视频的第二视频帧;Compare the SEI information of the source video and the SEI information of the transcoded video to determine a first video frame of the source video and a second video frame of the transcoded video corresponding to the same SEI information;
将源视频的第一视频帧以及转码视频的第二视频帧同时播放和/或存储,完成源视频及转码视频的视频对齐。The first video frame of the source video and the second video frame of the transcoded video are played and/or stored simultaneously to complete the video alignment of the source video and the transcoded video.
可选地,对齐模块330进一步适于:Optionally, the alignment module 330 is further adapted to:
比对源视频的SEI序号和转码视频的SEI序号,确定相同SEI序号对应的源视频的第一视频帧和转码视频的第二视频帧;Compare the SEI sequence number of the source video and the SEI sequence number of the transcoded video to determine the first video frame of the source video and the second video frame of the transcoded video corresponding to the same SEI sequence number;
或者, or,
比对源视频的显示时间戳和转码视频的显示时间戳,确定相同显示时间戳对应的源视频的第一视频帧和转码视频的第二视频帧;Compare the display timestamp of the source video and the display timestamp of the transcoded video to determine a first video frame of the source video and a second video frame of the transcoded video corresponding to the same display timestamp;
或者,or,
比对源视频的标识码和转码视频的标识码,确定相同标识码对应的源视频的第一视频帧和转码视频的第二视频帧。The identification code of the source video and the identification code of the transcoded video are compared to determine a first video frame of the source video and a second video frame of the transcoded video corresponding to the same identification code.
可选地,源视频的第一视频帧的个数和转码视频的第二视频帧的个数相同。Optionally, the number of first video frames of the source video is the same as the number of second video frames of the transcoded video.
可选地,第一视频帧的个数为多个;第二视频帧的个数为多个;Optionally, the number of the first video frames is multiple; the number of the second video frames is multiple;
对齐模块330进一步适于:The alignment module 330 is further adapted to:
将相同SEI信息的源视频的第一视频帧和转码视频的第二视频帧按照视频播放的先后顺序排序,并建立一一对应关系;Sort the first video frame of the source video and the second video frame of the transcoded video with the same SEI information in the order of video playback, and establish a one-to-one correspondence;
将具有对应关系的第一视频帧和第二视频帧同时播放和/或存储,完成源视频及转码视频的视频对齐。The first video frame and the second video frame having a corresponding relationship are played and/or stored simultaneously to complete the video alignment of the source video and the transcoded video.
以上各模块的描述参照方法实施例中对应的描述,在此不再赘述。The description of each module above refers to the corresponding description in the method embodiment and will not be repeated here.
根据本公开提供的视频对齐装置,将源视频的SEI信息复制写入至转码视频中,即得到的转码视频与源视频中的SEI信息为同一SEI信息。根据源视频及转码视频中的相同SEI信息,可以将源视频及转码视频进行视频对齐,且无需对源视频及转码视频的视频画面进行解码,节省资源,视频对齐的处理速度也大幅提升。According to the video alignment device provided by the present disclosure, the SEI information of the source video is copied and written into the transcoded video, that is, the SEI information in the obtained transcoded video is the same as that in the source video. According to the same SEI information in the source video and the transcoded video, the source video and the transcoded video can be aligned, and there is no need to decode the video images of the source video and the transcoded video, which saves resources and greatly improves the processing speed of video alignment.
本公开还提供了一种非易失性计算机可读存储介质,非易失性计算机可读存储介质存储有至少一可执行指令,可执行指令可执行上述任意方法实施例中的视频对齐方法。The present disclosure also provides a non-volatile computer-readable storage medium, which stores at least one executable instruction, and the executable instruction can execute the video alignment method in any of the above method embodiments.
图4示出了根据本公开一实施例的一种计算设备的结构示意图,本公开的具体实施例并不对计算设备的具体实现做限定。FIG4 shows a schematic diagram of the structure of a computing device according to an embodiment of the present disclosure. The specific embodiment of the present disclosure does not limit the specific implementation of the computing device.
如图4所示,该计算设备可以包括:处理器(processor)402、通信接口(Communications Interface)404、存储器(memory)406、以及通信总线408。As shown in Figure 4, the computing device may include: a processor (processor) 402, a communication interface (Communications Interface) 404, a memory (memory) 406, and a communication bus 408.
其中:in:
处理器402、通信接口404、以及存储器406通过通信总线408完成相互间的通信。The processor 402 , the communication interface 404 , and the memory 406 communicate with each other via a communication bus 408 .
通信接口404,用于与其它设备比如客户端或其它服务器等的网元通信。The communication interface 404 is used to communicate with other devices such as clients or other servers.
处理器402,用于执行程序410,具体可以执行上述视频对齐方法实施 例中的相关步骤。Processor 402 is used to execute program 410, which can specifically execute the above-mentioned video alignment method to implement The relevant steps in the example.
具体地,程序410可以包括程序代码,该程序代码包括计算机操作指令。Specifically, the program 410 may include program codes, which include computer operation instructions.
处理器402可能是中央处理器CPU,或者是特定集成电路ASIC(Application Specific Integrated Circuit),或者是被配置成实施本公开的一个或多个集成电路。计算设备包括的一个或多个处理器,可以是同一类型的处理器,如一个或多个CPU;也可以是不同类型的处理器,如一个或多个CPU以及一个或多个ASIC。Processor 402 may be a central processing unit (CPU), or an application specific integrated circuit (ASIC), or one or more integrated circuits configured to implement the present disclosure. The one or more processors included in the computing device may be processors of the same type, such as one or more CPUs; or may be processors of different types, such as one or more CPUs and one or more ASICs.
存储器406,用于存放程序410。存储器406可能包含高速RAM存储器,也可能还包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。Memory 406 is used to store program 410. Memory 406 may include high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk storage.
程序410具体可以用于使得处理器402执行上述任意方法实施例中的视频对齐方法。程序410中各步骤的具体实现可以参见上述视频对齐实施例中的相应步骤和单元中对应的描述,在此不赘述。所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的设备和模块的具体工作过程,可以参考前述方法实施例中的对应过程描述,在此不再赘述。The program 410 can be specifically used to enable the processor 402 to perform the video alignment method in any of the above method embodiments. The specific implementation of each step in the program 410 can refer to the corresponding descriptions in the corresponding steps and units in the above video alignment embodiments, which will not be repeated here. Those skilled in the art can clearly understand that for the convenience and simplicity of description, the specific working process of the above-described devices and modules can refer to the corresponding process description in the above-mentioned method embodiments, which will not be repeated here.
在此提供的算法或显示不与任何特定计算机、虚拟系统或者其它设备固有相关。各种通用系统也可以与基于在此的示教一起使用。根据上面的描述,构造这类系统所要求的结构是显而易见的。此外,本公开也不针对任何特定编程语言。应当明白,可以利用各种编程语言实现在此描述的本公开的内容,并且上面对特定语言所做的描述是为了披露本公开的较佳实施方式。The algorithm or display provided herein is not inherently related to any particular computer, virtual system or other device. Various general purpose systems can also be used together with the teachings based on this. According to the above description, it is obvious that the structure required for constructing such systems. In addition, the present disclosure is not directed to any specific programming language either. It should be understood that various programming languages can be utilized to implement the content of the present disclosure described herein, and the above description of specific languages is to disclose the preferred embodiment of the present disclosure.
在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本公开的实施例可以在没有这些具体细节的情况下实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。In the description provided herein, a large number of specific details are described. However, it is understood that the embodiments of the present disclosure can be practiced without these specific details. In some instances, well-known methods, structures and techniques are not shown in detail so as not to obscure the understanding of this description.
类似地,应当理解,为了精简本公开并帮助理解各个发明方面中的一个或多个,在上面对本公开的示例性实施例的描述中,本公开的各个特征有时被一起分组到单个实施例、图、或者对其的描述中。然而,并不应将该公开的方法解释成反映如下意图:即所要求保护的本公开要求比在每个权利要求中所明确记载的特征更多的特征。更确切地说,如下面的权利要求书所反映的那样,发明方面在于少于前面公开的单个实施例的所有特征。因此,遵循具体实施方式的权利要求书由此明确地并入该具体实施方式,其中每个权利要求本身都作为本公开的单独实施例。Similarly, it should be understood that in order to streamline the present disclosure and aid in understanding one or more of the various inventive aspects, in the above description of the exemplary embodiments of the present disclosure, the various features of the present disclosure are sometimes grouped together into a single embodiment, figure, or description thereof. However, this disclosed method should not be interpreted as reflecting the following intention: the claimed disclosure requires more features than the features explicitly recited in each claim. More specifically, as reflected in the claims below, the inventive aspects lie in less than all the features of the individual embodiments disclosed above. Therefore, the claims that follow the specific embodiment are hereby expressly incorporated into the specific embodiment, with each claim itself serving as a separate embodiment of the present disclosure.
本领域那些技术人员可以理解,可以对实施例中的设备中的模块进行自适应性地改变并且把它们设置在与该实施例不同的一个或多个设备中。可以 把实施例中的模块或单元或组件组合成一个模块或单元或组件,以及此外可以把它们分成多个子模块或子单元或子组件。除了这样的特征和/或过程或者单元中的至少一些是相互排斥之外,可以采用任何组合对本说明书(包括伴随的权利要求、摘要和附图)中公开的所有特征以及如此公开的任何方法或者设备的所有过程或单元进行组合。除非另外明确陈述,本说明书(包括伴随的权利要求、摘要和附图)中公开的每个特征可以由提供相同、等同或相似目的的替代特征来代替。Those skilled in the art will appreciate that the modules in the device of the embodiment may be adaptively changed and arranged in one or more devices different from the embodiment. The modules or units or components in the embodiments are combined into one module or unit or component, and further they can be divided into multiple sub-modules or sub-units or sub-components. All features disclosed in this specification (including the accompanying claims, abstracts and drawings) and all processes or units of any method or device so disclosed can be combined in any combination, except that at least some of such features and/or processes or units are mutually exclusive. Unless otherwise expressly stated, each feature disclosed in this specification (including the accompanying claims, abstracts and drawings) can be replaced by an alternative feature that provides the same, equivalent or similar purpose.
此外,本领域的技术人员能够理解,尽管在此的一些实施例包括其它实施例中所包括的某些特征而不是其它特征,但是不同实施例的特征的组合意味着处于本公开的范围之内并且形成不同的实施例。例如,在下面的权利要求书中,所要求保护的实施例的任意之一都可以以任意的组合方式来使用。In addition, those skilled in the art will appreciate that, although some embodiments herein include certain features included in other embodiments but not other features, the combination of features of different embodiments is meant to be within the scope of this disclosure and form different embodiments. For example, in the claims below, any one of the claimed embodiments may be used in any combination.
本公开的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本公开的一些或者全部部件的一些或者全部功能。本公开还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序(例如,计算机程序和计算机程序产品)。这样的实现本公开的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。The various component embodiments of the present disclosure can be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. It should be understood by those skilled in the art that a microprocessor or digital signal processor (DSP) can be used in practice to implement some or all functions of some or all components of the present disclosure. The present disclosure can also be implemented as a device or apparatus program (e.g., computer program and computer program product) for executing a part or all of the methods described herein. Such a program implementing the present disclosure can be stored on a computer-readable medium, or can have the form of one or more signals. Such a signal can be downloaded from an Internet website, or provided on a carrier signal, or provided in any other form.
应该注意的是上述实施例对本公开进行说明而不是对本公开进行限制,并且本领域技术人员在不脱离所附权利要求的范围的情况下可设计出替换实施例。在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的元件或步骤。位于元件之前的单词“一”或“一个”不排除存在多个这样的元件。本公开可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。上述实施例中的步骤,除有特殊说明外,不应理解为对执行顺序的限定。 It should be noted that the above embodiments illustrate the present disclosure rather than limit the present disclosure, and that those skilled in the art may design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference symbol between brackets shall not be constructed as a limitation on the claims. The word "comprising" does not exclude the presence of elements or steps not listed in the claims. The word "one" or "an" preceding an element does not exclude the presence of multiple such elements. The present disclosure may be implemented by means of hardware including several different elements and by means of appropriately programmed computers. In a unit claim that lists several devices, several of these devices may be embodied by the same hardware item. The use of the words first, second, and third, etc. does not indicate any order. These words may be interpreted as names. The steps in the above embodiments, unless otherwise specified, should not be understood as limitations on the order of execution.

Claims (15)

  1. 一种视频对齐方法,其包括:A video alignment method, comprising:
    获取源视频中的补充增强信息SEI信息;Obtaining supplemental enhancement information SEI information in the source video;
    将所述源视频进行转码处理,并将所述SEI信息复制写入,得到转码视频;The source video is transcoded and the SEI information is copied and written to obtain a transcoded video;
    根据所述源视频及所述转码视频中的相同SEI信息,将所述源视频及所述转码视频进行视频对齐。The source video and the transcoded video are aligned according to the same SEI information in the source video and the transcoded video.
  2. 根据权利要求1所述的方法,其中,所述获取源视频中的SEI信息进一步包括:The method according to claim 1, wherein the obtaining SEI information in the source video further comprises:
    对源视频进行解封装,得到存储压缩编码数据的结构体;Decapsulate the source video to obtain a structure storing compressed coded data;
    解析所述结构体,得到所述源视频的SEI信息。The structure is parsed to obtain SEI information of the source video.
  3. 根据权利要求1所述的方法,其中,在所述获取源视频中的SEI信息后,所述方法还包括:The method according to claim 1, wherein, after obtaining the SEI information in the source video, the method further comprises:
    基于所述结构体的解码时间戳将所述SEI信息存储至缓存。The SEI information is stored in a cache based on a decoding timestamp of the structure.
  4. 根据权利要求3所述的方法,其中,所述将所述源视频进行转码处理,并将所述SEI信息复制写入,得到转码视频进一步包括:The method according to claim 3, wherein the step of transcoding the source video and copying and writing the SEI information to obtain the transcoded video further comprises:
    将所述源视频进行转码处理;Transcoding the source video;
    从缓存中根据结构体的解码时间戳获取对应的SEI信息;Get the corresponding SEI information from the cache according to the decoding timestamp of the structure;
    将所述SEI信息复制写入转码处理后的视频,得到转码视频。The SEI information is copied and written into the transcoded video to obtain a transcoded video.
  5. 根据权利要求1所述的方法,其中,所述方法还包括:The method according to claim 1, wherein the method further comprises:
    为所述源视频中的视频帧增加SEI信息。SEI information is added to the video frames in the source video.
  6. 根据权利要求5所述的方法,其中,所述SEI信息采用递增方式设置。The method according to claim 5, wherein the SEI information is set in an incremental manner.
  7. 根据权利要求5所述的方法,其中,所述SEI信息包括SEI序号、显示时间戳和/或标识码。The method according to claim 5, wherein the SEI information includes a SEI sequence number, a display timestamp and/or an identification code.
  8. 根据权利要求1-7中任一项所述的方法,其中,所述根据所述源视频及所述转码视频中的相同SEI信息,将所述源视频及所述转码视频进行视频对齐进一步包括:The method according to any one of claims 1 to 7, wherein the step of aligning the source video and the transcoded video according to the same SEI information in the source video and the transcoded video further comprises:
    对所述源视频及所述转码视频进行解封装,分别获取所述源视频的SEI信息以及所述转码视频的SEI信息; Decapsulating the source video and the transcoded video, and obtaining SEI information of the source video and SEI information of the transcoded video respectively;
    比对所述源视频的SEI信息和所述转码视频的SEI信息,确定相同SEI信息对应的所述源视频的第一视频帧和所述转码视频的第二视频帧;Comparing the SEI information of the source video and the SEI information of the transcoded video, determining a first video frame of the source video and a second video frame of the transcoded video corresponding to the same SEI information;
    将所述源视频的所述第一视频帧以及所述转码视频的第二视频帧同时播放和/或存储,完成所述源视频及所述转码视频的视频对齐。The first video frame of the source video and the second video frame of the transcoded video are played and/or stored simultaneously to complete the video alignment of the source video and the transcoded video.
  9. 根据权利要求8所述的方法,其中,所述比对所述源视频的SEI信息和所述转码视频的SEI信息,确定相同SEI信息对应的所述源视频的第一视频帧和所述转码视频的第二视频帧进一步包括:The method according to claim 8, wherein the comparing the SEI information of the source video and the SEI information of the transcoded video to determine the first video frame of the source video and the second video frame of the transcoded video corresponding to the same SEI information further comprises:
    比对所述源视频的SEI序号和所述转码视频的SEI序号,确定相同SEI序号对应的所述源视频的第一视频帧和所述转码视频的第二视频帧;Comparing the SEI sequence number of the source video and the SEI sequence number of the transcoded video, determining a first video frame of the source video and a second video frame of the transcoded video corresponding to the same SEI sequence number;
    或者,or,
    比对所述源视频的显示时间戳和所述转码视频的显示时间戳,确定相同显示时间戳对应的所述源视频的第一视频帧和所述转码视频的第二视频帧;Comparing the display timestamp of the source video and the display timestamp of the transcoded video, determining a first video frame of the source video and a second video frame of the transcoded video corresponding to the same display timestamp;
    或者,or,
    比对所述源视频的标识码和所述转码视频的标识码,确定相同标识码对应的所述源视频的第一视频帧和所述转码视频的第二视频帧。The identification code of the source video and the identification code of the transcoded video are compared to determine a first video frame of the source video and a second video frame of the transcoded video corresponding to the same identification code.
  10. 根据权利要求8或9所述的方法,其中,所述源视频的第一视频帧的个数和所述转码视频的第二视频帧的个数相同。The method according to claim 8 or 9, wherein the number of first video frames of the source video is the same as the number of second video frames of the transcoded video.
  11. 根据权利要求8-10中任一项所述的方法,其中,所述第一视频帧的个数为多个;所述第二视频帧的个数为多个;The method according to any one of claims 8 to 10, wherein the number of the first video frames is multiple; the number of the second video frames is multiple;
    所述将所述源视频的所述第一视频帧以及所述转码视频的第二视频帧同时播放和/或存储,完成所述源视频及所述转码视频的视频对齐进一步包括:The simultaneously playing and/or storing the first video frame of the source video and the second video frame of the transcoded video to complete the video alignment of the source video and the transcoded video further comprises:
    将相同SEI信息的所述源视频的第一视频帧和所述转码视频的第二视频帧按照视频播放的先后顺序排序,并建立一一对应关系;Sort the first video frame of the source video and the second video frame of the transcoded video with the same SEI information according to the sequence of video playback, and establish a one-to-one correspondence;
    将具有对应关系的所述第一视频帧和所述第二视频帧同时播放和/或存储,完成所述源视频及所述转码视频的视频对齐。The first video frame and the second video frame having a corresponding relationship are played and/or stored simultaneously to complete the video alignment of the source video and the transcoded video.
  12. 一种视频对齐装置,其包括:A video alignment device, comprising:
    获取模块,适于获取源视频中的补充增强信息SEI信息;An acquisition module, adapted to acquire supplementary enhancement information SEI information in a source video;
    复制写入模块,适于将所述源视频进行转码处理,并将所述SEI信息复制写入,得到转码视频;A copy and write module, adapted to perform transcoding processing on the source video and copy and write the SEI information to obtain a transcoded video;
    对齐模块,适于根据所述源视频及所述转码视频中的相同SEI信息,将 所述源视频及所述转码视频进行视频对齐。The alignment module is adapted to align the source video and the transcoded video according to the same SEI information. The source video and the transcoded video are aligned.
  13. 一种计算设备,包括:处理器、存储器、通信接口和通信总线,所述处理器、所述存储器和所述通信接口通过所述通信总线完成相互间的通信;A computing device, comprising: a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface communicate with each other via the communication bus;
    所述存储器用于存放至少一可执行指令,所述可执行指令使所述处理器执行如权利要求1-11中任一项所述的视频对齐方法对应的操作。The memory is used to store at least one executable instruction, and the executable instruction enables the processor to perform operations corresponding to the video alignment method according to any one of claims 1-11.
  14. 一种非易失性计算机可读存储介质,所述非易失性计算机可读存储介质中存储有至少一可执行指令,所述可执行指令使处理器执行如权利要求1-11中任一项所述的视频对齐方法对应的操作。A non-volatile computer-readable storage medium stores at least one executable instruction, wherein the executable instruction enables a processor to perform operations corresponding to the video alignment method according to any one of claims 1 to 11.
  15. 一种计算机程序产品,所述计算机程序产品包括存储在非易失性计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被处理器执行时,使所述处理器执行如权利要求1-11中任一项所述的视频对齐方法对应的操作。 A computer program product, comprising a computer program stored on a non-volatile computer-readable storage medium, wherein the computer program comprises program instructions, and when the program instructions are executed by a processor, the processor is caused to perform operations corresponding to the video alignment method as described in any one of claims 1 to 11.
PCT/CN2023/108948 2022-11-11 2023-07-24 Video alignment method and apparatus WO2024098836A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211411427.1A CN115802054A (en) 2022-11-11 2022-11-11 Video alignment method and device
CN202211411427.1 2022-11-11

Publications (1)

Publication Number Publication Date
WO2024098836A1 true WO2024098836A1 (en) 2024-05-16

Family

ID=85436927

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/108948 WO2024098836A1 (en) 2022-11-11 2023-07-24 Video alignment method and apparatus

Country Status (2)

Country Link
CN (1) CN115802054A (en)
WO (1) WO2024098836A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115802054A (en) * 2022-11-11 2023-03-14 上海哔哩哔哩科技有限公司 Video alignment method and device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140140417A1 (en) * 2012-11-16 2014-05-22 Gary K. Shaffer System and method for providing alignment of multiple transcoders for adaptive bitrate streaming in a network environment
US20160191961A1 (en) * 2014-12-31 2016-06-30 Imagine Communications Corp. Fragmented video transcoding systems and methods
CN107995155A (en) * 2017-10-11 2018-05-04 上海聚力传媒技术有限公司 Video data encoding, decoding, methods of exhibiting, video system and storage medium
CN110213615A (en) * 2018-04-04 2019-09-06 腾讯科技(深圳)有限公司 Video transcoding method, device, server and storage medium
CN110401850A (en) * 2019-07-30 2019-11-01 网宿科技股份有限公司 A kind of method and apparatus of the customized SEI of transparent transmission
CN113905257A (en) * 2021-09-29 2022-01-07 北京字节跳动网络技术有限公司 Video code rate switching method and device, electronic equipment and storage medium
CN115134622A (en) * 2022-06-29 2022-09-30 北京奇艺世纪科技有限公司 Video data alignment method, device, equipment and storage medium
CN115802054A (en) * 2022-11-11 2023-03-14 上海哔哩哔哩科技有限公司 Video alignment method and device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140140417A1 (en) * 2012-11-16 2014-05-22 Gary K. Shaffer System and method for providing alignment of multiple transcoders for adaptive bitrate streaming in a network environment
US20160191961A1 (en) * 2014-12-31 2016-06-30 Imagine Communications Corp. Fragmented video transcoding systems and methods
CN107995155A (en) * 2017-10-11 2018-05-04 上海聚力传媒技术有限公司 Video data encoding, decoding, methods of exhibiting, video system and storage medium
CN110213615A (en) * 2018-04-04 2019-09-06 腾讯科技(深圳)有限公司 Video transcoding method, device, server and storage medium
CN110401850A (en) * 2019-07-30 2019-11-01 网宿科技股份有限公司 A kind of method and apparatus of the customized SEI of transparent transmission
CN113905257A (en) * 2021-09-29 2022-01-07 北京字节跳动网络技术有限公司 Video code rate switching method and device, electronic equipment and storage medium
CN115134622A (en) * 2022-06-29 2022-09-30 北京奇艺世纪科技有限公司 Video data alignment method, device, equipment and storage medium
CN115802054A (en) * 2022-11-11 2023-03-14 上海哔哩哔哩科技有限公司 Video alignment method and device

Also Published As

Publication number Publication date
CN115802054A (en) 2023-03-14

Similar Documents

Publication Publication Date Title
WO2017063399A1 (en) Video playback method and device
US8670072B1 (en) Method and apparatus for streaming media data processing, and streaming media playback equipment
CN110870282B (en) Processing media data using file tracks of web content
US20050193138A1 (en) Storage medium storing multimedia data, and method and apparatus for reproducing the multimedia data
US20060013123A1 (en) Method and apparatus for processing transmission error in DMB system
WO2024098836A1 (en) Video alignment method and apparatus
CN111602406B (en) Method, device and computer readable storage medium for processing media data
US20050281289A1 (en) System and method for embedding multimedia processing information in a multimedia bitstream
CN105916058A (en) Streaming media buffer play method and device and display device
CN112261377B (en) Web edition monitoring video playing method, electronic equipment and storage medium
WO2020097857A1 (en) Media stream processing method and apparatus, storage medium, and program product
WO2013053259A1 (en) Processing method, playing method and apparatus for streaming media data
CN114095784A (en) H.265 format video stream transcoding playing method, system, device and medium
CN113490047A (en) Android audio and video playing method
CN103430558A (en) A method for optimizing a video stream
US20180020043A1 (en) Method for playing audio/video and display device
US20100076944A1 (en) Multiprocessor systems for processing multimedia data and methods thereof
CN110574378B (en) Method and apparatus for media content asset change
US11588870B2 (en) W3C media extensions for processing DASH and CMAF inband events along with media using process@append and process@play mode
US11973820B2 (en) Method and apparatus for mpeg dash to support preroll and midroll content during media playback
US11882170B2 (en) Extended W3C media extensions for processing dash and CMAF inband events
US8442126B1 (en) Synchronizing audio and video content through buffer wrappers
US20230224557A1 (en) Auxiliary mpds for mpeg dash to support prerolls, midrolls and endrolls with stacking properties
JP2005151128A (en) Method and apparatus of processing data
WO2021114305A1 (en) Video processing method and apparatus, and computer readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23887532

Country of ref document: EP

Kind code of ref document: A1