WO2020119670A1 - Video transcoding method and device - Google Patents

Video transcoding method and device Download PDF

Info

Publication number
WO2020119670A1
WO2020119670A1 PCT/CN2019/124232 CN2019124232W WO2020119670A1 WO 2020119670 A1 WO2020119670 A1 WO 2020119670A1 CN 2019124232 W CN2019124232 W CN 2019124232W WO 2020119670 A1 WO2020119670 A1 WO 2020119670A1
Authority
WO
WIPO (PCT)
Prior art keywords
path
video
transcoding
path information
source video
Prior art date
Application number
PCT/CN2019/124232
Other languages
French (fr)
Chinese (zh)
Inventor
李庆文
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2020119670A1 publication Critical patent/WO2020119670A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/40Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video transcoding, i.e. partial or full decoding of a coded input stream followed by re-encoding of the decoded output stream
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4

Definitions

  • This application relates to the field of Internet technology, in particular to a video transcoding method and device.
  • the video playback platform In order to provide users with videos with different picture quality, the video playback platform usually needs to transcode the source video to generate multiple videos with different resolutions and different bit rates.
  • the purpose of the embodiments of the present application is to provide a video transcoding method and device, which can improve the efficiency of video transcoding.
  • an embodiment of the present application provides a video transcoding method, the method includes: acquiring a source video; determining path information from the source video to a target video; wherein, the path information includes transcoding A transcoding method between the code path and the nodes in the transcoding path; based on the acquired path information, transcoding the source video.
  • the embodiments of the present application also provide a video transcoding device.
  • the device includes: a video acquisition unit for acquiring a source video; and a path determination unit for determining transcoding from the source video to a target video Path information; wherein, the path information includes a transcoding path and a transcoding method between the nodes in the transcoding path; a transcoding unit is used to perform the source video based on the obtained path information Transcoding.
  • the embodiments of the present application also provide a video transcoding device.
  • the device includes a memory and a processor.
  • the memory is used to store a computer program.
  • the computer program is executed by the processor, the foregoing Video transcoding method.
  • the technical solution provided in this application can determine the path information of transcoding from the source video to the target video.
  • the path information includes a transcoding path and a transcoding method between nodes in the transcoding path.
  • the source video and other intermediate nodes in the transcoding path can be transcoded in turn according to the transcoding path, through the transcoding method between the nodes in the transcoding path, and the output Target video.
  • the entire transcoding process can be completed in one transcoding machine, reducing the uploading process and the process of reading from an external storage platform, thereby reducing the time for video transcoding and improving the efficiency of video transcoding.
  • FIG. 1 is a schematic diagram of a video transcoding method in an embodiment of this application
  • FIG. 2 is a schematic diagram of a directed acyclic transcoding architecture in the implementation of this application.
  • FIG. 3 is a schematic structural diagram of a video transcoding device according to an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of another video transcoding device in an embodiment of the present application.
  • the present application provides a video transcoding method, which can be applied to a terminal device having an image processing function.
  • the terminal device may be, for example, a desktop computer, a notebook computer, a tablet computer, a workstation, or the like.
  • the method can also be applied to a business server of a video playing website.
  • the business server may be an independent server or a server cluster composed of multiple servers.
  • the video transcoding method provided in this application includes the following steps.
  • the method of acquiring the source video may include reading the source video from the storage path or receiving the source video from another terminal device according to the provided storage path.
  • S13 Determine path information of transcoding from the source video to the target video; wherein, the path information includes a transcoding path and a transcoding method between nodes in the transcoding path.
  • transcoding from the source video to the target video may require a transcoding process between multiple nodes.
  • FRC Frame Rate Conversion
  • FRC Frame Rate Conversion
  • two intermediate nodes need to be passed through.
  • the entire transcoding process can be divided into four levels, which are the first level with the source video as the root node.
  • the source video with high frame rate is the second level of the child node, the third level with the video with the specified resolution as the third level node, and the fourth level with the target video with the specified resolution and the specified video format as the leaf node .
  • the output of the node of the second level depends on the node of the first level
  • the output of the node of the third level depends on the node of the second level
  • the output of the node of the fourth level depends on the node of the third level.
  • the path information of transcoding from the source video to the target video may be determined first.
  • the path information may include a transcoding path and a transcoding method between each node in the transcoding path. In this way, based on the determined path information, a transcoding process of the source video can be implemented in a terminal device to obtain the target video.
  • the transcoding process from the source video to the target video may be a multi-layer dependent transcoding process that already exists in practical applications, then multiple existing transcodings can be separated from the existing In the task, the dependencies between the transcoding tasks are obtained, and the inputs and outputs in these transcoding tasks can be used as the nodes in a transcoding path.
  • a transcoding path for transcoding from the source video to a target video and various nodes in the transcoding path can be obtained.
  • the transcoding method between each node can also be obtained directly from each separate transcoding task. In this way, the path information of transcoding from the source video to the target video can be determined.
  • the transcoding method corresponds to the video transcoding parameters required for transcoding one video to another video, and the parameter values of the video transcoding parameters may be based on the video parameters and audio parameters of the two videos The parameter value is determined.
  • the transcoding parameters may include, for example, fidelity, resolution, transmission code rate, and so on. After these transcoding parameters are set, the video can be transcoded, so that the transcoded video conforming to these transcoding parameters can be obtained.
  • the path information corresponding to the video group composed of the source video and the target video may be identified by a support vector machine (SVM).
  • SVM support vector machine
  • the source video and the target video may be used as a path start node and a path end node in the path identification model, respectively.
  • a training sample set may be obtained in advance, and the training sample set may be used to train the path recognition model, so that the path recognition model can recognize the input video group Corresponding path information.
  • the training sample set may include a sample video group whose corresponding transcoding path conforms to the path information and a sample video group whose corresponding transcoding path does not conform to the path information.
  • the sample video group may include sample videos corresponding to the path start node and the path end node, respectively.
  • the sample video groups in the training sample set may be sequentially input into the path recognition model.
  • An initial neural network can be constructed in the path recognition model, and initial prediction parameters can be preset in the neural network. After processing the input sample video group through the initial prediction parameters, a prediction result of the sample video group can be obtained, and the prediction result can be used to characterize whether the transcoding path corresponding to the sample video group conforms to the The path information.
  • the path recognition model may first separately extract a first feature vector corresponding to the parameter information of the source video and a second feature vector corresponding to the parameter information of the target video.
  • Elements in the first feature vector may be parameter values of various parameters of the source video, for example, parameter values of video parameters or audio parameters, and video parameters may include video resolution, video bit rate, video frame rate, and video Format and so on.
  • the elements in the first feature vector may be parameter values of various parameters of the target video.
  • the path identification model can read the parameter value of each parameter in the source video corresponding to the path start node in the sample video group, and the parameter value of each parameter in the target video corresponding to the path end node, and follow In the order of reading, the parameter values form the first feature vector and the second feature vector.
  • a convolutional neural network Convolutional Neural Network, CNN
  • CNN convolutional Neural Network
  • a probability value vector of the sample video group after processing the data of the input sample video group through the neural network, a probability value vector of the sample video group can be obtained.
  • the probability value for the specified path information may be included in the probability value vector.
  • the probability value vector may include two probability values, and these two probability values respectively represent the probability that the transcoding path conforms to the specified path information and the probability that it does not conform to the specified path information.
  • a probability value vector of (0.4, 0.8) can be obtained through the path identification model, where 0.4 indicates that the transcoding path conforms to the specified path
  • the probability of information, 0.8 means the probability that the transcoding path does not conform to the specified path information.
  • the initial prediction parameters in the path recognition model may not be set accurately enough, the probability results predicted by the path recognition model may be inconsistent with the actual situation.
  • the above input is a sample video group whose transcoding path conforms to the specified path information, but in the obtained probability vector, the probability that the transcoding path matches the specified path information is only 0.4, and that the transcoding path does not match the specified path information.
  • the probability is 0.8. In this case, it indicates that the prediction result is incorrect.
  • the initial prediction parameter in the path recognition model may be adjusted according to the difference between the prediction result and the correct result.
  • the sample video group may have a theoretical probability value result.
  • the theoretical probability value result of the transcoding path conforming to the specified path information may be (1, 0), where 1 represents the probability value that the transcoding path conforms to the specified path information.
  • the predicted probability value result can be subtracted from the theoretical probability value result to obtain the difference between the two, and then the obtained difference can be used to adjust the initial prediction parameters of the neural network. After the prediction parameters are processed again for the sample video group, the obtained prediction result is consistent with the correct result. In this way, after training a large number of training samples, the path recognition model can distinguish whether the transcoding path corresponding to the sample video group conforms to the specified transcoding path, thereby identifying the path that matches the actual transcoding path corresponding to the sample video group information.
  • the source video after determining the path information from the source video to the target video to be output, the source video can be converted into a terminal device based on the determined path information Coding process to get the target video.
  • the source video may be transcoded according to the transcoding path included in the path information through a transcoding method between nodes in the transcoding path, so that the target video may be obtained.
  • the transcoding path includes four nodes, which are a root node, a child node, a three-level node, and a four-level node in the transcoding order.
  • the root node is the source video
  • the four-level node is the target video.
  • the video to be output often includes at least two kinds of target videos, so that multiple transcoding paths for these target videos will appear.
  • the path transcodes the source video to obtain an intermediate node, and then transcodes the intermediate node according to non-overlapping paths in each transcoding path.
  • the first path information for determining the transcoding from the source video to the first target video and the first path information for transcoding from the source video to the second target video are determined.
  • the source video may be first transcoded according to the overlapping path to obtain
  • the intermediate node is transcoded according to the non-overlapping paths in the first path information and the second path information, respectively.
  • a transcoding structure including a directed acyclic graph (DAG) of these nodes can be constructed according to the up-down dependencies between the nodes in the path information for these target videos.
  • the source video is transcoded into a target video with Dolby sound effects
  • a variety of target videos with different Dolby sound effects need to be output.
  • the transcoding paths in the path information corresponding to these target videos include partially overlapping paths.
  • the corresponding DAG transcoding structure can be constructed according to the upper and lower dependencies between the nodes in these path information to merge these overlapping paths, and the subsequent output can be directly completed in a terminal device according to this DAG transcoding structure.
  • Complex transcoding process for multiple target videos As shown in FIG.
  • the paths corresponding to the nodes Node11 with Dolby audio 11, Node12 with Dolby audio 12 and Node13 with Dolby audio 13 all include The path from root to root Node1 with Dolby Audio 1. Then, the overlapping partial paths can be merged, so that the terminal device can first transcode the source video according to the partial overlapping path to obtain the intermediate node Node1, and then respectively follow the non-overlapping paths to the intermediate node Node1 Perform transcoding to get nodes Node11, Node12 and Node13.
  • the Node 21 with Dolby Audio 21 and the Node 22 with Dolby Audio 22 can also be combined with overlapping paths in the above manner, so that a DAG transcoding structure can be constructed Transcoding path.
  • the root node root is the first level
  • the nodes Node1 and Node2 form the second level
  • the nodes Node11, Node12, Node13, Node21, and Node22 form the third level
  • the first level is used as an input
  • the second level is output
  • the second level can be used as an input
  • the third level can be output.
  • the construction of the DAG transcoding structure can be extended horizontally and vertically according to the increased transcoding service requirements.
  • the dashed box in Figure 2 in order to increase the output of the Dolby audio 211, Dolby audio 212, Dolby audio 213, Dolby audio 221 and Dolby audio 311 video transcoding business requirements
  • the node Node3 can be added horizontally
  • the nodes Node211, Node212, Node213 and Node221 can be added vertically.
  • the transcoding path of the DAG transcoding structure can be constructed, and complex transcoding service requirements can be realized in one terminal device.
  • these intermediate nodes are directly stored locally in the terminal device, so that there is no need to upload the video corresponding to the intermediate node to the external storage device, nor to read the video from the external storage device multiple times for subsequent conversion Code process.
  • the functions implemented in the above method steps may be implemented by a computer program, and the computer program may be stored in a computer storage medium.
  • the computer storage medium may be coupled with the processor, so that the processor can read the computer program in the computer storage medium.
  • S13 Determine path information of transcoding from the source video to the target video; wherein, the path information includes a transcoding path and a transcoding method between nodes in the transcoding path;
  • transcoding path transcoding the source video through a transcoding method between nodes in the transcoding path to obtain the target video.
  • Input a video group composed of the source video and the target video into a path recognition model, and determine path information of transcoding from the source video to the target video; wherein, the source video and the target video are respectively As a path start node and a path end node in the path identification model.
  • Input a video group composed of the source video and the target video into a path recognition model to extract the first feature vector corresponding to the parameter information of the source video and the parameter information of the target video respectively through the feature recognition model A corresponding second feature vector, and determining the predicted value corresponding to the vector group formed by the first feature vector and the second feature vector through the path recognition model;
  • the path information characterized by the predicted value is used as the path information for transcoding from the source video to the target video.
  • the training sample set including a corresponding sample video group whose transcoding path conforms to the path information and a corresponding sample video group whose transcoding path does not conform to the path information;
  • the sample video group includes Sample videos corresponding to the path start node and the path end node respectively;
  • this application also provides a video transcoding device, which includes:
  • the video obtaining unit 100 is used to obtain the source video
  • the path determining unit 200 is configured to determine path information for transcoding from the source video to the target video; wherein, the path information includes a transcoding path and a transcoding method between nodes in the transcoding path;
  • the transcoding unit 300 is configured to transcode the source video based on the acquired path information.
  • the transcoding unit is further configured to transcode the source video according to the transcoding path and through a transcoding method between nodes in the transcoding path to obtain the target video.
  • the video to be output includes at least two target videos
  • the transcoding unit is further configured to transcode the source video according to the overlapping path when there is an overlapping path between the first path information corresponding to the first target video and the second path information corresponding to the second target video To obtain an intermediate node, and then transcode the intermediate node according to the non-overlapping paths in the first path information and the second path information, respectively.
  • the path determination unit is further configured to input a video group composed of the source video and the target video into a path identification model, and determine path information for transcoding from the source video to the target video ; Wherein the source video and the target video are used as the path start node and path end node in the path recognition model, respectively;
  • the video transcoding device provided in the embodiments of the present specification, in which the specific functions of each unit module can be explained in comparison with the aforementioned method embodiments in the present specification, and can achieve the technical effects of the aforementioned method embodiments, which will not be repeated here.
  • the present application also provides a video transcoding device.
  • the device includes a memory and a processor.
  • the memory is used to store a computer program.
  • the computer program is executed by the processor, the following steps are implemented:
  • S13 Determine path information of transcoding from the source video to the target video; wherein, the path information includes a transcoding path and a transcoding method between nodes in the transcoding path;
  • the memory may include a physical device for storing information, usually the information is digitized and then stored on a medium using electrical, magnetic, or optical methods.
  • the memory described in this embodiment may further include: devices that use electrical energy to store information, such as RAM, ROM, etc.; devices that use magnetic energy to store information, such as hard disks, floppy disks, magnetic tapes, magnetic core memories, bubble memories, and U disks ; Devices that use optical methods to store information, such as CDs or DVDs.
  • devices that use electrical energy to store information such as RAM, ROM, etc.
  • devices that use magnetic energy to store information such as hard disks, floppy disks, magnetic tapes, magnetic core memories, bubble memories, and U disks
  • Devices that use optical methods to store information such as CDs or DVDs.
  • quantum memory graphene memory, and so on.
  • the processor may be implemented in any suitable manner.
  • the processor may employ, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (such as software or firmware) executable by the (micro)processor, logic gate, switch, dedicated integration Circuit (Application Specific Integrated Circuit, ASIC), programmable logic controller and embedded microcontroller form, etc.
  • computer-readable program code such as software or firmware
  • transcoding path transcoding the source video through a transcoding method between nodes in the transcoding path to obtain the target video.
  • Input a video group composed of the source video and the target video into a path recognition model, and determine path information of transcoding from the source video to the target video; wherein, the source video and the target video are respectively As a path start node and a path end node in the path identification model.
  • Input a video group composed of the source video and the target video into a path recognition model to extract the first feature vector corresponding to the parameter information of the source video and the parameter information of the target video respectively through the feature recognition model A corresponding second feature vector, and determining the predicted value corresponding to the vector group formed by the first feature vector and the second feature vector through the path recognition model;
  • the path information characterized by the predicted value is used as the path information for transcoding from the source video to the target video.
  • the training sample set including a corresponding sample video group whose transcoding path conforms to the path information and a corresponding sample video group whose transcoding path does not conform to the path information;
  • the sample video group includes Sample videos corresponding to the path start node and the path end node respectively;
  • the technical solution provided in this application and the technical solution provided in this application after acquiring the source video, can determine the path information of transcoding from the source video to the target video for the target video to be output.
  • the path information includes a transcoding path and a transcoding method between nodes in the transcoding path.
  • the source video and other intermediate nodes in the transcoding path can be transcoded in turn according to the transcoding path, through the transcoding method between the nodes in the transcoding path, and the output Target video.
  • the entire transcoding process can be completed in one transcoding machine, reducing the uploading process and the process of reading from an external storage platform, thereby reducing the time for video transcoding and improving the efficiency of video transcoding.
  • the improvement of a technology can be clearly distinguished from the improvement in hardware (for example, the improvement of circuit structures such as diodes, transistors, and switches) or the improvement in software (the improvement of the process flow).
  • the improvement of many methods and processes can be regarded as a direct improvement of the hardware circuit structure.
  • Designers almost get the corresponding hardware circuit structure by programming the improved method flow into the hardware circuit. Therefore, it cannot be said that the improvement of a method flow cannot be realized by hardware physical modules.
  • a programmable logic device Programmable Logic Device, PLD
  • PLD Programmable Logic Device
  • FPGA Field Programmable Gate Array
  • HDL Hardware Description Language
  • ABEL Advanced Boolean Expression
  • AHDL AlteraHardwareDescriptionLanguage
  • Confluence a specific programming language
  • CUPL CornellUniversityProgrammingLanguage
  • HDCal JHDL (JavaHardwareDescriptionLanguage)
  • Lava Lola
  • MyHDL PALASM
  • RHDL RubyHardwareDescription
  • video image transcoding device in addition to implementing the video image transcoding device in a pure computer-readable program code manner, it is entirely possible to make the video image transcoding device into logic gates, switches, application specific integrated circuits, Program the logic controller and embedded microcontroller to achieve the same function. Therefore, such a video image transcoding device can be regarded as a hardware component, and the device for implementing various functions included therein can also be regarded as a structure within the hardware component. Or even, the means for realizing various functions can be regarded as both a software module of an implementation method and a structure within a hardware component.
  • the present application can be implemented by means of software plus a necessary general hardware platform. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence or part that contributes to the existing technology, and the computer software product can be stored in a storage medium, such as ROM/RAM, magnetic disk , Optical discs, etc., including several instructions to enable a computer device (which may be a personal computer, server, or network device, etc.) to perform the methods described in various embodiments of the present application or some parts of the embodiments.
  • ROM/RAM read-only memory
  • magnetic disk magnetic disk
  • Optical discs etc.
  • the present application may be described in the general context of computer-executable instructions executed by a computer, such as program modules.
  • program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types.
  • the present application may also be practiced in distributed computing environments in which tasks are performed by remote processing devices connected through a communication network.
  • program modules may be located in local and remote computer storage media including storage devices.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

Disclosed in embodiments of the present application are a video transcoding method and device. The method comprises: obtaining a source video; determining path information for transcoding the source video to a target video, the path information comprising a transcoding path and a transcoding mode between nodes in the transcoding path; and transcoding the source video on the basis of the determined path information. The technical solution provided by the present application can improve the video transcoding efficiency.

Description

一种视频转码方法及装置Video transcoding method and device
本申请要求2018年12月11日递交的申请号为201811510054.7、发明名称为“一种视频转码方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application requires the priority of the Chinese patent application filed on December 11, 2018 with the application number 201811510054.7 and the invention titled "a video transcoding method and device", the entire contents of which are incorporated by reference in this application.
技术领域Technical field
本申请涉及互联网技术领域,特别涉及一种视频转码方法及装置。This application relates to the field of Internet technology, in particular to a video transcoding method and device.
背景技术Background technique
随着互联网技术的不断发展,涌现出越来越多的视频播放平台。为了向用户提供不同画质的视频,视频播放平台通常需要对源视频进行转码,从而生成具备不同分辨率和不同码率的多份视频。With the continuous development of Internet technology, more and more video playback platforms have emerged. In order to provide users with videos with different picture quality, the video playback platform usually needs to transcode the source video to generate multiple videos with different resolutions and different bit rates.
当前,针对一些多层级依赖的转码场景,例如,生产高帧率视频的场景,通常需要通过多台转码机器分别执行不同的转码任务。例如,在对源视频进行转码之前,需要先对源视频进行高帧率转换,生成一个中间结果,然后对这个中间结果进行转码,从而生成具备不同分辨率和不同码率的多份视频。在这种依赖中间结果的转码场景中,生成中间结果的转码任务完成后,通常需要将这个中间结果上传至外部存储平台,后续在基于这个中间结果进行多次转码任务时,需要从外部存储平台中多次读取该中间结果,这些上传过程和从外部存储平台中读取的过程都比较耗时,导致视频转码的效率较低。At present, for some multi-level dependent transcoding scenarios, for example, the scenario of producing high frame rate video, it is usually necessary to perform different transcoding tasks separately through multiple transcoding machines. For example, before transcoding the source video, you need to convert the source video at a high frame rate to generate an intermediate result, and then transcode the intermediate result to generate multiple videos with different resolutions and different bit rates . In such a transcoding scenario that relies on intermediate results, after the transcoding task that generates the intermediate results is completed, it is usually necessary to upload the intermediate results to an external storage platform. The intermediate result is read multiple times in the external storage platform. These uploading processes and the process of reading from the external storage platform are both time-consuming, resulting in low efficiency of video transcoding.
因此,亟需提供一种更快速的视频转码方法。Therefore, there is an urgent need to provide a faster video transcoding method.
发明内容Summary of the invention
本申请实施方式的目的是提供一种视频转码方法及装置,能够提高视频转码的效率。The purpose of the embodiments of the present application is to provide a video transcoding method and device, which can improve the efficiency of video transcoding.
为实现上述目的,本申请实施方式提供一种视频转码方法,所述方法包括:获取源视频;确定从所述源视频转码为目标视频的路径信息;其中,所述路径信息中包括转码路径和所述转码路径中节点之间的转码方式;基于获取的所述路径信息,对所述源视频进行转码。In order to achieve the above object, an embodiment of the present application provides a video transcoding method, the method includes: acquiring a source video; determining path information from the source video to a target video; wherein, the path information includes transcoding A transcoding method between the code path and the nodes in the transcoding path; based on the acquired path information, transcoding the source video.
为实现上述目的,本申请实施方式还提供一种视频转码装置,所述装置包括:视频获取单元,用于获取源视频;路径确定单元,用于确定从所述源视频转码为目标视频的路径信息;其中,所述路径信息中包括转码路径和所述转码路径中节点之间的转码方式;转码单元,用于基于获取的所述路径信息,对所述源视频进行转码。In order to achieve the above object, the embodiments of the present application also provide a video transcoding device. The device includes: a video acquisition unit for acquiring a source video; and a path determination unit for determining transcoding from the source video to a target video Path information; wherein, the path information includes a transcoding path and a transcoding method between the nodes in the transcoding path; a transcoding unit is used to perform the source video based on the obtained path information Transcoding.
为实现上述目的,本申请实施方式还提供一种视频转码装置,所述装置包括存储器和处理器,所述存储器用于存储计算机程序,所述计算机程序被所述处理器执行时,实现上述的视频转码方法。In order to achieve the above object, the embodiments of the present application also provide a video transcoding device. The device includes a memory and a processor. The memory is used to store a computer program. When the computer program is executed by the processor, the foregoing Video transcoding method.
由上可见,本申请提供的技术方案,在获取源视频之后,针对待输出的目标视频,可以确定从所述源视频转码为目标视频的路径信息。其中,所述路径信息中包括转码路径和所述转码路径中节点之间的转码方式。这样,便可以按照所述转码路径,通过所述转码路径中节点之间的转码方式,依次对所述源视频和所述转码路径中的其他中间节点进行转码,输出所述目标视频。如此,整个转码过程可以在一台转码机器完成,减少了上传过程和从外部存储平台中读取的过程,从而减少了视频转码的时间,提高视频转码的效率。It can be seen from the above that, after obtaining the source video, for the target video to be output, the technical solution provided in this application can determine the path information of transcoding from the source video to the target video. Wherein, the path information includes a transcoding path and a transcoding method between nodes in the transcoding path. In this way, the source video and other intermediate nodes in the transcoding path can be transcoded in turn according to the transcoding path, through the transcoding method between the nodes in the transcoding path, and the output Target video. In this way, the entire transcoding process can be completed in one transcoding machine, reducing the uploading process and the process of reading from an external storage platform, thereby reducing the time for video transcoding and improving the efficiency of video transcoding.
附图说明BRIEF DESCRIPTION
为了更清楚地说明本申请实施方式或现有技术中的技术方案,下面将对实施方式或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请中记载的一些实施方式,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly explain the embodiments of the present application or the technical solutions in the prior art, the following will briefly introduce the drawings used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some of the embodiments described in this application. For those of ordinary skill in the art, without paying any creative labor, other drawings can be obtained based on these drawings.
图1为本申请实施方式中视频转码方法示意图;FIG. 1 is a schematic diagram of a video transcoding method in an embodiment of this application;
图2为本申请实施方式中有向无环转码架构的示意图;2 is a schematic diagram of a directed acyclic transcoding architecture in the implementation of this application;
图3为本申请实施方式中一种视频转码装置的结构示意图;3 is a schematic structural diagram of a video transcoding device according to an embodiment of the present application;
图4为本申请实施方式中另一种视频转码装置的结构示意图。4 is a schematic structural diagram of another video transcoding device in an embodiment of the present application.
具体实施方式detailed description
为了使本技术领域的人员更好地理解本申请中的技术方案,下面将结合本申请实施方式中的附图,对本申请实施方式中的技术方案进行清楚、完整地描述,显然,所描述的实施方式仅仅是本申请一部分实施方式,而不是全部的实施方式。基于本申请中的实施方式,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施方式,都应当属于本申请保护的范围。In order to enable those skilled in the art to better understand the technical solutions in the present application, the technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the drawings in the embodiments of the present application. Obviously, the described The embodiments are only a part of the embodiments of the present application, but not all the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the scope of protection of this application.
本申请提供一种视频转码方法,所述方法可以应用于具备图像处理功能的终端设备中。所述终端设备例如可以是台式电脑、笔记本电脑、平板电脑、工作站等。此外,所述方法还可以应用于视频播放网站的业务服务器中,所述业务服务器可以是独立的服务 器,也可以是由多个服务器构成的服务器集群。The present application provides a video transcoding method, which can be applied to a terminal device having an image processing function. The terminal device may be, for example, a desktop computer, a notebook computer, a tablet computer, a workstation, or the like. In addition, the method can also be applied to a business server of a video playing website. The business server may be an independent server or a server cluster composed of multiple servers.
请参阅图1,本申请提供的视频转码方法包括以下步骤。Please refer to FIG. 1, the video transcoding method provided in this application includes the following steps.
S11:获取源视频。S11: Obtain the source video.
在本实施方式中,通过对源视频进行转码,可以生成具备不同分辨率和不同码率的多份视频。In this embodiment, by transcoding the source video, multiple videos with different resolutions and different code rates can be generated.
在本实施方式中,获取所述源视频的方式可以包括根据提供的存储路径,从所述存储路径下读取所述源视频或者接收其它终端设备发来的所述源视频。In this embodiment, the method of acquiring the source video may include reading the source video from the storage path or receiving the source video from another terminal device according to the provided storage path.
S13:确定从所述源视频转码为目标视频的路径信息;其中,所述路径信息中包括转码路径和所述转码路径中节点之间的转码方式。S13: Determine path information of transcoding from the source video to the target video; wherein, the path information includes a transcoding path and a transcoding method between nodes in the transcoding path.
在本实施方式中,在实际应用场景中,针对待输出的目标视频,从所述源视频转码至所述目标视频,可能需要经过多个节点之间的转码过程。例如,针对一些多层级依赖的转码场景,比如,生产高帧率视频的场景,需要先通过帧率转换技术(Frame Rate Conversion,FRC)将源视频转换为高帧率的源视频,即生成一个中间结果,然后对这个中间结果进行转码,生成具备指定分辨率的视频,最后对该视频进行转码,生成具备指定视频格式和指定分辨率的目标视频。如此,为了输出具备所述指定视频格式和指定分辨率的目标视频,需要经过两个中间节点,整个转码过程可以分为四个层级,分别为以源视频为根节点的第一层级、以高帧率的源视频为子节点的第二层级、以具备指定分辨率的视频为三级节点的第三层级,和以具备指定分辨率和指定视频格式的目标视频为叶子节点的第四层级。其中,第二层级的节点的输出依赖于第一层级的节点,第三层级的节点的输出依赖于第二层级的节点,第四层级的节点的输出依赖于第三层级的节点。为了在所述终端设备中实现这种多层级依赖的复杂转码过程,可以先确定从所述源视频转码为目标视频的路径信息。其中,所述路径信息中可以包括转码路径和所述转码路径中各个节点之间的转码方式。这样,后续便可以基于确定的所述路径信息,在一台终端设备中实现对所述源视频的转码过程,以得到所述目标视频。In this embodiment, in the actual application scenario, for the target video to be output, transcoding from the source video to the target video may require a transcoding process between multiple nodes. For example, for some multi-level dependent transcoding scenes, for example, scenes that produce high frame rate video, you need to first convert the source video to a high frame rate source video through Frame Rate Conversion (FRC), that is, generate An intermediate result, then transcode the intermediate result to generate a video with the specified resolution, and finally transcode the video to generate the target video with the specified video format and specified resolution. In this way, in order to output the target video with the specified video format and specified resolution, two intermediate nodes need to be passed through. The entire transcoding process can be divided into four levels, which are the first level with the source video as the root node. The source video with high frame rate is the second level of the child node, the third level with the video with the specified resolution as the third level node, and the fourth level with the target video with the specified resolution and the specified video format as the leaf node . Among them, the output of the node of the second level depends on the node of the first level, the output of the node of the third level depends on the node of the second level, and the output of the node of the fourth level depends on the node of the third level. In order to implement such a complex multi-layer dependent transcoding process in the terminal device, the path information of transcoding from the source video to the target video may be determined first. Wherein, the path information may include a transcoding path and a transcoding method between each node in the transcoding path. In this way, based on the determined path information, a transcoding process of the source video can be implemented in a terminal device to obtain the target video.
在一个实施方式中,从所述源视频转码为目标视频的转码过程可能是实际应用中已经存在的多层级依赖的转码过程,那么,便可以从已有的分离的多个转码任务中,获取转码任务之间的依赖关系,并可以将这些转码任务中的输入和输出作为一条转码路径中的各个节点。这样,便可以得到从所述源视频转码为目标视频的转码路径,以及转码路径中的各个节点。各个节点之间的转码方式也可以直接通过各个分离的转码任务中获取。如此,便可以确定从所述源视频转码为目标视频的路径信息。在本实施方式中,所述转 码方式与一个视频转码至另一个视频所需的视频转码参数相对应,该视频转码参数的参数值可以根据这两个视频的视频参数和音频参数的参数值确定。所述转码参数例如可以包括保真度、分辨率、传输码率等。在设置了这些转码参数后,便可以对视频进行转码,从而得到符合这些转码参数的转码后的视频。In one embodiment, the transcoding process from the source video to the target video may be a multi-layer dependent transcoding process that already exists in practical applications, then multiple existing transcodings can be separated from the existing In the task, the dependencies between the transcoding tasks are obtained, and the inputs and outputs in these transcoding tasks can be used as the nodes in a transcoding path. In this way, a transcoding path for transcoding from the source video to a target video and various nodes in the transcoding path can be obtained. The transcoding method between each node can also be obtained directly from each separate transcoding task. In this way, the path information of transcoding from the source video to the target video can be determined. In this embodiment, the transcoding method corresponds to the video transcoding parameters required for transcoding one video to another video, and the parameter values of the video transcoding parameters may be based on the video parameters and audio parameters of the two videos The parameter value is determined. The transcoding parameters may include, for example, fidelity, resolution, transmission code rate, and so on. After these transcoding parameters are set, the video can be transcoded, so that the transcoded video conforming to these transcoding parameters can be obtained.
在一个实施方式中,考虑到从所述源视频转码为目标视频的转码过程可能无法从已存在的多层级依赖的转码过程中找到,为此,在实际应用过程中,可以采用深度学习的方法,构建用于识别转码路径信息的路径识别模型。例如,可以通过支持向量机(support vector machine,SVM)对由所述源视频和所述目标视频构成的视频组对应的路径信息进行识别。其中,可以将所述源视频和所述目标视频分别作为所述路径识别模型中的路径起始节点和路径结束节点。具体地,在构建所述路径识别模型时,可以预先获取训练样本集,所述训练样本集可以用于对所述路径识别模型进行训练,以使得所述路径识别模型能够识别出输入的视频组对应的路径信息。所述训练样本集中可以包括对应的转码路径符合所述路径信息的样本视频组以及对应的转码路径不符合所述路径信息的样本视频组。所述样本视频组中可以包括所述路径起始节点和所述路径结束节点分别对应的样本视频。这样,在训练过程中,可以依次将所述训练样本集中的样本视频组输入路径识别模型。该路径识别模型中可以构建初始的神经网络,神经网络中可以预先设置初始预测参数。通过所述初始预测参数对输入的所述样本视频组进行处理后,可以得到所述样本视频组的预测结果,所述预测结果可以用于表征所述样本视频组对应的转码路径是否符合所述路径信息。具体地,所述路径识别模型在对样本视频组进行处理时,首先可以分别提取所述源视频的参数信息对应的第一特征向量和所述目标视频的参数信息对应的第二特征向量。所述第一特征向量中的元素可以是所述源视频的各个参数的参数值,例如,视频参数或音频参数的参数值,视频参数可以包括视频分辨率、视频码率、视频帧率和视频格式等等。类似地,所述第一特征向量中的元素可以是所述目标视频的各个参数的参数值。这样,所述路径识别模型可以读取所述样本视频组中路径起始节点对应的源视频中每个参数的参数值,以及路径结束节点对应的目标视频中每个参数的参数值,并按照读取的顺序,将参数值构成所述第一特征向量和所述第二特征向量。在实际应用中,由于参数的个数通常较多,会导致提取的特征向量的维数也较大,这样,会耗费较多的资源来对特征向量进行处理。鉴于此,在本实施方式中还可以采用卷积神经网络(Convolutional Neural Network,CNN)对样本视频组进行处理,从而得到维数较小的特征向量,以便后续的识别处理。In one embodiment, considering that the transcoding process from the source video to the target video may not be found in the existing multi-level dependent transcoding process, for this reason, in practical applications, depth Learning methods to build a path recognition model for identifying transcoding path information. For example, the path information corresponding to the video group composed of the source video and the target video may be identified by a support vector machine (SVM). Wherein, the source video and the target video may be used as a path start node and a path end node in the path identification model, respectively. Specifically, when constructing the path recognition model, a training sample set may be obtained in advance, and the training sample set may be used to train the path recognition model, so that the path recognition model can recognize the input video group Corresponding path information. The training sample set may include a sample video group whose corresponding transcoding path conforms to the path information and a sample video group whose corresponding transcoding path does not conform to the path information. The sample video group may include sample videos corresponding to the path start node and the path end node, respectively. In this way, during the training process, the sample video groups in the training sample set may be sequentially input into the path recognition model. An initial neural network can be constructed in the path recognition model, and initial prediction parameters can be preset in the neural network. After processing the input sample video group through the initial prediction parameters, a prediction result of the sample video group can be obtained, and the prediction result can be used to characterize whether the transcoding path corresponding to the sample video group conforms to the The path information. Specifically, when processing the sample video group, the path recognition model may first separately extract a first feature vector corresponding to the parameter information of the source video and a second feature vector corresponding to the parameter information of the target video. Elements in the first feature vector may be parameter values of various parameters of the source video, for example, parameter values of video parameters or audio parameters, and video parameters may include video resolution, video bit rate, video frame rate, and video Format and so on. Similarly, the elements in the first feature vector may be parameter values of various parameters of the target video. In this way, the path identification model can read the parameter value of each parameter in the source video corresponding to the path start node in the sample video group, and the parameter value of each parameter in the target video corresponding to the path end node, and follow In the order of reading, the parameter values form the first feature vector and the second feature vector. In practical applications, since the number of parameters is usually large, the dimension of the extracted feature vector is also large, which will consume more resources to process the feature vector. In view of this, in this embodiment, a convolutional neural network (Convolutional Neural Network, CNN) can also be used to process the sample video group, so as to obtain feature vectors with smaller dimensions for subsequent identification processing.
在本实施方式中,经过所述神经网络对输入的样本视频组的数据进行处理之后,可以得到该样本视频组的概率值向量。在所述概率值向量中可以包括针对指定路径信息的概率值。所述概率值向量中可以包括两个概率值,这两个概率值分别表示转码路径符合指定路径信息的概率以及不符合指定路径信息的概率。例如,在输入一组对应的转码路径符合指定路径信息的样本视频组之后,经过所述路径识别模型可以得到(0.4,0.8)这样的概率值向量,其中,0.4表示转码路径符合指定路径信息的概率,0.8表示转码路径不符合指定路径信息的概率。由于路径识别模型中的初始预测参数可能设置得不够准确,因此经过路径识别模型预测得到的概率结果可能与实际情况不符。例如,上述输入的是转码路径符合指定路径信息的样本视频组,但是得到的概率向量中,表示转码路径符合指定路径信息的概率仅为0.4,而表示转码路径不符合指定路径信息的概率为0.8。在这种情况下,表明预测结果不正确。此时,可以根据所述预测结果与正确结果之间的差异值,对所述路径识别模型中的初始预测参数进行调整。具体地,所述样本视频组可以具备理论的概率值结果。例如,转码路径符合指定路径信息的理论的概率值结果可以为(1,0),其中1表示转码路径符合指定路径信息的概率值。此时,可以将预测得到的概率值结果与理论的概率值结果进行相减,从而得到两者的差值,然后可以利用得到的差值对神经网络的初始预测参数进行调整,最终使得通过调整后的预测参数再次对所述样本视频组进行处理后,得到的预测结果与正确结果相符。这样,通过对大量的训练样本进行训练之后,该路径识别模型便可以区分样本视频组对应的转码路径是否符合指定转码路径,从而可以识别出符合样本视频组对应的实际转码路径的路径信息。In this embodiment, after processing the data of the input sample video group through the neural network, a probability value vector of the sample video group can be obtained. The probability value for the specified path information may be included in the probability value vector. The probability value vector may include two probability values, and these two probability values respectively represent the probability that the transcoding path conforms to the specified path information and the probability that it does not conform to the specified path information. For example, after inputting a set of corresponding sample video groups whose transcoding paths conform to the specified path information, a probability value vector of (0.4, 0.8) can be obtained through the path identification model, where 0.4 indicates that the transcoding path conforms to the specified path The probability of information, 0.8 means the probability that the transcoding path does not conform to the specified path information. Since the initial prediction parameters in the path recognition model may not be set accurately enough, the probability results predicted by the path recognition model may be inconsistent with the actual situation. For example, the above input is a sample video group whose transcoding path conforms to the specified path information, but in the obtained probability vector, the probability that the transcoding path matches the specified path information is only 0.4, and that the transcoding path does not match the specified path information. The probability is 0.8. In this case, it indicates that the prediction result is incorrect. At this time, the initial prediction parameter in the path recognition model may be adjusted according to the difference between the prediction result and the correct result. Specifically, the sample video group may have a theoretical probability value result. For example, the theoretical probability value result of the transcoding path conforming to the specified path information may be (1, 0), where 1 represents the probability value that the transcoding path conforms to the specified path information. At this time, the predicted probability value result can be subtracted from the theoretical probability value result to obtain the difference between the two, and then the obtained difference can be used to adjust the initial prediction parameters of the neural network. After the prediction parameters are processed again for the sample video group, the obtained prediction result is consistent with the correct result. In this way, after training a large number of training samples, the path recognition model can distinguish whether the transcoding path corresponding to the sample video group conforms to the specified transcoding path, thereby identifying the path that matches the actual transcoding path corresponding to the sample video group information.
S15:基于获取的所述路径信息,对所述源视频进行转码。S15: Transcode the source video based on the obtained path information.
在本实施方式中,在确定从所述源视频转码为待输出的目标视频的路径信息之后,便可以基于确定的所述路径信息,在一台终端设备中实现对所述源视频的转码过程,以得到所述目标视频。具体地,可以按照所述路径信息中包括的转码路径,通过所述转码路径中节点之间的转码方式,对所述源视频进行转码,从而可以得到所述目标视频。例如,所述转码路径中包括四个节点,按照转码顺序分别为根节点、子节点、三级节点和四级节点。其中,根节点为所述源视频,所述四级节点为所述目标视频。那么,便可以先通过根节点与子节点之间的转码方式,对根节点进行转码,得到子节点对应的视频。接着通过子节点与三级节点之间的转码方式,对子节点对应的视频进行转码,得到三级节点对应的视频。最后通过三级节点与四级节点之间的转码方式,对三级节点对应的视频进行转码,得到所述目标视频。如此,整个多层级依赖的转码过程在一台终端设备中 完成,减少了中间节点对应的视频的上传过程和从外部存储平台中读取的这些视频过程,从而可以减少视频转码的时间,提高视频转码的效率。In this embodiment, after determining the path information from the source video to the target video to be output, the source video can be converted into a terminal device based on the determined path information Coding process to get the target video. Specifically, the source video may be transcoded according to the transcoding path included in the path information through a transcoding method between nodes in the transcoding path, so that the target video may be obtained. For example, the transcoding path includes four nodes, which are a root node, a child node, a three-level node, and a four-level node in the transcoding order. The root node is the source video, and the four-level node is the target video. Then, you can first transcode the root node through the transcoding method between the root node and the child node to obtain the video corresponding to the child node. Then, by transcoding between the child node and the third-level node, the video corresponding to the child node is transcoded to obtain the video corresponding to the third-level node. Finally, by transcoding between the third-level node and the fourth-level node, the video corresponding to the third-level node is transcoded to obtain the target video. In this way, the entire multi-level dependent transcoding process is completed in one terminal device, reducing the uploading process of the video corresponding to the intermediate node and reading these video processes from the external storage platform, thereby reducing the time for video transcoding, Improve the efficiency of video transcoding.
在本实施方式中,在一些复杂的转码场景中,待输出的视频中往往包括至少两种目标视频,这样,便会出现针对这些目标视频的多条转码路径。当这些转码路径中存在重叠路径时,为了避免重复进行这些重叠路径中的转码过程,可以先判断针对各个目标视频的转码路径中是否存在重叠路径,若存在,可以先按照所述重叠路径对所述源视频进行转码,得到中间节点,再分别按照各个转码路径中的非重叠路径对所述中间节点进行转码。具体地,针对两种目标视频,第一目标视频和第二目标视频,在确定从源视频转码为第一目标视频的第一路径信息,以及从源视频转码为第二目标视频的第二路径信息之后,若第一目标视频对应的第一路径信息与第二目标视频对应的第二路径信息之间存在重叠路径,可以先按照所述重叠路径对所述源视频进行转码,得到中间节点后,再分别按照所述第一路径信息和所述第二路径信息中的非重叠路径对所述中间节点进行转码。为了实现上述过程,可以根据针对这些目标视频的路径信息中的各个节点之间的上下依赖关系,构建包括这些节点的有向无环图(Directed Acyclic Graph,DAG)的转码结构。In this embodiment, in some complex transcoding scenarios, the video to be output often includes at least two kinds of target videos, so that multiple transcoding paths for these target videos will appear. When there are overlapping paths in these transcoding paths, in order to avoid repeating the transcoding process in these overlapping paths, you can first determine whether there are overlapping paths in the transcoding paths for each target video. If there are, you can first follow the overlap The path transcodes the source video to obtain an intermediate node, and then transcodes the intermediate node according to non-overlapping paths in each transcoding path. Specifically, for the two target videos, the first target video and the second target video, the first path information for determining the transcoding from the source video to the first target video, and the first path information for transcoding from the source video to the second target video are determined. After the second path information, if there is an overlapping path between the first path information corresponding to the first target video and the second path information corresponding to the second target video, the source video may be first transcoded according to the overlapping path to obtain After the intermediate node, the intermediate node is transcoded according to the non-overlapping paths in the first path information and the second path information, respectively. In order to realize the above process, a transcoding structure including a directed acyclic graph (DAG) of these nodes can be constructed according to the up-down dependencies between the nodes in the path information for these target videos.
例如,针对通过源视频转码为具备杜比音效的目标视频的应用场景中,需要输出多种具备不同杜比音效的目标视频。而这些目标视频对应的路径信息中的转码路径中包括部分重叠路径。那么,可以根据这些路径信息中各个节点之间的上下依赖关系,构建对应的DAG转码结构,以合并这些重叠路径,后续可以直接按照这个DAG转码结构,在一台终端设备中完成需要输出多种目标视频的复杂转码过程。如图2所示,针对最终输出的目标视频的节点中,具备杜比音效11的节点Node11、具备杜比音效12的节点Node12和具备杜比音效13的节点Node13分别对应的路径中,都包括从根节点root到具备杜比音效1的节点Node1的路径。那么,便可以将重叠的这部分路径合并,这样,所述终端设备可以先按照这部分重叠路径对所述源视频进行转码,得到中间节点Node1,再分别按照非重叠的路径对中间节点Node1进行转码,得到节点Node11、Node12和Node13。类似地,针对其他最终输出的目标视频的节点,具备杜比音效21的节点Node21、具备杜比音效22的节点Node22,也可以按照上述方式进行重叠路径合并,从而可以构建出DAG转码结构的转码路径。在图2中,根节点root为第一层级,节点Node1和节点Node2构成第二层级,节点Node11、Node12、Node13、Node21和Node22构成第三层级,各个层级之间存在依赖关系。第一层级作为输入,输出第二层级,接着便可以将第二层级 作为输入,输出第三层级。在本实施方式中,在构建DAG转码结构完成之后,还可以根据增加的转码业务需求,对构建DAG转码结构进行横向和纵向的扩展。如图2中虚线方框部分所示,为了增加输出具备杜比音效211、具备杜比音效212、具备杜比音效213、具备杜比音效221和具备杜比音效311的视频的转码业务需求,根据确定的这些待输出的视频的路径信息,可以从横向增加节点Node3,从纵向增加节点Node211、Node212、Node213和Node221,以及节点Node3 1和Node311。这样,便可以构建的DAG转码结构的转码路径,便可以在一台终端设备中实现复杂的转码业务需求。而且在转码过程中,这些中间节点直接在终端设备的本地存储,这样,无需将中间节点对应的视频上传至外部存储设备,也无需从外部存储设备中多次读取该视频,进行后续转码过程。For example, in an application scenario where the source video is transcoded into a target video with Dolby sound effects, a variety of target videos with different Dolby sound effects need to be output. The transcoding paths in the path information corresponding to these target videos include partially overlapping paths. Then, the corresponding DAG transcoding structure can be constructed according to the upper and lower dependencies between the nodes in these path information to merge these overlapping paths, and the subsequent output can be directly completed in a terminal device according to this DAG transcoding structure. Complex transcoding process for multiple target videos. As shown in FIG. 2, among the nodes for the final output target video, the paths corresponding to the nodes Node11 with Dolby audio 11, Node12 with Dolby audio 12 and Node13 with Dolby audio 13 all include The path from root to root Node1 with Dolby Audio 1. Then, the overlapping partial paths can be merged, so that the terminal device can first transcode the source video according to the partial overlapping path to obtain the intermediate node Node1, and then respectively follow the non-overlapping paths to the intermediate node Node1 Perform transcoding to get nodes Node11, Node12 and Node13. Similarly, for other nodes of the final output target video, the Node 21 with Dolby Audio 21 and the Node 22 with Dolby Audio 22 can also be combined with overlapping paths in the above manner, so that a DAG transcoding structure can be constructed Transcoding path. In FIG. 2, the root node root is the first level, the nodes Node1 and Node2 form the second level, and the nodes Node11, Node12, Node13, Node21, and Node22 form the third level, and there is a dependency relationship between each level. The first level is used as an input, and the second level is output, and then the second level can be used as an input, and the third level can be output. In this embodiment, after the construction of the DAG transcoding structure is completed, the construction of the DAG transcoding structure can be extended horizontally and vertically according to the increased transcoding service requirements. As shown by the dashed box in Figure 2, in order to increase the output of the Dolby audio 211, Dolby audio 212, Dolby audio 213, Dolby audio 221 and Dolby audio 311 video transcoding business requirements According to the determined path information of the video to be output, the node Node3 can be added horizontally, the nodes Node211, Node212, Node213 and Node221, and the nodes Node3 and Node311 can be added vertically. In this way, the transcoding path of the DAG transcoding structure can be constructed, and complex transcoding service requirements can be realized in one terminal device. Moreover, during the transcoding process, these intermediate nodes are directly stored locally in the terminal device, so that there is no need to upload the video corresponding to the intermediate node to the external storage device, nor to read the video from the external storage device multiple times for subsequent conversion Code process.
在本实施方式中,上述方法步骤中实现的功能,可以由计算机程序实现,所述计算机程序可以被存储于计算机存储介质中。具体的,所述计算机存储介质可以与处理器进行耦合,处理器从而可以读取计算机存储介质中的计算机程序。所述计算机程序被处理器执行时,可以实现以下功能:In this embodiment, the functions implemented in the above method steps may be implemented by a computer program, and the computer program may be stored in a computer storage medium. Specifically, the computer storage medium may be coupled with the processor, so that the processor can read the computer program in the computer storage medium. When the computer program is executed by the processor, the following functions can be realized:
S11:获取源视频;S11: Obtain the source video;
S13:确定从所述源视频转码为目标视频的路径信息;其中,所述路径信息中包括转码路径和所述转码路径中节点之间的转码方式;S13: Determine path information of transcoding from the source video to the target video; wherein, the path information includes a transcoding path and a transcoding method between nodes in the transcoding path;
S15:基于确定的所述路径信息,对所述源视频进行转码。S15: Transcode the source video based on the determined path information.
在一个实施方式中,所述计算机程序被所述处理器执行时,还实现以下步骤:In one embodiment, when the computer program is executed by the processor, the following steps are further implemented:
按照所述转码路径,通过所述转码路径中节点之间的转码方式,对所述源视频进行转码,以得到所述目标视频。According to the transcoding path, transcoding the source video through a transcoding method between nodes in the transcoding path to obtain the target video.
在一个实施方式中,所述计算机程序被所述处理器执行时,当待输出的视频中包括至少两种目标视频时,还实现以下步骤:In one embodiment, when the computer program is executed by the processor, when the video to be output includes at least two kinds of target videos, the following steps are further implemented:
当第一目标视频对应的第一路径信息与第二目标视频对应的第二路径信息之间存在重叠路径时,按照所述重叠路径对所述源视频进行转码,得到中间节点,再分别按照所述第一路径信息和所述第二路径信息中的非重叠路径对所述中间节点进行转码。When there is an overlapping path between the first path information corresponding to the first target video and the second path information corresponding to the second target video, transcode the source video according to the overlapping path to obtain an intermediate node, and then respectively The non-overlapping paths in the first path information and the second path information transcode the intermediate node.
在一个实施方式中,所述计算机程序被所述处理器执行时,还实现以下步骤:In one embodiment, when the computer program is executed by the processor, the following steps are further implemented:
将所述源视频和所述目标视频构成的视频组输入路径识别模型中,确定从所述源视频转码为所述目标视频的路径信息;其中,将所述源视频和所述目标视频分别作为所述路径识别模型中的路径起始节点和路径结束节点。Input a video group composed of the source video and the target video into a path recognition model, and determine path information of transcoding from the source video to the target video; wherein, the source video and the target video are respectively As a path start node and a path end node in the path identification model.
在一个实施方式中,所述计算机程序被所述处理器执行时,还实现以下步骤:In one embodiment, when the computer program is executed by the processor, the following steps are further implemented:
将所述源视频和所述目标视频构成的视频组输入路径识别模型中,以通过所述特征识别模型分别提取所述源视频的参数信息对应的第一特征向量和所述目标视频的参数信息对应的第二特征向量,并通过所述路径识别模型确定所述第一特征向量和所述第二特征向量构成的向量组对应的预测值;Input a video group composed of the source video and the target video into a path recognition model to extract the first feature vector corresponding to the parameter information of the source video and the parameter information of the target video respectively through the feature recognition model A corresponding second feature vector, and determining the predicted value corresponding to the vector group formed by the first feature vector and the second feature vector through the path recognition model;
将所述预测值表征的路径信息作为从所述源视频转码为所述目标视频的路径信息。The path information characterized by the predicted value is used as the path information for transcoding from the source video to the target video.
在一个实施方式中,所述计算机程序被所述处理器执行时,还实现以下步骤:In one embodiment, when the computer program is executed by the processor, the following steps are further implemented:
获取训练样本集,所述训练样本集中包括对应的转码路径符合所述路径信息的样本视频组以及对应的转码路径不符合所述路径信息的样本视频组;所述样本视频组中包括所述路径起始节点和所述路径结束节点分别对应的样本视频;Obtaining a training sample set, the training sample set including a corresponding sample video group whose transcoding path conforms to the path information and a corresponding sample video group whose transcoding path does not conform to the path information; the sample video group includes Sample videos corresponding to the path start node and the path end node respectively;
将所述训练样本集中的样本视频组输入路径识别模型,所述路径识别模型中包括初始预测参数;Input a sample video group in the training sample set into a path recognition model, where the path recognition model includes initial prediction parameters;
通过所述初始预测参数对输入的所述样本视频组进行处理,得到所述样本视频组的预测结果,所述预测结果用于表征所述样本视频组对应的转码路径是否符合所述路径信息;Processing the input sample video group through the initial prediction parameter to obtain a prediction result of the sample video group, where the prediction result is used to characterize whether a transcoding path corresponding to the sample video group conforms to the path information ;
若所述预测结果不正确,根据所述预测结果与正确结果之间的差异值,对所述路径识别模型中的初始预测参数进行调整,以使得通过调整后的预测参数再次对所述样本视频组进行处理后,得到的预测结果与正确结果相符。If the prediction result is incorrect, adjust the initial prediction parameters in the path recognition model according to the difference between the prediction result and the correct result, so that the sample video After the group was processed, the predicted results were consistent with the correct results.
请参阅图3,本申请还提供一种视频转码装置,所述装置包括:Referring to FIG. 3, this application also provides a video transcoding device, which includes:
视频获取单元100,用于获取源视频;The video obtaining unit 100 is used to obtain the source video;
路径确定单元200,用于确定从所述源视频转码为目标视频的路径信息;其中,所述路径信息中包括转码路径和所述转码路径中节点之间的转码方式;The path determining unit 200 is configured to determine path information for transcoding from the source video to the target video; wherein, the path information includes a transcoding path and a transcoding method between nodes in the transcoding path;
转码单元300,用于基于获取的所述路径信息,对所述源视频进行转码。The transcoding unit 300 is configured to transcode the source video based on the acquired path information.
在一个实施方式中,所述转码单元还用于按照所述转码路径,通过所述转码路径中节点之间的转码方式,对所述源视频进行转码,以得到所述目标视频。In one embodiment, the transcoding unit is further configured to transcode the source video according to the transcoding path and through a transcoding method between nodes in the transcoding path to obtain the target video.
在一个实施方式中,当待输出的视频中包括至少两种目标视频时,In one embodiment, when the video to be output includes at least two target videos,
所述转码单元还用于当第一目标视频对应的第一路径信息与第二目标视频对应的第二路径信息之间存在重叠路径时,按照所述重叠路径对所述源视频进行转码,得到中间节点,再分别按照所述第一路径信息和所述第二路径信息中的非重叠路径对所述中间节点进行转码。The transcoding unit is further configured to transcode the source video according to the overlapping path when there is an overlapping path between the first path information corresponding to the first target video and the second path information corresponding to the second target video To obtain an intermediate node, and then transcode the intermediate node according to the non-overlapping paths in the first path information and the second path information, respectively.
在一个实施方式中,所述路径确定单元还用于将所述源视频和所述目标视频构成的视频组输入路径识别模型中,确定从所述源视频转码为所述目标视频的路径信息;其中,将所述源视频和所述目标视频分别作为所述路径识别模型中的路径起始节点和路径结束节点。In one embodiment, the path determination unit is further configured to input a video group composed of the source video and the target video into a path identification model, and determine path information for transcoding from the source video to the target video ; Wherein the source video and the target video are used as the path start node and path end node in the path recognition model, respectively;
本说明书实施方式提供的视频转码装置,其中各个单元模块的具体功能,可以与本说明书中的前述方法实施方式相对照解释,并能够达到前述方法实施方式的技术效果,这里便不再赘述。The video transcoding device provided in the embodiments of the present specification, in which the specific functions of each unit module can be explained in comparison with the aforementioned method embodiments in the present specification, and can achieve the technical effects of the aforementioned method embodiments, which will not be repeated here.
请参阅图4,本申请还提供一种视频转码装置,所述装置包括存储器和处理器,所述存储器用于存储计算机程序,所述计算机程序被所述处理器执行时,实现以下步骤:Referring to FIG. 4, the present application also provides a video transcoding device. The device includes a memory and a processor. The memory is used to store a computer program. When the computer program is executed by the processor, the following steps are implemented:
S11:获取源视频;S11: Obtain the source video;
S13:确定从所述源视频转码为目标视频的路径信息;其中,所述路径信息中包括转码路径和所述转码路径中节点之间的转码方式;S13: Determine path information of transcoding from the source video to the target video; wherein, the path information includes a transcoding path and a transcoding method between nodes in the transcoding path;
S15:基于确定的所述路径信息,对所述源视频进行转码。S15: Transcode the source video based on the determined path information.
在本实施方式中,所述存储器可以包括用于存储信息的物理装置,通常是将信息数字化后再以利用电、磁或者光学等方法的媒体加以存储。本实施方式所述的存储器又可以包括:利用电能方式存储信息的装置,如RAM、ROM等;利用磁能方式存储信息的装置,如硬盘、软盘、磁带、磁芯存储器、磁泡存储器、U盘;利用光学方式存储信息的装置,如CD或DVD。当然,还有其他方式的存储器,例如量子存储器、石墨烯存储器等等。In this embodiment, the memory may include a physical device for storing information, usually the information is digitized and then stored on a medium using electrical, magnetic, or optical methods. The memory described in this embodiment may further include: devices that use electrical energy to store information, such as RAM, ROM, etc.; devices that use magnetic energy to store information, such as hard disks, floppy disks, magnetic tapes, magnetic core memories, bubble memories, and U disks ; Devices that use optical methods to store information, such as CDs or DVDs. Of course, there are other types of memory, such as quantum memory, graphene memory, and so on.
在本实施方式中,所述处理器可以按任何适当的方式实现。例如,所述处理器可以采取例如微处理器或处理器以及存储可由该(微)处理器执行的计算机可读程序代码(例如软件或固件)的计算机可读介质、逻辑门、开关、专用集成电路(Application Specific Integrated Circuit,ASIC)、可编程逻辑控制器和嵌入微控制器的形式等等。In this embodiment, the processor may be implemented in any suitable manner. For example, the processor may employ, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (such as software or firmware) executable by the (micro)processor, logic gate, switch, dedicated integration Circuit (Application Specific Integrated Circuit, ASIC), programmable logic controller and embedded microcontroller form, etc.
在一个实施方式中,所述计算机程序被所述处理器执行时,还实现以下步骤:In one embodiment, when the computer program is executed by the processor, the following steps are further implemented:
按照所述转码路径,通过所述转码路径中节点之间的转码方式,对所述源视频进行转码,以得到所述目标视频。According to the transcoding path, transcoding the source video through a transcoding method between nodes in the transcoding path to obtain the target video.
在一个实施方式中,所述计算机程序被所述处理器执行时,当待输出的视频中包括至少两种目标视频时,还实现以下步骤:In one embodiment, when the computer program is executed by the processor, when the video to be output includes at least two kinds of target videos, the following steps are further implemented:
当第一目标视频对应的第一路径信息与第二目标视频对应的第二路径信息之间存在重叠路径时,按照所述重叠路径对所述源视频进行转码,得到中间节点,再分别按照所 述第一路径信息和所述第二路径信息中的非重叠路径对所述中间节点进行转码。When there is an overlapping path between the first path information corresponding to the first target video and the second path information corresponding to the second target video, transcode the source video according to the overlapping path to obtain an intermediate node, and then respectively The non-overlapping paths in the first path information and the second path information transcode the intermediate node.
在一个实施方式中,所述计算机程序被所述处理器执行时,还实现以下步骤:In one embodiment, when the computer program is executed by the processor, the following steps are further implemented:
将所述源视频和所述目标视频构成的视频组输入路径识别模型中,确定从所述源视频转码为所述目标视频的路径信息;其中,将所述源视频和所述目标视频分别作为所述路径识别模型中的路径起始节点和路径结束节点。Input a video group composed of the source video and the target video into a path recognition model, and determine path information of transcoding from the source video to the target video; wherein, the source video and the target video are respectively As a path start node and a path end node in the path identification model.
在一个实施方式中,所述计算机程序被所述处理器执行时,还实现以下步骤:In one embodiment, when the computer program is executed by the processor, the following steps are further implemented:
将所述源视频和所述目标视频构成的视频组输入路径识别模型中,以通过所述特征识别模型分别提取所述源视频的参数信息对应的第一特征向量和所述目标视频的参数信息对应的第二特征向量,并通过所述路径识别模型确定所述第一特征向量和所述第二特征向量构成的向量组对应的预测值;Input a video group composed of the source video and the target video into a path recognition model to extract the first feature vector corresponding to the parameter information of the source video and the parameter information of the target video respectively through the feature recognition model A corresponding second feature vector, and determining the predicted value corresponding to the vector group formed by the first feature vector and the second feature vector through the path recognition model;
将所述预测值表征的路径信息作为从所述源视频转码为所述目标视频的路径信息。The path information characterized by the predicted value is used as the path information for transcoding from the source video to the target video.
在一个实施方式中,所述计算机程序被所述处理器执行时,还实现以下步骤:In one embodiment, when the computer program is executed by the processor, the following steps are further implemented:
获取训练样本集,所述训练样本集中包括对应的转码路径符合所述路径信息的样本视频组以及对应的转码路径不符合所述路径信息的样本视频组;所述样本视频组中包括所述路径起始节点和所述路径结束节点分别对应的样本视频;Obtaining a training sample set, the training sample set including a corresponding sample video group whose transcoding path conforms to the path information and a corresponding sample video group whose transcoding path does not conform to the path information; the sample video group includes Sample videos corresponding to the path start node and the path end node respectively;
将所述训练样本集中的样本视频组输入路径识别模型,所述路径识别模型中包括初始预测参数;Input a sample video group in the training sample set into a path recognition model, where the path recognition model includes initial prediction parameters;
通过所述初始预测参数对输入的所述样本视频组进行处理,得到所述样本视频组的预测结果,所述预测结果用于表征所述样本视频组对应的转码路径是否符合所述路径信息;Processing the input sample video group through the initial prediction parameter to obtain a prediction result of the sample video group, where the prediction result is used to characterize whether a transcoding path corresponding to the sample video group conforms to the path information ;
若所述预测结果不正确,根据所述预测结果与正确结果之间的差异值,对所述路径识别模型中的初始预测参数进行调整,以使得通过调整后的预测参数再次对所述样本视频组进行处理后,得到的预测结果与正确结果相符。If the prediction result is incorrect, adjust the initial prediction parameters in the path recognition model according to the difference between the prediction result and the correct result, so that the sample video After the group was processed, the predicted results were consistent with the correct results.
由上可见,本申请提供的技术方案,本申请提供的技术方案,在获取源视频之后,针对待输出的目标视频,可以确定从所述源视频转码为目标视频的路径信息。其中,所述路径信息中包括转码路径和所述转码路径中节点之间的转码方式。这样,便可以按照所述转码路径,通过所述转码路径中节点之间的转码方式,依次对所述源视频和所述转码路径中的其他中间节点进行转码,输出所述目标视频。如此,整个转码过程可以在一台转码机器完成,减少了上传过程和从外部存储平台中读取的过程,从而减少了视频转码的时间,提高视频转码的效率。It can be seen from the above that the technical solution provided in this application and the technical solution provided in this application, after acquiring the source video, can determine the path information of transcoding from the source video to the target video for the target video to be output. Wherein, the path information includes a transcoding path and a transcoding method between nodes in the transcoding path. In this way, the source video and other intermediate nodes in the transcoding path can be transcoded in turn according to the transcoding path, through the transcoding method between the nodes in the transcoding path, and the output Target video. In this way, the entire transcoding process can be completed in one transcoding machine, reducing the uploading process and the process of reading from an external storage platform, thereby reducing the time for video transcoding and improving the efficiency of video transcoding.
在20世纪90年代,对于一个技术的改进可以很明显地区分是硬件上的改进(例如,对二极管、晶体管、开关等电路结构的改进)还是软件上的改进(对于方法流程的改进)。然而,随着技术的发展,当今的很多方法流程的改进已经可以视为硬件电路结构的直接改进。设计人员几乎都通过将改进的方法流程编程到硬件电路中来得到相应的硬件电路结构。因此,不能说一个方法流程的改进就不能用硬件实体模块来实现。例如,可编程逻辑器件(Programmable Logic Device,PLD)(例如现场可编程门阵列(Field Programmable Gate Array,FPGA))就是这样一种集成电路,其逻辑功能由用户对器件编程来确定。由设计人员自行编程来把一个数字系统“集成”在一片PLD上,而不需要请芯片制造厂商来设计和制作专用的集成电路芯片。而且,如今,取代手工地制作集成电路芯片,这种编程也多半改用“逻辑编译器(logic compiler)”软件来实现,它与程序开发撰写时所用的软件编译器相类似,而要编译之前的原始代码也得用特定的编程语言来撰写,此称之为硬件描述语言(Hardware Description Language,HDL),而HDL也并非仅有一种,而是有许多种,如ABEL(Advanced Boolean Expression Language)、AHDL(Altera Hardware Description Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL(Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby Hardware Description Language)等,目前最普遍使用的是VHDL(Very-High-Speed Integrated Circuit Hardware Description Language)与Verilog2。本领域技术人员也应该清楚,只需要将方法流程用上述几种硬件描述语言稍作逻辑编程并编程到集成电路中,就可以很容易得到实现该逻辑方法流程的硬件电路。In the 1990s, the improvement of a technology can be clearly distinguished from the improvement in hardware (for example, the improvement of circuit structures such as diodes, transistors, and switches) or the improvement in software (the improvement of the process flow). However, with the development of technology, the improvement of many methods and processes can be regarded as a direct improvement of the hardware circuit structure. Designers almost get the corresponding hardware circuit structure by programming the improved method flow into the hardware circuit. Therefore, it cannot be said that the improvement of a method flow cannot be realized by hardware physical modules. For example, a programmable logic device (Programmable Logic Device, PLD) (such as a field programmable gate array (Field Programmable Gate Array, FPGA)) is such an integrated circuit, and its logic function is determined by the user programming the device. Designers can program themselves to "integrate" a digital system on a PLD without having to ask chip manufacturers to design and make dedicated integrated circuit chips. Moreover, nowadays, instead of manually making integrated circuit chips, this kind of programming is also mostly implemented with "logic compiler" software, which is similar to the software compiler used in program development and writing, but before compilation The original code must also be written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), and HDL is not only one kind, but there are many kinds, such as ABEL (Advanced Boolean Expression) Language , AHDL (AlteraHardwareDescriptionLanguage), Confluence, CUPL (CornellUniversityProgrammingLanguage), HDCal, JHDL (JavaHardwareDescriptionLanguage), Lava, Lola, MyHDL, PALASM, RHDL (RubyHardwareDescription) It is VHDL (Very-High-Speed Integrated Circuit Hardware Description) and Verilog2. Those skilled in the art should also be clear that by simply programming the method flow in the above hardware description languages and programming into the integrated circuit, the hardware circuit that implements the logic method flow can be easily obtained.
本领域技术人员也知道,除了以纯计算机可读程序代码方式实现视频图像转码装置以外,完全可以通过将方法步骤进行逻辑编程来使得视频图像转码装置以逻辑门、开关、专用集成电路、可编程逻辑控制器和嵌入微控制器等的形式来实现相同功能。因此这种视频图像转码装置可以被认为是一种硬件部件,而对其内包括的用于实现各种功能的装置也可以视为硬件部件内的结构。或者甚至,可以将用于实现各种功能的装置视为既可以是实现方法的软件模块又可以是硬件部件内的结构。Those skilled in the art also know that, in addition to implementing the video image transcoding device in a pure computer-readable program code manner, it is entirely possible to make the video image transcoding device into logic gates, switches, application specific integrated circuits, Program the logic controller and embedded microcontroller to achieve the same function. Therefore, such a video image transcoding device can be regarded as a hardware component, and the device for implementing various functions included therein can also be regarded as a structure within the hardware component. Or even, the means for realizing various functions can be regarded as both a software module of an implementation method and a structure within a hardware component.
通过以上的实施方式的描述可知,本领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件平台的方式来实现。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施方式 或者实施方式的某些部分所述的方法。It can be known from the description of the above embodiments that those skilled in the art can clearly understand that the present application can be implemented by means of software plus a necessary general hardware platform. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence or part that contributes to the existing technology, and the computer software product can be stored in a storage medium, such as ROM/RAM, magnetic disk , Optical discs, etc., including several instructions to enable a computer device (which may be a personal computer, server, or network device, etc.) to perform the methods described in various embodiments of the present application or some parts of the embodiments.
本说明书中的各个实施方式均采用递进的方式描述,各个实施方式之间相同相似的部分互相参见即可,每个实施方式重点说明的都是与其他实施方式的不同之处。尤其,针对视频图像转码装置的实施方式来说,均可以参照前述方法的实施方式的介绍对照解释。The various embodiments in this specification are described in a progressive manner. The same or similar parts between the various embodiments can be referred to each other. Each embodiment focuses on the differences from other embodiments. In particular, for the embodiments of the video image transcoding device, reference may be made to the introduction of the foregoing method embodiments for comparison and explanation.
本申请可以在由计算机执行的计算机可执行指令的一般上下文中描述,例如程序模块。一般地,程序模块包括执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。也可以在分布式计算环境中实践本申请,在这些分布式计算环境中,由通过通信网络而被连接的远程处理设备来执行任务。在分布式计算环境中,程序模块可以位于包括存储设备在内的本地和远程计算机存储介质中。The present application may be described in the general context of computer-executable instructions executed by a computer, such as program modules. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types. The present application may also be practiced in distributed computing environments in which tasks are performed by remote processing devices connected through a communication network. In a distributed computing environment, program modules may be located in local and remote computer storage media including storage devices.
虽然通过实施方式描绘了本申请,本领域普通技术人员知道,本申请有许多变形和变化而不脱离本申请的精神,希望所附的权利要求包括这些变形和变化而不脱离本申请的精神。Although the present application has been described through the embodiments, those of ordinary skill in the art know that there are many variations and changes in the present application without departing from the spirit of the application, and it is hoped that the appended claims include these variations and changes without departing from the spirit of the application.

Claims (11)

  1. 一种视频转码方法,其特征在于,所述方法包括:A video transcoding method, characterized in that the method includes:
    获取源视频;Get the source video;
    确定从所述源视频转码为目标视频的路径信息;其中,所述路径信息中包括转码路径和所述转码路径中节点之间的转码方式;Determining the path information of transcoding from the source video to the target video; wherein, the path information includes a transcoding path and a transcoding method between nodes in the transcoding path;
    基于确定的所述路径信息,对所述源视频进行转码。Transcode the source video based on the determined path information.
  2. 根据权利要求1所述的方法,基于确定的所述路径信息对所述源视频进行转码包括:The method of claim 1, transcoding the source video based on the determined path information includes:
    按照所述转码路径,通过所述转码路径中节点之间的转码方式,对所述源视频进行转码,以得到所述目标视频。According to the transcoding path, transcoding the source video through a transcoding method between nodes in the transcoding path to obtain the target video.
  3. 根据权利要求1所述的方法,当待输出的视频中包括至少两种目标视频时,基于确定的所述路径信息对所述源视频进行转码包括:The method according to claim 1, when the video to be output includes at least two target videos, transcoding the source video based on the determined path information includes:
    当第一目标视频对应的第一路径信息与第二目标视频对应的第二路径信息之间存在重叠路径时,按照所述重叠路径对所述源视频进行转码,得到中间节点,再分别按照所述第一路径信息和所述第二路径信息中的非重叠路径对所述中间节点进行转码。When there is an overlapping path between the first path information corresponding to the first target video and the second path information corresponding to the second target video, transcode the source video according to the overlapping path to obtain an intermediate node, and then respectively The non-overlapping paths in the first path information and the second path information transcode the intermediate node.
  4. 根据权利要求1所述的方法,其特征在于,所述路径信息按照下述方式确定:The method according to claim 1, wherein the path information is determined in the following manner:
    将所述源视频和所述目标视频构成的视频组输入路径识别模型中,确定从所述源视频转码为所述目标视频的路径信息;其中,将所述源视频和所述目标视频分别作为所述路径识别模型中的路径起始节点和路径结束节点。Input a video group composed of the source video and the target video into a path recognition model, and determine path information of transcoding from the source video to the target video; wherein, the source video and the target video are respectively As a path start node and a path end node in the path identification model.
  5. 根据权利要求4所述的方法,其特征在于,所述路径信息按照下述方式确定:The method according to claim 4, wherein the path information is determined in the following manner:
    将所述源视频和所述目标视频构成的视频组输入路径识别模型中,以通过所述特征识别模型分别提取所述源视频的参数信息对应的第一特征向量和所述目标视频的参数信息对应的第二特征向量,并通过所述路径识别模型确定所述第一特征向量和所述第二特征向量构成的向量组对应的预测值;Input a video group composed of the source video and the target video into a path recognition model to extract the first feature vector corresponding to the parameter information of the source video and the parameter information of the target video respectively through the feature recognition model A corresponding second feature vector, and determining the predicted value corresponding to the vector group formed by the first feature vector and the second feature vector through the path recognition model;
    将所述预测值表征的路径信息作为从所述源视频转码为所述目标视频的路径信息。The path information characterized by the predicted value is used as the path information for transcoding from the source video to the target video.
  6. 根据权利要求4所述的方法,其特征在于,所述路径识别模型按照下述方式确定:The method according to claim 4, wherein the path recognition model is determined in the following manner:
    获取训练样本集,所述训练样本集中包括对应的转码路径符合所述路径信息的样本视频组以及对应的转码路径不符合所述路径信息的样本视频组;所述样本视频组中包括所述路径起始节点和所述路径结束节点分别对应的样本视频;Obtaining a training sample set, the training sample set including a corresponding sample video group whose transcoding path conforms to the path information and a corresponding sample video group whose transcoding path does not conform to the path information; the sample video group includes Sample videos corresponding to the path start node and the path end node respectively;
    将所述训练样本集中的样本视频组输入路径识别模型,所述路径识别模型中包括初始预测参数;Input a sample video group in the training sample set into a path recognition model, where the path recognition model includes initial prediction parameters;
    通过所述初始预测参数对输入的所述样本视频组进行处理,得到所述样本视频组的预测结果,所述预测结果用于表征所述样本视频组对应的转码路径是否符合所述路径信息;Processing the input sample video group through the initial prediction parameter to obtain a prediction result of the sample video group, where the prediction result is used to characterize whether a transcoding path corresponding to the sample video group conforms to the path information ;
    若所述预测结果不正确,根据所述预测结果与正确结果之间的差异值,对所述路径识别模型中的初始预测参数进行调整,以使得通过调整后的预测参数再次对所述样本视频组进行处理后,得到的预测结果与正确结果相符。If the prediction result is incorrect, adjust the initial prediction parameters in the path recognition model according to the difference between the prediction result and the correct result, so that the sample video After the group was processed, the predicted results were consistent with the correct results.
  7. 一种视频转码装置,其特征在于,所述装置包括:A video transcoding device, characterized in that the device includes:
    视频获取单元,用于获取源视频;Video acquisition unit for acquiring source video;
    路径确定单元,用于确定从所述源视频转码为目标视频的路径信息;其中,所述路径信息中包括转码路径和所述转码路径中节点之间的转码方式;A path determining unit, configured to determine path information for transcoding from the source video to the target video; wherein, the path information includes a transcoding path and a transcoding method between nodes in the transcoding path;
    转码单元,用于基于获取的所述路径信息,对所述源视频进行转码。The transcoding unit is configured to transcode the source video based on the acquired path information.
  8. 根据权利要求7所述的装置,其特征在于,所述转码单元还用于按照所述转码路径,通过所述转码路径中节点之间的转码方式,对所述源视频进行转码,以得到所述目标视频。The device according to claim 7, wherein the transcoding unit is further configured to transcode the source video according to the transcoding path and through a transcoding method between nodes in the transcoding path Code to get the target video.
  9. 根据权利要求7所述的装置,其特征在于,当待输出的视频中包括至少两种目标视频时,The apparatus according to claim 7, wherein when the video to be output includes at least two target videos,
    所述转码单元还用于当第一目标视频对应的第一路径信息与第二目标视频对应的第二路径信息之间存在重叠路径时,按照所述重叠路径对所述源视频进行转码,得到中间节点,再分别按照所述第一路径信息和所述第二路径信息中的非重叠路径对所述中间节点进行转码。The transcoding unit is further configured to transcode the source video according to the overlapping path when there is an overlapping path between the first path information corresponding to the first target video and the second path information corresponding to the second target video To obtain an intermediate node, and then transcode the intermediate node according to the non-overlapping paths in the first path information and the second path information, respectively.
  10. 根据权利要求7所述的装置,其特征在于,所述路径确定单元还用于将所述源视频和所述目标视频构成的视频组输入路径识别模型中,确定从所述源视频转码为所述目标视频的路径信息;其中,将所述源视频和所述目标视频分别作为所述路径识别模型中的路径起始节点和路径结束节点。The apparatus according to claim 7, wherein the path determining unit is further configured to input a video group composed of the source video and the target video into a path recognition model, and determine that transcoding from the source video is Path information of the target video; wherein, the source video and the target video are respectively used as a path start node and a path end node in the path identification model.
  11. 一种视频转码装置,其特征在于,所述装置包括存储器和处理器,所述存储器用于存储计算机程序,所述计算机程序被所述处理器执行时,实现如权利要求1至6中任一所述的方法。A video transcoding device, characterized in that the device includes a memory and a processor, and the memory is used to store a computer program, and when the computer program is executed by the processor, any of claims 1 to 6 is implemented One of the methods.
PCT/CN2019/124232 2018-12-11 2019-12-10 Video transcoding method and device WO2020119670A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811510054.7 2018-12-11
CN201811510054.7A CN111314706B (en) 2018-12-11 2018-12-11 Video transcoding method and device

Publications (1)

Publication Number Publication Date
WO2020119670A1 true WO2020119670A1 (en) 2020-06-18

Family

ID=71075341

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/124232 WO2020119670A1 (en) 2018-12-11 2019-12-10 Video transcoding method and device

Country Status (2)

Country Link
CN (1) CN111314706B (en)
WO (1) WO2020119670A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112653892B (en) * 2020-12-18 2024-04-23 杭州当虹科技股份有限公司 Method for realizing transcoding test evaluation by utilizing video features
CN115396683B (en) * 2022-08-22 2024-04-09 广州博冠信息科技有限公司 Video optimization processing method and device, electronic equipment and computer readable medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102916989A (en) * 2011-08-02 2013-02-06 腾讯科技(深圳)有限公司 Video downloading method, server and clients
CN103036942A (en) * 2011-10-08 2013-04-10 美国博通公司 Advanced content hosting
US20160034306A1 (en) * 2014-07-31 2016-02-04 Istreamplanet Co. Method and system for a graph based video streaming platform
CN106161599A (en) * 2016-06-24 2016-11-23 电子科技大学 A kind of method reducing cloud storage overall overhead when there is data dependence relation

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104935955B (en) * 2015-05-29 2019-01-25 腾讯科技(北京)有限公司 A kind of methods, devices and systems transmitting live video stream
CN107124635B (en) * 2017-06-06 2021-01-22 北京奇艺世纪科技有限公司 Video online method, video management system and live broadcast system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102916989A (en) * 2011-08-02 2013-02-06 腾讯科技(深圳)有限公司 Video downloading method, server and clients
CN103036942A (en) * 2011-10-08 2013-04-10 美国博通公司 Advanced content hosting
US20160034306A1 (en) * 2014-07-31 2016-02-04 Istreamplanet Co. Method and system for a graph based video streaming platform
CN106161599A (en) * 2016-06-24 2016-11-23 电子科技大学 A kind of method reducing cloud storage overall overhead when there is data dependence relation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
付眸 等 (FU, MOU ET AL.): "基于Spark Streaming的快速视频转码方法 (Fast Video Transcoding Method Based on Spark Streaming)", 计算机应用 (JOURNAL OF COMPUTER APPLICATIONS), 25 July 2018 (2018-07-25), DOI: 20200212181503Y *

Also Published As

Publication number Publication date
CN111314706A (en) 2020-06-19
CN111314706B (en) 2023-08-25

Similar Documents

Publication Publication Date Title
US11159790B2 (en) Methods, apparatuses, and systems for transcoding a video
WO2020119515A1 (en) Video transcoding method and device
US11514948B1 (en) Model-based dubbing to translate spoken audio in a video
WO2020119670A1 (en) Video transcoding method and device
WO2022150401A1 (en) Summarization of video artificial intelligence method, system, and apparatus
US10679070B1 (en) Systems and methods for a video understanding platform
US20230075893A1 (en) Speech recognition model structure including context-dependent operations independent of future data
CN110717421A (en) Video content understanding method and device based on generation countermeasure network
US7830284B2 (en) Entropy encoding apparatus, entropy encoding method, and computer program
WO2024060852A1 (en) Model ownership verification method and apparatus, storage medium and electronic device
CN113409803A (en) Voice signal processing method, device, storage medium and equipment
JP2018155939A (en) Generation device, generation method and generation program
WO2022227689A1 (en) Video processing method and apparatus
KR101370290B1 (en) Method and apparatus for generating multimedia data with decoding level, and method and apparatus for reconstructing multimedia data with decoding level
KR20230124266A (en) Speech synthesis method and apparatus using adversarial learning technique
CN114157895A (en) Video processing method and device, electronic equipment and storage medium
CN113256765A (en) AI anchor video generation method and device, electronic equipment and storage medium
US20240071388A1 (en) Computer-readable recording medium storing dictionary selection program, dictionary selection method, and dictionary selection device
CN113076828B (en) Video editing method and device and model training method and device
KR102663654B1 (en) Adaptive visual speech recognition
CN114979772B (en) Decoder configuration method, decoder configuration device, medium and electronic equipment
WO2023087234A1 (en) Artificial intelligence (ai) -assisted context-aware pipeline creation
CN116185200A (en) Haptic feedback content generation method and device, electronic equipment and storage medium
CN115424184A (en) Video object segmentation method and device and electronic equipment
CN115358288A (en) Multi-modal classification model training method and device based on label constraint

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19895116

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19895116

Country of ref document: EP

Kind code of ref document: A1