CN111314706A - Video transcoding method and device - Google Patents

Video transcoding method and device Download PDF

Info

Publication number
CN111314706A
CN111314706A CN201811510054.7A CN201811510054A CN111314706A CN 111314706 A CN111314706 A CN 111314706A CN 201811510054 A CN201811510054 A CN 201811510054A CN 111314706 A CN111314706 A CN 111314706A
Authority
CN
China
Prior art keywords
path
video
transcoding
path information
source video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811510054.7A
Other languages
Chinese (zh)
Other versions
CN111314706B (en
Inventor
李庆文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201811510054.7A priority Critical patent/CN111314706B/en
Priority to PCT/CN2019/124232 priority patent/WO2020119670A1/en
Publication of CN111314706A publication Critical patent/CN111314706A/en
Application granted granted Critical
Publication of CN111314706B publication Critical patent/CN111314706B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/40Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video transcoding, i.e. partial or full decoding of a coded input stream followed by re-encoding of the decoded output stream
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The embodiment of the application discloses a video transcoding method and a video transcoding device, wherein the method comprises the following steps: acquiring a source video; determining path information transcoded from the source video into a target video; the path information comprises a transcoding path and a transcoding mode between nodes in the transcoding path; transcoding the source video based on the determined path information. The technical scheme provided by the application can improve the efficiency of video transcoding.

Description

Video transcoding method and device
Technical Field
The present application relates to the field of internet technologies, and in particular, to a video transcoding method and apparatus.
Background
With the continuous development of internet technology, more and more video playing platforms emerge. In order to provide videos with different image qualities to users, a video playing platform generally needs to transcode a source video, so as to generate multiple videos with different resolutions and different code rates.
Currently, for some multi-level dependent transcoding scenarios, for example, scenarios for producing high frame rate video, different transcoding tasks are typically performed by multiple transcoding machines respectively. For example, before transcoding the source video, it is necessary to perform high frame rate conversion on the source video to generate an intermediate result, and then transcode the intermediate result, so as to generate multiple copies of videos with different resolutions and different code rates. In the transcoding scene depending on the intermediate result, after the transcoding task generating the intermediate result is completed, the intermediate result is generally required to be uploaded to an external storage platform, and subsequently, when multiple transcoding tasks are performed based on the intermediate result, the intermediate result is required to be read from the external storage platform for multiple times, which consumes time in both the uploading process and the reading process from the external storage platform, resulting in low video transcoding efficiency.
Therefore, it is desirable to provide a faster video transcoding method.
Disclosure of Invention
The embodiment of the application aims to provide a video transcoding method and a video transcoding device, which can improve the efficiency of video transcoding.
In order to achieve the above object, an embodiment of the present application provides a video transcoding method, where the method includes: acquiring a source video; determining path information transcoded from the source video into a target video; the path information comprises a transcoding path and a transcoding mode between nodes in the transcoding path; and transcoding the source video based on the acquired path information.
To achieve the above object, the present application further provides a video transcoding device, where the device includes: a video acquisition unit for acquiring a source video; a path determining unit, configured to determine path information transcoded from the source video to a target video; the path information comprises a transcoding path and a transcoding mode between nodes in the transcoding path; and the transcoding unit is used for transcoding the source video based on the acquired path information.
In order to achieve the above object, the present application further provides a video transcoding device, which includes a memory and a processor, where the memory is used for storing a computer program, and the computer program, when executed by the processor, implements the above video transcoding method.
As can be seen from the above, according to the technical scheme provided by the application, after a source video is acquired, path information transcoded from the source video to a target video can be determined for the target video to be output. The path information comprises a transcoding path and a transcoding mode between nodes in the transcoding path. In this way, the source video and other intermediate nodes in the transcoding path can be transcoded in sequence by the transcoding mode among the nodes in the transcoding path according to the transcoding path, and the target video is output. Therefore, the whole transcoding process can be completed in one transcoding machine, the uploading process and the process of reading from an external storage platform are reduced, the video transcoding time is shortened, and the video transcoding efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic diagram of a video transcoding method according to an embodiment of the present application;
fig. 2 is a schematic diagram of a directed acyclic transcoding architecture according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a video transcoding apparatus according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of another video transcoding apparatus according to an embodiment of the present disclosure.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art without any inventive work based on the embodiments in the present application shall fall within the scope of protection of the present application.
The application provides a video transcoding method which can be applied to terminal equipment with an image processing function. The terminal device may be, for example, a desktop computer, a notebook computer, a tablet computer, a workstation, etc. In addition, the method can also be applied to a service server of a video playing website, and the service server can be an independent server or a server cluster consisting of a plurality of servers.
Referring to fig. 1, a video transcoding method provided in the present application includes the following steps.
S11: a source video is acquired.
In this embodiment, multiple videos with different resolutions and different code rates can be generated by transcoding the source video.
In this embodiment, the manner of acquiring the source video may include reading the source video from a storage path provided by the terminal device or receiving the source video sent by another terminal device.
S13: determining path information transcoded from the source video into a target video; the path information comprises a transcoding path and a transcoding mode between nodes in the transcoding path.
In this embodiment, in an actual application scenario, for a target video to be output, transcoding from the source video to the target video may need to pass through a transcoding process between multiple nodes. For example, for some multi-level dependent transcoding scenarios, for example, scenarios for producing high Frame rate video, it is necessary to convert a source video into a source video with a high Frame rate by Frame Rate Conversion (FRC), that is, generate an intermediate result, transcode the intermediate result to generate a video with a specified resolution, and finally transcode the video to generate a target video with a specified video format and a specified resolution. In this way, in order to output the target video with the specified video format and the specified resolution, two intermediate nodes are required, and the whole transcoding process can be divided into four levels, namely a first level taking the source video as a root node, a second level taking the source video with a high frame rate as a child node, a third level taking the video with the specified resolution as a three-level node, and a fourth level taking the target video with the specified resolution and the specified video format as a leaf node. Wherein the output of the node of the second hierarchy depends on the node of the first hierarchy, the output of the node of the third hierarchy depends on the node of the second hierarchy, and the output of the node of the fourth hierarchy depends on the node of the third hierarchy. In order to implement such a multi-level dependent complex transcoding process in the terminal device, path information to be transcoded from the source video to the target video may be determined first. The path information may include a transcoding path and a transcoding method between nodes in the transcoding path. In this way, the transcoding process of the source video can be subsequently implemented in a terminal device based on the determined path information to obtain the target video.
In one embodiment, the transcoding process for transcoding the source video into the target video may be a multi-level dependent transcoding process already existing in an actual application, so that the dependency relationship between transcoding tasks may be obtained from existing separate transcoding tasks, and the input and output of the transcoding tasks may be taken as nodes in one transcoding path. In this way, a transcoding path for transcoding the source video into the target video and nodes in the transcoding path can be obtained. The transcoding mode between each node can also be directly obtained through each separated transcoding task. In this manner, path information for transcoding from the source video to the target video may be determined. In this embodiment, the transcoding manner corresponds to a video transcoding parameter required for transcoding one video to another video, and a parameter value of the video transcoding parameter may be determined according to a parameter value of a video parameter and a parameter value of an audio parameter of the two videos. The transcoding parameters may include, for example, fidelity, resolution, transmission code rate, etc. After the transcoding parameters are set, the video can be transcoded, so that the transcoded video conforming to the transcoding parameters is obtained.
In one embodiment, considering that the transcoding process from the source video to the target video may not be found from the existing multi-level dependent transcoding process, in the practical application process, a deep learning method may be adopted to construct a path identification model for identifying the transcoding path information. For example, the path information corresponding to the video group composed of the source video and the target video may be identified by a Support Vector Machine (SVM). Wherein the source video and the target video can be respectively used as a path starting node and a path ending node in the path identification model. Specifically, when the path recognition model is constructed, a training sample set may be obtained in advance, and the training sample set may be used to train the path recognition model, so that the path recognition model can recognize path information corresponding to an input video group. The training sample set may include sample video sets whose corresponding transcoding paths conform to the path information and sample video sets whose corresponding transcoding paths do not conform to the path information. The sample video group may include sample videos corresponding to the path start node and the path end node, respectively. In this way, in the training process, the sample video sets in the training sample set can be sequentially input into the path recognition model. An initial neural network can be constructed in the path identification model, and initial prediction parameters can be preset in the neural network. After the input sample video group is processed through the initial prediction parameters, a prediction result of the sample video group can be obtained, and the prediction result can be used for representing whether a transcoding path corresponding to the sample video group conforms to the path information. Specifically, when the path identification model processes a sample video group, first, a first feature vector corresponding to parameter information of the source video and a second feature vector corresponding to parameter information of the target video may be extracted respectively. The elements in the first feature vector may be parameter values of various parameters of the source video, for example, parameter values of video parameters or audio parameters, and the video parameters may include video resolution, video bitrate, video frame rate, video format, and the like. Similarly, the elements in the first feature vector may be parameter values of respective parameters of the target video. In this way, the path identification model may read a parameter value of each parameter in the source video corresponding to the path start node in the sample video group and a parameter value of each parameter in the target video corresponding to the path end node, and form the parameter values into the first feature vector and the second feature vector according to the read sequence. In practical applications, the number of parameters is usually large, which results in a large dimension of the extracted feature vector, and thus, a large amount of resources are consumed to process the feature vector. In view of this, in the present embodiment, a Convolutional Neural Network (CNN) may be further used to process the sample video group, so as to obtain a feature vector with a smaller dimension, so as to facilitate subsequent identification processing.
In this embodiment, after the data of the input sample video group is processed by the neural network, the probability value vector of the sample video group can be obtained. A probability value for the specified path information may be included in the probability value vector. The probability value vector may include two probability values respectively representing the probability that the transcoding path conforms to the specified path information and the probability that the transcoding path does not conform to the specified path information. For example, after a set of sample video groups corresponding to the transcoding path conforming to the specified path information is input, a probability value vector of (0.4, 0.8) may be obtained through the path identification model, where 0.4 represents a probability that the transcoding path conforms to the specified path information, and 0.8 represents a probability that the transcoding path does not conform to the specified path information. Because the initial prediction parameters in the path recognition model may not be set accurately enough, the probability result obtained through the path recognition model prediction may not be in accordance with the actual situation. For example, the sample video set whose transcoding path matches the specified path information is input, but the probability vector obtained has a probability of only 0.4 indicating that the transcoding path matches the specified path information and a probability of 0.8 indicating that the transcoding path does not match the specified path information. In this case, the prediction result is indicated to be incorrect. At this time, the initial prediction parameters in the path recognition model may be adjusted according to a difference value between the prediction result and the correct result. In particular, the sample video set may have theoretical probability value results. For example, the probability value result of the transcoding path conforming to the theory of the specified path information may be (1, 0), where 1 represents the probability value of the transcoding path conforming to the specified path information. At this time, the predicted probability value result and the theoretical probability value result may be subtracted to obtain a difference value therebetween, and then the obtained difference value may be used to adjust an initial prediction parameter of the neural network, so that the obtained prediction result matches the correct result after the adjusted prediction parameter is used to process the sample video group again. Therefore, after a large number of training samples are trained, the path identification model can distinguish whether the transcoding path corresponding to the sample video set conforms to the specified transcoding path or not, and accordingly, the path information conforming to the actual transcoding path corresponding to the sample video set can be identified.
S15: and transcoding the source video based on the acquired path information.
In this embodiment, after determining the path information for transcoding from the source video to the target video to be output, a transcoding process for the source video may be implemented in a terminal device based on the determined path information to obtain the target video. Specifically, the source video may be transcoded by a transcoding method between nodes in the transcoding path according to the transcoding path included in the path information, so that the target video may be obtained. For example, the transcoding path includes four nodes, which are respectively a root node, a child node, a third-level node and a fourth-level node according to a transcoding sequence. Wherein, the root node is the source video, and the four-level node is the target video. Then, the root node can be transcoded by a transcoding mode between the root node and the child node to obtain a video corresponding to the child node. And transcoding the video corresponding to the child node in a transcoding mode between the child node and the third-level node to obtain the video corresponding to the third-level node. And finally, transcoding the video corresponding to the third-level node in a transcoding mode between the third-level node and the fourth-level node to obtain the target video. Therefore, the whole multi-level dependent transcoding process is completed in one terminal device, the video uploading process corresponding to the intermediate node and the video processes read from the external storage platform are reduced, the video transcoding time can be reduced, and the video transcoding efficiency is improved.
In this embodiment, in some complex transcoding scenarios, the video to be output often includes at least two target videos, so that multiple transcoding paths for the target videos may occur. When overlapping paths exist in the transcoding paths, in order to avoid repeating the transcoding process in the overlapping paths, it may be determined whether overlapping paths exist in the transcoding paths for each target video, if so, the source video may be transcoded according to the overlapping paths to obtain intermediate nodes, and then the intermediate nodes are transcoded according to non-overlapping paths in each transcoding path. Specifically, for two target videos, a first target video and a second target video, after determining first path information transcoded into the first target video from a source video and second path information transcoded into the second target video from the source video, if an overlapping path exists between the first path information corresponding to the first target video and the second path information corresponding to the second target video, the source video may be transcoded according to the overlapping path to obtain an intermediate node, and then the intermediate node is transcoded according to non-overlapping paths in the first path information and the second path information, respectively. To implement the above process, a transcoding structure of a Directed Acyclic Graph (DAG) including nodes may be constructed according to a top-bottom dependency relationship between the nodes in the path information for the target videos.
For example, in an application scenario in which a source video is transcoded into a target video with dolby sound, a plurality of target videos with different dolby sound needs to be output. And transcoding paths in the path information corresponding to the target videos comprise partially overlapped paths. Then, a corresponding DAG transcoding structure may be constructed according to the upper and lower dependency relationships between the nodes in the path information to merge the overlapping paths, and then a complex transcoding process that a plurality of target videos need to be output may be completed in one terminal device directly according to the DAG transcoding structure. As shown in fig. 2, among the nodes of the target video to be finally output, the paths corresponding to the Node11 having dolby effect 11, the Node12 having dolby effect 12, and the Node13 having dolby effect 13 include a path from the root Node root to the Node1 having dolby effect 1. Then, the overlapped paths may be merged, so that the terminal device may transcode the source video according to the overlapped paths to obtain the intermediate Node1, and then transcode the intermediate Node1 according to the non-overlapped paths to obtain the nodes Node11, Node12, and Node13, respectively. Similarly, for other nodes of the finally output target video, the Node21 with dolby sound 21 and the Node22 with dolby sound 22 may also perform overlapping path merging in the above manner, so that a transcoding path of the DAG transcoding structure may be constructed. In fig. 2, the root Node is a first hierarchy, the nodes Node1 and Node2 form a second hierarchy, the nodes Node11, Node12, Node13, Node21 and Node22 form a third hierarchy, and dependency relationships exist between the hierarchies. The first level is used as input to output the second level, and then the second level is used as input to output the third level. In this embodiment, after the DAG transcoding structure is constructed, the constructed DAG transcoding structure may be expanded horizontally and vertically according to the increased transcoding service requirement. As shown in the dotted line box of fig. 2, in order to increase the transcoding service requirement for outputting videos with dolby audio 211, dolby audio 212, dolby audio 213, dolby audio 221, and dolby audio 311, the nodes Node3 may be increased from the horizontal direction, the nodes Node211, Node212, Node213, and Node221 may be increased from the vertical direction, and the nodes Node31 and Node311 may be increased according to the determined path information of the videos to be output. Therefore, the constructed transcoding path of the DAG transcoding structure can meet the complex transcoding service requirement in one terminal device. In the transcoding process, the intermediate nodes are directly stored locally in the terminal device, so that the video corresponding to the intermediate nodes does not need to be uploaded to an external storage device, and the video does not need to be read from the external storage device for multiple times to perform the subsequent transcoding process.
In this embodiment, the functions implemented in the above method steps may be implemented by a computer program, and the computer program may be stored in a computer storage medium. In particular, the computer storage medium may be coupled to a processor, which may thereby read the computer program from the computer storage medium. The computer program, when executed by a processor, may perform the following functions:
s11: acquiring a source video;
s12: determining path information transcoded from the source video into a target video; the path information comprises a transcoding path and a transcoding mode between nodes in the transcoding path;
s13: transcoding the source video based on the determined path information.
In one embodiment, the computer program, when executed by the processor, further implements the steps of:
and transcoding the source video by a transcoding mode among nodes in the transcoding path according to the transcoding path to obtain the target video.
In one embodiment, when the computer program is executed by the processor and at least two target videos are included in the video to be output, the following steps are further implemented:
when an overlapped path exists between first path information corresponding to a first target video and second path information corresponding to a second target video, transcoding the source video according to the overlapped path to obtain an intermediate node, and then transcoding the intermediate node respectively according to non-overlapped paths in the first path information and the second path information.
In one embodiment, the computer program, when executed by the processor, further implements the steps of:
inputting a video group formed by the source video and the target video into a path identification model, and determining path information for transcoding the source video into the target video; and respectively taking the source video and the target video as a path starting node and a path ending node in the path identification model.
In one embodiment, the computer program, when executed by the processor, further implements the steps of:
inputting a video group formed by the source video and the target video into a path identification model, respectively extracting a first feature vector corresponding to parameter information of the source video and a second feature vector corresponding to parameter information of the target video through the feature identification model, and determining a predicted value corresponding to a vector group formed by the first feature vector and the second feature vector through the path identification model;
and taking the path information characterized by the predicted value as the path information transcoded from the source video to the target video.
In one embodiment, the computer program, when executed by the processor, further implements the steps of:
acquiring a training sample set, wherein the training sample set comprises a sample video group of which the corresponding transcoding path conforms to the path information and a sample video group of which the corresponding transcoding path does not conform to the path information; the sample video group comprises sample videos respectively corresponding to the path starting node and the path ending node;
inputting a sample video group in the training sample set into a path identification model, wherein the path identification model comprises an initial prediction parameter;
processing the input sample video group through the initial prediction parameters to obtain a prediction result of the sample video group, wherein the prediction result is used for representing whether a transcoding path corresponding to the sample video group conforms to the path information;
if the prediction result is incorrect, adjusting the initial prediction parameters in the path recognition model according to the difference value between the prediction result and the correct result, so that the obtained prediction result is consistent with the correct result after the sample video group is processed again through the adjusted prediction parameters.
Referring to fig. 3, the present application further provides a video transcoding apparatus, including:
a video acquisition unit 100 for acquiring a source video;
a path determining unit 200, configured to determine path information transcoded from the source video to a target video; the path information comprises a transcoding path and a transcoding mode between nodes in the transcoding path;
a transcoding unit 300, configured to transcode the source video based on the obtained path information.
In an embodiment, the transcoding unit is further configured to transcode the source video according to the transcoding path by means of transcoding between nodes in the transcoding path, so as to obtain the target video.
In one embodiment, when at least two kinds of target videos are included in the video to be output,
the transcoding unit is further configured to, when an overlapping path exists between first path information corresponding to a first target video and second path information corresponding to a second target video, transcode the source video according to the overlapping path to obtain an intermediate node, and then transcode the intermediate node according to non-overlapping paths in the first path information and the second path information, respectively.
In one embodiment, the path determining unit is further configured to input a video group formed by the source video and the target video into a path identification model, and determine path information for transcoding the source video into the target video; and respectively taking the source video and the target video as a path starting node and a path ending node in the path identification model.
In the video transcoding device provided in the embodiments of the present specification, specific functions of each unit module may be explained in comparison with the foregoing method embodiments in the present specification, and technical effects of the foregoing method embodiments can be achieved, which is not described herein again.
Referring to fig. 4, the present application further provides a video transcoding apparatus, the apparatus includes a memory and a processor, the memory is used for storing a computer program, and the computer program, when executed by the processor, implements the following steps:
s11: acquiring a source video;
s12: determining path information transcoded from the source video into a target video; the path information comprises a transcoding path and a transcoding mode between nodes in the transcoding path;
s13: transcoding the source video based on the determined path information.
In this embodiment, the memory may include a physical device for storing information, and typically, the information is digitized and then stored in a medium using an electrical, magnetic, or optical method. The memory according to this embodiment may further include: devices that store information using electrical energy, such as RAM, ROM, etc.; devices that store information using magnetic energy, such as hard disks, floppy disks, tapes, core memories, bubble memories, usb disks; devices for storing information optically, such as CDs or DVDs. Of course, there are other ways of memory, such as quantum memory, graphene memory, and so forth.
In this embodiment, the processor may be implemented in any suitable manner. For example, the processor may take the form of, for example, a microprocessor or processor and a computer-readable medium that stores computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, an embedded microcontroller, and so forth.
In one embodiment, the computer program, when executed by the processor, further implements the steps of:
and transcoding the source video by a transcoding mode among nodes in the transcoding path according to the transcoding path to obtain the target video.
In one embodiment, when the computer program is executed by the processor and at least two target videos are included in the video to be output, the following steps are further implemented:
when an overlapped path exists between first path information corresponding to a first target video and second path information corresponding to a second target video, transcoding the source video according to the overlapped path to obtain an intermediate node, and then transcoding the intermediate node respectively according to non-overlapped paths in the first path information and the second path information.
In one embodiment, the computer program, when executed by the processor, further implements the steps of:
inputting a video group formed by the source video and the target video into a path identification model, and determining path information for transcoding the source video into the target video; and respectively taking the source video and the target video as a path starting node and a path ending node in the path identification model.
In one embodiment, the computer program, when executed by the processor, further implements the steps of:
inputting a video group formed by the source video and the target video into a path identification model, respectively extracting a first feature vector corresponding to parameter information of the source video and a second feature vector corresponding to parameter information of the target video through the feature identification model, and determining a predicted value corresponding to a vector group formed by the first feature vector and the second feature vector through the path identification model;
and taking the path information characterized by the predicted value as the path information transcoded from the source video to the target video.
In one embodiment, the computer program, when executed by the processor, further implements the steps of:
acquiring a training sample set, wherein the training sample set comprises a sample video group of which the corresponding transcoding path conforms to the path information and a sample video group of which the corresponding transcoding path does not conform to the path information; the sample video group comprises sample videos respectively corresponding to the path starting node and the path ending node;
inputting a sample video group in the training sample set into a path identification model, wherein the path identification model comprises an initial prediction parameter;
processing the input sample video group through the initial prediction parameters to obtain a prediction result of the sample video group, wherein the prediction result is used for representing whether a transcoding path corresponding to the sample video group conforms to the path information;
if the prediction result is incorrect, adjusting the initial prediction parameters in the path recognition model according to the difference value between the prediction result and the correct result, so that the obtained prediction result is consistent with the correct result after the sample video group is processed again through the adjusted prediction parameters.
As can be seen from the above, according to the technical scheme provided by the application, after the source video is obtained, for the target video to be output, the path information transcoded from the source video to the target video can be determined. The path information comprises a transcoding path and a transcoding mode between nodes in the transcoding path. In this way, the source video and other intermediate nodes in the transcoding path can be transcoded in sequence by the transcoding mode among the nodes in the transcoding path according to the transcoding path, and the target video is output. Therefore, the whole transcoding process can be completed in one transcoding machine, the uploading process and the process of reading from an external storage platform are reduced, the video transcoding time is shortened, and the video transcoding efficiency is improved.
In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually making an integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Language Description Language), traffic, pl (core unified Programming Language), HDCal, JHDL (Java Hardware Description Language), langue, Lola, HDL, laspam, hardsradware (Hardware Description Language), vhjhd (Hardware Description Language), and vhigh-Language, which are currently used in most popular applications. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.
Those skilled in the art also know that instead of implementing the video image transcoding means in pure computer readable program code, it is entirely possible to logically program the method steps such that the video image transcoding means performs the same functions in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a video image transcoding apparatus can be considered as a hardware component, and the apparatus included therein for implementing various functions can also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present application may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present application.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for embodiments of the video image transcoding device, reference may be made to the introduction of embodiments of the method described above for an explanation.
The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
Although the present application has been described in terms of embodiments, those of ordinary skill in the art will recognize that there are numerous variations and permutations of the present application without departing from the spirit of the application, and it is intended that the appended claims encompass such variations and permutations without departing from the spirit of the application.

Claims (11)

1. A method of video transcoding, the method comprising:
acquiring a source video;
determining path information transcoded from the source video into a target video; the path information comprises a transcoding path and a transcoding mode between nodes in the transcoding path;
transcoding the source video based on the determined path information.
2. The method of claim 1, transcoding the source video based on the determined path information comprising:
and transcoding the source video by a transcoding mode among nodes in the transcoding path according to the transcoding path to obtain the target video.
3. The method of claim 1, when at least two target videos are included in the video to be output, transcoding the source video based on the determined path information comprises:
when an overlapped path exists between first path information corresponding to a first target video and second path information corresponding to a second target video, transcoding the source video according to the overlapped path to obtain an intermediate node, and then transcoding the intermediate node respectively according to non-overlapped paths in the first path information and the second path information.
4. The method of claim 1, wherein the path information is determined as follows:
inputting a video group formed by the source video and the target video into a path identification model, and determining path information for transcoding the source video into the target video; and respectively taking the source video and the target video as a path starting node and a path ending node in the path identification model.
5. The method of claim 4, wherein the path information is determined as follows:
inputting a video group formed by the source video and the target video into a path identification model, respectively extracting a first feature vector corresponding to parameter information of the source video and a second feature vector corresponding to parameter information of the target video through the feature identification model, and determining a predicted value corresponding to a vector group formed by the first feature vector and the second feature vector through the path identification model;
and taking the path information characterized by the predicted value as the path information transcoded from the source video to the target video.
6. The method of claim 4, wherein the path recognition model is determined as follows:
acquiring a training sample set, wherein the training sample set comprises a sample video group of which the corresponding transcoding path conforms to the path information and a sample video group of which the corresponding transcoding path does not conform to the path information; the sample video group comprises sample videos respectively corresponding to the path starting node and the path ending node;
inputting a sample video group in the training sample set into a path identification model, wherein the path identification model comprises an initial prediction parameter;
processing the input sample video group through the initial prediction parameters to obtain a prediction result of the sample video group, wherein the prediction result is used for representing whether a transcoding path corresponding to the sample video group conforms to the path information;
if the prediction result is incorrect, adjusting the initial prediction parameters in the path recognition model according to the difference value between the prediction result and the correct result, so that the obtained prediction result is consistent with the correct result after the sample video group is processed again through the adjusted prediction parameters.
7. A video transcoding apparatus, the apparatus comprising:
a video acquisition unit for acquiring a source video;
a path determining unit, configured to determine path information transcoded from the source video to a target video; the path information comprises a transcoding path and a transcoding mode between nodes in the transcoding path;
and the transcoding unit is used for transcoding the source video based on the acquired path information.
8. The apparatus of claim 7, wherein the transcoding unit is further configured to transcode the source video according to the transcoding path by means of transcoding between nodes in the transcoding path to obtain the target video.
9. The apparatus according to claim 7, wherein when at least two kinds of target videos are included in the video to be output,
the transcoding unit is further configured to, when an overlapping path exists between first path information corresponding to a first target video and second path information corresponding to a second target video, transcode the source video according to the overlapping path to obtain an intermediate node, and then transcode the intermediate node according to non-overlapping paths in the first path information and the second path information, respectively.
10. The apparatus of claim 7, wherein the path determining unit is further configured to input a video group consisting of the source video and the target video into a path identification model, and determine path information for transcoding the source video into the target video; and respectively taking the source video and the target video as a path starting node and a path ending node in the path identification model.
11. A video transcoding device, characterized in that the device comprises a memory for storing a computer program which, when executed by the processor, implements the method of any of claims 1 to 6.
CN201811510054.7A 2018-12-11 2018-12-11 Video transcoding method and device Active CN111314706B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201811510054.7A CN111314706B (en) 2018-12-11 2018-12-11 Video transcoding method and device
PCT/CN2019/124232 WO2020119670A1 (en) 2018-12-11 2019-12-10 Video transcoding method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811510054.7A CN111314706B (en) 2018-12-11 2018-12-11 Video transcoding method and device

Publications (2)

Publication Number Publication Date
CN111314706A true CN111314706A (en) 2020-06-19
CN111314706B CN111314706B (en) 2023-08-25

Family

ID=71075341

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811510054.7A Active CN111314706B (en) 2018-12-11 2018-12-11 Video transcoding method and device

Country Status (2)

Country Link
CN (1) CN111314706B (en)
WO (1) WO2020119670A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112653892A (en) * 2020-12-18 2021-04-13 杭州当虹科技股份有限公司 Method for realizing transcoding test evaluation by using video characteristics
CN115396683A (en) * 2022-08-22 2022-11-25 广州博冠信息科技有限公司 Video optimization processing method and device, electronic equipment and computer readable medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104168488A (en) * 2014-08-29 2014-11-26 北京奇艺世纪科技有限公司 Video transcoding method and device
CN104935955A (en) * 2015-05-29 2015-09-23 腾讯科技(北京)有限公司 Live video stream transmission method, device and system
US20160034306A1 (en) * 2014-07-31 2016-02-04 Istreamplanet Co. Method and system for a graph based video streaming platform
CN107124635A (en) * 2017-06-06 2017-09-01 北京奇艺世纪科技有限公司 A kind of loading method of video, system for managing video and live broadcast system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102916989B (en) * 2011-08-02 2018-02-13 腾讯科技(深圳)有限公司 A kind of method for downloading video, service end and client
US20130091207A1 (en) * 2011-10-08 2013-04-11 Broadcom Corporation Advanced content hosting
CN106161599A (en) * 2016-06-24 2016-11-23 电子科技大学 A kind of method reducing cloud storage overall overhead when there is data dependence relation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160034306A1 (en) * 2014-07-31 2016-02-04 Istreamplanet Co. Method and system for a graph based video streaming platform
CN104168488A (en) * 2014-08-29 2014-11-26 北京奇艺世纪科技有限公司 Video transcoding method and device
CN104935955A (en) * 2015-05-29 2015-09-23 腾讯科技(北京)有限公司 Live video stream transmission method, device and system
CN107124635A (en) * 2017-06-06 2017-09-01 北京奇艺世纪科技有限公司 A kind of loading method of video, system for managing video and live broadcast system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112653892A (en) * 2020-12-18 2021-04-13 杭州当虹科技股份有限公司 Method for realizing transcoding test evaluation by using video characteristics
CN112653892B (en) * 2020-12-18 2024-04-23 杭州当虹科技股份有限公司 Method for realizing transcoding test evaluation by utilizing video features
CN115396683A (en) * 2022-08-22 2022-11-25 广州博冠信息科技有限公司 Video optimization processing method and device, electronic equipment and computer readable medium
CN115396683B (en) * 2022-08-22 2024-04-09 广州博冠信息科技有限公司 Video optimization processing method and device, electronic equipment and computer readable medium

Also Published As

Publication number Publication date
WO2020119670A1 (en) 2020-06-18
CN111314706B (en) 2023-08-25

Similar Documents

Publication Publication Date Title
CN104735468B (en) A kind of method and system that image is synthesized to new video based on semantic analysis
US20220215052A1 (en) Summarization of video artificial intelligence method, system, and apparatus
KR20210038467A (en) Method and apparatus for generating an event theme, device and storage medium
CN111314737B (en) Video transcoding method and device
CN114339450B (en) Video comment generation method, system, device and storage medium
CN111143551A (en) Text preprocessing method, classification method, device and equipment
CN111314706A (en) Video transcoding method and device
US20230075893A1 (en) Speech recognition model structure including context-dependent operations independent of future data
CN114881174A (en) Content classification method and device, readable storage medium and electronic equipment
CN114022955A (en) Action recognition method and device
AU2020364386B2 (en) Rare topic detection using hierarchical clustering
CN116628141B (en) Information processing method, device, equipment and storage medium
KR102371487B1 (en) Method and apparatus for learning based on data including nominal data
KR102243275B1 (en) Method, device and computer readable storage medium for automatically generating content regarding offline object
CN116151363B (en) Distributed Reinforcement Learning System
CN116975347A (en) Image generation model training method and related device
CN113726692B (en) Virtual network mapping method and device based on generation of countermeasure network
CN115600090A (en) Ownership verification method and device for model, storage medium and electronic equipment
CN113194270B (en) Video processing method and device, electronic equipment and storage medium
CN116756676A (en) Abstract generation method and related device
CN114723398A (en) Stage creative arrangement method, stage creative arrangement device and electronic equipment
CN111815638A (en) Training method of video segmentation network model, video segmentation method and related equipment
CN112035622A (en) Integrated platform and method for natural language processing
KR20200071826A (en) Method and apparatus for emotion detection in video
CN113076828B (en) Video editing method and device and model training method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant