CN111314706A

CN111314706A - Video transcoding method and device

Info

Publication number: CN111314706A
Application number: CN201811510054.7A
Authority: CN
Inventors: 李庆文
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2018-12-11
Filing date: 2018-12-11
Publication date: 2020-06-19
Anticipated expiration: 2038-12-11
Also published as: WO2020119670A1; CN111314706B

Abstract

The embodiment of the application discloses a video transcoding method and a video transcoding device, wherein the method comprises the following steps: acquiring a source video; determining path information transcoded from the source video into a target video; the path information comprises a transcoding path and a transcoding mode between nodes in the transcoding path; transcoding the source video based on the determined path information. The technical scheme provided by the application can improve the efficiency of video transcoding.

Description

Video transcoding method and device

Technical Field

The present application relates to the field of internet technologies, and in particular, to a video transcoding method and apparatus.

Background

With the continuous development of internet technology, more and more video playing platforms emerge. In order to provide videos with different image qualities to users, a video playing platform generally needs to transcode a source video, so as to generate multiple videos with different resolutions and different code rates.

Currently, for some multi-level dependent transcoding scenarios, for example, scenarios for producing high frame rate video, different transcoding tasks are typically performed by multiple transcoding machines respectively. For example, before transcoding the source video, it is necessary to perform high frame rate conversion on the source video to generate an intermediate result, and then transcode the intermediate result, so as to generate multiple copies of videos with different resolutions and different code rates. In the transcoding scene depending on the intermediate result, after the transcoding task generating the intermediate result is completed, the intermediate result is generally required to be uploaded to an external storage platform, and subsequently, when multiple transcoding tasks are performed based on the intermediate result, the intermediate result is required to be read from the external storage platform for multiple times, which consumes time in both the uploading process and the reading process from the external storage platform, resulting in low video transcoding efficiency.

Therefore, it is desirable to provide a faster video transcoding method.

Disclosure of Invention

The embodiment of the application aims to provide a video transcoding method and a video transcoding device, which can improve the efficiency of video transcoding.

In order to achieve the above object, an embodiment of the present application provides a video transcoding method, where the method includes: acquiring a source video; determining path information transcoded from the source video into a target video; the path information comprises a transcoding path and a transcoding mode between nodes in the transcoding path; and transcoding the source video based on the acquired path information.

To achieve the above object, the present application further provides a video transcoding device, where the device includes: a video acquisition unit for acquiring a source video; a path determining unit, configured to determine path information transcoded from the source video to a target video; the path information comprises a transcoding path and a transcoding mode between nodes in the transcoding path; and the transcoding unit is used for transcoding the source video based on the acquired path information.

In order to achieve the above object, the present application further provides a video transcoding device, which includes a memory and a processor, where the memory is used for storing a computer program, and the computer program, when executed by the processor, implements the above video transcoding method.

As can be seen from the above, according to the technical scheme provided by the application, after a source video is acquired, path information transcoded from the source video to a target video can be determined for the target video to be output. The path information comprises a transcoding path and a transcoding mode between nodes in the transcoding path. In this way, the source video and other intermediate nodes in the transcoding path can be transcoded in sequence by the transcoding mode among the nodes in the transcoding path according to the transcoding path, and the target video is output. Therefore, the whole transcoding process can be completed in one transcoding machine, the uploading process and the process of reading from an external storage platform are reduced, the video transcoding time is shortened, and the video transcoding efficiency is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic diagram of a video transcoding method according to an embodiment of the present application;

fig. 2 is a schematic diagram of a directed acyclic transcoding architecture according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of a video transcoding apparatus according to an embodiment of the present disclosure;

fig. 4 is a schematic structural diagram of another video transcoding apparatus according to an embodiment of the present disclosure.

Detailed Description

In order to make those skilled in the art better understand the technical solutions in the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art without any inventive work based on the embodiments in the present application shall fall within the scope of protection of the present application.

The application provides a video transcoding method which can be applied to terminal equipment with an image processing function. The terminal device may be, for example, a desktop computer, a notebook computer, a tablet computer, a workstation, etc. In addition, the method can also be applied to a service server of a video playing website, and the service server can be an independent server or a server cluster consisting of a plurality of servers.

Referring to fig. 1, a video transcoding method provided in the present application includes the following steps.

S11: a source video is acquired.

In this embodiment, multiple videos with different resolutions and different code rates can be generated by transcoding the source video.

In this embodiment, the manner of acquiring the source video may include reading the source video from a storage path provided by the terminal device or receiving the source video sent by another terminal device.

S13: determining path information transcoded from the source video into a target video; the path information comprises a transcoding path and a transcoding mode between nodes in the transcoding path.

In this embodiment, in an actual application scenario, for a target video to be output, transcoding from the source video to the target video may need to pass through a transcoding process between multiple nodes. For example, for some multi-level dependent transcoding scenarios, for example, scenarios for producing high Frame rate video, it is necessary to convert a source video into a source video with a high Frame rate by Frame Rate Conversion (FRC), that is, generate an intermediate result, transcode the intermediate result to generate a video with a specified resolution, and finally transcode the video to generate a target video with a specified video format and a specified resolution. In this way, in order to output the target video with the specified video format and the specified resolution, two intermediate nodes are required, and the whole transcoding process can be divided into four levels, namely a first level taking the source video as a root node, a second level taking the source video with a high frame rate as a child node, a third level taking the video with the specified resolution as a three-level node, and a fourth level taking the target video with the specified resolution and the specified video format as a leaf node. Wherein the output of the node of the second hierarchy depends on the node of the first hierarchy, the output of the node of the third hierarchy depends on the node of the second hierarchy, and the output of the node of the fourth hierarchy depends on the node of the third hierarchy. In order to implement such a multi-level dependent complex transcoding process in the terminal device, path information to be transcoded from the source video to the target video may be determined first. The path information may include a transcoding path and a transcoding method between nodes in the transcoding path. In this way, the transcoding process of the source video can be subsequently implemented in a terminal device based on the determined path information to obtain the target video.

In one embodiment, the transcoding process for transcoding the source video into the target video may be a multi-level dependent transcoding process already existing in an actual application, so that the dependency relationship between transcoding tasks may be obtained from existing separate transcoding tasks, and the input and output of the transcoding tasks may be taken as nodes in one transcoding path. In this way, a transcoding path for transcoding the source video into the target video and nodes in the transcoding path can be obtained. The transcoding mode between each node can also be directly obtained through each separated transcoding task. In this manner, path information for transcoding from the source video to the target video may be determined. In this embodiment, the transcoding manner corresponds to a video transcoding parameter required for transcoding one video to another video, and a parameter value of the video transcoding parameter may be determined according to a parameter value of a video parameter and a parameter value of an audio parameter of the two videos. The transcoding parameters may include, for example, fidelity, resolution, transmission code rate, etc. After the transcoding parameters are set, the video can be transcoded, so that the transcoded video conforming to the transcoding parameters is obtained.

In one embodiment, considering that the transcoding process from the source video to the target video may not be found from the existing multi-level dependent transcoding process, in the practical application process, a deep learning method may be adopted to construct a path identification model for identifying the transcoding path information. For example, the path information corresponding to the video group composed of the source video and the target video may be identified by a Support Vector Machine (SVM). Wherein the source video and the target video can be respectively used as a path starting node and a path ending node in the path identification model. Specifically, when the path recognition model is constructed, a training sample set may be obtained in advance, and the training sample set may be used to train the path recognition model, so that the path recognition model can recognize path information corresponding to an input video group. The training sample set may include sample video sets whose corresponding transcoding paths conform to the path information and sample video sets whose corresponding transcoding paths do not conform to the path information. The sample video group may include sample videos corresponding to the path start node and the path end node, respectively. In this way, in the training process, the sample video sets in the training sample set can be sequentially input into the path recognition model. An initial neural network can be constructed in the path identification model, and initial prediction parameters can be preset in the neural network. After the input sample video group is processed through the initial prediction parameters, a prediction result of the sample video group can be obtained, and the prediction result can be used for representing whether a transcoding path corresponding to the sample video group conforms to the path information. Specifically, when the path identification model processes a sample video group, first, a first feature vector corresponding to parameter information of the source video and a second feature vector corresponding to parameter information of the target video may be extracted respectively. The elements in the first feature vector may be parameter values of various parameters of the source video, for example, parameter values of video parameters or audio parameters, and the video parameters may include video resolution, video bitrate, video frame rate, video format, and the like. Similarly, the elements in the first feature vector may be parameter values of respective parameters of the target video. In this way, the path identification model may read a parameter value of each parameter in the source video corresponding to the path start node in the sample video group and a parameter value of each parameter in the target video corresponding to the path end node, and form the parameter values into the first feature vector and the second feature vector according to the read sequence. In practical applications, the number of parameters is usually large, which results in a large dimension of the extracted feature vector, and thus, a large amount of resources are consumed to process the feature vector. In view of this, in the present embodiment, a Convolutional Neural Network (CNN) may be further used to process the sample video group, so as to obtain a feature vector with a smaller dimension, so as to facilitate subsequent identification processing.

In this embodiment, after the data of the input sample video group is processed by the neural network, the probability value vector of the sample video group can be obtained. A probability value for the specified path information may be included in the probability value vector. The probability value vector may include two probability values respectively representing the probability that the transcoding path conforms to the specified path information and the probability that the transcoding path does not conform to the specified path information. For example, after a set of sample video groups corresponding to the transcoding path conforming to the specified path information is input, a probability value vector of (0.4, 0.8) may be obtained through the path identification model, where 0.4 represents a probability that the transcoding path conforms to the specified path information, and 0.8 represents a probability that the transcoding path does not conform to the specified path information. Because the initial prediction parameters in the path recognition model may not be set accurately enough, the probability result obtained through the path recognition model prediction may not be in accordance with the actual situation. For example, the sample video set whose transcoding path matches the specified path information is input, but the probability vector obtained has a probability of only 0.4 indicating that the transcoding path matches the specified path information and a probability of 0.8 indicating that the transcoding path does not match the specified path information. In this case, the prediction result is indicated to be incorrect. At this time, the initial prediction parameters in the path recognition model may be adjusted according to a difference value between the prediction result and the correct result. In particular, the sample video set may have theoretical probability value results. For example, the probability value result of the transcoding path conforming to the theory of the specified path information may be (1, 0), where 1 represents the probability value of the transcoding path conforming to the specified path information. At this time, the predicted probability value result and the theoretical probability value result may be subtracted to obtain a difference value therebetween, and then the obtained difference value may be used to adjust an initial prediction parameter of the neural network, so that the obtained prediction result matches the correct result after the adjusted prediction parameter is used to process the sample video group again. Therefore, after a large number of training samples are trained, the path identification model can distinguish whether the transcoding path corresponding to the sample video set conforms to the specified transcoding path or not, and accordingly, the path information conforming to the actual transcoding path corresponding to the sample video set can be identified.

S15: and transcoding the source video based on the acquired path information.

In this embodiment, after determining the path information for transcoding from the source video to the target video to be output, a transcoding process for the source video may be implemented in a terminal device based on the determined path information to obtain the target video. Specifically, the source video may be transcoded by a transcoding method between nodes in the transcoding path according to the transcoding path included in the path information, so that the target video may be obtained. For example, the transcoding path includes four nodes, which are respectively a root node, a child node, a third-level node and a fourth-level node according to a transcoding sequence. Wherein, the root node is the source video, and the four-level node is the target video. Then, the root node can be transcoded by a transcoding mode between the root node and the child node to obtain a video corresponding to the child node. And transcoding the video corresponding to the child node in a transcoding mode between the child node and the third-level node to obtain the video corresponding to the third-level node. And finally, transcoding the video corresponding to the third-level node in a transcoding mode between the third-level node and the fourth-level node to obtain the target video. Therefore, the whole multi-level dependent transcoding process is completed in one terminal device, the video uploading process corresponding to the intermediate node and the video processes read from the external storage platform are reduced, the video transcoding time can be reduced, and the video transcoding efficiency is improved.

In this embodiment, in some complex transcoding scenarios, the video to be output often includes at least two target videos, so that multiple transcoding paths for the target videos may occur. When overlapping paths exist in the transcoding paths, in order to avoid repeating the transcoding process in the overlapping paths, it may be determined whether overlapping paths exist in the transcoding paths for each target video, if so, the source video may be transcoded according to the overlapping paths to obtain intermediate nodes, and then the intermediate nodes are transcoded according to non-overlapping paths in each transcoding path. Specifically, for two target videos, a first target video and a second target video, after determining first path information transcoded into the first target video from a source video and second path information transcoded into the second target video from the source video, if an overlapping path exists between the first path information corresponding to the first target video and the second path information corresponding to the second target video, the source video may be transcoded according to the overlapping path to obtain an intermediate node, and then the intermediate node is transcoded according to non-overlapping paths in the first path information and the second path information, respectively. To implement the above process, a transcoding structure of a Directed Acyclic Graph (DAG) including nodes may be constructed according to a top-bottom dependency relationship between the nodes in the path information for the target videos.

For example, in an application scenario in which a source video is transcoded into a target video with dolby sound, a plurality of target videos with different dolby sound needs to be output. And transcoding paths in the path information corresponding to the target videos comprise partially overlapped paths. Then, a corresponding DAG transcoding structure may be constructed according to the upper and lower dependency relationships between the nodes in the path information to merge the overlapping paths, and then a complex transcoding process that a plurality of target videos need to be output may be completed in one terminal device directly according to the DAG transcoding structure. As shown in fig. 2, among the nodes of the target video to be finally output, the paths corresponding to the Node11 having dolby effect 11, the Node12 having dolby effect 12, and the Node13 having dolby effect 13 include a path from the root Node root to the Node1 having dolby effect 1. Then, the overlapped paths may be merged, so that the terminal device may transcode the source video according to the overlapped paths to obtain the intermediate Node1, and then transcode the intermediate Node1 according to the non-overlapped paths to obtain the nodes Node11, Node12, and Node13, respectively. Similarly, for other nodes of the finally output target video, the Node21 with dolby sound 21 and the Node22 with dolby sound 22 may also perform overlapping path merging in the above manner, so that a transcoding path of the DAG transcoding structure may be constructed. In fig. 2, the root Node is a first hierarchy, the nodes Node1 and Node2 form a second hierarchy, the nodes Node11, Node12, Node13, Node21 and Node22 form a third hierarchy, and dependency relationships exist between the hierarchies. The first level is used as input to output the second level, and then the second level is used as input to output the third level. In this embodiment, after the DAG transcoding structure is constructed, the constructed DAG transcoding structure may be expanded horizontally and vertically according to the increased transcoding service requirement. As shown in the dotted line box of fig. 2, in order to increase the transcoding service requirement for outputting videos with dolby audio 211, dolby audio 212, dolby audio 213, dolby audio 221, and dolby audio 311, the nodes Node3 may be increased from the horizontal direction, the nodes Node211, Node212, Node213, and Node221 may be increased from the vertical direction, and the nodes Node31 and Node311 may be increased according to the determined path information of the videos to be output. Therefore, the constructed transcoding path of the DAG transcoding structure can meet the complex transcoding service requirement in one terminal device. In the transcoding process, the intermediate nodes are directly stored locally in the terminal device, so that the video corresponding to the intermediate nodes does not need to be uploaded to an external storage device, and the video does not need to be read from the external storage device for multiple times to perform the subsequent transcoding process.

In this embodiment, the functions implemented in the above method steps may be implemented by a computer program, and the computer program may be stored in a computer storage medium. In particular, the computer storage medium may be coupled to a processor, which may thereby read the computer program from the computer storage medium. The computer program, when executed by a processor, may perform the following functions:

s11: acquiring a source video;

s12: determining path information transcoded from the source video into a target video; the path information comprises a transcoding path and a transcoding mode between nodes in the transcoding path;

s13: transcoding the source video based on the determined path information.

In one embodiment, the computer program, when executed by the processor, further implements the steps of:

and transcoding the source video by a transcoding mode among nodes in the transcoding path according to the transcoding path to obtain the target video.

In one embodiment, when the computer program is executed by the processor and at least two target videos are included in the video to be output, the following steps are further implemented:

when an overlapped path exists between first path information corresponding to a first target video and second path information corresponding to a second target video, transcoding the source video according to the overlapped path to obtain an intermediate node, and then transcoding the intermediate node respectively according to non-overlapped paths in the first path information and the second path information.

inputting a video group formed by the source video and the target video into a path identification model, and determining path information for transcoding the source video into the target video; and respectively taking the source video and the target video as a path starting node and a path ending node in the path identification model.

inputting a video group formed by the source video and the target video into a path identification model, respectively extracting a first feature vector corresponding to parameter information of the source video and a second feature vector corresponding to parameter information of the target video through the feature identification model, and determining a predicted value corresponding to a vector group formed by the first feature vector and the second feature vector through the path identification model;

and taking the path information characterized by the predicted value as the path information transcoded from the source video to the target video.

acquiring a training sample set, wherein the training sample set comprises a sample video group of which the corresponding transcoding path conforms to the path information and a sample video group of which the corresponding transcoding path does not conform to the path information; the sample video group comprises sample videos respectively corresponding to the path starting node and the path ending node;

inputting a sample video group in the training sample set into a path identification model, wherein the path identification model comprises an initial prediction parameter;

processing the input sample video group through the initial prediction parameters to obtain a prediction result of the sample video group, wherein the prediction result is used for representing whether a transcoding path corresponding to the sample video group conforms to the path information;

if the prediction result is incorrect, adjusting the initial prediction parameters in the path recognition model according to the difference value between the prediction result and the correct result, so that the obtained prediction result is consistent with the correct result after the sample video group is processed again through the adjusted prediction parameters.

Referring to fig. 3, the present application further provides a video transcoding apparatus, including:

a video acquisition unit 100 for acquiring a source video;

a path determining unit 200, configured to determine path information transcoded from the source video to a target video; the path information comprises a transcoding path and a transcoding mode between nodes in the transcoding path;

a transcoding unit 300, configured to transcode the source video based on the obtained path information.

In an embodiment, the transcoding unit is further configured to transcode the source video according to the transcoding path by means of transcoding between nodes in the transcoding path, so as to obtain the target video.

In one embodiment, when at least two kinds of target videos are included in the video to be output,

the transcoding unit is further configured to, when an overlapping path exists between first path information corresponding to a first target video and second path information corresponding to a second target video, transcode the source video according to the overlapping path to obtain an intermediate node, and then transcode the intermediate node according to non-overlapping paths in the first path information and the second path information, respectively.

In one embodiment, the path determining unit is further configured to input a video group formed by the source video and the target video into a path identification model, and determine path information for transcoding the source video into the target video; and respectively taking the source video and the target video as a path starting node and a path ending node in the path identification model.

In the video transcoding device provided in the embodiments of the present specification, specific functions of each unit module may be explained in comparison with the foregoing method embodiments in the present specification, and technical effects of the foregoing method embodiments can be achieved, which is not described herein again.

Referring to fig. 4, the present application further provides a video transcoding apparatus, the apparatus includes a memory and a processor, the memory is used for storing a computer program, and the computer program, when executed by the processor, implements the following steps:

s11: acquiring a source video;

s13: transcoding the source video based on the determined path information.

In this embodiment, the memory may include a physical device for storing information, and typically, the information is digitized and then stored in a medium using an electrical, magnetic, or optical method. The memory according to this embodiment may further include: devices that store information using electrical energy, such as RAM, ROM, etc.; devices that store information using magnetic energy, such as hard disks, floppy disks, tapes, core memories, bubble memories, usb disks; devices for storing information optically, such as CDs or DVDs. Of course, there are other ways of memory, such as quantum memory, graphene memory, and so forth.

In this embodiment, the processor may be implemented in any suitable manner. For example, the processor may take the form of, for example, a microprocessor or processor and a computer-readable medium that stores computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, an embedded microcontroller, and so forth.

As can be seen from the above, according to the technical scheme provided by the application, after the source video is obtained, for the target video to be output, the path information transcoded from the source video to the target video can be determined. The path information comprises a transcoding path and a transcoding mode between nodes in the transcoding path. In this way, the source video and other intermediate nodes in the transcoding path can be transcoded in sequence by the transcoding mode among the nodes in the transcoding path according to the transcoding path, and the target video is output. Therefore, the whole transcoding process can be completed in one transcoding machine, the uploading process and the process of reading from an external storage platform are reduced, the video transcoding time is shortened, and the video transcoding efficiency is improved.

In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually making an integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Language Description Language), traffic, pl (core unified Programming Language), HDCal, JHDL (Java Hardware Description Language), langue, Lola, HDL, laspam, hardsradware (Hardware Description Language), vhjhd (Hardware Description Language), and vhigh-Language, which are currently used in most popular applications. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.

Those skilled in the art also know that instead of implementing the video image transcoding means in pure computer readable program code, it is entirely possible to logically program the method steps such that the video image transcoding means performs the same functions in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a video image transcoding apparatus can be considered as a hardware component, and the apparatus included therein for implementing various functions can also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.

From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present application may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present application.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for embodiments of the video image transcoding device, reference may be made to the introduction of embodiments of the method described above for an explanation.

The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

Although the present application has been described in terms of embodiments, those of ordinary skill in the art will recognize that there are numerous variations and permutations of the present application without departing from the spirit of the application, and it is intended that the appended claims encompass such variations and permutations without departing from the spirit of the application.

Claims

1. A method of video transcoding, the method comprising:

acquiring a source video;

determining path information transcoded from the source video into a target video; the path information comprises a transcoding path and a transcoding mode between nodes in the transcoding path;

transcoding the source video based on the determined path information.

2. The method of claim 1, transcoding the source video based on the determined path information comprising:

3. The method of claim 1, when at least two target videos are included in the video to be output, transcoding the source video based on the determined path information comprises:

4. The method of claim 1, wherein the path information is determined as follows:

5. The method of claim 4, wherein the path information is determined as follows:

6. The method of claim 4, wherein the path recognition model is determined as follows:

7. A video transcoding apparatus, the apparatus comprising:

a video acquisition unit for acquiring a source video;

a path determining unit, configured to determine path information transcoded from the source video to a target video; the path information comprises a transcoding path and a transcoding mode between nodes in the transcoding path;

and the transcoding unit is used for transcoding the source video based on the acquired path information.

8. The apparatus of claim 7, wherein the transcoding unit is further configured to transcode the source video according to the transcoding path by means of transcoding between nodes in the transcoding path to obtain the target video.

9. The apparatus according to claim 7, wherein when at least two kinds of target videos are included in the video to be output,

10. The apparatus of claim 7, wherein the path determining unit is further configured to input a video group consisting of the source video and the target video into a path identification model, and determine path information for transcoding the source video into the target video; and respectively taking the source video and the target video as a path starting node and a path ending node in the path identification model.

11. A video transcoding device, characterized in that the device comprises a memory for storing a computer program which, when executed by the processor, implements the method of any of claims 1 to 6.