CN113852824A

CN113852824A - Video transcoding method and device, electronic equipment and storage medium

Info

Publication number: CN113852824A
Application number: CN202111192912.XA
Authority: CN
Inventors: 许东旭
Original assignee: Wangsu Science and Technology Co Ltd
Current assignee: Wangsu Science and Technology Co Ltd
Priority date: 2021-10-13
Filing date: 2021-10-13
Publication date: 2021-12-28

Abstract

The application relates to the technical field of communication, and discloses a video transcoding method, a video transcoding device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring the playing time of each frame of the transcoded source stream video according to the output frame rate of the transcoded source stream video; screening out video frames which do not meet the requirement of monotonously increasing decoding time and discarding the video frames according to the playing time and the timestamp information of each video frame in the source stream video; and transcoding the source stream video according to the output frame rate, updating the time stamp of each key frame into the initial time stamp of each key frame in the source stream video, and updating the time stamps of the rest non-key frames according to the time stamp of each key frame. By aligning the key frame timestamps of the video before and after transcoding in the video transcoding process and adaptively updating the timestamps of the rest non-key frames, the transcoded source stream video can smoothly switch the definition of the video according to the aligned key frames of the timestamps.

Description

Video transcoding method and device, electronic equipment and storage medium

Technical Field

The embodiment of the application relates to the technical field of communication, in particular to a video transcoding method, a video transcoding device, electronic equipment and a storage medium.

Background

With the continuous development of communication technology and internet, users play videos or watch live broadcasts through mobile devices become a normal state. Based on factors such as a Network state and a device limit, a user may select or change a definition according to an actual situation when playing a video on a mobile device, and a Content provider, for example, a Content Delivery Network (CDN) cloud service manufacturer needs to support flexible switching of videos between different definitions according to a user selection.

In the field of cloud video transcoding, a content provider supports different definition switching by transcoding a video, because playing needs to be started from a key frame, the video after definition switching is completed can relocate a key frame closest to the current moment, and the playing of the video after definition switching is started from the closest key frame. In order to ensure the watching experience of a user, the smoothness of a video during definition switching needs to be ensured, a terminal device at a content provider usually calculates a target output frame rate of a transcoded video stream according to the definition after switching, and performs frame interpolation processing on the transcoded video stream at a position corresponding to a time with the same display timestamp under the condition that the current frame rate of the video stream is different from the target frame rate, so that the output frame rate of the transcoded video stream is changed into the target output frame rate after the frame interpolation processing, thereby realizing smooth switching of the video stream between different definitions.

However, after the frame rate of the video stream is converted according to the target output frame rate, the video frame corresponding to the time of the frame interpolation in the transcoded video stream and the video frame corresponding to the original video at the time are likely not to be the same video frame, and at this time, after the playback is resumed, an error exists between the playback progress of the transcoded video and the playback progress before the transcoding, so that the user experience is poor.

Disclosure of Invention

The embodiment of the application mainly aims to provide a video transcoding method, a video transcoding device, electronic equipment and a storage medium, and aims to align key frame timestamps of videos before and after transcoding in a video transcoding process, so that subsequent smooth switching can be performed on the definition of the videos according to the key frames after the timestamps are aligned.

In order to achieve the above object, an embodiment of the present application provides a video transcoding method, including: acquiring the playing time of each frame of the transcoded source stream video according to the output frame rate of the transcoded source stream video; screening out video frames which do not meet the requirement of monotonously increasing decoding time and discarding the video frames according to the playing time and the timestamp information of each video frame in the source stream video; and transcoding the source stream video according to the output frame rate, updating the time stamp of each key frame into the initial time stamp of each key frame in the source stream video, and updating the time stamps of the rest non-key frames according to the time stamp of each key frame.

In order to achieve the above object, an embodiment of the present application further provides a video transcoding apparatus, including: the acquisition module is used for acquiring the playing time of each frame of the transcoded source stream video according to the output frame rate of the transcoded source stream video; the screening module is used for screening out video frames which do not meet the requirement of monotonous increasing of decoding time and discarding the video frames according to the playing time and the timestamp information of each video frame in the source stream video; and the transcoding module is used for transcoding the source stream video according to the output frame rate, updating the time stamp of each key frame into the initial time stamp of each key frame in the source stream video, and updating the time stamps of the rest non-key frames according to the time stamp of each key frame.

In order to achieve the above object, an embodiment of the present application further provides an electronic device, where the electronic device includes: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the video transcoding method as described above.

To achieve the above object, an embodiment of the present application further provides a computer-readable storage medium storing a computer program, which when executed by a processor implements the video transcoding method as described above.

According to the video transcoding method provided by the embodiment of the application, before transcoding a source stream video, the playing time of each frame after transcoding is determined according to the output frame rate after transcoding, video frames which do not meet the requirement of monotonous increasing of decoding time are screened out according to the timestamp information and the playing time of each video frame of the source stream video and discarded, and after the timestamps of key frames in the transcoded video are updated, the transcoded video cannot meet the condition of monotonous increasing of the decoding time of the video frames in a discarding mode of partial video frames, so that the source stream video can be smoothly transcoded and output and smoothly played; after transcoding the source stream video according to the output frame rate, updating the key frame time stamps in the transcoded video into the initial time stamps of all the key frames in the source stream video, and ensuring the strong consistency of the key frame time stamps of the transcoded video and the source stream video in a mode of resetting the time stamps of the key frames in the transcoded video; and updating the rest non-key frame timestamps according to the timestamps of the key frames, so that the video frame timestamps in the transcoded video are uniformly distributed according to the time sequence, and the moments corresponding to the video frame timestamps are uniformly distributed on the time axis, thereby ensuring the fluency of the transcoded video during playing and realizing the smooth switching of the definition.

Drawings

One or more embodiments are illustrated by the corresponding figures in the drawings, which are not meant to be limiting.

Fig. 1 is a flowchart of a video transcoding method in an embodiment of the present application;

fig. 2 is a flowchart of a video frame screening method in an embodiment of the present application;

FIG. 3 is a flow chart of a video frame dropping method in an embodiment of the present application;

FIG. 4 is a flow chart of a timestamp updating method in an embodiment of the present application;

fig. 5 is a schematic structural diagram of a video transcoding device in another embodiment of the present application;

fig. 6 is a schematic structural diagram of an electronic device in another embodiment of the present application.

Detailed Description

It can be known from the background art that, after the frame rate conversion is performed on the video stream according to the target output frame rate, the video frame corresponding to the time of performing the frame interpolation processing in the transcoded video stream and the video frame corresponding to the original video at this time are likely not to be the same video frame, and at this time, after the playback is resumed, an error exists between the playback progress of the transcoded video and the playback progress before the transcoding, so that the user experience is poor. Therefore, how to ensure that the time corresponding to the timestamps of the video key frames before and after transcoding is consistent, and how to recover the video playing from the same playing progress of the video after the definition switching is a problem which needs to be solved urgently.

In order to solve the above problem, an embodiment of the present application provides a video transcoding method, including: acquiring the playing time of each frame of the transcoded source stream video according to the output frame rate of the transcoded source stream video; screening out video frames which do not meet the requirement of monotonous increasing of transcoding time and discarding the video frames according to the playing time and the timestamp information of each video frame in the source stream video; and transcoding the source stream video according to the output frame rate, updating the time stamp of each key frame into the initial time stamp of each key frame in the source stream video, and updating the time stamps of the rest non-key frames according to the time stamp of each key frame.

To make the objects, technical solutions and advantages of the embodiments of the present application clearer, the embodiments of the present application will be described in detail below with reference to the accompanying drawings. However, it will be appreciated by those of ordinary skill in the art that in the examples of the present application, numerous technical details are set forth in order to provide a better understanding of the present application. However, the technical solution claimed in the present application can be implemented without these technical details and various changes and modifications based on the following embodiments. The following embodiments are divided for convenience of description, and should not constitute any limitation to the specific implementation manner of the present application, and the embodiments may be mutually incorporated and referred to without contradiction.

The following description will be made in conjunction with some exemplary embodiments for implementation details of the video transcoding method described in the present application, and the following description is provided only for the convenience of understanding and is not necessary for implementing the present solution.

A first aspect of the embodiments of the present application provides a video transcoding method, where a flow of the video transcoding method refers to fig. 1, and in some embodiments, the video transcoding method is applied to electronic devices such as a cloud service manufacturer terminal and a live broadcast server that provide videos with multiple definitions, and specifically includes the following steps:

step 101, acquiring the playing time of each frame of the transcoded source stream video according to the output frame rate of the transcoded source stream video.

Specifically, when providing a video service for a user, a terminal device of a cloud service manufacturer receives a video playing request of the user, and determines the definition of a video to be provided for the user according to definition information in the video playing request of the user. The method comprises the steps of determining an output frame rate of a source stream video after transcoding according to the definition of a video provided for a user as required, and acquiring the playing time of each frame of the source stream video after transcoding according to the output frame rate of the source stream video after transcoding, so as to facilitate the subsequent screening of video frames of the source stream video before transcoding. The output frame rate corresponding to the designated definition is obtained, the playing time of each frame of the transcoded source stream video is determined, the playing speed of the transcoded source stream video is accurately obtained, and the video frames which do not meet the conditions can be conveniently screened subsequently.

And 102, screening out video frames which do not meet the requirement of monotonous increasing of transcoding time and discarding according to the playing time and the timestamp information of each video frame in the source stream video.

Specifically, when a source stream video is generated, a terminal device of a cloud service manufacturer stores timestamp information of each video frame in the source stream video in advance. After the output frame rate of the transcoded source stream video and the playing time length of each video frame are determined, the video frames in the source stream video are screened according to the playing time length and the timestamp information of each video frame in the source stream video. And screening out video frames which do not meet the monotone increasing of the decoding time in the source stream video, and discarding the screened out video frames. According to the method, the video frames which do not meet the requirements in the source stream video are selectively discarded according to the playing time and the timestamp information of the video frames in the source stream video, the problem that the decoding time of the video frames in the transcoded source stream video does not meet the requirement of monotone increasing after the key frame timestamp is updated is avoided, so that the transcoded video is unsmooth in playing and playing, and the fluency of the video after definition switching is ensured.

In one example, a terminal device of a cloud service manufacturer screens out and discards video frames that do not satisfy a monotonically increasing decoding time according to a playing time and timestamp information of each video frame in a source stream video, and includes: taking a decoding time stamp of a reference video frame in the source stream video as a reference time stamp; acquiring the frame number difference between a current video frame and a reference video frame and the time interval between a decoding time stamp of the current video frame and a reference time stamp; detecting whether the time interval is smaller than the product of the frame number difference and the playing time length; in case the time interval is smaller than the product of the frame number difference and the play duration, the current video frame is discarded.

Specifically, a terminal device of a cloud service vendor may obtain and store timestamp information of each video frame of a source stream video in advance before transcoding the source stream video. For example, the time stamp information of all video frames of the source stream video is stored in queue a, and the time stamp information of each key frame is stored in queue B. The output frame rate of the source stream video needs to be converted in the transcoding process, so that the video frames in the source stream video need to be selectively discarded on the basis of the principle that the decoding time is monotonically increased when encoding is increased. When the terminal equipment of the cloud service manufacturer conducts video frame screening, the playing time t of each frame of the transcoded source stream video is calculated according to the output frame rate of the transcoded source stream video. Then, the first video frame in the source stream video can be used as a reference video frame, and the rest video frames can be used as video frames to be screened, and screening is sequentially performed according to the coding sequence of the video frames. Fig. 2 shows a process of performing video frame screening on a source stream video by a terminal device of a cloud service manufacturer, including:

step 201 reads the timestamp information of the reference video frame and obtains the decoding timestamp of the reference video frame.

Specifically, the cloud service manufacturer reads pre-stored source stream video timestamp information and selects one key frame from video frames of the source stream video. Then, the decoding time stamp of the reference video frame is used as a reference time stamp, and the time corresponding to the reference time stamp is recorded as firsttdts.

Step 202, detecting whether the decoding time difference between the current video frame and the reference video frame meets a preset condition, and entering step 203 to discard the current video frame under the condition that the decoding time difference meets the preset condition; and in the case that the decoding time difference does not meet the preset condition, the step 204 is entered, the current video frame is reserved, and the current video frame is encoded.

Specifically, the terminal device of the cloud server manufacturer reads the timestamp information of the current video frame being screened according to the transcoding time from the queue a, and records the decoding time stamp of the current video frame as currentdts. Then, the frame number difference framenum between the current video frame and the reference video frame is determined according to the coding of the current video frame and the reference video frame. Combining the pre-acquired playing time t of each frame of the transcoded source stream video, taking the following formula as a preset condition, and detecting whether the current video frame meets the requirement that the decoding time is monotonically increased or not:

currentdts–firstdts<t×framenum

namely, the decoding time difference between the current video frame and the reference video frame is obtained, and whether the decoding time difference is smaller than the product of the frame number difference between the current video frame and the reference video frame and the play time t of each frame after transcoding is detected. Under the condition that the decoding time difference between the current video frame and the reference video frame meets the formula, judging that the decoding time of the current video frame after transcoding is later than the decoding time in the source stream video can cause the decoding time of each subsequent video frame to be delayed backwards, namely after the timestamp information of the key frame is updated, the requirement that the decoding time is monotonically increased cannot be met, the current video frame needs to be discarded, and the step 203 is carried out to discard the current video frame. Under the condition that the decoding time difference between the current video frame and the reference video frame does not meet the formula, the requirement that the decoding time is monotonically increased after the timestamp information of the key frame of the current video frame is updated is judged, and the step 204 is entered, wherein the current video frame is reserved and is encoded. According to the principle that the corresponding time of the coding increase decoding time stamp is monotonically increased, by utilizing the playing time of each frame after transcoding and the time stamp information of each video frame, the video frames which can cause the delay of the transcoding time of other video frames in the source stream video after transcoding are screened out and discarded, so that the decoding time of each video frame in the source stream video after transcoding is monotonically increased along with the increase of the coding, and the video after transcoding can be smoothly played.

Further, before dropping the current video frame, the terminal device of the cloud service vendor further includes: detecting whether a current video frame is a key frame in a source stream video; under the condition that the current video frame is a key frame in the source stream video, retaining the current video frame; in the event that the current video frame is not a key frame in the source stream video, the current video frame is discarded. In order to ensure that the number, position and timestamp information of key frames in the transcoded source stream video are consistent with those before transcoding, in the process of screening and discarding video frames of the source stream video, when it is detected that a current video frame needs to be discarded, the current video frame needs to be further detected. The flow chart of the current video frame dropping is shown in fig. 3, and includes:

step 301, detecting whether a current video frame is a key frame in a source stream video, and if the current video frame is the key frame, determining that the current video frame cannot be discarded, entering step 302, retaining the current video frame, and encoding the current video frame; in the case that the current video frame is not a key frame in the source stream video, it is determined that the current video frame can be discarded, and step 303 is entered to discard the current video frame. And then, continuously screening the subsequent video frames to be discarded. By detecting whether the current video frame to be discarded is a key frame, the key frame in the source stream video is prevented from being discarded by mistake, so that the key frame of the transcoded source stream video is lost, and the transcoded source stream video cannot be played at the place where the key frame is discarded by mistake.

In another example, after detecting whether the decoding time interval between the current video frame and the reference video frame is less than the product of the frame number difference and the playing time length, the terminal device of the cloud service manufacturer further includes: detecting whether the time interval is greater than a first preset time length or not; and under the condition that the time interval is greater than a first preset time length, updating the reference video frame to be the current video frame. In the process of screening the video frames of the source stream video, in order to avoid the increase of the calculation amount caused by the overlarge frame number difference between the current video frame and the reference video frame, the reference video frame can be periodically updated according to the screening progress. After detecting whether the decoding time interval of the current video frame and the reference video frame is smaller than the product of the frame number difference and the playing time length or not, detecting the relation between the time interval of the decoding time and the first preset time length. And under the condition that the time interval is greater than the first preset time length, judging that the frame number difference between the reference video frame and the current video frame is large, and updating the reference video frame. And updating the reference video frame to be the current video frame, updating the reference timestamp to be the decoding timestamp of the current video frame, and then continuing to screen the subsequent video frames. By updating the reference video frame according to the relation between the first preset time length and the time interval, the phenomenon that the calculated amount is increased due to the fact that the frame number difference between the reference video frame and the current video frame is too large is avoided, the efficiency of video frame screening is improved, and the efficiency of video transcoding is further improved.

And 103, transcoding the source stream video according to the output frame rate, updating each key frame time stamp to the initial time stamp of each key frame in the source stream video, and updating the time stamps of the rest non-key frames according to the time stamps of the key frames.

Specifically, after the video frames of the source stream video are screened, the terminal device of the cloud service manufacturer encodes each video frame through an encoder according to the output frame rate after transcoding, and marks a corresponding timestamp on each video frame. After the encoding is finished, the time stamp of each key frame of the transcoded source stream video frame is updated to the initial time stamp of each key frame in the source stream video, and the time stamps of the other non-key frames after the transcoding are updated according to the updated time stamp of each key frame. Updating the time stamp of each key frame of the transcoded source stream video into the initial time stamp of each key frame in the source stream video before transcoding so that the time stamp of each key frame of the transcoded source stream video can be kept unchanged, and obtaining a transcoded video with the time stamp information of the key frames of the source stream video before transcoding consistent with that of the key frames of the source stream video before transcoding; the timestamps of the non-key frames are updated according to the timestamps updated by the key frames, so that the corresponding moments of the timestamps of the video frames of the transcoded source stream video can be uniformly distributed, and the transcoded video can be smoothly played.

Further, the updated time stamp includes: a decoding timestamp and a display timestamp. When the terminal equipment of the cloud service manufacturer updates the time stamps of all the video frames in the source stream video, the decoding time stamps of all the video frames are updated, the display time stamps of all the video frames are also updated, the decoding sequence of all the video frames is guaranteed, the problem that all the video frames are disordered due to inaccurate playing of the display time stamps is avoided, and the smoothness of video playing after definition switching is further guaranteed.

In one example, the updating, by the terminal device of the cloud service vendor, the timestamps of the remaining video frames according to the timestamp of each key frame includes: for each non-key frame, obtaining the closest target key frame; and acquiring a time error between the time stamp of the transcoded target key frame and the initial time stamp, and updating the time stamp of the transcoded non-key frame according to the time error. After the coding is completed through the coder, each video frame is marked with a new timestamp, and when the timestamp of each key frame is updated, the terminal equipment can store the timestamp information of each key frame after transcoding, so that the timestamp of each non-key frame can be updated after the timestamp of each key frame is updated. In order to ensure that the time corresponding to the time stamp of each video frame of the transcoded source stream video is uniformly distributed, when the time stamp of the non-key frame is updated, the change of the time corresponding to the time stamp of each non-key frame is consistent with the change of the time corresponding to the time stamp of the target key frame. And for each non-key frame, determining a target key frame closest to the non-key frame, then acquiring a time corresponding to the time stamp of the transcoded target key frame and a time error between the time corresponding to the initial time stamp of the target key frame, and updating the time stamp of the transcoded non-key frame according to the acquired time error. The time error of the corresponding moment of the target key frame time stamp is obtained, and the time stamp of each non-key frame is updated according to the obtained time error, so that the time stamps of each video frame of the transcoded source stream video are distributed on a time axis as uniformly as possible, and the fluency in the process of playing the transcoded source stream video is ensured.

Further, the updating, by the terminal device of the cloud service manufacturer, the time stamp after transcoding each non-key frame according to the time error includes: acquiring a first moment corresponding to a time stamp after transcoding a non-key frame; generating a target timestamp according to the first moment and the time error, and updating the timestamp after transcoding the non-key frame into the target timestamp; and the second time corresponding to the target timestamp is later than the first time, and the time interval between the first time and the second time is a time error. For example, the target key frame stored in advance has a time corresponding to the transcoded decoding time stamp dtsencoder, a display time stamp ptsencoder, an initial time decoding time stamp dtsinial, an initial display time stamp ptsinial, and a decoding time error dts offset-dtsencoder and a display time error pts offset-ptsinial-ptsencoder among time errors of the target time stamp. The first decoding time corresponding to the decoding timestamp of the current non-key frame is dtsout, and the first display time corresponding to the display timestamp is ptsout. The second decoding time dtstart corresponding to the target decoding timestamp is dtsout + dts offset, and the second display time ptstart corresponding to the target display timestamp is ptsout + pts offset. And updating the non-critical decoding time stamp and the display time stamp into target time stamps with dttarget and pttarget time moments respectively. The time error of the corresponding time of the time stamp of the target key frame is obtained, the time stamp of the non-key frame is updated to the target time stamp, the time error of the corresponding time of the target time stamp and the corresponding time of the current time stamp is consistent with the time error of the corresponding time of the target key frame time stamp, and therefore the corresponding time of the time stamp of each video frame of the transcoded source stream video is uniformly distributed on a time axis.

Further, after updating the time stamp after transcoding the non-key frame to the target time stamp, the terminal device of the cloud service vendor further includes: acquiring a third moment corresponding to a decoding time stamp of a next key frame after the non-key frame; detecting whether the second time is later than the third time; and under the condition that the second time is later than the third time, updating the target timestamp, and adjusting the time corresponding to the target timestamp forwards by a second preset time length. After the time stamp of each non-key frame is updated, the time corresponding to each non-key decoding time stamp may be later than the time corresponding to the next key frame decoding time stamp, and in order to ensure that the decoding time of each video frame of the transcoded source stream video increases monotonically with the increase of the encoding, the decoding time of each non-key frame needs to be further detected. The method comprises the steps of obtaining a second time corresponding to a current non-key frame decoding time stamp and a third time corresponding to a next key frame decoding time stamp after the current non-key frame decoding time stamp, detecting whether the second time is later than the third time, and updating a target time stamp of the current non-key frame according to a second preset time length T under the condition that the second time is later than the third time, wherein the updated target decoding time stamp corresponding time dtsnew is dtstandby-T, and the target display time stamp ptsnew is ptstandby-T. The second preset time period may be set according to actual needs, for example, set to 16ms, which is not limited in this embodiment. And updating the non-key frame timestamp of which the corresponding time of the target timestamp is later than the corresponding time of the timestamp of the updated next key frame according to the second preset time length, so that the corresponding time of the timestamp of each video frame in the transcoded source stream video is increased along with the monotonous increase of the code, and the fluency of the transcoded source stream video is ensured.

That is, as shown in fig. 4, the flowchart of performing timestamp update for an encoded video frame includes:

step 401, detecting whether the current video frame is a key frame, if the current video frame is the key frame, entering step 402, and updating the time stamp of the current video frame to the initial time stamp; if the current video frame is not a key frame, step 403 is performed, and the timestamp of the current video frame is updated according to the time error at the time corresponding to the timestamp of the target key frame.

Step 402, update the current video frame timestamp to the initial timestamp.

Specifically, the cloud service acquires an initial timestamp of the current key frame in the source stream video frame, updates the timestamp of the current key frame to the initial timestamp, and records a time error of a time corresponding to the current key frame timestamp.

And step 403, updating the timestamp of the current video frame according to the time error of the corresponding moment of the target key frame timestamp.

Specifically, the cloud service acquires timestamp information of a closest target key frame when detecting that a current video frame is a non-key frame, and moves a time corresponding to a timestamp of the current video frame backwards by a corresponding time according to a time error of the time corresponding to the timestamp before and after the update of the target key frame to obtain the timestamp after the update of the current video frame.

Step 404, detecting whether a second time corresponding to the current video frame decoding time stamp is later than a third time corresponding to a next key frame decoding time stamp, and entering step 405 under the condition that the second time is later than the third time, and updating the current video frame time stamp according to a second preset time length; if the second time is not later than the third time, the process proceeds to step 406, and the transcoded video is output.

In addition, in this embodiment, the first time may be a time corresponding to a time stamp after transcoding any non-key frame, the corresponding second time may be a time after a time error has passed from the time corresponding to the time stamp after transcoding any non-key frame, and the third time is a time corresponding to a decoding time stamp after transcoding a next key frame after transcoding any non-key frame.

In addition, it should be understood that the above steps of the various methods are divided for clarity, and the implementation may be combined into one step or split into some steps, and the steps are divided into multiple steps, so long as the same logical relationship is included in the protection scope of the present patent; it is within the scope of the patent to add insignificant modifications to the algorithms or processes or to introduce insignificant design changes to the core design without changing the algorithms or processes.

Another aspect of the embodiments of the present application relates to a video transcoding apparatus, referring to fig. 5, including:

the obtaining module 501 is configured to obtain a playing time of each frame of the transcoded source stream video according to the output frame rate of the source stream video after transcoding.

The filtering module 502 is configured to filter out and discard video frames that do not satisfy the monotonically increasing decoding time according to the playing time and the timestamp information of each video frame in the source stream video.

The transcoding module 503 is configured to transcode the source stream video according to the output frame rate, update the timestamp of each key frame to the initial timestamp of each key frame in the source stream video, and update the timestamps of the remaining non-key frames according to the timestamp of each key frame.

It should be understood that the present embodiment is an apparatus embodiment corresponding to the method embodiment, and the present embodiment can be implemented in cooperation with the method embodiment. The related technical details mentioned in the method embodiment are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related art details mentioned in the present embodiment can also be applied in the method embodiment.

It should be noted that, all the modules involved in this embodiment are logic modules, and in practical application, one logic unit may be one physical unit, may also be a part of one physical unit, and may also be implemented by a combination of multiple physical units. In addition, in order to highlight the innovative part of the present application, a unit that is not so closely related to solving the technical problem proposed by the present application is not introduced in the present embodiment, but this does not indicate that there is no other unit in the present embodiment.

Another aspect of the embodiments of the present application further provides an electronic device, with reference to fig. 6, including: comprises at least one processor 601; and a memory 602 communicatively coupled to the at least one processor 601; the memory 602 stores instructions executable by the at least one processor 601, and the instructions are executed by the at least one processor 601 to enable the at least one processor 601 to execute the video transcoding method described in any of the above method embodiments.

Where the memory 602 and the processor 601 are coupled by a bus, the bus may comprise any number of interconnected buses and bridges that couple one or more of the various circuits of the processor 601 and the memory 602 together. The bus may also connect various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. A bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor 601 is transmitted over a wireless medium via an antenna, which further receives the data and transmits the data to the processor 601.

The processor 601 is responsible for managing the bus and general processing and may also provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. While memory 602 may be used to store data used by processor 601 in performing operations.

Another aspect of the embodiments of the present application also provides a computer-readable storage medium storing a computer program. The computer program realizes the above-described method embodiments when executed by a processor.

That is, as can be understood by those skilled in the art, all or part of the steps in the method for implementing the embodiments described above may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples for carrying out the present application, and that various changes in form and details may be made therein without departing from the spirit and scope of the present application in practice.

Claims

1. A method of video transcoding, comprising:

acquiring the playing time of each frame of the transcoded source stream video according to the output frame rate of the transcoded source stream video;

screening out video frames which do not meet the requirement of monotonously increasing decoding time according to the playing time and the timestamp information of each video frame in the source stream video and discarding the video frames;

transcoding the source stream video according to the output frame rate, updating the time stamp of each key frame to the initial time stamp of each key frame in the source stream video, and updating the time stamps of the rest non-key frames according to the time stamp of each key frame.

2. The method of claim 1, wherein the selecting and discarding video frames that do not satisfy monotonically increasing decoding time based on the playback time and timestamp information of each video frame in the source stream video comprises:

taking a decoding time stamp of a reference video frame in the source stream video as a reference time stamp;

acquiring a frame number difference between a current video frame and the reference video frame and a time interval between a decoding time stamp of the current video frame and the reference time stamp;

detecting whether the time interval is smaller than the product of the frame number difference and the playing time length;

and under the condition that the time interval is smaller than the product of the frame number difference and the playing time length, discarding the current video frame.

3. The method of video transcoding of claim 2, further comprising, prior to said dropping said current video frame:

detecting whether the current video frame is a key frame in the source stream video;

if the current video frame is a key frame in the source stream video, retaining the current video frame;

discarding the current video frame if the current video frame is not a key frame in the source stream video.

4. The method of claim 2, wherein after the detecting whether the time interval is smaller than the product of the frame number difference and the playing duration, the method further comprises:

detecting whether the time interval is greater than a first preset time length or not;

and updating the reference video frame to the current video frame when the time interval is greater than the first preset time length.

5. The method of claim 1, wherein updating the timestamps of the remaining video frames according to the timestamp of each of the key frames comprises:

for each non-key frame, obtaining the closest target key frame;

and acquiring a time error between the time stamp of the transcoded target key frame and the initial time stamp, and updating the time stamp of the transcoded non-key frame according to the time error.

6. A video transcoding method as claimed in claim 5 wherein said updating the transcoded timestamps of said non-key frames in accordance with said temporal error comprises:

acquiring a first moment corresponding to the time stamp after transcoding the non-key frame;

generating a target timestamp according to the first moment and the time error, and updating the timestamp after transcoding the non-key frame into the target timestamp;

and the second time corresponding to the target timestamp is later than the first time, and the time interval between the first time and the second time is the time error.

7. The method of video transcoding of claim 6, wherein after updating the timestamp of the transcoded non-key frames to the target timestamp, further comprising:

acquiring a third moment corresponding to a next key frame decoding time stamp after the non-key frame;

detecting whether the second time is later than the third time;

and under the condition that the second time is later than the third time, updating the target timestamp, and adjusting the time corresponding to the target timestamp forwards by a second preset time.

8. The video transcoding method of any of claims 1 to 7, wherein the time stamps comprise decoding time stamps and display time stamps.

9. A video transcoding apparatus, comprising:

the acquisition module is used for acquiring the playing time of each frame of the transcoded source stream video according to the output frame rate of the transcoded source stream video;

the screening module is used for screening out video frames which do not meet the requirement of monotonous increasing of transcoding time and discarding the video frames according to the playing time and the timestamp information of each video frame in the source stream video;

and the transcoding module is used for transcoding the source stream video according to the output frame rate, updating the time stamp of each key frame into the initial time stamp of each key frame in the source stream video, and updating the time stamps of the rest non-key frames according to the time stamp of each key frame.

10. An electronic device, comprising:

at least one processor; and the number of the first and second groups,

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of video transcoding as claimed in any of claims 1 to 8.

11. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the video transcoding method of any of claims 1 to 8.