WO2024001777A1

WO2024001777A1 - Video decoding method, virtual set-top box, physical-end set-top box and medium

Info

Publication number: WO2024001777A1
Application number: PCT/CN2023/100133
Authority: WO
Inventors: 杨洋
Original assignee: 中兴通讯股份有限公司
Priority date: 2022-06-30
Filing date: 2023-06-14
Publication date: 2024-01-04
Also published as: CN117376579A

Abstract

Provided in the present disclosure is a video decoding method, which is used for a virtual set-top box. The video decoding method comprises: continuously sending, by means of a video code stream and to a physical-end set-top box, video frames to be decoded; in ascending order, continuously adding, to a presentation time stamp (PTS) queue, PTS values of said video frames in the code stream; receiving decoding feedback which is sent by the physical-end set-top box; comparing the received decoding feedback with the smallest PTS value in the current PTS queue; according to a comparison result, determining whether the physical-end set-top box is decoded normally; and if the physical-end set-top box is decoded normally, deleting the smallest PTS value from the current PTS queue. Further provided in the present disclosure are a video decoding method for a physical-end set-top box, and a virtual set-top box, a physical-end set-top box and a computer-readable storage medium.

Description

Video decoding method, cloud set-top box, physical set-top box, media

Cross-references to related applications

This application claims priority from Patent Application No. 202210763777.8 submitted to the China Patent Office on June 30, 2022, the entire content of which is incorporated herein by reference.

Technical field

The present disclosure relates to, but is not limited to, the technical fields of multimedia terminals and cloud computing.

Background technique

As the big video business becomes increasingly mature, users have higher requirements for business experience, hoping to have a better UI (User Interface) experience and enjoy richer value-added services; at the same time, cloud computing and virtualization technology are developing rapidly , so the cloud set-top box based on cloud computing and virtualization technology was born.

For virtual machines that do not support hardware acceleration, the decoding and rendering of cloud applications when playing videos can only be processed by the CPU (central processing unit), resulting in high virtual machine resource usage. Currently, the above problems are usually solved by separating desktop streams and video streams.

For example, after the cloud set-top box receives the video stream, it forwards the video stream to the physical terminal, and the physical terminal decodes and renders the video stream. However, there is a problem of low decoding efficiency in this scheme.

Contents of the invention

The present disclosure provides a video decoding method, a cloud set-top box, a physical set-top box, and a computer-readable medium.

As a first aspect of the present disclosure, a video decoding method is provided for use in a cloud set-top box. The video decoding method includes: continuously sending each video frame to be decoded to a physical set-top box in the form of a video stream; Continuously add the display timestamp PTS values of each video frame to be decoded in the code stream to the PTS queue in order from small to large; receive the decoding feedback sent by the physical end set-top box; compare the received decoding feedback with the current PTS Compare the smallest PTS value in the queue; determine the physical terminal based on the comparison result Whether the set-top box decodes normally; if the physical set-top box decodes normally, delete the smallest PTS value in the current PTS queue.

As a second aspect of the present disclosure, a video decoding method is provided for use in a physical set-top box. The video decoding method includes: receiving a video code stream; and according to the PTS value of each video frame to be decoded in the video code stream. The video frames to be decoded are decoded in sequence and corresponding decoding feedback is generated; and the decoding feedback is sent to the cloud set-top box according to the decoding order.

As a third aspect of the present disclosure, a cloud set-top box is provided. The cloud set-top box includes: one or more first processors; a first memory on which one or more first programs are stored. When a or multiple first programs are executed by one or more first processors, so that the one or more first processors implement the video decoding method provided by the first aspect of the present disclosure; one or more first I/O ( Input/output) interface, connected between the first processor and the first memory, configured to implement information interaction between the first processor and the first memory.

As a fourth aspect of the present disclosure, a physical set-top box is provided. The physical set-top box includes: one or more second processors; a second memory on which one or more second programs are stored. When a or multiple second programs are executed by one or more second processors, so that the one or more second processors implement the video decoding method provided by the second aspect of the present disclosure; one or more second I/O interfaces , connected between the second processor and the second memory, and configured to implement information interaction between the second processor and the second memory.

As a fifth aspect of the present disclosure, a computer-readable medium is provided. An executable program is stored on the computer-readable medium. When the executable program is executed by a processor, the processor is caused to execute the present disclosure. The video decoding method provided.

Description of drawings

Figure 1 shows a cloud decoding framework;

Figure 2 is a flow chart of an implementation of the video decoding method provided by the present disclosure;

Figure 3 is a flow chart of another implementation of the video decoding method provided by the present disclosure;

Figure 4 is a flow chart of yet another implementation of the video decoding method provided by the present disclosure;

Figure 5 is a flow chart of yet another implementation of the video decoding method provided by the present disclosure;

Figure 6 is a flow chart of an implementation of step S250;

Figure 7 is a schematic diagram of an implementation of the cloud decoding framework provided by the present disclosure;

Figure 8 is a signaling diagram of the video decoding method provided by the present disclosure, in which the physical set-top box decodes successfully;

Figure 9 is a signaling diagram of the video decoding method provided by the present disclosure, in which the physical end set-top box fails to decode.

Detailed ways

In order to enable those skilled in the art to better understand the technical solutions of the present disclosure, the video decoding method, cloud set-top box, physical set-top box, and computer-readable medium provided by the present disclosure will be described in detail below with reference to the accompanying drawings.

Example embodiments will be described more fully below with reference to the accompanying drawings, which may, however, be embodied in different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully allow those skilled in the art to fully understand the scope of the disclosure.

The various embodiments and features in the embodiments of the present disclosure may be combined with each other without conflict.

As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

The terminology used herein is used to describe particular embodiments only and is not intended to limit the disclosure. As used herein, the singular forms "a," "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that when the terms "comprising" and/or "made of" are used in this specification, the presence of said features, integers, steps, operations, elements and/or components is specified but does not exclude the presence or Add one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. It will also be understood that terms such as those defined in commonly used dictionaries should be construed as having the same meaning as in the relevant technical meaning consistent with the meaning within the art and context of the present disclosure, and is not to be construed as having an idealized or overly formal meaning unless expressly so limited herein.

Figure 1 shows a cloud decoding framework. As shown in Figure 1, the cloud set-top box includes a cloud player and a cloud codec component MediaCodec. After the player intercepts the code stream, it sends the frames to be decoded to the cloud MediaCodec, and the cloud MediaCodec sends the frames to be decoded to the physical set-top box. In this way, decoding each video frame includes the following steps: sending the video frame to be decoded to the physical set-top box; after decoding, the physical set-top box reports the decoding success and the display timestamp (PTS, Presentation Time Stamp) value of the successfully decoded video frame. ;The cloud set-top box sends rendering instructions to the physical set-top box.

Since the interface between the cloud set-top box and the physical set-top box has a synchronous structure, the above decoding process will involve three network data interactions. When there is a network delay (for example, 30ms), when decoding one video frame, the network transmission time alone takes 90ms, which results in only 11 video frames being decoded per second. That said, the video decoding rate is relatively slow.

In view of this, as a first aspect of the present disclosure, a video decoding method is provided for use in a cloud set-top box. As shown in Figure 2, the video decoding method includes steps S110 to S160.

In step S110, each video frame to be decoded is continuously sent to the physical set-top box in the form of a video code stream.

In step S120, the presentation time stamp (PTS) values of each video frame to be decoded in the code stream are continuously added to the PTS queue in ascending order.

In step S130, decoding feedback sent by the physical end set-top box is received.

In step S140, the received decoding feedback is compared with the smallest PTS value in the current PTS queue.

In step S150, it is determined according to the comparison result whether the physical set-top box decodes normally.

In step S160, if the physical set-top box decodes normally, delete the smallest PTS value in the current PTS queue.

After the cloud set-top box intercepts the video stream, it converts each to-be-decoded Video frames are continuously sent to the physical set-top box, which improves the efficiency of sending frames to be decoded.

In step S120, the PTS value of the video frame to be decoded is added to the PTS queue. When the physical set-top box decodes it correctly, the minimum PTS value in the PTS queue is deleted.

Multiple PTS values in the PTS queue established by the cloud set-top box correspond to multiple video frames. After each video frame to be decoded is sent to the physical end set-top box in the form of a code stream, the physical end set-top box also decodes each received frame to be decoded in order from small to large PTS values. Every time the physical set-top box correctly decodes a video frame to be decoded, the one with the smallest PTS value in the PTS queue is deleted. In other words, the PTS queue is a dynamic queue, and new PTS values are constantly being added. As the physical set-top box continues to correctly decode the video frames to be decoded, the PTS values in the PTS queue are gradually deleted. It can be considered that the video frames corresponding to the remaining PTS values in the PTS queue are exactly the video frames received by the physical set-top box and not yet decoded. Therefore, steps S120 and S160 are equivalent to simulating the frame decoding rate in the physical set-top box.

In the present disclosure, the physical set-top box does not need to interact with the cloud set-top box after receiving the video frame to be decoded, and the physical set-top box only needs to interact with the cloud set-top box once after decoding a video frame. This not only ensures correct video decoding, but also reduces the impact of network delay on video decoding, and improves the efficiency of video decoding by the physical set-top box.

It should be noted that in this disclosure, the sequence number of each step does not represent the execution order of each step. In other words, it is not necessary to add the PTS value of each frame to be decoded to the PTS queue after all the frames to be decoded are sent to the physical set-top box.

In the present disclosure, step S110 and step S120 are performed simultaneously. That is to say, every time a frame to be decoded is sent to the physical end set-top box, the PTS value of the frame to be decoded is added to the PTS queue.

After receiving the decoding feedback sent by the physical set-top box, steps S110 and S120 are also performed until all code streams intercepted by the cloud set-top box are sent to the physical set-top box.

In the present disclosure, steps S130 to S160 are performed sequentially. Each time a decoding feedback is received (that is, each time step S130 is executed), steps S140 to S160 are executed.

In the present disclosure, there is no special limitation on the rate at which PTS values are added to the PTS queue. As an optional implementation manner, the rate at which the PTS value is added to the PTS queue may be the same as the rate at which the cloud set-top box sends the code stream to the physical set-top box. That is to say, while sending the video frame to be decoded to the physical end set-top box, the PTS value of the video frame to be decoded can be added to the PTS queue.

Of course, the present disclosure is not limited to this, and the rate at which the PTS value is added to the PTS queue may also be different from the code stream rate. In order to ensure that the decoding rate of the physical set-top box is accurately simulated on the cloud set-top box side, the rate at which the PTS value is added to the PTS queue can be adjusted based on the decoding feedback of the physical set-top box.

In the present disclosure, the specific type of decoding feedback is not particularly limited. As an optional implementation manner, the decoding feedback may include the PTS value of the decoding completed frame. That is to say, every time the physical set-top box completes decoding of a video frame, it sends the PTS value of the video frame that has just been decoded to the cloud set-top box. As mentioned above, the physical set-top box decodes each video frame in the order of PTS from small to large. Unless nothing unexpected happens, the PTS value of each decoded video frame is the one with the smallest PTS value among all undecoded video frames of the current physical set-top box. The situation in the cloud set-top box is similar. The smallest PTS value in the PTS queue corresponds to the video frame that is about to be decoded or is being decoded in the physical set-top box. In step S140 of comparing the received decoding feedback with the smallest PTS value in the PTS queue, the smallest PTS value in the PTS queue can be used to subtract the PTS value of the received decoding completion frame, to obtain The difference is the comparison result. If the comparison result is less than the predetermined value, it means that the video decoding speed of the physical set-top box simulated in the cloud set-top box is not much different from the actual video decoding speed of the physical set-top box. In this case, it can be determined that the decoding of the physical set-top box is normal. Correspondingly, if the comparison result is not less than the predetermined value, it means that the simulated video decoding speed of the physical set-top box in the cloud set-top box is greater than the actual video decoding speed of the physical set-top box. In this case, the video decoding method may further include: if the comparison result is not less than the predetermined value, generating information indicating that frameless decoding is completed. It should be pointed out that when frame decoding is not completed, it will not stop sending the code stream to the physical set-top box, nor will it stop adding the corresponding PTS value to the PTS queue. Only adding the PTS value to the PTS queue without deleting the PTS value is equivalent to slowing down The video decoding speed of the physical set-top box is simulated in the cloud set-top box.

In the present disclosure, the predetermined value is not particularly limited. As an optional implementation, the predetermined value may be the time required for playing 3 to 5 video frames.

When the physical set-top box decodes the received video frame, the decoding may fail due to various factors (for example, physical set-top box hardware problems, data loss during stream transmission). When the physical set-top box fails to decode, a decoding failure flag will be generated and sent to the cloud set-top box as "decoding feedback".

When comparing the smallest PTS value in the PTS queue with the decoding failure indicator, it is easy to determine that the decoding feedback is the decoding failure indicator, as shown in Figure 3. In this case, the video decoding method also includes: In step S170, decoding exception information is generated.

In step S170, after the decoding exception information is generated, relevant technical personnel can process the decoding exception information.

In this disclosure, the abnormality of the physical end set-top box can also be determined in the following manner: after receiving the decoding feedback, query the PTS queue to determine the smallest PTS value in the PTS queue; when the PTS queue is empty When, decoding exception information is generated.

After the physical end set-top box completes decoding of the video frame to be decoded, it needs to render the decoded video frame. Therefore, as shown in Figure 3, the video decoding method also includes step S180.

In step S180, if the physical set-top box decodes normally, a rendering instruction for the decoded video frame is sent to the physical set-top box.

After receiving the rendering instruction, the physical set-top box can render the corresponding decoded video frame.

As a second aspect of the present disclosure, a video decoding method is provided for use in a physical set-top box. As shown in Figure 4, the video decoding method includes steps S210 to S230.

In step S210, the video code stream is received.

In step S220, each video frame to be decoded is sequentially decoded in the order of the PTS value of each video frame to be decoded in the video code stream, and corresponding decoding feedback is generated.

In step S230, the decoding feedback is sent to the cloudification machine according to the decoding order. top box.

The physical set-top box cooperates with the cloud set-top box. Therefore, the video code stream received in step S210 is the video code stream sent in step S110 in the video decoding method provided by the first aspect of the present disclosure. When the physical set-top box decodes each frame to be decoded in the received code stream, it only needs to interact with the cloud set-top box once by generating decoding feedback, thus reducing the impact of network delay on the decoding process.

It should be noted that each time a video frame is decoded, a decoding feedback is generated. As an optional implementation manner, each time the physical end set-top box receives a video frame to be decoded, the video frame to be decoded is added to the decoding queue. In the decoding queue, the video frames to be decoded are arranged in ascending order according to their respective PTS values. When decoding each video frame to be decoded, decoding is also performed in ascending order of PTS values.

Similar to the first aspect of the present disclosure, in the video decoding method provided in the second aspect, the requirements before each step are only for convenience of description and do not represent the order of execution.

That is to say, in the present disclosure, step S210, step S220 and step S230 can also be performed simultaneously. Each time a video frame to be decoded is received, the video frame to be decoded is decoded. Moreover, in the video decoding method, each time a decoding feedback is generated, the generated decoding feedback is sent to the cloud set-top box.

As an optional implementation, if the decoding is normal, for the decoded video frame, the decoding feedback includes the PTS value of the decoded video frame.

That is to say, every time the physical set-top box decodes a video frame normally, it feeds back the PTS value of the video frame to the cloud set-top box, so that the cloud set-top box can perform step S140.

As another optional implementation, the decoding feedback includes a decoding failure identification of the decoding abnormal frame. When decoding fails, a decoding failure identifier for the abnormal frame in which decoding fails is generated.

After decoding is completed, the physical set-top box needs to render the decoded video frames. As mentioned above, rendering instructions are also issued by the cloud set-top box. Correspondingly, as shown in Figure 5, the video decoding method also includes steps S240 and S250.

In step S240, a rendering instruction is received.

In step S250, the corresponding decoded frame is rendered according to the rendering instruction. dye.

As an optional implementation, the PTS value can be carried in the rendering instruction. After the physical set-top box determines the PTS value in the rendering instruction, it renders the decoded frame corresponding to the PTS value.

Of course, the present disclosure is not limited to this. As an optional implementation, as shown in Figure 6, step S250 may include steps S251 to S254.

In step S251, determine the PTS value of the decoded video frame corresponding to the rendering instruction.

In step S252, the PTS value of the decoded video frame is compared with the PTS value corresponding to the rendering instruction.

In step S253, if the PTS value of the decoded video frame does not exceed the PTS value corresponding to the rendering instruction, the decoded video frame is rendered after waiting for a predetermined time.

In step S254, if the PTS value of the decoded video frame is greater than the PTS value corresponding to the rendering instruction, the frame to be rendered is rendered.

If the PTS value of the video frame that has just been decoded does not exceed the PTS value corresponding to the rendering instruction, it means that the decoding speed of the physical set-top box is too fast and rendering cannot be performed immediately. Otherwise, the image rendering frame rate will be too high. You need to wait for a predetermined time before rendering the decoded frame to achieve better video display effects.

If the PTS value of the video frame that has just been decoded is greater than the PTS corresponding to the rendering instruction, rendering is performed immediately.

As a third aspect of the present disclosure, a cloud set-top box is provided. The cloud set-top box includes: one or more first processors; a first memory on which one or more first programs are stored. When a or multiple first programs are executed by one or more first processors, so that the one or more first processors implement the video decoding method provided by the first aspect of the present disclosure; one or more first I/O interfaces , connected between the first processor and the first memory, and configured to implement information interaction between the first processor and the first memory.

In this disclosure, the first processor is a device with data processing capabilities, including but not limited to a central processing unit (CPU), etc.; the first memory is a device with data storage capabilities, including but not limited to random access memory. (RAM, more specifically SDRAM, DDR, etc.), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory (FLASH); the first I/O interface (read-write interface) is connected between the first processor and the first memory , can realize information interaction between the first processor and the first memory, which includes but is not limited to a data bus (Bus), etc.

In this disclosure, the second processor is a device with data processing capabilities, including but not limited to a central processing unit (CPU), etc.; the second memory is a device with data storage capabilities, including but not limited to random access memory. (RAM, more specifically such as SDRAM, DDR, etc.), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory (FLASH); the second I/O interface (read-write interface) is connected to the Between the two processors and the second memory 102, information exchange between the second processor and the second memory can be realized, which includes but is not limited to a data bus (Bus), etc.

As a sixth aspect of the present disclosure, a cloud set-top box is provided. As shown in Figure 7, the cloud set-top box includes a cloud player, a cloud Mediacodec, a PTS management component and a physical side status recording component.

The cloud player is configured to intercept the video stream and send the intercepted video stream to the cloud Mediacodec in the form of a code stream.

The cloud Mediacodec is configured to save the PTS value of each video frame to be decoded, and continuously send each video frame to be decoded to the physical set-top box in the form of a video stream.

The PTS management component is configured to continuously add the PTS values of each video frame to be decoded to the PTS queue at the first rate in order from small to large.

The physical side status recording component is configured to receive decoding feedback sent by the physical side set-top box.

The PTS management component is also configured to compare the decoding feedback with the smallest PTS value in the current PTS queue to determine whether the physical set-top box decodes normally, and delete the smallest PTS value in the current PTS queue when the physical set-top box decodes normally.

In addition to this, the PTS management component is configured to adjust the rate at which PTS values are added to the PTS queue. The cloud player is also used to send rendering requests to the cloud Mediacodec. Cloud Mediacodec is also used to send the PTS value of the video frame corresponding to the rendering request.

As a seventh aspect of the present disclosure, a physical set-top box is provided. As shown in Figure 7, the physical set-top box includes a decoding component and a rendering identifier management component.

The decoding component is configured to: receive a video code stream; decode each video frame to be decoded in sequence according to the PTS value of each video frame to be decoded in the video code stream, and generate corresponding decoding feedback; decode according to the In sequence, the decoding feedback is sent to the cloud set-top box.

The rendering identification management component is configured to: receive a rendering instruction; and render the corresponding decoding completed frame according to the rendering instruction.

The video decoding method provided by the present disclosure will be introduced below with reference to Figures 8 and 9. Shown in Figure 8 is the signaling diagram during normal decoding of the physical end set-top box. As shown in the figure, in this case, the video decoding method includes S101 to S117.

In S101, the cloud video application opens and creates a cloud application player to play the video source.

In S102, the cloud application player (cloud player) creates a cloud MediaCodec decoder.

In S103, Cloud MediaCodec creates a decoder on the physical set-top box and initializes the decoder.

In S104, the cloud application player sends a code stream to the cloud MediaCodec for decoding by the physical set-top box.

In S105, the cloud MediaCodec records the PTS value of the video frame to the PTS management component.

At S106, the cloud MediaCodec pushes the video frame to the physical set-top box.

In S107, the physical end set-top box adds the video frame to the decoding queue and performs decoding.

In S108, after the physical end set-top box completes decoding the data, the decoded video frame The PTS value is reported to the physical end status recording component.

In S109, the cloud application player queries the cloud MediaCodec whether data decoding is completed.

In S110a, the cloud MediaCodec queries the PTS management component whether there is a PTS value.

In S111, there is no frame data in the PTS management component, and the cloud application player is directly notified that no frame decoding is completed.

In S112, there is frame data in the PTS management component, and the minimum PTS value is compared with the PTS value fed back by the physical end set-top box recorded by the physical end status recording component.

In S113, if the minimum PTS value of the PTS management component is greater than the PTS value fed back by the physical set-top box by more than 5 frames of data, it means that the physical set-top box decodes slowly and needs to reduce the sending speed of the cloud simulated PTS value and directly notify the cloud application player that there is no Frame decoding completed.

In S114, otherwise, the PTS value of the decoded video frame is reported to the player, and the PTS value is removed from the queue of the PTS management module.

In S115, the cloud application player sends a request to render the video frame.

In S116, the cloud MediaCodec sends the PTS value of the frame to be rendered to the rendering identification management component of the physical set-top box. The rendering identification management component records the current cloud rendering progress.

In S117, after each decoder component of the physical set-top box decodes a frame, it queries the rendering identification management component for the PTS value to be rendered sent by the cloud. If the PTS value of the decoded frame is less than or equal to the PTS value of the frame to be rendered sent by the cloud, value, immediately render the decoded video frame; otherwise, it means that the physical set-top box decodes too fast and cannot render the decoded video frame immediately, otherwise the image rendering frame rate is too high; until the rendering identification management component receives the cloud When sending the PTS value of the rendering frame, compare the PTS value of the frame that has just been decoded by the physical set-top box with the PTS value of the frame to be rendered sent by the cloud. If the PTS value of the frame that has just been decoded is less than or equal to the frame that is to be rendered sent by the cloud. Render the PTS of the frame, immediately render the video frame that has just been decoded, and then continue to decode the next frame.

Figure 9 shows the signaling diagram when the physical set-top box decodes abnormally. In this case, the video decoding method includes S201 to S211.

In S201, cloud set-top boxes and physical set-top boxes have created cloud decoding channels, And initialize the decoder on the physical side.

In S202, the cloud application player sends the code stream to the cloud MediaCodec for decoding.

In S203, the cloud MediaCodec records the PTS value of the frame to the PTS management component.

In S204, the cloud MediaCodec pushes the frame to the physical set-top box.

In S205, the physical end adds the frame to the decoding queue and performs decoding, but the decoding fails.

In S206, the decoding failure status is reported to the physical end status recording component.

In S207, the cloud application player queries the cloud MediaCodec whether data decoding is completed.

In S208, the cloud MediaCodec queries the PTS management component whether a PTS value is stored.

In S209, the PTS management component stores the PTS value, and then queries the physical side status recording component to decode the PTS on the physical side. As a result, it is found that the physical side decoding failed.

In S210a, a player decoding failure is reported.

In S211, the cloud application player notifies the cloud video application, and the cloud application displays a playback failure.

Those of ordinary skill in the art can understand that all or some steps, systems, and functional modules/units in the devices disclosed above can be implemented as software, firmware, hardware, and appropriate combinations thereof. In hardware implementations, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may consist of several physical components. Components execute cooperatively. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, a digital signal processor, or a microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit . Such software may be distributed on computer-readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). As is known to those of ordinary skill in the art, the term computer storage medium includes any method or technology for storage of information, such as computer readable instructions, data structures, program modules or other data. Implemented volatile and non-volatile, removable and non-removable media. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disk (DVD) or other optical disk storage, magnetic cassettes, tapes, disk storage or other magnetic storage devices, or may Any other medium used to store desired information and that can be accessed by a computer. Additionally, it is known to those of ordinary skill in the art that communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .

Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should be interpreted in a general illustrative sense only and not for purpose of limitation. In some instances, it will be apparent to those skilled in the art that features, characteristics and/or elements described in connection with a particular embodiment may be used alone, or may be used in conjunction with other embodiments, unless expressly stated otherwise. Features and/or components are used in combination. Accordingly, it will be understood by those skilled in the art that various changes in form and details may be made without departing from the scope of the present disclosure as set forth in the appended claims.

Claims

A video decoding method for cloud set-top boxes. The video decoding method includes:

Continuously send each video frame to be decoded to the physical set-top box in the form of a video stream;

Continuously add the display timestamp PTS values of each video frame to be decoded in the code stream to the PTS queue in order from small to large;

Receive decoding feedback sent by the physical end set-top box;

Compare the received decoding feedback with the smallest PTS value in the current PTS queue;

Determine whether the physical set-top box decodes normally according to the comparison result;

If the physical end set-top box decodes normally, delete the smallest PTS value in the current PTS queue.
The video decoding method according to claim 1, wherein the decoding feedback includes the PTS value of the decoded video frame;

In the step of comparing the received decoding feedback with the smallest PTS value in the PTS queue, subtracting the PTS value of the received decoded video frame from the smallest PTS value in the PTS queue, we obtain The difference is the comparison result;

If the comparison result is less than the predetermined value, it is determined that the physical end set-top box is decoding normally.
The video decoding method according to claim 2, wherein the video decoding method further includes:

If the comparison result is not less than the predetermined value, information indicating completion of frameless decoding is generated.
The video decoding method according to any one of claims 1 to 3, wherein if the decoding feedback is a decoding failure indicator, the video decoding method further includes:

Generate decoding exception information.
The video decoding method according to any one of claims 1 to 3, wherein the video decoding method further includes:

If the physical set-top box decodes normally, a rendering instruction for the decoded video frame is sent to the physical set-top box.
A video decoding method for physical set-top boxes. The video decoding method includes:

Receive video stream;

Sequentially decoding each video frame to be decoded in the order of the PTS value of each video frame to be decoded in the video code stream, and generating corresponding decoding feedback;

According to the decoding order, the decoding feedback is sent to the cloud set-top box.
The video decoding method according to claim 6, wherein if the decoding is normal, for the decoded video frame, the decoding feedback includes the PTS value of the decoded video frame.
The video decoding method according to claim 6, wherein if the decoding is abnormal, the decoding feedback includes a decoding failure identification of the video frame with abnormal decoding.
The video decoding method according to any one of claims 6 to 8, wherein the video decoding method further includes:

Receive rendering instructions;

Render the corresponding decoded video frame according to the rendering instruction.
The video decoding method according to claim 9, wherein rendering the corresponding decoded video frame according to the rendering instruction includes:

Determine the PTS value of the decoded video frame corresponding to the rendering instruction;

The PTS value of the decoded video frame is compared with the PTS value corresponding to the rendering instruction. row comparison;

If the PTS value of the decoded video frame does not exceed the PTS value corresponding to the rendering instruction, then wait for a predetermined time before rendering the decoded video frame;

If the PTS value of the decoded video frame is greater than the PTS value corresponding to the rendering instruction, the frame to be rendered is rendered.
A cloud set-top box, which includes:

one or more first processors;

A first memory having one or more first programs stored thereon. When the one or more first programs are executed by one or more first processors, the one or more first processors implement claims 1 to 5 The video decoding method described in any one of the above;

One or more first I/O interfaces are connected between the first processor and the first memory, and are configured to implement information exchange between the first processor and the first memory.
A physical set-top box, which includes:

one or more second processors;

The second memory has one or more second programs stored thereon, and when the one or more second programs are executed by one or more second processors, the one or more second processors implement claims 6 to 10 The video decoding method described in any one of the above;

One or more second I/O interfaces are connected between the second processor and the second memory, and are configured to implement information exchange between the second processor and the second memory.
A computer-readable medium having an executable program stored on the computer-readable medium. When the executable program is executed by a processor, the processor is caused to execute the method described in any one of claims 1 to 10. Video decoding method.