CN110072143B

CN110072143B - Video stream decoding method and device

Info

Publication number: CN110072143B
Application number: CN201910205510.5A
Authority: CN
Inventors: 郭鹏; 潘廷勇; 韩杰; 王艳辉
Original assignee: Visionvera Information Technology Co Ltd
Current assignee: Visionvera Information Technology Co Ltd
Priority date: 2019-03-18
Filing date: 2019-03-18
Publication date: 2021-03-12
Anticipated expiration: 2039-03-18
Also published as: CN110072143A

Abstract

The embodiment of the application provides a video stream decoding method, which is applied to a video network, wherein the video network comprises a video receiving end, and the video receiving end is provided with a decoder. The video receiving end acquires the timestamp information of one frame of video stream when receiving one frame of video stream, and sends the received one frame of video stream to a buffer queue; determining a frame rate according to respective timestamp information of two adjacent frames of video streams; when sending a frame of video stream to the decoder, determining a first sending interval time for sending the next frame of video stream according to the frame rate, the preset time coefficient value and the number of the frame video streams currently cached in the buffer queue; and sequentially transmitting each frame of video stream buffered in the buffer queue to a decoder according to the first transmission interval time. According to the method and the device, the sending time interval between the video streams of each frame fluctuates up and down within a certain range, the time delay for receiving the video stream of the key frame is smoothed, and the picture smoothness of video playing is improved.

Description

Video stream decoding method and device

Technical Field

The present application relates to the field of video networking technologies, and in particular, to a method and an apparatus for decoding a video stream.

Background

In video networking video service, in order to guarantee bandwidth, uniform packet operation is required when video packets are sent in the video service. Because the data volume of the I frame is large, each I frame needs to be divided into a plurality of network packets to be sent, when a receiving end receives video data, the receiving end needs to receive the network packets for multiple times in a circulating mode to complete the splicing of the I frame data of one frame, and because of the packet evening operation, time difference exists between the receiving packets, and the I frame after the splicing of one frame has a large time interval with the last P frame after the splicing is completed. The decoder decodes the video stream according to the frame unit, and when the time interval between frames is large and reaches the range recognizable by human eyes, the video output by the decoder has obvious pause phenomenon.

Disclosure of Invention

In view of the above problems, embodiments of the present application are proposed to provide a video stream decoding method and a video stream decoding apparatus, and a corresponding computer-readable storage medium and electronic device that overcome or at least partially solve the above problems.

In order to solve the above technical problem, the present application provides a video stream decoding method, which is applied to a video network, where the video network includes a video receiving end, and the video receiving end is configured with a decoder, and the method includes:

the video receiving end acquires the timestamp information of one frame of video stream when receiving one frame of video stream, and sends the received one frame of video stream to a buffer queue;

the video receiving end determines a frame rate according to respective timestamp information of two adjacent frames of video streams;

when the video receiving end sends a frame of video stream to the decoder, determining a first sending interval time for sending the next frame of video stream according to the frame rate, a preset time coefficient value and the number of the frame video streams cached currently in the buffer queue;

the video receiving end sequentially sends each frame of video stream cached in the buffer queue to the decoder according to the first sending interval time; the decoder is used for decoding the video streams of the frames.

Optionally, before the step of determining, by the video receiving end, a first sending interval time for sending a next frame of video stream according to the frame rate, the number of currently buffered frame video streams in the buffer queue, and a preset time coefficient value when sending each frame of video stream to the decoder, the method further includes:

the video receiving end determines a second sending interval time according to the frame rate;

the video receiving end sends the first frame video stream received by the buffer queue firstly to a decoder;

when the video receiving end sends a frame of video stream to the decoder, the video receiving end determines a first sending interval time for sending the next frame of video stream according to the frame rate, the preset time coefficient value and the number of the frame video streams currently cached in the buffer queue, and the step includes:

and after the video receiving end separates the second sending interval time, after the second frame video stream in the buffer queue is sent to the decoder, when sending a frame video stream to the decoder, determining the first sending interval time for sending the next frame video stream according to the frame rate, the time coefficient value and the number of the frame video streams currently stored in the buffer queue.

Optionally, when the video receiving end sends a frame of video stream to the decoder, determining a first sending interval time for sending a next frame of video stream according to the frame rate, the preset time coefficient value, and the number of the currently buffered frame of video streams in the buffer queue: the method comprises the following steps:

the video receiving end judges whether the number of the frame video streams currently cached in the buffer queue is larger than a preset number or not;

if so, the video receiving end determines the first sending interval time according to a preset first formula;

if not, the video receiving end determines the first sending interval time according to a preset second formula.

In order to solve the above technical problem, the present application further provides a video stream decoding method, where the method is applied to a video network, the video network includes a video receiving end, and the video receiving end is configured with a decoder, and the method includes:

the video receiving end determines a time coefficient value according to the received timestamp information of the first frame video stream;

when the video receiving end sends a frame of video stream to the decoder, determining a first sending interval time for sending the next frame of video stream according to the frame rate, the time coefficient value and the number of the frame video streams cached currently in the buffer queue;

Optionally, when the video receiving end sends a frame of video stream to the decoder, determining a first sending interval time for sending a next frame of video stream according to the frame rate, the time coefficient value, and the number of currently buffered frame video streams in the buffer queue: the method comprises the following steps:

In order to solve the above technical problem, the present application further provides a video stream decoding apparatus, where the apparatus is applied to a video network, the video network includes a video receiving end, and a decoder is configured in the video receiving end; the device is located at the video receiving end and comprises:

the video stream buffering module is used for sending a received frame of video stream to a buffering queue every time the frame of video stream is received;

the time stamp information acquisition module is used for acquiring the time stamp information of the frame of video stream;

the frame rate determining module is used for determining the frame rate according to the respective timestamp information of two adjacent frames of video streams;

a first sending interval time determining module, configured to determine, when each frame of video stream is sent to the decoder, a first sending interval time for sending a next frame of video stream according to the frame rate, a preset time coefficient value, and the number of currently buffered frame video streams in the buffer queue;

a frame video stream sending module, configured to send each frame video stream buffered in the buffer queue to the decoder in sequence according to the first sending interval time; the decoder is used for decoding the video streams of the frames in sequence.

Optionally, the apparatus further comprises:

a second sending interval time determining module, configured to determine a second sending interval time according to the frame rate;

a first sending module, configured to send a first frame video stream received by the buffer queue first to a decoder; and is used for sending the second frame video stream in the buffer queue to decoding at intervals of the second sending interval time;

the first sending interval time determining module is configured to determine, after the first sending module sends the second frame video stream in the buffer queue to the decoder, a first sending interval time for sending a next frame video stream according to the frame rate and the number of currently stored frame video streams in the buffer queue every time the frame video stream in the buffer queue is sent to the decoder.

Optionally, the first sending interval time determining module includes:

the first calculation submodule is used for determining the first sending interval time according to a preset first formula when the number of the frame video streams cached in the buffering queue at present is greater than a preset number;

and the second calculation submodule is used for determining the first sending interval time according to a preset second formula when the number of the frame video streams currently cached in the buffering queue is less than or equal to a preset number.

In order to solve the above technical problem, the present application further provides a computer-readable storage medium having a computer program stored thereon, which when executed by a processor implements the steps in the method according to any one of claims 1 to 5.

In order to solve the above technical problem, the present application further provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing implements the steps of the method according to any one of claims 1 to 5.

Compared with the prior art, the embodiment of the application has the following advantages:

firstly, the embodiment of the application applies the characteristics of the video network, when a video receiver receives a video stream of the video network, when each frame of video stream is received, the frame of video stream is firstly stored in a buffer queue, when each frame of video stream in the buffer queue is sent to be decoded, the interval time for sending the next frame of video stream is determined according to the number of the currently buffered frame of video stream in the buffer queue, so that the larger time delay between an I frame and a P frame caused by the uniform packet operation is averagely distributed to the sending interval time for sending each frame of video stream, the time interval for sending the frame of video stream to a decoder is kept uniformly changed by the video receiver, the video blocking is avoided, and the image fluency is improved.

Secondly, the sending interval time of every two adjacent frame video streams is determined according to the frame rate, the time coefficient value and the number of the currently buffered frame video streams, so that the frequency of a frame video streaming decoder can be basically consistent with the frame rate when the video stream is received by a video receiving end, the sending data volume of the video sending end in unit time is approximately the same as the data volume of the video receiving end sending the decoder, and the video can be kept synchronous.

Drawings

FIG. 1 is a networking schematic of a video network of the present application;

FIG. 2 is a schematic diagram of a hardware architecture of a node server according to the present application;

fig. 3 is a schematic diagram of a hardware architecture of an access switch of the present application;

fig. 4 is a schematic diagram of a hardware structure of an ethernet protocol conversion gateway according to the present application;

fig. 5 is a flowchart of steps of an embodiment 1 of a video stream decoding method according to the present application;

fig. 6 is an application environment diagram of embodiment 1 of a video stream decoding method according to the present application;

fig. 7 is a flowchart illustrating steps of another embodiment of a video stream decoding method according to embodiment 1 of the present application;

fig. 8 is a block diagram of a video stream decoding apparatus embodiment 2 according to the present application;

fig. 9 is a block diagram of a video stream decoding apparatus according to embodiment 2 of the present application.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, the present application is described in further detail with reference to the accompanying drawings and the detailed description.

The video networking is an important milestone for network development, is a real-time network, can realize high-definition video real-time transmission, and pushes a plurality of internet applications to high-definition video, and high-definition faces each other.

The video networking adopts a real-time high-definition video exchange technology, can integrate required services such as dozens of services of video, voice, pictures, characters, communication, data and the like on a system platform on a network platform, such as high-definition video conference, video monitoring, intelligent monitoring analysis, emergency command, digital broadcast television, delayed television, network teaching, live broadcast, VOD on demand, television mail, Personal Video Recorder (PVR), intranet (self-office) channels, intelligent video broadcast control, information distribution and the like, and realizes high-definition quality video broadcast through a television or a computer.

To better understand the embodiments of the present application, the following description refers to the internet of view:

some of the technologies applied in the video networking are as follows:

network Technology (Network Technology)

Network technology innovation in video networking has improved over traditional Ethernet (Ethernet) to face the potentially enormous video traffic on the network. Unlike pure network Packet Switching (Packet Switching) or network Circuit Switching (Circuit Switching), the Packet Switching is adopted by the technology of the video networking to meet the Streaming requirement. The video networking technology has the advantages of flexibility, simplicity and low price of packet switching, and simultaneously has the quality and safety guarantee of circuit switching, thereby realizing the seamless connection of the whole network switching type virtual circuit and the data format.

Switching Technology (Switching Technology)

The video network adopts two advantages of asynchronism and packet switching of the Ethernet, eliminates the defects of the Ethernet on the premise of full compatibility, has end-to-end seamless connection of the whole network, is directly communicated with a user terminal, and directly bears an IP data packet. The user data does not require any format conversion across the entire network. The video networking is a higher-level form of the Ethernet, is a real-time exchange platform, can realize the real-time transmission of the whole-network large-scale high-definition video which cannot be realized by the existing Internet, and pushes a plurality of network video applications to high-definition and unification.

Server technology (Servertechnology)

The server technology on the video networking and unified video platform is different from the traditional server, the streaming media transmission of the video networking and unified video platform is established on the basis of connection orientation, the data processing capacity of the video networking and unified video platform is independent of flow and communication time, and a single network layer can contain signaling and data transmission. For voice and video services, the complexity of video networking and unified video platform streaming media processing is much simpler than that of data processing, and the efficiency is greatly improved by more than one hundred times compared with that of a traditional server.

Storage Technology (Storage Technology)

The super-high speed storage technology of the unified video platform adopts the most advanced real-time operating system in order to adapt to the media content with super-large capacity and super-large flow, the program information in the server instruction is mapped to the specific hard disk space, the media content is not passed through the server any more, and is directly sent to the user terminal instantly, and the general waiting time of the user is less than 0.2 second. The optimized sector distribution greatly reduces the mechanical motion of the magnetic head track seeking of the hard disk, the resource consumption only accounts for 20% of that of the IP internet of the same grade, but concurrent flow which is 3 times larger than that of the traditional hard disk array is generated, and the comprehensive efficiency is improved by more than 10 times.

Network Security Technology (Network Security Technology)

The structural design of the video network completely eliminates the network security problem troubling the internet structurally by the modes of independent service permission control each time, complete isolation of equipment and user data and the like, generally does not need antivirus programs and firewalls, avoids the attack of hackers and viruses, and provides a structural carefree security network for users.

Service Innovation Technology (Service Innovation Technology)

The unified video platform integrates services and transmission, and is not only automatically connected once whether a single user, a private network user or a network aggregate. The user terminal, the set-top box or the PC are directly connected to the unified video platform to obtain various multimedia video services in various forms. The unified video platform adopts a menu type configuration table mode to replace the traditional complex application programming, can realize complex application by using very few codes, and realizes infinite new service innovation.

Networking of the video network is as follows:

the video network is a centralized control network structure, and the network can be a tree network, a star network, a ring network and the like, but on the basis of the centralized control node, the whole network is controlled by the centralized control node in the network.

As shown in fig. 1, the video network is divided into an access network and a metropolitan network.

The devices of the access network part can be mainly classified into 3 types: node server, access switch, terminal (including various set-top boxes, coding boards, memories, etc.). The node server is connected to an access switch, which may be connected to a plurality of terminals and may be connected to an ethernet network.

The node server is a node which plays a centralized control function in the access network and can control the access switch and the terminal. The node server can be directly connected with the access switch or directly connected with the terminal.

Similarly, devices of the metropolitan network portion may also be classified into 3 types: a metropolitan area server, a node switch and a node server. The metro server is connected to a node switch, which may be connected to a plurality of node servers.

The node server is a node server of the access network part, namely the node server belongs to both the access network part and the metropolitan area network part.

The metropolitan area server is a node which plays a centralized control function in the metropolitan area network and can control a node switch and a node server. The metropolitan area server can be directly connected with the node switch or directly connected with the node server.

Therefore, the whole video network is a network structure with layered centralized control, and the network controlled by the node server and the metropolitan area server can be in various structures such as tree, star and ring.

The access network part can form a unified video platform (the part in the dotted circle), and a plurality of unified video platforms can form a video network; each unified video platform may be interconnected via metropolitan area and wide area video networking.

Video networking device classification

1.1 devices in the video network of the embodiment of the present application can be mainly classified into 3 types: servers, switches (including ethernet gateways), terminals (including various set-top boxes, code boards, memories, etc.). The video network as a whole can be divided into a metropolitan area network (or national network, global network, etc.) and an access network.

1.2 wherein the devices of the access network part can be mainly classified into 3 types: node servers, access switches (including ethernet gateways), terminals (including various set-top boxes, code boards, memories, etc.).

The specific hardware structure of each access network device is as follows:

a node server:

as shown in fig. 2, the system mainly includes a network interface module 201, a switching engine module 202, a CPU module 203, and a disk array module 204;

the network interface module 201, the CPU module 203, and the disk array module 204 all enter the switching engine module 202; the switching engine module 202 performs an operation of looking up the address table 205 on the incoming packet, thereby obtaining the direction information of the packet; and stores the packet in a queue of the corresponding packet buffer 206 based on the packet's steering information; if the queue of the packet buffer 206 is nearly full, it is discarded; the switching engine module 202 polls all packet buffer queues for forwarding if the following conditions are met: 1) the port send buffer is not full; 2) the queue packet counter is greater than zero. The disk array module 204 mainly implements control over the hard disk, including initialization, read-write, and other operations on the hard disk; the CPU module 203 is mainly responsible for protocol processing with an access switch and a terminal (not shown in the figure), configuring an address table 205 (including a downlink protocol packet address table, an uplink protocol packet address table, and a data packet address table), and configuring the disk array module 204.

The access switch:

as shown in fig. 3, the network interface module mainly includes a network interface module (a downlink network interface module 301 and an uplink network interface module 302), a switching engine module 303 and a CPU module 304;

wherein, the packet (uplink data) coming from the downlink network interface module 301 enters the packet detection module 305; the packet detection module 305 detects whether the Destination Address (DA), the Source Address (SA), the packet type, and the packet length of the packet meet the requirements, and if so, allocates a corresponding stream identifier (stream-id) and enters the switching engine module 303, otherwise, discards the stream identifier; the packet (downstream data) coming from the upstream network interface module 302 enters the switching engine module 303; the data packet coming from the CPU module 204 enters the switching engine module 303; the switching engine module 303 performs an operation of looking up the address table 306 on the incoming packet, thereby obtaining the direction information of the packet; if the packet entering the switching engine module 303 is from the downstream network interface to the upstream network interface, the packet is stored in the queue of the corresponding packet buffer 307 in association with the stream-id; if the queue of the packet buffer 307 is nearly full, it is discarded; if the packet entering the switching engine module 303 is not from the downlink network interface to the uplink network interface, the data packet is stored in the queue of the corresponding packet buffer 307 according to the guiding information of the packet; if the queue of the packet buffer 307 is nearly full, it is discarded.

The switching engine module 303 polls all packet buffer queues, which in this embodiment is divided into two cases:

if the queue is from the downlink network interface to the uplink network interface, the following conditions are met for forwarding: 1) the port send buffer is not full; 2) the queued packet counter is greater than zero; 3) obtaining a token generated by a code rate operation module;

if the queue is not from the downlink network interface to the uplink network interface, the following conditions are met for forwarding: 1) the port send buffer is not full; 2) the queue packet counter is greater than zero.

The rate operation module 208 is configured by the CPU module 204, and generates tokens for packet buffer queues from all downstream network interfaces to upstream network interfaces at programmable intervals to control the rate of upstream forwarding.

The CPU module 304 is mainly responsible for protocol processing with the node server, configuration of the address table 306, and configuration of the code rate operation module 308.

Ethernet protocol gateway:

as shown in fig. 4, the apparatus mainly includes a network interface module (a downlink network interface module 401 and an uplink network interface module 402), a switching engine module 403, a CPU module 404, a packet detection module 405, a code rate operation module 408, an address table 406, a packet buffer 407, a MAC adding module 409, and a MAC deleting module 410.

Wherein, the data packet coming from the downlink network interface module 401 enters the packet detection module 405; the packet detection module 405 detects whether the ethernet MAC DA, the ethernet MAC SA, the ethernet length or frame type, the video network destination address DA, the video network source address SA, the video network packet type, and the packet length of the packet meet the requirements, and if so, allocates a corresponding stream identifier (stream-id); then, the MAC deletion module 410 subtracts MAC DA, MAC SA, length or frame type (2byte) and enters the corresponding receiving buffer, otherwise, discards it;

the downlink network interface module 401 detects the sending buffer of the port, and if there is a packet, obtains the ethernet MAC DA of the corresponding terminal according to the destination address DA of the packet, adds the ethernet MAC DA of the terminal, the MAC SA of the ethernet protocol gateway, and the ethernet length or frame type, and sends the packet.

The other modules in the ethernet protocol gateway function similarly to the access switch.

A terminal:

the system mainly comprises a network interface module, a service processing module and a CPU module; for example, the set-top box mainly comprises a network interface module, a video and audio coding and decoding engine module and a CPU module; the coding board mainly comprises a network interface module, a video and audio coding engine module and a CPU module; the memory mainly comprises a network interface module, a CPU module and a disk array module.

1.3 devices of the metropolitan area network part can be mainly classified into 2 types: node server, node exchanger, metropolitan area server. The node switch mainly comprises a network interface module, a switching engine module and a CPU module; the metropolitan area server mainly comprises a network interface module, a switching engine module and a CPU module.

2. Video networking packet definition

2.1 Access network packet definition

The data packet of the access network mainly comprises the following parts: destination Address (DA), Source Address (SA), reserved bytes, payload (pdu), CRC.

As shown in the following table, the data packet of the access network mainly includes the following parts:

DA

SA

Reserved

Payload

CRC

wherein:

the Destination Address (DA) is composed of 8 bytes (byte), the first byte represents the type of the data packet (such as various protocol packets, multicast data packets, unicast data packets, etc.), there are 256 possibilities at most, the second byte to the sixth byte are metropolitan area network addresses, and the seventh byte and the eighth byte are access network addresses;

the Source Address (SA) is also composed of 8 bytes (byte), defined as the same as the Destination Address (DA);

the reserved byte consists of 2 bytes;

the payload part has different lengths according to different types of datagrams, and is 64 bytes if the datagram is various types of protocol packets, and is 32+1024 or 1056 bytes if the datagram is a unicast packet, of course, the length is not limited to the above 2 types;

the CRC consists of 4 bytes and is calculated in accordance with the standard ethernet CRC algorithm.

2.2 metropolitan area network packet definition

The topology of a metropolitan area network is a graph and there may be 2, or even more than 2, connections between two devices, i.e., there may be more than 2 connections between a node switch and a node server, a node switch and a node switch, and a node switch and a node server. However, the metro network address of the metro network device is unique, and in order to accurately describe the connection relationship between the metro network devices, parameters are introduced in the embodiment of the present application: a label to uniquely describe a metropolitan area network device.

In this specification, the definition of the Label is similar to that of the Label of MPLS (Multi-Protocol Label Switch), and assuming that there are two connections between the device a and the device B, there are 2 labels for the packet from the device a to the device B, and 2 labels for the packet from the device B to the device a. The label is classified into an incoming label and an outgoing label, and assuming that the label (incoming label) of the packet entering the device a is 0x0000, the label (outgoing label) of the packet leaving the device a may become 0x 0001. The network access process of the metro network is a network access process under centralized control, that is, address allocation and label allocation of the metro network are both dominated by the metro server, and the node switch and the node server are both passively executed, which is different from label allocation of MPLS, and label allocation of MPLS is a result of mutual negotiation between the switch and the server.

As shown in the following table, the data packet of the metro network mainly includes the following parts:

DA

SA

Reserved

label (R)

Payload

CRC

Namely Destination Address (DA), Source Address (SA), Reserved byte (Reserved), tag, payload (pdu), CRC. The format of the tag may be defined by reference to the following: the tag is 32 bits with the upper 16 bits reserved and only the lower 16 bits used, and its position is between the reserved bytes and payload of the packet.

Based on the above characteristics of the video network, one of the core concepts of the embodiments of the present application is provided, which follows a protocol of the video network, when a video receiver receives a video stream of the video network, when a frame of video stream is received, the frame of video stream is stored in a buffer queue first, when a frame of video stream in the buffer queue is transmitted to a decoder each time, an interval time for transmitting a next frame of video stream is determined according to the number of the currently buffered frame of video stream in the buffer queue, and a larger time delay between an I frame and a P frame caused by a uniform packet operation is averagely distributed to the interval time for transmitting each frame of video stream, so that a video receiver keeps a time interval of a frame video stream decoder uniformly changed, thereby solving a problem of video card caused by a long key frame receiving time, and improving smoothness of a picture.

Example one

Referring to fig. 5, a flowchart illustrating steps of embodiment 1 of a video stream decoding method according to the present application is shown, in which the method may be applied to a video network, where the video network may include a video receiving end configured with a decoder.

Referring to fig. 6, in fig. 6, a video receiving end is in communication connection with a video networking server, before the video receiving end receives video streams through the video networking, the video receiving end needs to register in the video networking server, the video receiving end can access the video networking after the registration to receive stream data in the video networking, and in the video networking, instruction information sent by the video receiving end and the received stream data are forwarded through the video networking server.

In this embodiment, the video receiving end is a hardware device that can be used in the video network and is suitable for the network transmission protocol of the video network, and includes the hardware structure of the terminal described in section 1.2 above. The decoder may be an h.264 decoder, and may decode a video stream conforming to a video networking protocol in a video networking.

The video stream decoding method according to the embodiment of the present application may specifically include the following steps:

step 501, the video receiving end acquires timestamp information of a frame of video stream every time the video receiving end receives the frame of video stream, and sends the received frame of video stream to a buffer queue.

The video stream in the embodiment of the present application comes from video networking, that is, each frame of video stream received by a video receiving end is a video stream conforming to a video networking protocol, and for the video stream of the video networking protocol, it is different from the video stream transmitted in the internet in that: the protocol head data of the video stream of the video networking protocol is socket information which comprises a protocol number of a physical layer of a data link layer, an interface index number, a header type, a packet type, a physical layer address length and the like, but does not comprise a protocol number of a network layer, an IP address and the like which are comprised by the video stream of the internet protocol, therefore, if a frame video stream of the video networking protocol is received at a video receiving end, the protocol head is analyzed simply, fast analysis and decoding can be obtained, and the protocol number and the IP address of the network layer are not comprised, so that the video stream is transmitted directly end to end in the video networking without determining a local network address of a destination terminal by a complex network layer protocol mechanism, therefore, the video stream transmission efficiency is high, the speed is high, the efficiency of the video stream receiving end is improved, and the video stream decoding is faster, the picture is also smoother.

In the present application, a frame of video stream can be understood as complete key frame data or complete difference frame data, a key frame is the first frame of a group of continuous pictures, and also refers to the frame where a key action in the motion or change of a character or an object is located, and the key frame is used as a reference value of a subsequent difference frame, and can be used as a complete image, all data of one image are retained, and the data volume is very large; the difference frame is also called a prediction frame, is predicted from a previous key frame or a difference frame, only keeps the difference between the current frame and the previous frame or a plurality of frames of data, is partial data of an image, and has much smaller data quantity compared with the key frame; therefore, according to this characteristic, in video streaming, the time taken to receive a frame of video stream is different, and if the frame of video stream is a key frame video stream, the time taken may be several times that of a frame of difference frame video stream, in practice, if the video receiving end receives a frame of video stream, the frame of video stream is sent to the decoder, so that the decoder waits for a longer time to decode the following key frames after decoding the previous difference frames, and if the waiting time is recognized by human eyes, the frame is obviously jammed when a key frame appears, the picture is not smooth, and the user experience is poor.

In the embodiment of the application, when a video receiving end receives a frame of video stream, the video stream is not immediately sent to a decoder, but the frame of video stream is sent to a buffer queue first, then time delay between a key frame and a difference frame is uniformly distributed, and then the time delay is sent to the decoder in sequence according to uniformly distributed interval time, so that the problem that the decoding time interval of the difference frame and the key frame of the decoder is greatly different, and if the time interval is identified by human eyes, the video stream is obviously jammed is solved.

In the embodiment of the present application, the timestamp information may be understood as a piece of complete verifiable data that can indicate that a frame of video stream already exists at a specific time point, so as to prove the generation time of a frame of video stream data; in the embodiment of the present application, a frame of video stream is generated at a video receiving end only when the frame of video stream is received, so that timestamp information of the frame of video stream identifies a time when the frame of video stream is received, and in the embodiment of the present application, the timestamp information is millisecond-level timestamp information; a buffer queue may be understood in this embodiment of the present application as a linear storage structure, which may be regarded as a channel through which a data stream passes.

In the specific implementation, each time a video receiving end receives a frame of video stream, the video receiving end sends the received frame of video stream to a buffer queue, so that a plurality of frame video streams can be stored in the buffer queue within a period of time, and the stored frame video streams are arranged according to the sequence of the time represented by the timestamp information carried by the frame video streams.

Step 502, the video receiving end determines the frame rate according to the respective timestamp information of the two adjacent frames of video streams.

In practice, frame rate is understood to be the number of video streams of frames transmitted in a 1 second period, and also to be the number of times the graphics processor can refresh every second. In the embodiment of the application, the frame rate is determined according to the respective timestamp information of two adjacent video streams, that is, when the number of the received frame video streams reaches the preset number of frame video streams, the frame rate is determined according to the respective timestamp information of each two adjacent video streams. In the specific implementation, the video receiving end calculates the time interval between every two adjacent frames of video streams according to the respective timestamp information of the two adjacent frames of video streams, calculates the average time interval according to each calculated time interval, and calculates the frame rate according to the average time interval, so that the picture refreshing capability of the video receiving end can be adapted to the frame video stream receiving rate of the video receiving end. The preset number of frame video streams may be set to 3 or 5, that is, as long as the video receiving end receives 3 frame video streams or 5 frame video streams, the frame rate is determined according to the respective timestamp information of each two adjacent frame video streams.

For example, taking the case that the frame rate is determined after the first five frames of video streams are received, the time values represented by the timestamp information of the 5 frames of video streams are arranged in chronological order as follows (to be accurate to milliseconds): 2016-08-0410:34:42: 100. 2016-08-0410:34: 42: 132. 2016-08-0410:34: 42: 150. 2016-08-0410:34: 42: 180. 2016-08-0410:34: 42: 232; the time intervals between the obtained two frame video streams are respectively as follows: 32 ms, 18 ms, 30 ms, 52 ms, the average time interval is 33 ms, which means that the video receiving end receives one frame of video stream every 33 ms on average, and can receive 30 frames of video stream within one second, the frame rate can be set to 30, so that the video receiving end processes 30 pictures in one second, and the frame rate is adapted to the receiving efficiency of the frame video stream of the video receiving end. Therefore, when decoding and playing are carried out, the refreshing frequency of the picture can be consistent with the receiving rate of a video receiving end, and the picture can be smoother.

Step 503, when the video receiving end sends a frame of video stream to the decoder, the video receiving end determines a first sending interval time for sending a next frame of video stream according to the frame rate, the preset time coefficient value and the number of the currently buffered frame video streams in the buffer queue.

In the embodiment of the present application, after the video receiving end determines the frame rate, it is necessary to solve the problem that the video picture is blocked if the waiting time between the decoding of the previous difference frame video stream and the next key frame video stream by the decoder is too long. Based on this, the embodiment of the present application adopts the method that the interval time between the receiving of the key frame video stream and the difference frame video stream is evenly distributed to the time interval of sending each frame video stream, that is, the interval time between the key frame video stream and the difference frame video stream is smoothed, so as to solve the above problem.

In a specific implementation, when sending a frame of video stream in a buffer queue to a decoder, a video terminal determines a first sending interval time for sending a next frame of video stream according to the frame rate, the set time coefficient value and the number of currently buffered frame video streams in the buffer queue. In practice, when sending a frame of video stream in the buffer queue to the decoder each time, the number of the frame of video streams buffered in the buffer queue may be the same as the previous time, or may be different from the previous time, and by adopting the scheme of step 503 in the embodiment of the present application, the video receiving end does not send each frame of video stream to the decoder at equal time intervals, but sends each frame of video stream to the decoder at fine fluctuation time intervals, which not only ensures that the received frame of video stream is sent to the decoder, but also ensures that the decoding time intervals of each frame of video stream by the decoder are not recognized by human eyes, thereby improving the smoothness of the picture.

The time coefficient is understood to mean a reference time interval value that equally allocates the interval time between the reception of the key frame video stream and the difference frame video stream to the time interval during which the respective frame video streams are transmitted. The time coefficient can be determined by a user according to the frame rate which can be recognized by human eyes in advance, and the data size of the key frame can be determined according to the pictures which need to be acquired by the acquisition equipment of the video stream. For example, in a real-time video call, the data volume of a key frame is large, and the transmission speed is slow, so that the frame rate that human eyes can recognize is generally more than 24 frames, and is possibly recognized when the frame rate is less than 24 frames, and if the frame interval time of 24 frames is 42 milliseconds, the first transmission interval time cannot be longer than 42 milliseconds at the longest; the time coefficient value may be determined to be 1 msec according to the frame rate.

In the embodiment of the present application, when the video receiving end determines the frame rate, the frame rate may be determined when the number of the frame video streams reaches a certain number, so that when the video receiving end transmits the frame video streams, that is, when the buffer queue has the preset number of the frame video streams, the video receiving end starts to transmit the first frame video stream, and when each frame video stream is transmitted to the decoder, the first transmission interval time for transmitting the next frame video stream is determined according to the frame rate, the preset time coefficient value, and the number of the frame video streams currently buffered in the buffer queue.

In an alternative embodiment, step 503 may include the following sub-steps:

step 5031, the video receiving end judges whether the number of the currently cached frame video streams in the buffer queue is greater than a preset number; if yes, rotator step 5032; if not, rotator step 5033.

In sub-step 5032, the video receiving end determines the first sending interval time according to a preset first formula, where the first formula is:

in the first formula, t1 is an integer, t1 is a first transmission interval time, n is a frame rate, m is a time coefficient value, x1 is the number of currently buffered frame video streams, and y is a preset number. In practice, y in the formula may be set by a user, and is generally the number of buffered frame video streams in the buffering queue under a normal condition.

For example, the frame rate is 30, the temporal coefficient value is 3, and the preset number is 5, that is, the video receiving end can receive 30 pictures within one second, and also has the capability of processing 30 frames of video streams, and the interval time of each frame of video stream is changed according to the number of frame video streams in the buffer queue.

When a frame of video stream is sent from the buffer queue for the nth time, 7 frames of video streams are buffered in the buffer queue, and the sending interval time between every two frames of video streams in the current buffer queue is 27 milliseconds according to the formula I, namely after the interval is 27 milliseconds, the (N + 1) th time sends the next frame of video stream from the buffer queue; wherein, N is a positive integer, namely N can be Arabic numbers such as 1, 2, 3 and the like.

When the next frame of video stream is sent from the buffer queue at the N +1 th time, if 6 frames of video streams are buffered in the buffer queue, the sending interval time between every two frames of video streams in the current buffer queue is obtained to be 30 milliseconds according to the formula I, namely the next frame of video stream is sent from the buffer queue after 30 milliseconds are separated.

When the next frame video stream is sent from the buffer queue at the N +2 th time, if 5 frame video streams are buffered in the buffer queue, the sending interval time between every two frame video streams in the current buffer queue is 33 milliseconds according to the formula one, namely the next frame video stream is sent from the buffer queue after the interval of 33 milliseconds.

In sub-step 5033, the video receiving end determines the first sending interval time according to a preset second formula, where the second formula is:

and t2 is an integer.

In the second formula, t2 is the first transmission interval time, n is the frame rate, m is the time coefficient value, x2 is the number of the currently buffered frame video streams, and y is the preset number. In practice, y in the formula may be set by a user, and is generally the number of frame video streams buffered in the buffering queue under a normal condition.

And when the next frame of video stream is sent from the buffer queue at the Nth time, if 4 frames of video streams are buffered in the buffer queue, obtaining that the sending interval time between every two frames of video streams in the current buffer queue is 36 milliseconds according to a formula II, namely, sending the next frame of video stream from the buffer queue after 36 milliseconds are separated.

When the next frame of video stream is sent from the buffer queue at the N +1 th time, if 3 frames of video streams are buffered in the buffer queue, the sending interval time between every two frames of video streams in the current buffer queue is 39 milliseconds according to the formula two, namely the next frame of video stream is sent from the buffer queue after the interval of 39 milliseconds.

In summary, the greater the number of frame video streams in the buffer queue, the faster the video receiving end receives the frame video streams, and at this time, according to the first formula, the transmission interval time of every two adjacent frame video streams is shortened; the smaller the number of the frame video streams in the buffer queue is, the slower the speed of receiving the frame video streams by the video receiving end is, at this time, according to a second formula, the sending interval time of every two adjacent frame video streams is prolonged by a time coefficient value or a plurality of time coefficient values so as to correspond to the time delay of receiving the frame video streams; the interval time between the frame video streams sent to the decoder and the receiving time of each frame video stream received by the video receiving end are kept in dynamic balance; that is, the sending interval time between each frame video stream is different by N time coefficient values on the basis of the average interval time corresponding to the frame rate, so that the interval time between the key frame video stream and the difference frame video stream is evenly distributed to the time interval for sending each frame video stream, and the fluency of video playing is improved.

It can also be seen from the above two formulas that, after the frame rate is determined, the smaller the temporal coefficient value is, the larger the difference between the numbers of the allowed frame video streams can be, i.e. the longer the time allowed to receive the key frame video stream is, thereby resulting in better smoothness.

Meanwhile, the sending interval time between each frame of video stream in the embodiment of the application is a time coefficient value which floats up and down on the basis of the average interval time corresponding to the frame rate, so that the number of the received frame of video streams is consistent with the attention number of the frame food decoded by the decoder in unit time, namely the sending data volume of the video sending end and the data volume sent to the decoder by the video receiving end in unit time are approximately the same, and the video can be kept synchronous.

Step 504, the video receiving end sequentially sends each frame of video stream buffered in the buffer queue to the decoder according to the first sending interval time; the decoder is used for decoding the video streams of the frames.

That is, each time the video receiving end finishes sending the previous frame of video stream, the video receiving end sends the next frame of video stream at the first sending interval time determined at intervals, and according to the mode, each frame of video stream in the buffer queue is sent to the decoder, and the decoder decodes and plays each received frame of video stream.

According to the embodiment of the application, the delay time of receiving the key frame video streams is averagely superposed into the interval time of sending each frame video stream, so that the sending interval time between each frame video stream fluctuates up and down in the time coefficient value and does not reach the time interval range which can be identified by human eyes, and the problem of video blocking caused by long time receiving of the key frames is solved.

In an alternative embodiment, step 503 may further include the following steps:

step S2, the video receiving end determines a second sending interval time according to the frame rate.

After the video receiving end determines the frame rate, the video receiving end may determine the frame interval time corresponding to the frame rate, that is, after the frame rate is determined, it may determine how many frames are sent within one second, and further obtain the interval time of each frame sent within one second.

Illustratively, taking the frame rate as 30 as an example, that is, 30 frames are transmitted within one second, the second transmission interval time is 33 milliseconds.

In step S3, the video receiving end sends the first frame video stream received by the buffering queue first to a decoder.

In practice, before or after the frame rate is determined, the video receiving end may first send the first frame video stream in the buffer queue to the decoder, that is, the video receiving end does not need to send the frame video stream until the number of the frame video streams reaches a certain number, so as to reduce the waiting time of the decoder.

Since the first frame video stream is the key frame video stream, and the second frame video stream and the subsequent 2 or 3 frame video streams are generally the difference frame video streams, the receiving speed of the subsequent 2 or 3 frame video streams is generally increased, after the first frame video stream is sent first, the video receiving end has already determined the second sending interval time, and at this time, the second frame video stream is sent according to the second sending interval.

In practice, taking steps S2 and S3 can determine the interval between the transmission of the first frame video stream and the transmission of the second frame video stream according to the second transmission interval.

Based on the above steps S2 and S3, step 503 in the embodiment of the present application may specifically be the content of step 503':

step 503', after the video receiving end transmits the second frame video stream in the buffer queue to the decoder, the video receiving end determines the first transmission interval time for transmitting the next frame video stream according to the frame rate, the time coefficient value and the number of the frame video streams currently stored in the buffer queue when transmitting a frame video stream to the decoder.

That is, after the video receiving end has sent the first frame video stream and the second frame video stream, it calculates the sending interval time between each frame video stream each time it sends a frame video stream, and sends each frame video stream according to the sending interval time.

In another implementation manner of the embodiment of the present application, as shown in fig. 7, a flowchart illustrating steps of a video stream decoding method in the implementation manner is shown, and specifically includes the following steps:

step 701, the video receiving end acquires timestamp information of one frame of video stream every time the video receiving end receives one frame of video stream, and sends the received one frame of video stream to a buffer queue.

The specific content of step 701 may refer to the description of step 501.

Step 702, the video receiving end determines a frame rate according to respective timestamp information of two adjacent frames of video streams.

The specific content of step 702 may refer to the description of step 502.

In step 703, the video receiving end determines a time coefficient value according to the received timestamp information of the first frame video stream.

In practice, if the first frame video stream received by the video receiving end is a key frame video stream, the time taken for actually receiving the first frame video stream can be determined according to the timestamp information of the first frame video stream, and the time coefficient value can be determined according to the time taken.

That is, in the present embodiment, the determining, by the video receiving end, the time coefficient value according to the timestamp information of the received first frame video stream means: the video receiving end determines the time spent on receiving the first frame video stream according to the time stamp information of the first frame video stream, and determines the time coefficient value according to the spent time.

In specific implementation, when a video receiving end receives a frame video stream, a video sending end sends each frame of video after uniformly packaging, namely, one frame of video is divided into a plurality of network packets to be sent, and each network packet has a frame identifier, a packet serial number and timestamp information; the video receiving end splices a plurality of video stream data packets with the same frame identification into a frame of video stream according to the sequence of the packet serial numbers, so that the video terminal can determine the time spent on receiving the first frame of video stream according to the difference between the timestamp information of the first network packet and the timestamp information of the last network packet in the spliced first frame of video stream, and calculate the time coefficient value according to the spent time; in an alternative real-time approach, the temporal coefficient value may be determined based on the difference between the time taken to receive the first frame video stream and the minimum frame video interval time recognizable to the human eye.

For example, taking 45 milliseconds as an example of the time taken to receive the first frame video stream, where 45 milliseconds can be already recognized by human eyes, the number of frame video streams buffered in the buffer queue can be significantly reduced within about 45 milliseconds of other key frame video streams received by the later video receiving end, that is, within the same time period, for example, 100 milliseconds, and if even a frame video stream is sent, the interval between a difference frame and a key frame is also 45 milliseconds, and the video is significantly jammed; if the time average of 45 milliseconds spent on the first frame video stream is delayed to the interval time of sending each frame video stream later, the time interval of sending each frame video stream to the decoder can smooth the time delay brought by receiving the key frame video stream and can not be recognized by human eyes; specifically, because there are several difference frames between the key frames, the time for receiving the difference frames will be very short, by adopting this way, the time for receiving the key frames can be equally distributed to the interval time of each difference frame, which is equivalent to delaying the sending interval time of each difference frame, so that the receiving time left for each key frame is quite abundant, thereby the long receiving time of the key frame can be offset in time, and the sending interval time of the key frame is out of the range recognizable by human eyes; thereby solving the problem of video jamming.

Based on this, in an example in which the time taken to receive the first frame video stream is 45 msec, the frame video interval time recognizable to the human eye is 42 msec at the minimum, and the time coefficient value may be set to 3 msec.

In an alternative embodiment, it can also be obtained according to the first formula that, the smaller the time coefficient value, the larger the allowed number phase difference value of the frame video streams in the buffer queue, that is, the longer the time allowed to receive the key frame video stream can be, and thus the time coefficient value can also be determined by setting a matching comparison table, that is, there is a one-to-one relationship between multiple sets of time length values and time coefficient values in the matching comparison.

Illustratively, when the time taken for receiving the first frame video stream is 45 milliseconds, according to the matching comparison table, determining that the time coefficient value corresponding to 45 milliseconds is 2 milliseconds; when the time taken for receiving the first frame video stream is 46 milliseconds, according to the matching comparison table, the time coefficient value corresponding to 46 milliseconds is determined to be 1 millisecond.

Step 704, when the video receiving end sends a frame of video stream to the decoder, the video receiving end determines a first sending interval time for sending the next frame of video stream according to the frame rate, the time coefficient value and the number of the currently buffered frame video streams in the buffer queue.

The specific content of step 704 may refer to the description of step 503.

Optionally, step 704 may be preceded by the steps of:

step S4, the video receiving end determines a second sending interval time according to the frame rate.

In step S5, the video receiving end sends the first frame video stream received by the buffering queue first to a decoder.

In practice, when a predetermined number of frame video streams are buffered in the buffer queue, the first frame video stream is sent to the decoder.

Illustratively, when a preset number of frame video streams are buffered in the buffer queue to be 5, the first frame video stream received at the beginning is sent to the decoder.

Based on the above steps S4 and S5, step 704 of the present embodiment may specifically be the content of step 704':

step 704', after the video receiving end transmits the second frame video stream in the buffer queue to the decoder, the video receiving end determines the first transmission interval time for transmitting the next frame video stream according to the frame rate, the time coefficient value and the number of the frame video streams currently stored in the buffer queue when transmitting a frame video stream to the decoder.

That is, the video receiving end determines the first transmission interval time only in each frame video stream transmission after the first frame video stream and the second frame video stream are transmitted.

As an alternative example of this embodiment, step 704 may specifically include the following sub-steps:

in sub-step 7041, the video receiving end determines whether the number of currently buffered frame video streams in the buffering queue is greater than a preset number; if so, rotator step 7042; if not, then step 7043 is performed.

In sub-step 7041, the video receiving end determines the first transmission interval time according to a preset first formula;

and in sub-step 7042, the video receiving end determines the first transmission interval time according to a preset second formula.

Sub-step 7041 may be described with reference to step 5031, and sub-step 7042 may be described with reference to step 5032.

Step 705, the video receiving end sequentially sends each frame of video stream buffered in the buffer queue to the decoder according to the first sending interval time; the decoder is used for decoding the video streams of the frames.

The specific content of step 705 may refer to the description of step 504.

To sum up, in the embodiment of the present application, the video stream decoding in the prior art is improved in the aspect of the video networking environment for the main video stream decoding application and the time interval setting for sending the frame video stream, and the following beneficial effects are obtained:

the method is applied to the video networking, and the video stream is the video stream conforming to the video networking protocol, so that the transmission speed, the receiving speed and the analysis and decoding efficiency of the video stream are improved, and the fluency of a final video stream picture is improved.

In the embodiment of the present application, because there are a plurality of difference frames between a key frame and a key frame, the time for receiving the difference frame is very short, and when each frame of video stream is sent to the decoder, the interval time for sending the next frame of video stream is determined according to the frame rate, the preset time coefficient value and the number of buffered frame video streams, so that the time for receiving the key frame is evenly distributed to the interval time of each difference frame, which is equivalent to delaying the sending interval time of each difference frame, and thus the receiving time left for each key frame is quite sufficient, so that the long receiving time of the key frame can be offset in time, and the sending interval time of the key frame is out of the range recognizable to human eyes; thereby solving the problem of video jamming. In the embodiment of the application, although the time intervals when the video streams of each frame are sent to the decoder are not uniform, the time intervals are still out of the range which can be identified by human eyes, so that the problem of video blockage caused by overlong time for receiving certain key video streams is avoided, and the fluency of video playing pictures is improved.

And 3, the time coefficient value can be preset by a user, and can also be set by a video receiving end according to the receiving time of the first frame video stream, so that the practicability and flexibility of the method are improved, the longer time for receiving the key frame video stream can be better averagely distributed, and the expected picture fluency effect is achieved.

It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the embodiments are not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the embodiments. Further, those skilled in the art will also appreciate that the embodiments described in the specification are presently preferred and that no particular act is required of the embodiments of the application.

Example two

Referring to fig. 8, a block diagram of an embodiment 2 of a video stream decoding apparatus according to the present application is shown, in which the apparatus may be applied to a video network, where the video network includes a video receiving end, and a decoder is configured in the video receiving end; the apparatus may be located at the video receiving end, and specifically may include the following modules:

the video stream buffering module 801 is configured to send a received frame of video stream to a buffering queue every time the frame of video stream is received;

a timestamp information obtaining module 802, configured to obtain timestamp information of the frame of video stream;

a frame rate determining module 803, which determines a frame rate according to the respective timestamp information of two adjacent frames of video streams;

a first sending interval time determining module 804, configured to determine, when each frame of video stream is sent to the decoder, a first sending interval time for sending a next frame of video stream according to the frame rate, a preset time coefficient value, and the number of currently buffered frame video streams in the buffer queue;

a frame video stream sending module 805, configured to send each frame video stream buffered in the buffer queue to the decoder in sequence according to the first sending interval; the decoder is used for decoding the video streams of the frames in sequence.

In an alternative embodiment, the apparatus further comprises:

a second sending interval time determining module 806, configured to determine a second sending interval time according to the frame rate;

a first sending module 807, configured to send the first frame video stream received first by the buffer queue to a decoder; and is used for sending the second frame video stream in the buffer queue to a decoder at intervals of the second sending interval time;

the first sending interval time determining module is configured to determine, after the first sending module sends the second frame of video stream in the buffer queue to the decoder, a first sending interval time for sending a next frame of video stream according to the frame rate and the number of currently stored frame of video streams in the buffer queue when each frame of video stream is sent to the decoder.

In an optional embodiment, the first transmission interval time determining module includes:

the first calculation submodule is used for determining the first sending interval time according to a preset first formula when the number of the frame video streams cached in the buffering queue at present is greater than a preset number; wherein, the first preset formula is as follows:

in the first formula, t1 is an integer, t1 is a first transmission interval time, n is a frame rate, m is a time coefficient value, x1 is the number of currently buffered frame video streams, and y is a preset number.

The second calculation submodule is used for determining the first sending interval time according to a preset second formula when the number of the frame video streams currently cached in the buffering queue is smaller than or equal to a preset number;

wherein, the preset second formula is:

t2 is an integer;

in the second formula, t2 is the first transmission interval time, n is the frame rate, m is the time coefficient value, x2 is the number of the currently buffered frame video streams, and y is the preset number.

In another embodiment, referring to fig. 9, another video stream decoding apparatus is shown, where the apparatus is applied to a video network, where the video network includes a video receiving end, and the apparatus may be located at the video receiving end, and specifically may include the following modules:

the video stream buffering module 901 is configured to send a received frame of video stream to a buffering queue every time the frame of video stream is received;

a timestamp information obtaining module 902, configured to obtain timestamp information of the frame of video stream;

a frame rate determining module 903, configured to determine a frame rate according to respective timestamp information of two adjacent frames of video streams;

a time coefficient value determining module 904, configured to determine a time coefficient value according to the received timestamp information of the first frame video stream;

a first sending interval time determining module 905, configured to determine, when each frame of video stream is sent to the decoder, a first sending interval time for sending a next frame of video stream according to the frame rate, the time coefficient value, and the number of currently buffered frame video streams in the buffer queue;

a frame video stream sending module 906, configured to send each frame video stream buffered in the buffer queue to the decoder in sequence according to the first sending interval time; the decoder is used for decoding the video streams of the frames in sequence.

wherein, the preset second formula is:

t2 is an integer;

In an embodiment of the present application, a computer-readable storage medium is further provided, on which a computer program is stored, and the computer program, when executed by a processor, implements the contents of steps 501 to 504 and the contents of steps 701 to 705 in the method according to embodiment 1.

In an embodiment of the present application, an electronic device is further provided, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the electronic device implements the contents of steps 501 to 504 and the contents of steps 701 to 705 in the method described in embodiment 1.

For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.

The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.

As will be appreciated by one of skill in the art, embodiments of the present application may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present application have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the true scope of the embodiments of the application.

Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.

The foregoing describes a video stream decoding method and a corresponding video stream decoding apparatus provided in the present application in detail, and specific examples are applied herein to illustrate the principles and embodiments of the present application, and the description of the foregoing embodiments is only used to help understand the method and the core ideas of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A method for decoding a video stream, the method being applied to a video network, the video network comprising a video receiving end, the video receiving end being configured with a decoder, the method comprising:

when the video receiving end sends a frame of video stream to the decoder, the video receiving end determines a first sending interval time for sending the next frame of video stream according to the frame rate, the preset time coefficient value and the number of the frame video streams currently cached in the buffer queue, and the method comprises the following steps: calculating the first sending interval time by using a preset first formula or a preset second formula;

the video receiving end sequentially sends each frame of video stream cached in the buffer queue to the decoder according to the first sending interval time; the decoder is used for decoding the video streams of the frames;

wherein the first formula is:

the second formula is:

t1 and t2 are the first transmission interval time, n is the frame rate, m isA time coefficient value; x1 and x2 are the number of the currently cached frame video streams, and y is a preset number; the temporal coefficient value refers to a reference temporal interval value that equally distributes the interval time between the reception of the key frame video stream and the difference frame video stream to the time interval in which the respective frame video streams are transmitted.

2. The method of claim 1, wherein before the step of determining the first transmission interval time for transmitting the next frame video stream according to the frame rate, the preset time coefficient value and the number of currently buffered frame video streams in the buffer queue when the video receiving end transmits one frame video stream to the decoder, the method further comprises:

when the video receiving end sends a frame of video stream to the decoder, the step of determining a first sending interval time for sending the next frame of video stream according to the frame rate, the preset time coefficient value and the number of the frame of video streams currently cached in the buffer queue comprises the following steps:

3. The method as claimed in claim 1, wherein the step of determining, at the video receiving end, a first transmission interval for transmitting a next frame video stream according to the frame rate, the preset time coefficient value and the number of the currently buffered frame video streams in the buffer queue when transmitting a frame video stream to the decoder: the method comprises the following steps:

4. A method for decoding a video stream, the method being applied to a video network, the video network comprising a video receiving end, the video receiving end being configured with a decoder, the method comprising:

the video receiving end determines a time coefficient value according to the received time stamp information of the first frame video stream, and the method comprises the following steps: determining the time spent on receiving the first frame video stream according to the difference between the timestamp information of the first network packet and the timestamp information of the last network packet in the spliced first frame video stream, and calculating the time coefficient value according to the spent time;

when the video receiving end sends a frame of video stream to the decoder, the video receiving end determines a first sending interval time for sending the next frame of video stream according to the frame rate, the time coefficient value and the number of the frame video streams currently cached in the buffer queue, and the method comprises the following steps: calculating the first sending interval time by using a preset first formula or a preset second formula;

wherein the first formula is:

the second formula is:

t1 and t2 are first transmission interval times, n is the frame rate, and m is the time coefficient value; x1 and x2 are the number of video streams of the current buffered frame, and y is the preset number.

5. The method as claimed in claim 4, wherein the step of determining the first transmission interval time for transmitting the next frame video stream by the video receiving end according to the frame rate, the time coefficient value and the number of currently buffered frame video streams in the buffer queue when transmitting one frame video stream to the decoder comprises:

6. The device for decoding the video stream is applied to a video network, wherein the video network comprises a video receiving end, and a decoder is arranged in the video receiving end; the device is located at the video receiving end and comprises:

a first sending interval time determining module, configured to determine, when each frame of video stream is sent to the decoder, a first sending interval time for sending a next frame of video stream according to the frame rate, a preset time coefficient value, and the number of currently buffered frame video streams in the buffer queue, where the first sending interval time determining module includes: calculating the first sending interval time by using a preset first formula or a preset second formula;

a frame video stream sending module, configured to send each frame video stream buffered in the buffer queue to the decoder in sequence according to the first sending interval time; the decoder is used for sequentially decoding the video streams of each frame;

wherein the first formula is:

the second formula is:

t1 and t2 are first transmission interval times, n is the frame rate, and m is the time coefficient value; x1 and x2 are the number of the currently cached frame video streams, and y is a preset number; the temporal coefficient value refers to a reference temporal interval value that equally distributes the interval time between the reception of the key frame video stream and the difference frame video stream to the time interval in which the respective frame video streams are transmitted.

7. The apparatus of claim 6, further comprising:

a first sending module, configured to send a first frame video stream received by the buffer queue first to a decoder; and is used for sending the second frame video stream in the buffer queue to the decoder at intervals of the second sending interval time;

and the first sending interval time determining module is used for determining the first sending interval time for sending the next frame of video stream according to the frame rate, the time coefficient value and the number of the frame video streams cached currently in the buffer queue when sending one frame of video stream to the decoder after the first sending module sends the second frame of video stream to the decoder.

8. The apparatus of claim 6, wherein the first transmission interval time determining module comprises:

9. The device for decoding the video stream is applied to a video network, wherein the video network comprises a video receiving end, and a decoder is arranged in the video receiving end; the device is located at the video receiving end and comprises:

a time coefficient value determining module, configured to determine a time coefficient value according to timestamp information of a received first frame video stream, including: determining the time spent on receiving the first frame video stream according to the difference between the timestamp information of the first network packet and the timestamp information of the last network packet in the spliced first frame video stream, and calculating the time coefficient value according to the spent time;

a first sending interval time determining module, configured to determine, when each frame of video stream is sent to the decoder, a first sending interval time for sending a next frame of video stream according to the frame rate, the time coefficient value, and the number of currently buffered frame video streams in the buffer queue, where the first sending interval time determining module includes: calculating the first sending interval time by using a preset first formula or a preset second formula;

wherein the first formula is:

the second formula is:

10. The apparatus of claim 9, wherein the first transmission interval time determining module comprises: