CN114302142A

CN114302142A - Video encoding method, image transmission apparatus, and storage medium

Info

Publication number: CN114302142A
Application number: CN202111586374.2A
Authority: CN
Inventors: 顾冬珏
Original assignee: China Mobile Communications Group Co Ltd; MIGU Interactive Entertainment Co Ltd; MIGU Culture Technology Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; MIGU Interactive Entertainment Co Ltd; MIGU Culture Technology Co Ltd
Priority date: 2021-12-22
Filing date: 2021-12-22
Publication date: 2022-04-08

Abstract

The invention discloses a video coding method, image sending equipment and a storage medium, and belongs to the field of image transmission. The video coding method comprises the following steps: acquiring the average packet loss rate of a video image in the current detection time period; determining a long-term reference frame according to the average packet loss rate, and acquiring the real-time average packet loss rate of each moment and the current round-trip delay of the current round-trip delay period of each moment; adjusting the insertion time interval of the reference frame according to the real-time average packet loss rate; determining a target long-term reference frame according to the current round-trip delay; encoding each P-frame of the video image using the target long-term reference frame and the reference frame insertion interval. The invention also effectively ensures the transmission of the video and the image quality when the network quality is poor.

Description

Video encoding method, image transmission apparatus, and storage medium

Technical Field

The present invention relates to the field of image transmission technologies, and in particular, to a video encoding method, an image sending device, and a storage medium.

Background

In the related art, a fixed long-Term reference frame strategy is mainly adopted, i.e., a long-Term reference frame ltr (long Term reference) is inserted by a fixed period. Or a fixed number of frames following the LTR frame are referenced to the LTR frame.

However, when the network is poor, the receiving end may lose the LTR frame, and thus the transmission and image quality of the video may not be effectively guaranteed.

Disclosure of Invention

The invention mainly aims to provide a video coding method, an image sending device and a storage medium, aiming at solving the problem that the transmission and the image quality of a video cannot be effectively ensured when an existing receiving end loses an LTR frame.

To achieve the above object, in a first aspect, the present invention provides a video encoding method, including:

acquiring the average packet loss rate of a video image in the current detection time period;

determining a long-term reference frame according to the average packet loss rate, and acquiring the real-time average packet loss rate of each moment and the current round-trip delay of the current round-trip delay period of each moment;

adjusting the insertion time interval of the reference frame according to the real-time average packet loss rate;

determining a target long-term reference frame according to the current round-trip delay;

encoding each P-frame of the video image using the target long-term reference frame and the reference frame insertion interval.

In one embodiment, the determining the target long-term reference frame according to the current round-trip delay includes:

judging whether the current round-trip delay is larger than a preset delay threshold value or not;

if the current round-trip delay is larger than the preset delay threshold, determining the long-term reference frame as a target long-term reference frame;

and if the current round-trip delay is smaller than the preset delay threshold, determining a target long-term reference frame according to feedback information sent by a receiving end.

In an embodiment, if the current round-trip delay is smaller than the preset delay threshold, determining a target long-term reference frame according to feedback information sent by a receiving end includes:

if the current round-trip delay is smaller than the preset delay threshold, judging whether feedback information of the historical long-term reference frame confirmed to be received by a receiving end is received;

if the feedback information is received, determining the historical long-term reference frame as a target long-term reference frame;

and if the feedback information is not received, screening the target long-term reference frame from each P frame according to the reference frame insertion time interval.

In an embodiment, before obtaining the average packet loss rate of the video image in the current detection time period, the method further includes:

acquiring the packet loss rate at the previous moment of the current moment and the duration of the previous detection time period;

and determining the duration of the current detection time period according to the packet loss rate and the duration of the previous detection time period.

In an embodiment, the determining the duration of the current detection time period according to the packet loss rate and the duration of the previous detection time period includes:

determining the ratio of the packet loss rate to a preset packet loss rate threshold;

if the ratio is greater than or equal to a preset reference value, determining the duration of the current detection time period according to the duration of the previous detection time period and a first preset formula: the first preset formula is as follows:

m_i＝m_i-1-adjustmentFact²；

if the ratio is smaller than the preset reference value, determining the duration of the current detection time period according to the duration of the previous detection time period and a second preset formula: the second preset formula is as follows:

m_i＝m_i-1+1；

wherein m is_iIs the duration of the current detection time period，m_i-1And the adjustmentFact is a specific value which is the duration of the previous detection time period.

obtaining the time length to be determined according to the packet loss rate and the time length of the previous detection time period;

if the duration to be determined is greater than or equal to a preset maximum detection duration, taking the preset maximum detection duration as the duration of the current detection time period;

if the duration to be determined is smaller than a preset maximum detection duration and larger than a preset minimum detection duration, taking the duration to be determined as the duration of the current detection time period;

and if the time length to be determined is less than or equal to the preset minimum detection time length, taking the preset minimum detection time length as the time length of the current detection time period.

In an embodiment, the adjusting the reference frame insertion time interval according to the real-time average packet loss rate includes:

obtaining a reference frame insertion time interval according to the real-time average packet loss rate, the preset packet loss rate threshold and a first preset formula; the first preset formula is as follows:

wherein LTRFramePEriod is the reference frame insertion time interval, averageLossRate is the real-time average packet loss rate, n is the preset packet loss rate threshold, and t is a natural number.

In an embodiment, after the adjusting the reference frame insertion time interval according to the real-time average packet loss rate, the method further includes:

and if the reference frame insertion time interval is larger than a preset maximum insertion time interval, taking the preset maximum insertion time interval as the reference frame insertion time interval.

In a second aspect, the present invention also provides an image transmission apparatus, including: a memory, a processor and a video encoding program stored on the memory and executable on the processor, the video encoding program being configured to implement the steps of the video encoding method as described above.

In a third aspect, the present invention also provides a computer-readable storage medium, on which a video coding program is stored, which when executed by a processor implements the steps of the video coding method as described above.

The embodiment of the invention provides a video coding method, an image sending device and a storage medium, wherein the video coding method determines a long-term reference frame to carry out coding through an average packet loss rate in a current detection time period so as to insert the long-term reference frame immediately, avoid the situation that the image quality is reduced due to the loss of the long-term reference frame, such as screen splash and the like, and also determines a target long-term reference frame according to the current round-trip delay RTT, and adjusts the reference frame insertion time interval according to the real-time average packet loss rate of each subsequent moment and the current round-trip delay of the current round-trip delay period in real time so as to enable the insertion time interval of the subsequent target long-term reference frame to be more consistent with the actual network transmission situation, so that different coding strategies are adopted under different RTTs, the video quality is ensured under the condition of poor network, the screen splash time is reduced, and the video picture is more smooth.

Drawings

FIG. 1 is a block diagram of an image sending apparatus according to the present invention;

FIG. 2 is a flowchart illustrating a video encoding method according to a first embodiment of the present invention;

FIG. 3 is a flowchart illustrating a second embodiment of a video encoding method according to the present invention;

FIG. 4 is a flowchart illustrating a video encoding method according to a third embodiment of the present invention;

FIG. 5 is a flowchart illustrating a video encoding method according to a fourth embodiment of the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

In the related art, a fixed long-Term reference frame strategy is mainly adopted, that is, a long-Term reference frame ltr (long Term reference) is inserted into a fixed period. Or a fixed n frame after the LTR frame is referenced to the LTR frame. However, when the network is poor, the receiving end may lose the LTR frame, and thus the transmission and image quality of the video may not be effectively guaranteed.

Therefore, the video coding method provided by the application can overcome the defect that the long-term reference frame is inserted in a fixed period by instantly inserting one LTR when the network is poor, dynamically adjusting the long-term reference frame insertion time interval according to the packet loss rate in the subsequent process and simultaneously determining the coding strategy of the coding frame according to the round-trip delay, so that the video coding is better suitable for network transmission in strong network and weak network environments, and better video quality is provided.

Referring to fig. 1, fig. 1 is a schematic structural diagram of an image sending apparatus in a hardware operating environment according to an embodiment of the present application.

As shown in fig. 1, the image transmission apparatus may include: a processor 1001, such as a Central Processing Unit (CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a WIreless interface (e.g., a WIreless-FIdelity (WI-FI) interface). The Memory 1005 may be a Random Access Memory (RAM) Memory, or may be a Non-Volatile Memory (NVM), such as a disk Memory. The memory 1005 may alternatively be a storage device separate from the processor 1001.

Those skilled in the art will appreciate that the configuration shown in fig. 1 does not constitute a limitation of the image transmission apparatus, and may include more or less components than those shown, or combine some components, or a different arrangement of components.

As shown in fig. 1, the memory 1005, which is a storage medium, may include therein an operating system, a data storage module, a bluetooth communication module, a user interface module, and a video encoding program.

In the image transmission apparatus shown in fig. 1, the network interface 1004 is mainly used for data communication with a network server; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 and the memory 1005 of the image transmission device of the present invention may be provided in the image transmission device, and the image transmission device calls the video encoding program stored in the memory 1005 through the processor 1001 and executes the video encoding method provided by the embodiment of the present application.

A first embodiment of a video coding method according to the present application is proposed based on the above-mentioned hardware devices, but not limited to the above-mentioned hardware devices. Referring to fig. 2, fig. 2 is a flowchart illustrating a video encoding method according to a first embodiment of the present application.

In this embodiment, the method includes:

s101, acquiring the average packet loss rate of a video image in the current detection time period;

in this embodiment, the main execution body of the video encoding method is an image sending device, the image sending device is configured to encode video image data in the cache and send the encoded video image data to a receiving end, and the receiving end decodes the encoded video image data after receiving the encoded video image data. It can be understood that the image sending device may be a video server, a cloud game server, or a live broadcast server, and the receiving end may be a mobile terminal such as a mobile phone or a tablet, or a terminal device such as a computer.

The image sending device can obtain the average packet loss rate in the current detection time period in the encoding transmission process. The current detection time period is a time period formed by taking a time from the current time as a starting time and taking the current time as an ending time. The average packet loss rate in the current detection time period is used for evaluationThe network transmission quality of the time period in which the current time is. Wherein, the duration of the current detection time period can be expressed as m_iIn seconds, so that the current time can be m before_iAnd evaluating the network transmission condition at the current moment by the average packet loss rate in seconds. If the average packet loss rate is high, the network at the current moment is poor. And if the average packet loss rate is low, the network at the current moment is higher.

Step S102, determining a long-term reference frame according to the average packet loss rate, and acquiring the real-time average packet loss rate of each moment and the current round-trip delay of the current round-trip delay period of each moment.

After the average packet loss rate is obtained, the current actual network condition can be determined according to the average packet loss rate. If the average packet loss rate is obtained, the average packet loss rate may be compared with a preset packet loss rate threshold, and if the average packet loss rate is greater than or equal to the preset packet loss rate threshold, the network condition suddenly fluctuates, the network is poor, and a long-term reference frame may be lost. If the average packet loss rate is lower than the preset packet loss rate threshold, the network condition is better at this time.

Specifically, when the average packet loss rate is greater than or equal to the preset packet loss rate threshold, a long-term reference frame LTR can be immediately inserted, so that the situation that the image quality is reduced due to screen blooming and the like caused by the loss of the long-term reference frame is avoided. And when the average packet loss rate is lower than a preset packet loss rate threshold, encoding can be continuously performed according to a preset LTR insertion strategy without changing.

In order to avoid that an algorithm is more complex due to frequent insertion of the long-term reference frame LTR when the network is poor, so that the code rate allocation is more reasonable, and the method is suitable for network transmission and ensures image quality.

After the average packet loss rate is obtained, the average packet loss rate can be compared with a preset packet loss rate threshold, if the average packet loss rate is greater than or equal to the preset packet loss rate threshold for the first time, the network condition suddenly fluctuates at the moment, the network is poor, and a long-term reference frame can be lost, a long-term reference frame LTR can be immediately inserted, so that the situation that the image quality is reduced due to screen splash and the like caused by the loss of the long-term reference frame is avoided.

It is understood that when the transmitting end starts to encode and transmit the video image data stream, the transmitting end inserts the LTR periodically or dynamically based on certain preset rules. At this time, if the network at the current time is detected to be poor, the current time is taken as the period starting time, and a long-term reference frame insertion period is entered. That is, after entering the long-term reference frame insertion period, a new reference frame insertion time interval or insertion strategy may be adopted to adapt to the network transmission in the weak network environment at this time.

And acquiring the real-time average packet loss rate of each moment and the current round-trip delay of the current round-trip delay period of each moment in the long-term reference frame insertion period.

Specifically, the image sending device obtains the current round-trip delay of the current round-trip delay period of each time in the encoding transmission process.

Specifically, the sending end constructs and sends a network probe packet to the receiving end, and the network probe packet carries a sending timestamp begin _ ntp _ time. After receiving the latest network probing packet, the receiving end records the current timestamp last _ recv _ ntp _ time. The receiving end b constructs a network detection packet reply packet, and sets a detection packet reply packet timestamp processTime equal to the current ntp timestamp-last _ recv _ ntp _ time. After receiving a network detection packet reply packet of a receiving end, a sending end records the arrival time cur _ ntp _ time of the network detection packet reply packet.

Thus, the method for calculating the current round trip delay RTT is as follows:

RTT＝cur_ntp_time-begin_ntp_time-processTime。

for the current time, a specific value of the current round trip delay RTT of the current round trip delay period may be obtained through query.

Step S103, adjusting the insertion time interval of the reference frame according to the real-time average packet loss rate;

in this step, the reference frame insertion time interval is dynamically and adaptively adjusted according to the real-time average packet loss rate in the long-term reference frame insertion period, so that the defect of fixing the long-term reference frame is overcome, the video coding is better adapted to the network transmission of the strong network environment and the weak network environment, and better video quality is provided.

The reference frame insertion time interval is obtained in a self-adaptive mode according to the average packet loss rate and the preset packet loss rate threshold, and is adjusted in the follow-up process according to the real-time average packet loss rate, so that the video quality is ensured, the screen-spending time is reduced, and the video pictures are smoother.

And step S104, determining the target long-term reference frame according to the current round-trip delay.

Specifically, in the long-term reference frame insertion period, the target long-term reference frame is adjusted according to the current round-trip delay and a preset delay threshold, so that a suitable coding frame strategy is determined according to the specific current round-trip delay condition, a suitable target long-term reference frame is set, and the problems that the decoding process is aggravated by screen splash and even network congestion is increased due to an unsuitable target long-term reference frame are avoided.

And step S105, encoding each P frame of the video image by using the target long-term reference frame and the reference frame insertion time interval.

After a target long-term reference frame and the reference frame insertion time interval are determined according to a network transmission condition, namely, a coding strategy at any moment in a long-term reference frame insertion period is determined, each P frame of the video image can be coded by using the target long-term reference frame and the reference frame insertion time interval, and code stream data to be sent is obtained. And the sending end sends the sending code stream data to the receiving end. And after receiving the code stream data to be sent, the receiving end decodes the code stream data to be sent to obtain video image data.

It can be understood that, in the video transmission process, the sending end constantly monitors the network transmission condition in real time, and constantly obtains the real-time average packet loss rate of the current time and the current round-trip delay of the current round-trip delay period of the current time at each time.

In addition, in the present embodiment, when it is detected that the preset condition is satisfied, the preset long-term reference frame insertion period may be ended, that is, the LTRs are inserted periodically or dynamically returning to the normal preset rule. Specifically, the condition for ending the insertion period of the preset long-term reference frame may be that the video image transmission is completed, or that the packet loss rate of each frame in a plurality of frames is greater than a preset packet loss rate threshold, that is, the network transmission quality becomes good, and if the packet loss rate of each frame in the current detection time period is greater than the preset packet loss rate threshold, this embodiment does not limit this.

In the embodiment, the video coding method determines the network state at the average packet loss rate, so that a long-term reference frame is inserted in real time, the situation that the image quality is reduced due to screen splash and the like caused by long-term reference frame loss is avoided, the reference frame insertion time interval is adjusted in real time according to the average packet loss rate at each subsequent moment, the long-term reference frame insertion time interval is more consistent with the actual situation of network transmission, a target long-term reference frame is determined according to the current round trip delay RTT, and different coding strategies are adopted under different RTTs, so that the video quality is ensured under the condition of poor network, the screen splash time is reduced, and the video picture is more smooth.

Based on the above embodiments, a second embodiment of the video encoding method of the present application is proposed. Referring to fig. 3, fig. 3 is a flowchart illustrating a video encoding method according to a second embodiment of the present application.

In this embodiment, step S104 includes:

step A10, judging whether the current round-trip delay is larger than the preset delay threshold value;

step A20, if the current round-trip delay is larger than the preset delay threshold, determining the current long-term reference frame as a target long-term reference frame;

step A30, if the current round-trip delay is smaller than the preset delay threshold, determining the target long-term reference frame according to the feedback information sent by the receiving end.

Specifically, if the network RTT delay is greater than the preset delay threshold RTT. Wherein, the threshold rtt is determined according to an empirical value, and if the value is 500ms, the long-term reference frame determined in step S102 is determined as the target long-term reference frame, and the normal p frames in the LTRFramePeriod are forced to refer to the LTR frame for encoding. When the RTT is long, the feedback period is also long, and the feedback packet may be lost, which causes the step length of LTR to become long, resulting in rapid deterioration of image quality. Therefore, the LTR inserted only at the current time can be encoded as a reference for the target long-term reference frame.

And if the RTT time delay of the network is less than a preset time delay threshold Rtt, further determining the target long-term reference frame according to the feedback information sent by the receiving end. That is, under the condition that the network is better, the long-term reference frame determined currently is not used, and the LTR can be inserted regularly or dynamically by using a simpler coding strategy or returning to a normal preset rule, so as to ensure the video quality, reduce the screen-spending time and make the video picture smoother.

Specifically, referring to fig. 4, step a30 includes:

step A301, if the current round-trip delay is smaller than the preset delay threshold, determining whether feedback information of the historical long-term reference frame confirmed to be received by the decoding end is received.

Step A302, if the feedback information is received, executing the step of determining the historical long-term reference frame as the target long-term reference frame.

Step A303, if the feedback information is not received, screening out a target long-term reference frame from each P frame according to the reference frame insertion time interval.

When a transmitting end encodes a current frame, a history LTR frame confirmed by a receiving end is used as a reference for encoding, so that better video image fluency can be guaranteed. The advantage of this reference relationship is that the video frames received by the receiving end all use the confirmed history LTR frames as reference frames when encoding, and can be decoded and displayed as long as the received video frames are complete. Therefore, the sender encodes a P frame other than LTR using the history LTR, which has been acknowledged before, as a reference. That is, under the condition that the network is better, the current determined long-term reference frame is not used, and the previous historical LTR is still used, so that the video quality is ensured, the screen-spending time is reduced, and the video picture is smoother.

It is understood that, if the feedback information is received, the history LTR is normally received by the receiving end, and at this time, the history LTR may be normally encoded with reference.

If the feedback information is not received, an unexpected situation that the history LTR is lost may occur. At this time, normal I-P frame encoding can be performed according to the reference frame insertion interval, but some of the P frames are screened out and marked as LTR frames, i.e. target long-term reference frames.

Based on the above embodiments, a third embodiment of the video encoding method of the present application is proposed. Referring to fig. 5, fig. 5 is a flowchart illustrating a video encoding method according to a third embodiment of the present application.

In this embodiment, the method includes:

step S201, a packet loss rate at a time previous to the current time and a duration of a previous detection time period are obtained.

In this embodiment, the packet loss rate is the packet loss rate in the previous time, and for example, when the previous time is the previous 1s, the packet loss rate is the packet loss rate in the previous 1 s. The previous detection time period is a detection time period corresponding to a previous time, that is, a time period formed by taking a time which is a duration of the previous detection time period from the previous time as a starting time and taking the previous time as an ending time. At the previous moment, the sending end obtains the average packet loss rate of the video image in the duration of the previous detection time period before the previous moment.

In this embodiment, the sending end determines a new current detection time period again at each time. The specific value of the duration of the current detection time period changes with the change of the packet loss rate. If the sending end updates the last time length to the time length of the previous detection time period every 1S, and dynamically determines the new time length of the current detection time period.

Step S202, determining the duration of the current detection time period according to the packet loss rate and the duration of the previous detection time period.

Specifically, step S202 includes:

step B10, determining the ratio of the packet loss rate to a preset packet loss rate threshold;

step B20, if the ratio is greater than or equal to a preset reference value, determining the duration of the current detection time period according to the duration of the previous detection time period and a first preset formula: the first preset formula is as follows:

m_i＝m_i-1-adjustmentFact²；

step B30, if the ratio is smaller than the preset reference value, determining the duration of the current detection time period according to the duration of the previous detection time period and a second preset formula: the second preset formula is as follows:

m_i＝m_i-1+1；

wherein m is_iIs the duration of the current detection period, m_i-1And the adjustmentFact is the ratio of the time length of the previous detection time period.

In particular, the method of manufacturing a semiconductor device,

wherein, adjustmentFact is a specific value, currSecondLossRate is a packet loss rate, and n is a preset packet loss rate threshold. m is_iMay be taken to be 10.

If the ratio is greater than or equal to the preset reference value, that is, the packet loss rate at the previous moment is higher, the duration m of the current detection time period is rapidly converged_i. On the contrary, when the packet loss rate at the previous moment is lower, the duration m of the current detection time period is increased_i。

In obtaining the duration m_iThen, the current time is usedPush m forward for termination time_iThe time period obtained by the duration is the current detection time period of the current time.

Therefore, in this embodiment, the time length value of the current detection time period corresponding to each time is dynamically adjusted by the packet loss rate, so as to better determine the network transmission quality at the current time.

As an embodiment, step S202 includes:

(1) obtaining the time length to be determined according to the packet loss rate and the time length of the previous detection time period;

(2) if the duration to be determined is greater than or equal to the preset maximum detection duration, taking the preset maximum detection duration as the duration of the current detection time period;

(3) if the duration to be determined is less than the preset maximum detection duration and greater than the preset minimum detection duration, taking the duration to be determined as the duration of the current detection time period;

(4) and if the duration to be determined is less than or equal to the preset minimum detection duration, taking the preset minimum detection duration as the duration of the current detection time period.

Specifically, in the present embodiment,

therefore, when the duration to be determined is longer than the preset maximum detection duration, the maximum detection duration is taken as the duration of the current detection time period. And when the duration to be determined is less than the preset minimum duration, taking the preset minimum detection duration as the duration of the current detection time period. When the time length to be determined is between the preset maximum detection time length and the preset minimum detection time length, determining the time length to be determined as the time length m of the current detection time period_i。

In this embodiment, the upper limit value and the lower limit value of the duration of the current detection time period are limited by the preset maximum detection duration and the preset minimum detection duration, so as to avoid that the value of the duration of the current detection time period is beyond the preset maximum detection duration and the preset minimum detection duration to affect the accuracy of the detection result of the network quality.

Step S203, obtaining an average packet loss rate of the video image in the current detection time period.

Step S204, determining a long-term reference frame according to the average packet loss rate, and acquiring the real-time average packet loss rate of each moment and the current round-trip delay of the current round-trip delay period of each moment.

And step S205, adjusting the reference frame insertion time interval according to the real-time average packet loss rate.

In this embodiment, the steps specifically include:

obtaining a reference frame insertion time interval of the long-term reference frame insertion period according to the real-time average packet loss rate, the preset packet loss rate threshold and a first preset formula; the first preset formula is as follows:

wherein, LTRFramePEriod is the reference frame insertion time interval, and averageLossRate is the real-time average packet loss rate t which is a natural number. Optionally, t is 3.

Therefore, the network is good, namely when the real-time average packet loss rate is low, the reference frame insertion time interval is adjusted to be a little shorter, and when the network is poor, the reference frame insertion time interval is adjusted to be a little longer, so that the image quality and the network transmission can be considered at the same time.

Step S206, if the reference frame insertion time interval is greater than a preset maximum insertion time interval, using the preset maximum insertion time interval as the reference frame insertion time interval.

In particular, the method of manufacturing a semiconductor device,

wherein, the preset maximum insertion time interval maxRefFrameNum is optionally 16. If LTRFramePeriod is less than the preset maximum number of referenceable frames, then LTRFramePeriod is not changed, otherwise LTRFramePeriod is equal to 16.

In this step, the step value of inserting LTR is limited by presetting the maximum insertion time interval, so as to avoid that the normal encoding quality is affected by too large insertion time interval.

And step S207, determining a target long-term reference frame according to the current round-trip delay.

And step S208, encoding each P frame of the video image by using the target long-term reference frame and the reference frame insertion time interval.

For ease of understanding, a specific embodiment is shown below:

and the live broadcast server sends video stream data to the mobile terminal, and in the process, the live broadcast server acquires the current round-trip delay RTT of the current round-trip delay period at the current moment and the average packet loss rate within 10s before the current moment. When the average packet loss rate at the time 03:00 is greater than or equal to a preset packet loss rate threshold for the first time, the live broadcast server takes the current frame as a long-term reference frame LTR, and enters a long-term reference frame insertion period at the moment. And calculating to obtain the reference frame insertion time interval 16s according to the ratio of the average packet loss rate to a preset packet loss rate threshold. At this time, if RTT > 500ms, the LTR is encoded and transmitted with reference to the first reference frame insertion time interval.

At each moment after the long-term reference frame insertion period, the live broadcast server obtains the current round-trip delay RTT of the current round-trip delay period at each moment and the real-time average packet loss rate in the current detection time period corresponding to each moment, and continuously updates the reference frame insertion time interval according to the real-time average packet loss rate so as to insert a short LTR step length when the network is good and insert a long LTR step length when the network is poor, thereby taking account of the image quality and the network transmission.

When the live broadcast server is at 30:10, detecting that packet loss rates per second in continuous 10S are all smaller than a preset packet loss rate threshold value, ending the long-term reference frame insertion period, and inserting LTR by the live broadcast server according to a conventional LTR insertion algorithm to perform video coding transmission.

Furthermore, an embodiment of the present invention further provides a computer storage medium, where a video encoding program is stored on the storage medium, and the video encoding program, when executed by a processor, implements the steps of the video encoding method as above. Therefore, a detailed description thereof will be omitted. In addition, the beneficial effects of the same method are not described in detail. For technical details not disclosed in embodiments of the computer-readable storage medium referred to in the present application, reference is made to the description of embodiments of the method of the present application. It is determined that, by way of example, the program instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

It should be noted that the above-described embodiments of the apparatus are merely schematic, where units illustrated as separate components may or may not be physically separate, and components illustrated as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiment of the apparatus provided by the present invention, the connection relationship between the modules indicates that there is a communication connection between them, and may be specifically implemented as one or more communication buses or signal lines. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that the present invention may be implemented by software plus necessary general hardware, and may also be implemented by special hardware including special integrated circuits, special CPUs, special memories, special components and the like. Generally, functions performed by computer programs can be easily implemented by corresponding hardware, and specific hardware structures for implementing the same functions may be various, such as analog circuits, digital circuits, or dedicated circuits. However, the implementation of a software program is a more preferable embodiment for the present invention. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, where the computer software product is stored in a readable storage medium, such as a floppy disk, a usb disk, a removable hard disk, a Read-only memory (ROM), a random-access memory (RAM), a magnetic disk or an optical disk of a computer, and includes instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A method of video encoding, the method comprising:

2. The video coding method of claim 1, wherein determining the target long-term reference frame according to the current round-trip delay time comprises:

3. The video coding method of claim 2, wherein determining a target long-term reference frame according to feedback information sent by a receiving end if the current round-trip delay is smaller than the preset delay threshold comprises:

if the current round-trip delay is smaller than the preset delay threshold, judging whether feedback information of the historical long-term reference frame confirmed to be received by the receiving end is received;

4. The video encoding method of claim 1, wherein the obtaining of the average packet loss rate of the video image in the current detection period is preceded by:

5. The video coding method of claim 4, wherein the determining the duration of the current detection period according to the packet loss ratio and the duration of the previous detection period comprises:

m_i＝m_i-1-adjustmentFact²；

m_i＝m_i-1+1；

6. The video coding method of claim 4, wherein the determining the duration of the current detection period according to the packet loss ratio and the duration of the previous detection period comprises:

7. The video coding method of claim 5, wherein the adjusting the reference frame insertion time interval according to the real-time average packet loss ratio comprises:

obtaining the reference frame insertion time interval according to the real-time average packet loss rate, the preset packet loss rate threshold and a first preset formula; the first preset formula is as follows:

8. The video coding method according to any one of claims 1 to 7, wherein after the adjusting the reference frame insertion time interval according to the real-time average packet loss ratio, the method further comprises:

9. An image transmission apparatus characterized by comprising: memory, processor and a video coding program stored on the memory and executable on the processor, the video coding program being configured to implement the steps of the video coding method according to any of the claims 1 to 8.

10. A computer-readable storage medium, in which a video coding program is stored, which when executed by a processor implements the steps of the video coding method according to any one of claims 1 to 8.