CN110769269A

CN110769269A - Local area network screen live broadcast delay optimization method

Info

Publication number: CN110769269A
Application number: CN201911087824.6A
Authority: CN
Inventors: 付鹏斌; 任衡; 杨惠荣; 董澳静
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2019-11-08
Filing date: 2019-11-08
Publication date: 2020-02-07

Abstract

The invention relates to a local area network screen live broadcast delay optimization method, which is used for multi-screen, multi-device and real-time synchronous live broadcast in a local area network environment. Firstly, a stream pushing end monitors and evaluates the uplink average bandwidth and the average round-trip delay of a network at regular time, and three network quality models are established; the stream pushing end adaptively adjusts the stream pushing code rate according to different network conditions, and the process follows the principle of 'fast reduction and slow rise', so that the fluency of live stream playing in different network environments is ensured; secondly, the server caches the key frame, and when the playing end starts playing, the playing end firstly obtains the key frame from the server memory for loading and playing, so that an overtime reconnection mechanism of the player is avoided being triggered, the opening speed of the first frame is accelerated, and the opening time of the first frame is reduced to 2-4 seconds; finally, the fast selection algorithm of the intra-frame prediction mode is improved, the encoding complexity is reduced by reducing the number of candidate modes, and the encoding time can be reduced by about 22 percent at most on the premise that the video quality is basically unchanged.

Description

Local area network screen live broadcast delay optimization method

Technical Field

The invention relates to the field of audio and video acquisition, encoding, stream pushing, transmission and decoding, in particular to a delay optimization method of a screen live broadcast system.

Background

Live broadcast interactive education is effective use to the modernized education platform, can more effectively realize sharing and propagation to high-quality education resources, but the research of live broadcast interactive education has the ubiquitous live broadcast card serious, live broadcast delay higher problem. Due to the problems of hardware performance of networks and mobile devices, the occurrence of jamming in the live broadcasting process is a very high probability. The Kadun enables a teacher to have unsmooth situation when playing a teaching video and a phenomenon of discontinuous pictures when playing PPT demonstration, and normal teaching activities are seriously influenced. Live broadcast delay is also an important factor influencing live broadcast effect in a classroom, and for live broadcast teaching, even if picture delay is high, experience is not influenced as long as a picture of a viewer is smooth. However, for live broadcasting teaching in a local area network environment, a teacher is used as a main broadcasting station and students are in a classroom, too high picture delay enables the live broadcasting teaching mode to be full of snaking, and the live broadcasting effect of multi-classroom teaching in campus public classes is seriously influenced.

Live broadcast based on an HLS (HTTP Live streaming) protocol is in a high concurrency scene, the theoretical delay is between 10s and 30s, and low-delay local area network screen Live broadcast cannot be realized; the UDP (user data program) multicast technology is only suitable for screen sharing between computers by means of wired connection, and is not suitable for screen sharing between computers and multiple mobile devices, and the UDP protocol is an unreliable protocol and cannot ensure picture quality.

An open free internet live broadcast platform is introduced into schools for live broadcast interactive education, but some bad information can be transmitted into classes through the open internet in various ways, so that interference on normal teaching order of the schools is caused, and the support of the live broadcast platform on multi-classroom multi-device synchronous interactive live broadcast teaching in a school local area network is insufficient. And because of the live broadcast of the local area network of the school, the problems of high live broadcast delay, poor fluency and the like generally exist in the process, and the teaching effect is greatly reduced to a certain extent. Therefore, low-delay live broadcasting is a challenging technical difficulty in local area network video live broadcasting.

Disclosure of Invention

Aiming at the problems, the invention analyzes each link of live broadcast and performs live broadcast delay optimization from three angles of network stream pushing, first frame opening time and video coding. Firstly, adaptively adjusting the direct-broadcasting stream pushing code rate according to the network condition, and optimizing a stream pushing strategy; secondly, designing and realizing a key frame cache algorithm, and reducing the first frame loading delay; thirdly, the intra-frame prediction part of the x264 algorithm is improved, and the coding delay is reduced. Finally, a low-delay screen live broadcast system is realized, and live broadcast real-time experience is enhanced.

The method of the invention is realized by the following main steps: firstly, a stream pushing end monitors and evaluates the uplink average bandwidth and the average round-trip delay of a network at regular time, and three network quality models are established; the stream pushing end adaptively adjusts the stream pushing code rate according to different network conditions, and the process follows the principle of 'fast reduction and slow rise', so that the fluency of live stream playing in different network environments is ensured; secondly, the server caches the key frame, and when the playing end starts playing, the playing end firstly obtains the key frame from the server memory for loading and playing, so that an overtime reconnection mechanism of the player is avoided being triggered, the opening speed of the first frame is accelerated, and the opening time of the first frame is reduced to 2-4 seconds; finally, the fast selection algorithm of the intra-frame prediction mode is improved, the encoding complexity is reduced by reducing the number of candidate modes, and the encoding time can be reduced by about 22 percent at most on the premise that the video quality is basically unchanged.

The delay optimization method of the screen live broadcast system comprises the following steps:

step one, network quality evaluation, which specifically comprises the following steps: monitoring an uplink plug flow network at regular time, acquiring the average uplink bandwidth and the average round-trip delay of the network, and defining three network quality models to describe the network conditions by taking the average uplink bandwidth and the average round-trip delay of the network as the judgment basis of the network conditions, wherein the three network quality models are respectively good network, excellent network and poor network;

step two, network self-adaptive plug flow, which specifically comprises the following steps: on the basis of the network condition in the first step, detecting and evaluating the network condition at intervals, and when the network is excellent, slowly increasing the acquisition frame rate and the resolution; when the network is good, the network is kept in an original state, and the frame rate and the resolution ratio are not changed; when the network is poor, the acquisition frame rate and the video resolution are greatly reduced, and the self-adaptive process follows the principle of 'fast reduction and slow rise';

step three, key frame caching, specifically: pulling a stream from a streaming media server at regular time to obtain a video stream, pushing the stream to the local, and generating a video file in an flv format; firstly, detecting the data type of the video stream, judging whether the data type is the video type, if the data type is the video type, detecting whether the data type is a key frame, otherwise, continuing to pull the stream; if the video stream mark is a key frame, calculating the length of the video data, taking the length as the length for acquiring key frame data, and updating and caching the key frame to a server; after the caching operation is finished, the playing end firstly requests the key frame data cached in the server to perform decoding rendering playing when performing live broadcasting playing;

step four, reducing the number of candidate intra-frame coding modes, specifically: for a 4x4 brightness block, calculating texture coefficients based on 4 texture directions of an image, calculating the texture coefficient values to obtain texture direction values of each prediction mode, sequencing the texture direction values, and selecting 2 modes with the minimum texture direction values to add into a candidate mode; and simultaneously selecting three common prediction modes of a vertical mode, a horizontal mode and a DC mode, and adding the three common prediction modes into the candidate mode to finally obtain 3 to 5 candidate modes.

Compared with the prior art, the method has the following advantages:

the stream pushing code rate is adaptively adjusted according to the network condition, and the fluency of live stream playing in different network environments is ensured; the method comprises the steps that key frames are cached, so that a playing end firstly requests the key frame data cached in a server to be decoded, rendered and played when performing live broadcast, an overtime reconnection mechanism of a player is avoided being triggered, the opening time of a first frame is optimized, and the opening time of the first frame is reduced to be within 4 seconds; by reducing the number of intra-prediction candidate modes, the coding complexity is reduced, and the coding time can be reduced by about 22% at most on the premise that the video quality is basically unchanged.

Drawings

FIG. 1 is a block diagram of the system framework of the present invention;

FIG. 2 is a flow chart of network upstream average bandwidth detection;

FIG. 3 is a schematic diagram of a network adaptive push flow strategy;

FIG. 4 is a flow chart of key frame caching;

FIG. 5 is a diagram of the time-consuming optimization front-back comparison (video resolution 1920 × 1080) of the opening of the first screen;

FIG. 6 is a graph of the comparison (video resolution 1280 × 720) before and after optimization of the time spent opening the first screen;

FIG. 7 is a graph of the comparison (video resolution 800X 600) before and after optimization of the time spent opening the first screen;

FIG. 8 is a flow chart of an improved 4x4 intra prediction mode selection algorithm;

fig. 9 is a graph of delay test results for a screen live system at four resolutions (800 × 600, 1024 × 768, 1280 × 720, 1920 × 1080);

FIG. 10 is a comparison of subjective effects of three test sequences (ppt, news, highway) before and after optimization;

Detailed Description

The invention is further described with reference to the following figures and detailed description.

The process of the method comprises the following steps:

(1) network quality evaluation

And adjusting the network condition by using a network simulation tool Clumsy, monitoring the uplink plug flow network every 10min, calculating the uplink average bandwidth and the average round-trip delay of the network, and establishing three network quality models by using the two indexes as the judgment basis of the network condition. The network quality evaluation is carried out according to the following steps:

a. acquiring a network interface list: calling a jpcapCaptor.getDeviceList () method in a function library to obtain a native network interface list;

b. opening a network interface: monitoring a network interface by using a JpcapCaptor. openDevice () method, setting the maximum capture data packet size of 65535Byte each time, starting a promiscuous mode to receive data packets of all types, and setting the capture data packet timeout time to 50 ms;

c. capturing a data packet: defining a class MyPacketReceiver, realizing a PacketReceiver interface, realizing a receivePacket method in the interface, calling a Jpcapaptaor.loopPacket () method to capture a data packet, and calling back the receivePacket method to perform distinguishing statistics on the data packet after acquiring the data packet;

d. calculating the average bandwidth of the uplink network in real time: counting the size of a data packet of which the source IP address is captured within 60 seconds and is a video acquisition end, and calculating the size of the data packet captured per second to obtain the network uplink average bandwidth;

e. the timing task repeats the step c and the step d every 10 min;

f. obtaining the average round trip delay value: calling a cmd command 'ping IP Address-npingTimes-l length' through a java programming language to acquire a value of average round-trip delay, wherein the ipAddress is an Ngnix-rtmp server host IP, the pingTimes are the number of requests sent to a server, and the length is the size of a data packet sent each time;

g. establishing a network quality model: three network quality models shown in table 1 are established by using two indexes of the average bandwidth and the average round trip delay of the uplink network. For two network qualities of 'network excellent' and 'network good', the two indexes are the relation of 'and'; for "network poor", the two indicators are the relationship of "or". When the average uplink bandwidth of the network is 0.8MB/s and the average round-trip delay is 5ms, the network is excellent; when the average uplink bandwidth of the network is between 0.4MB/s and 0.8MB/s and the average round-trip delay is between 5ms and 10ms, the network is good; when the average bandwidth of the network uplink is less than 0.4MB/s or the average round trip delay is more than 20ms, the network is poor, and the priority of the average round trip delay is higher in the case.

TABLE 1 network quality model

(2) Network adaptive plug flow

Firstly, 6 kinds of push flow schemes are defined, which mainly include frame rates and resolution sizes, and numbering is performed from 1 in the order of poor push flow quality to good push flow quality, and as shown in table 2, the frame rates and resolutions corresponding to the 6 kinds of push flow schemes are defined. The meaning of 'fast decreasing and slow increasing' is that when the network condition is poor, if the current flow pushing scheme is larger than 2, the current flow pushing scheme is decreased by 2, the frame rate and the resolution ratio are greatly reduced, the pressure on the network bandwidth is reduced, otherwise, the flow pushing is carried out according to the flow pushing scheme 1; for example, when the stream pushing end pushes the stream according to 25Fps @1280 x 720, the network is detected to be poor, the frame rate and the resolution are greatly reduced, the stream is pushed according to a 15Fps @800 x 600 mode, and the pressure on the network bandwidth is reduced; when the network condition is excellent, if the current push flow scheme is smaller than the highest push flow scheme serial number, which is 6 in this embodiment, the push flow scheme is added with 1, and if the current push flow scheme is already the highest serial number, the current push flow scheme is kept unchanged; for example, when the stream pushing end pushes the stream according to 20FPS @1024 × 768, the network excellence is detected, the frame rate and the resolution are slowly increased, and the stream is pushed according to the mode of 25FPS @1280 × 720; when the network is good, the original state is kept, and the frame rate and the resolution ratio are not changed.

TABLE 2 frame rates and resolutions for different push flow schemes

According to the invention, after a network plug flow mode is improved to network self-adaptive plug flow, under the condition of large network fluctuation, the code stream is adjusted according to the network bandwidth condition, so that the video playing is kept to be entirely smooth. The higher the video rate, the better the video quality, and the greater the consumption of network bandwidth. Therefore, when the bandwidth of the weak network is insufficient, if parameters such as resolution, code rate and the like are not adjusted, the load of the push-streaming uplink network is increased, and further, live broadcast delay is caused.

(3) Key frame buffer and first screen open time consuming test

The live broadcast is a real-time video stream, a user accesses the video stream at a random time point for playing, if the frame type of a first frame is not a key frame during the access, the playing buffer area is empty, and a decoder cannot decode but can drop frames. The player needs to wait until the next key frame comes, if the waiting time is more than 5 seconds, the player SDK overtime reconnection mechanism is triggered, reconnection is continuously refreshed until the loading is successful within 5 seconds, and the live broadcast watching experience of a user is greatly influenced. Therefore, the key frame is buffered according to the following steps:

a. the stream pushing end starts two paths of stream pushing, namely two threads of concurrent execution, wherein one path of video stream is pushed to the streaming media server, the other path of video stream is used for pushing small segments of videos to the local area at regular time, and the video formats are all flv;

b. reading a local latest video segment, and analyzing according to an flv format;

c. traversing each FLV Tag, acquiring an FLV Tag header, judging whether the FLV Tag header is of a video type according to the first byte, wherein 0x12 is of a Script type, 0x08 is of an audio type, and 0x09 is of a video type, traversing the next FLV Tag if the FLV Tag header is not of the video type, and entering the step d if the FLV Tag header is not of the video type;

d. acquiring the first 4 bits of the first byte of the Tag data, if the first byte is 0001, then acquiring a key frame, then acquiring the second byte to the fourth byte of the Tag header, calculating the length of the video data, starting to acquire key frame data with the length from the second byte of the Tag data, and then entering the step e; if not, returning to the step c to traverse the next FLV Tag;

e. checking whether the cache queue has key frames, if not, caching the current key frames, and if so, removing the old key frames and caching the new key frames;

f. repeating the steps b to e;

g. the playing end firstly requests the key frame from the memory of the server to play and then requests the normal video stream.

Because the network flow of the campus network in the day and at night is different, in order to test the effect of the algorithm under different network conditions, the opening time of the first screen under three video resolutions (1920 x 1080, 1280 x 720 and 800 x 600) of the campus network in two time periods of day (10: 00-17: 00) and night (19: 30-23: 00) is tested, 200 click playing operations are performed under each video resolution, and the time from each click playing to the loading of the first screen image is recorded. The experimental results for video resolution 1920 × 1080 are shown in fig. 5, the experimental results for video resolution 1280 × 720 are shown in fig. 6, and the experimental results for video resolution 800 × 600 are shown in fig. 7.

Analyzing the experimental result, when the video resolution is 1920 × 1080, as shown in fig. 5, the opening time of the first screen before optimization is about 15-25 seconds in both day and night, the average opening time of the first screen after optimization is 3133ms in day and 3010ms in night; when the video resolution is 1280 × 720, as shown in fig. 6, the opening time of the first screen before optimization is about 10-25 seconds, the average opening time of the first screen after optimization is 3241ms in the daytime, and 3150ms at night; when the video resolution is 800 × 600, as shown in fig. 7, the time is 10s or more in both day and night, the average first screen open time after optimization is 2864ms in day, and 2915ms in night. The opening time of the first screen is obviously reduced, and the speed of accessing a live broadcast picture is improved. The necessity of optimizing the opening time of the first frame by using a key frame cache algorithm is demonstrated through tests, the algorithm ensures that the first frame is just the key frame, and the long-time waiting of the first screen caused by uncertain types of the first frame is avoided.

After the key frame buffer is added, the playing end firstly requests the key frame data buffered in the server to be decoded, rendered and played when the live broadcast is carried out, and meanwhile, the server also continuously updates the key frame data, so that the time for opening the first frame is reduced to be within 4 seconds.

(4) Reducing the number of intra-coding candidate modes and coding time testing

For a 4x4 brightness block, calculating texture coefficients based on 4 texture directions of an image, further calculating to obtain texture direction values of each prediction mode, sequencing the texture direction values, and selecting 2 modes with the minimum texture direction values to add into a candidate mode; and simultaneously adding three common prediction modes of a vertical mode, a horizontal mode and a DC mode into the candidate mode to finally obtain 3 to 5 candidate modes. Determining intra-coding candidate modes according to the following steps:

a. defining 4 video texture directions of 0 degrees, 45 degrees, 90 degrees and 135 degrees, and representing texture coefficients in four directions by using variables of Angle0, Angle45, Angle90 and Angle 135;

b. dividing a 4x4 brightness block into 4 sub-blocks of 2x2, wherein the pixels of the sub-blocks are respectively represented as S1, S2, S3 and S4 according to the row-first order, and each pixel value is obtained by averaging 4 pixels contained in the pixel value;

c. estimating the image texture of the 4x4 luminance block according to the sub-block pixel values, and calculating the texture coefficients Angle0, Angle45, Angle90 and Angle135 of four directions as follows:

d. except for the DC mode, calculating the corresponding relation of the texture direction values of other 8 prediction modes as shown in the table 3;

TABLE 3 texture Direction values for eight prediction modes

e. Quickly sequencing the texture direction values, and selecting 2 modes with the minimum texture direction values as candidate modes; in addition, since the vertical mode, the horizontal mode, and the DC mode are used most frequently in actual encoding, these three modes are also used as candidate modes, and considering that the two modes with the smallest texture direction value may include the vertical mode or the horizontal mode, the final candidate modes are at least 3, and at most 5;

f. and performing RDO calculation on the 3-5 candidate modes, and selecting the mode corresponding to the minimum RDO value as the optimal prediction mode.

The method comprises the steps of recording 500 frames of ppt teaching videos in YUV (4:2:0) format in a screen live broadcast system, wherein the video features are that the video changes more slowly, the local texture details are rich, and the video features can represent the video features of most application scenes in the screen live broadcast system.

TABLE 4 comparison of the improved x264 algorithm with the unmodified x264 algorithm

According to experimental results, the improved x264 algorithm compresses a ppt video sequence in a screen live broadcast system, the coding time is reduced by 8.26%, and the coding frame rate is improved by 8.79%; for the official test sequence, the coding time of the news sequence is reduced by 3.74%, and the coding frame rate is improved by 4.16%; the coding time of the highway sequence is reduced by 21.94%, and the coding frame rate is improved by 27.9%. There is substantially no loss in video quality for the 3 sequences. The improved x264 coding algorithm is applied to a screen live broadcast system, so that the coding time is effectively reduced, and the coding efficiency is improved.

(5) Designing and realizing screen live broadcast system, and performing system delay test

And (3) operating a screen live broadcast system, playing by using 13 tablet computers (memory 2GB), and counting live broadcast delay conditions of the system under four resolutions of 800 × 600, 1024 × 768, 1280 × 720 and 1920 × 1080. The network router is Hua enterprise router AR111-S, and the AP is Hua wireless AP 3010 DN-V2. The plug flow end and the server are notebook computers, the CPU configuration information is Intel (R) core (TM) i5-3230M, the dual cores are 2.6Ghz, and the memory is 8 GB. Live broadcast delay is counted every 100 frames of video, 5000 frames are counted in total after 50 tests, live broadcast delay data are written into a text file of the SD card in real time, and the final experiment result is shown in fig. 9.

The experimental results in fig. 9 are analyzed, when the plug flow resolution is 800 × 600, 1024 × 768, 1280 × 720, the system delay is stabilized between 1s and 3s, compared with the delay of 5s to 10s before optimization, the delay of the screen live broadcast system after optimization is lower at 1280 × 720 or even lower resolution, and the classroom screen sharing requirement in actual teaching can be met.

The three video test sequences of ppt, news and highway are coded by using the algorithm before and after improvement, as shown in fig. 10, the first line of images are the images coded by the algorithm before improvement, the second line of images are the images coded by the algorithm after improvement, and the comparison effect in the images shows that the loss of image quality is not great, and the watching effect of students in the actual classroom screen sharing teaching is not influenced. The optimization scheme of the invention can reduce the coding time and hardly affect the image quality, thereby meeting the requirement of classroom screen sharing teaching.

Claims

1. A local area network screen live broadcast delay optimization method is characterized by comprising the following steps:

the method comprises the following steps of firstly, evaluating the network quality, and specifically: monitoring an uplink plug flow network at regular time, acquiring the average uplink bandwidth and the average round-trip delay of the network, and defining three network quality models to describe the network conditions by taking the average uplink bandwidth and the average round-trip delay of the network as the judgment basis of the network conditions, wherein the three network quality models are respectively good network, excellent network and poor network;

step two, network self-adaptive plug flow, which comprises the following specific steps: on the basis of the network condition in the first step, detecting and evaluating the network condition at intervals, and when the network is excellent, slowly increasing the acquisition frame rate and the resolution; when the network is good, the network is kept in an original state, and the frame rate and the resolution ratio are not changed; when the network is poor, the acquisition frame rate and the video resolution are greatly reduced, and the self-adaptive process follows the principle of 'fast reduction and slow rise';

step three, key frame caching, which comprises the following specific steps: pulling a stream from a streaming media server at regular time to obtain a video stream, pushing the stream to the local, and generating a video file in an flv format; firstly, detecting the data type of the video stream, judging whether the data type is the video type, if the data type is the video type, detecting whether the data type is a key frame, otherwise, continuing to pull the stream; if the video stream mark is a key frame, calculating the length of the video data, taking the length as the length for acquiring key frame data, and updating and caching the key frame to a server; after the caching operation is finished, the playing end firstly requests the key frame data cached in the server to perform decoding rendering playing when performing live broadcasting playing;

reducing the number of intra-frame coding candidate modes, and specifically comprising the following steps: for a 4x4 brightness block, calculating texture coefficients based on 4 texture directions of an image, calculating the texture coefficient values to obtain texture direction values of each prediction mode, sequencing the texture direction values, and selecting 2 modes with the minimum texture direction values to add into a candidate mode; and simultaneously selecting three common prediction modes of a vertical mode, a horizontal mode and a DC mode, and adding the three common prediction modes into the candidate mode to finally obtain 3 to 5 candidate modes.

2. The method according to claim 1, wherein the network quality evaluation method in the first step is specifically as follows:

c. capturing a data packet: defining a class MyPacketReceiver, realizing a PacketReceiver interface, realizing a receivePacket method in the interface, calling a Jpcapaptaor.loopPacket () method to capture a data packet, and calling back the receivePacket method to process the data packet after acquiring the data packet;

d. calculating the average bandwidth of the uplink network in real time: counting the size of a data packet of which the source IP address is captured within 60 seconds and is a video acquisition end, and calculating the average bandwidth of an uplink network;

e. the timing task repeats the step c and the step d every 10 min;

f. obtaining the average round trip delay value: calling a cmd command 'ping IP Address-npingTimes-l length' through a java programming language to acquire a value of average round-trip delay, wherein the IP Address is an Ngnix-rtmp server host IP, the pingTimes are the number of requests sent to a server, and the length is the size of a data packet sent each time;

g. establishing a network quality model: the network simulation tool Clumsy is used for adjusting the network condition, three network quality models are established by utilizing two indexes of the average bandwidth and the average round-trip delay of an uplink network, and the network is excellent when the average bandwidth of the uplink network is 0.8MB/s and the average round-trip delay is less than 5 ms; when the average uplink bandwidth of the network is between 0.4MB/s and 0.8MB/s and the average round-trip delay is between 5ms and 10ms, the network is good; when the average bandwidth of the network uplink is less than 0.4MB/s or the average round trip delay is more than 20ms, the network is poor, and the priority of the average round trip delay is higher in the case.

3. The method of claim 1, wherein the network adaptive streaming method in the second step is as follows:

according to the principle that the network condition follows 'fast reduction and slow rise', the size of the stream-pushing code rate is adaptively adjusted, and the method specifically comprises the following steps: firstly, different plug flow schemes are defined, mainly comprising frame rate and resolution, and numbering is carried out from 1 according to the sequence of good plug flow quality to poor plug flow quality; when the network condition is poor, subtracting 2 from the serial number of the current plug flow scheme, greatly reducing the frame rate and the resolution ratio, reducing the pressure on the network bandwidth, if the serial number of the current plug flow scheme is 2, subtracting 1 from the serial number of the plug flow scheme, and if the serial number of the current plug flow scheme is 1, keeping unchanged; when the network condition is excellent, if the serial number of the current plug flow scheme is smaller than the serial number of the highest plug flow scheme, adding 1 to the serial number of the plug flow scheme, slowly increasing the frame rate and the resolution, and if the serial number of the current plug flow scheme is the highest serial number, keeping the serial number unchanged; when the network is good, the original state is kept, and the frame rate and the resolution ratio are not changed.

4. The method of claim 1, wherein the key frame buffering method in step three is as follows:

a. the stream pushing end starts two paths of stream pushing, one path of video stream is pushed to the server, and the other path of video stream regularly pushes flv small-segment videos to the local;

b. reading and analyzing local latest flv video segments;

c. traversing each FLV Tag, acquiring a first byte of the FLV Tag header, judging whether the value is 0x09, namely the video type, if so, entering the step d, otherwise, traversing the next FLV Tag;

e. checking whether the cache queue contains key frames, if not, caching, if so, removing the old key frames, and caching the new key frames;

f. repeating steps b-e;

5. The method as claimed in claim 1, wherein the method for reducing the number of candidate intra-coding modes in step four comprises the following steps:

d. removing the DC mode, and calculating texture direction values of other 8 prediction modes;