CN113905200B

CN113905200B - Video processing method and device based on statistics

Info

Publication number: CN113905200B
Application number: CN202111170532.6A
Authority: CN
Inventors: 王卫东; 王军; 巩家雨; 孟萌
Original assignee: Yarward Electronic Co ltd
Current assignee: Yarward Electronic Co ltd
Priority date: 2021-10-08
Filing date: 2021-10-08
Publication date: 2023-07-11
Anticipated expiration: 2041-10-08
Also published as: CN113905200A

Abstract

The embodiment of the application discloses a video processing method and equipment based on statistics, which are used for solving the problem that the existing method for reducing the video frame rate is unreasonable in video frame reservation and discarding strategies. The method comprises the following steps: detecting the current input frame rate of a received video input sequence in real time; and detecting the current available bandwidth of the server in real time; determining a target output frame rate based on the target coding rate, the current input frame rate, and the current available bandwidth; determining a current frame extraction factor corresponding to a video input sequence based on a target output frame rate, a current input frame rate and a pre-stored frame extraction factor table; generating a current frame extraction mapping table based on the current frame extraction factor; determining the value of the current video frame in the current extraction frame mapping table according to the frame sequence number of the current video frame in the video input sequence; based on the value of the current video frame in the current extraction frame mapping table, it is determined whether the current video frame needs to be reserved.

Description

Video processing method and device based on statistics

Technical Field

The present disclosure relates to the field of video processing technologies, and in particular, to a method and apparatus for video processing based on statistics.

Background

In the video call process, the sender of the video exists at two places, one is a terminal, the video stream comes from a camera, and the other is a cloud end, which is the extension of the camera on the terminal and is a video source virtualized for adapting to the downlink network conditions of different subscribers. The network bandwidth always fluctuates at the terminal or the cloud, the time is large and small, and if the bandwidth is small, the phenomenon of blocking and unsmooth of the video experienced by the video receiver can occur if the video is not processed at all. In order to avoid this phenomenon to the greatest extent, it is necessary to perform some adaptive processing work on the video at the transmitting end, for example, to reduce the resolution of the video or reduce the frame rate of the video, and then transmit the video with a lower coding rate, so as to adapt to the current bandwidth and ensure the fluency of the video.

However, reducing the resolution of the video has a larger loss on the definition of the video, and for some scenes with low requirements on the definition, such as a common video telephone, the visual effect brought by reducing the resolution may not be greatly influenced, but for some scenes with high requirements on the definition, such as the video shared by a desktop in a video conference, the video often shares the content focusing on the text details, and higher definition is required, so that the resolution cannot be reduced at this time, but only the mode of reducing the frame rate of the video can be adopted. However, the conventional method for reducing the video frame rate mostly adopts a strategy of discarding a plurality of previous frames when the bandwidth fluctuates, namely, continuous frame loss occurs when the bandwidth is reduced, so that the video presented by a video receiver has a click feel when the bandwidth is reduced. Although this approach guarantees sharpness, it sacrifices a portion of fluency and still does not address the problem of video sticking.

Disclosure of Invention

The embodiment of the application provides a video processing method and device based on statistics, which are used for solving the following technical problems: the existing method for reducing the video frame rate is unreasonable in video frame retention and discarding strategies, can not ensure the fluency of video, and can possibly generate video clamping phenomenon.

The embodiment of the application adopts the following technical scheme:

in one aspect, an embodiment of the present application provides a video processing method based on statistics, where the method includes: detecting the current input frame rate of a received video input sequence in real time, and detecting the current available bandwidth of a server in real time; determining a target output frame rate based on a target coding rate, the current input frame rate, and the current available bandwidth; wherein the target coding rate is obtained according to the resolution of the video input sequence and a maximum frame rate; determining a current frame extraction factor corresponding to the video input sequence based on the target output frame rate, the current input frame rate and a pre-stored frame extraction factor table; generating a current frame extraction mapping table based on the current frame extraction factor; determining the value of the current video frame in the current extraction frame mapping table according to the frame sequence number of the current video frame in the video input sequence; and determining whether the current video frame needs to be reserved or not based on the value of the current video frame in the current extraction frame mapping table.

According to the embodiment of the application, the target output frame rate is calculated in a self-adaptive mode according to the current available bandwidth, then a frame extraction mapping table is generated according to the target output frame rate, and then the frames in the current video frame sequence are determined to be reserved and the frames are determined to be discarded according to the frame extraction mapping table. And the frames to be discarded are evenly distributed. By adopting the scheme to carry out video frame extraction, the video can be adaptively and uniformly extracted according to the change of the bandwidth, and the phenomenon of blocking of the video is avoided.

In a possible implementation manner, based on the value of the current video frame in the current frame extraction mapping table, determining whether the current video frame needs to be reserved specifically includes: if the value of the current video frame in the current extraction frame mapping table is 1, reserving the current video frame; and if the value of the current video frame in the current extraction frame mapping table is 0, discarding the current video frame.

In a possible implementation, after determining whether the current video frame needs to be reserved based on the value of the current video frame in the current frame-pumping map table, the method further includes: arranging all the video frames according to the time sequence to obtain a video output sequence; transmitting the video output sequence to a video encoder for encoding to obtain an output video stream; and sending the output video stream to a video receiving end.

In a possible implementation manner, the detecting, in real time, the current input frame rate of the received video input sequence specifically includes: counting the number of frames of the video input sequence received in the script execution time through a pre-created frame counting script; dividing the number of frames of the received video input sequence by the script execution time to obtain the current input frame rate of the video input sequence; wherein the frame number statistics script is executed once every a seconds, and the value of a is related to the timer trigger time corresponding to the frame number statistics script.

In the video call process, in some scenes, the video sent by the video sender needs to be stored in the cloud end, and then sent to the video receiver from the cloud end, and in the transmission process, the phenomenon of video frame loss can be possibly caused, so that the real-time frame rate of the current video input needs to be detected in real time.

In a possible implementation manner, the determining the target output frame rate based on the target coding rate, the current input frame rate and the current available bandwidth specifically includes: obtaining the target output frame rate n according to n=m×b/t; wherein m is the current input frame rate, b is the current available bandwidth, and t is the target coding rate.

In a possible implementation manner, determining the current frame extraction factor of the video input sequence based on the target output frame rate, the current input frame rate and a pre-stored frame extraction factor table specifically includes: pre-storing a frame extraction factor table; the frame extraction factor table comprises an index row, a numerator row and a denominator row, wherein each column of the frame extraction factor table is provided with an index value corresponding to a numerator value and a denominator value; obtaining a frame extraction proportion r of the video input sequence according to r=n/m; in case r is equal to 1, retaining all video frames in the video input sequence; dividing the numerator value in each column of the frame extraction factor table by the denominator value to obtain the ratio of each column under the condition that r is smaller than 1 and r is smaller than 1; traversing the ratio of each column, and comparing r with the traversed ratio; stopping traversing after traversing to a ratio larger than r for the first time, and determining the ratio as a first ratio; determining an index value corresponding to the first ratio; subtracting one from the index value to obtain a target index value; determining the ratio of the numerator value and the denominator value corresponding to the target index value as the current frame extraction factor; wherein the current frame-pumping factor is represented by a score.

According to the embodiment of the application, the frame extraction proportion of the video input sequence is determined by calculating the ratio of the target output frame rate to the current input frame rate, then a proper frame extraction factor is searched in a frame extraction factor table which is carefully designed in advance, and then a frame extraction mapping table is generated according to a numerator and a denominator corresponding to the frame extraction factor, so that which frames should be extracted and which frames are discarded are determined. The scheme can uniformly extract video frames in the current input frame rate according to the current target output frame rate.

In a possible embodiment, before subtracting one from the index value to obtain the target index value, the method further includes: and if the index value corresponding to the first ratio is 0, the target index value is directly set to be 0.

In a possible implementation manner, the generating of the current frame extraction mapping table based on the current frame extraction factor specifically includes: generating an array with a preset length; taking each Y elements of the array from the first element as an element group; wherein Y is the denominator value of the current frame extraction factor; the number of elements contained in the last element group is a value obtained by modulo Y of the preset length; setting the values of the first X elements in each element group to be 1, and setting the values of the rest elements to be 0 to obtain the current extraction frame mapping table; wherein X is the molecular value of the current frame extraction factor.

In a possible implementation manner, the determining the value of the current video frame in the current extraction frame mapping table according to the frame sequence number of the current video frame in the video input sequence specifically includes: setting a frame sequence number for each frame of video frames in the received video input sequence; determining a frame sequence number of the current video frame; modulo the preset length by the frame number of the current video frame to obtain the element number of the current video frame; and searching a value corresponding to the element sequence number in the current extraction frame mapping table, and determining the value as the value of the current video frame in the current extraction frame mapping table.

On the other hand, the embodiment of the application also provides a video processing device based on statistics, which comprises: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a statistics-based video processing method according to any one of the above embodiments.

According to the video processing method and the video processing equipment based on statistics, a frame extraction factor table is designed, and a method for generating the frame extraction mapping table is defined, so that when frames are extracted, video frames can be extracted uniformly according to factors such as the current input frame rate of the video, the available bandwidth of a current network and the like, and the video frames are smoother than the video output in a frame loss mode.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and that other drawings may be obtained according to these drawings without inventive effort to a person skilled in the art. In the drawings:

fig. 1 is a flowchart of a video processing method based on statistics according to an embodiment of the present application;

fig. 2 is a schematic structural diagram of an adaptive smooth frame extraction module according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of a video processing device based on statistics according to an embodiment of the present application.

Detailed Description

In order to better understand the technical solutions in the present application, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application.

First, the embodiment of the present application provides a video processing method based on statistics, as shown in fig. 1, the video processing method based on statistics specifically includes steps 101-106:

step 101, detecting the current input frame rate of the video input sequence in real time, and detecting the current available bandwidth of the server in real time.

Specifically, an adaptive smooth frame extraction module is generated in the statistics-based video processing device according to a video receiver. The statistics-based video processing device may be a terminal for sending a video stream, or may be a cloud server for temporarily storing the video stream. For example, in one-to-one video call, an adaptive smooth frame extraction module is generated in the terminal of the video sender, and the video frame extraction is processed and then sent to the video receiver. In a multi-user video conference, a video sender sends a video segment to a cloud server for temporary storage, so that a plurality of users can subscribe the video in the cloud server, at this time, in the cloud server, a self-adaptive smooth frame extraction module is correspondingly generated for each user subscribing the video, and the video is respectively subjected to frame extraction processing and then sent to a corresponding video subscriber.

Further, in the multi-person video conference, operations such as encoding and decoding are performed on the video in the process of sending the video to the cloud server, so that the frame rate of the video may be reduced or fluctuated. The realization method of the function comprises the following steps: a frame number statistics script is pre-created for counting the number of frames of the received video input sequence in execution time, and then dividing the number of frames of the received video input sequence by the execution time of the script to obtain the current input frame rate of the video input sequence. A timer is created in the frame count script that is triggered every a seconds so that the frame count script automatically executes every a seconds.

In one embodiment, the timer is typically triggered every 1 second to cause the frame count script to execute every second to count the real-time frame rate of the video input sequence received by the adaptive smoothing frame extraction module every second.

Further, in the field of real-time audio and video, a receiver of the video is connected with a cloud server for sending the video, and the cloud server can detect the current transmittable bandwidth in real time by adopting a certain technology when sending the video. This value is the current available bandwidth, beyond which the video will clip, below which the bandwidth will be wasted. And typically the current available bandwidth is less than the target coding rate. Therefore, the adaptive smooth frame extraction module can detect the current available bandwidth between the cloud server and the video receiver in real time through the existing detection technology. Also, because the current available bandwidth between the cloud server and each video receiver is not necessarily the same, and the current available bandwidth is an important factor affecting the number of frames extracted by the adaptive smooth frame extraction module, it is necessary to create an adaptive smooth frame extraction module for each video receiver.

Step 102, the video processing device based on statistics determines a target output frame rate based on the target encoding rate, the current input frame rate, and the current available bandwidth.

Specifically, in the real-time audio and video field, a target coding rate is allocated according to the resolution and the maximum frame rate of a video stream when video coding is performed: for example, a video stream with a resolution of 1920×1080 and a maximum frame rate of 30 frames/sec is allocated with a target coding rate of 2500 kbps; a video stream with a resolution of 1280 x 720 and a maximum frame rate of 30 frames/second is allocated a target coding rate of 1800kbps, etc. Typically, the frame rate of the video stream does not exceed 30 frames/second, so the maximum frame rate of most video streams is 30 frames/second. Therefore, the self-adaptive smooth frame extraction module in the video processing equipment based on statistics receives the video input sequence sent by the video sender, and allocates a target coding rate to the video input sequence according to the resolution and the maximum frame rate of the video input sequence and according to the requirements of the real-time audio field or the experience of technicians.

Further, after the adaptive smoothing frame extraction module obtains the target coding rate, the current input frame rate and the current available bandwidth of the input video sequence, the target output frame rate n is calculated according to the formula n=m×b/t. Wherein m is the current input frame rate, b is the current available bandwidth, and t is the target coding rate.

Step 103, the video processing device based on statistics determines the current frame extraction factor corresponding to the video input sequence based on the target output frame rate, the current input frame rate and a pre-stored frame extraction factor table.

Firstly, an adaptive smooth frame extraction module in the video processing equipment based on statistics calculates the proportion of the target output frame rate and the current input frame rate according to the calculated target output frame rate, wherein the proportion is the proportion of the video input sequence needing frame extraction within the current 1 second. Specifically, according to r=n/m, the frame extraction proportion r of the video input sequence in the current 1 second is obtained.

It should be noted that "frame extraction" in the present application means that video frames that need to be preserved are extracted from a video input sequence.

Further, a frame extraction factor table is pre-stored in the self-adaptive smooth frame extraction module:

Index	0	1	2	3	4	5	6	7	8	9	10	11	12	13
															molecules	1	1	1	1	1	1	1	1	3	2	3	4	5	9
Denominator of denominator	30	15	10	6	5	4	3	2	5	3	4	5	6	10

TABLE 1

As shown in table 1, the frame extraction factor table includes an index row, a numerator row, and a denominator row. Each column of the extraction factor table has an index value corresponding to a molecular value and a denominator value. Note that, the extraction factor table shown in table 1 is a fixed table designed in advance, and the content is not changed in general.

Under the condition that the frame extraction proportion r is equal to 1, the current bandwidth is indicated to ensure that the current video input sequence is completely output under the condition of no frame loss, so that frame extraction is not needed, and all video frames in the video input sequence are reserved.

Under the condition that the frame extraction ratio r is smaller than 1, the current bandwidth cannot support the output of the current video input sequence under the condition of no frame loss, so that further frame extraction work is needed, at the moment, the self-adaptive smooth frame extraction module obtains the ratio of the molecular value to the denominator value in each column, and traverses the ratio in each column of the frame extraction factor table from left to right.

Further, each time a row is traversed, comparing the ratio of the frame extraction ratio r with the ratio of the row until a ratio larger than the frame extraction ratio r is found for the first time, wherein the ratio is a first ratio, and then determining an index value corresponding to the first ratio. Then the index value is subtracted by one to obtain the target index value. And determining the ratio of the numerator value and the denominator value corresponding to the target index value as a current frame extraction factor, wherein the current frame extraction factor is expressed by a fraction.

In one embodiment, if r is calculated to be 0.8, then the ratio of 0.8 to the left to right numerator to denominator in table 1: 1/30, 1/15, 1/10, 1/6, … …, 9/10 until a first value greater than 0.8 is found, the first value greater than 0.8 is 5/6 and the corresponding index value is 12, since 4/5 is 0.8,5/6 is about 0.83. Then, 12-1=11, and the target index value is 11. Since the numerator corresponding to 11 is 4 and the denominator is 5, the current frame-pumping factor is 4/5.

As a possible embodiment, in the case where the first index value larger than the ratio of the frame extraction ratio r is 0, the target index value is set to 0.

For example, if r is 0.01, it is known that 0.01 is smaller than 1/30, then the ratio of the first larger than r is 1/30, and the corresponding index value is 0, at this time, the current index value 0 is not required to be subtracted by 1, but is directly used as the target index value, and at this time, the current frame extraction factor is 1/30 corresponding to the target index value 0.

Step 104, the video processing device based on statistics generates a current frame extraction mapping table based on the current frame extraction factor.

Specifically, the adaptive smooth frame extraction module generates an array with a preset length after obtaining the current frame extraction factor. As a preferred embodiment, the array has a length of 30 and the array element members are of the bool type.

Then, according to the molecular value X and the denominator value Y in the current frame extraction factor, each Y elements of the array from the first element are taken as an element group. The number of elements contained in the last element group is a value obtained by modulo Y of a preset length. And then setting the values of the first X elements in each element group to be 1, and setting the values of the rest elements to be 0 to obtain the current frame extraction mapping table.

For example: if the current frame extraction factor is 2/3, since 30 can be divided by 3, the array is formed by taking every 3 elements from left to right as an element group, setting the value of the first 2 elements to 1 and the value of the 3 rd element to 0 in each element group, and obtaining the current frame extraction mapping table corresponding to the current frame extraction factor 2/3 as shown in table 2:

1

0

1

0

1

0

1

0

1

0

1

0

1

0

1

0

1

0

1

0

TABLE 2

If the current frame extraction factor is 3/4, since 30 cannot be divided by 4, every 4 elements of the array form an element group from left to right, and the last remaining elements form an element group, so that the last element group contains 30% 4=2 elements, the value of the first 3 elements in each element group is 1, and the value of the 4 th element is 0. The last element group has 2 elements less than 3, so that the values of the two elements are 1, and the current frame extraction mapping table corresponding to the current frame extraction factor 3/4 is shown in table 3:

1

0

1

0

1

0

1

0

1

0

1

0

1

0

1

TABLE 3 Table 3

Step 105, the statistics-based video processing device determines the value of the current video frame in the current frame extraction mapping table according to the frame number of the current video frame in the video input sequence.

Specifically, the adaptive smooth frame extraction module generated by the video processing device sets a frame sequence number for each frame video frame in the currently received video input sequence starting from 0. The frame number of the first frame of the video input sequence is 0, the frame number of the second frame is 1, and so on.

Further, the adaptive smooth frame extraction module firstly sets a corresponding frame sequence number for a received current video frame, and then obtains an element sequence number of the current video frame by modulo a preset length through the frame sequence number of the current video frame. And then searching a value corresponding to the element sequence number in the current frame extraction mapping table, and determining the value as the value of the current video frame in the current frame extraction mapping table.

In one embodiment, if the current frame extraction map is shown in table 2, the adaptive smoothing frame extraction module models the frame number pair 30 of each video frame received in the current 1 second to obtain the element number of each video frame in the current 1 second. The value corresponding to the element number of each video frame is then looked up in table 2. For example, if the frame number of a certain frame of video frame is 40, then 40%30=10 is calculated, and the value corresponding to the 10 th element is found to be 1 in table 2, that is, the value in the current frame extraction mapping table corresponding to the frame of video frame is 1. Wherein, the element number of the first element from the left in Table 2 is 1, the element number of the second element is 2, and so on.

Step 106, the statistics-based video processing device determines whether the current video frame needs to be reserved or not based on the value of the current video frame in the current frame extraction mapping table.

Specifically, if the value of the current video frame in the current frame extraction mapping table is 1, the current video frame is reserved. And if the value of the current video frame in the current extraction frame mapping table is 0, discarding the current video frame. And arranging all the video frames according to the time sequence to obtain a video output sequence. And sending the video output sequence to a video coder for coding to obtain an output video stream. And sending the output video stream to a video receiving end.

In one embodiment, all video frames with the corresponding value of 1 in the current frame extraction mapping table within the current 1 second are reserved, all video frames with the corresponding value of 0 are discarded, and then a video output sequence with the frame rate of the target output frame rate n is output after rearrangement encoding.

It should be noted that, the processing procedure performed in the steps 101-106 is only 1 second, and the current input frame rate of the video stream and the current available bandwidth of the server are changed in real time, so the frame extraction mapping table generated in each second is different, and therefore the statistical-based video processing method described in the steps 101-106 needs to be repeatedly performed in each second.

In one embodiment, fig. 2 is a schematic structural diagram of an adaptive smoothing frame extraction module provided in the embodiment of the present application, and as shown in fig. 2, the adaptive smoothing frame extraction module includes 2 adaptive modules: and the input frame rate real-time detection module and the dynamic frame extraction mapping table generation module.

In fig. 2, the whole process of adaptively smoothing the frame of a video input sequence for one second is:

the video input sequence continuously inputs video frames to the adaptive smooth frame extraction module. After the self-adaptive smooth frame extraction module receives the video input sequence, the current input frame rate of the video input sequence is detected in real time through the input frame rate real-time detection module, the target coding rate and the current available bandwidth are obtained, and then the current input frame rate m and the target coding rate t and the current available bandwidth b are used as parameters to be input into the dynamic frame extraction mapping table generation module. The dynamic frame extraction mapping table generating module calculates the current target output frame rate n through the adaptive smooth frame extraction function fun (m, t, b), and then generates a frame extraction mapping table according to the ratio of the target output frame rate to the current input frame rate. The self-adaptive smooth frame extraction module extracts frames of the video input sequence according to the frame extraction mapping table, leaves video frames corresponding to 1 in the frame extraction mapping table, discards video frames corresponding to 0 in the frame extraction mapping table, and accordingly outputs a video output sequence with the frame rate of n, and sends the video output sequence to the video encoder for encoding.

In addition, the present application further provides a statistics-based video processing apparatus, as shown in fig. 3, the statistics-based video processing apparatus 300 includes:

at least one processor 301; and a memory 302 communicatively coupled to the at least one processor 301; wherein the memory 302 stores instructions executable by the at least one processor 301 to enable the at least one processor 301 to perform:

detecting the current input frame rate of a received video input sequence in real time; and detecting the current available bandwidth of the server in real time;

determining a target output frame rate based on the target coding rate, the current input frame rate, and the current available bandwidth; the target coding rate is distributed according to the resolution of the video input sequence and the maximum frame rate;

determining a current frame extraction factor corresponding to a video input sequence based on a target output frame rate, a current input frame rate and a pre-stored frame extraction factor table;

generating a current frame extraction mapping table based on the current frame extraction factor;

determining the value of the current video frame in the current extraction frame mapping table according to the frame sequence number of the current video frame in the video input sequence;

based on the value of the current video frame in the current extraction frame mapping table, it is determined whether the current video frame needs to be reserved.

According to the embodiment of the application, the target output frame rate calculation formula, the frame extraction factor table and the frame extraction mapping table generation algorithm are designed in the self-adaptive smooth frame extraction module, so that even frame extraction of video streams is realized, the frame extraction proportion can be changed along with the change of the current available bandwidth and the current input frame rate of a video input sequence, the frame extraction proportion can be large under the condition that the current available bandwidth is large, the corresponding frame discarding proportion can be small, and therefore the video stream with the highest quality can be output under the condition that the bandwidth is allowed, and bandwidth resources are fully utilized. Meanwhile, the video is even and the frame is extracted better than the visual effect of continuous frame loss, so that the video clamping phenomenon can be reduced, and the user experience is improved.

All embodiments in the application are described in a progressive manner, and identical and similar parts of all embodiments are mutually referred, so that each embodiment mainly describes differences from other embodiments. In particular, for apparatus, devices, non-volatile computer storage medium embodiments, the description is relatively simple as it is substantially similar to method embodiments, as relevant points are found in the partial description of method embodiments.

The foregoing describes specific embodiments of the present application. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the embodiments of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the embodiments of the present application should be included in the scope of the claims of the present application.

Claims

1. A method of statistics-based video processing, the method comprising:

detecting the current input frame rate of a received video input sequence in real time, and detecting the current available bandwidth of a server in real time;

determining a target output frame rate based on a target coding rate, the current input frame rate, and the current available bandwidth; wherein the target coding rate is obtained according to the resolution of the video input sequence and a maximum frame rate;

determining a current frame extraction factor corresponding to the video input sequence based on the target output frame rate, the current input frame rate and a pre-stored frame extraction factor table;

and determining whether the current video frame needs to be reserved or not based on the value of the current video frame in the current extraction frame mapping table.

2. The method for processing a video based on statistics according to claim 1, wherein determining whether the current video frame needs to be retained based on the value of the current video frame in the current frame-pumping mapping table comprises:

if the value of the current video frame in the current extraction frame mapping table is 1, reserving the current video frame;

and if the value of the current video frame in the current extraction frame mapping table is 0, discarding the current video frame.

3. The method of claim 1, wherein after determining whether the current video frame needs to be retained based on the value of the current video frame in the current snapshot map, the method further comprises:

arranging all the video frames according to the time sequence to obtain a video output sequence;

transmitting the video output sequence to a video encoder for encoding to obtain an output video stream;

and sending the output video stream to a video receiving end.

4. A method of statistical-based video processing according to claim 1, wherein the real-time detection of the current input frame rate of the received video input sequence comprises:

counting the number of frames of the video input sequence received in the script execution time through a pre-created frame counting script;

dividing the number of frames of the received video input sequence by the script execution time to obtain the current input frame rate of the video input sequence; wherein the frame number statistics script is executed once every a seconds, and the value of a is related to the timer trigger time corresponding to the frame number statistics script.

5. The method for processing the video based on statistics according to claim 1, wherein determining the target output frame rate based on the target coding rate, the current input frame rate and the current available bandwidth comprises:

obtaining the target output frame rate n according to n=m×b/t;

wherein m is the current input frame rate, b is the current available bandwidth, and t is the target coding rate.

6. The method according to claim 5, wherein determining the current frame extraction factor of the video input sequence based on the target output frame rate, the current input frame rate and a pre-stored frame extraction factor table, comprises:

pre-storing a frame extraction factor table; the frame extraction factor table comprises an index row, a numerator row and a denominator row, wherein each column of the frame extraction factor table is provided with an index value corresponding to a numerator value and a denominator value;

obtaining a frame extraction proportion r of the video input sequence according to r=n/m;

in case r is equal to 1, retaining all video frames in the video input sequence;

dividing the numerator value in each column of the frame extraction factor table by the denominator value to obtain the ratio of each column under the condition that r is smaller than 1;

traversing the ratio of each column, and comparing r with the traversed ratio;

stopping traversing after traversing to a ratio larger than r for the first time, and determining the ratio as a first ratio;

determining an index value corresponding to the first ratio;

subtracting one from the index value to obtain a target index value;

determining the ratio corresponding to the target index value as the current frame extraction factor; wherein the current frame-pumping factor is represented by a score.

7. The method of statistics-based video processing according to claim 6, wherein prior to subtracting one from the index value to obtain the target index value, the method further comprises:

and if the index value corresponding to the first ratio is 0, the target index value is directly set to be 0.

8. The method for statistics-based video processing according to claim 6, wherein generating the current frame-pumping mapping table based on the current frame-pumping factor comprises:

generating an array with a preset length;

taking each Y elements of the array from the first element as an element group; wherein Y is the denominator value of the current frame extraction factor; the number of elements contained in the last element group is a value obtained by modulo Y of the preset length;

setting the values of the first X elements in each element group to be 1, and setting the values of the rest elements to be 0 to obtain the current extraction frame mapping table; wherein X is the molecular value of the current frame extraction factor.

9. The method according to claim 8, wherein determining the value of the current video frame in the current frame-pumped mapping table according to the frame number of the current video frame in the video input sequence, specifically comprises:

setting a frame sequence number for each frame of video frames in the received video input sequence;

determining a frame sequence number of the current video frame;

modulo the preset length by the frame number of the current video frame to obtain the element number of the current video frame;

and searching a value corresponding to the element sequence number in the current extraction frame mapping table, and determining the value as the value of the current video frame in the current extraction frame mapping table.

10. A statistics-based video processing apparatus, the apparatus comprising:

at least one processor; the method comprises the steps of,

a memory communicatively coupled to the at least one processor; wherein,,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a statistics-based video processing method according to any one of claims 1-9.