CN109168013B

CN109168013B - Method, device and equipment for extracting frame and computer readable storage medium

Info

Publication number: CN109168013B
Application number: CN201811094881.2A
Authority: CN
Inventors: 朱材源
Original assignee: Guangzhou Huaduo Network Technology Co Ltd
Current assignee: Guangzhou Cubesili Information Technology Co Ltd
Priority date: 2018-09-19
Filing date: 2018-09-19
Publication date: 2020-09-25
Anticipated expiration: 2038-09-19
Also published as: CN109168013A

Abstract

The embodiment of the application discloses a method, a device and equipment for extracting frames and a computer readable storage medium, comprising the following steps: acquiring a coding rate which is received by a video encoder and is suitable for the current network environment in real time; calculating the coding frame rate corresponding to the coding code rate according to the preset corresponding relation between the coding code rate and the coding frame rate; and performing frame extraction according to the coding frame rate to finish the coding operation. The method solves the technical problem that the existing frame extraction method can cause low fluency of live video when the network condition is poor.

Description

Method, device and equipment for extracting frame and computer readable storage medium

Technical Field

The present application relates to the field of communications technologies, and in particular, to a method, an apparatus, a device, and a computer-readable storage medium for frame extraction.

Background

With the development of communication technology, live webcasting is more and more popular. The live webcasting is a new social network mode, which is capable of watching movies on different communication platforms through a network system at the same time, and is mainly divided into real-time live webcasting games, movies, television plays, talent show and the like.

In live network, the formation process of live video is roughly as follows: the video acquisition equipment continuously acquires video frames at a main broadcasting end, then performs frame extraction processing on the acquired video frames, then forms live video through encoding of a video encoder, and finally sends the live video to each terminal through a server for watching by audiences.

The existing frame extraction method comprises the following steps: a fixed coding frame rate is preset, and then frame extraction is carried out according to the fixed coding frame rate.

However, the frame extraction method does not consider the situation of network fluctuation, and when the network condition is poor, the situation that the fluency of the live video is low can occur.

Disclosure of Invention

The embodiment of the application provides a frame extraction method, a frame extraction device, frame extraction equipment and a computer readable storage medium, so that the fluency of live video is higher under the condition of poorer network conditions.

In view of the above, a first aspect of the present application provides a method for frame extraction, the method comprising:

acquiring a coding rate which is received by a video encoder and is suitable for the current network environment in real time;

calculating the coding frame rate corresponding to the coding code rate according to the preset corresponding relation between the coding code rate and the coding frame rate;

and performing frame extraction according to the coding frame rate to finish the coding operation.

Preferably, the first and second electrodes are formed of a metal,

after acquiring the coding rate suitable for the current network environment received by the video encoder in real time, before calculating the coding frame rate corresponding to the coding rate according to the preset corresponding relationship between the coding rate and the coding frame rate, the method further comprises the following steps:

judging whether the coding code rate exceeds a preset code rate interval or not;

if the coding code rate is larger than the maximum value in the code rate interval, assigning the maximum value to the coding code rate;

and if the coding code rate is smaller than the minimum value in the code rate interval, assigning the minimum value to the coding code rate.

Preferably, the first and second electrodes are formed of a metal,

calculating the coding frame rate corresponding to the coding code rate according to the preset corresponding relationship between the coding code rate and the coding frame rate specifically comprises the following steps:

acquiring a first difference value between the coding rate and a minimum value in the rate interval;

acquiring a first ratio of the first difference value to the length of the code rate interval;

calculating a sine parameter according to a preset formula y-sin (pi x/2), wherein x represents a first ratio and y represents the sine parameter;

and multiplying the sine parameter by the length of the code rate interval, and adding the minimum value of the code rate interval to obtain the coding frame rate.

Preferably, the first and second electrodes are formed of a metal,

performing frame extraction according to the encoding frame rate to complete the encoding operation specifically comprises:

calculating the time interval of two adjacent frame extraction according to the coding frame rate;

and performing frame extraction on the acquired frames according to the time interval to finish the encoding operation.

Preferably, the first and second electrodes are formed of a metal,

calculating the time interval between two adjacent frame extractions according to the coding frame rate specifically comprises:

and acquiring a second ratio of 1000ms to the coding frame rate, and taking the second ratio as a time interval between two adjacent frame extraction.

Preferably, the first and second electrodes are formed of a metal,

acquiring a second difference value between the acquisition frame rate of the image acquisition equipment and the coding frame rate;

and acquiring a third ratio of the 1000ms to the second difference, and taking the third ratio as a time interval between two adjacent frame extraction.

Preferably, the first and second electrodes are formed of a metal,

before acquiring a second difference value between the acquisition frame rate and the encoding frame rate of the image acquisition device, the method further includes:

and acquiring the acquisition frame rate of the image acquisition equipment in real time.

Preferably, the first and second electrodes are formed of a metal,

performing frame extraction according to the time interval to complete the encoding operation specifically comprises:

acquiring system time of a current frame;

judging whether the last frame extraction exists;

if the last frame extraction exists, acquiring the theoretical system time of the last frame extraction, and taking the difference value between the system time of the current frame and the theoretical system time of the last frame extraction as a third difference value;

if the last frame extraction does not exist, acquiring the system time of the first frame, and taking the difference value between the system time of the current frame and the system time of the first frame as a third difference value;

and comparing the third difference value with the time interval, and performing frame extraction coding or frame extraction discarding according to a comparison result to finish coding operation.

Preferably, the first and second electrodes are formed of a metal,

if the last frame extraction exists, after the theoretical system time of the last frame extraction is obtained, and the difference value between the system time of the current frame and the theoretical system time of the last frame extraction is used as a third difference value, the method further comprises the following steps:

acquiring the actual system time of the last frame extraction, taking the difference value between the actual system time of the last frame extraction and the theoretical system time of the last frame extraction as a time residual, and judging the size of the time residual and an acquisition time interval, wherein the acquisition time interval is a fourth ratio of 1000ms to the acquisition frame rate;

if the time residual is larger than the acquisition time interval, adding the value of the third difference to the time residual; and if the time residual is smaller than the acquisition time interval, not changing the value of the third difference.

Preferably, the first and second electrodes are formed of a metal,

the obtaining of the theoretical system time of the last frame extraction specifically includes:

judging whether another frame extraction exists before the last frame extraction, if so, acquiring the theoretical system time of the last frame extraction as reference time, and if not, acquiring the system time of the acquired first frame as reference time;

and taking the sum of the reference time and the time interval calculated before the last frame drawing as the theoretical system time of the last frame drawing.

A second aspect of the present application provides an apparatus for frame extraction, including:

the encoding code rate obtaining unit is used for obtaining the encoding code rate which is received by the video encoder and is suitable for the current network environment in real time;

the coding frame rate calculation unit is used for calculating the coding frame rate corresponding to the coding code rate according to the preset corresponding relation between the coding code rate and the coding frame rate;

and the frame extracting unit is used for extracting frames according to the coding frame rate so as to finish the coding operation.

Preferably, the first and second electrodes are formed of a metal,

the device further comprises:

the coding rate control unit is used for judging whether the coding rate exceeds a preset rate interval;

Preferably, the first and second electrodes are formed of a metal,

the encoding frame rate calculation unit is specifically configured to:

Preferably, the first and second electrodes are formed of a metal,

the frame extracting unit specifically includes:

the time interval determining subunit is used for calculating the time interval between two adjacent frame extraction according to the coding frame rate;

and the frame extracting execution subunit is used for extracting the frames acquired according to the time interval so as to finish the coding operation.

Preferably, the first and second electrodes are formed of a metal,

the time interval determining subunit is specifically configured to:

Preferably, the first and second electrodes are formed of a metal,

the time interval determining subunit is specifically configured to:

Preferably, the first and second electrodes are formed of a metal,

the device further comprises:

and the acquisition frame rate acquisition unit is used for acquiring the acquisition frame rate of the image acquisition equipment in real time.

Preferably, the first and second electrodes are formed of a metal,

the frame extraction execution subunit is specifically configured to:

acquiring system time of a current frame;

judging whether the last frame extraction exists;

Preferably, the first and second electrodes are formed of a metal,

if the last frame extraction exists, the frame extraction execution subunit is further configured to:

A third aspect of the present application provides an apparatus for frame decimation, the apparatus comprising a processor and a memory:

the memory is used for storing program codes and transmitting the program codes to the processor;

the processor is configured to execute the frame extraction method according to the first aspect according to instructions in the program code.

A fourth aspect of the present application provides a computer-readable storage medium for storing program code for performing the method of the first aspect.

According to the technical scheme, the embodiment of the application has the following advantages:

in the embodiment of the application, a frame extraction method is provided, which comprises the steps of firstly acquiring a coding rate which is received by a video encoder and is suitable for a current network environment in real time; then, calculating the coding frame rate corresponding to the coding code rate according to the preset corresponding relation between the coding code rate and the coding frame rate; finally, frame extraction is carried out according to the coding frame rate so as to complete the coding operation; when the network environment is poor, although the network bandwidth is lower than that when the network condition is normal, in the embodiment of the application, the coding frame rate is correspondingly reduced along with the deterioration of the network environment, so that the live video formed by coding is in the transmission range of the network bandwidth, the technical problem of low fluency of the live video caused by poor network condition is avoided, and the network adaptability of the network live broadcast is improved.

Drawings

FIG. 1 is a drawing frame system architecture diagram according to an embodiment of the present application;

FIG. 2 is a flowchart illustrating a method of a frame extraction method according to a first embodiment of the present application;

FIG. 3 is a flowchart illustrating a method of a frame extraction method according to a second embodiment of the present application;

FIG. 4 is a flowchart of a method for calculating an encoding frame rate according to a second embodiment of the present application;

FIG. 5 is a flowchart of a method for extracting frames at time intervals according to an embodiment of the present application;

FIG. 6 is a schematic structural diagram of a frame extracting apparatus according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a frame extracting device in an embodiment of the present application.

Detailed Description

In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In the existing frame extraction method, frame extraction coding is performed according to a fixed coding frame rate, so that the coding rate of the formed live video is fixed and unchanged. The coding rate of a live video coded at a fixed coding frame rate is assumed to be 1000 bkps; when the network environment is normal, it is assumed that the network bandwidth can transmit a live video with a coding rate of 1200bkps, and when the network environment changes, the network bandwidth may only transmit a live video with a coding rate of 800bkps, so that the phenomena of stuttering and low fluency may occur when transmitting a live video with a coding rate of 1000bkps, which causes poor viewing experience for users.

Therefore, the embodiment of the application provides a frame extraction method, so that the coding rate of a live video formed by coding can adapt to a network environment, the coding rate of the live video is in a network bandwidth transmission range, the live video is prevented from being unsmooth due to network environment fluctuation, and the watching experience of a user is improved.

As an example, fig. 1 shows a frame-extracting system architecture diagram in an embodiment of the present application, as shown in fig. 1, fig. 1 includes a live terminal 400 and a server 300. The live broadcast terminal 400 may be a device having a video capture function, a communication function, and a video encoding function, and may be, for example, a smart phone, a tablet computer, a notebook computer, a desktop computer, or the like; the live terminal 400 may further include a separate video capture device 100 and a video encoding device 200 having a communication function, and the live terminal 400 is not limited in this embodiment.

It is understood that the frame-extracting system shown in fig. 1 can be applied to a live scene, for example, a main broadcast acquires an original video through a live terminal 400, then performs frame-extracting coding on the original video to form a live video, and transmits the live video to a server 300 through a network, so that the server 300 transmits the live video to each terminal watching the live video.

In this scenario, the live broadcast terminal 400 may obtain a coding rate adapted to a network environment in real time, calculate a corresponding coding frame rate according to the coding rate, and perform frame extraction coding according to the coding frame rate to form a live broadcast video, where the coding rate of the live broadcast video changes according to the network environment, and even if the network boundary becomes poor, the live broadcast video can be transmitted by a network bandwidth, so that the smoothness of the live broadcast video is not reduced.

It should be noted that the above application scenarios are only shown for the convenience of understanding the present application, and the embodiments of the present application are not limited in any way in this respect. Rather, embodiments of the present application may be applied to any scenario where applicable.

Various non-limiting embodiments of the present application are described in detail below with reference to the accompanying drawings.

Referring to fig. 2, fig. 2 is a flowchart illustrating a method of a frame extraction method according to a first embodiment of the present application, as shown in fig. 2, the method may specifically include:

step 201, acquiring the coding rate suitable for the current network environment received by the video encoder in real time.

It can be understood that the video encoder receives an encoding bitrate suitable for the current network environment, for example, when the network environment is good, the video encoder receives a notification that the suitable encoding bitrate is 1200kbps, and when the network environment is degraded, the video encoder receives a notification that the suitable encoding bitrate is 1000kbps, but the embodiment of the present application obtains the log in real time.

In the embodiment of the present application, the video encoder may be a module integrated in a live terminal, or may be a separate video encoding device.

Step 202, calculating the coding frame rate corresponding to the coding code rate according to the preset corresponding relationship between the coding code rate and the coding frame rate.

It can be understood that the encoding frame rate adapted to the network environment may be calculated by presetting a corresponding relationship, and the corresponding relationship is various, which is not limited in the embodiment of the present application.

And step 203, performing frame extraction according to the coding frame rate to complete the coding operation.

It can be understood that there are various ways to perform frame extraction according to the encoding frame rate; for example, multiple frames may be extracted from the original video for encoding, or one or more frames may be extracted from the original video for discarding, and the remaining frames may be encoded; the present application does not specifically limit the frame extraction method.

According to the embodiment of the application, the coding frame rate is calculated and adjusted in real time according to the change of the network environment, so that the coding rate of the live video formed by coding is in the transmission range of the network bandwidth, the phenomena of low fluency and blockage of the live video caused by network fluctuation are avoided, and the watching experience of a user is improved.

Referring to fig. 3, fig. 3 is a flowchart illustrating a method of a frame extraction method according to a second embodiment of the present application, as shown in fig. 3, the method may specifically include:

step 301, obtaining the coding rate suitable for the current network environment received by the video encoder in real time.

Step 302, judging whether the coding code rate exceeds a preset code rate interval;

if the coding code rate is greater than the maximum value in the code rate interval, assigning the maximum value to the coding code rate;

In the embodiment of the application, the coding code rate is limited in the preset code rate interval through step 302, and different user requirements are met by setting different code rate intervals; for example, assuming that the code rate interval is [600kbps, 1000kbps ], if the three coding code rates obtained in step 301 are 500kbps, 800kbps and 1100kbps, the coding code rates are 600kbps, 800kbps and 1000kbps in sequence after the processing of step 302.

In the embodiment of the present application, the preset bitrate interval is related to the model of the video encoder, for example, the coding bitrate of the video encoder with the model H264 is generally 1200kbps, so the maximum value of the bitrate interval is generally set to not exceed 1200 kbps; the coding rate of the video encoder with the model H265 is generally 1000kbps, so the maximum value of the rate interval is generally set to be not more than 1000 kbps.

The preset code rate interval can also be related to a service scene, for example, in the live broadcast process, the real-time requirement of the continuous wheat scene is higher, so that the packet loss is ensured to be avoided as much as possible; specifically, assuming that the maximum value of the code rate interval when no wheat is connected is set to 1000kbps, and the maximum code rate of the live video which can be transmitted by the network bandwidth is 1100bkps, when the network environment is deteriorated, the network bandwidth is reduced, and the network bandwidth is easily reduced to the extent that the live video with the coding code rate of 1000kbps cannot be transmitted, and packet loss occurs; therefore, the maximum value of the code rate interval in the continuous wheat scene can be set to be lower than that in the non-continuous wheat scene, such as 800bkps, so as to reduce the risk of packet loss.

The preset code rate interval may also be related to the resolution, for example, the code rate intervals corresponding to the resolutions of standard definition, high definition and super definition are different inevitably.

It can be understood that, in order to ensure the quality of the live video image, the coding rate is limited by the minimum value of the rate interval.

Step 303, calculating the coding frame rate corresponding to the coding code rate according to the preset corresponding relationship between the coding code rate and the coding frame rate.

Therefore, there may be a plurality of corresponding relationships, and therefore, the encoding frame rate calculation method also includes a plurality of corresponding manners, for example, please refer to fig. 4, which is a flowchart of a method for calculating an encoding frame rate in the second embodiment of the present application; as shown in fig. 4, calculating the coding frame rate corresponding to the coding code rate according to the preset corresponding relationship between the coding code rate and the coding frame rate specifically includes:

step 41, obtaining a first difference value between the coding rate and the minimum value in the rate interval.

For example, assuming that the code rate interval is [600kbps, 1000kbps ] and the coding code rate is 800kbps, the first difference becomes 200 kbps.

Step 42, obtain a first ratio of the first difference to the length of the code rate interval.

In the embodiment of the present application, the first ratio is one-half.

And 43, calculating the sine parameter according to a preset formula of y-sin (pi x/2), wherein x represents the first ratio and y represents the sine parameter.

When the first ratio is one-half, the sinusoidal parameter is two-halves.

And 44, multiplying the sine parameter by the length of the code rate interval, and adding the minimum value of the code rate interval to obtain the coding frame rate.

And step 304, calculating the time interval between two adjacent frames according to the coding frame rate.

Step 305, frames are extracted from the collected frames at time intervals to complete the encoding operation.

It can be understood that there are various ways of performing frame extraction coding according to the coding frame rate, for example, a method of random frame extraction may be adopted, and specifically, if the coding frame rate is 50 frames/second, then 50 frames may be randomly extracted from the original video acquired within one second.

In the embodiment of the present application, step 304 and step 305 are methods for performing frame extraction once every a period of time, as mentioned above, multiple frames may be extracted from the original video for encoding, specifically, when the encoding frame rate is 50 frames/second, one frame may be extracted every 20ms for encoding; it is also possible to extract one or more frames from the original video and discard the frames, specifically, assuming that the encoding frame rate is 60 frames/second and the collection frame rate is 100 frames/second, the number of frames needs to be extracted is 40 frames/second, i.e. one frame is extracted every 25 ms.

Therefore, the frame extraction modes selected are different, and the calculation modes of the corresponding time intervals are also different, for example:

calculating the time interval between two adjacent frames according to the encoding frame rate may specifically include:

and acquiring a second ratio of 1000ms to the coding frame rate, and taking the second ratio as the time interval between two adjacent frame extraction.

Calculating the time interval between two adjacent frames according to the encoding frame rate may further specifically include:

acquiring a second difference value between the acquisition frame rate and the coding frame rate of the image acquisition equipment;

and acquiring a third ratio of the 1000ms to the second difference, and taking the third ratio as the time interval between two adjacent frame extractions.

It will be appreciated that one or more frames are dropped from the original video, and the size of the time interval is related to the acquisition frame rate.

In practical application, the acquisition frame rate fluctuates; for example, when the electric quantity of the live broadcast terminal is too low, the acquisition frame rate may be reduced from the original 100 frames/second to 80 frames/second, and here, assuming that the encoding frame rate is 50 frames/second, the frame extraction frame rate is 50 frames/second, that is, the number of frames to be encoded per second is 50 frames; when the collection frame rate is changed to 80 frames/second, if the original time interval of 100 frames/second is still adopted for calculation, the frame extraction frame rate is still 50 frames/second through the time interval, the actual encoding frame number per second is reduced to 30 frames, and the image quality of the live video is obviously reduced if the actual encoding frame number per second is 20 frames lower than the preset encoding frame number.

Therefore, before acquiring the second difference between the acquisition frame rate and the encoding frame rate of the image acquisition device, the method may further include:

Therefore, by acquiring the acquisition frame rate in real time and then adjusting the time interval in real time, even when the acquisition frame rate of the live broadcast terminal changes, the encoding frame rate can be ensured to be unchanged or changed very little, so that the quality of live broadcast video images is ensured.

Further, please refer to fig. 5, a flowchart of a method for extracting frames at time intervals in the embodiment of the present application; as shown in fig. 5, the frame decimation according to the time interval to complete the encoding operation may specifically include:

step 501, acquiring system time of the current frame.

Step 502, determine whether there is a previous frame extraction.

It may be appreciated that there may or may not be a last decimated frame prior to the acquisition of the current frame.

Step 503, if the last frame extraction exists, acquiring the theoretical system time of the last frame extraction, and taking the difference value between the acquired system time of the current frame and the theoretical system time of the last frame extraction as a third difference value.

It will be appreciated that the time of each frame of the acquisition may also be advanced or delayed due to fluctuations in the acquisition frame rate.

For example, assuming that the frame rate of acquisition is 100 frames/second, the system time of the acquired frames is theoretically 10ms, 20ms, 30ms, … … in sequence, and the frame rate of encoding is 50 frames/second, the theoretical system time of each frame extraction is 20ms, 40ms, 60ms, … … ms

Step 504, if there is no last frame extraction, acquiring the system time of the first frame, and taking the difference between the system time of the current frame and the system time of the first frame as a third difference.

It will be appreciated that when no previous frame decimation has occurred, a third difference is calculated with respect to the system time at which the first frame was acquired.

And 505, comparing the third difference value with the time interval, and performing frame extraction coding or frame extraction discarding according to the comparison result to complete the coding operation.

It should be noted that, in the embodiment of the present application, the third difference is calculated based on the theoretical system time of the previous frame extraction instead of the actual system time of the previous frame extraction, and is compared with the time interval to perform frame extraction coding, so that not only the accuracy of frame extraction but also the stability of the number of frame extractions can be ensured.

For example, still assuming that the acquisition frame rate is 100 frames/second, the system time of the acquired frames is theoretically 10ms, 20ms, 30ms, … …, 1000ms in sequence, assuming that the frame extraction mode is to extract multiple frames from the original video and discard the frames, and the encoding frame rate is 50 frames/second, the time interval of the frame extraction is 20ms, the theoretical system time of each frame extraction is 20ms, 40ms, 60ms, … …, 1000ms, when the acquisition frame rate fluctuates to cause the acquisition system time of the second frame to change, that is, the acquisition frame is sequentially 10ms, 25ms, 30ms, … …, 1000 ms.

If the actual system time of the previous frame extraction is used as a reference to calculate the third difference, the frame extraction operation can be performed only after the frame is acquired, so the actual system time of the frame extraction is 25ms, 50ms, 70ms … … 990ms, the actual system time of each frame extraction is deviated from the theoretical system time, the actual number of frames per second is 49 frames, and the actual number of codes per second is 51 frames.

If the third difference is calculated based on the theoretical system time of the previous frame extraction, the actual system time of the frame extraction is 25ms, 40ms, 60ms, … …, 1000ms, and only if the actual system time of the first frame extraction is deviated from the theoretical system time and the number of frame extraction still ensures 50 frames, the number of codes per second is also kept unchanged at 50 frames.

Therefore, it can be seen from the comparison that the actual system time of the previous frame extraction is used as a reference to calculate the third difference, which not only can ensure the accuracy of the frame extraction time, but also can ensure the stability of the frame extraction quantity, thereby preventing the situation that the coding code rate is increased and the packet loss is caused due to the increase of the coding quantity, and preventing the situation that the quality of the live video image is reduced due to the decrease of the coding quantity.

Further, if there is a previous frame extraction, after acquiring the theoretical system time of the previous frame extraction and taking the difference between the acquired system time of the current frame and the theoretical system time of the previous frame extraction as a third difference, the method may further include:

step 506, obtaining the actual system time of the last frame extraction, taking the difference value between the actual system time of the last frame extraction and the theoretical system time of the last frame extraction as a time residual, and judging the size of the time residual and an acquisition time interval, wherein the acquisition time interval is a fourth ratio of 1000ms to the acquisition frame rate.

Step 507, if the time residual is larger than the acquisition time interval, adding the value of the third difference value to the time residual; and if the time residual is smaller than the acquisition time interval, the value of the third difference is not changed.

It is understood that the third difference value originally represents the theoretical system time of the last frame extraction, and becomes the actual system time of the last frame extraction after the time residual is added.

It should be noted that, in the actual live broadcast process, there may be small fluctuation or large fluctuation in the acquisition time interval, and the time residual is mainly caused by fluctuation in the acquisition time interval; in the embodiment of the present application, when the time residual is greater than the theoretical time interval of acquisition, it is considered that the acquisition time interval is not normal fluctuation, and it is considered as caused by other business logic factors, for example, when the anchor starts functions such as beautifying and face slimming, the time residual may be greater than the theoretical time interval of acquisition; in addition, the reason why the time residual is large may be that the power of the live broadcast terminal is too low, and the live broadcast terminal generates heat due to too long working time.

Specifically, assuming that the acquisition frame rate is 100/s, the frame extraction mode is to extract multiple frames from the original video and discard the frames, and the encoding frame rate is 50 frames/s, then theoretically, the frame extraction frame rate is also 50 frames, that is, 50 frames are extracted and discarded in one second; when the time residual is too large, the number of acquired frames per second may eventually become 90 frames, and if the third difference is still calculated according to the theoretical system time of the last frame extraction, 50 frames may still be accurately extracted, and the actual remaining number of encoded frames is 40 frames, which is less than the preset 50 frames, thereby causing the image quality of the live video to become low.

Further, there are various ways to obtain the theoretical system time, for example, obtaining the theoretical system time of the last frame extraction may specifically include:

judging whether another frame extraction exists before the last frame extraction, if so, acquiring the theoretical system time of the last frame extraction as reference time, and if not, acquiring the system time of the first frame as reference time;

It is understood that, in the embodiment of the present application, the theoretical system time of the last frame extraction is calculated according to the theoretical system time and the time interval of the second previous frame extraction.

Referring to fig. 6, a schematic structural diagram of a frame extracting apparatus in an embodiment of the present application; as shown in fig. 6, an embodiment of the present application provides an apparatus for frame extraction, including:

an encoding rate obtaining unit 601, configured to obtain, in real time, an encoding rate suitable for a current network environment and received by a video encoder;

a coding frame rate calculating unit 602, configured to calculate a coding frame rate corresponding to a coding code rate according to a preset correspondence between the coding code rate and a coding frame rate;

a frame extracting unit 603, configured to perform frame extraction according to the encoding frame rate to complete the encoding operation.

Further, the apparatus for extracting frames may further include:

Further, the encoding frame rate calculating unit 602 may specifically be configured to:

acquiring a first difference value between the coding code rate and a minimum value in a code rate interval;

the sine parameter is calculated according to a preset formula y-sin (pi x/2), where x represents the first ratio and y represents the sine parameter.

Further, the frame extracting unit 603 may specifically include:

the time interval determining subunit is used for calculating the time interval between two adjacent frame extractions according to the coding frame rate;

Further, the time interval determining subunit may be specifically configured to:

Further, the apparatus for extracting frames further comprises:

Further, the frame extraction execution subunit may be specifically configured to:

acquiring system time of a current frame;

judging whether the last frame extraction exists;

if the last frame extraction exists, acquiring the theoretical system time of the last frame extraction, and taking the difference value between the acquired system time of the current frame and the theoretical system time of the last frame extraction as a third difference value;

and comparing the third difference value with the time interval, and performing frame extraction coding or frame extraction discarding according to the comparison result to finish the coding operation.

Further, if there is a last frame extraction, the frame extraction execution subunit is further configured to:

acquiring the actual system time of the last frame extraction, taking the difference value between the actual system time of the last frame extraction and the theoretical system time of the last frame extraction as a time residual error, and judging the size of the time residual error and an acquisition time interval, wherein the acquisition time interval is a fourth ratio of 1000ms to the acquisition frame rate;

if the time residual is larger than the acquisition time interval, adding the value of the third difference value to the time residual; and if the time residual is smaller than the acquisition time interval, the value of the third difference is not changed.

Another frame extracting device is provided in the embodiment of the present application, as shown in fig. 7, for convenience of description, only the portions related to the embodiment of the present application are shown, and details of the specific technology are not disclosed, please refer to the method portion in the embodiment of the present application. The frame extracting device may be a terminal, and the terminal may be any terminal device including a mobile phone, a tablet computer, a personal digital Assistant (PDA, for short in english), a Sales terminal (POS, for short in english), a vehicle-mounted computer, and the like, where the terminal is the mobile phone:

fig. 7 is a block diagram illustrating a partial structure of a mobile phone related to a terminal provided in an embodiment of the present application. Referring to fig. 7, the handset includes: radio Frequency (RF) circuit 1010, memory 1020, input unit 1030, display unit 1040, sensor 1050, audio circuit 1060, wireless fidelity (WiFi) module 1070, processor 1080, and power source 1090. Those skilled in the art will appreciate that the handset configuration shown in fig. 7 is not intended to be limiting and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

The following describes each component of the mobile phone in detail with reference to fig. 7:

RF circuit 1010 may be used for receiving and transmitting signals during information transmission and reception or during a call, and in particular, for processing downlink information of a base station after receiving the downlink information to processor 1080; in addition, the data for designing uplink is transmitted to the base station. In general, RF circuit 1010 includes, but is not limited to, an antenna, at least one Amplifier, a transceiver, a coupler, a low noise Amplifier (Lownoise Amplifier; LNA), a duplexer, and the like. In addition, the RF circuitry 1010 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communications (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), e-mail, Short message Service (Short Messaging Service (SMS), etc.

The memory 1020 can be used for storing software programs and modules, and the processor 1080 executes various functional applications and data processing of the mobile phone by operating the software programs and modules stored in the memory 1020. The memory 1020 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 1020 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The input unit 1030 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the cellular phone. Specifically, the input unit 1030 may include a touch panel 1031 and other input devices 1032. The touch panel 1031, also referred to as a touch screen, may collect touch operations by a user (e.g., operations by a user on or near the touch panel 1031 using any suitable object or accessory such as a finger, a stylus, etc.) and drive corresponding connection devices according to a preset program. Alternatively, the touch panel 1031 may include two parts, a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 1080, and can receive and execute commands sent by the processor 1080. In addition, the touch panel 1031 may be implemented by various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The input unit 1030 may include other input devices 1032 in addition to the touch panel 1031. In particular, other input devices 1032 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a track ball, a mouse, a joystick, or the like.

The display unit 1040 may be used to display information input by a user or information provided to the user and various menus of the cellular phone. The Display unit 1040 may include a Display panel 1041, and optionally, the Display panel 1041 may be configured by using a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch panel 1031 can cover the display panel 1041, and when the touch panel 1031 detects a touch operation on or near the touch panel 1031, the touch operation is transmitted to the processor 1080 to determine the type of the touch event, and then the processor 1080 provides a corresponding visual output on the display panel 1041 according to the type of the touch event. Although in fig. 7, the touch panel 1031 and the display panel 1041 are two independent components to implement the input and output functions of the mobile phone, in some embodiments, the touch panel 1031 and the display panel 1041 may be integrated to implement the input and output functions of the mobile phone.

The handset may also include at least one sensor 1050, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 1041 according to the brightness of ambient light, and the proximity sensor may turn off the display panel 1041 and/or the backlight when the mobile phone moves to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), can detect the magnitude and direction of gravity when stationary, and can be used for applications of recognizing the posture of a mobile phone (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), vibration recognition related functions (such as pedometer and tapping), and the like; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile phone, further description is omitted here.

Audio circuitry 1060, speaker 1061, microphone 1062 may provide an audio interface between the user and the handset. The audio circuit 1060 can transmit the electrical signal converted from the received audio data to the speaker 1061, and the electrical signal is converted into a sound signal by the speaker 1061 and output; on the other hand, the microphone 1062 converts the collected sound signal into an electrical signal, which is received by the audio circuit 1060 and converted into audio data, which is then processed by the audio data output processor 1080 and then sent to, for example, another cellular phone via the RF circuit 1010, or output to the memory 1020 for further processing.

WiFi belongs to short-distance wireless transmission technology, and the mobile phone can help the user to send and receive e-mail, browse web pages, access streaming media, etc. through the WiFi module 1070, which provides wireless broadband internet access for the user. Although fig. 7 shows the WiFi module 1070, it is understood that it does not belong to the essential constitution of the handset, and can be omitted entirely as needed within the scope not changing the essence of the invention.

The processor 1080 is a control center of the mobile phone, connects various parts of the whole mobile phone by using various interfaces and lines, and executes various functions of the mobile phone and processes data by operating or executing software programs and/or modules stored in the memory 1020 and calling data stored in the memory 1020, thereby integrally monitoring the mobile phone. Optionally, processor 1080 may include one or more processing units; preferably, the processor 1080 may integrate an application processor, which handles primarily the operating system, user interfaces, applications, etc., and a modem processor, which handles primarily the wireless communications. It is to be appreciated that the modem processor described above may not be integrated into processor 1080.

The handset also includes a power source 1090 (e.g., a battery) for powering the various components, which may preferably be logically coupled to the processor 1080 via a power management system to manage charging, discharging, and power consumption via the power management system.

Although not shown, the mobile phone may further include a camera, a bluetooth module, etc., which are not described herein.

In the embodiment of the present application, the processor 1080 included in the terminal further has the following functions:

Further, after acquiring the coding rate suitable for the current network environment received by the video encoder in real time, before calculating the coding frame rate corresponding to the coding rate according to the preset corresponding relationship between the coding rate and the coding frame rate, the method may further include:

Further, calculating the coding frame rate corresponding to the coding code rate according to the preset corresponding relationship between the coding code rate and the coding frame rate may specifically include:

Further, performing frame extraction according to the encoding frame rate to complete the encoding operation may specifically include:

and extracting frames from the collected frames at time intervals to finish the coding operation.

Further, calculating the time interval between two adjacent frames according to the encoding frame rate may specifically include:

Further, before acquiring a second difference value between the acquisition frame rate and the encoding frame rate of the image acquisition device, the method may further include:

Further, performing frame extraction according to the time interval to complete the coding operation specifically may include:

acquiring system time of a current frame;

judging whether the last frame extraction exists;

Further, the obtaining of the theoretical system time of the last frame extraction may specifically include:

The embodiment of the present application further provides a computer-readable storage medium for storing a program code, where the program code is configured to execute any one implementation of the frame extraction method described in the foregoing embodiments.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The terms "first," "second," "third," "fourth," and the like in the description of the application and the above-described figures, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

It should be understood that in the present application, "at least one" means one or more, "a plurality" means two or more. "and/or" for describing an association relationship of associated objects, indicating that there may be three relationships, e.g., "a and/or B" may indicate: only A, only B and both A and B are present, wherein A and B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of single item(s) or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A method of framing, comprising:

acquiring system time of a current frame;

judging whether the last frame extraction exists;

2. The method of claim 1, wherein after obtaining the coding rate suitable for the current network environment received by the video encoder in real time, before calculating the coding frame rate corresponding to the coding rate according to the preset correspondence between the coding rate and the coding frame rate, the method further comprises:

3. The method according to claim 2, wherein calculating the coding frame rate corresponding to the coding rate according to the preset correspondence between the coding rate and the coding frame rate specifically comprises:

4. The method according to claim 1, wherein calculating the time interval between two adjacent decimated frames according to the coding frame rate specifically comprises:

5. The method according to claim 1, wherein calculating the time interval between two adjacent decimated frames according to the coding frame rate specifically comprises:

6. The method of claim 5, further comprising, before obtaining a second difference between an acquisition frame rate of an image acquisition device and the encoding frame rate:

7. The method according to claim 1, wherein if there is a last frame extraction, after obtaining a theoretical system time of the last frame extraction and taking a difference between the system time of the current frame and the theoretical system time of the last frame extraction as a third difference, the method further comprises:

8. The method of claim 1, wherein obtaining the theoretical system time of the last frame decimation specifically comprises:

9. An apparatus for decimating frames, comprising:

the frame extracting unit is used for extracting frames according to the coding frame rate so as to finish the coding operation;

the frame extracting unit comprises:

the frame extraction execution subunit is used for acquiring the system time of the current frame; judging whether the last frame extraction exists; if the last frame extraction exists, acquiring the theoretical system time of the last frame extraction, and taking the difference value between the acquired system time of the current frame and the theoretical system time of the last frame extraction as a third difference value; if the last frame extraction does not exist, acquiring the system time of the first frame, and taking the difference value between the system time of the current frame and the system time of the first frame as a third difference value; and comparing the third difference value with the time interval, and performing frame extraction coding or frame extraction discarding according to the comparison result to finish the coding operation.

10. An apparatus for framing, the apparatus comprising a processor and a memory:

the processor is configured to execute the frame extraction method according to any one of claims 1 to 8 according to instructions in the program code.

11. A computer-readable storage medium for storing program code for performing the frame extraction method of any one of claims 1-8.