CN117376641A

CN117376641A - Video processing method, electronic device, apparatus and storage medium

Info

Publication number: CN117376641A
Application number: CN202210772946.4A
Authority: CN
Inventors: 李国盛; 吉高平
Original assignee: Beijing Xiaomi Mobile Software Co Ltd
Current assignee: Beijing Xiaomi Mobile Software Co Ltd
Priority date: 2022-06-30
Filing date: 2022-06-30
Publication date: 2024-01-09

Abstract

The disclosure provides a video processing method, electronic equipment, a device and a storage medium, wherein the method comprises the following steps: acquiring adjacent first video frames and second video frames in a first video frame sequence; determining a time difference of a time stamp for adjacent first video frames and second video frames in a first video frame sequence; and under the condition that the time difference is larger than the set threshold value, inserting at least one third video frame between the first video frame and the second video frame to be used as a second video frame sequence to encode to obtain a video file, thereby, under the condition that video frames between adjacent first video frames and second video frames in the first video frame sequence are discarded, inserting at least one third video frame generated according to the content of the first video frames and the second video frames between the first video frames and the second video frames to stabilize the frame rate of the obtained second video frame sequence, so that the video file obtained by encoding the second video frame sequence is stable and smooth during playing, and improving the user experience.

Description

Video processing method, electronic device, apparatus and storage medium

Technical Field

The disclosure relates to the technical field of video processing, and in particular relates to a video processing method, electronic equipment, a device and a storage medium.

Background

With the development of video technology and internet technology, video recording capability in electronic devices is becoming more important. In video recording, the frame rate of video frames is kept stable, so that the recorded video can be stable and smooth in playing. However, in the video recording process, enough operation resources may not be obtained to complete video processing, so that video frames are discarded, and thus the video cannot be stable and smooth during playing, and the user experience is reduced.

Disclosure of Invention

The present disclosure provides a method, an electronic device, an apparatus, and a storage medium for video processing.

According to an aspect of the present disclosure, there is provided a video processing method including: acquiring adjacent first video frames and second video frames in a first video frame sequence; determining a time difference of a time stamp for adjacent first video frames and second video frames in the first video frame sequence; and under the condition that the time difference is larger than a set threshold value, inserting at least one third video frame between the first video frame and the second video frame to be used as a second video frame sequence for encoding to obtain a video file, wherein the third video frame is generated according to the content of the first video frame and the second video frame.

According to another aspect of the present disclosure, there is provided an electronic device including: an image sensor for outputting an original frame sequence; at least one processor coupled to the image sensor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to process the original sequence of frames to obtain a first sequence of video frames and to perform a video processing method according to an embodiment of the first aspect of the present disclosure.

According to another aspect of the present disclosure, there is provided a video processing apparatus including: the acquisition module is used for acquiring adjacent first video frames and second video frames in the first video frame sequence; a determining module, configured to determine a time difference of a time stamp for a first video frame and a second video frame adjacent to each other in the first video frame sequence; and the processing module is used for inserting at least one third video frame between the first video frame and the second video frame under the condition that the time difference is larger than a set threshold value so as to be used as a second video frame sequence for encoding to obtain a video file, wherein the third video frame is generated according to the content of the first video frame and the second video frame.

According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the video processing method according to the embodiment of the first aspect of the present disclosure.

According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the video processing method according to the above-described first aspect embodiment of the present disclosure.

According to the technical scheme, when the time difference of the time stamps of the adjacent first video frames and second video frames in the first video frame sequence is larger than a set threshold value, at least one third video frame is inserted between the first video frames and the second video frames to be used as the second video frame sequence for encoding to obtain the video file, wherein the third video frame is generated according to the content of the first video frames and the second video frames. Therefore, under the condition that video frames are discarded between adjacent first video frames and second video frames in the first video frame sequence, at least one third video frame can be generated according to the content of the first video frames and the second video frames, and the generated at least one third video frame is inserted between the first video frames and the second video frames, so that the frame rate of the obtained second video frame sequence is stable, the video file obtained by encoding the second video frame sequence is stable and smooth during playing, and the user experience is improved.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is a schematic diagram of a video processing flow in the related art;

FIG. 2 is a flow chart of a related art camera application requesting image data;

FIG. 3 is a schematic diagram of a processing flow of each video frame in a related art video

FIG. 4 is a schematic diagram of processing each video frame in a related art video;

FIG. 5 is a schematic diagram of a video frame loss processing mechanism in the related art;

FIG. 6 is a flow chart of a video processing method according to an embodiment of the disclosure;

fig. 7 is a flowchart of a video processing method according to a second embodiment of the disclosure;

fig. 8 is a flowchart of a video processing method according to a third embodiment of the present disclosure;

FIG. 9 is a schematic illustration of inserting a third video frame between a first video frame and a second video frame provided by an embodiment of the present disclosure;

fig. 10 is a flowchart of a video processing method according to a fourth embodiment of the present disclosure;

FIG. 11 is a flow chart of a camera application requesting image data in an embodiment of the present disclosure;

FIG. 12 is a flow diagram of a video processing method of an embodiment of the present disclosure;

fig. 13 is a schematic structural diagram of an electronic device according to a fifth embodiment of the present disclosure;

fig. 14 is a schematic structural diagram of a video processing apparatus according to a sixth embodiment of the present disclosure;

fig. 15 is a block diagram of an electronic device used to implement a video processing method of an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Currently, camera software architecture in electronic devices mostly uses an application programming interface (camera 2 API) of camera version 2, where the camera2 API defines a data structure for a request (request) at a software layer, so that a camera application can request different types of image data for subsequent processing.

As shown in fig. 1, in an application scenario of video recording of an electronic device, fig. 1 is a schematic diagram of a video processing flow in the related art. In fig. 1, an original frame sequence acquired by an image sensor (Camera image sensor) is processed by an image signal processor (Image Signal Processing, abbreviated as ISP) to obtain a preview frame sequence (preview frames) and a video frame sequence (video frames), and further, the preview frame sequence and the video frame sequence processed by a related algorithm are encoded by a Camera application (Camera APP) to obtain a video file, where under normal conditions, the Camera application continuously issues requests according to the requirements of the Camera2API to request new image data.

As shown in fig. 2, the image sensor of the camera outputs frames N and n+1 at fixed intervals according to a preset frame rate, which correspond to requests N and n+ … issued by the camera application, respectively.

It should be noted that, as shown in fig. 3, fig. 3 is a schematic process flow diagram of each video frame in the video in the related art, the image output by the image sensor is frame N, after being processed by the electronic device (process frame N), the image is returned to the camera application result N, and each frame must be executed sequentially, that is, when frame N is not processed, the frame n+1 processing is completed. Since too many images cannot be processed in parallel at the same time due to limitations of the memory and processing power of the electronic device, the number of images processed in parallel is limited, and in this disclosure, an upper limit of the number of images is described as 6. If the electronic device processes the image normally, when frame n+6 arrives, frame N is already processed, i.e. is in the process of image frames n+1 to frame n+6, and a total of 6 pictures will not exceed the upper limit that the electronic device can process simultaneously. At this point the electronic device can process all images normally.

However, as shown in fig. 4, assuming that frame N is processed for too long, the processing of frame N is still not completed at the time frame n+6 is ready to be input to the processor of the electronic device, then it may occur that the terminal device cannot process the image of frame n+6 again while satisfying the maximum simultaneous processing of 6 images. Then frame N +6 must be discarded and cannot be processed.

In addition, due to conditions such as high temperature and preemption of other applications, a processing speed of a module responsible for processing an image in electronic equipment such as a memory, a central processing unit (central processing unit, CPU for short), a graphic processing unit (graphics processing unit, GPU for short), an image signal processing unit (Image Signal Processing, ISP for short), an embedded Neural network processing unit (nerve-network Processing Unit, NPU for short) is reduced, and therefore, the phenomenon of discarding video frames inevitably occurs, and a preview or video picture is blocked, so that user experience is affected. Further, as shown in fig. 5, if the clock rate of the modules such as the memory CPU, GPU, ISP, NPU needs to be increased in order to increase the processing capacity of the modules, the power consumption is further increased, and the whole machine is heated, so that the clock rate of each module has to be limited to reduce the power consumption, and the loop without solution is trapped.

In view of the foregoing, the present disclosure proposes a video processing method, an electronic device, an apparatus, and a storage medium.

The following describes a video processing method, an electronic device, an apparatus, and a storage medium of an embodiment of the present disclosure with reference to the accompanying drawings.

Fig. 6 is a flowchart of a video processing method according to an embodiment of the disclosure.

The embodiments of the present disclosure are exemplified in the video processing method configured in a video processing apparatus, which may be applied to any electronic device, so that the electronic device may perform video processing functions.

The electronic device may be any device with computing capability, for example, may be a personal computer, a mobile terminal, a server, etc., and the mobile terminal may be, for example, a vehicle-mounted device, a mobile phone, a tablet computer, a personal digital assistant, a wearable device, etc., which have various operating systems, touch screens, and/or display screens.

As shown in fig. 6, the video processing method may include the steps of:

step 601, acquiring a first video frame and a second video frame adjacent to each other in a first video frame sequence.

In the disclosed embodiments, adjacent first and second video frames may be read from a first sequence of video frames.

The first video frame sequence may be generated from an original frame sequence, which may be an image sequence generated from an original frame captured by an image sensor in a camera application in the electronic device.

It should be noted that, the first video frame may correspond to frame n+5 in fig. 4, and the second video frame may correspond to frame n+7 in fig. 4; the timestamp of the first video frame may be a time when the first video frame is input to the image processor after the image sensor collects the first video frame; the timestamp of the second video frame may be a time when the second video frame is input to the image processor after the image sensor collects the second video frame, and when the time difference between the timestamps of the frame n+5 and the frame n+7 is detected to be greater than the set threshold, a frame of image (i.e. the third video frame) is calculated according to the frame n+5 and the frame n+7.

Step 602, determining a time difference of the time stamp for adjacent first video frames and second video frames in the first video frame sequence.

As one possible implementation of the embodiments of the present disclosure, the time difference of the time stamps may be determined according to the difference between the time stamp of the first video frame and the time stamp of the second video frame. For example, the time stamp of the first video frame may be a time when acquisition of the first video frame is completed, in particular, a time when the first video frame is transferred to the image processor, the time stamp of the second video frame may be a time when acquisition of the second video frame is completed, in particular, a time when the second video frame is transferred to the image processor, and the time difference of the time stamps may be a time difference between a time when the first video frame is input to the image processor and a time when the second video frame is input to the image processor.

Step 603, inserting at least one third video frame between the first video frame and the second video frame to encode as the second video frame sequence to obtain the video file when the time difference is greater than the set threshold.

The third video frame is generated according to the contents of the first video frame and the second video frame.

In the disclosed embodiments, the set threshold may be determined from the frame rate of the original frame sequence.

It should be noted that, when determining the set threshold, the video processing apparatus may obtain, as an example, a frame rate of the original frame sequence from the camera application, where the set threshold is calculated according to the frame rate, where there is a difference in content obtained from the camera application by the video processing apparatus; as another example, the video processing device may obtain the set threshold directly from the camera application, where the set threshold is calculated by the camera application from the frame rate of the original sequence of frames.

For example, in response to an enabling instruction of the camera application, obtaining a frame rate of an original frame sequence from the camera application; the original frame sequence is acquired by an image sensor and is used for generating a first video frame sequence; a set threshold is determined based on the frame rate. For example, the video processing apparatus may obtain the frame rate of the original frame sequence according to the camera application, and calculate the set threshold, for example, the frame rate of the original frame sequence is f, and the set threshold may be 1.5x1/f.

For another example, in response to an enabling instruction of the camera application, acquiring a set threshold from the camera application; wherein, the threshold value is set according to the frame rate of the original frame sequence; the original frame sequence is acquired by an image sensor for generating a first video frame sequence. That is, the camera application may calculate the set threshold value in advance according to the frame rate of the original frame sequence, and store the calculated set threshold value, so that the video processing apparatus may directly read or acquire the set threshold value from the camera application, without calculating the set threshold value according to the frame rate of the original frame sequence, and may improve the processing efficiency of the video processing apparatus.

Further, in the case that the time difference is greater than the set threshold, that is, the adjacent first video frame and the second video frame in the first video frame sequence are non-adjacent original frames in the original frame sequence, at least one discarded third video frame exists in the first video frame and the second video frame, so that in order to stabilize the frame rate of the video frames, stable and smooth video playing is realized, at least one third video frame can be generated according to the content of the first video frame and the second video frame, the generated at least one third video frame is inserted, that is, the first video frame sequence after the insertion of the at least one third video frame is inserted as the second video frame sequence, and the video file is obtained by encoding.

For example, the first video frame is frame n+5 in fig. 4, and the second video frame is frame n+7 in fig. 4; when the time difference between the time stamps of frame n+5 and frame n+7 is greater than the set threshold, then at least one third video frame (e.g., frame n+6 in fig. 4) is calculated from frame n+5 and frame n+7.

In summary, the video file is obtained by inserting at least one third video frame between the first video frame and the second video frame as the second video frame sequence for encoding, in case that the time difference of the time stamps of the adjacent first video frame and second video frame in the first video frame sequence is larger than the set threshold value. Therefore, under the condition that video frames are discarded, at least one third video frame is generated according to the content of the first video frame and the second video frame, and the generated at least one third video frame is inserted between the first video frame and the second video frame, so that the stability of the video frame rate is realized, the stability and smoothness of the video in playing can be realized, and the user experience is improved.

In order to clearly illustrate how the above-described embodiments acquire adjacent first and second video frames in a first video frame sequence, the present disclosure proposes another video processing method.

Fig. 7 is a flowchart of a video processing method according to a second embodiment of the disclosure.

As shown in fig. 7, the video processing method may include the steps of:

in response to the camera application outputting the second video frame of the first sequence of video frames, the second video frame is stored, step 701.

It should be appreciated that the video frames in the first sequence of video frames are output in order in the camera application, and must be performed in order according to the frames, so that the first video frame may be output before the second video frame, and the first video frame and the second video frame may be obtained by storing the first video frame and the second video frame.

In embodiments of the present disclosure, the second video frame may be retrieved in response to the camera application outputting the second video frame in the first sequence of video frames.

Step 702, reading a stored first video frame; the first video frame is stored when the camera application outputs the first video frame before the second video frame.

Further, the first video frame may be obtained by reading the stored first video frame, where the stored first video is obtained by storing the first video frame when the camera application outputs the first video frame prior to the second video frame.

Step 703, determining a time difference of the time stamp for the adjacent first video frame and second video frame in the first video frame sequence.

Step 704, inserting at least one third video frame between the first video frame and the second video frame to encode as the second video frame sequence to obtain the video file in case that the time difference is larger than the set threshold.

It should be noted that the execution of steps 703 to 704 may be implemented in any manner in each embodiment of the disclosure, which is not limited to this embodiment, and is not repeated herein.

To sum up, storing a second video frame in the first sequence of video frames by outputting the second video frame in response to the camera application; and reading the stored first video frame, thereby obtaining the first video frame and the second video frame by storing the second video frame output by the camera application and reading the stored first video frame.

To clearly illustrate how a second sequence of video frames may be encoded to obtain a video file, another video processing method is proposed by the present disclosure.

Fig. 8 is a flowchart of a video processing method according to a third embodiment of the present disclosure.

As shown in fig. 8, the video processing method may include the steps of:

step 801, a first video frame and a second video frame adjacent to each other in a first video frame sequence are acquired.

Step 802, determining a time difference of the time stamp for adjacent first video frames and second video frames in the first video frame sequence.

Step 803, inserting at least one third video frame between the first video frame and the second video frame to encode as the second video frame sequence to obtain the video file when the time difference is greater than the set threshold.

As an example, in case the time difference is larger than the set threshold, that is, at least one third video frame may be generated according to the content of the first video frame and the second video frame, and at least one third video frame may be inserted between the first video frame and the second video frame in the first video frame sequence, and the first video frame sequence after the at least one third video frame is inserted may be encoded as the second video frame sequence to obtain the video file.

Step 804 sequentially outputs the stored first video frame and the at least one third video frame to the encoder to cause the encoder to encode based on the stored first video frame and the at least one third video frame.

For example, as shown in fig. 9, taking frame n+5 as the first video frame, frame n+7 as the second video frame, frame interpolation as an example of inserting one third video frame, frames n+5 (first video frame) and frame interpolation (third video frame) may be sequentially output to the encoder, which stores and encodes frames n+5 and frame interaction.

And step 805, deleting the stored first video frame when the stored first video frame and at least one third video frame are output.

Further, in order to reduce the memory consumption of the electronic device, the stored first video frame may be deleted when the stored first video frame and the at least one third video frame are output.

As another example, in the case where the time difference is less than or equal to the set threshold, the first video frame and the second video frame are sequentially output to be encoded as the second video frame sequence to obtain the video file.

It should be noted that the execution of steps 801 to 802 may be implemented in any manner of the embodiments of the disclosure, which are not limited to this and are not repeated.

In summary, by sequentially outputting the stored first video frame and at least one third video frame to the encoder, so that the encoder encodes based on the stored first video frame and the at least one third video frame, and deleting the stored first video frame when the stored first video frame and the at least one third video frame are output, the encoder encodes based on each video frame in the sequence of sequentially outputting the second video frame, thereby obtaining the encoded file, and deleting the stored first video frame when the stored first video frame and the at least one third video frame are output, thereby reducing the resource consumption.

In order to clearly illustrate how the above embodiments insert at least one third video frame between a first video frame and a second video frame to encode as a second sequence of video frames to obtain a video file in case the time difference is larger than a set threshold, the present disclosure proposes another video processing method.

Fig. 10 is a flowchart of a video processing method according to a fourth embodiment of the present disclosure.

As shown in fig. 10, the video processing method may include the steps of:

In step 1001, adjacent first video frames and second video frames in a first video frame sequence are acquired.

Step 1002, determining a time difference of the time stamps for adjacent first video frames and second video frames in a first sequence of video frames.

In step 1003, in case the time difference is larger than the set threshold, at least one third video frame is generated based on the motion estimation and motion compensation MEMC from the first video frame and the second video frame.

In an embodiment of the present disclosure, in the case where the time difference is greater than the set threshold, based on motion estimation and motion compensation (Motion Compensation, abbreviated MEMC), motion trends of the first video frame and the second video frame are estimated and analyzed in both horizontal and vertical directions, generating at least one third video frame between the first video frame and the second video frame.

At step 1004, at least one third video frame is inserted between the first video frame and the second video frame to be encoded as a second video frame sequence to obtain a video file.

Further, the generated at least one third video frame is inserted between the first video frame and the second video frame in the first video frame sequence, the first video frame sequence after the at least one third video frame is inserted is used as the second video frame sequence, and the video file can be obtained by encoding the second video frame sequence.

For example, as shown in fig. 11, when there is a dropped video frame between the first video frame and the second video frame, the method generates at least one third video frame (e.g., frame n+6 in fig. 4) according to the content of the first video frame (e.g., frame n+5 in fig. 4) and the second video frame (e.g., frame n+7 in fig. 4) through the MEMC module processing, and inserts the generated at least one third video frame between the first video frame and the second video frame, thereby encoding the video frame sequence in which the at least one third video frame is inserted.

It should be noted that, the execution of steps 1001 to 1002 may be implemented in any manner of each embodiment of the disclosure, which is not limited to this embodiment, and is not repeated.

In summary, generating at least one third video frame based on the motion estimation and motion compensation MEMC from the first video frame and the second video frame by if the time difference is greater than the set threshold; and inserting at least one third video frame between the first video frame and the second video frame to encode as a second video frame sequence to obtain a video file, so that under the condition that the video frame is discarded between the first video frame and the second video frame, the at least one third video frame can be predicted based on motion estimation and motion compensation, and the at least one third video frame generated by prediction is inserted between the first video frame and the second video frame, so that the frame rate of the obtained second video frame sequence is stable, the video file obtained by encoding the second video frame sequence is stable and smooth during playing, and the user experience is improved.

In order to more clearly illustrate the above embodiments, an example will now be described.

For example, as shown in fig. 12, fig. 12 is a flow chart of a video processing method according to an embodiment of the disclosure.

1. Opening a camera application, enabling MEMC, and recording a video frame rate;

2. starting video recording, and reading video frames from a video frame sequence (original frame sequence) of the video recording to generate a first video frame sequence;

3. the MEMC checks whether the input frame rate coincides with the recorded value (frame rate);

4. when the input frame rate is consistent with the recorded value, directly encoding the input frame; when the input frame rate is different from the recorded value, performing frame inserting operation; the sequence of video frames after the insertion of frames is encoded.

In the video processing method of the embodiment of the disclosure, when a time difference between time stamps of adjacent first video frames and second video frames in a first video frame sequence is greater than a set threshold, at least one third video frame is inserted between the first video frames and the second video frames to be used as a second video frame sequence for encoding to obtain a video file. Therefore, under the condition that video frames are discarded between adjacent first video frames and second video frames in the first video frame sequence, at least one third video frame is generated according to the content of the first video frames and the second video frames, and the generated at least one third video frame is inserted, so that the frame rate in the obtained second video frame sequence is stable, the video file obtained by encoding the second video frame sequence can be stable and smooth during playing, and the user experience is improved.

In order to implement the above-described embodiments, the present disclosure proposes an electronic device.

Fig. 13 is a schematic structural diagram of an electronic device according to a fifth embodiment of the present disclosure.

As shown in fig. 13, an electronic device 1300 includes: an image sensor 1310, at least one processor 1320 coupled to the image sensor, and a memory 1330 communicatively coupled to the at least one processor.

Wherein, the image sensor 1310 is used for outputting an original frame sequence; the memory 1330 stores instructions executable by the at least one processor to cause the at least one processor to process the original frame sequence to obtain a first video frame sequence and perform the video processing method described in the above embodiments.

As one possible implementation of the embodiments of the present disclosure, the at least one processor 1320 includes: a central processing unit CPU, a first processor and an encoder.

The CPU is used for sequentially processing all frames in the original frame sequence and sequentially outputting all frames in the first video frame sequence obtained after the processing; the first processor is connected with the CPU and is used for executing a video processing method based on the first video frame sequence to obtain a second video frame sequence; and the encoder is connected with the first processor and used for encoding based on the second video frame sequence to obtain a video file.

As one possible implementation of the embodiments of the present disclosure, the first processor is a graphics processor GPU, or is an embedded neural network processor NPU.

As one possible implementation of the embodiments of the present disclosure, the at least one processor 1320 includes: a central processing unit CPU and an encoder.

The CPU is used for sequentially processing all frames in the original frame sequence and sequentially outputting all frames in the first video frame sequence obtained after the processing; an encoder, coupled to the CPU, for performing a video processing method based on the first video frame sequence to obtain a second video frame sequence; and encoding based on the second video frame sequence to obtain a video file.

The electronic device of the embodiment of the disclosure inserts at least one third video frame between the first video frame and the second video frame to encode as the second video frame sequence to obtain the video file when the time difference of the time stamps of the adjacent first video frame and second video frame in the first video frame sequence is larger than a set threshold. Therefore, under the condition that video frames are discarded between adjacent first video frames and second video frames in the first video frame sequence, at least one third video frame is generated according to the content of the first video frames and the second video frames, and the generated at least one third video frame is inserted between the first video frames and the second video frames, so that the frame rate of the obtained second video frame sequence is stable, the video file obtained by encoding the second video frame sequence can be stable and smooth during playing, and the user experience is improved.

In order to achieve the above-described embodiments, the present disclosure proposes a video processing apparatus.

Fig. 14 is a schematic structural diagram of a video processing apparatus according to a sixth embodiment of the present disclosure.

As shown in fig. 14, the video processing apparatus 1400 includes: an acquisition module 1410, a determination module 1420, and a processing module 1430.

The acquiring module 1410 is configured to acquire a first video frame and a second video frame that are adjacent to each other in the first video frame sequence; a determining module 1420 configured to determine a time difference of the time stamps for adjacent first video frames and second video frames in the first video frame sequence; and the processing module 1430 is configured to insert at least one third video frame between the first video frame and the second video frame when the time difference is greater than the set threshold value, so as to encode the second video frame sequence to obtain the video file, where the third video frame is generated according to the content of the first video frame and the second video frame.

As one possible implementation of the embodiments of the present disclosure, the obtaining module 1410 is specifically configured to: storing the second video frame in response to the camera application outputting the second video frame in the first sequence of video frames; reading a stored first video frame; the first video frame is stored when the camera application outputs the first video frame before the second video frame.

As one possible implementation of an embodiment of the present disclosure, the video processing apparatus 1400 further includes: an output module and a deletion module.

The output module is used for outputting the stored first video frames and at least one third video frame to the encoder in sequence so that the encoder encodes based on the stored first video frames and the at least one third video frame; and the deleting module is used for deleting the stored first video frame under the condition that the stored first video frame and at least one third video frame are completely output.

As one possible implementation of the embodiments of the present disclosure, the processing module 1430 is specifically configured to: generating at least one third video frame based on the motion estimation and motion compensation MEMC from the first video frame and the second video frame if the time difference is greater than the set threshold; at least one third video frame is inserted between the first video frame and the second video frame to be used as a second video frame sequence for encoding to obtain a video file.

As one possible implementation of the embodiments of the present disclosure, the processing module 1430 is further configured to: and under the condition that the time difference is smaller than or equal to a set threshold value, sequentially outputting the first video frame and the second video frame to be used as a second video frame sequence for encoding to obtain a video file.

As one possible implementation of an embodiment of the disclosure, the obtaining module 1410 is further configured to: in response to an enabling instruction of the camera application, obtaining a frame rate of an original frame sequence from the camera application; the original frame sequence is acquired by an image sensor and is used for generating a first video frame sequence; a set threshold is determined based on the frame rate.

As one possible implementation of an embodiment of the disclosure, the obtaining module 1410 is further configured to: acquiring a set threshold value from the camera application in response to an enabling instruction of the camera application; wherein, the threshold value is set according to the frame rate of the original frame sequence; the original frame sequence is acquired by an image sensor for generating a first video frame sequence.

The video processing device of the embodiment of the disclosure inserts at least one third video frame between the first video frame and the second video frame to encode as the second video frame sequence to obtain a video file when a time difference of time stamps of the adjacent first video frame and the second video frame in the first video frame sequence is larger than a set threshold. Therefore, under the condition that video frames are discarded between adjacent first video frames and second video frames in the first video frame sequence, at least one third video frame is generated according to the content of the first video frames and the second video frames, and the generated at least one third video frame is inserted between the first video frames and the second video frames, so that the frame rate of the obtained second video frame sequence is stable, the video file obtained by encoding the second video frame sequence can be stable and smooth during playing, and the user experience is improved.

To achieve the above embodiments, the present disclosure also proposes a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the video processing method described in the above embodiments.

To achieve the above embodiments, the present disclosure also proposes a computer program product comprising a computer program which, when executed by a processor, implements the video processing method according to the above embodiments of the present disclosure.

According to an embodiment of the disclosure, the disclosure also provides another electronic device.

As shown in fig. 15, is a block diagram of an electronic device of a video processing method according to an embodiment of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 15, the electronic device includes: one or more processors 1501, memory 1502, and interfaces for connecting the components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). In fig. 15, a processor 1501 is taken as an example.

Memory 1502 is a non-transitory computer-readable storage medium provided by the present disclosure. Wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the video processing methods provided by the present disclosure. The non-transitory computer readable storage medium of the present disclosure stores computer instructions for causing a computer to perform the video processing method provided by the present disclosure.

The memory 1502 serves as a non-transitory computer readable storage medium that may be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as program instructions/modules (e.g., the acquisition module 1401, the determination module 1402, and the processing module 1403 shown in fig. 14) corresponding to the video processing method in the embodiments of the present disclosure. The processor 1501 executes various functional applications of the server and data processing, i.e., implements the video processing method in the above-described method embodiment, by executing non-transitory software programs, instructions, and modules stored in the memory 1502.

Memory 1502 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, at least one application program required for functionality; the storage data area may store data created according to the use of the electronic device of the video processing method, and the like. In addition, the memory 1502 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory 1502 may optionally include memory located remotely from processor 1501, which may be connected to the electronic device of the video processing method via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device of the video processing method may further include: an input device 1503 and an output device 1504. The processor 1501, memory 1502, input device 1503, and output device 1504 may be connected by a bus or otherwise, for example in fig. 15.

The input device 1503 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device of the video processing method, such as input devices of a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointer stick, one or more mouse buttons, a track ball, a joystick, etc. The output device 1504 may include a display device, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibration motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASIC (application specific integrated circuit), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

These computing programs (also referred to as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the technical solutions presented in the present disclosure are achieved, and are not limited herein.

The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims

1. A video processing method, comprising:

acquiring adjacent first video frames and second video frames in a first video frame sequence;

determining a time difference of a time stamp for adjacent first video frames and second video frames in the first video frame sequence;

inserting at least one third video frame between the first video frame and the second video frame under the condition that the time difference is larger than a set threshold value, and encoding the third video frame as a second video frame sequence to obtain a video file; wherein the third video frame is generated according to the contents of the first video frame and the second video frame.

2. The method of claim 1, wherein the acquiring adjacent first and second video frames in the first sequence of video frames comprises:

Storing a second video frame in the first sequence of video frames in response to a camera application outputting the second video frame;

reading the stored first video frame; the first stored video frame is obtained by storing the first video frame when the camera application outputs the first video frame before the second video frame.

3. The method according to claim 2, wherein the method further comprises:

sequentially outputting the stored first video frame and the at least one third video frame to an encoder to cause the encoder to encode based on the stored first video frame and the at least one third video frame;

and deleting the stored first video frame under the condition that the stored first video frame and the at least one third video frame are completely output.

4. A method according to any one of claims 1-3, wherein said inserting at least one third video frame between said first video frame and said second video frame to encode as a second sequence of video frames to obtain a video file, if said time difference is greater than a set threshold, comprises:

Generating the at least one third video frame based on motion estimation and motion compensation MEMC from the first video frame and the second video frame if the time difference is greater than the set threshold;

and inserting the at least one third video frame between the first video frame and the second video frame to be used as the second video frame sequence for encoding to obtain a video file.

5. A method according to any one of claims 1-3, wherein the method further comprises:

and under the condition that the time difference is smaller than or equal to the set threshold value, outputting the first video frame and the second video frame sequentially to be used as a second video frame sequence for encoding to obtain a video file.

6. A method according to any one of claims 1-3, wherein the method further comprises:

in response to an enabling instruction of the camera application, obtaining a frame rate of an original frame sequence from the camera application; wherein the original frame sequence is acquired by an image sensor for generating the first video frame sequence;

and determining the set threshold according to the frame rate.

7. A method according to any one of claims 1-3, wherein the method further comprises:

Acquiring the set threshold value from the camera application in response to an enabling instruction of the camera application; wherein, the set threshold is determined according to the frame rate of the original frame sequence; the original frame sequence is acquired by an image sensor for generating the first video frame sequence.

8. An electronic device, comprising:

an image sensor for outputting an original frame sequence;

at least one processor coupled to the image sensor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to cause the at least one processor to process the original sequence of frames to obtain a first sequence of video frames and to perform the video processing method of any of claims 1-7.

9. The electronic device of claim 8, wherein the at least one processor comprises: a central processing unit CPU, a first processor and an encoder;

the CPU is used for sequentially processing all frames in the original frame sequence and sequentially outputting all frames in the first video frame sequence obtained after the processing;

The first processor is connected with the CPU and is used for executing the video processing method based on the first video frame sequence to obtain a second video frame sequence;

the encoder is connected with the first processor and used for encoding based on the second video frame sequence to obtain a video file.

10. The electronic device of claim 9, wherein the first processor is a graphics processor GPU or an embedded neural network processor NPU.

11. The electronic device of claim 8, wherein the at least one processor comprises: a central processing unit CPU and an encoder;

the encoder is connected with the CPU and is used for executing the video processing method based on the first video frame sequence to obtain a second video frame sequence; and encoding based on the second video frame sequence to obtain a video file.

12. A video processing apparatus, comprising:

the acquisition module is used for acquiring adjacent first video frames and second video frames in the first video frame sequence;

A determining module, configured to determine a time difference of a time stamp for a first video frame and a second video frame adjacent to each other in the first video frame sequence;

and the processing module is used for inserting at least one third video frame between the first video frame and the second video frame under the condition that the time difference is larger than a set threshold value so as to be used as a second video frame sequence for encoding to obtain a video file, wherein the third video frame is generated according to the content of the first video frame and the second video frame.

13. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-7.