CN114125298A

CN114125298A - Video generation method and device, electronic equipment and computer readable storage medium

Info

Publication number: CN114125298A
Application number: CN202111423165.6A
Authority: CN
Inventors: 董晓龙
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2021-11-26
Filing date: 2021-11-26
Publication date: 2022-03-01

Abstract

The application discloses a video generation method, a video generation device, an electronic device and a non-volatile computer-readable storage medium. The video generation method comprises the following steps: calculating a loss value between any two frames of shot images according to the attitude data of the shot images with the first preset frame number; selecting a frame of shot image with the minimum loss value as a selected frame from shot images with a second preset frame number after the first preset frame number, wherein the second preset frame number is smaller than the first preset frame number; acquiring an image with the minimum loss value with a selected frame in a plurality of frames of shot images before the selected frame as a next selected frame; and circularly executing the step of acquiring the image with the minimum loss value with the selected frame in the multi-frame shot images before the selected frame as the next selected frame so as to generate the target video according to the acquired multiple selected frames. The loss value between any two continuous frames in the multiple selected frames is minimum, so that the target video generated according to the multiple selected frames is less influenced by jitter, and the video quality of the target video is ensured.

Description

Video generation method and device, electronic equipment and computer readable storage medium

Technical Field

The present application relates to the field of electronic technologies, and in particular, to a video generation method, a video generation apparatus, an electronic device, and a non-volatile computer-readable storage medium.

Background

Currently, smart devices (such as smart phones, smart watches, tablet computers, and the like) have become articles for daily use. When a user shoots through the intelligent terminal device for delayed shooting, because the shooting duration of the delayed shooting is generally long and the number of shot image frames is large, a predetermined number of image frames need to be selected from a large number of image frames to generate a delayed shooting video with a moderate video length (for example, one frame is selected at every first preset frame number), however, slight jitter also has a large influence on the delayed video, and the video quality of the delayed video is seriously reduced.

Disclosure of Invention

The embodiment of the application provides a video generation method, a video generation device, electronic equipment and a non-volatile computer readable storage medium.

The video generation method comprises the steps of calculating a loss value between any two frames of shot images according to attitude data of the shot images with a first preset frame number, wherein the loss value is larger when the difference of the attitude data of the shot images of any two frames is larger; selecting a frame of the shot image with the minimum loss value as a selected frame from the shot images with a second preset frame number after the first preset frame number, wherein the second preset frame number is smaller than the first preset frame number; acquiring an image with the minimum loss value between the selected frame and multiple frames of the shot images before the selected frame as a next selected frame; and circularly executing the step of taking the image with the minimum loss value with the selected frame in the multiple frames of shot images before the selected frame is obtained as the next selected frame so as to generate a target video according to the obtained multiple selected frames.

The video generation device comprises a calculation module, a selection module, an acquisition module and a generation module. The calculation module is used for calculating a loss value between any two frames of the shot images according to the attitude data of the shot images with a first preset frame number, wherein the larger the difference of the attitude data of any two frames of the shot images is, the larger the loss value is; the selection module is used for selecting a frame of the shot image with the minimum loss value as a selected frame from the shot images with a second preset frame number after the first preset frame number, wherein the second preset frame number is smaller than the first preset frame number; the acquisition module is used for acquiring an image with the minimum loss value between the selected frame and multiple frames of the shot images before the selected frame as a next selected frame; the generating module is used for circularly executing the step of taking the image with the minimum loss value with the selected frame as the next selected frame in the plurality of frames of shot images before the selected frame is obtained, so as to generate the target video according to the plurality of obtained selected frames.

The electronic device of the embodiment of the application comprises a processor. The processor is used for calculating a loss value between any two frames of shot images according to the attitude data of the shot images with a first preset frame number, wherein the larger the difference of the attitude data of any two frames of the shot images is, the larger the loss value is; selecting a frame of the shot image with the minimum loss value as a selected frame from the shot images with a second preset frame number after the first preset frame number, wherein the second preset frame number is smaller than the first preset frame number; acquiring an image with the minimum loss value between the selected frame and multiple frames of the shot images before the selected frame as a next selected frame; and circularly executing the step of taking the image with the minimum loss value with the selected frame in the multiple frames of shot images before the selected frame is obtained as the next selected frame so as to generate a target video according to the obtained multiple selected frames.

The non-transitory computer-readable storage medium of embodiments of the present application contains a computer program that, when executed by one or more processors, causes the processors to perform a video generation method that: calculating a loss value between any two frames of shot images according to the attitude data of the shot images with a first preset frame number, wherein the larger the difference of the attitude data of the shot images with any two frames is, the larger the loss value is; selecting a frame of the shot image with the minimum loss value as a selected frame from the shot images with a second preset frame number after the first preset frame number, wherein the second preset frame number is smaller than the first preset frame number; acquiring an image with the minimum loss value between the selected frame and multiple frames of the shot images before the selected frame as a next selected frame; and circularly executing the step of taking the image with the minimum loss value with the selected frame in the multiple frames of shot images before the selected frame is obtained as the next selected frame so as to generate a target video according to the obtained multiple selected frames.

According to the video generation method, the video generation device, the electronic equipment and the nonvolatile computer readable storage medium of the embodiment of the application, firstly, according to the attitude data of the shot images with a first preset frame number, the loss value between any two shot images is calculated, the loss value is in direct proportion to the difference of the attitude data of any two shot images, then, in the shot images with a second preset frame number after the first preset frame number, the shot image with the minimum loss value is selected as a selected frame in a backtracking mode, then, the backtracking is carried out forward, the shot image with the minimum loss value between the selected frames is found and is used as the next selected frame, then, the backtracking is carried out continuously until the shot images with the whole first preset frame number are backtracked completely, a plurality of selected frames can be selected from the shot images with the first preset frame number according to a dynamic selection interval, and the loss value between any two continuous frames in the multiple selected frames is minimum, so that the target video generated according to the multiple selected frames is less influenced by jitter, and the video quality of the target video is ensured.

Additional aspects and advantages of embodiments of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of embodiments of the present application.

Drawings

The above and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

FIG. 1 is a schematic flow diagram of a video generation method according to some embodiments of the present application;

FIG. 2 is a schematic diagram of a video generation apparatus according to some embodiments of the present application;

FIG. 3 is a schematic plan view of an electronic device of some embodiments of the present application;

FIG. 4 is a schematic flow chart diagram of a video generation method in accordance with certain embodiments of the present application;

FIG. 5 is a schematic flow chart diagram of a video generation method according to some embodiments of the present application;

FIG. 6 is a schematic flow chart diagram of a video generation method in accordance with certain embodiments of the present application;

FIG. 7 is a schematic diagram of a connection state of a non-volatile computer readable storage medium and a processor of some embodiments of the present application.

Detailed Description

Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below by referring to the drawings are exemplary only for the purpose of explaining the embodiments of the present application, and are not to be construed as limiting the embodiments of the present application.

In the process of shooting the delayed video, the terminal may shake to cause poor imaging effect, and even if the terminal shakes slightly when image frames are selected at intervals of a first preset number of frames, the terminal looks like severe shaking when the delayed video is generated according to the selected image frames, so that the video quality of the delayed video is seriously affected.

Referring to fig. 1, a video generation method is provided in an embodiment of the present application. The video generation method comprises the following steps:

011: calculating a loss value between any two frames of shot images according to the attitude data of the shot images with the first preset frame number, wherein the larger the difference of the attitude data of any two frames of shot images is, the larger the loss value is;

012: selecting a shot image corresponding to the minimum loss value as a selected frame from shot images of a second preset frame number after the first preset frame number, wherein the second preset frame number is smaller than the first preset frame number;

013: acquiring an image with the minimum loss value with a selected frame in a plurality of frames of shot images before the selected frame as a next selected frame;

014: and circularly executing the step of acquiring the image with the minimum loss value with the selected frame in the multi-frame shot images before the selected frame as the next selected frame so as to generate the target video according to the acquired multiple selected frames.

Referring to fig. 2, the present embodiment provides a video generating apparatus 10. The video generation device 10 includes a calculation module 11, a selection module 12, an acquisition module 13, and a generation module 14. The video generation method according to the embodiment of the present application is applicable to the video generation device 10. The calculating module 11, the selecting module 12, the obtaining module 13, and the generating module 14 are respectively configured to execute step 011, step 012, step 013, and step 014. That is, the calculating module 11 is configured to calculate a loss value between any two captured images according to the pose data of the captured images of the first preset number of frames, where the greater the difference between the pose data of any two captured images is, the greater the loss value is; the selecting module 12 is configured to select, as a selected frame, a frame of a captured image with a minimum loss value from captured images with a second preset frame number after the first preset frame number, where the second preset frame number is smaller than the first preset frame number; the obtaining module 13 is configured to obtain, as a next selected frame, an image with a minimum loss value with respect to the selected frame from among multiple frames of shot images before the selected frame; the generating module 14 is configured to cyclically execute the step of acquiring, as a next selected frame, an image with a smallest loss value with respect to the selected frame in the captured images of the plurality of frames before the selected frame, so as to generate the target video according to the acquired plurality of selected frames.

Referring to fig. 3, an electronic device 100 is further provided in the present embodiment. The electronic device 100 comprises a processor 20. The video generation method according to the embodiment of the present application is applicable to the electronic device 100. Processor 20 is configured to perform step 011, step 012, step 013, and step 014. That is, the processor 20 is configured to calculate a loss value between any two captured images based on the pose data of the captured images of the first preset number of frames, the greater the difference of the pose data of any two captured images, the greater the loss value; selecting a frame of shot image with the minimum loss value as a selected frame from shot images with a second preset frame number after the first preset frame number, wherein the second preset frame number is smaller than the first preset frame number; acquiring an image with the minimum loss value with a selected frame in a plurality of frames of shot images before the selected frame as a next selected frame; and circularly executing the step of acquiring the image with the minimum loss value with the selected frame in the multi-frame shot images before the selected frame as the next selected frame so as to generate the target video according to the acquired multiple selected frames.

The electronic device 100 includes a housing 30. The electronic device 100 may be a cell phone, a tablet computer, a display device, a notebook computer, a teller machine, a gate, a smart watch, a head-up display device, a game console, etc. As shown in fig. 3, in the embodiment of the present application, the electronic device 100 is a mobile phone as an example, and it is understood that the specific form of the electronic device 100 is not limited to the mobile phone. The housing 30 may also be used to mount functional modules of the electronic device 100, such as a display device, an imaging device, a power supply device, and a communication device, so that the housing 30 provides protection for the functional modules, such as dust prevention, drop prevention, and water prevention.

Specifically, the electronic device 100 further includes a camera 40 and an attitude sensor 50, the camera 40 may be configured to capture a plurality of consecutive frames of captured images, the attitude sensor 50 is configured to acquire attitude data corresponding to the captured images when the camera 40 captures the images, for example, the attitude sensor 50 and the camera 40 perform information acquisition at the same frame rate, so that the captured images and the attitude data correspond to each other one by one, and of course, the attitude sensor 50 may acquire attitude data at a higher frame rate, so that one frame of captured image corresponds to a plurality of attitude data. The attitude sensor 50 may be a gyroscope, an accelerometer, or the like, and the attitude data may include a roll angle, a pitch angle, and a yaw angle of the electronic device 100 or the image sensor, and it is understood that the accuracy of determining the attitude data of the captured image by using the roll angle, the pitch angle, and the yaw angle of the image sensor is high.

It is understood that the number of frames of the photographed images is large in the delayed photographing scene, and in order to reduce the amount of calculation, when the delayed photographing video is generated, all the photographed images may be divided into a plurality of parts, and only one part may be processed at a time, thereby improving the video generation efficiency. Wherein the number of frames of the photographed image of each section may be a first preset number of frames.

After the processor 20 obtains the captured images with the first preset number of frames and the corresponding posture data of the captured images, the loss value between any two captured images can be calculated. For example, the processor 20 may calculate the difference in the pose data of any two captured images to determine a loss value, the greater the difference in the pose data of any two captured images, the greater the loss value.

Then, the processor 20 starts to trace back, and selects the image with the minimum loss value as the selected frame from the shot images with the second preset frame number after the first preset frame number, and it can be understood that the loss values exist between each frame of shot image and other shot images with the first preset frame number, so that all the loss values between each frame of shot image with the second preset frame number and other shot images can be obtained first, and then the shot image corresponding to the minimum loss value is found, that is, the shot image can be determined as the first selected frame.

The shot images with the first preset frame number are arranged in front of and behind the shooting time to determine the serial number of each frame of shot images, and the shooting time of the shot images with the first preset frame number behind the first preset frame number is later than that of other shot images in the shot images with the first preset frame number. For example, the first preset frame number is 100 frames, and if the first preset frame number is 100, the first preset frame number is respectively marked as a serial number 1 to a serial number 100 from front to back according to the shooting time, and the shot image of the second preset frame number can be the last 10 frames in the 100 frames.

Then, the processor 20 continues to backtrack, searches for an image with the minimum loss value with the first selected frame in the captured image before the first selected frame, determines the image as the second selected frame, then continues to backtrack again, searches for an image with the minimum loss value with the second selected frame in the captured image before the second selected frame, determines the image as the third selected frame, and so on until the captured image with the first preset frame number is backtracked. For example, when the number of frames of the captured image before the selected frame is less than the threshold of the second preset number of frames (e.g., 5 frames, 10 frames, etc.), it is determined that the captured image with the first preset number of frames has been traced back, and at this time, the step of continuously acquiring the previous selected frame of the currently selected frame is stopped. The second preset frame number threshold value can be determined according to a preset frame selection interval, the preset frame selection interval can be determined according to the frame number requirement of the delayed video and the total frame number of the shot images, and if the second preset frame number threshold value is a preset multiple of the preset frame selection interval, such as 0.3 time, 0.5 time and the like, the situation that the frame number interval between the actually selected frames is too small, and the frame number of the finally generated target video does not meet the requirement is prevented.

And finally, synthesizing the selected frames of the selected multi-frames according to the sequence number to generate the target video played in sequence, wherein the target video is less influenced by jitter because the loss value between any two continuous frames in the target video is minimum.

In the video generation method, the video generation device 10 and the electronic device 100 according to the embodiment of the present application, firstly, according to the attitude data of the shot images with the first preset frame number, the loss value between any two shot images is calculated, the loss value is in direct proportion to the difference between the attitude data of any two shot images, then, in the back-tracking mode, the shot image with the minimum loss value is selected as the selected frame from the shot images with the second preset frame number after the first preset frame number, then, the back-tracking is performed forward, the image with the minimum loss value between the selected frames is found and is used as the next selected frame, then, the back-tracking is performed continuously until the whole shot image with the first preset frame number is back-tracked, that is, according to the dynamic selection interval from the shot image with the first preset frame number, a plurality of selected frames are selected, and the loss value between any two continuous frames in the plurality of selected frames is minimum, therefore, the target video generated according to the multi-frame selected frame is less influenced by jitter, and the video quality of the target video is ensured. The method can meet the preset frame selection interval, and can ensure that the video frames with the camera posture and the picture content as close as possible are selected, so that the target video has a better stable effect.

Referring to fig. 2, 3 and 4, in some embodiments, step 011 includes:

0111: arranging the shot images with a first preset frame number according to the shooting time sequence to determine the serial number of each frame of shot image, wherein the earlier the shooting time is, the smaller the serial number is;

0112: calculating a first loss value according to the attitude data of the previous frame of shot image and the attitude data of the next frame of shot image in any two frames of shot images;

0113: when the selected frame does not exist after the shooting image of the next frame, calculating a second loss value according to the attitude data of the shooting image of the next frame and the attitude data of the last frame in the shooting image of a second preset frame number; and

0114: and when the selected frame exists after the image is shot in the next frame, calculating a second loss value according to the attitude data of the shot image in the next frame and the attitude data of the first selected frame after the shot image in the next frame.

0115: determining the loss value based on the first loss value and the second loss value.

In some embodiments, the calculation module 11 is further configured to perform step 0111, step 0112, step 0113, step 0114, and step 0115. Namely, the calculating module 11 is further configured to arrange the shot images of the first preset number of frames in the shooting time sequence to determine the serial number of each frame of shot image, where the earlier the shooting time is, the smaller the serial number is; calculating a first loss value according to the attitude data of the previous frame of shot image and the attitude data of the next frame of shot image in any two frames of shot images; when the selected frame does not exist after the shooting image of the next frame, calculating a second loss value according to the attitude data of the shooting image of the next frame and the attitude data of the last frame in the shooting image of a second preset frame number; when the selected frame exists after the next frame of shot image, calculating a second loss value according to the attitude data of the next frame of shot image and the attitude data of the first selected frame after the next frame of shot image; determining the loss value based on the first loss value and the second loss value.

In certain embodiments, processor 20 is configured to perform

steps

0111, 0112, 0113, 0114, and 0115. The processor 20 is configured to arrange the shot images of the first preset number of frames in the shooting time sequence to determine the serial number of each frame of shot image, wherein the earlier the shooting time is, the smaller the serial number is; calculating a first loss value according to the attitude data of the previous frame of shot image and the attitude data of the next frame of shot image in any two frames of shot images; when the selected frame does not exist after the shooting image of the next frame, calculating a second loss value according to the attitude data of the shooting image of the next frame and the attitude data of the last frame in the shooting image of a second preset frame number; when the selected frame exists after the next frame of shot image, calculating a second loss value according to the attitude data of the next frame of shot image and the attitude data of the first selected frame after the next frame of shot image; determining the loss value based on the first loss value and the second loss value.

Specifically, for convenience of describing the anteroposterior relationship between the photographed images of different frames, the photographed images are also arranged in the order of photographing time to determine the number of the photographed images of each frame, the greater the number, the further back. In calculating the loss value, in addition to calculating the first loss value between each frame of photographed image and all other photographed images, the loss value between each frame of photographed image and the reference frame may be calculated, thereby more accurately determining the loss value.

In calculating the first loss value, the processor 20 calculates the first loss value based on the attitude data of the preceding captured image and the attitude data of the succeeding captured image in any two captured images, for example, the processor 20 calculates the first loss value based on the difference between the attitude data of the preceding captured image and the attitude data of the succeeding captured image in any two captured images.

When calculating the second loss value, the processor 20 first determines whether a selected frame exists after the next frame of captured image (please refer to the foregoing description for determining the selected frame), and if not, calculates the second loss value according to the pose data of the next frame of captured image and the pose data of the last frame in the captured images of the second preset number of frames; and if so, calculating a second loss value according to the attitude data of the next frame of shot image and the attitude data of the first selected frame after the next frame of shot image. In this manner, the first loss value and the second loss value can be accurately determined.

Finally, processor 20 may determine a loss value based on the first loss value and the second loss value, for example, the loss value calculation formula is as follows:

wherein i and j are respectively a previous frame photographed image and a subsequent frame photographed imageAnd (5) taking a frame image, cost (j) is a second loss value, K is any integer from 1 to i-1, and i-K can traverse to any frame before the ith frame image. That is, the loss value is determined based on the second loss value corresponding to the next captured image and the minimum value of the first loss value between the previous captured image and any captured image before the previous captured image, so as to accurately determine the loss value, and it can be understood that, taking the first preset frame number as 100 frames as an example, 100 × 100 loss values can be calculated, thereby forming 100 × 100 Dv [ i ] s][j]And the matrix can facilitate the backtracking of the selected frame in the follow-up process through the loss value matrix.

In other embodiments, the processor 20 calculates a third loss value from the attitude data of the previous captured image and the attitude data of the next captured image in any two captured images; then, the processor 20 may further calculate a fourth loss value based on the attitude data of the next frame captured image and the attitude data of the selected frame having the smallest difference from the serial number of the next frame captured image, from among any two frames of captured images. And finally, calculating the loss value according to the third loss value and the fourth loss value. Therefore, the fourth loss value is obtained by calculation according to the shot image of the next frame and the selected frame all the time, and the accuracy of the fourth loss value is high.

Referring to fig. 2, 3 and 5, in some embodiments, step 0112 includes:

01121: calculating a first sub-loss value according to the attitude data of the previous frame of shot image and the attitude data of the next frame of shot image in any two frames of shot images;

01122: calculating a second sub-loss value according to the image data of the previous frame of shot image and the image data of the next frame of shot image in any two frames of shot images;

01123: calculating a third sub-loss value according to the sequence number of the previous frame of shot image and the sequence number of the next frame of shot image in any two frames of shot images;

01124: a first loss value is calculated based on the first sub-loss value, the second sub-loss value, and the third sub-loss value.

In certain embodiments, the calculation module 11 is further configured to perform

steps

01121, 01122, 01123, and 01124. Namely, the calculating module 11 is further configured to calculate a first sub-loss value according to the pose data of the previous captured image and the pose data of the next captured image in any two captured images; calculating a second sub-loss value according to the image data of the previous frame of shot image and the image data of the next frame of shot image in any two frames of shot images; calculating a third sub-loss value according to the sequence number of the previous frame of shot image and the sequence number of the next frame of shot image in any two frames of shot images; a first loss value is calculated based on the first sub-loss value, the second sub-loss value, and the third sub-loss value.

In certain embodiments, processor 20 is configured to perform step 01121, step 01122, step 01123, and step 01124. Namely, the processor 20 is configured to calculate a first sub-loss value according to the pose data of the previous captured image and the pose data of the next captured image in any two captured images; calculating a second sub-loss value according to the image data of the previous frame of shot image and the image data of the next frame of shot image in any two frames of shot images; calculating a third sub-loss value according to the sequence number of the previous frame of shot image and the sequence number of the next frame of shot image in any two frames of shot images; a first loss value is calculated based on the first sub-loss value, the second sub-loss value, and the third sub-loss value.

Specifically, in addition to the loss value of the attitude data, the image data and the number difference factor of the two captured images may be considered when calculating the first loss value of any two captured images.

Wherein, for the time-lapse photography, the larger the difference of the image data is, the larger the jitter is, and therefore, the larger the difference of the image data is, the larger the first loss value can be determined to be; the larger the difference between the serial numbers of the two shot images is, the larger the difference between the two shot images is, and the larger the first loss value is.

Therefore, when calculating the first loss value, a first sub-loss value may be first calculated according to the pose data of the previous frame captured image and the pose data of the next frame captured image in any two frames of captured images, where the first sub-loss value represents the influence of the change of the pose data on the first loss value; calculating a second sub-loss value from image data of a previous frame captured image and image data of a next frame captured image in any two frames of captured images, the second sub-loss value representing an influence of a change in the image data on the first loss value, for example, calculating an image similarity between the image data of the previous frame captured image and the image data of the next frame captured image to determine the second sub-loss value, wherein the greater the similarity, the smaller the second sub-loss value; calculating a third sub-loss value according to the sequence number of the shot image of the previous frame and the sequence number of the shot image of the next frame in any two frames of shot images, wherein the third sub-loss value represents the influence of the frame number interval between different frames on the first loss value, and for example, the larger the difference value between the sequence number of the shot image of the previous frame and the sequence number of the shot image of the next frame is, the larger the third sub-loss value is; or, determining a third sub-loss value according to the absolute value of the difference value between the sequence number of the previous frame of shot image and the sequence number of the next frame of shot image and the difference value between the preset frame selection interval, wherein the smaller the difference value between the absolute value of the difference value and the preset frame selection interval is, the smaller the third sub-loss value is, so that the frame selection interval is not too large or too small, and the size of the target video is ensured to meet the requirement. Finally, the processor 20 calculates a first loss value according to the first sub-loss value, the second sub-loss value and the third sub-loss value, for example, the first loss value is the sum of the first sub-loss value, the second sub-loss value and the third sub-loss value.

In one example, the first loss value is calculated as follows: cost ═ alpha Cost_gyro+βCost_image+γCost_speedWhere Cost represents a first loss value, Cost_gyroRepresenting a first sub-loss value, Cost_imageRepresenting a second sub-loss value, Cost_speedAnd the third sub-loss value is expressed, alpha, beta and gamma are all weights, and can be set according to the importance of different sub-loss values, for example, the first sub-loss value corresponding to the attitude data can be set to be larger, so that the attitude data between any two continuous frames in the finally generated target video is ensured to be less changed.

Wherein, when calculating the first sub-loss value, the attitude angle of the previous frame image and the next frame image can be selected according to any two frames imageAnd calculating a first sub-loss value by using a first difference value of the attitude angles of the shot images and a second difference value of the attitude angle of the shot image of the next frame and the average value of the attitude angles of the shot images of all the first preset frames. For example, the calculation formula of the first sub-loss value is as follows: cost_gyro＝μCost_posture+vCost_std(ii) a Among them, Cost_postureThe relative position loss value represents the difference of the posture data of the front frame and the back frame, and the smaller the difference of the posture data is, the more stable the image change of the front frame and the back frame is; cost_stdThe attitude variance loss value represents the difference between the attitude data of the next frame of captured image and the average of the attitude data of all captured images, and the smaller the attitude variance loss value is, the more the attitude data of the final selected frame is represented by the average attitude data.

The specific formula for calculating the relative position loss value is as follows: cost_posture＝abs((∑_iΔθ_x)²+(∑_iΔθ_y)²+ i Δ θ z2-abs (j Δ θ x2+ j Δ θ y2+ j Δ θ z2), the specific formula for calculating the attitude variance loss value is as follows: coststd ═ abs ((∑ e)_iΔθ_x-avg(Δθ_x))²+(∑_iΔθ_y-avg(Δθ_y))²+(∑_iΔθ_z-avg(Δθ_z))²) Wherein, Δ θ_x,Δθ_y,Δθ_zPitch, roll and yaw, respectively, abs indicates the absolute value of the result taken in the back brackets.

It can be understood that the calculation of the first loss value covers the calculation of the loss value between any two frames of shot images, and the calculation of the second loss value is basically the same as the calculation principle of the first loss value, only the previous frame of shot image when the first loss value is calculated is replaced by the selected frame, which is not described herein again.

Referring to fig. 2, 3 and 6, in some embodiments, step 01122 includes:

01125: identifying matched feature points in the previous frame of shot image and the next frame of shot image;

01126: calculating the offset between the matched feature points; and

01127: and calculating a second sub-loss value according to the offset.

In certain embodiments, the second generation module 13 is configured to perform step 01125, step 01126, and step 01127. Namely, the calculation module 11 is further configured to identify matched feature points in the previous frame of captured image and the next frame of captured image; calculating the offset between the matched feature points; and calculating a second sub-loss value according to the offset.

In certain embodiments, processor 20 is configured to perform step 01125, step 01126, and step 01127. That is, the processor 20 is configured to identify matching feature points in the previous captured image and the next captured image; calculating the offset between the matched feature points; and calculating a second sub-loss value according to the offset.

Specifically, when the second sub-loss value is calculated, matched feature points in the previous frame of captured image and the next frame of captured image can be identified, and the matched feature points correspond to the same acquisition point in the captured scene (the acquisition point is a part of the captured scene). The amount of shift between the matched feature points can be determined from the shift between the matched feature points (e.g., the difference between the image coordinates of the two matched feature points in the previous frame captured image and the next frame captured image), and thus the second sub-loss value is calculated from the amount of shift.

In other embodiments, matching feature points in any two adjacent captured images can be identified in advance, so as to determine an offset between the matching feature points between any two adjacent frames, and then a first offset of the feature points matching the previous captured image and the first captured image and a second offset of the feature points matching the next captured image and the first captured image can be determined according to the offset; and thus a second sub-loss value is calculated from the first offset amount and the second offset amount, the larger the difference between the first offset amount and the second offset amount is, the larger the second sub-loss value is.

For example, if the previous captured image is the 3 rd frame and the subsequent captured image is the 5 th frame, the first offset may be obtained according to the offset of the feature point matched with the first frame and the second frame and the sum of the offsets of the feature point matched with the second frame and the third frame, and the second offset may be obtained according to the offset of the feature point matched with the first frame and the second frame, the offset of the feature point matched with the second frame and the third frame, the offset of the feature point matched with the third frame and the fourth frame, and the sum of the offsets of the feature point matched with the fourth frame and the fifth frame, so that the difference between the offsets of any two frames of captured images can be obtained only by calculating the offsets of any two consecutive captured images in advance, and the calculation amount is small.

Referring to fig. 7, the present embodiment further provides a non-volatile computer-readable storage medium 200 containing a computer program 201. The computer program 201, when executed by the one or more processors 20, causes the one or more processors 20 to perform the video generation method of any of the embodiments described above.

Referring to fig. 1, for example, the computer program 201, when executed by the one or more processors 20, causes the processors 20 to perform the following video generation method:

Referring to fig. 4, for another example, the computer program 201, when executed by the one or more processors 20, causes the processors 20 to perform the following video generation method:

In the description herein, references to the description of "certain embodiments," "in one example," "exemplary," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present application includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.

Although embodiments of the present application have been shown and described above, it is to be understood that the above embodiments are exemplary and not to be construed as limiting the present application, and that changes, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims

1. A method of video generation, comprising:

calculating a loss value between any two frames of shot images according to the attitude data of the shot images with a first preset frame number, wherein the larger the difference of the attitude data of the shot images with any two frames is, the larger the loss value is;

selecting the shot image corresponding to the minimum loss value as a selected frame from the shot images of a second preset frame number after the first preset frame number, wherein the second preset frame number is smaller than the first preset frame number;

acquiring an image with the minimum loss value between the selected frame and multiple frames of the shot images before the selected frame as a next selected frame;

and circularly executing the step of taking the image with the minimum loss value with the selected frame in the multiple frames of shot images before the selected frame is obtained as the next selected frame so as to generate a target video according to the obtained multiple selected frames.

2. The video generation method of claim 1, further comprising:

and when the frame number of the shot image before the selected frame is smaller than a second preset frame number threshold value, stopping continuously executing the step of obtaining the image with the minimum loss value with the selected frame in the multiple frames of shot images before the selected frame as the next selected frame.

3. The video generation method according to claim 1, wherein the loss value includes the first loss value and a second loss value, and the calculating of the loss value between any two frames of the captured images based on the pose data of the captured images of the first preset number of frames includes:

arranging the shot images of the first preset frame number according to the shooting time sequence to determine the serial number of each frame of the shot images;

calculating a first loss value according to the attitude data of the shot image of the previous frame and the attitude data of the shot image of the next frame in any two frames of shot images;

when the selected frame does not exist after the shooting image of the next frame, calculating a second loss value according to the attitude data of the shooting image of the next frame and the attitude data of the last frame in the shooting image of the second preset frame number;

when the selected frame exists after the shooting image of the next frame, calculating the second loss value according to the attitude data of the shooting image of the next frame and the attitude data of the first selected frame after the shooting image of the next frame;

determining the loss value based on the first loss value and the second loss value.

4. The video generation method according to claim 1, wherein the loss value includes the third loss value and a fourth loss value, and the calculating of the loss value between any two frames of the captured images from the pose data of the captured images of the first preset number of frames includes:

calculating a third loss value according to the attitude data of the shot image in the previous frame and the attitude data of the shot image in the next frame in any two frames of shot images; and

calculating a fourth loss value according to the attitude data of the next frame of the shot images and the attitude data of the selected frame with the smallest difference value with the serial number of the next frame of the shot images in any two frames of the shot images;

determining the loss value according to the third loss value and the fourth loss value.

5. The video generation method according to claim 3 or 4, wherein the calculating the first loss value from the attitude data of the captured image of the previous frame and the attitude data of the captured image of the next frame in any two frames of the captured images includes:

calculating a first sub-loss value according to the attitude data of the shot image of the previous frame and the attitude data of the shot image of the next frame in any two frames of shot images;

calculating a second sub-loss value according to image data of a previous frame of the shot image and image data of a next frame of the shot image in any two frames of the shot images;

calculating a third sub-loss value according to the sequence number of the shot image in the previous frame and the sequence number of the shot image in the next frame in any two frames of shot images;

calculating the first loss value according to the first sub-loss value, the second sub-loss value and the third sub-loss value.

6. The video generation method according to claim 5, wherein the captured image of a previous frame includes the selected frame.

7. The video generation method according to claim 5, wherein the attitude data includes an attitude angle, and the calculating of the first sub-loss value from the attitude data of the preceding frame of the captured image and the attitude data of the succeeding frame of the captured image in any two frames of the captured images includes:

and calculating the first sub-loss value according to a first difference value between the attitude angle of the shot image in the previous frame and the attitude angle of the shot image in the next frame and a second difference value between the attitude angle of the shot image in the next frame and the average value of the attitude angles of the shot images in all the first preset frames in any two frames of shot images.

8. The video generation method according to claim 5, wherein the calculating the second sub-loss value from image data of a preceding frame of the captured image and image data of a subsequent frame of the captured image in any two frames of the captured images includes:

identifying matched feature points in the shot image of the previous frame and the shot image of the next frame;

calculating the offset between the matched feature points; and

and calculating the second sub-loss value according to the offset.

9. The video generation method according to claim 5, wherein the calculating the second sub-loss value from image data of a preceding frame of the captured image and image data of a subsequent frame of the captured image in any two frames of the captured images includes:

identifying matched feature points in the shot images adjacent to any two frames;

calculating the offset between the matched feature points in the shot images adjacent to any two frames;

determining a first offset of a characteristic point of the shot image of the previous frame relative to a characteristic point of the shot image of the first frame and a second offset of the characteristic point of the shot image of the next frame relative to the characteristic point of the shot image of the first frame according to the offsets;

and calculating the second sub-loss value according to the first offset and the second offset.

10. A video generation apparatus, comprising:

the calculation module is used for calculating a loss value between any two frames of shot images according to the attitude data of the shot images with a first preset frame number, wherein the larger the difference of the attitude data of the shot images with any two frames is, the larger the loss value is;

a selecting module, configured to select, as a selected frame, a frame of the captured image with a minimum loss value from the captured images with a second preset frame number after the first preset frame number, where the second preset frame number is smaller than the first preset frame number;

the acquisition module is used for acquiring an image with the minimum loss value between the selected frame and multiple frames of the shot images before the selected frame as a next selected frame;

and the generating module is used for circularly executing the step of taking the image with the minimum loss value with the selected frame as the next selected frame in the plurality of frames of shot images before the selected frame is acquired so as to generate the target video according to the acquired selected frames.

11. An electronic device is characterized by comprising a processor, wherein the processor is used for calculating a loss value between any two frames of shot images according to attitude data of the shot images with a first preset number of frames, and the loss value is larger when the difference of the attitude data of any two frames of the shot images is larger; selecting a frame of the shot image with the minimum loss value as a selected frame from the shot images with a second preset frame number after the first preset frame number, wherein the second preset frame number is smaller than the first preset frame number; acquiring an image with the minimum loss value between the selected frame and multiple frames of the shot images before the selected frame as a next selected frame; and circularly executing the step of taking the image with the minimum loss value with the selected frame in the multiple frames of shot images before the selected frame is obtained as the next selected frame so as to generate a target video according to the obtained multiple selected frames.

12. A non-transitory computer-readable storage medium comprising a computer program which, when executed by a processor, causes the processor to perform the video generation method of any of claims 1-9.