WO2022246608A1

WO2022246608A1 - Method for generating panoramic video, apparatus, and mobile platform

Info

Publication number: WO2022246608A1
Application number: PCT/CN2021/095551
Authority: WO
Inventors: 李广; 王程昊; 徐斌
Original assignee: 深圳市大疆创新科技有限公司
Priority date: 2021-05-24
Filing date: 2021-05-24
Publication date: 2022-12-01

Abstract

A method for generating a panoramic video, an apparatus, and a mobile platform. The method comprises: obtaining multiple first images acquired via a photography apparatus of a mobile platform photographing a target object from different angles, the first images being photographed while the mobile platform moved around the target object along a preset trajectory; performing preprocessing on the first images, so as to obtain second images, wherein the preprocessing is used for producing smooth transitioning from an ending first image to a beginning first image among a plurality of the first images; and generating a panoramic video on the basis of a plurality of second images. By means photographing images using the photography apparatus of the mobile platform, there is more flexibility in a photographing trajectory, and a high quality panoramic video with beginning-to-end smooth transitioning can be generated.

Description

Method, device and mobile platform for generating panoramic video

manual

technical field

The present application relates to the technical field of data processing, and more specifically relates to a method, device and mobile platform for generating panoramic video.

Background technique

Surrounding panoramic video is a kind of panoramic video, which can freely display 360° around the subject itself, and the user can arbitrarily determine the viewing angle. The traditional shooting method is to use a shooting device to take several pictures around the subject after the subject is fixed, and to fuse the multiple pictures into a full-field-of-view image or a panoramic video with similar functions through post-processing.

Traditional wrap-around panoramic videos generally use fixed tracks or multiple fixed-position shooting devices to shoot, which greatly restricts the versatility of this type of panoramic video shooting. Moreover, the first and last frames of the panoramic video cannot be connected, which is manifested as a series of fixed and single videos, lacking an interactive look and feel.

Contents of the invention

A series of concepts in simplified form are introduced in the Summary of the Invention, which will be further detailed in the Detailed Description. The summary of the invention in the present invention does not mean to limit the key features and essential technical features of the claimed technical solution, nor does it mean to try to determine the protection scope of the claimed technical solution.

The first aspect of the embodiment of the present invention provides a method for generating panoramic video, including:

Obtain multiple frames of first images obtained by shooting the target object at different angles by the shooting device of the movable platform, and the first images are taken while the movable platform is running around the target object along a preset track ;

Preprocessing the first image to obtain a second image, wherein the preprocessing is used to make a smooth transition between the first image in the last frame and the first image in the first frame among the multiple frames of the first image;

A panoramic video is generated based on multiple frames of the second image.

The second aspect of the embodiment of the present invention provides a method for generating panoramic video, including:

Obtaining multiple frames of images obtained by the shooting device of the movable platform shooting the target object at different angles, the images are taken during the process of the movable platform running around the target object along a preset trajectory, wherein the The displacement of the first and last ends of the preset trajectory is greater than or equal to the first preset threshold;

A panoramic video is generated based on the multiple frames of images, wherein, in the panoramic video, the image displacement between the first frame and the last frame is less than or equal to a second preset threshold.

The third aspect of the embodiment of the present invention provides a device for generating a panoramic video, the device comprising:

memory for storing executable instructions;

a processor, configured to execute the instructions stored in the memory, so that the processor performs the following steps:

A panoramic video is generated based on multiple frames of the second image.

The fourth aspect of the embodiment of the present invention provides a device for generating a panoramic video, the device comprising:

memory for storing executable instructions;

The fifth aspect of the embodiment of the present invention provides a mobile platform, including:

Movable platform body;

A photographing device, mounted on the movable platform body, for photographing the target object;

And, in the device for generating panoramic video as described above, the device for generating panoramic video is communicatively connected to the shooting device for generating a panoramic video based on the images captured by the shooting device.

A sixth aspect of the embodiment of the present invention provides a computer storage medium, on which a computer program is stored, and when the program is executed by a processor, the above-mentioned method for generating a panoramic video is implemented.

According to the method, device and movable platform for generating panoramic video according to the embodiments of the present invention, the photographing device of the movable platform captures images, the shooting trajectory is more flexible, and high-quality panoramic video with smooth transition from the beginning to the end can be generated.

Description of drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings that need to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present invention. For those skilled in the art, other drawings can also be obtained based on these drawings without any creative effort.

In the attached picture:

Fig. 1 shows a schematic flowchart of a method for generating a panoramic video according to an embodiment of the present invention;

FIG. 2 shows a schematic flow chart of preprocessing a first image according to an embodiment of the present invention;

Fig. 3 shows a schematic flowchart of a method for generating a panoramic video according to another embodiment of the present invention;

Fig. 4 shows a schematic block diagram of an apparatus for generating a panoramic video according to an embodiment of the present invention.

Detailed ways

In order to make the objects, technical solutions and advantages of the present invention more apparent, exemplary embodiments according to the present invention will be described in detail below with reference to the accompanying drawings. Apparently, the described embodiments are only some embodiments of the present invention, rather than all embodiments of the present invention, and it should be understood that the present invention is not limited by the exemplary embodiments described here. Based on the embodiments of the present invention described in the present invention, all other embodiments obtained by those skilled in the art without creative effort shall fall within the protection scope of the present invention.

In the following description, numerous specific details are given in order to provide a more thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without one or more of these details. In other examples, some technical features known in the art are not described in order to avoid confusion with the present invention.

It should be understood that the invention can be embodied in different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms "a", "an" and "the/the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It should also be understood that the terms "consists of" and/or "comprising", when used in this specification, identify the presence of stated features, integers, steps, operations, elements and/or parts, but do not exclude one or more other Presence or addition of features, integers, steps, operations, elements, parts and/or groups. As used herein, the term "and/or" includes any and all combinations of the associated listed items.

In order to thoroughly understand the present invention, detailed steps and detailed structures will be provided in the following description, so as to explain the technical solution proposed by the present invention. Preferred embodiments of the present invention are described in detail below, however, the present invention may have other embodiments besides these detailed descriptions.

The method, device and mobile platform for generating a panoramic video of the present application will be described in detail below with reference to the accompanying drawings. If there is no conflict, the features in the following embodiments and implementations can be combined with each other.

For the traditional surround panoramic video, in order to make the smooth transition between the last frame of the video and the first frame of the video, there are often strict requirements on the motion track, so the traditional shooting method often needs to arrange a fixed frame around the subject. It is extremely inconvenient to shoot, especially for objects such as large outdoor buildings that are not suitable for fixed orbits. In addition, there is another problem in shooting along a fixed track, that is, the shooting range cannot be customized by the user, which seriously affects the user experience of subsequent products, and the interaction is poor.

However, if a fixed track is not used for shooting, the motion trajectory of the shooting device will fluctuate greatly, and the obtained panoramic video picture will shake greatly, seriously affecting the visual experience. The uncertainty of the trajectory will also make this kind of panoramic video unable to complete the end-to-end connection, and can only be presented as a single video, lacking interactive look and feel.

In view of the above problems, the method for generating a panoramic video in the embodiment of the present invention is based on the shooting device of the movable platform for shooting, which is convenient for users to customize the shooting trajectory, and the movable platform can give a more ideal shooting trajectory to a greater extent, which is convenient for carrying out Subsequent image processing. In addition, the method for generating a panoramic video in the embodiment of the present invention first preprocesses the image sequence before generating the panoramic video based on the image, so that the transition between the first and last frame images is smooth, so that the generated panoramic video can complete the end-to-end connection, so that it can be obtained High quality panoramic video.

In the following, firstly, referring to FIG. 1 , the method for generating a panoramic video provided by an embodiment of the present invention will be described. Fig. 1 shows a flowchart of a method 100 for generating a panoramic video according to an embodiment of the present invention. As shown in FIG. 1, a method 100 for generating a panoramic video in an embodiment of the present invention includes the following steps:

In step S110, multiple frames of first images obtained by shooting the target object at different angles by the shooting device of the movable platform are obtained, and the first images are obtained when the movable platform moves around the target object along a preset track taken during the

In step S120, preprocessing is performed on the first image to obtain a second image, wherein the preprocessing is used to make the difference between the first image in the last frame and the first image in the first frame among the multiple frames of the first image smooth transition between

In step S130, a panoramic video is generated based on multiple frames of the second image.

The execution body of the method 100 for generating panoramic video in the embodiment of the present invention may be the processor of the mobile platform, or other electronic devices communicatively connected to the mobile platform. When the execution subject of the method 100 for generating panoramic video is the processor of the mobile platform, the processor may perform the above processing on the image captured by the photographing device of the mobile platform in real time to obtain the panoramic video. When the execution subject of the method 100 for generating panoramic video is other devices, the first image captured by the shooting device may be transmitted to other devices through the communication system of the mobile platform to perform the above processing to obtain the panoramic video.

Wherein, the movable platform may include an aircraft (such as a drone), a robot, an unmanned vehicle, an unmanned ship, and the like. Taking a UAV as an example, it may include one or more power units for providing power for the UAV to fly in the air. One or more power units can make the drone move in one or more degrees of freedom, so that the drone can fly around the target object, and during the flight, ensure that its shooting device is always facing the shooting target. Compared with fixed track devices such as rotating object trays, the shooting track is more flexible when shooting with mobile platform shooting devices such as drones. Since there is a certain range of motion in the process of shooting with a movable platform, the embodiment of the present invention preprocesses the images to make a smooth transition between each frame of images, thereby obtaining a high-quality panoramic video.

In step S110, first determine a preset track around the target object, and control the movable platform to run around the target object along the preset track, so that the movable platform starts from a certain point to run around the target object for a circle, and returns to the starting point. The running process of the movable platform should be as smooth and stable as possible. There are no restrictions on the height, shape and radius of the preset trajectory. It is only necessary to ensure that the starting and ending positions of the running trajectory are basically the same, so that the beginning and ending of the moving trajectory are smooth and connectable. Exemplarily, the preset trajectory may be a circle or an ellipse.

In some embodiments, a preset trajectory can be generated according to user instructions, and the movable platform can be controlled to run around the target object along the preset trajectory. Since the method 100 for generating a panoramic video in the embodiment of the present invention uses the camera of the movable platform to shoot, it is convenient for the user to customize the shooting trajectory, so that the user can adjust the trajectory height, shape, radius, and control the operation of the movable platform according to actual needs. speed etc. When the movable platform is running around the target object along the preset trajectory, the shooting device is controlled to shoot the target object at a preset frame rate, so as to obtain first images taken at various angles around the target object.

In other embodiments, the preset trajectory may also be automatically determined based on one or more of the target object's type, size, shape or movement characteristics. The type, size, shape and movement characteristics of the target object can be obtained through image analysis or can be input by the user. For example, during the operation of the movable platform, it is possible to detect in real time the target object that meets the preset shooting requirements according to the shooting picture of the shooting device, determine the type, size, shape and movement characteristics of the target object, and generate a preset trajectory according to the above characteristics .

Exemplarily, when the movable platform reaches the starting point of the preset track, the shooting device can be controlled to start shooting, and when the movable platform reaches the end point of the preset track, the shooting device can be controlled to stop shooting; of course, the shooting device can also be controlled to continue shooting, and extracting images taken during a period of one week when the movable platform runs along a preset trajectory, as the first image to be processed.

Afterwards, in step S120, the first image is preprocessed to obtain a second image, wherein the preprocessing is used to make a smooth transition between the first image of the last frame and the first image of the first frame among the multiple frames of first images , so as to obtain a shooting effect similar to the seamless connection between the beginning and the end of the preset trajectory. The preprocessing at least reduces the difference between the first image of the last frame and the first image of the first frame, so that when the user adjusts the viewing angle, no obvious seam will be generated in the transition area between the first and last frames. Further, the preprocessing in step S120 enables a smooth transition between the first images of each frame, so that the differences between the first images of each frame are evenly distributed, thereby obtaining an approximation that the shooting device always maintains a constant speed and a smooth transition during the shooting process. , The shooting effect of stable shooting.

Since there is a mapping relationship between the pose of the camera and the first image captured by the camera, reduce the difference between the poses of the camera corresponding to the first and last frames of images, and then map the adjusted pose to the image , which can reduce the difference between the first and last frame images. Therefore, referring to FIG. 2, in one embodiment, preprocessing the first image includes:

In step S210, the actual displacement between the poses of the photographing device corresponding to the first image of the adjacent frame is obtained;

In step S220, the actual displacement is adjusted to obtain the target displacement, wherein the first target displacement between the first image in the last frame and the first image in the first frame is smaller than the distance between the first image in the last frame and the first image in the first frame The first actual displacement of ;

And, in step S230, the first image is mapped according to the target displacement to obtain the second image.

Since there is a corresponding relationship between the pose of the photographing device and the images collected by the photographing device, as an implementation, in step S210, image matching and motion estimation methods can be used to obtain the photographing corresponding to the first image of the adjacent frame. The actual displacement between poses of the device. Specifically, the feature points in the first image are first extracted, and then the feature points in the first image of the adjacent frames are matched, so as to obtain the matching feature point pairs of the first images of the adjacent frames. Afterwards, the actual displacement between the first images of adjacent frames is obtained according to the pair of matching feature points.

Exemplarily, a SIFT (Scale Invariant Feature Transform) algorithm may be used to detect SIFT feature points in the first image. Local features in the image can be located. The SIFT feature points detected by the algorithm are scale and rotation invariant, not affected by the size and direction of the image, and less affected by light, noise, etc. After the SIFT feature points are extracted, each SIFT feature point in the feature space can be matched with several neighborhood points in its neighborhood in the adjacent frame, and the neighborhood point with the highest matching degree can be found through the matching algorithm. The SIFT The feature point and its neighbor point with the highest matching degree are called matching feature point pair. Exemplarily, the k-d algorithm may be used to find the neighborhood point with the highest matching degree.

Further, obtaining the actual displacement between the first images of adjacent frames according to the pair of matching feature points includes: performing nonlinear optimization according to the pair of matching feature points to obtain a second homography matrix between the first images of adjacent frames ; According to the second homography matrix and the internal reference matrix of the shooting device, the actual displacement between the first images of adjacent frames is obtained. Wherein, the homography matrix describes the mapping relationship between planes, that is, the mapping relationship between the first images of adjacent frames. In order to distinguish it from the first homography matrix described later, the homography matrix describing the mapping relationship between the first images of adjacent frames here is called the second homography matrix, and the first image and The homography matrix of the mapping relationship between the second images is called the second homography matrix.

Afterwards, according to the mapping relationship between the first images of adjacent frames, the mapping relationship between the poses of the photographing devices corresponding to the adjacent frame images can be obtained, that is, the relationship between the poses of the photographing devices corresponding to the first images of adjacent frames actual displacement. Wherein, the second homography matrix may be obtained by using a non-linear optimization method, or other optimization algorithms may be used to obtain the second homography matrix according to the coordinates of multiple sets of matching feature point pairs. After obtaining the actual displacement between the poses of the shooting device corresponding to the first image of adjacent frames, the actual displacement of the first image of each frame relative to the first image of the first frame can be obtained by accumulating, and then the first image of multiple frames can be obtained. The displacement distribution curve of the image.

Exemplarily, the actual displacement between the first images of adjacent frames can be calculated according to the following formulas (1) and (2):

P _t =HP _t-1 (2)

in:

H represents the second homography matrix between the first image of the adjacent frame, and the second homography matrix H constrains the 2D homogeneous coordinates of the same 3D space point in the first image plane of the adjacent frame;

R represents the rotation matrix between the first images of adjacent frames;

T represents the translation amount between the first images of adjacent frames;

K ₁ represents the internal reference matrix of the shooting device, which is fixed and related to the focal length and the center of the aperture;

n ^T represents the unit normal vector of the object plane relative to the first frame;

P _t-1 represents the coordinates of a certain feature point in the first image of the previous frame;

P _t represents the coordinates of the matching point in the first image of the next frame.

Specifically, firstly, according to the formula (2), the second homography matrix H is obtained based on the coordinates P _t-1 and P _t of the matching feature point pair, using a nonlinear optimization method; then, according to the formula (1), based on the obtained The second homography matrix H, the known internal reference matrix K ₁ , and the unit normal vector n ^T are used to obtain the rotation matrix R and translation T between the first images of adjacent frames.

When calculating the actual displacement, the translation model and the rotation model can be considered to calculate the actual rotation angle and the actual translation amount, or only the translation model or only the rotation model can be considered, that is, one of the actual rotation angle and the actual translation amount can be calculated. For example, if the focal length of the photographing device is short, only the actual rotation angle R can be calculated, and the actual translation amount T can be ignored; if the focal length is infinite, only the actual translation amount T can be calculated, and the actual rotation angle R can be ignored. Correspondingly, when adjusting the actual displacement to obtain the target displacement, if the rotation model is considered, the actual rotation angle can be adjusted to obtain the target rotation angle; if the translation model is considered, the actual translation amount can be adjusted Adjust to obtain the target translation amount.

It should be noted that the above-mentioned feature point detection method and image registration method are only examples and not limiting, and any suitable feature point detection method and image registration method can be applied to the method 100 for generating panoramic video according to the embodiment of the present invention middle.

In another embodiment, the actual displacement can be obtained according to the inertial measurement data measured by the inertial measurement device of the movable platform during the process of capturing the first image. Specifically, the pose of the photographing device can be obtained from the inertial measurement data measured by the inertial measurement device mounted on the movable platform, and then the position of the photographing device between the first images of adjacent frames can be determined according to the poses corresponding to the first images of each frame. The actual displacement of the pose. Among them, the inertial measurement device includes but is not limited to an accelerometer and a gyroscope. The gyroscope can measure the rotational angular velocity of each axis of the shooting device, and the accelerometer can measure the linear acceleration of the shooting device moving along each axis. The angle measured by the gyroscope The speed signal is integrated with time to calculate the attitude information such as the instantaneous motion direction and inclination angle. Using the acceleration signal measured by the accelerometer, the time integral operation can be used to calculate the speed information of the shooting device, and then the shooting device can be obtained during the image acquisition process. pose information at each moment.

Exemplarily, when using an image matching method or calculating the actual displacement based on inertial measurement data, the actual displacements in different dimensions may be calculated respectively. For example, for the actual translation amount, the actual translation amounts in the x, y, and z directions are calculated respectively; when the actual rotation angle is calculated, the rotation angles of the three rotation axes p, y, and r are respectively calculated. Subsequently, when the actual displacement is adjusted to obtain the target displacement, the actual displacements in different dimensions are also adjusted respectively, so as to obtain the target displacements in different dimensions.

After the actual displacement is obtained, in step S220, the actual displacement between the first images in adjacent frames is adjusted, so as to obtain the target displacement between the first images in adjacent frames. Wherein, the actual displacement represents the actual displacement of the photographing device during the process of capturing the first image, and the target displacement represents the expected displacement of the photographing device, that is, assuming that the photographing device moves according to the target displacement, a smooth transition image can be collected; The purpose of the target displacement is to map the first image according to the target displacement, so as to obtain a smooth transition of the second image. Since the preset trajectory is unidirectional, the sudden change of the image is mainly between the first image of the last frame and the first image of the first frame, so the above adjustment reduces the displacement between the first image of the last frame and the first image of the first frame , so that the first target displacement between the first image in the last frame and the first image in the first frame is smaller than the first actual displacement between the first image in the last frame and the first image in the first frame.

Since the actual displacement between the first image of the adjacent frame is small, and the actual displacement between the first image of the last frame and the first image of the first frame is relatively large, when adjusting the actual displacement, the first image of the last frame can be adjusted to The actual displacement between one image and the first image of the first frame is allocated among the remaining frames of images, that is, in step S220, adjusting the actual displacement to obtain the target displacement includes: obtaining the first image of the last frame to the first frame of the first frame The deviation between the target displacement between the images and the actual displacement between the first image in the last frame and the first image in the first frame, and distribute the deviation among the first images in the remaining frames.

It can be understood that since the above adjustment reduces the displacement between the first image of the last frame and the first image of the first frame, at least one of the first image of the last frame and the first image of the first frame needs to share the deviation, that is to say , the above deviation must be distributed at least between the first image in the last frame and the first image in the previous frame, or between the first image in the first frame and the first image in the next frame. Preferably, both the first image in the last frame and the first image in the first frame need to share part of the deviation, so as to improve the smoothness of image transition.

Exemplarily, the method for distributing the deviation between the first and last frames to the first images of the remaining frames includes: linearly distributing the above deviation, so that on the basis of the actual displacement between the first images of each frame, increase Or reduce partial bias. Specifically, after calculating the deviation between the actual displacement and the target displacement between the first image of the last frame and the first image of the first frame, starting from the first image of a certain frame in the middle, moving towards the first image of the first frame and the first image of the last frame respectively. Motion compensation is performed on the first image of the frame, and a linear distribution method is adopted during the period, so that the actual displacement between the first images of every two adjacent frames increases or decreases part of the deviation, so as to complete the smooth transition between the first and last frames. For example, if the actual displacement between the first images of two adjacent frames is expressed as D ₁ , and the deviation allocated between the first images of the two adjacent frames is expressed as △D, then the two adjacent frames after the partial deviation are allocated The target displacement D ₂ =D ₁ +ΔD between the first images.

In some embodiments, in order to further improve the smoothness of the transition between images, after distributing the deviation between the first and last frame images to the first image of each frame, it is also necessary to obtain the displacement distribution curve according to the actual displacement after the distribution of the deviation , the displacement distribution curve is smoothed, and the smoothed displacement distribution curve is differentiated to obtain the target displacement between the first images of each frame. That is to say, the target displacement _D2 between the first images of each frame is accumulated to obtain the displacement between the first image of each frame relative to the first image of the first frame, thereby obtaining the displacement distribution curve, and then the displacement distribution curve is calculated Smoothing and difference are performed to obtain the final target displacement D _target . After the above smoothing process, the target motion trajectory of the shooting device will be smoother and more stable.

Since the expected panorama video effect in the embodiment of the present invention requires a sufficiently close and stable transition between the last frame image and the first frame image, in some embodiments, after acquiring multiple frames of the first image, further includes: according to preset conditions The above-mentioned first image of the last frame is selected from at least two first images of the last frame. That is, the first image of the last frame used to generate the panoramic video above is not necessarily the first image of the last frame obtained during the image acquisition process, but the one that is the closest to the first image of the first frame among the last several frames of images collected. A close frame of image. For example, assuming that a total of N frames of the first image are acquired, if the first image of the N-1th frame satisfies the preset condition and is close enough to the first image of the first frame, then the first image of the N-1th frame is used as the first image of the last frame , and discard the first image of the Nth frame, thereby reducing the difference between the first image of the last frame and the first image of the first frame, and further ensuring a smooth transition between the images of the first and last frames.

In one embodiment, when the first image of the last frame is selected from the at least two first images at the end according to preset conditions, the at least two first images at the end can be matched with the first image of the first frame respectively, and the The first frame of the first image with the largest number of matching feature point pairs between the first image of the first frame is used as the first image of the last frame. Wherein, it can be matched according to the SIFT feature points extracted in step S210, and the matching feature point pairs between at least two frames of the first image at the end and the first frame of the first image are respectively obtained, the more the number of matching feature point pairs is, the description The higher the similarity between images.

In another embodiment, the actual displacement between at least two frames of the first image at the end and the first frame of the first image can be obtained respectively, and the frame of the first image with the smallest actual displacement is used as the first image of the last frame. Exemplarily, the actual displacement can be obtained according to the method of image matching or according to the inertial measurement data, or can be accumulated according to the actual displacement between the first images of adjacent frames obtained in step S210, so as to obtain at least two frames of the first image at the end The actual displacement from the first image of the first frame. The smaller the actual displacement, the higher the similarity between images.

In some embodiments, it is also possible to retain the last captured first frame of the first image as the last frame of the first image, and select an image with the highest similarity to the last frame of the first image among the initially collected frames of the first image, As the first image of the first frame used to generate the panoramic video. For example, assuming that a total of N frames of the first image are acquired, if the first image of the second frame is close enough to the first image of the Nth frame, the first image of the second frame is used as the first image of the first frame, and the first image of the first frame is discarded. image. The method for selecting the first image of the first frame is similar to the method for selecting the first image of the last frame above, and will not be repeated here.

After the target displacement is obtained, in step S230, the first image is mapped according to the target displacement to obtain a second image. Exemplarily, the first homography matrix between the first image and the second image can be obtained according to the target displacement and the internal reference matrix of the shooting device, and the first homography matrix describes the distance between the first image and the second image. Mapping relations. Afterwards, the first image is mapped according to the first homography matrix to obtain the second image.

Among them, the first homography matrix can be obtained according to the target displacement and the internal reference matrix by referring to formula (1), namely

After obtaining the first homography matrix, the first image may be mapped with reference to formula (2) to obtain the second image.

Through a series of image processing as described above, the shooting angles of the first and last frames of the second images are close enough to achieve a smooth transition between the images. Further, due to the difference in light intensity, the brightness between the images will be uneven, so that the generated panoramic video will appear alternately bright and dark. Therefore, after performing the above steps, the preprocessing of the first image also includes: The brightness of the first image of the frame is smoothed to reduce the brightness difference between the first image of the last frame and the first image of the first frame. Of course, in order to further improve the visualization effect of the panoramic video, when smoothing the brightness of the first image of multiple frames, not only the brightness difference between the first image of the last frame and the first image of the first frame is reduced, but also the rest of the frames need to be balanced Brightness of the first image.

For example, the illumination inhomogeneity of the image can be corrected by the illumination model of the shooting device, and then the histogram mapping table between two adjacent images can be established through the relationship between the overlapping areas of the two adjacent images, and the The two images are mapped and transformed as a whole, and finally the overall brightness and color consistency are achieved.

Finally, in step S130, a panoramic video is generated based on multiple frames of second images. Wherein, multiple frames of second images may be arranged according to the original order, and used as video frames of the panoramic video, so as to obtain a panoramic video with a full field of view around the target object.

After the panoramic video is generated, a step of displaying the panoramic video is also included. The generation and display of the panoramic video can be implemented in different devices. For example, the processor of the mobile platform can send the generated panoramic video to the client, and display it on the display interface of the client.

In some embodiments, the panoramic video and a progress bar associated with the panoramic video can be displayed synchronously, and the progress bar is used to describe the preset track and describe the corresponding position of the panoramic video displayed at the current moment in the preset track. The progress bar can be implemented as a long strip, ring, spherical, etc., including the operation control. The position of the operation control represents the corresponding position of the panoramic video displayed at the current moment in the preset track, that is, the shooting corresponding to the picture displayed at the current moment. Position, by adjusting the operation control, you can adjust the viewing angle of the panoramic video. The panoramic video can be played automatically, that is, the angle of view can be automatically switched; it can also be switched as the user adjusts the operation controls. Wherein, the preset track described by the progress bar is connected end to end, that is, by adjusting the operation control, the viewing angle corresponding to the end of the preset track can be directly transitioned to the viewing angle corresponding to the head end of the preset track, and because the embodiment of the present invention is applicable to The images that generate the panorama video are preprocessed as described above, and there will be no obvious abrupt changes in the transition process.

The above has exemplarily described the exemplary steps involved in the method for generating a panoramic video according to the embodiment of the present invention. The method 100 for generating a panoramic video in the embodiment of the present invention uses a mobile platform camera to capture images, and the shooting trajectory is more flexible, and by smoothing the first and last frame images, a high-quality panoramic video with a smooth transition from the beginning to the end can be generated. .

Next, with reference to FIG. 3 , a method 300 for generating a panoramic video provided by another aspect of the embodiment of the present invention will be described. As shown in FIG. 3 , a method 300 for generating a panoramic video in an embodiment of the present invention includes the following steps:

In step S310, multiple frames of images obtained by shooting the target object at different angles by the shooting device of the movable platform are obtained, and the images are taken while the movable platform is running around the target object along a preset track , wherein, the displacement at both ends of the preset trajectory is greater than or equal to a first preset threshold;

In step S320, a panoramic video is generated based on the multiple frames of images, wherein, in the panoramic video, the image displacement between the first frame and the last frame is less than or equal to a second preset threshold.

Wherein, the displacements at both ends of the preset trajectory are displacements of the photographing device in a spatial coordinate system, and the spatial coordinate system may be GPS or a similar absolute physical coordinate system, or the coordinate system of the photographing device. The displacement at both ends of the preset trajectory is the actual displacement of the photographing device between the pose corresponding to the first frame of image and the pose corresponding to the last frame of image. The image displacement between the first frame and the last frame is the displacement between images. Since the displacement of the first and last ends of the preset trajectory is greater than or equal to the first preset threshold, there is a sudden change between the images taken at the first and last ends; and when generating a panoramic video based on multiple frames of images, by processing the images, the first frame The image displacement between the last frame and the last frame is less than or equal to the second preset threshold, so that the panoramic video can smoothly transition between the first and last frame images.

In step S310, the target object may be determined based on a user's selection instruction, or may be identified based on an image. When the target object is automatically identified based on the image taken by the shooting device, the target object can be recognized in real time during the flight of the movable platform, or a control command to start automatic shooting input by the user through the control device can also be obtained, and in response to the control The instruction starts to identify the target objects meeting the preset shooting conditions within the shooting range.

Target objects meeting preset shooting conditions within the shooting range of the shooting device of the movable platform may be identified by any suitable method. Target objects include but are not limited to people, animals, plants, buildings, etc. For example, the target object can be identified according to the shape information or position information of each object in the image. For example, the shape information includes the top view shape, side view shape, etc. of the object, and the top view shape may be, for example, the roof shape of a building. The location information may include the longitude and latitude coordinates of the object, or any other coordinate information that can determine the location of the object. After the shape information and position information of each object in the image are recognized, it is matched with the shape information and position information of the pre-marked target object, and the successfully matched object is determined as the target object. Wherein, the successful matching may mean that both the shape information and the position information are successfully matched, or one item of the shape information and the position information is successfully matched.

After the target object is determined, in response to the user's shooting instruction, the mobile platform is controlled to carry the shooting device to surround the target object along a preset trajectory to shoot. It is understandable that even if the preset trajectory is set to overlap the trajectory from the beginning to the end, in the process of controlling the movable platform to run along the preset trajectory, the two ends of the head and the tail cannot be seamlessly connected, so the displacement of the two ends is greater than or equal to The first preset threshold.

Exemplarily, firstly, a preset track running around the target object is generated, so as to control the movable platform to run around the target object along the preset track. Wherein, the preset trajectory can be determined by the system based on one or more of the type, size, shape or movement characteristics of the target object, and the system can be the control system of the movable platform, or other computing systems capable of communicating with the movable platform equipment. Wherein, the type of the target object may include the corresponding type of the target object in preset categories such as people, animals, plants, buildings, etc.; the size of the target object may include the height, size, etc. of the target object; the shape of the target object may be The shape of the outline includes but is not limited to the projected shape on the horizontal plane; the moving characteristics of the target object may include whether the target object is moving or stationary, the moving direction and moving speed of the target object, and the like. The preset trajectory can be planned according to the type, size, shape, and movement characteristics of the target object, so that the proportion of the target object in the shooting frame, the position of the shooting target in the shooting frame, etc. meet the preset requirements.

Alternatively, the preset trajectory may be determined in response to a user's setting instruction. For example, the way for the user to input the setting instruction can be implemented as drawing a preset trajectory on the real-time shooting screen through the control device connected to the mobile platform in communication, or it can be realized as inputting the relevant parameters of the preset trajectory through the control device, so that A control device or a movable platform generates a preset trajectory according to relevant parameters.

Afterwards, in step S320, a panoramic video is generated based on the multi-frame images acquired in step S310. Exemplarily, multi-frame images are firstly preprocessed, and then a panoramic video is generated based on the preprocessed images. The preprocessing performed on the multi-frame images reduces the difference between the first frame image and the last frame image, so that in the generated panoramic video, the image displacement between the first frame and the last frame is less than or equal to the second preset threshold. For a method of preprocessing an image, reference may be made to related descriptions in the method 100 for generating a panoramic video.

After the panoramic video is generated, a step of displaying the panoramic video is also included. For example, when displaying the panoramic video, the panoramic video and the progress bar associated with the panoramic video can be displayed synchronously, the progress bar is used to describe the preset track, and describe the corresponding position of the panoramic video displayed at the current moment in the preset track, Moreover, the preset tracks described by the progress bar are connected end to end. During the playing process, the panoramic video can automatically switch the angle of view; it can also switch the angle of view following the user's adjustment of the progress bar.

To sum up, the method 300 for generating a panoramic video according to the embodiment of the present invention captures an image by a camera on a movable platform, the shooting track is more flexible, and a high-quality panoramic video with a smooth transition from the beginning to the end can be generated.

As shown in Figure 4, the embodiment of the present invention also provides a device 400 for generating a panoramic video, the device 400 for generating a panoramic video includes one or more memories 410 and one or more processors 420, of course, according to needs, generate The panoramic video device 400 may also have other components and structures, such as a communication interface, a display, and the like. The processor 420 is configured to execute the program instructions stored in the memory 410, so that the processor 420 executes the steps of the method for generating a panoramic video above, wherein, in order to avoid repetition, the detailed description of some steps can refer to the above, and will not be repeated here. repeat.

Specifically, the memory 410 is used for storing various data and executable program instructions generated during the process of generating the panoramic video, for example, for storing various application programs or algorithms for realizing various specific functions. One or more computer program products may be included, and computer program products may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include random access memory (RAM) and/or cache memory (cache), etc., for example. Non-volatile memory may include, for example, read-only memory (ROM), hard disk, flash memory, and the like.

The processor 420 may be a central processing unit (CPU), a graphics processing unit (GPU), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), or other forms of processing with data processing capabilities and/or instruction execution capabilities. unit, and can control other components in apparatus 400 to perform desired functions. For example, processor 420 may include one or more embedded processors, processor cores, microprocessors, logic circuits, hardware finite state machines (FSMs), digital signal processors (DSPs), graphics processing units (GPUs) or their combination.

In one embodiment, when the instructions stored in the memory 410 are executed by the processor 420, the processor executes the following steps: acquiring multiple frames of first images obtained by photographing the target object at different angles by the photographing device of the movable platform, the first An image is taken while the movable platform is running around a target object along a preset trajectory; preprocessing is performed on the first image to obtain a second image, wherein the preprocessing is used to make the last image in multiple frames of the first image A smooth transition between the first frame of the first image and the first frame of the first image; generating a panoramic video based on multiple frames of the second image.

Exemplarily, the preprocessing of the first image includes: obtaining the actual displacement between the poses of the shooting device corresponding to the first image of the adjacent frame; adjusting the actual displacement to obtain the target displacement, wherein the last frame of the first image The first target displacement between an image and the first image of the first frame is smaller than the first actual displacement between the first image of the last frame and the first image of the first frame; the first image is mapped according to the target displacement to obtain the second image.

Exemplarily, the mapping of the first image according to the target displacement to obtain the second image includes: obtaining the first homography matrix between the first image and the second image according to the target displacement and the internal reference matrix of the shooting device; The first homography matrix maps the first image to obtain the second image.

Exemplarily, adjusting the actual displacement to obtain the target displacement includes: obtaining the target displacement between the first image in the last frame and the first image in the first frame and the actual displacement between the first image in the last frame and the first image in the first frame. Bias of displacement; distributes the bias between the first images of the remaining frames.

Exemplarily, distributing the deviation among the remaining first images of each frame includes: linearly distributing the deviation, so that part of the deviation is increased or decreased on the basis of the actual displacement between the first images of each frame.

Exemplarily, adjusting the actual displacement further includes: obtaining a displacement distribution curve according to the actual displacement after the distribution deviation; smoothing the displacement distribution curve; and performing a difference on the smoothed displacement distribution curve to obtain the first Object displacement between images.

Exemplarily, obtaining the actual displacement between the poses of the shooting device corresponding to the first images of adjacent frames includes: extracting feature points in the first images, and matching the feature points in the first images of adjacent frames to Obtain the matching feature point pairs of the first images of adjacent frames; and obtain the actual displacement between the first images of adjacent frames according to the matching feature point pairs.

Exemplarily, obtaining the actual displacement between the first images of adjacent frames according to the pair of matching feature points includes: performing nonlinear optimization according to the pair of matching feature points to obtain the second homography between the first images of adjacent frames matrix; according to the second homography matrix and the internal reference matrix of the shooting device, the actual displacement between the first images of adjacent frames is obtained.

Exemplarily, obtaining the actual displacement between the poses of the photographing device corresponding to the first image of adjacent frames includes: obtaining the actual displacement according to the inertial measurement data measured by the inertial measurement device of the movable platform during the process of photographing the first image displacement.

Exemplarily, adjusting the actual displacement to obtain the target displacement includes: respectively adjusting the actual displacement in different dimensions so as to obtain the target displacement in different dimensions.

Exemplarily, the actual displacement includes at least one of an actual rotation angle and an actual translation amount; adjusting the actual displacement to obtain a target displacement includes at least one of the following: adjusting the actual rotation angle to obtain a target rotation angle; The actual translation amount is adjusted to obtain the target translation amount.

Exemplarily, the processor 420 is further configured to: select the first image of the last frame from at least two last first images of the frame according to a preset condition. For example, at least two frames of the first image at the end can be matched with the first frame of the first image, and the first frame of the first image with the largest number of matching feature point pairs between the first frame of the first image can be used as the first frame of the last frame. image. Alternatively, the actual displacement between at least two frames of the first image at the end and the first image of the first frame may be obtained respectively; and the frame of the first image with the smallest actual displacement is used as the first image of the last frame.

Exemplarily, the preprocessing of the first image further includes: smoothing brightness of multiple frames of the first image, so as to reduce brightness difference between the last frame of the first image and the first frame of the first image.

After the panoramic video is generated, the processor 420 is also used to: synchronously display the panoramic video and the progress bar associated with the panoramic video, the progress bar is used to describe the preset track, and describe the corresponding position of the panoramic video displayed at the current moment in the preset track , and the preset tracks described by the progress bar are connected end to end. Exemplarily, the processor 420 is also configured to: generate a preset trajectory according to user instructions; control the movable platform to run around the target object along the preset trajectory; The device captures a first image.

In another embodiment, when the instructions stored in the memory 410 are executed by the processor 420, the processor is made to perform the following steps:

Obtain multi-frame images of the target object captured by the shooting device of the movable platform at different angles, and the images are taken during the process of the movable platform orbiting the target object along the preset trajectory, wherein the first and last ends of the preset trajectory are The displacement is greater than or equal to a first preset threshold;

A panoramic video is generated based on multiple frames of images, wherein, in the panoramic video, the image displacement between the first frame and the last frame is less than or equal to a second preset threshold.

Further, before acquiring multiple frames of images obtained by the shooting device of the movable platform shooting the target object at different angles, it also includes: responding to a user's shooting instruction, controlling the movable platform to shoot around the target object along a preset trajectory. Wherein, the target object may be determined based on a user's selection instruction, or the target object may be identified based on an image. The preset trajectory may be determined by the system based on one or more of the type, size, shape or movement characteristics of the target object, or the preset trajectory may be determined in response to a user's setting instruction.

The above describes the main functions of the components of the apparatus 400 for generating a panoramic video. For further details, refer to the related descriptions in the method 100 for generating a panoramic video and the method 300 for generating a panoramic video, and details are not repeated here.

Another aspect of the embodiments of the present invention provides a movable platform, including a movable platform body and a photographing device, and the photographing device is mounted on the movable platform body for photographing a target object; The device 400 for video, the device 400 for generating a panoramic video is communicatively connected with the shooting device, and is used for generating a panoramic video based on an image captured by the shooting device.

Exemplarily, the movable platform may include an aircraft (such as a drone), a robot, an unmanned vehicle, an unmanned boat, and the like. The mobile platform is described below by taking an aircraft as an example, but it can be understood that this is not intended to limit the application scenario of the present application. Those skilled in the art should understand that any embodiments described herein regarding aircraft are applicable to any aircraft (such as unmanned aerial vehicles, also called unmanned aerial vehicles).

Aircraft may include processors, memory, powertrains, sensing systems, and communication systems. These components are interconnected by bus systems and/or other forms of connection mechanisms. In some embodiments, the photographing device can be set on the aircraft through a carrier such as a pan/tilt.

The power mechanism may include one or more rotating bodies, propellers, paddles, engines, motors, wheels, bearings, magnets, nozzles. For example, the rotating body of the power mechanism may be a self-fastening rotating body, a rotating body assembly, or other rotating body power units. An aircraft can have one or more power units. All power mechanisms can be of the same type. Optionally, one or more power mechanisms may be of different types. The power unit can be mounted on the aircraft by suitable means, such as via support elements (eg drive shafts). The power mechanism can be installed in any suitable position of the aircraft, such as top, bottom, front, rear, side or any combination thereof.

In some embodiments, the power mechanism is capable of causing the aircraft to take off vertically from a surface, or land vertically on a surface, without requiring any horizontal motion of the aircraft (eg, without taxiing on a runway). Optionally, the power mechanism may allow the aircraft to hover at a preset position and/or direction in the air. One or more powered mechanisms may be controlled independently of other powered mechanisms. Optionally, one or more power mechanisms can be controlled simultaneously. For example, an aircraft may have multiple horizontally oriented rotators to track the lifting and/or pushing of objects. The rotating body in the horizontal direction can be actuated to provide the capability of the aircraft to take off vertically, land vertically, and hover. In some embodiments, one or more of the horizontally oriented rotators may rotate clockwise, while the other one or more of the horizontally oriented rotators may rotate counterclockwise. For example, there are as many rotators that rotate clockwise as there are rotators that rotate counterclockwise. The rate of rotation of each horizontal rotator can be varied independently to achieve the lift and/or push action caused by each rotator, thereby adjusting the spatial orientation, velocity, and/or acceleration of the aircraft (e.g., relative to up to three free-wheeling degrees of rotation and translation).

The sensing system may include one or more sensors to sense the spatial orientation, velocity and/or acceleration (eg, rotation and translation with respect to up to three degrees of freedom) of the aircraft. The one or more sensors include any of the aforementioned sensors, including GPS sensors, motion sensors, inertial sensors, proximity sensors, or image sensors. The sensing data provided by the sensing system can be used to track the spatial orientation, velocity and/or acceleration of the target. Optionally, the sensing system may be used to collect data about the aircraft's environment, such as weather conditions, potential obstacles to approach, locations of geographic features, locations of man-made structures, image information, and the like.

The communication system can communicate with the control device with the communication system through wireless signals. A communication system may include any number of transmitters, receivers, and/or transceivers for wireless communication. The communication may be one-way communication, such that data is sent in one direction. For example, one-way communication may include that only the aircraft transmits data to the control device, or vice versa. One or more transmitters of the communication system can send data to one or more receivers of the communication system, and vice versa. Optionally, the communication may be bi-directional, so that data can be transmitted in both directions between the aircraft and the control means. Two-way communication involves that one or more transmitters of the communication system can send data to one or more receivers of the communication system, and vice versa.

In some embodiments, the control device can provide control data to one or more of the aircraft, the carrier, and the photographing device, and receive information from one or more of the aircraft, the carrier, and the photographing device (such as the aircraft, the carrier, and the photographing device). The position and/or motion information of the carrier or the photographing device, image data captured by the photographing device such as a camera, etc.). In some embodiments, the control data of the control device may include instructions about position, movement, actuation, or control of the aircraft, carrier and/or camera. For example, the control data may result in a change in the position and/or orientation of the aircraft (eg, by controlling a power mechanism), or cause a movement of the carrier relative to the aircraft (eg, by controlling the carrier). The control data of the control device can lead to the control of the shooting device, such as controlling the operation of the camera or other shooting devices (capturing still or moving images, zooming, turning on or off, switching shooting modes, changing image resolution, changing focal length, changing depth of field, changing exposure time, changing viewing angle or field of view). In some embodiments, the communication of the aircraft, the carrier and/or the camera may include information from one or more sensors (such as a sensor system or a camera). The communication may include sensory information transmitted from one or more sensors of different types, such as GPS sensors, motion sensors, inertial sensors, proximity sensors, or image sensors. The sensing information is about the position (such as direction, position), motion, or acceleration of the aircraft, carrier, and/or photographing device. The sensing information transmitted from the camera includes data captured by the camera or the state of the camera. The control data transmitted and provided by the control device can be used to track the status of one or more of the aircraft, the carrier or the camera device. Optionally or simultaneously, each of the carrier and the photographing device may include a communication module for communicating with the control device, so that the control device can communicate or track the aircraft, the carrier and the photographing device individually.

In some embodiments, the aircraft may communicate with remote devices other than the control device, and the control device may also communicate with remote devices other than the aircraft. For example, the aircraft and/or the control device may communicate with another aircraft or with a carrier or camera of another aircraft. The additional remote device may be a second control device or other computing device (such as a computer, desktop, tablet, smart phone, or other mobile device), when desired. The remote device may transmit data to the aircraft, receive data from the aircraft, transmit data to the control device, and/or receive data from the control device. Optionally, the remote device may be connected to the Internet or other telecommunication network to allow data received from the aircraft and/or controls to be uploaded to a website or server.

In some embodiments, the movement of the aircraft, the movement of the carrying body and the movement of the photographing device relative to a fixed reference object (such as the external environment), and/or the movement between them can all be controlled by the control device. The control device may be a remote control terminal located far away from the aircraft, carrier and/or photographing device. The control device may be located or attached to the support platform. Optionally, the control device may be handheld or wearable. For example, the control device may include a smart phone, a tablet computer, a desktop computer, a computer, glasses, gloves, a helmet, a microphone, or any combination thereof. The control means may comprise a user interface such as a keyboard, mouse, joystick, touch screen or display. Any suitable user input may interact with the control device, such as manual input commands, voice control, gesture control, or positional control (eg by movement, position or tilt of the control device).

The aircraft may include one or more memories, on which are stored computer programs executed by the processor, for example, for storing corresponding steps and program instructions for implementing the method for generating panoramic video according to the embodiment of the present application. One or more computer program products may be included, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random access memory (RAM) and/or cache memory (cache). The non-volatile memory may include, for example, a read-only memory (ROM), a hard disk, a flash memory, and the like.

The aircraft may include one or more processors, and the processor may be a central processing unit (CPU), a graphics processing unit (GPU), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), or have data processing capabilities and/or or other form of processing unit capable of executing instructions and may control other components in the aircraft to perform desired functions. The processor can execute the program instructions stored in the memory, so as to execute the relevant steps in the method for generating a panoramic video in the embodiment of the present application described above. For example, a processor can include one or more embedded processors, processor cores, microprocessors, logic circuits, hardware finite state machines (FSMs), digital signal processors (DSPs), or combinations thereof. In this embodiment, the processor includes a Field Programmable Gate Array (FPGA), or one or more ARM processors.

Since the mobile platform in the embodiment of the present invention has the apparatus 400 for generating panoramic video in the embodiment of the present invention, it also has similar advantages.

In addition, an embodiment of the present invention also provides a computer storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps of the

method

100 or 300 for generating a panoramic video are implemented. Computer storage media may include, for example, a memory card of a smartphone, a memory component of a tablet computer, a hard disk of a personal computer, read only memory (ROM), erasable programmable read only memory (EPROM), portable compact disk ROM ( CD-ROM), USB memory, or any combination of the above storage media. The computer readable storage medium can be any combination of one or more computer readable storage medium.

Those skilled in the art can appreciate that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present invention.

Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the above-described system, device and unit can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.

In the several embodiments provided in this application, it should be understood that the disclosed systems, devices and methods may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.

If the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the essence of the technical solution of the present invention or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present invention. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

The above is only a specific embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Anyone skilled in the art can easily think of changes or substitutions within the technical scope disclosed in the present invention. Should be covered within the protection scope of the present invention. Therefore, the protection scope of the present invention should be based on the protection scope of the claims.

Although example embodiments have been described herein with reference to the accompanying drawings, it should be understood that the above-described example embodiments are exemplary only and are not intended to limit the scope of the invention thereto. Various changes and modifications can be made therein by those skilled in the art without departing from the scope and spirit of the invention. All such changes and modifications are intended to be included within the scope of the invention as claimed in the appended claims.

In the several embodiments provided in this application, it should be understood that the disclosed devices and methods may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another device, or some features may be omitted, or not implemented.

In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure the understanding of this description.

Similarly, it should be understood that in the description of the exemplary embodiments of the invention, in order to streamline the disclosure and to facilitate an understanding of one or more of the various inventive aspects, various features of the invention are sometimes grouped together in a single embodiment, figure , or in its description. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed invention requires more features than are expressly recited in each claim. Rather, as the corresponding claims reflect, the inventive point lies in that the corresponding technical problem can be solved by using less than all features of a single disclosed embodiment. Thus, the claims following the Detailed Description are hereby expressly incorporated into this Detailed Description, with each claim standing on its own as a separate embodiment of this invention.

It will be appreciated by those skilled in the art that all features disclosed in this specification (including accompanying claims, abstract and drawings) and all features of any method or apparatus so disclosed may be used in any combination, except where the features are mutually exclusive. process or unit. Each feature disclosed in this specification (including accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

Furthermore, those skilled in the art will understand that although some embodiments described herein include some features included in other embodiments but not others, combinations of features from different embodiments are meant to be within the scope of the invention. and form different embodiments. For example, in the claims, any one of the claimed embodiments can be used in any combination.

The various component embodiments of the present invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art should understand that a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all functions of some modules according to the embodiments of the present invention. The present invention can also be implemented as an apparatus program (for example, a computer program and a computer program product) for performing a part or all of the methods described herein. Such a program for realizing the present invention may be stored on a computer-readable medium, or may be in the form of one or more signals. Such a signal may be downloaded from an Internet site, or provided on a carrier signal, or provided in any other form.

It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The invention can be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In a unit claim enumerating several means, several of these means can be embodied by one and the same item of hardware. The use of the words first, second, and third, etc. does not indicate any order. These words can be interpreted as names.

The above is only a specific embodiment of the present invention or a description of the specific embodiment, and the protection scope of the present invention is not limited thereto. Any person familiar with the technical field can easily Any changes or substitutions that come to mind should be covered within the protection scope of the present invention. The protection scope of the present invention should be based on the protection scope of the claims.

Claims

A method for generating panoramic video, characterized in that the method comprises:

Obtain multiple frames of first images obtained by shooting the target object at different angles by the shooting device of the movable platform, and the first images are taken while the movable platform is running around the target object along a preset track ;

Preprocessing the first image to obtain a second image, wherein the preprocessing is used to make a smooth transition between the first image in the last frame and the first image in the first frame among the multiple frames of the first image;

A panoramic video is generated based on multiple frames of the second image.
The method according to claim 1, wherein said preprocessing the first image comprises:

Acquiring the actual displacement between the poses of the photographing device corresponding to the first image in adjacent frames;

adjusting the actual displacement to obtain a target displacement, wherein the first target displacement between the first image in the last frame and the first image in the first frame is smaller than the first target displacement between the first image in the last frame and the first image in the first frame a first actual displacement between the first images of the frame;

Mapping the first image according to the target displacement to obtain the second image.
The method according to claim 2, wherein said mapping said first image according to said target displacement to obtain said second image comprises:

obtaining a first homography matrix between the first image and the second image according to the target displacement and the internal reference matrix of the photographing device;

Mapping the first image according to the first homography matrix to obtain the second image.
The method according to claim 2 or 3, wherein said adjusting said actual displacement to obtain a target displacement comprises:

Acquiring a deviation between the target displacement between the first image in the last frame and the first image in the first frame and the actual displacement between the first image in the last frame and the first image in the first frame;

Distributing the offset between frames of the first image.
The method according to claim 4, wherein the distributing the deviation between the first images of each frame comprises:

The deviation is distributed linearly, so that part of the deviation is increased or decreased on the basis of the actual displacement between the frames of the first image.
The method according to claim 4 or 5, wherein the adjusting the actual displacement further comprises:

Obtaining a displacement distribution curve according to the actual displacement after allocating the deviation;

smoothing the displacement distribution curve;

Performing a difference on the smoothed displacement distribution curves to obtain the target displacement between the first images in each frame.
The method according to any one of claims 2-6, wherein the acquiring the actual displacement between the poses of the photographing device corresponding to the first image in adjacent frames comprises:

extracting feature points in the first image, and matching the feature points in the first image in adjacent frames to obtain matching feature point pairs in the first image in adjacent frames;

The actual displacement between the first images in adjacent frames is obtained according to the pair of matching feature points.
The method according to claim 7, wherein the obtaining the actual displacement between the first images of adjacent frames according to the pair of matching feature points comprises:

performing nonlinear optimization according to the matching feature point pairs to obtain a second homography matrix between the first images in adjacent frames;

The actual displacement between the first images in adjacent frames is obtained according to the second homography matrix and the internal reference matrix of the photographing device.
The method according to any one of claims 2-6, wherein the obtaining the actual displacement between the poses of the photographing device corresponding to the first image in adjacent frames comprises:

The actual displacement is obtained according to the inertial measurement data measured by the inertial measurement device of the movable platform during the process of capturing the first image.
The method according to any one of claims 2-9, wherein the adjusting the actual displacement to obtain the target displacement comprises:

The actual displacements in different dimensions are respectively adjusted to obtain the target displacements in different dimensions.
The method according to any one of claims 2-10, wherein the actual displacement includes at least one of an actual rotation angle and an actual translation amount; the adjustment of the actual displacement is performed to obtain a target displacement , including at least one of the following:

adjusting the actual rotation angle to obtain a target rotation angle;

The actual translation amount is adjusted to obtain a target translation amount.
The method according to any one of claims 1-11, further comprising: selecting the first image of the last frame from at least two frames of the first images at the end according to preset conditions.
The method according to claim 12, wherein the selecting the first image of the last frame from at least two frames of the first images at the end according to preset conditions comprises:

Match the first image of at least two frames at the end with the first image of the first frame, and use the first image of the frame with the largest number of matching feature point pairs between the first image of the first frame as the first image. Describe the first image of the last frame.
The method according to claim 12, wherein the selecting the first image of the last frame from at least two frames of the first images at the end according to preset conditions comprises:

Respectively acquire the actual displacement between the first image of at least two frames at the end and the first image of the first frame;

The frame of the first image with the smallest actual displacement is used as the first image of the last frame.
The method according to any one of claims 2-14, wherein said preprocessing the first image further comprises:

Smoothing the brightness of the multiple frames of the first image to reduce the brightness difference between the last frame of the first image and the first frame of the first image.
The method according to any one of claims 1-15, further comprising:

Synchronously displaying the panoramic video and a progress bar associated with the panoramic video, the progress bar is used to describe the preset track, and describe the corresponding position of the panoramic video displayed at the current moment in the preset track , and the preset tracks described by the progress bar are connected end to end.
The method according to any one of claims 1-16, further comprising:

generating the preset trajectory according to a user instruction;

controlling the movable platform to run around the target object along the preset track;

During the process of the movable platform running around the target object along a preset trajectory, the photographing device is controlled to photograph the first image.
A method for generating panoramic video, characterized in that the method comprises:

Obtaining multiple frames of images obtained by the shooting device of the movable platform shooting the target object at different angles, the images are taken during the process of the movable platform running around the target object along a preset trajectory, wherein the The displacement of the first and last ends of the preset trajectory is greater than or equal to the first preset threshold;

A panoramic video is generated based on the multiple frames of images, wherein, in the panoramic video, the image displacement between the first frame and the last frame is less than or equal to a second preset threshold.
The method according to claim 18, characterized in that before acquiring the multi-frame images obtained by shooting the target object at different angles by the shooting device of the movable platform, further comprising:

In response to a user's shooting instruction, the movable platform is controlled to surround the target object along the preset trajectory to shoot.
The method according to claim 19, wherein the target object is determined based on a user's selection instruction, or the target object is identified based on an image.
The method according to claim 19, wherein the preset trajectory is determined by the system based on one or more of the type, size, shape or movement characteristics of the target object, or the preset trajectory The trajectory is determined in response to a user's setting instruction.
The method according to any one of claims 18-21, further comprising:

Synchronously displaying the panoramic video and a progress bar associated with the panoramic video, the progress bar is used to describe the preset track, and describe the corresponding position of the panoramic video displayed at the current moment in the preset track , and the preset tracks described by the progress bar are connected end to end.
A device for generating panoramic video, characterized in that the device comprises:

memory for storing executable instructions;

a processor, configured to execute the instructions stored in the memory, so that the processor performs the following steps:

Obtain multiple frames of first images obtained by shooting the target object at different angles by the shooting device of the movable platform, and the first images are taken while the movable platform is running around the target object along a preset track ;

Preprocessing the first image to obtain a second image, wherein the preprocessing is used to make a smooth transition between the first image in the last frame and the first image in the first frame among the multiple frames of the first image;

A panoramic video is generated based on multiple frames of the second image.
The device according to claim 23, wherein said preprocessing the first image comprises:

Acquiring the actual displacement between the poses of the photographing device corresponding to the first image in adjacent frames;

adjusting the actual displacement to obtain a target displacement, wherein the first target displacement between the first image in the last frame and the first image in the first frame is smaller than the first target displacement between the first image in the last frame and the first image in the first frame a first actual displacement between the first images of the frame;

Mapping the first image according to the target displacement to obtain the second image.
The device according to claim 14, wherein said mapping said first image according to said target displacement to obtain said second image comprises:

obtaining a first homography matrix between the first image and the second image according to the target displacement and the internal reference matrix of the photographing device;

Mapping the first image according to the first homography matrix to obtain the second image.
The device according to claim 24 or 25, wherein said adjusting said actual displacement to obtain a target displacement comprises:

Acquiring a deviation between the target displacement between the first image in the last frame and the first image in the first frame and the actual displacement between the first image in the last frame and the first image in the first frame;

Distributing the offset among the remaining frames of the first image.
The device according to claim 26, wherein the distributing the deviation between the first images of the remaining frames comprises:

The deviation is distributed linearly, so that part of the deviation is increased or decreased on the basis of the actual displacement between the frames of the first image.
The device according to claim 26 or 27, wherein the adjusting the actual displacement further comprises:

Obtaining a displacement distribution curve according to the actual displacement after allocating the deviation;

smoothing the displacement distribution curve;

Performing a difference on the smoothed displacement distribution curves to obtain the target displacement between the first images in each frame.
The device according to any one of claims 24-28, wherein the acquiring the actual displacement between the poses of the shooting device corresponding to the first image in adjacent frames comprises:

extracting feature points in the first image, and matching the feature points in the first image in adjacent frames to obtain matching feature point pairs in the first image in adjacent frames;

The actual displacement between the first images in adjacent frames is obtained according to the pair of matching feature points.
The device according to claim 29, wherein the obtaining the actual displacement between the first images in adjacent frames according to the pair of matching feature points comprises:

performing nonlinear optimization according to the matching feature point pairs to obtain a second homography matrix between the first images in adjacent frames;

The actual displacement between the first images in adjacent frames is obtained according to the second homography matrix and the internal reference matrix of the photographing device.
The device according to any one of claims 24-28, wherein the acquiring the actual displacement between the poses of the photographing device corresponding to the first image in adjacent frames comprises:

The actual displacement is obtained according to the inertial measurement data measured by the inertial measurement device of the movable platform during the process of capturing the first image.
The device according to any one of claims 24-31, wherein the adjusting the actual displacement to obtain the target displacement comprises:

The actual displacements in different dimensions are respectively adjusted to obtain the target displacements in different dimensions.
The device according to any one of claims 24-32, wherein the actual displacement includes at least one of an actual rotation angle and an actual translation amount; the actual displacement is adjusted to obtain a target displacement , including at least one of the following:

adjusting the actual rotation angle to obtain a target rotation angle;

The actual translation amount is adjusted to obtain a target translation amount.
The device according to any one of claims 23-33, wherein the processor is further configured to: select the last frame first image from at least two last frames of the first image according to preset conditions. image.
The device according to claim 34, wherein the selecting the first image of the last frame from at least two frames of the first images at the end according to preset conditions comprises:

Match the first image of at least two frames at the end with the first image of the first frame, and use the first image of the frame with the largest number of matching feature point pairs between the first image of the first frame as the first image. Describe the first image of the last frame.
The device according to claim 34, wherein the selecting the first image of the last frame from at least two frames of the first images at the end according to preset conditions comprises:

Respectively acquire the actual displacement between the first image of at least two frames at the end and the first image of the first frame;

The frame of the first image with the smallest actual displacement is used as the first image of the last frame.
The device according to any one of claims 24-36, wherein the preprocessing the first image further comprises:

Smoothing the brightness of the multiple frames of the first image to reduce the brightness difference between the last frame of the first image and the first frame of the first image.
The device according to any one of claims 23-37, wherein the processor is further configured to:

Synchronously displaying the panoramic video and a progress bar associated with the panoramic video, the progress bar is used to describe the preset track, and describe the corresponding position of the panoramic video displayed at the current moment in the preset track , and the preset tracks described by the progress bar are connected end to end.
The device according to any one of claims 23-38, wherein the processor is further configured to:

generating the preset trajectory according to a user instruction;

controlling the movable platform to run around the target object along the preset track;

During the process of the movable platform running around the target object along a preset trajectory, the photographing device is controlled to photograph the first image.
A device for generating panoramic video, characterized in that the device comprises:

memory for storing executable instructions;

a processor, configured to execute the instructions stored in the memory, so that the processor performs the following steps:

Obtaining multiple frames of images obtained by the shooting device of the movable platform shooting the target object at different angles, the images are taken during the process of the movable platform running around the target object along a preset trajectory, wherein the The displacement of the first and last ends of the preset trajectory is greater than or equal to the first preset threshold;

A panoramic video is generated based on the multiple frames of images, wherein, in the panoramic video, the image displacement between the first frame and the last frame is less than or equal to a second preset threshold.
The device according to claim 40, wherein before the multi-frame images obtained by capturing the target object at different angles by the photographing device of the movable platform, further comprising:

In response to a user's shooting instruction, the movable platform is controlled to surround the target object along the preset trajectory to shoot.
The device according to claim 40, wherein the target object is determined based on a user's selection instruction, or the target object is identified based on an image.
The device according to claim 40, wherein the preset trajectory is determined by the system based on one or more of the type, size, shape or movement characteristics of the target object, or the preset trajectory The trajectory is determined in response to a user's setting command.
The device according to any one of claims 40-43, further comprising:

Synchronously displaying the panoramic video and a progress bar associated with the panoramic video, the progress bar is used to describe the preset track, and describe the corresponding position of the panoramic video displayed at the current moment in the preset track , and the preset tracks described by the progress bar are connected end to end.
A mobile platform, characterized in that the mobile platform comprises:

Movable platform body;

A photographing device, mounted on the movable platform body, for photographing the target object;

And, the device for generating a panoramic video according to any one of claims 23-44, wherein the device for generating a panoramic video is communicatively connected to the shooting device, and is used to generate a panoramic video based on an image captured by the shooting device.
A computer storage medium on which a computer program is stored, wherein the method for generating panoramic video according to any one of claims 1 to 22 is implemented when the program is executed by a processor.