WO2023159611A1

WO2023159611A1 - Image photographing method and device, and movable platform

Info

Publication number: WO2023159611A1
Application number: PCT/CN2022/078433
Authority: WO
Inventors: 刘宝恩; 王涛; 李鑫超
Original assignee: 深圳市大疆创新科技有限公司
Priority date: 2022-02-28
Filing date: 2022-02-28
Publication date: 2023-08-31
Also published as: CN117716702A

Abstract

An image photographing method and device, and a movable platform. The method comprises: capturing a first target image at a first preset focal length, and identifying a target object in the first target image; when the target object is identified in the first target image, adjusting the position and orientation of a movable platform according to the position of the target object in the first target image, so as to cause the position of the target object to be located at a preset position in a photography scene of a camera device; when the target object is not identified in the first target image, continuously increasing the focal length of the camera device, and identifying the target object according to a current image captured by the camera device, until the focal length of the camera device is equal to a second preset focal length; and during adjustment of the focal length of the camera device, when the target object is recognized according to a current image captured by the camera device at any moment, adjusting the position and orientation of the movable platform according to the position of the target object in the current image. An embodiment of the present disclosure can improve the accuracy of an unmanned automated photography.

Description

Image capturing method, device and movable platform

technical field

The present disclosure relates to the technical field of unmanned control, and in particular, relates to an image capture method, device and movable platform capable of improving the accuracy of unmanned automatic capture.

Background technique

With the development of unmanned aerial vehicle technology, tasks such as power inspections, bridge inspections, and oil and gas pipeline inspections that require repeated inspections of targets are gradually undertaken by unmanned aerial vehicles. During the inspection process, the unmanned aerial vehicle needs to take precise photos/videos of the target, so that the status of the inspection target can be compared and checked later to check the working status of key equipment. In the related art, there is a teaching/replay inspection scheme based on a zoom lens, that is, to record the position, attitude and focal length of the movable platform of the target object during the teaching process, and automatically according to the position and attitude during the replay process. , focal length to shoot the target object. However, when locating the target object in the replay process, it takes a long time to adjust the position, attitude and focus of the gimbal, and it is impossible to solve the gimbal control deviation during the teaching process and the shaking of the fuselage caused by factors such as strong winds during the actual shooting process. shooting error.

It should be noted that the information disclosed in the above background section is only for enhancing the understanding of the background of the present disclosure, and therefore may include information that does not constitute the prior art known to those of ordinary skill in the art.

Contents of the invention

The purpose of the present disclosure is to provide an image capturing method, device and movable platform for overcoming the problem of inaccurate target object positioning in the process of unmanned shooting existing in the related art at least to a certain extent.

According to the first aspect of the embodiments of the present disclosure, an image capturing method is provided, which is applied to a movable platform, where a camera is mounted on the movable platform, and the movable platform collects images through the camera, the method The method includes: collecting a first target image at a first preset focal length, identifying a target object in the first target image; when the target object is identified in the first target image, according to the position of the target object The position in the first target image, adjust the position and posture of the movable platform, so that the position of the target object is located at the preset position in the picture taken by the camera device; When the target object is not reached, continuously increase the focal length of the camera device, and identify the target object according to the current image collected by the camera device until the focal length of the camera device is equal to the second preset focal length; During the process of adjusting the focal length of the camera device, when the target object is recognized according to the current image collected by the camera device at any time, the target object is adjusted according to the position of the target object in the current image. The position and posture of the platform can be moved, so that the position of the target object is located at a preset position in the picture taken by the camera device.

According to a second aspect of the present disclosure, there is provided an image capture device, comprising: a memory configured to store program code; one or more processors coupled to the memory, the processors configured to The instructions in the memory execute the following method: acquire a first target image at a first preset focal length, identify a target object in the first target image; when the target object is identified in the first target image, According to the position of the target object in the first target image, adjust the position and posture of the movable platform, so that the position of the target object is located at a preset position in the picture taken by the camera; When the first target image cannot identify the target object, continuously increase the focal length of the camera, and identify the target object according to the current image collected by the camera until the focal length of the camera is equal to The second preset focal length; wherein, during the process of adjusting the focal length of the camera device, when the target object is recognized according to the current image collected by the camera device at any time, according to the target object in the position in the current image, adjusting the position and posture of the movable platform so that the position of the target object is located at a preset position in the picture taken by the camera device.

According to a third aspect of the present disclosure, a movable platform is provided, including: a body; a power system provided on the body, and the power system is used to provide power for the movable platform; a camera device arranged on the A body for capturing images; a memory; and a processor coupled to the memory, the processor configured to execute the image capturing method as described in any one of the preceding items based on instructions stored in the memory.

According to a fourth aspect of the present disclosure, there is provided a computer-readable storage medium, on which a program is stored, and when the program is executed by a processor, the image capturing method described in any one of the above items is implemented.

In the embodiment of the present disclosure, by locating the target object at a shorter focal length, continuously adjusting the pose of the movable platform, increasing the focal length, and repositioning the target object, the alignment of the target object can be continuously realized through an iterative alignment process. When the controlled movable platform is used for automatic shooting, the movable platform can overcome the problems of inaccurate given position (control error in the teaching process), strong wind and other external forces causing the fuselage to shake, so that the movable platform can realize accurate unmanned automatic shooting .

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the present disclosure.

Description of drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description serve to explain the principles of the disclosure. Apparently, the drawings in the following description are only some embodiments of the present disclosure, and those skilled in the art can obtain other drawings according to these drawings without creative efforts.

FIG. 1 is a flowchart of an image capturing method in an exemplary embodiment of the present disclosure.

Fig. 2 is a sub-flowchart of step S102 in an embodiment of the present disclosure.

FIG. 3 is a sub-flowchart of step S104 in an embodiment of the present disclosure.

FIG. 4 is a sub-flowchart of step S106 in an embodiment of the present disclosure.

Fig. 5 is a flowchart of an image capture method in an embodiment of the present disclosure.

Fig. 6 is a flowchart of an image capture method in another embodiment of the present disclosure.

FIG. 7 is a schematic diagram of a movable platform in one embodiment of the present disclosure.

FIG. 8 is a block diagram of an image capturing device in an exemplary embodiment of the present disclosure.

Detailed ways

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided in order to give a thorough understanding of embodiments of the present disclosure. However, those skilled in the art will appreciate that the technical solutions of the present disclosure may be practiced without one or more of the specific details being omitted, or other methods, components, devices, steps, etc. may be adopted. In other instances, well-known technical solutions have not been shown or described in detail to avoid obscuring aspects of the present disclosure.

In addition, the drawings are only schematic illustrations of the present disclosure, the same reference numerals in the drawings denote the same or similar parts, and thus repeated descriptions thereof will be omitted. Some of the block diagrams shown in the drawings are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in software, or in one or more hardware modules or integrated circuits, or in different network and/or processor means and/or microcontroller means.

Exemplary implementations of the present disclosure will be described in detail below in conjunction with the accompanying drawings.

FIG. 1 is a flowchart of an image capturing method in an exemplary embodiment of the present disclosure. The method shown in FIG. 1 can be applied to a movable platform, where a camera is mounted on the movable platform, and images are collected by the movable platform through the camera.

Referring to FIG. 1, an image capture method 100 may include:

Step S102, collecting a first target image at a first preset focal length, and identifying a target object in the first target image;

Step S104, when the target object is recognized in the first target image, adjust the position and posture of the movable platform according to the position of the target object in the first target image, so that the The position of the target object is located at a preset position in the picture taken by the camera device;

Step S106, when the target object cannot be identified in the first target image, continuously increase the focal length of the camera device, and identify the target object according to the current image collected by the camera device until the The focal length of the imaging device is equal to the second preset focal length; wherein, during the process of adjusting the focal length of the imaging device, when the target object is recognized according to the current image collected by the imaging device at any time, according to the The position of the target object in the current image, adjusting the position and posture of the movable platform, so that the position of the target object is located at a preset position in the picture taken by the camera device.

In the embodiment of the present disclosure, the movable platform may be, for example, an unmanned aerial vehicle (unmanned aerial vehicle), and the camera device is mounted on the unmanned aerial vehicle. In one embodiment, the movable platform further includes a pan-tilt, the pan-tilt is mounted on the UAV, and the camera device is mounted on the pan-tilt.

The method provided by the embodiments of the present disclosure can be applied in the replay phase of the target detection task and the tracking phase of the target tracking task, so as to perform automatic focusing and automatic shooting on the target object.

The replay stage of the target detection task refers to manually controlling the movable platform (such as an unmanned aerial vehicle) during the teaching stage of the target detection task, and recording the shooting position and shooting height of the target to be detected (such as a communication base station, an electric tower) , focal length, etc. In the follow-up replay stage, the movable platform will automatically and regularly shoot the target to be detected according to the shooting position, shooting height, and focal length, so as to realize automatic and regular monitoring of the target to be detected and improve monitoring efficiency.

The target tracking task refers to the tracking range of a target to be detected (such as a person, an animal, a vehicle, etc.), and the movable platform (such as an unmanned aerial vehicle) automatically tracks the target to be detected to monitor the real-time detection of the target to be detected. position, observe the action and state of the target to be detected, etc.

Whether it is in the replay stage of the target detection task or the tracking stage of the target tracking task, it is necessary to locate, focus, and shoot the target object. In some cases, since the movable platform is automatically photographed without unmanned control, it is easy to cause the pre-input shooting position, shooting height, shooting focal length and other parameters to deviate from the actual situation, or due to strong wind, rain and snow, etc. Factors lead to problems such as body shaking. In the case of unmanned intervention, there is no target object in the collected image, or only part of the image of the target object is collected, blurred image, etc., resulting in the failure of the unmanned automatic shooting task.

Take UAV power inspection as an example: inspection needs to regularly check the safety of key components of the tower, so regular and repeated inspections are required, so automated inspections can greatly improve efficiency and accuracy. Automatic power inspection can be divided into teaching mode and replay mode. The teaching mode is fully manual or semi-automatic operation, and performs normal power inspection tasks. During the process, each waypoint will record data such as the attitude of the aircraft and the gimbal at each photo (video) point. Repeat mode for fully automatic flight. According to the waypoint data stored in the teaching mode, the aircraft will fly to each waypoint in turn, adjust the gimbal posture according to the waypoint data, and take corresponding photos/videos.

In the existing inspection scheme, the replay mainly follows the target detection process, that is, the target is detected only once, and then the gimbal pose and camera focal length are directly adjusted to shoot the target. However, in actual work, this control process takes a long time. If there is a large cumulative deviation in the gimbal control during the process, or the fuselage shakes obviously due to external factors such as excessive wind, plus the telephoto control deviation and the machine The camera is more sensitive to body shake, and the target in the final shooting result may be obviously deviated from the center of the screen or even not in the screen. That is to say, the existing teaching/go-around inspection scheme cannot solve the problem of gimbal control deviation and the shaking of the fuselage caused by factors such as strong wind. Therefore, gimbal stabilization plays an important role in the reliability of target shooting.

The embodiments of the present disclosure first use the feature point matching method to detect the precise position of the inspection target in the screen at low magnification focal length as the initialization frame for precise photography during replay, and then use the target tracking algorithm while adjusting the pan/tilt attitude and zoom Track the position of the target in the screen, and correct the gimbal control in real time to keep the target in the center of the screen, realize the gimbal tracking of the target, and achieve the effect of gimbal stabilization. When the zoom is completed and the target is kept in the center of the screen, the target is shot, and the precise shooting process of the inspection target is completed. The process of searching for the target at a small focal length, tracking the target with the gimbal, and then shooting the target with a large focal length enables the aircraft to capture the details of the target stably, accurately and clearly at a relatively long distance, achieving a good inspection effect.

Next, each step of the image capturing method 100 will be described in detail.

In step S102, a first target image is captured at a first preset focal length, and a target object is identified in the first target image.

In the embodiment of the present disclosure, when the image capture method 100 is applied to the replay stage of the target detection task, the second preset focal length and the position and posture of the movable platform when capturing the first target image are all based on the teaching of the target detection task Phase teaching data are obtained. When the image capture method 100 is applied to the tracking phase of the target tracking task, the first preset focal length can be obtained according to the tracking range of the target tracking task.

When the method of the embodiment of the present disclosure is applied in the replay stage, the first preset preset focal length may be a focal length shorter than the second preset focal length for shooting set in the teaching stage, so as to obtain Larger shooting field of view, convenient for real-time positioning of target objects. The position (including latitude and longitude and altitude) and posture (including lens orientation) of the movable platform when the first target image is collected are obtained according to the teaching data in the teaching phase.

When the method of the embodiment of the present disclosure is applied in the tracking phase, the first preset focal length can be obtained according to the tracking range of the target tracking task, and the position and posture of the movable platform during shooting can also be obtained according to the tracking range of the target tracking task. For example, in one embodiment, when the target object to be tracked is set at point A, and the shooting height is more than x meters away from the target object to prevent being discovered by the target object (x is a value greater than zero), according to the value of x, Calculate the first preset focal length while ensuring a shooting angle of view as large as possible, so as to collect the first target image at the maximum angle of view that can be realized by the movable platform centered on point A, and reconstruct the target object through the first target image Positioning to prevent inaccurate positioning of the target object or movement of the target object during the transfer of information and positioning of the movable platform.

Referring to FIG. 2, in one embodiment, step S102 may include:

Step S1021, acquiring the feature points of the target object according to the standard feature image of the target object;

Step S1022, performing feature point recognition in the first target image according to the feature points of the target object, so as to determine a positioning frame of the target object.

The feature points of the target object can be obtained in advance according to the standard feature image of the target object. In the replay stage of the target detection task, the feature points of the target object can be extracted according to the standard feature image of the target object provided in the teaching stage; feature image to extract feature points of the target object.

In an embodiment of the present disclosure, a convolutional neural network may be used to extract local feature points from the current image, and the extracted local feature points are compared with feature points of the target object to identify the target object.

In the embodiment shown in FIG. 2 , the target object can be identified in the current image through the trained neural network model. The neural network model may be, for example, a convolutional neural network model (Convolutional Neural Network S1, CNN). Based on the learnability of CNN, using CNN to extract feature points can better deal with target objects such as tower insulators that have no local texture, not rich in texture, or have repeated texture characteristics, and accurately obtain the position of the target object. By using the CNN model for local feature point matching, compared with the traditional image block-based target recognition method, it can better deal with irregularly shaped targets and reduce the false detection problem caused by background matching.

In other embodiments of the present disclosure, the image block detection method may also be used to identify the target object, or the scale-invariant feature transform (Scale-invariant feature transform, SIFT) algorithm, Speeded Up Robust Features (Speeded Up Robust Features, SURF ) algorithm, fast feature point extraction and description (Oriented FAST and Rotated BRIEF, ORB) algorithm and other target recognition algorithms, or through multiple deep learning algorithms for target detection, this disclosure does not make special restrictions on this.

In one implementation, feature point recognition can be performed starting from the center point of the first target image. When collecting the first target image, it is assumed that the current pose can accurately capture the image of the target object, that is, it is assumed that the target object is located in the center of the first target image, therefore, starting from the center point of the first target image for feature point recognition can Improve recognition efficiency.

In step S104, when the target object is recognized in the first target image, adjust the position and posture of the movable platform according to the position of the target object in the first target image, so that The position of the target object is located at a preset position in the picture taken by the camera device.

The preset position may include, for example, a central area of the captured image, and the central area may be, for example, a smaller area within a preset length and width around the central point of the captured image, and its shape may be, for example, a rectangle or a circle.

Referring to FIG. 3, in one embodiment, step S104 may include:

Step S1041, taking the center of the first target image as the coordinate origin, and determining the first coordinates of the target object in the first target image;

Step S1042, determining the position and attitude adjustment values of the movable platform according to the first coordinates, where the position and attitude adjustment values include at least one of a horizontal adjustment value, a vertical adjustment value, and a rotation angle adjustment value;

Step S1043, adjusting the position and attitude of the movable platform according to the position and attitude adjustment value.

When the local feature point comparison method is used to identify the target object in the first target image, the recognition frame of the target object can be obtained, and the center of the first target image is used as the coordinate source point, and the recognition frame is placed in the first target image The coordinates of are determined as the coordinates of the target object, so as to determine the first coordinates of the target object.

Next, position and attitude adjustment values of the movable platform can be determined according to the first coordinates. For example, when the first coordinate is (-50, 10), the movable platform can be controlled to move 50 coordinate units in the positive direction of the X axis and 10 coordinate units in the negative direction of the Y axis, so that the target object Coincide with the coordinate origin, that is, the center of the first target image, so that the target image is located at the center of the next frame image. The proportional relationship between the coordinate unit and the movable platform can be determined according to the scale used in the current coordinate system. The scale can be determined according to the flying height of the movable platform or the distance between the movable platform and the target object, which is not particularly limited in the present disclosure.

In addition to adjusting the horizontal adjustment value and vertical adjustment value, in one embodiment, the shooting angle of the movable platform can also be adjusted, that is, the rotation angle of the movable platform can be controlled, so as to capture the target surface of the target object as much as possible, such as the setting of the electric tower There is a side of a key facility, the face of a tracking target, etc. The adjustment value of the rotation angle of the movable platform can be judged according to the feature recognition results of the target object to determine the angle difference between the side of the target object currently photographed and the standard side to be photographed, etc., and then according to the scale of the current coordinate system and the angle difference to obtain the adjustment value of the rotation angle of the movable platform.

In another embodiment, an adjustment value of the distance between the movable platform and the target object may also be determined. For example, when an unmanned aerial vehicle takes a bird's-eye view of a target object, the flying height of the fuselage becomes higher due to the influence of strong winds. At this time, the recognition frame of the target object in the current image is smaller than the size of the recognition frame during the teaching stage or the preset recognition frame. Small, the flying height of the movable platform can be readjusted according to the set flying height, or the distance adjustment value can be converted according to the proportional relationship between the size of the recognition frame and the preset standard value of the recognition frame, and the UAV can be controlled to approach or stay away target. There are many types of position and attitude adjustment values of the movable platform, and those skilled in the art can set them by themselves according to the actual situation.

When the camera device is directly erected on the movable platform, the movable platform can be controlled to adjust its position and attitude so that the shooting center of the camera device is aligned with the target object. When the camera is assumed to be on the pan/tilt of the movable platform, the shooting center of the camera can be aligned with the target object by adjusting the position and attitude of the pan/tilt.

After adjusting the pose of the movable platform, the current shooting center has been aligned with the target object by default, and the focal length can be adjusted to the second preset focal length to shoot the target object.

The second preset focal length is a preset focal length value at which the target object can be accurately observed. In the replay stage, the second preset focal length may be, for example, an ideal shooting focal length set in the teaching stage; in the tracking stage, the second preset focal length may be, for example, a preset tracking shooting focal length value.

In the embodiment of the present disclosure, during the process of adjusting the position and posture of the movable platform according to the position of the target object in the first target image, the target object is continuously tracked so that the target object continues to be in the picture captured by the camera device .

In one embodiment, it can also be set that when the target object is located in the central area of the current image, no pose adjustment is performed, and the focal length is directly adjusted to the second preset focal length for shooting, so as to reduce the calculation amount and improve the calculation efficiency.

In step S106, when the target object cannot be recognized in the first target image, continuously increase the focal length of the camera device, and identify the target object according to the current image collected by the camera device until the The focal length of the camera device is equal to the second preset focal length. Wherein, during the process of adjusting the focal length of the camera device, when the target object is recognized according to the current image collected by the camera device at any time, according to the position of the target object in the current image, The position and posture of the movable platform are adjusted so that the position of the target object is located at a preset position in the picture taken by the camera device.

In an embodiment of the present disclosure, continuously increasing the focal length of the imaging device includes: continuously increasing the focal length of the imaging device with a preset step size.

The current image collected by the camera device in step S106 is not the captured image, but the cached data of the real-time field of view of the camera device. The current image is used to assist image recognition and location analysis, and is deleted within a short period of time. Therefore, when the embodiment of the present disclosure is executed by the processor, the processor acquires the current image in real time and analyzes the current image to identify the target object, so as to adjust the zoom while continuously zooming according to the position of the target object in the current image. The pose of the mobile platform is such that the target object is located at a preset position in the image captured by the camera device, and the preset position is, for example, a central area. The central area may be, for example, a smaller area within a preset length and width range around the central point of the current image, and its shape is, for example, a rectangle or a circle.

Referring to FIG. 4, in one embodiment, step S106 may include:

Step S1061, acquiring feature points of the target object according to the standard feature image of the target object;

Step S1061, performing feature point recognition in the current image collected by the camera device according to the feature points of the target object, so as to determine a positioning frame of the target object.

The process of identifying the target object in step S106 is similar to the process of identifying the target object in step S104, both of which can perform feature point identification based on various algorithms such as convolutional neural network (CNN), and will not be repeated here.

When performing feature recognition, feature point recognition can also be performed from the center point of the current image to improve recognition efficiency. Recognition of feature points starting from the center point of the current image can be represented as recognition near the position of the recognition frame of the previous frame image, so as to determine the recognition frame of the target object in the current image.

In one embodiment, in step S1061, the feature points of the target object can also be obtained according to the image features in the recognition frame of the target object in the previous frame image, so as to improve the information accuracy of the target object based on the latest information and improve the recognition accuracy Rate.

The robust automatic inspection solution based on feature matching to locate targets and pan-tilt tracking proposed in this disclosure can solve the problems of automatic precise positioning and stable shooting of targets such as electric towers. In order to overcome the problem of gimbal control error or shooting deviation caused by strong wind during the time of gimbal position adjustment and zooming, first use the precise photo target matching frame as the initialization frame to track the gimbal to overcome the gimbal control and zooming process. Deviation, to achieve the effect of gimbal stabilization, so that the target is always kept in the center of the screen during the control process, until the zoom ends to capture a complete and clear inspection target. Compared with the traditional target detection method, the present invention extracts local features based on CNN for features that can better adapt to the situation of local non-texture or repeated texture.

The method proposed in the embodiment of the present disclosure obtains the initialization frame for accurate photography based on local feature matching and positioning the target, and then combines the target pan-tilt tracking method to realize the stabilization of the pan-tilt during the replay process, and finally achieves accurate and robust inspection. The target shooting effect can be used to solve the problem that the target is not in the center of the screen or in the field of view during the automatic inspection process due to camera (aircraft) positioning, gimbal control is not accurate enough, or the camera (aircraft) shakes due to external factors such as strong wind , can be applied to industrial drones with inspection functions.

The pan-tilt stabilization inspection solution proposed in the embodiment of the present disclosure, which combines the precise camera initialization frame and the pan-tilt tracking, is an automatic inspection solution with closed-loop control. When replaying the inspection target, first use the feature point matching method to obtain the target frame in the short-focus picture as the initialization frame for accurate photography of the inspection target, and then use the target tracking method to track the target with the pan/tilt to Resist the accumulative control errors that may occur during the gimbal pose adjustment and zooming process or the deviation caused by external forces such as strong winds.

Referring to FIG. 5, an image capturing method 500 may include:

Step S501, short-focus shooting replay image;

Step S502, identifying the target object through the CNN algorithm;

Step S503, extracting feature points and descriptors of the target object;

Simultaneously, in step S501 ' obtain target teaching figure;

Identify target object by CNN algorithm in step S502';

Feature points and descriptors of the target object are extracted in step S503';

Next,

In step S504, the feature points are matched;

Step S505, calculating the target frame of the short-focus composite image;

Step S506, adjusting the pose and zoom of the gimbal;

Step S507, tracking;

Step S508, determine whether the zooming and control are over, if yes, go to step S509 to use telephoto to shoot the inspection target and then end the process, if yes, return to step S506 to readjust the pan/tilt pose and zoom.

In step S501 , short-focus shooting of replay images refers to capturing images at a first preset focal length, which belongs to real-time observation of a target object. In step S501', the teaching image of the target object, that is, the target teaching map is obtained through the teaching data.

In steps S502, S503 and steps S502', S503', the CNN algorithm is used to identify the target object in two pictures (replay image and teaching picture) and extract the feature points and descriptors of the target object in the two pictures.

Among them, based on the learnability of CNN, using CNN to extract feature points can better deal with objects such as tower insulators that are not rich in texture or have repetitive texture characteristics. Compared with the method based on image blocks, the target positioning method based on local feature point matching can better deal with irregular shape targets and reduce the false detection problem caused by background matching. Among them, the feature point matching can use local feature extraction methods, or traditional methods, including but not limited to SIFT, SURF, ORB algorithms, etc.

In step S504, feature point matching is performed on the feature points and descriptors extracted from the two pictures, so as to calculate the target frame in the short-focus reshot image in step S505 according to the matching result, which is also called extraction initialization frame. The initialization box acquisition method can be replaced by an image patch-based detection method, which is not limited to traditional methods or deep learning methods.

Steps S506 to S508 are a process of cyclically and iteratively adjusting the pose of the pan/tilt.

In step S506, zooming refers to continuously increasing the focal length, such as increasing the focal length according to preset compensation; adjusting the pan-tilt pose, for example, adjusting the pan-tilt pose on parameters such as the horizontal direction, vertical direction, and deflection angle, to adjust Filming angle. Adjusting the pose of the gimbal and zooming can be performed simultaneously, while the camera device continues to collect real-time images.

In step S506, target tracking is performed in step S507 according to the images continuously collected by the camera device and the position recognition results of the target object in the images. The target tracking can be directly completed by the feature point matching method of the present invention to obtain the initialization frame, that is, for each frame (or several frames at intervals), the feature point matching method is used to update the position of the target frame. When the gimbal is tracking (track), you can first extract features in the target frame of the previous frame, then perform feature search and matching near the old frame position of the current frame, update the target frame position, and control the gimbal so that the drawn target remains on the screen Center, to achieve the effect of gimbal stabilization, iterate this process until the end of the zoom and control. The basis for the end of zooming and control is to reach the predetermined focal length and the target frame is still in the center of the screen. At this time, the shooting target has completed the entire accurate reshooting of the inspection target with the effect of the pan-tilt.

After step S508 judges that the zooming and control are finished, go to step S509 to use the telephoto (the second preset focal length) to photograph the inspection target and then end the process.

The embodiments of the present disclosure use the CNN-based feature point detection method to obtain an accurate photo initialization frame, and use the target pan-tilt tracking method to stabilize the pan-tilt, realize the closed-loop control of the inspection target shooting, and overcome the process of zooming and pan-tilt pose adjustment The shooting deviation caused by external factors such as the cloud platform control error and strong wind has at least the following advantages:

1. Based on the learnability of CNN, the local features extracted by CNN can better adapt to the situation of local no texture or repeated texture, and accurately obtain the position of the target frame;

2. Compared with the shooting method of directly controlling the pose and zoom of the gimbal with one target detection, using gimbal target tracking to keep the target in the center of the screen during the entire control process can better deal with gimbal control or Target shooting deviation or even loss of target caused by strong wind and other external forces.

Referring to FIG. 6, in one embodiment, the complete process of the image capturing method may include:

Step S601, acquiring a first image at a first preset focal length, configuring the first image as a current image, and configuring the first preset focal length as a current focal length.

Step S602, identifying the target object in the current image.

Step S603, judging whether the target object is recognized, if the target object is recognized, go to step S604, otherwise go to step S613.

Step S604, updating the current feature points of the target object according to the recognition result of the target object in the current image.

Step S605, determine whether the target object is located in the center of the current image, if yes, proceed to step S606 to adjust the pose according to the coordinate difference between the target object and the center of the current image, and then proceed to step S607; if not, directly proceed to step S607.

Step S607, judge whether the current focal length is equal to the second preset focal length, if it is equal to the second preset focal length, go to step S608; otherwise, go to step S608, increase the current focal length, acquire a second image, and configure the second image as the current image, return to step S602.

Step S609, acquiring a third image, and configuring the third image as a current image.

Step S610, judge whether the target object is located in the center of the current image, if not, proceed to step S611 to adjust the pose according to the coordinate difference between the target object and the center of the current image, and return to step S609 until the target object is located in the center of the current image; if If yes, go to step S612 to shoot the target object.

Step S613, if the target object is not identified in the current image, judge whether the current focal length is equal to the third preset focal length, if not, enter step S614, reduce the current focal length, obtain the fourth image, and configure the fourth image as the current image, and then return to step S602; if yes, proceed to step S615 to output recognition failure information.

In the embodiment shown in FIG. 6 , steps S601 to S608 are an iterative zooming method when the target object can be directly recognized in the first image. In step S601, the current image is a parameter, not the only image, and it can be assigned a value so as to be equal to different images at different times; the current focal length is also a parameter, not the only value, and it can also be assigned a value so that Different moments equal different focal length values.

The method for identifying the target object in steps S602 and S603 is as described in the above-mentioned embodiment, and will not be repeated here.

In step S604, the current feature point of the target object is a parameter. When the current image is equal to the first image, the current feature point of the target object is equal to the feature point of the target object in the teaching image. When the current image is equal to other images, there is no The updated current feature points of the target object are equal to the feature points of the target object recognized in the previous frame image. Therefore, in step S604, the parameter of the feature points of the target object is updated in real time according to the feature points of the recognized target object, so that the feature points of the target object can be kept based on the latest recognition data, and the discrepancy between the teaching data and the real-time situation can be reduced. Identify errors.

In steps S605 and S606, if the target image is not located in the center of the current image (or the central area described in the preceding embodiments), the position of the gimbal or the UAV can be adjusted according to the coordinate difference between the target object and the center of the current image and gesture. When calculating the coordinates of the target object, the center of the current image can be used as the origin for calculation.

In step S607, if the target object is located at the center of the current image, or after pose adjustment, the target image is located at the center of the current image, it can be judged whether the current focal length has reached the set shooting focal length (the second preset focal length), if not If it is reached, go to step S608 and continue to increase the focal length to enlarge the proportion of the target object in the image, improve the shooting clarity of the target object, collect the second image at the new increased focal length, and use steps S602 to The method of S607 performs pose adjustment according to the image captured at a larger focal length, so that the target object can also be located at the shooting center of the current image at a larger focal length until reaching the preset shooting focal length, that is, the second preset focal length.

Steps S609 to S612 are the process of fine-tuning the shooting at the second preset focal length to complete the shooting.

If there is no need to fine-tune the shooting, in step S609 the third image can be captured directly at the second preset focal length, and the third image can be saved as the shooting result of the target object.

But in some cases, when the current focal length is equal to the second preset focal length after increasing the focal length, the second image taken at the second preset focal length is set as the current image, thus, in step S606, according to the target in the current image After the object adjusts the pose (of the unmanned aerial vehicle or gimbal), it directly proceeds to step S609 to take the third image after the judgment of step S607. The result after the pose adjustment has not been checked yet, that is, whether the pose adjustment is in place.

Therefore, in step S610, a judgment can be made on the third image acquired (not necessarily shot) after the pose adjustment is performed at the second preset focal length. If the pose adjustment is not in place (the target object is not located in the current image/third image center of the image), go to step S611 and continue to adjust the pose until the pose is adjusted in place, then go to step S612 to take pictures.

Before entering step S612 to shoot the target object, the acquired first image, second image, and third image can all be image data that changes in real time in the lens when the camera has not pressed the shooting key.

Steps S613 to S615 are a processing method when the target is found to be lost. In some cases, for example, when strong wind causes the fuselage to shake, there may be no target object in the collected target image, whether it is the first image collected based on the first preset focal length or the second image collected based on the increased focal length. At this time, the current focal length can be reduced to increase the shooting field of view and re-identify and locate the target object. When the current focal length is reduced to the third preset focal length, if the target object still cannot be recognized in the current image, recognition failure information may be output to report that the target is lost. The third preset focal length is shorter than the first preset focal length, and the third preset focal length can be set by those skilled in the art according to the shooting capability of the camera device.

The method provided by the embodiment shown in Fig. 6 firstly controls the movable platform to shoot according to the set position and height under a small first preset focal length, so as to first ensure that the target object is in the lens, and then adopts iterative zoom positioning The method continuously locates the target object, and finally shoots the target object at the second preset focal length that can ensure clear shooting.

After the embodiment shown in Figure 6 iteratively collects images, recognizes the target object, adjusts the pose, and zooms, the feature information and position information of the target object can be updated according to the images collected in real time, and then gradually approach the ideal shooting pose, Shoot the focal length to avoid the failure of shooting the target object caused by the deviation of the pre-input control parameters or the shaking of the fuselage caused by external forces such as strong winds.

The gimbal stability enhancement inspection solution provided by the embodiments of the present disclosure, which combines the precise camera initialization frame and gimbal tracking, is an automatic inspection solution with closed-loop control. When replaying the inspection target, first use the feature point matching method to obtain the target frame in the short-focus picture as the initialization frame for accurate photography of the inspection target, and then use the target tracking method to track the target with the pan/tilt to Resist the accumulative control errors that may occur during the gimbal pose adjustment and zooming process or the deviation caused by external forces such as strong winds. When the gimbal is tracking (track), first extract the current feature points of the target object in the previous frame recognition frame, then perform feature search and matching near the old frame position of the current frame, update the recognition frame position, and control the gimbal so that the target object Keep in the center of the shooting frame to achieve the gimbal stabilization effect, and iterate this process until the zoom and control are over. The basis for the end of zooming and control is to reach the predetermined focal length and the target frame is still in the center of the screen. At this time, the shooting target has completed the precise reshooting of the inspection target. Compared with the shooting method of directly controlling the pose and zoom of the gimbal with one target detection, the target tracking of the gimbal is used to keep the target in the center of the screen during the entire control process, which can better deal with gimbal control or strong wind, etc. The target shooting deviation or even loss of target caused by external force can realize the stabilization of the gimbal during the replay process, and finally achieve accurate and robust inspection target shooting effect.

Referring to FIG. 7, the mobile platform 700 may include:

Body 71;

The power system 72 is located in the body, and the power system is used to provide power for the movable platform;

The camera device 73 is arranged on the body and is used for collecting images;

memory 74; and

A processor 75 coupled to the memory, and the processor is configured to execute the image capturing method of the embodiment shown in FIGS. 1 to 6 based on instructions stored in the memory.

In an exemplary embodiment of the present disclosure, when adjusting the position and attitude of the movable platform 700 during the execution of the image capturing method, the processor 75 controls the power system 72 to adjust the position and attitude of the movable platform 700 .

In another embodiment of the present disclosure, the movable platform 700 further includes a platform 76 on which the camera device 73 is mounted. When adjusting the position and attitude of the movable platform 700 during the execution of the image capture method, the processor 75 controls the power system 72 to adjust the position and attitude of the pan-tilt 76 to keep the target object at the shooting center.

The embodiments of the present disclosure can be used for industrial drones with inspection functions, to solve the problem that the target is not in the center of the screen or in the field of view due to the camera (aircraft) positioning and pan/tilt control during the automatic inspection process, and to solve the problem of automatic inspection. During the inspection process, the camera (aircraft) shakes due to external factors such as strong winds, resulting in the problem that the target is not in the center of the screen or within the field of view.

Corresponding to the foregoing method embodiments, the present disclosure further provides an image capturing device, which may be used to execute the foregoing method embodiments.

Referring to FIG. 8, an image capture device 800 may include:

memory 81 configured to store program codes;

One or more processors 82 coupled to the memory 81, the processors 82 are configured to perform the following methods based on the instructions stored in the memory 81:

acquiring a first target image at a first preset focal length, and identifying a target object in the first target image;

When the target object is recognized in the first target image, adjust the position and posture of the movable platform according to the position of the target object in the first target image, so that the position of the target object is located at a preset position in the picture taken by the camera device ;

When the target object cannot be identified in the first target image, continuously increase the focal length of the camera, and identify the target object according to the current image collected by the camera until the focal length of the camera is equal to the second preset focal length;

Wherein, in the process of adjusting the focal length of the camera device, when the target object is recognized according to the current image collected by the camera device at any time, the position and posture of the movable platform are adjusted according to the position of the target object in the current image, so as to The position of the target object is located at the preset position in the shooting picture of the camera device.

In an exemplary embodiment of the present disclosure, the second preset focal length and the position and posture of the movable platform when capturing the first target image are obtained according to the teaching data in the teaching phase of the target detection task.

In an exemplary embodiment of the present disclosure, the first preset focal length is obtained according to the tracking range of the target tracking task.

In an exemplary embodiment of the present disclosure, the processor 82 is configured to: determine the first coordinates of the target object in the first target image by taking the position of the central region of the first target image as the coordinate origin; The position and attitude adjustment value of the movable platform, the position and attitude adjustment value includes at least one of the horizontal adjustment value, the vertical adjustment value, and the rotation angle adjustment value; adjust the position and attitude of the movable platform according to the position and attitude adjustment value.

In an exemplary embodiment of the present disclosure, the processor 82 is configured to: acquire the feature points of the target object according to the standard feature image of the target object; perform feature point recognition in the first target image according to the feature points of the target object, to Determines the anchor box for the target object.

In an exemplary embodiment of the present disclosure, the processor 82 is configured to: obtain the feature points of the target object according to the standard feature image of the target object; Recognition to determine the positioning box of the target object.

In an exemplary embodiment of the present disclosure, the processor 82 is configured to perform feature point recognition starting from a central area of the first target image.

In an exemplary embodiment of the present disclosure, the processor 82 is configured to: use a convolutional neural network to extract local feature points from the first target image; compare the extracted local feature points with the feature points of the target object , to identify the target object.

In an exemplary embodiment of the present disclosure, the processor 82 is configured to continuously increase the focal length of the camera device with a preset step size.

In an exemplary embodiment of the present disclosure, the processor 82 is configured to continuously track the target object during the process of adjusting the position and posture of the movable platform according to the position of the target object in the first target image, so that The target object is continuously in the picture captured by the camera device.

It should be noted that although several modules or units of the device for action execution are mentioned in the above detailed description, this division is not mandatory. Actually, according to the embodiment of the present disclosure, the features and functions of two or more modules or units described above may be embodied in one module or unit. Conversely, the features and functions of one module or unit described above can be further divided to be embodied by a plurality of modules or units.

Those skilled in the art can understand that various aspects of the present invention can be implemented as systems, methods or program products. Therefore, various aspects of the present invention can be embodied in the following forms, that is: a complete hardware implementation, a complete software implementation (including firmware, microcode, etc.), or a combination of hardware and software implementations, which can be collectively referred to herein as "circuit", "module" or "system".

Through the description of the above implementations, those skilled in the art can easily understand that the example implementations described here can be implemented by software, or by combining software with necessary hardware. Therefore, the technical solutions according to the embodiments of the present disclosure can be embodied in the form of software products, and the software products can be stored in a non-volatile storage medium (which can be CD-ROM, U disk, mobile hard disk, etc.) or on the network , including several instructions to make a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) execute the method according to the embodiments of the present disclosure.

In an exemplary embodiment of the present disclosure, there is also provided a computer-readable storage medium on which a program product capable of implementing the above-mentioned method in this specification is stored. In some possible implementations, various aspects of the present invention can also be implemented in the form of a program product, which includes program code, and when the program product is run on a terminal device, the program code is used to make the The terminal device executes the steps according to various exemplary embodiments of the present invention described in the "Exemplary Method" section above in this specification.

The program product for implementing the above method according to the embodiment of the present invention may adopt a portable compact disk read-only memory (CD-ROM) and include program codes, and may run on a terminal device such as a personal computer. However, the program product of the present invention is not limited thereto. In this document, a readable storage medium may be any tangible medium containing or storing a program, and the program may be used by or in combination with an instruction execution system, apparatus or device.

The program product may reside on any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. More specific examples (non-exhaustive list) of readable storage media include: electrical connection with one or more conductors, portable disk, hard disk, random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.

A computer readable signal medium may include a data signal carrying readable program code in baseband or as part of a carrier wave. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium other than a readable storage medium that can transmit, propagate, or transport a program for use by or in conjunction with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out the operations of the present invention may be written in any combination of one or more programming languages, including object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural programming languages. Programming language - such as "C" or a similar programming language. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server to execute. In cases involving a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device (for example, using an Internet service provider). business to connect via the Internet).

In addition, the above-mentioned figures are only schematic illustrations of the processes included in the method according to the exemplary embodiments of the present invention, and are not intended to be limiting. It is easy to understand that the processes shown in the above figures do not imply or limit the chronological order of these processes. In addition, it is also easy to understand that these processes may be executed synchronously or asynchronously in multiple modules, for example.

Other embodiments of the present disclosure will be readily apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any modification, use or adaptation of the present disclosure, and these modifications, uses or adaptations follow the general principles of the present disclosure and include common knowledge or conventional technical means in the technical field not disclosed in the present disclosure . The specification and examples are to be considered exemplary only, with the true scope and concept of the disclosure indicated by the appended claims.

Industrial Applicability

Claims

An image capturing method, characterized in that it is applied to a movable platform, the movable platform is equipped with a camera device, and the movable platform collects images through the camera device, and the method comprises:

acquiring a first target image at a first preset focal length, and identifying a target object in the first target image;

When the target object is recognized in the first target image, adjust the position and posture of the movable platform according to the position of the target object in the first target image, so that the target object The position is located at the preset position in the picture taken by the camera device;

When the target object cannot be identified in the first target image, continuously increase the focal length of the camera device, and identify the target object according to the current image collected by the camera device until the camera device the focal length is equal to the second preset focal length;

Wherein, during the process of adjusting the focal length of the camera device, when the target object is recognized according to the current image collected by the camera device at any time, according to the position of the target object in the current image, The position and posture of the movable platform are adjusted so that the position of the target object is located at a preset position in the picture taken by the camera device.
The image capturing method according to claim 1, characterized in that, the method is applied to the replay stage of the target detection task, the second preset focal length and the movable platform when capturing the first target image Both position and attitude are obtained according to the teaching data in the teaching phase of the target detection task.
The image capturing method according to claim 1, wherein the method is applied in a tracking phase of a target tracking task, and the first preset focal length is obtained according to a tracking range of the target tracking task.
The image capturing method according to claim 1, wherein the adjusting the position and posture of the movable platform according to the position of the target object in the first target image comprises:

Determining the first coordinates of the target object in the first target image with the center of the first target image as the coordinate origin;

Determine the position and attitude adjustment values of the movable platform according to the first coordinates, and the position and attitude adjustment values include at least one of a horizontal adjustment value, a vertical adjustment value, and a rotation angle adjustment value;

The position and attitude of the movable platform are adjusted according to the position and attitude adjustment value.
The image capturing method according to claim 1, wherein identifying the target object in the first target image comprises:

Acquiring the feature points of the target object according to the standard feature image of the target object;

Perform feature point recognition in the first target image according to the feature points of the target object to determine a positioning frame of the target object.
The image capturing method according to claim 1, wherein identifying the target object according to the current image collected by the camera device comprises:

Acquiring the feature points of the target object according to the standard feature image of the target object;

Perform feature point recognition in the current image collected by the camera device according to the feature points of the target object, so as to determine the positioning frame of the target object.
The image capturing method according to claim 5, wherein said identifying feature points in said first target image according to said current feature points comprises:

The feature point recognition is performed starting from the center point of the first target image.
The image capturing method according to claim 5 or 7, wherein said identifying feature points in said first target image according to said current feature points comprises:

Using a convolutional neural network to extract local feature points from the first target image;

The extracted local feature points are compared with the feature points of the target object to identify the target object.
The image capturing method according to claim 1, wherein the continuously increasing the focal length of the imaging device comprises: continuously increasing the focal length of the imaging device with a preset step size.
The image capturing method according to claim 1, wherein, during the process of adjusting the position and posture of the movable platform according to the position of the target object in the first target image, continuously tracking the the target object, so that the target object remains in the picture captured by the camera device.
The image capture method of claim 1, wherein the movable platform comprises an unmanned aerial vehicle.
The image capturing method according to claim 11, wherein the movable platform further comprises a pan-tilt, the pan-tilt is mounted on the unmanned aerial vehicle, and the camera is mounted on the pan-tilt superior.
An image capturing device, characterized in that it comprises:

memory configured to store program code;

one or more processors coupled to the memory, the processors configured to perform the following methods based on instructions stored in the memory:

acquiring a first target image at a first preset focal length, and identifying a target object in the first target image;

When the target object is recognized in the first target image, adjust the position and posture of the movable platform according to the position of the target object in the first target image, so that the target object The position is located at the preset position in the picture taken by the camera device;

When the target object cannot be identified in the first target image, continuously increase the focal length of the camera device, and identify the target object according to the current image collected by the camera device until the camera device the focal length is equal to the second preset focal length;

Wherein, during the process of adjusting the focal length of the camera device, when the target object is recognized according to the current image collected by the camera device at any time, according to the position of the target object in the current image, The position and posture of the movable platform are adjusted so that the position of the target object is located at a preset position in the picture taken by the camera device.
The image capturing device according to claim 13, wherein the second preset focal length and the position and posture of the movable platform when capturing the first target image are all based on the teaching of the target detection task Phase teaching data are obtained.
The image capturing device according to claim 13, wherein the first preset focal length is obtained according to a tracking range of the target tracking task.
The image capture device of claim 13, wherein the processor is configured to:

Determining the first coordinates of the target object in the first target image with the position of the central area of the first target image as the origin of coordinates;

Determine the position and attitude adjustment values of the movable platform according to the first coordinates, and the position and attitude adjustment values include at least one of a horizontal adjustment value, a vertical adjustment value, and a rotation angle adjustment value;

The position and attitude of the movable platform are adjusted according to the position and attitude adjustment value.
The image capture device of claim 13, wherein the processor is configured to:

Acquiring the feature points of the target object according to the standard feature image of the target object;

Perform feature point recognition in the first target image according to the feature points of the target object to determine a positioning frame of the target object.
The image capture device of claim 13, wherein the processor is configured to:

Acquiring the feature points of the target object according to the standard feature image of the target object;

Perform feature point recognition in the current image collected by the camera device according to the feature points of the target object, so as to determine the positioning frame of the target object.
The image capture device of claim 17, wherein the processor is configured to:

The feature point recognition is performed starting from the central area of the first target image.
The image capturing device according to claim 17 or 19, wherein the processor is configured to:

Using a convolutional neural network to extract local feature points from the first target image;

The extracted local feature points are compared with the feature points of the target object to identify the target object.
The image capture device according to claim 13, wherein the processor is configured to: continuously increase the focal length of the camera with a preset step size.
The image capture device of claim 13, wherein the processor is configured to:

During the process of adjusting the position and posture of the movable platform according to the position of the target object in the first target image, continuously tracking the target object so that the target object is continuously on the camera device in the captured screen.
A mobile platform, characterized in that it comprises:

body;

a power system, provided on the body, and the power system is used to provide power for the movable platform;

a camera device, installed on the body, for collecting images;

storage; and

A processor coupled to the memory, the processor configured to execute the image capturing method according to any one of claims 1-12 based on instructions stored in the memory.
The movable platform according to claim 23, wherein when adjusting the position and attitude of the movable platform during the execution of the image capturing method, the processor controls the power system to adjust the movable The position and attitude of the mobile platform.
The movable platform according to claim 23, characterized in that, the movable platform further comprises a pan-tilt, and the camera device is mounted on the pan-tilt; during the process of executing the image capturing method, the When the position and attitude of the movable platform, the processor controls the power system to adjust the position and attitude of the platform.
A computer-readable storage medium, on which a program is stored, and when the program is executed by a processor, the image capturing method according to any one of claims 1-12 is implemented.