WO2022040988A1

WO2022040988A1 - Image processing method and apparatus, and movable platform

Info

Publication number: WO2022040988A1
Application number: PCT/CN2020/111450
Authority: WO
Inventors: 周游; 刘洁; 陈希
Original assignee: 深圳市大疆创新科技有限公司
Priority date: 2020-08-26
Filing date: 2020-08-26
Publication date: 2022-03-03
Also published as: CN113950705A

Abstract

An image processing method and apparatus, and a movable platform. The method comprises: acquiring a first image collected by a photographic apparatus in a first pose, and determining, in the first image, a first pixel block corresponding to an object to be filtered out; acquiring a second image collected by the photographic apparatus in a second pose, wherein the second image comprises a second pixel block corresponding to a target object, and the target object is an object, which is obscured by the object to be filtered out, in the first image; and performing replacement processing on the first pixel block in the first image by means of the second pixel block, so as to generate a first image which has been subjected to replacement processing.

Description

Image processing method, device and movable platform

technical field

The present application relates to the technical field of image processing, and in particular, to an image processing method, device and movable platform.

Background technique

In the process of shooting images or videos, users are usually faced with a scene where non-target subjects appear within the shooting angle of view, and these non-target subjects are also captured in the final image or video, which affects the final shooting effect. For example, assuming that the user is photographing a building, there may be some non-target subjects such as passers-by, vehicles, trash cans or telephone poles next to the building. These non-target subjects will block the photographed building, or appear in the In the captured image, it affects the display effect of the image. In order to improve the shooting effect of images or videos to better meet the shooting needs of users, it is necessary to propose a solution for removing non-target shooting objects in images.

SUMMARY OF THE INVENTION

In view of this, the present application provides an image processing method, device and movable platform.

According to a first aspect of the present application, an image processing method is provided, the method comprising:

acquiring the first image collected by the camera in the first pose, and determining the first pixel block corresponding to the object to be filtered out in the first image;

Obtain a second image collected by the camera in the second pose, the second image includes a second pixel block corresponding to a target object, and the target object is the object to be filtered out in the first image occluded object;

The first pixel block in the first image is replaced by the second pixel block to generate a replaced first image.

According to a second aspect of the present application, an image processing method is provided, wherein the method comprises:

Determine the first pixel block corresponding to the dynamic object in the first image;

Determine the pixel block of the pixel position of the first pixel block corresponding to the pixel position in a plurality of second images, and the plurality of second images and the first image are collected in the same pose by a camera;

From the plurality of second images, it is determined that the pixel block of the corresponding pixel position is the third image of the static area;

The first pixel block in the first image is replaced with the pixel block at the corresponding pixel position in the third image.

According to a third aspect of the present application, an image processing apparatus is provided, the image processing apparatus includes a processor, a memory, a computer program stored in the memory and executable by the processor, and the processor executes the computer program , implement the following steps:

According to a fourth aspect of the present application, an image processing apparatus is provided, the image processing apparatus includes a processor, a memory, a computer program stored in the memory and executable by the processor, and the processor executes the computer program , implement the following steps:

From the plurality of second images, determine that the pixel block of the corresponding pixel position is the third image of the static area;

According to a fifth aspect of the present application, a movable platform is provided, where the movable platform includes a camera device and any one of the image processing devices in the embodiments of the present application.

By applying the solution provided by the present application, by acquiring images collected by the camera device in different poses, the second image including the target object occluded by the object to be filtered is used to complete the occluded target object in the first image, so as to eliminate the first image. The objects to be filtered out in the image are not only suitable for filtering out dynamic objects, but also for filtering out static objects. Through this method, the purpose of automatically filtering out the non-shooting target objects in the image can be realized according to the user's needs, which can improve the quality of the image. Display effect and user experience.

Description of drawings

In order to illustrate the technical solutions in the embodiments of the present application more clearly, the following briefly introduces the drawings that are used in the description of the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative labor.

FIG. 1 is a schematic diagram of filtering out non-shooting target objects in an image according to an embodiment of the present application.

FIG. 2 is a flowchart of an image processing method according to an embodiment of the present application.

FIG. 3 is a schematic diagram of determining a first pixel block corresponding to an object to be filtered out according to an embodiment of the present application.

FIG. 4 is a schematic diagram of a prompting interface for prompting a user to adjust to a second pose according to an embodiment of the present application.

FIG. 5 is a schematic diagram of adjusting a camera device to a second pose according to an embodiment of the present application.

FIG. 6 is a schematic diagram of determining a second pose according to an embodiment of the present application.

Fig. 7 is a schematic diagram of determining corresponding pixel regions of an object to be filtered and a target object according to an embodiment of the present application.

FIG. 8 is a schematic diagram of determining a second orientation according to an embodiment of the present application.

FIG. 9 is a schematic diagram of determining whether an image captured by a camera device can be used as a second image according to an embodiment of the present application.

FIG. 10 is a schematic diagram of determining whether an image captured by a camera device can be used as a second image according to an embodiment of the present application.

FIG. 11( a ) is a schematic diagram of filtering out dynamic objects according to an embodiment of the present application.

FIG. 11( b ) is a schematic diagram of determining a third image according to an embodiment of the present application.

FIG. 12 is a flowchart of an image processing method according to an embodiment of the present application.

FIG. 13 is a schematic diagram of an application scenario of an embodiment of the present application.

FIG. 14 is a schematic diagram of filtering out dynamic objects according to an embodiment of the present application.

FIG. 15 is a schematic diagram of a frame selection of an object to be filtered out and an occluded background area according to an embodiment of the present application.

FIG. 16 is a schematic diagram of filtering out static objects according to an embodiment of the present application

FIG. 17 is a schematic diagram of a logical structure of an image processing apparatus according to an embodiment of the present application.

detailed description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are only a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.

During the process of shooting a video or an image, there may be some non-shooting target objects within the shooting angle of view, and these non-shooting target objects are also captured in the image, which affects the display effect of the image. These non-target objects may be dynamic, such as walking passers-by, vehicles, etc., or static, such as trash cans, telephone poles, buildings, etc. As shown in Fig. 1(a), the user wants to photograph the house 11 and its surrounding scenery, but the trash can 12 and passers-by in front of the house are also photographed in the image 13, which seriously affects the visual effect of the image. Therefore, it is necessary to remove these non-shooting target objects in the image, as shown in (b) in FIG. 1 , so that the image has a better shooting effect and improves the user experience.

In order to remove the non-shooting target object (hereinafter referred to as the object to be filtered) in the image, the embodiment of the present application provides an image processing method. The second image is collected from the second pose of the target object occluded by the object to be filtered, so as to use the second image to complement the target object occluded by the object to be filtered in the first image, so as to achieve the purpose of removing the object to be filtered.

Specifically, the flow chart of the method is shown in Figure 2, which includes the following steps:

S202, acquiring the first image collected by the camera in the first pose, and determining the first pixel block corresponding to the object to be filtered out in the first image;

S204: Acquire a second image collected by the camera in the second pose, the second image includes a second pixel block corresponding to a target object, and the target object is the target object in the first image that is to be filtered Except for objects occluded by objects;

S206 , performing replacement processing on the first pixel block in the first image by using the second pixel block to generate a replaced first image.

The image processing method in this embodiment of the present application may be performed by a camera device that captures the first image and the second image, and the camera device may be any device with an image or video capture function. For example, the camera device may be a camera, or a device Terminals such as cameras, mobile phones, tablets, laptops, and smart wearable devices with cameras can also be mobile platforms such as drones, unmanned vehicles, and handheld PTZs with cameras. Of course, in some embodiments, the image processing methods of the embodiments of the present application may also be performed by other devices that are communicatively connected to the camera, for example, a cloud server, and the camera will send the first image and the second image after collecting the first image and the second image. For processing by the cloud server, of course, if the camera device is a movable platform, the image processing method in this embodiment of the present application may also be executed by a control terminal communicatively connected to the mobile platform, which is not limited in this application.

The object to be filtered out in this embodiment of the present application refers to an object that the user wishes to remove from the first image. The object to be filtered out may be a dynamic object or a static object, and there may be one or more objects to be filtered out.

The target object in this embodiment of the present application refers to an object in the first image that is occluded by the object to be filtered out, and there may be one or more target objects.

After acquiring the first image captured by the camera, the first pixel block corresponding to the object to be filtered out can be determined in the first image, and then a second image containing the target object occluded by the object to be filtered out is acquired, and the second image is used The second pixel block corresponding to the target object in the first image performs replacement processing on the first pixel block in the first image, so that the object to be filtered out in the first image can be eliminated.

In the present application, the target object in another image is complemented by the image including the target object occluded by the object to be filtered, so as to eliminate the object to be filtered out in the other image by acquiring the images collected by the camera in different poses. , which is not only suitable for filtering out dynamic objects, but also can filter out static objects. This method can automatically filter out the non-shooting target objects of the image according to user needs, which can improve the display effect and user experience of the image.

It is determined that the object to be filtered out in the first image can be selected by the user, or can be automatically recognized by the device. In some embodiments, the device can automatically identify the object to be filtered from the first image. For example, some identification rules for the object to be filtered can be preset, and then the object to be filtered can be identified according to these rules. For example, the object to be filtered can be automatically identified. Objects such as trash cans, utility poles, and walking vehicles are identified from the first image as objects to be filtered out, and objects located at the edge of the image can also be identified as objects to be filtered out. Of course, after the object to be filtered out is automatically identified, the identified object to be filtered out can also be displayed to the user, for example, an image of the object to be filtered out is displayed in a frame on the user interface, and subsequent steps are performed after the user confirms.

In some embodiments, the object to be filtered out may be determined in the first image according to the user's instruction, for example, the user may click or frame the object to be filtered out in the image. The first pixel block may be a pixel block that only includes the object to be filtered out, or a pixel block that includes the object to be filtered out and its surrounding part of the background. For example, as shown in FIG. 3 , the first pixel block may be a human-shaped outline area 31 including only the person to be filtered out, or may be a rectangular area 32 including the person to be filtered out and its surrounding background area.

In some embodiments, the user instruction for determining the object to be filtered may include a check box input by the user through a human-computer interaction interface, where the check box is used to frame the object to be filtered out. For example, the user can directly draw a selection frame on the user interface to select the objects to be filtered out. The selection frame drawn by the user can be a rectangular selection frame, a circular selection frame or an irregular shaped selection frame, which can be set according to actual needs.

In a certain embodiment, the image area selected by the user through the selection frame may be used as the first pixel block corresponding to the object to be filtered out. Since the selection frame drawn by the user may not be accurate enough, there may be some edge areas of the object to be filtered out that are not The frame is selected, but some background areas are frame-selected instead, and when the first pixel block is replaced, there may be incompletely replaced objects to be filtered out in the replaced image, which causes the image to appear after the image is replaced. In order to avoid this phenomenon, after the user inputs the marquee to select the object to be filtered out, the first image can be subjected to superpixel segmentation processing to obtain multiple image areas, and then the marquee selection is performed according to the multiple image areas. pixel block to adjust. Among them, the principle of superpixel segmentation processing is to group pixels, and group adjacent pixels with similar texture, color, brightness and other characteristics into a group as an image area, so that the image can be divided into multiple image areas. Then, the pixel blocks framed in the marquee can be adjusted according to the multiple image areas. When performing superpixel segmentation processing on the first image, it can be implemented by using a currently common algorithm, and details are not described herein again.

In some embodiments, when adjusting the pixel blocks selected in the marquee according to the multiple image regions obtained by the superpixel segmentation process, the ratio of the portion of each image region that falls into the marquee to the image region can be adjusted. to adjust the pixel block selected by the marquee. For example, you can first determine the proportion of the part of each image area that falls into the marquee to the image area. If the proportion is greater than a preset proportion, such as greater than 50%, it is considered that the image area is selected by the marquee, and the image area is selected. If it is smaller than the marquee, it is considered that the image area is not selected by the marquee, and the image area is placed outside the marquee to adjust the pixel block selected by the marquee. Of course, in some embodiments, in order to ensure that the first pixel block corresponding to the object to be filtered is within the selection frame as much as possible, after adjusting the pixel block selected by the selection frame in the above-mentioned manner, the selection frame can also be enlarged. Expand the pixel block selected by the marquee to serve as the first pixel block.

After performing superpixel processing on the image, fine-tune the user's input box to adjust the content of the box, so that when the user selects the object to be filtered out, the user does not need to make an accurate box selection, and can also more accurately determine. The image area that the user wants to select is selected, so as to accurately determine the first pixel block corresponding to the object to be filtered out, which facilitates the user's operation.

In order to conveniently adjust the control camera device to different poses to capture images in different poses, in some embodiments, the camera device can be mounted on a movable platform, and the camera device can be controlled by controlling the movement of the movable platform The movement is adjusted so that the second pose occluded by the object to be filtered can be observed and a second image is acquired.

In some embodiments, the movable platform can be any electronic device that includes a power component that can drive the movable platform to move. For example, the movable platform can be any one of drones, unmanned vehicles, unmanned ships, or intelligent robots.

In some embodiments, the camera device can also be mounted on the movable platform through the pan-tilt. For example, the movable platform can be provided with a pan-tilt, and the camera device can be fixed on the pan-tilt, and the camera can be controlled by controlling the movement of the movable platform. The movement of the device can also be controlled to make the camera device and the movable platform move relative to each other by controlling the movement of the PTZ, so as to control the movement of the camera device to adjust to the second pose.

In some embodiments, the second pose may include a second position and a second orientation. In a scene where the camera device is mounted on the movable platform through the gimbal, when the camera device is controlled to move, the movable platform can be controlled to move so as to make the camera move. The camera device is located at the second position, and the orientation of the camera device can be adjusted to the second orientation by controlling the rotation of the pan/tilt. Take the mobile platform as an example of a UAV, a gimbal can be set on the UAV, and a camera device can be installed on the gimbal. After the second pose is determined, the UAV can be controlled to fly to the second position to reach the first position. After the second position, the pan/tilt can be controlled to rotate, so that the orientation of the camera device is adjusted to the second orientation.

A variety of ways can be used in acquiring the second image. For example, it is possible to first determine the second pose where the target object can be observed, and then directly control the camera to adjust to the second pose to collect the second image. Of course, the pose of the camera can also be changed continuously to obtain the Multiple frames of images collected in different poses, and then each time a frame of image is collected, it is determined whether the image includes a complete target object, until an image including the complete target object is obtained, which is taken as the second image.

In some embodiments, in order to capture the target object that is occluded by the object to be filtered out, it may be automatically determined that the camera can observe the second pose of the target object, and then the camera is controlled to move to adjust to the second pose. pose to acquire the second image. For example, the camera device can be mounted on movable platforms such as drones and unmanned vehicles. Therefore, the position and attitude of the camera device can be automatically calculated to observe the complete target object, and then automatically control the drone, The unmanned vehicle moves to the corresponding position and collects the second image.

In some embodiments, it can also be automatically determined that the drone can capture the second pose of the target object, and then a prompt message indicating the second pose is sent to the user, so that the user can control the movement of the camera to adjust to the A second pose and a second image is acquired. For example, taking a scene where a user uses a mobile phone, a camera, and other camera devices to shoot as an example, the second pose can be automatically calculated first, and then a prompt message indicating the second pose is sent to the user through the interactive interface. The prompt message can be: Text information can also be image information. For example, prompt information such as "move 100 meters from the current position to the east" and "move to the right from the current position 50 meters" can be displayed on the interactive interface, so that the user can control the camera according to the prompt information. The device is moved to the corresponding position before shooting. Of course, if the user uses the movable platform to shoot, the user can control the control terminal corresponding to the movable platform according to the prompt information, and control the movable platform to move to the corresponding position through the control terminal.

Of course, in some embodiments, the second pose includes a second position and a second orientation, and when the prompt information indicating the second pose is sent to the user, the user may be shown an image marked with the second position and adjusted to the rotation angle information corresponding to the second orientation. In order to make the prompt information more intuitive and facilitate the user to quickly locate the second pose, the prompt information can also be image information. For example, an image that identifies the second position in the second pose can be displayed to the user on the interactive interface, wherein, The second position where the target object can be observed can be an area, so this area can be framed in the image, as shown in FIG. 4 . At the same time, the rotation angle information corresponding to the second orientation adjusted to the second pose can also be displayed to the user, so that the user can adjust the camera device to capture the second image according to the displayed position information and angle information.

Since the position of the target object that can be photographed is related to the relative position of the object to be filtered and the target object, for example, when the distance between the object to be filtered and the target object is far, moving a small distance can collect the complete target object. When the distance between the object to be filtered and the target object is relatively short, it may be necessary to move a long distance to collect an image including the complete target object. Therefore, in some embodiments, when determining the second pose, the location information of the object to be filtered and the location information of the target object may be determined first, and then the location information of the object to be filtered and the location information of the target object may be determined. second pose.

Of course, the size of the object to be filtered also affects the position where the target object can be completely photographed. For example, if the size of the object to be filtered is large, the target object may have to be moved to a farther position to be completely photographed. Small in size, it may be possible to fully capture the target object by moving a small distance. Therefore, in some embodiments, when the second pose is determined according to the position information of the object to be filtered and the position information of the target object, the size of the object to be filtered may be determined, and then the size of the object to be filtered, the size of the object to be filtered and the size of the object to be filtered may be determined. The second pose is determined by dividing the position information of the object and the position information of the target object.

As shown in FIG. 5 , the trash can 51 in the figure is the object to be filtered out, the house 52 is the target object to be occluded, and the black dot in the figure represents the position of the camera device. Assuming that the current camera captures the first image at "Position 1", and the target object 52 in the first image is blocked by the object 51 to be filtered out, when it is determined that the second pose of the target object 52 can be completely captured, the camera can Bypass the object 51 to be filtered (for example, go behind the object to be filtered, as shown in "Position 2") to capture the complete target object 52, of course, you can also move a distance along the current shooting position to reach "Position 3" so that the target object 52 can fall within the field of view of the camera. In some embodiments, the first pose includes a first position and a first orientation, the second pose includes a second position and a second orientation, and the second position passes through the first position and is parallel to a plane where the object to be filtered is located On the straight line, the second orientation points to the position of the object to be filtered, that is, you can translate a distance along the current first position of the camera device to reach the second position, and then adjust the orientation of the camera device to point to the object to be filtered. remove objects.

In some embodiments, when determining the second position, the moving distance may be determined according to the position information of the object to be filtered, the position information of the target object, and the size of the object to be filtered, and then the moving distance may be determined according to the first position and the size of the object to be filtered. The moving distance determines the second position. For example, as shown in FIG. 6, the small cuboid 61 in the figure is the object to be filtered, the width of the object to be filtered is L, the distance between the object to be filtered and the camera is d1, and the large cuboid 62 is blocked. The target object, the distance between the target object and the camera device is d2, and the object to be filtered and the target object are converted to a view from a top-down perspective. The object to be filtered is shown as 65 in the figure, the area of the target object that is occluded by the object to be filtered is shown as 64 in the figure, the "position A" in the figure is the first position, and the schematic diagram of the image plane 66 is the camera The schematic diagram of the image collected by the device at the first position, the image plane schematic diagram 67 is a schematic diagram of the image collected by the camera at "position B", where "position B" is the position where the camera device can just observe the left edge of the occluded area of the target object. position, "position B" can be reached by translation distance D from the first position, and it can be seen from Figure 6 that the moving distance D can be solved by formula (1):

The distance d1 between the object to be filtered and the camera device and the distance d2 between the target object and the camera device may be determined by using multiple images collected by the camera device in different poses. The width L of the object to be filtered out can be determined according to the distance between the object to be filtered out and the camera device and the imaging size of the object to be filtered out.

It can be seen from FIG. 6 that the occluded area can be observed in the area on the right side of the line connecting “Location B” and the right edge of the object to be filtered. Therefore, the second location can be any location in this area. After the moving distance D is determined, the three-dimensional space coordinates of "position B" can be determined according to the current three-dimensional space coordinates of the first position and the moving distance D, the three-dimensional space coordinates corresponding to the second position can be further determined, and the camera is controlled to move to the second position. Location.

In some embodiments, when determining the distance d1 between the object to be filtered and the camera device and the distance d2 between the target object and the camera device, multiple frames of images collected by the camera device in different poses may be acquired, and one frame of the image An area including the object to be filtered out is determined, for example, as shown in FIG. 7 , an area 71 including the object to be filtered out 70 is determined, and then an annular area 72 surrounding the area 71 is determined. When determining the distance d1 between the object to be filtered and the camera device, a plurality of feature points can be extracted from the area 71 first, and the feature point extraction can be performed by using an existing feature point extraction algorithm, which will not be repeated here. Then, the matching points of the extracted feature points in the remaining multi-frame images can be determined, and then the optical flow vector of each feature point can be determined according to the matching points of these feature points in the remaining images, and the optical flow vector of the feature points can be fitted according to the optical flow vector of the feature points to be filtered. The center of the object (that is, the area 71) is relative to the optical flow vector of each image, so that the matching points of the center of the object to be filtered (that is, the area 71) in the remaining multi-frame images can be determined, according to the center of the object to be filtered out. For feature points and matching points, the BA (Bundle Adjustment) algorithm can be used to determine the internal and external parameters of the camera, and the depth distance from the center of the object to be filtered is determined according to the internal and external parameters of the camera, which is the distance between the object to be filtered and the camera. distance d1. For the distance d2 between the target object and the camera device, feature points can be extracted from the annular region 72, and then a similar method is used to determine the distance d2 between the target object and the camera device, which will not be repeated here.

In some embodiments, when the second orientation is determined, the second orientation may be determined according to the first position and the position of the object to be filtered out in the image frame captured by the camera. For example, during the moving process of the camera device, it can detect the position of the object to be filtered out in the captured image in real time, and can continuously adjust the orientation of the camera device to keep the object to be filtered out in the center of the image screen, so that when the camera device When moving to the second position, the second orientation can also be determined. For example, according to the position of the center of the object to be filtered out in the first image on the image screen and the pose parameters corresponding to the first pose, it can be determined that when the camera moves to the second position, the center of the object to be filtered out should be located at the center of the screen at the second position. The second orientation corresponds to the attitude angle, thereby determining the second orientation.

In some embodiments, when determining the second orientation, the second orientation may also be determined according to the first position, the positions of the left and right endpoints of the object to be filtered, and the positions of the left and right endpoints of the target object. For example, as shown in FIG. 8 , which side of the first position the second position is located on can be determined according to the three-dimensional coordinates of the first position and the second position, and when the second position is located on the right side of the first position, it can be determined according to the three-dimensional coordinates of the first position and the second position. The left endpoint A of the object to be filtered and the right endpoint D of the target object determine a connecting line AD, and the second orientation is to point to the object to be filtered along the connecting line AD. Since the three-dimensional coordinates of the left endpoint A of the object to be filtered and the right endpoint D of the target object can be determined, the attitude angle corresponding to the line connecting the two endpoints can also be solved. Of course, when the second position is located on the left side of the first position, a connecting line BC can be determined according to the left endpoint B of the object to be filtered and the right endpoint C of the target object, and the second orientation is along the connecting line BC pointing to the object to be filtered. Filter out objects. Similarly, the attitude angle corresponding to the connection line BC can be determined according to the three-dimensional coordinates of the left endpoint B of the object to be filtered and the right endpoint C of the target object.

Of course, when acquiring the second image, the second image may also be acquired by continuously adjusting the pose of the camera to acquire the image and then judging whether the acquired image can be used as the second image. For example, in some embodiments, the pose of the camera can also be continuously changed to obtain multiple frames of images collected by the camera at different poses. Each time the camera collects a frame of image, it can determine whether the image includes the corresponding image of the target object. The second pixel block of , if included, the image is regarded as the second image.

The determination of whether the image includes the second pixel block may be determined by the user, or may be determined automatically by the device executing the image processing method. Taking the camera device mounted on a mobile platform such as a drone as an example, the user can adjust the posture of the drone and collect images in different postures, and then the drone can send the collected images back to the control terminal, and the user can determine the image. When the target object is included in the , you can click on the image to use it as a second image. Of course, it is also possible to automatically determine whether the collected image can be used as the second image by performing certain processing and identification on the collected image. For example, in some embodiments, after the first image is collected and the first pixel block corresponding to the object to be filtered is determined from the first image, a plurality of first feature points may be extracted from the first pixel block, and A plurality of second feature points are extracted from the surrounding area of the first pixel block, and for each frame of image collected after the camera device changes the pose, the first matching point and the second feature of the first feature point in the image can be determined point the second matching point in the image, and then determine whether the image includes the second pixel block according to the positional relationship between the first matching point and the second matching point in the image.

In some embodiments, the first feature point may be a feature point located within the first side of the first pixel block, the second feature point may be a feature point located outside the second side of the first pixel block, and the When determining whether the image includes the second pixel block according to the positional relationship between the first matching point and the second matching point, it can be determined whether the second matching point is located on the first side of the first matching point, and if so, it is determined that the image includes the first matching point. Two pixel blocks. The first side is the opposite side of the second side. For example, the first side is the left side of the first pixel block, the second side is the right side of the first pixel block, and the first side is the first side. The upper side of a pixel block, the second side is the lower side of the first pixel block. For example, as shown in FIG. 9 , (a) in FIG. 9 is a schematic diagram of the first image 90 , the first pixel block 91 can be determined from the first image 90 , and the first pixel block 91 is on the first side of the first pixel block 91 . A plurality of first feature points 92 are extracted within (ie, the left side), and a plurality of second feature points 93 are extracted outside the second side (ie, the right side) of the first pixel block. When the camera device changes the pose to capture After one frame of image, as shown in (b) is a schematic diagram of the image 94, it can be determined that the first feature point is at the first matching point 95 of the image 94 and the second feature point is at the second matching point 96 of the image 94, Then determine the positional relationship between the first matching point 95 and the second matching point 96. As shown in the figure, the second matching point 96 is located on the first side (left side) of the first matching point 95, and it is considered that the target object is not at this time. is blocked by the object to be filtered out, so it can be determined that the image includes the second pixel block corresponding to the target object.

In some embodiments, there are a plurality of second feature points, and the plurality of second feature points may be located in a ring-shaped pixel block surrounding the first pixel area, at positions according to the first matching point and the second matching point When determining whether the image includes the second pixel block, when it is determined that a preset number of second matching points in the second matching point is located on one side of the first matching point, then it is determined that the second image includes the second pixel region. Piece. The preset number may be determined according to actual requirements, for example, 90% of the second matching points may be located on one side of the first matching point. As shown in FIG. 10 , (a) is a schematic diagram of the first image 100 , the first pixel block 101 can be determined from the first image 100 , and a plurality of first feature points 102 can be extracted from the first pixel block 101 , extract a plurality of second feature points 104 in the annular pixel block 103 around the first pixel block 101, when the camera changes the pose to collect a frame of image, as shown in (b) is a schematic diagram of the image 105, It can be determined that the first feature point 102 is at the first matching point 106 of the image 105 and the second feature point 104 is at the second matching point 107 of the image 105, and then the positional relationship between the first matching point 106 and the second matching point 107 can be determined , as shown in the figure, when more than a certain number of second matching points 107 (for example, more than 90% of the total number of second matching points) are located on one side of the first matching point 106 , it is considered that the target object has not been filtered at this time. Except for the occlusion of the object, it can be determined that the image includes the second pixel block corresponding to the target object.

Because in some scenes, even if the pose of the camera device is adjusted, the target object occluded by the object to be filtered may not be completely captured. For example, when the distance between the object to be filtered and the target object is very close, the A complete target object is captured by adjusting the pose of the camera device to complement the target object occluded by the object to be filtered out in the first image. In such a scenario, prompt information may be sent to the user to prompt the user that the object to be filtered out in the first image cannot be filtered out in the current scene. Therefore, in some embodiments, when the preset first condition is triggered, a prompt message indicating that the object to be filtered cannot be filtered is sent to the user, wherein the prompt message can be displayed in the form of a pop-up window, for example, it can be displayed in the user The interactive interface displays pop-up information, prompting the user that the currently selected object to be filtered cannot be filtered out.

In some embodiments, the first preset condition may be at least one of the following: the first distance between the object to be filtered and the target object is less than a first preset threshold, or the second distance between the target object and the camera is less than the first distance Two preset thresholds, or the distance relationship between the first distance and the second distance does not satisfy the preset second condition. Of course, the first preset threshold, the second preset threshold and the second condition can be flexibly set according to the actual scene. For example, if the distance between the object to be filtered and the target object is less than 1 meter, the complete target object cannot be photographed. The first preset threshold is set to 1 meter. The second preset threshold and the second preset condition may be set by similar means, and details are not described herein again.

In some embodiments, before performing the replacement process on the first pixel block of the first image with the second pixel block in the second image, the second pixel block may be determined in the collected second image first. Among them, there are many ways to determine the second pixel area. In some embodiments, when determining the second pixel area in the second image, the distance between the pixel points of the first image and the pixel points of the second image may be determined first. According to the mapping relationship, the mapping area of the first pixel block in the second image is determined as the second pixel block, and then the second pixel block is used to replace the first pixel block in the first image.

In some implementations, when determining the mapping relationship between the pixels of the first image and the pixels of the second image, a third feature point may be extracted from the peripheral area of the first pixel block in the first image, and A third matching point of the third feature point is determined in the second image, and then the mapping relationship is determined according to the third feature point and the third matching point. The mapping relationship between the pixels of the first image and the pixels of the second image can be represented by a homography matrix. For example, assuming that the pixel coordinates of the pixels on the first image are P1, The pixel coordinate of the pixel point is P2, and the homography matrix is H, then the pixel point of the first image and the pixel point of the second image satisfy the formula (2):

P2=HP1 Formula (2)

Since the homography matrix H has 8 unknowns, it needs at least 4 pairs of feature points and matching points to solve. Therefore, at least 4 third feature points can be extracted from the surrounding area of the first pixel block in the first image (such as the area surrounding the first pixel block), and then it is determined that these at least 4 pixel points are in the second image. The third matching point of , solve H according to the third feature point and the third matching point.

Of course, in some embodiments, in order to solve the obtained homography matrix H more accurately, the RANSAC (Random sample consensus) algorithm can be used to remove the poor matching degree between the third feature point and the third matching point H is obtained by filtering out the third feature point and the third matching point with a more accurate matching degree, so as to obtain a more accurate H and ensure the validity of the result.

After H is determined, the mapping area of the first pixel block in the first image in the second image can be determined according to H, and then the mapping area is used as the second pixel block to replace the first pixel area in the first image block to complement the target object in the first image that is occluded by the object to be filtered out.

In some embodiments, when the second pixel block is determined in the second image, a ring-shaped pixel block surrounding the first pixel block can also be determined in the first image, and then determined in the second image. A matching ring-shaped block matched with the ring-shaped pixel block, and then the pixel block surrounded by the matching ring-shaped block in the second image is used as the second pixel block.

Since the image from which the object to be filtered is removed is usually an image used by the user for subsequent use, the first image can be an image with a better shooting effect. In some embodiments, the camera device may collect multiple images in the first pose, and then select the first image from the multiple images, wherein the first image may be selected by the user, or the image processing method may be executed by the user. The device is automatically selected according to the image clarity, brightness, picture composition and other information.

Of course, for a dynamic object, since the object itself will move, it is more suitable to eliminate multiple images collected by the camera device at the same pose, and it is not necessary to move the camera device. For a static object, since the object itself does not move, the static object can be eliminated by moving the camera to capture images of the camera in different poses. Therefore, in some implementations, after the user determines the first image, before determining the first pixel block corresponding to the object to be filtered out in the first image, the category information of the object to be filtered out may be determined first, and the category information is used for The object to be filtered is identified as a dynamic object or a static object, and a corresponding processing method is adopted to eliminate the object to be filtered according to the category information of the object to be filtered. Wherein, determining whether the object to be filtered out is a dynamic object or a static object may be determined according to a plurality of images collected by the camera device in the first pose.

In some embodiments, when the category of the object to be filtered out is determined by using the plurality of images collected by the camera in the first pose, each pixel of the object to be filtered out in the first image may be relative to the plurality of images The optical flow vectors of other images in determine the category information of the object to be filtered out. For example, the optical flow vector of each pixel of the object to be filtered out relative to other images can be counted, and if the modulo length of the optical flow vector is greater than a preset threshold, the filtered object is considered to be a dynamic object.

In some embodiments, if the object to be filtered out is a static object, then the first pixel block corresponding to the object to be filtered out is determined in the first image, and the image captured by the camera in the second pose including the second pixel is obtained. The second image of the block, and the second pixel block is used to replace the first pixel block in the first image, so as to generate a step of replacing the processed first image.

In some embodiments, if it is determined that the object to be filtered out is a dynamic object, the third pixel block corresponding to the object to be filtered out in the first image may be determined, and then the third pixel block corresponding to the object to be filtered out in the first image may be determined, and then the third pixel block acquired by the camera in the first pose is determined. For the fourth pixel block located at the pixel position corresponding to the pixel position of the third pixel block in other images other than one image, it is determined from the other images that the fourth pixel block is the third image of the static area (that is, the third image The corresponding area of the three-pixel block is not blocked), and the fourth pixel block in the third image is used to replace the third pixel block in the first image. As shown in FIG. 11( a ), the third pixel block 110 corresponding to the object to be filtered can be determined in the first image, and then it is determined that the pixel position of the third pixel block 110 is located in another image (such as image 1 in the figure) , image 2, image 3) in the fourth pixel block (111 in the figure) corresponding to the pixel position, then it can be determined whether the fourth pixel block 111 is a static area, if so, then this image is used as the third image For example, the fourth pixel block 111 in the image 1 is a static area, the image 1 is regarded as the third image, and the fourth pixel block 111 in the image 1 is used to replace the third pixel block 110 in the first image.

In some embodiments, when other images in the plurality of images except the first image determine that the fourth pixel block is the third image of the static area, the dynamic area in the other images may be determined first, and then the first image is determined for the first image. For each object to be filtered out in the image, the pixel block whose pixel position of the third pixel block is located in the corresponding pixel position in the other images can be determined according to the order of acquisition of the other images and the first image from near to far, until The pixel block corresponding to the pixel position does not overlap with the dynamic area (ie, is not blocked), and the other image is regarded as the third image. For example, as shown in Figure 11(b), it is assumed that the first image is the Kth frame image collected by the camera, and the other images are the K+1th frame, the K+2th frame, and the K+th frame collected by the camera respectively. 3 frames, etc., you can first determine the dynamic area in the K+1th frame, the K+2th frame, and the K+3th frame. When determining the dynamic area of each frame, you can calculate each pixel of the current frame and its adjacent frame. Or the optical flow vector of multiple frames of images. If the modulo length of the optical flow vector of a pixel point is greater than a certain threshold, the pixel point is considered to be moving, and then the pixels determined to be moving are clustered to obtain multiple pixel point sets. , and the area where the number of pixels in the set is greater than a certain value (the number is too small may be that the noise can be ignored) is considered as a motion area. Assuming that the rectangular area and the circular area in the image are dynamic areas, and the remaining areas are static areas, determine the third pixel block 121 corresponding to the object to be filtered in the first image, and the pixel position 121 where the third pixel block is located is in the K+th The pixel block of the corresponding pixel position in the 1st frame, the K+2th frame, and the K+3th frame is the area 122 framed by the dotted line. Whether the pixel block 122 corresponding to the pixel position of the third pixel block 121 in a frame of image (such as the K+1th frame) overlaps with the dynamic region in the K+1th frame, and if so, then determine the Kth Whether the pixel block 122 corresponding to the pixel position of the third pixel block 121 in the +2 frames overlaps with the dynamic area, when it is determined that the K+2th frame meets the requirements, the K+2th frame is used as the third image, The third pixel block of the first image is replaced with the pixel block corresponding to the pixel position in this frame.

When the camera device collects images in the first attitude, it usually collects multiple frames of images continuously to form an image sequence. The position of the image does not change much, so it is not suitable to use adjacent frames to filter out the dynamic objects of the first image. If these image frames are judged one by one, it is more resource-intensive. Therefore, when acquiring multiple images collected by the camera, some images that can reflect the changes of dynamic objects can be selected from the image sequence collected by the camera, so that these images can be used more efficiently to filter out dynamic objects. Therefore, in some embodiments, other images in the plurality of images except the first image may be images whose differences from the first image exceed a preset threshold, or images that are separated from the first image by a specified frame. For example, the first image can be used as a reference. If the angle or displacement of a specified object in a certain frame of images in the image sequence and the angle or displacement of the object in the first image exceed a preset threshold, the image is acquired as one of the above-mentioned multiple images. For example, if the first image is the 5th frame of the image sequence, the other images are the 10th, 15th, 20th, and 25th frames in sequence. Wait.

In some embodiments, when the static objects in the image are filtered out by collecting images of different poses, the dynamic objects in the image will interfere with the filtering out of the static objects to a certain extent, resulting in an inability to filter out the static objects well. Static objects, therefore, in some embodiments, before using the above method to filter out static objects, dynamic objects in the image may be filtered out first.

In addition, the present application also provides an image processing method, which can be used to automatically remove dynamic objects in an image. The method is shown in FIG. 12 and includes the following steps:

S1202, determining the first pixel block corresponding to the dynamic object to be filtered out in the first image;

S1204. Determine a pixel block whose pixel position of the first pixel block is located in a plurality of second images corresponding to pixel positions, and the plurality of second images and the first image are collected in the same pose by a camera device get;

S1206, from the plurality of second images, determine that the pixel block of the corresponding pixel position is the third image of the static area;

S1208. Replace the first pixel block in the first image with the pixel block corresponding to the pixel position in the third image.

The image processing method in this embodiment of the present application may be performed by a camera device that collects the first image and the second image, and the camera device may be any device with an image acquisition function. For example, the camera device may be a camera, or a camera equipped with a camera. Terminals such as cameras, mobile phones, tablets, laptops, smart wearable devices, etc., can also be mobile platforms such as drones and unmanned vehicles with cameras. Of course, in some embodiments, the image processing method of the present application can also be executed by other devices that are communicatively connected to the camera, for example, a cloud server, and the camera collects the first image and the second image and sends them to the cloud Server processing, of course, if the camera device is a movable platform, the image processing method of the present application can also be executed by a control terminal communicatively connected to the movable platform, which is not limited in the present application.

The dynamic objects to be filtered out in the embodiments of the present application are objects that the user wishes to remove from the image. The dynamic objects to be filtered out may be determined by the user or selected by the user, and there may be one or more dynamic objects to be filtered out.

The first image can be selected from the image sequence continuously collected by the camera in a certain fixed pose, wherein the first image can be selected by the user, or can be automatically selected by the device executing the image processing method, such as automatically selecting from the image sequence. Select an image with better clarity, composition or shooting angle as the first image.

After determining the first image, it is possible to determine the first pixel block corresponding to the dynamic object to be filtered out in the first image, then determine multiple frames of second images from the image sequence, and determine the first pixel block from the second image The pixel block corresponding to the pixel position at the pixel position is the third image of the static area (that is, the image in which the pixel block corresponding to the pixel position is not occluded), and then the corresponding pixel position of the first pixel block in the third image is used. The pixel block replaces the first pixel block in the first image to remove dynamic objects.

Since the difference between two or more adjacent frames of images in the image sequence may be relatively small, the position of the dynamic object may not change much in the adjacent frames. Therefore, the adjacent frames of the first image may not be used to filter out the dynamic object to be filtered out. , in order to filter out the image frames that can be used to filter out the dynamic objects to be filtered out in the first image more quickly, in some embodiments, some images that are quite different from the first image can be selected from the image sequence as The second image, for example, the second image may be an image whose difference from the first image exceeds a preset threshold, or an image that is spaced apart from the first image by a specified frame. For example, the first image can be used as a reference. If the angle or displacement of an object in a certain frame of images in the image sequence and the angle or displacement of the object in the first image exceed a preset threshold, the image is acquired as the second image, It is also possible to acquire images with a specified frame interval from the first image. For example, if the first image is the 5th frame of the image sequence, the second image is the 10th frame, the 15th frame, the 20th frame, and the 25th frame in sequence.

In some embodiments, when determining the first pixel block corresponding to the dynamic object in the first image, the following operations may be performed for each frame of the second image: calculating the light intensity of each pixel of the first image relative to the second image The flow vector is determined from each pixel point of the first image, and the target pixel point whose modulo length of the optical flow vector is greater than the preset threshold is subjected to clustering processing to obtain the first pixel block corresponding to the dynamic object. For example, the optical flow vector of each pixel of the first image and its adjacent one or more frames of images can be calculated. If the modulo length of the optical flow vector of the pixel is greater than a certain threshold, it is considered that the pixel is moving, and then it is determined as The moving pixels are clustered to obtain multiple sets of pixels, and the area where the number of pixels in the set is greater than a certain value (the number may be too small and the noise can be ignored) is considered to be a dynamic object.

In some embodiments, when replacing the first pixel block in the first image with the pixel block field corresponding to the pixel position in the third image, the dynamic regions in the plurality of second images can be determined, and for each For the first pixel block, the corresponding third image can be determined in the following manner, and the pixel position of the dynamic object of the first image can be determined in the second image in the order of the second image and the first image acquisition sequence from near to far. until the pixel block corresponding to the pixel position does not overlap the dynamic area, the second image is regarded as the third image.

With the method provided in this embodiment of the present application, a reference image (ie, the first image) can be determined from multiple frames of images collected by the camera in the same pose, and then the first image corresponding to the dynamic object to be filtered can be determined from the reference image. Pixel block, by determining whether the pixel block corresponding to the pixel position of the first pixel block in other images is a static area, the pixel area corresponding to the pixel position of the first pixel block can be quickly screened out. block the unoccluded image, and then replace the first pixel block in the first image with the pixel block corresponding to the pixel position of the image, which can quickly and efficiently remove the dynamic object in the first image.

In order to further explain the image processing method provided by the implementation of this application, the following is explained with reference to a specific embodiment.

Usually, when a user is shooting an image or video, some non-target objects will also be within the shooting angle of view, resulting in some non-target objects in the final captured image, obscuring the target the user wants to shoot. Therefore, it is necessary to remove these non-target shooting objects. The following provides a method for removing non-shooting target objects (ie, objects to be filtered) in an image, wherein the objects to be filtered can be dynamic objects or static objects. Taking the scene where the user uses a drone equipped with a camera to collect images as an example, as shown in FIG. 13 , the user can use the control terminal 132 to control the camera device 133 mounted on the drone 131 to collect images, and the drone 131 can capture images. The images collected by the camera 133 are sent back to the control terminal so as to be displayed to the user. Filtering out the objects in the image can be performed by the control terminal. The following describes the filtering methods of dynamic objects and static objects respectively. Since dynamic objects will cause certain interference to the filtering of static objects, the dynamic objects can be filtered out first, and then the dynamic objects can be filtered out. Filter out static objects.

Filter out dynamic objects:

1. The camera can be controlled to collect a series of image sequences at a certain fixed pose, and then the user selects a reference image I0 from the image sequence, or the control terminal automatically selects the reference image I0 from the image sequence.

2. Select multiple key frames from the image sequence, wherein the key frame is an image frame with a large difference from the reference image I0, for example, the angle or displacement of an object may be different from the angle or displacement of the object in the reference image I0 An image frame with a certain threshold, or an image frame with a specified frame interval from the reference image. For example, the reference image I0 is the Kth frame, and the key frame can be an image frame with an interval of 5, 10, 15, and 20 from the Kth frame.

3. Calculate the optical flow vector of each pixel in the reference image I0 and the remaining key frames selected separately. If the optical flow vector modulo length of a single pixel is greater than a certain threshold, it is considered that the pixel has changed at different times, that is, motion If the pixels determined to be moving are clustered, a combination of multiple pixels is obtained. If the pixels in the set are greater than a certain value, the area corresponding to the pixels in the set is considered to be a moving object.

4. The remaining key frames also calculate the optical flow vector with the two frames (or multiple frames) before and after itself, respectively, and determine the dynamic area and the static area of each key frame.

5. Start from the nearest key frame of the k0th frame (that is, the reference image I0) to find the keyframes that can fill the dynamic objects of the k0th frame. You can determine whether each dynamic object of the k0th frame is a static area in the corresponding area of the frame. Sure. As shown in Figure 14, the triangle, square and circular areas in the figure represent dynamic areas, and the rest of the areas are static areas. For example, for the circular dynamic area of the k0th frame, first determine that the circular area is in the k-1th frame The corresponding area of , such as the circular dotted area in the figure, is a static area, so the k-1th frame can be used to fill the circular dynamic area of the k0th frame. Similarly, the remaining triangular dynamic areas and square dynamic areas need to be filled by the k-5th frame and the k7th frame respectively.

6. After determining the key frame that can be used to fill each dynamic object of the k0th frame, each dynamic object can be used to replace the dynamic object of the k0 frame in the corresponding area of the key frame, so as to achieve the purpose of filtering out the dynamic object.

Filter out static objects

7. Determine the pixel area corresponding to the object to be filtered out and the pixel area corresponding to the occluded background.

The static object to be filtered out can be determined in the reference image I0, which can be automatically recognized by the control terminal, or can be determined by the user by frame selection on the interactive interface. As shown in Figure 15, the user can select the static object to be filtered out in the reference image I0 (as shown in Figure (1)). Since the frame selected by the user is not very accurate, the selection frame can be adjusted automatically, such as The benchmark image I0 is subjected to superpixel segmentation to obtain multiple image areas, and then the ratio of the part of each area that falls into the selected frame to the image area is determined. outside the box. The frame adjusted by the above method is shown in Figure (2). After automatic adjustment to get a more accurate marquee, you can appropriately expand some marquees, such as 5%, to ensure that the static objects are completely in the marquee, and there will also be some backgrounds in the marquee. This background is what we need to make up for. The complete part is shown in (3) in the figure. On this basis, a part can be extended to obtain the background area, such as the trapezoidal area shown in Figure (4).

8. Determine the depth distance between static objects and background areas.

For the pixel area corresponding to the static object detected in the previous step and the pixel area corresponding to the background area, feature points can be extracted, and feature point tracking and matching between multiple frames of images, as well as feature point tracking and matching of the previous and subsequent frames, can be used to determine the static object. and the depth distance of the background area. details as follows:

(1) Feature point extraction

According to the static object selected by the frame, feature point extraction is performed on the corresponding area of the static object on the reference image. The feature point extraction can use a general algorithm, such as Harris algorithm, SIFT algorithm, SURF algorithm, ORB algorithm, etc.

In order to reduce the amount of calculation, the sparse method can be used to first extract the feature points of the image. Generally, the corner points (Corner detection) can be selected as the feature points. The optional corner detection algorithms Corner Detection Algorithm are: FAST (features from accelerated segment test ), SUSAN, and Harris operator, etc. The following is an example of using the Harris Corner Detection Algorithms algorithm:

Define matrix A as a construction tensor, such as formula (3)

where Ix and Iy are the gradient information of a certain point on the image in the x and y directions, respectively, and the function Mc can be defined as the following formula (4):

M _c =λ ₁ λ ₂ -κ(λ ₁ +λ ₂ ) ² =det(A)-κ trace ² (A) Equation (4)

where det(A) is the determinant of matrix A, trace(A) is the trace of matrix A, κ is the parameter to adjust the sensitivity, and the set threshold is Mth. When Mc>Mth, this point can be considered as a feature point.

(2) KLT (Kanade–Lucas–Tomasi feature tracker) feature point tracking and matching algorithm

The feature points between multiple frames of images can be tracked in order to calculate their movement (optical flow), and h can be taken as the offset of the two images before and after, the former image is F(x), and the latter image is G(x) )=F(x+h)

For each feature point, the displacement h of the feature point before and after the image frame can be obtained through the iteration of formula (5),

In order to ensure the reliability of the results, the latter image can be set as F(x) and the former image as G(x), and a certain feature point can be calculated, and the offset h of the latter image relative to the previous one can be calculated. , and then in turn, calculate and change the feature points, the offset h' of the previous image relative to the next image, theoretically h=-h', and this condition can be satisfied to indicate that the tracked point is correct, where h is Optical flow vector h=(Δu,Δv).

(3) Update feature points

During the tracking process, due to the change of viewing angle, some feature points can no longer be observed, and some feature points are newly added. Therefore, the feature points can be updated continuously.

(4) Calculate the position of the center of the static object

Because the center of static objects does not necessarily have feature points. Therefore, for the center of the static object, it is necessary to use the fitted optical flow vector to determine the position of the center of the static object in each image, so that the BA algorithm can be used to obtain the three-dimensional coordinates of the center of the static object.

The center point of the static object can be estimated from the optical flow vectors of other feature points within the corresponding region of the static object framed in the image. Specifically as formula (6):

x ₀ =∑ _n w _i x _i Formula (6)

x _i is the optical flow vector of the feature points in the frame, w _i is the weight, which can be determined according to the 2D image position of the feature point and the center point, as shown in formula (7):

Among them, σ is adjusted according to experience and is an adjustable parameter, and d _i represents the distance from the feature point i to the midline point

(u _i , v _i ) represents the 2D image pixel coordinates of the feature point i, and (u ₀ , v ₀ ) represents the 2D image pixel coordinates of the center point of the target frame.

Through the above steps (1)-(4), the disparity and optical flow of the center of the static object can be calculated, and the three-dimensional depth information of the center of the static object can be obtained.

In a similar way, the 3D depth information of the background region can be calculated.

9. Determine the pose of the camera when the background area can be observed.

Referring to FIG. 6 , assuming that the camera is located at "position 1" when taking the reference image I0, the camera can observe the viewing angle of all the occluded areas, and can reach "position 2" by performing translation transformation from "position 1". Among them, d1 and d2 are the depth distance of the static object obtained in one step and the depth distance of the background area. The maximum width L of the static object can be determined according to the size of the static object in the image and the depth distance of the static object. The moving distance D from "position 1" to "position 2" can be solved by formula (1):

From the above formula (1), when the static object is very close to the background area, that is, when d2≈d1, D will be close to infinity, so when the distance between the static object and the background area is too close, the static object cannot be filtered out, then It is possible to issue a prompt message to the user that cannot be filtered out.

Figure 6 shows that the drone is flying to the right, so the limit position where all the occluded background areas can be seen is that the left edge of the occluded background area is just observed. Of course, the drone can also fly to the left, and the corresponding limit viewing angle is to just see the right edge of the occluded background area. After the translation distance D is determined, the camera orientation can be adjusted at the same time, and the right edge of the pixel area corresponding to the object to be filtered can be centered. Through the above method, the adjustment of the camera pose can be completed, and the purpose of adjusting the viewing angle is achieved.

10. Calculate the homography matrix H

In the previous step, the pose where the occluded background area can be observed is determined, and the drone can be automatically controlled to adjust to the pose, and the image In can be obtained by shooting at this pose.

Feature points can be extracted from the pixel region corresponding to the background region in the reference image I0, and feature point matching can be performed in the image In to obtain a matching point queue. Determine the homography matrix H according to the feature points and matching points:

The H matrix represents the mapping relationship between two matched pixels on two images collected by the camera at different poses, as shown in formula (8):

x ₀ =Hx _n Formula (8)

x ₀ is a feature point on the background region of image I0, and x _n is the point in image In that matches x ₀ . Using the H matrix to represent the mapping relationship between two points actually requires these pixels to be on the same plane in space. When the camera device is far away from the shooting target, the background area can be treated as a plane. Therefore, when d1 is relatively small (for example, less than 100m), the user should also be reminded that the filtering effect is poor or cannot be filtered.

Or fit the three-dimensional points on the background (the feature points can be converted into three-dimensional information after obtaining the depth information) to the plane, and the tolerance parameter (that is, the maximum degree of unevenness) used in fitting the plane can be selected according to the depth of the background. (such as removing 2% of the background depth), if the plane cannot be fitted, the user will be prompted that the filtering effect is poor or cannot be filtered. The H matrix has 8 unknowns and requires at least 4 pairs of points to calculate.

The RANSAC (Random sample consensus) algorithm can be used to effectively filter out feature points and matching points with poor matching degree, further improve the effectiveness of the results, and obtain a more accurate H matrix.

11. Fill the occluded background area in image I0 with image In to filter out static objects.

As shown in Figure 16, through the homography matrix H determined in the previous step, the image In can be all projected onto the camera pose (camera pose) of the camera corresponding to the image I0 according to the H matrix to obtain the image In', and then the The pixel area corresponding to the object to be filtered out in the image In' is replaced by the area corresponding to the coverage image I0, and the static object can be removed.

In addition, the present application also provides an image processing apparatus. As shown in FIG. 17 , the image processing apparatus includes a processor 171 , a memory 172 , and a computer program stored in the memory 172 and executable by the processor 171 . When the processor executes the computer program, the following steps are implemented:

In some embodiments, when the processor is configured to acquire the second image captured by the camera in the second pose, the processor is specifically configured to:

determining the second pose;

The camera is controlled to move to adjust to the second pose and capture the second image.

determining the second pose;

Sending prompt information indicating the second pose to the user, so that the user controls the camera to move according to the prompt information to adjust to the second pose and capture the second image.

In some embodiments, when the processor is configured to determine the second pose, it is specifically configured to:

Obtain the location information of the object to be filtered and the location information of the target object;

The second pose is determined according to the position information of the object to be filtered and the position information of the target object.

In some embodiments, the processor is configured to determine the second pose according to the position information of the object to be filtered and the position information of the target object, and is specifically configured to:

The second pose is determined according to the position information of the object to be filtered, the position information of the target object, and the size of the object to be filtered.

In some embodiments, the first pose includes a first position and a first orientation, and the second pose includes a second position and a second orientation;

The second position is located on a straight line passing through the first position and parallel to the plane where the object to be filtered is located, and the second orientation points to the position where the object to be filtered is located.

In certain embodiments, the second location is determined by:

Determine the moving distance according to the position information of the object to be filtered, the position information of the target object and the size of the object to be filtered;

The second position is determined according to the first position and the moving distance.

In some embodiments, the second orientation is determined by:

Determine the second orientation according to the first position and the position of the object to be filtered out in the image frame captured by the camera; or

The second orientation is determined according to the first position, the positions of the left and right endpoints of the object to be filtered, and the positions of the left and right endpoints of the target object.

In some embodiments, the second posture includes a second position and a second orientation, and the processor is configured to send prompt information indicating the second posture to the user, when, specifically:

The image marked with the second position is displayed to the user, and the rotation angle information corresponding to the second orientation is displayed.

Controlling the motion of the camera to change the pose of the camera and collecting multiple frames of images, and for each frame of image, determine whether the second pixel block is included in the image;

An image including the second pixel block is used as the second image.

In some embodiments, when the processor is configured to determine whether the image includes the second pixel block, it is specifically configured to:

determining a first feature point in the first pixel block, and determining a second feature point in a peripheral area of the first pixel block;

For each frame of the image, determine a first matching point of the first feature point in the image, and a second matching point of the second feature point in the image;

Whether the image includes the second pixel block is determined according to the positional relationship between the first matching point and the second matching point in the image.

In some embodiments, the first feature point is located within a first side of the first pixel block, and the second feature point is located outside a second side of the first pixel block, wherein the the first side is the opposite side of the second side;

When the processor is used to determine whether the image includes the second pixel block according to the positional relationship between the first matching point and the second matching point, it is specifically used for:

When it is determined that the second matching point is located on the first side of the first matching point, it is determined that the image includes the second pixel block.

In some embodiments, a plurality of the second feature points are located in a ring-shaped pixel block surrounding the first pixel area, and the processor is configured to determine the first matching point and the second matching point according to the first matching point and the second matching point. When determining whether the image includes the second pixel block, it is specifically used for:

When a preset number of the second matching points among the plurality of second matching points are located on one side of the first matching point, it is determined that the second pixel block is included in the second image.

In some embodiments, the camera device is mounted on a movable platform, and the processor is used to control the movement of the camera device, specifically:

Movement of the movable platform is controlled to control movement of the camera.

In some embodiments, the camera device is mounted on a movable platform through a pan/tilt, and the processor is used to control the movement of the camera device, specifically:

The movable platform is controlled to move, and/or the pan/tilt is controlled to generate relative motion between the camera and the movable platform, so as to control the camera to move.

In some embodiments, the camera device is mounted on the movable platform through a pan/tilt head, the second pose includes a second position and a second orientation, and the processor is configured to control the camera device to move so as to make the camera device move. When the camera device is adjusted to the second pose, it is specifically used for:

The movable platform is controlled to move so that the camera is located at the second position; and the pan-tilt is controlled to rotate so that the orientation of the camera is adjusted to the second orientation.

In some embodiments, the movable platform includes any one of an unmanned aerial vehicle, an unmanned vehicle, and an unmanned boat.

In some embodiments, when the processor is configured to determine the first pixel block corresponding to the object to be filtered out in the first image, it is specifically configured to:

In response to the user's instruction, the first pixel block corresponding to the object to be filtered out is determined from the first image.

In some embodiments, the instruction includes a check box input by the user through a human-computer interaction interface, and the check box is used to frame the static target object.

In some embodiments, the first pixel block is a pixel block selected by the marquee, and the device is further configured to:

Perform superpixel segmentation processing on the first image to obtain multiple image areas, and adjust the pixel blocks selected by the frame selection based on the multiple image areas.

In some embodiments, when the processor adjusts the pixel blocks selected by the frame based on the plurality of image regions, the processor is specifically configured to:

The pixel block selected by the frame is adjusted according to the ratio of the portion of each image area that falls within the frame and each image area in the plurality of image areas.

In certain embodiments, the apparatus is also used to:

When the preset first condition is triggered, a prompt message that the object to be filtered out cannot be filtered out is sent to the user.

In some embodiments, the preset first condition includes one or more of the following:

The first distance between the object to be filtered and the target object is less than a first preset threshold;

or the second distance between the target object and the camera device is less than a second preset threshold;

Or, the distance magnitude relationship between the first distance and the second distance does not satisfy a preset second condition.

In certain embodiments, the apparatus is also used to:

The second pixel block is determined in the second image.

In some embodiments, when the processor is configured to determine the second pixel block in the second image, it is specifically configured to:

determining the mapping relationship between the pixels of the first image and the pixels of the second image;

The mapping area of the first pixel block in the second image is determined according to the mapping relationship as the second pixel block.

In some embodiments, when the processor is configured to determine the mapping relationship between the pixels of the first image and the pixels of the second image, it is specifically configured to:

Extract the third feature point in the peripheral area of the first pixel block, and determine the third matching point of the third feature point in the second image;

The mapping relationship is determined based on the third feature point and the third matching point.

determining a ring-shaped pixel block surrounding the first pixel block in the first image;

determining a matching ring-shaped block that matches the ring-shaped pixel block in the second image;

A pixel block surrounded by the matching ring-shaped block in the second image is used as the second pixel block.

In some embodiments, when the processor is configured to acquire the first image captured by the camera in the first pose, the processor is specifically configured to:

acquiring multiple images collected by the camera device in the first pose;

The first image is determined among the plurality of images.

In some embodiments, before the processor is configured to determine the first pixel block corresponding to the object to be filtered out in the first image, the processor is further configured to:

Determine the category information of the object to be filtered out by using the plurality of images, where the category information is used to identify the object to be filtered out as a dynamic object or a static object;

If the object to be filtered out is a static object, then the determining of the first pixel block corresponding to the object to be filtered out in the first image is performed, and the acquisition of the image data collected by the camera in the second pose including the second pixel block is performed. The second image of the pixel block, and the step of replacing the first pixel block in the first image with the second pixel block to generate the replaced first image.

In certain embodiments, the apparatus is also used to:

If the object to be filtered is a dynamic object, perform the following steps:

Determine the third pixel block corresponding to the object to be filtered out in the first image;

Determine the fourth pixel block on the corresponding pixel position in the other images except the first image in the plurality of images at the pixel position of the third pixel area;

Determine from the other images that the fourth pixel block is the third image of the static area;

The third pixel block in the first image is replaced with the fourth pixel block in the third image.

In some embodiments, when the processor is configured to determine the category of the object to be filtered out from the plurality of images, the processor is specifically configured to:

The category information of the object to be filtered out is determined according to the optical flow of each pixel of the object to be filtered out relative to other images in the plurality of images except the first image.

In some embodiments, when the processor is configured to determine from the other images that the fourth pixel block is the third image of the static area, the processor is specifically configured to:

determining dynamic regions in said other images;

For each object to be filtered out, perform the following steps:

According to the order of acquisition of the other images and the first image from near to far, determine the pixel block of the corresponding pixel position in the other image where the pixel position of the third pixel block is located, until the corresponding pixel position If the pixel block of the pixel position does not overlap with the dynamic area, the other image is used as the third image.

In some embodiments, the other difference from the first image exceeds a preset threshold; or

The other images are images spaced apart from the first image by a specified frame.

The specific details of the image processing performed by the image processing apparatus refer to the descriptions of the embodiments in the above-mentioned image processing method, which will not be repeated here.

Further, the present application also provides another image processing apparatus, the image processing apparatus includes a processor, a memory, a computer program stored in the memory and executable by the processor, and the processor executes the computer program , implement the following steps:

The first pixel block in the first image is replaced with a pixel block at the corresponding pixel position in the third image.

In some embodiments, when the processor is used to determine the first pixel block corresponding to the dynamic object in the first image, it is specifically used to:

Do the following for each frame of the second image:

calculating the optical flow of each pixel of the first image relative to the second image;

Determine, from each pixel of the first image, a target pixel whose modulo length of the optical flow is greater than a preset threshold;

The target pixel points are clustered to obtain the first pixel block corresponding to the dynamic object.

In some embodiments, the processor is configured to determine from the plurality of second images that the pixel block corresponding to the pixel position is the third image of the static area; specifically:

determining dynamic regions in the plurality of second images;

For each of the first pixel blocks, the following steps are performed:

According to the second image and the first image acquisition sequence from near to far, determine the pixel block where the pixel position of the first pixel block is located in the corresponding pixel position in the second image, until If the pixel block corresponding to the pixel position does not overlap with the dynamic area, the second image is used as the third image.

In some embodiments, the difference between the second image and the first image exceeds a preset threshold; or

The second image is an image spaced apart from the first image by a specified frame.

In addition, the present application also provides a movable platform, and the movable platform can be any device such as an unmanned aerial vehicle, an unmanned vehicle, an unmanned ship, an intelligent robot, and a handheld PTZ. The movable platform includes a camera device and an image processing device. The image processing device can implement any of the image processing methods in the embodiments of the present application. For specific implementation details, refer to the descriptions of the embodiments in the above image processing methods. No longer.

Correspondingly, an embodiment of the present specification further provides a computer storage medium, where a program is stored in the storage medium, and when the program is executed by a processor, the image processing method in any of the foregoing embodiments is implemented.

Embodiments of the present specification may take the form of a computer program product embodied on one or more storage media having program code embodied therein, including but not limited to disk storage, CD-ROM, optical storage, and the like. Computer-usable storage media includes permanent and non-permanent, removable and non-removable media, and storage of information can be accomplished by any method or technology. Information may be computer readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase-change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Flash Memory or other memory technology, Compact Disc Read Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device.

For the apparatus embodiments, since they basically correspond to the method embodiments, reference may be made to the partial descriptions of the method embodiments for related parts. The device embodiments described above are only illustrative, wherein the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in One place, or it can be distributed over multiple network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment. Those of ordinary skill in the art can understand and implement it without creative effort.

It should be noted that, in this document, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply any relationship between these entities or operations. any such actual relationship or sequence exists. The terms "comprising", "comprising" or any other variation thereof are intended to encompass non-exclusive inclusion such that a process, method, article or device comprising a list of elements includes not only those elements, but also other not expressly listed elements, or also include elements inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element.

The methods and devices provided by the embodiments of the present invention have been described in detail above. The principles and implementations of the present invention are described in this paper by using specific examples. At the same time, for those of ordinary skill in the art, according to the idea of the present invention, there will be changes in the specific implementation and application scope. To sum up, the content of this description should not be construed as a limitation to the present invention. .

Claims

An image processing method, characterized in that the method comprises:

acquiring the first image collected by the camera in the first pose, and determining the first pixel block corresponding to the object to be filtered out in the first image;

Obtain a second image collected by the camera in the second pose, the second image includes a second pixel block corresponding to a target object, and the target object is the object to be filtered out in the first image occluded object;

The first pixel block in the first image is replaced by the second pixel block to generate a replaced first image.
The method according to claim 1, wherein acquiring the second image collected by the camera at the second pose comprises:

determining the second pose;

The camera is controlled to move to adjust to the second pose and capture the second image.
The method according to claim 1, wherein acquiring the second image collected by the camera at the second pose comprises:

determining the second pose;

Sending prompt information indicating the second pose to the user, so that the user controls the camera to move according to the prompt information to adjust to the second pose and capture the second image.
The method according to claim 2 or 3, wherein determining the second pose comprises:

Obtain the location information of the object to be filtered and the location information of the target object;

The second pose is determined according to the position information of the object to be filtered and the position information of the target object.
The method according to claim 4, wherein determining the second pose according to the position information of the object to be filtered and the position information of the target object, comprising:

The second pose is determined according to the position information of the object to be filtered, the position information of the target object, and the size of the object to be filtered.
The method according to claim 5, wherein the first posture includes a first position and a first orientation, and the second posture includes a second position and a second orientation;

The second position is located on a straight line passing through the first position and parallel to the plane where the object to be filtered is located, and the second orientation points to the position where the object to be filtered is located.
The method of claim 6, wherein the second position is determined by:

Determine the moving distance according to the position information of the object to be filtered, the position information of the target object and the size of the object to be filtered;

The second position is determined according to the first position and the moving distance.
The method according to claim 6, wherein the second orientation is determined by:

Determine the second orientation according to the first position and the position of the object to be filtered out in the image frame captured by the camera; or

The second orientation is determined according to the first position, the positions of the left and right endpoints of the object to be filtered, and the positions of the left and right endpoints of the target object.
The method according to claim 3, wherein the second pose includes a second position and a second orientation, and sending prompt information indicating the second pose to the user, comprising:

The image marked with the second position is displayed to the user, and the rotation angle information corresponding to the second orientation is displayed.
The method according to claim 1, wherein acquiring the second image collected by the camera at the second pose comprises:

Controlling the motion of the camera to change the pose of the camera and collecting multiple frames of images, and for each frame of image, determine whether the second pixel block is included in the image;

An image including the second pixel block is used as the second image.
The method of claim 10, wherein determining whether the image includes the second pixel block comprises:

determining a first feature point in the first pixel block, and determining a second feature point in a peripheral area of the first pixel block;

For each frame of the image, determine a first matching point of the first feature point in the image, and a second matching point of the second feature point in the image;

Whether the image includes the second pixel block is determined according to the positional relationship between the first matching point and the second matching point in the image.
The method of claim 11, wherein the first feature point is located within a first side of the first pixel block, and the second feature point is located on a second side of the first pixel block In addition, wherein, the first side is the opposite side of the second side;

Determine whether the image includes the second pixel block according to the positional relationship between the first matching point and the second matching point, including:

When it is determined that the second matching point is located on the first side of the first matching point, it is determined that the image includes the second pixel block.
The method according to claim 11, wherein a plurality of the second feature points are located in a ring-shaped pixel block surrounding the first pixel area, according to the first matching point and the second matching The positional relationship of the point determines whether the image includes the second pixel block, including:

When a preset number of the second matching points among the plurality of second matching points are located on one side of the first matching point, it is determined that the second pixel block is included in the second image.
The method according to claim 2, 3 or 10, wherein the camera device is mounted on a movable platform, and the control of the camera device to move includes:

Movement of the movable platform is controlled to control movement of the camera.
The method according to claim 2, 3 or 10, wherein the camera device is mounted on a movable platform through a pan/tilt; the controlling the movement of the camera device includes:

The movable platform is controlled to move, and/or the pan/tilt is controlled to generate relative motion between the camera and the movable platform, so as to control the camera to move.
The method according to claim 2 or 3, wherein the camera device is mounted on a movable platform through a pan/tilt, the second pose includes a second position and a second orientation, and the camera device is controlled to move to Adjusting the camera device to the second pose includes:

The movable platform is controlled to move so that the camera is located at the second position; and the pan-tilt is controlled to rotate so that the orientation of the camera is adjusted to the second orientation.
The method according to any one of claims 14-16, wherein the movable platform comprises any one of an unmanned aerial vehicle, an unmanned vehicle, and an unmanned boat.
The method according to any one of claims 1-17, wherein determining the first pixel block corresponding to the object to be filtered out in the first image comprises:

In response to the user's instruction, the first pixel block corresponding to the object to be filtered out is determined from the first image.
The method according to claim 18, wherein the instruction comprises a check box input by a user through a human-computer interaction interface, and the check box is used to frame the static target object.
The method of claim 19, wherein the first pixel block is a pixel block selected by the frame, and the method further comprises:

Perform superpixel segmentation processing on the first image to obtain multiple image areas, and adjust the pixel blocks selected by the frame selection based on the multiple image areas.
The method according to claim 20, wherein, adjusting the pixel block selected by the frame selection based on the plurality of image regions, comprising:

The pixel block selected by the frame is adjusted according to the ratio of the portion of each image area that falls within the frame and each image area in the plurality of image areas.
The method according to any one of claims 1-21, wherein the method further comprises:

When the preset first condition is triggered, a prompt message that the object to be filtered out cannot be filtered out is sent to the user.
The method according to claim 22, wherein the preset first condition comprises one or more of the following:

The first distance between the object to be filtered and the target object is less than a first preset threshold;

or the second distance between the target object and the camera device is less than a second preset threshold;

Or the distance magnitude relationship between the first distance and the second distance does not satisfy a preset second condition.
The method according to any one of claims 1-23, wherein the method further comprises:

The second pixel block is determined in the second image.
The method of claim 24, wherein determining the second pixel block in the second image comprises:

determining the mapping relationship between the pixels of the first image and the pixels of the second image;

The mapping area of the first pixel block in the second image is determined according to the mapping relationship as the second pixel block.
The method according to claim 25, wherein determining the mapping relationship between the pixels of the first image and the pixels of the second image comprises:

Extract the 3rd feature point in the peripheral area of described first pixel block, and determine the 3rd matching point of described 3rd feature point in described second image;

The mapping relationship is determined based on the third feature point and the third matching point.
The method of claim 24, wherein determining the second pixel block in the second image comprises:

determining a ring-shaped pixel block surrounding the first pixel block in the first image;

determining a matching ring-shaped block that matches the ring-shaped pixel block in the second image;

A pixel block surrounded by the matching ring-shaped block in the second image is used as the second pixel block.
The method according to any one of claims 1-27, wherein acquiring the first image captured by the camera in the first pose comprises:

acquiring multiple images collected by the camera device in the first pose;

The first image is determined among the plurality of images.
The method according to claim 28, wherein before determining the first pixel block corresponding to the object to be filtered out in the first image, the method further comprises:

Determine the category information of the object to be filtered out by using the plurality of images, where the category information is used to identify the object to be filtered out as a dynamic object or a static object;

If the object to be filtered out is a static object, then the determining of the first pixel block corresponding to the object to be filtered out in the first image is performed, and the acquisition of the image data collected by the camera in the second pose including the second pixel block is performed. A second image of a pixel block, and performing a replacement process on the first pixel block in the first image through the second pixel block to generate a replaced first image.
The method according to claim 29, wherein if the object to be filtered is a dynamic object, the following steps are performed:

Determine the third pixel block corresponding to the object to be filtered out in the first image;

Determine the fourth pixel block on the corresponding pixel position in the other images except the first image in the plurality of images at the pixel position of the third pixel block;

Determine from the other images that the fourth pixel block is the third image of the static area;

The third pixel block in the first image is replaced with the fourth pixel block in the third image.
The method according to claim 29 or 30, wherein determining the category of the object to be filtered out from the plurality of images comprises:

The category information of the object to be filtered is determined according to the optical flow of each pixel of the object to be filtered relative to the other images.
The method according to claim 30, wherein determining from the other images that the fourth pixel block is a third image of a static area, comprising:

determining dynamic regions in said other images;

For each object to be filtered out, perform the following steps:

According to the order of acquisition of the other images and the first image from near to far, determine the pixel block of the corresponding pixel position in the other image where the pixel position of the third pixel block is located, until the corresponding pixel position If the pixel block at the pixel position does not overlap with the dynamic area, the other image is used as the third image.
The method according to any one of claims 29-32, wherein the difference between the other and the first image exceeds a preset threshold; or

The other images are images spaced apart from the first image by a specified frame.
An image processing method, characterized in that the method comprises:

Determine the first pixel block corresponding to the dynamic object in the first image;

Determine the pixel block of the pixel position of the first pixel block corresponding to the pixel position in a plurality of second images, and the plurality of second images and the first image are collected in the same pose by a camera;

From the plurality of second images, it is determined that the pixel block of the corresponding pixel position is the third image of the static area;

The first pixel block in the first image is replaced with the pixel block at the corresponding pixel position in the third image.
The method according to claim 34, wherein determining the first pixel block corresponding to the dynamic object in the first image comprises:

Do the following for each frame of the second image:

calculating the optical flow of each pixel of the first image relative to the second image;

Determine, from each pixel of the first image, a target pixel whose modulo length of the optical flow is greater than a preset threshold;

The target pixel points are clustered to obtain the first pixel block corresponding to the dynamic object.
The method according to claim 35, wherein determining from the plurality of second images that the pixel block corresponding to the pixel position is the third image of the static area, comprising:

determining dynamic regions in the plurality of second images;

For each of the first pixel blocks, the following steps are performed:

According to the sequence of acquiring the second image and the first image from near to far, determine the pixel block of the pixel position where the first pixel area is located in the corresponding pixel position in the second image, until the If the pixel block at the corresponding position does not overlap with the dynamic area, the second image is used as the third image.
The method according to any one of claims 34-36, wherein the difference between the second image and the first image exceeds a preset threshold; or

The second image is an image spaced apart from the first image by a specified frame.
An image processing apparatus, characterized in that the image processing apparatus includes a processor, a memory, and a computer program stored in the memory and executable by the processor, and when the processor executes the computer program, the following steps are implemented :

acquiring the first image collected by the camera in the first pose, and determining the first pixel block corresponding to the object to be filtered out in the first image;

Obtain a second image collected by the camera in the second pose, the second image includes a second pixel block corresponding to a target object, and the target object is the object to be filtered out in the first image occluded object;

The first pixel block in the first image is replaced by the second pixel block to generate a replaced first image.
The device according to claim 38, wherein when the processor is configured to acquire the second image collected by the camera in the second pose, the processor is specifically configured to:

determining the second pose;

The camera is controlled to move to adjust to the second pose and capture the second image.
The device according to claim 38, wherein when the processor is configured to acquire the second image collected by the camera in the second pose, the processor is specifically configured to:

determining the second pose;

Sending prompt information indicating the second pose to the user, so that the user controls the camera to move according to the prompt information to adjust to the second pose and capture the second image.
The device according to claim 39 or 40, wherein when the processor is configured to determine the second pose, it is specifically configured to:

Obtain the location information of the object to be filtered and the location information of the target object;

The second pose is determined according to the position information of the object to be filtered and the position information of the target object.
The device according to claim 41, wherein the processor is configured to determine the second pose according to the position information of the object to be filtered and the position information of the target object, and is specifically used for:

The second pose is determined according to the position information of the object to be filtered, the position information of the target object, and the size of the object to be filtered.
The device of claim 42, wherein the first posture includes a first position and a first orientation, and the second posture includes a second position and a second orientation;

The second position is located on a straight line passing through the first position and parallel to the plane where the object to be filtered is located, and the second orientation points to the position where the object to be filtered is located.
The apparatus of claim 43, wherein the second position is determined by:

Determine the moving distance according to the position information of the object to be filtered, the position information of the target object and the size of the object to be filtered;

The second position is determined according to the first position and the moving distance.
The device according to claim 43 or 44, wherein the second orientation is determined by:

Determine the second orientation according to the first position and the position of the object to be filtered out in the image frame captured by the camera; or

The second orientation is determined according to the first position, the positions of the left and right endpoints of the object to be filtered, and the positions of the left and right endpoints of the target object.
The device according to claim 40, wherein the second posture includes a second position and a second orientation, and the processor is configured to send prompt information indicating the second posture to the user, when, specifically Used for:

The image marked with the second position is displayed to the user, and the rotation angle information corresponding to the second orientation is displayed.
The device according to claim 38, wherein when the processor is configured to acquire the second image collected by the camera in the second pose, the processor is specifically configured to:

Controlling the motion of the camera to change the pose of the camera and collecting multiple frames of images, and for each frame of image, determine whether the second pixel block is included in the image;

An image including the second pixel block is used as the second image.
The device according to claim 47, wherein when the processor is configured to determine whether the image includes the second pixel block, the processor is specifically configured to:

determining a first feature point in the first pixel block, and determining a second feature point in a peripheral area of the first pixel block;

For each frame of the image, determine a first matching point of the first feature point in the image, and a second matching point of the second feature point in the image;

Whether the image includes the second pixel block is determined according to the positional relationship between the first matching point and the second matching point in the image.
The device of claim 48, wherein the first feature point is located within a first side of the first pixel block, and the second feature point is located on a second side of the first pixel block In addition, wherein, the first side is the opposite side of the second side;

When the processor is used to determine whether the image includes the second pixel block according to the positional relationship between the first matching point and the second matching point, it is specifically used for:

When it is determined that the second matching point is located on the first side of the first matching point, it is determined that the image includes the second pixel block.
The device according to claim 48, wherein a plurality of the second feature points are located in a ring-shaped pixel block surrounding the first pixel area, and the processor is configured to match the first matching points and the When the positional relationship of the second matching point determines whether the image includes the second pixel block, it is specifically used for:

When a preset number of the second matching points among the plurality of second matching points are located on one side of the first matching point, it is determined that the second pixel block is included in the second image.
The device according to claim 39, 40 or 47, wherein the camera device is mounted on a movable platform, and the processor is used to control the movement of the camera device, and is specifically used for:

Movement of the movable platform is controlled to control movement of the camera.
The device according to claim 39, 40 or 47, wherein the camera device is mounted on a movable platform through a pan/tilt, and the processor is used to control the movement of the camera device, and is specifically used for:

The movable platform is controlled to move, and/or the pan/tilt is controlled to generate relative motion between the camera and the movable platform, so as to control the camera to move.
The device according to claim 39 or 40, wherein the camera device is mounted on the movable platform through a pan/tilt head, the second posture includes a second position and a second orientation, and the processor is used to control the When the camera is moved to adjust the camera to the second pose, it is specifically used for:

The movable platform is controlled to move so that the camera is located at the second position; and the pan-tilt is controlled to rotate so that the orientation of the camera is adjusted to the second orientation.
The device according to any one of claims 51-53, wherein the movable platform comprises any one of an unmanned aerial vehicle, an unmanned vehicle, and an unmanned boat.
The device according to any one of claims 38-54, wherein when the processor is configured to determine the first pixel block corresponding to the object to be filtered out in the first image, it is specifically configured to:

In response to the user's instruction, the first pixel block corresponding to the object to be filtered out is determined from the first image.
The apparatus according to claim 55, wherein the instruction comprises a check box input by a user through a human-computer interaction interface, and the check box is used to frame the static target object.
The apparatus of claim 56, wherein the first pixel block is a pixel block selected by the marquee, and the apparatus is further configured to:

Perform superpixel segmentation processing on the first image to obtain multiple image areas, and adjust the pixel blocks selected by the frame selection based on the multiple image areas.
The device according to claim 57, wherein when the processor adjusts the pixel blocks selected by the marquee based on the plurality of image regions, the processor is specifically configured to:

The pixel block selected by the frame is adjusted according to the ratio of the portion of each image area that falls within the frame and each image area in the plurality of image areas.
The device according to any one of claims 38-58, characterized in that, the device is further used for:

When the preset first condition is triggered, a prompt message that the object to be filtered out cannot be filtered out is sent to the user.
The device according to claim 59, wherein the preset first condition comprises one or more of the following:

The first distance between the object to be filtered and the target object is less than a first preset threshold;

or the second distance between the target object and the camera device is less than a second preset threshold;

Or the distance magnitude relationship between the first distance and the second distance does not satisfy a preset second condition.
The device according to any one of claims 30-60, wherein the device is further used for:

The second pixel block is determined in the second image.
The device according to claim 61, wherein when the processor is configured to determine the second pixel block in the second image, it is specifically configured to:

determining the mapping relationship between the pixels of the first image and the pixels of the second image;

The mapping area of the first pixel block in the second image is determined according to the mapping relationship as the second pixel block.
The device according to claim 62, wherein when the processor is configured to determine the mapping relationship between the pixels of the first image and the pixels of the second image, it is specifically configured to:

Extract the third feature point in the peripheral area of the first pixel block, and determine the third matching point of the third feature point in the second image;

The mapping relationship is determined based on the third feature point and the third matching point.
The device according to claim 61, wherein when the processor is configured to determine the second pixel block in the second image, it is specifically configured to:

determining a ring-shaped pixel block surrounding the first pixel block in the first image;

determining a matching ring-shaped block that matches the ring-shaped pixel block in the second image;

A pixel block surrounded by the matching ring-shaped block in the second image is used as the second pixel block.
The device according to any one of claims 38 to 64, wherein when the processor is configured to acquire the first image captured by the camera in the first pose, the processor is specifically configured to:

acquiring multiple images collected by the camera device in the first pose;

The first image is determined among the plurality of images.
The device according to claim 65, wherein the processor is further configured to: before determining the first pixel block corresponding to the object to be filtered out in the first image:

Determine the category information of the object to be filtered out by using the plurality of images, where the category information is used to identify the object to be filtered out as a dynamic object or a static object;

If the object to be filtered out is a static object, then the determining of the first pixel block corresponding to the object to be filtered out in the first image is performed, and the acquisition of the image data collected by the camera in the second pose including the second pixel block is performed. The second image of the pixel block, and the step of replacing the first pixel block in the first image with the second pixel block to generate the replaced first image.
The apparatus of claim 66, wherein the apparatus is further configured to:

If the object to be filtered is a dynamic object, perform the following steps:

Determine the third pixel block corresponding to the object to be filtered out in the first image;

Determine the fourth pixel block on the corresponding pixel position in the other images except the first image in the plurality of images at the pixel position of the third pixel area;

Determine from the other images that the fourth pixel block is the third image of the static area;

The third pixel block in the first image is replaced with the fourth pixel block in the third image.
The device according to claim 66 or 67, wherein when the processor is configured to determine the category of the object to be filtered out from the multiple images, it is specifically configured to:

The category information of the object to be filtered out is determined according to the optical flow of each pixel of the object to be filtered out relative to other images in the plurality of images except the first image.
The device according to claim 67, wherein when the processor is configured to determine from the other images that the fourth pixel block is the third image of the static area, the processor is specifically configured to:

determining dynamic regions in said other images;

For each object to be filtered out, perform the following steps:

According to the order of acquisition of the other images and the first image from near to far, determine the pixel block of the corresponding pixel position of the third pixel block in the other image, until the corresponding pixel position If the pixel block at the pixel position does not overlap with the dynamic area, the other image is used as the third image.
The device according to any one of claims 66-69, wherein the difference between the other and the first image exceeds a preset threshold; or

The other images are images spaced apart from the first image by a specified frame.
An image processing device, characterized in that the image processing device comprises a processor, a memory, and a computer program stored in the memory and executable by the processor, and when the processor executes the computer program, the following steps are implemented :

Determine the first pixel block corresponding to the dynamic object in the first image;

Determine the pixel block of the pixel position of the first pixel block corresponding to the pixel position in a plurality of second images, and the plurality of second images and the first image are collected in the same pose by a camera;

From the plurality of second images, it is determined that the pixel block of the corresponding pixel position is the third image of the static area;

The first pixel block in the first image is replaced with the pixel block at the corresponding pixel position in the third image.
The device according to claim 71, wherein when the processor is configured to determine the first pixel block corresponding to the dynamic object in the first image, it is specifically configured to:

Do the following for each frame of the second image:

calculating the optical flow of each pixel of the first image relative to the second image;

Determine, from each pixel of the first image, a target pixel whose modulo length of the optical flow is greater than a preset threshold;

The target pixel points are clustered to obtain the first pixel block corresponding to the dynamic object.
The device according to claim 71, wherein the processor is configured to determine from the plurality of second images that the pixel block corresponding to the pixel position is the third image of the static area; specifically:

determining dynamic regions in the plurality of second images;

For each of the first pixel blocks, the following steps are performed:

According to the second image and the first image acquisition sequence from near to far, determine the pixel block where the pixel position of the first pixel block is located in the corresponding pixel position in the second image, until If the pixel block corresponding to the pixel position does not overlap with the dynamic area, the second image is used as the third image.
The device method according to any one of claims 71-73, wherein the difference between the second image and the first image exceeds a preset threshold; or

The second image is an image spaced apart from the first image by a specified frame.
A movable platform, characterized in that, the movable platform comprises a camera device and the image processing device according to any one of claims 38-74.