WO2023130465A1 - 飞行器、图像处理方法和装置、可移动平台 - Google Patents

飞行器、图像处理方法和装置、可移动平台 Download PDF

Info

Publication number
WO2023130465A1
WO2023130465A1 PCT/CN2022/071100 CN2022071100W WO2023130465A1 WO 2023130465 A1 WO2023130465 A1 WO 2023130465A1 CN 2022071100 W CN2022071100 W CN 2022071100W WO 2023130465 A1 WO2023130465 A1 WO 2023130465A1
Authority
WO
WIPO (PCT)
Prior art keywords
positional relationship
visual sensor
relative positional
moment
movable platform
Prior art date
Application number
PCT/CN2022/071100
Other languages
English (en)
French (fr)
Inventor
杨健
周游
杨振飞
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to PCT/CN2022/071100 priority Critical patent/WO2023130465A1/zh
Priority to CN202280057209.4A priority patent/CN117859104A/zh
Publication of WO2023130465A1 publication Critical patent/WO2023130465A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/10Simultaneous control of position or course in three dimensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image

Definitions

  • the present disclosure relates to the technical field of computer vision, in particular, to an aircraft, an image processing method and device, and a movable platform.
  • an embodiment of the present disclosure provides an aircraft, the aircraft includes a first visual sensor and a second visual sensor; the first visual range of the first visual sensor and the second visible range of the second visual sensor The visual ranges partially overlap, wherein the overlapping visual ranges include the surrounding environmental area of the aircraft, and the environmental area includes areas facing two opposite parts of the UAV; the first visual sensor and The images collected by the second vision sensor are used to calculate the position information of objects in the environment area, and the position information of the objects is used to control the movement of the aircraft in space.
  • an embodiment of the present disclosure provides an image processing method, the method is applied to a movable platform, the movable platform includes a first visual sensor and a second visual sensor, and the first visual sensor of the first visual sensor can The visual range partially overlaps with the second visual range of the second visual sensor, and the method includes: acquiring a first partial image of the first visual sensor within the overlapping visual range, acquiring the second visual sensor A second partial image within the overlapping visual range; acquiring an image collected by the first visual sensor at a first moment and an image collected at a second moment, the first visual sensor at the first moment at the first moment The pose in space is different from the pose in space at the second moment; based on the first partial image, the second partial image, the image collected at the first moment and the second moment The collected images are used to determine the relative positional relationship between objects in the space where the movable platform is located and the movable platform.
  • an embodiment of the present disclosure provides an image processing device, including a processor, the device is applied to a movable platform, and the movable platform includes a first visual sensor and a second visual sensor, and the first visual sensor The first visible range of the first visual sensor partially overlaps with the second visible range of the second visual sensor, and the processor is configured to perform the following steps: acquiring a first partial image of the first visual sensor within the overlapping visible range , acquire the second partial image of the second visual sensor within the overlapping visual range; acquire the image collected by the first visual sensor at the first moment and the image collected at the second moment, the first visual sensor The pose in space at the first moment is different from the pose in space at the second moment; based on the first partial image and the second partial image, the first moment captures The image and the image collected at the second moment determine the relative positional relationship between the object in the space where the movable platform is located and the movable platform.
  • an embodiment of the present disclosure provides a movable platform, including: a first visual sensor and a second visual sensor, respectively used to collect images of the environment area around the movable platform, and the first visual sensor The first visible range partially overlaps with the second visible range of the second visual sensor; and the image processing device according to any one of the embodiments.
  • the first visual sensor and the second visual sensor whose visual range partially overlaps are installed on the aircraft, because the overlapping visual range includes the surrounding environmental area of the aircraft, and the environmental area includes all The areas facing the two opposite parts of the aircraft, so that the position information of the objects in the two opposite areas in the surrounding environment area can be obtained; in addition, the objects within the non-overlapping visual range of the aircraft can also be obtained location information. Therefore, by adopting the solution of the disclosed embodiment, only two vision sensors need to be deployed on the aircraft to obtain a larger perception range, and the configuration of the vision sensors is simple and the cost is low, thereby reducing the weight and cost of the aircraft, and improving The safety of aircraft movement in space.
  • the relative positional relationship between the objects in the space and the movable platform is based on the first partial image of the first visual sensor within the overlapping visible range, the second partial image of the second visual sensor within the overlapping visible range, and the first partial image of the second visual sensor within the overlapping visible range.
  • the image collected by a visual sensor at the first moment and the image collected at the second moment are jointly determined. On the one hand, it can improve the range of perception of objects in space; on the other hand, it can improve the accuracy of perception, so as to accurately obtain the The relative positional relationship of a larger range of objects.
  • FIG. 1 is a schematic diagram of the coverage of a binocular vision sensor in the related art.
  • FIG. 2 is a schematic diagram of a layout of a vision sensor.
  • FIG. 3 is a schematic diagram of an aircraft according to an embodiment of the present disclosure.
  • FIG. 4A and FIG. 4B are schematic diagrams of coverage areas of vision sensors according to embodiments of the present disclosure, respectively.
  • Fig. 5 is a schematic diagram of determining a relative positional relationship based on semantics according to an embodiment of the present disclosure.
  • FIG. 6 is a flowchart of an image processing method according to an embodiment of the present disclosure.
  • FIG. 7 is a schematic diagram of a binocular depth estimation process according to an embodiment of the present disclosure.
  • FIG. 8 is a schematic diagram of a monocular depth estimation process according to an embodiment of the present disclosure.
  • FIG. 9 is an overall flowchart of an embodiment of the present disclosure.
  • FIG. 10 is a schematic diagram of a hardware structure of an image processing device according to an embodiment of the present disclosure.
  • FIG. 11 is a schematic diagram of a movable platform according to an embodiment of the present disclosure.
  • first, second, third, etc. may be used in the present disclosure to describe various information, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, without departing from the scope of the present disclosure, first information may also be called second information, and similarly, second information may also be called first information. Depending on the context, the word “if” as used herein may be interpreted as “at” or “when” or “in response to a determination.”
  • Computer vision In the information age, computers are more and more widely used in various fields. As an important field of intelligent computing, computer vision has been greatly developed and applied. Computer vision relies on imaging systems instead of visual organs as input sensitive means, the most commonly used are visual sensors. When applying computer vision to a mobile platform, visual sensors are used to perceive objects around the mobile platform, and based on the perception results to control the movement of the mobile platform (including moving speed, moving direction, etc.). For example, a stereo vision system (Stereo Vision System) can be deployed on a movable platform.
  • Stepo Vision System Stereo Vision System
  • the binocular vision system uses the binocular algorithm to calculate the depth, that is, through two visual sensors, two images at the same time and at different angles are taken, and then the difference between the two images, and the distance between the two visual sensors
  • the pose relationship between the scene and the visual sensor is calculated to obtain the depth map (Depth Map).
  • the depth can be calculated by a monocular algorithm, that is, the visual sensor captures the same object from different angles, and also calculates the depth information of the overlapping area of the images captured from multiple angles.
  • the sensing range is generally expanded in the following ways:
  • Method 1 Since the non-overlapping area of the binocular system imaging is relatively large, and the reliability of the depth information in the non-overlapping area is not high, in the case of using the binocular method to obtain the depth information, generally by improving the binocular vision sensor
  • the layout method maximizes the overlapping visual range of binocular vision sensors.
  • the binocular algorithm can only calculate the depth information within the overlapping visual range of the two visual sensors, even if the overlapping visual range increases, the range that can calculate the depth information is still limited.
  • S 1 and S 2 are two visual sensors respectively, the visual range of S 1 is the range corresponding to BO 1 D, and the visual range of S 2 is the range corresponding to AO 2 C, then S 1 and The visible range of S 2 overlap is shown in gray area. Therefore, in this scene, only the depth information of the scene in the gray area can be accurately calculated, and the left and right sides are the unique perspectives of S 1 /S 2 , and the depth information cannot be accurately calculated. Therefore, this improved method still cannot obtain sufficiently large and accurate depth information. In addition, the baseline length of the binocular system is limited, and it cannot observe long distances.
  • Method 2 Deploy multiple sets of vision sensors in multiple directions and angles.
  • monocular vision sensors or binocular vision sensors are deployed in multiple directions and angles of front, rear, left, right, up and down, respectively.
  • this layout method is complex in structure, high in cost, requires large computing power, and will increase the weight of the movable platform.
  • Method 3 Use monocular calculation to obtain depth information.
  • this method requires the visual sensor to move.
  • the visual sensor is stationary or the attitude changes too much or too small, it is difficult to generate an effective depth map, so that the depth information of the object cannot be estimated.
  • the object to be photographed cannot move and must be a stationary object. Therefore, the robustness of the depth information acquired by the monocular method is poor.
  • the pose relationship of the visual sensor at different moments is inaccurate, it will seriously affect the accuracy of the acquired depth information.
  • an embodiment of the present disclosure provides an aircraft.
  • the aircraft includes:
  • the images collected by the first visual sensor 301 and the second visual sensor 302 are used to calculate the position information of objects in the environment area, and the position information of the objects is used to control the movement of the aircraft in space.
  • the aircraft in the embodiments of the present disclosure may be an unmanned aerial vehicle, for example, a multi-rotor unmanned aerial vehicle.
  • the aircraft can fly autonomously in space, or fly in space in response to control instructions issued by the user through gestures, remote control, voice control, and the like.
  • the first visual sensor 301 and the second visual sensor 302 may be arranged on the aircraft.
  • the visible range of the first visual sensor 301 is called a first visible range
  • the visible range of the second visual sensor 302 is called a second visible range.
  • the overlapping visual range of the first visible range and the second visible range may include two oppositely facing areas of the aircraft, for example, including a top facing area and a bottom facing area of the aircraft, or , including a left-facing area and a right-facing area of the aircraft, or a front-facing area and a rear-facing area of the aircraft.
  • the overlapping visible range is also called overlapping area or binocular overlapping area
  • the non-overlapping visible range is also called non-overlapping area or binocular non-overlapping area.
  • one of the first visual sensor 301 and the second visual sensor 302 can be set on the top of the aircraft, and the other can be set on the bottom of the aircraft, so that the overlapping visual range includes the A left-facing area and a right-facing area of the aircraft, or including a front-facing area and a rear-facing area of the aircraft.
  • the non-overlapping visual ranges of the first visual sensor 301 and the second visual sensor 302 include the top and bottom environmental areas of the aircraft.
  • one of the first visual sensor 301 and the second visual sensor 302 can be set on the left side of the aircraft, and the other can be set on the right side of the aircraft, so that the overlapping visual range An environmental area on the top and an environmental area on the bottom of the aircraft are included.
  • the non-overlapping visual ranges of the first visual sensor 301 and the second visual sensor 302 include the left and right environmental areas of the aircraft.
  • one of the first vision sensor 301 and the second vision sensor 302 may be arranged on the front side of the aircraft, and the other may be arranged on the rear side of the aircraft, so that overlapping visual ranges An environmental area on the top and an environmental area on the bottom of the aircraft are included.
  • the non-overlapping visual ranges of the first visual sensor 301 and the second visual sensor 302 include the environmental area on the front side and the environmental area on the rear side of the aircraft.
  • the positions of the first visual sensor 301 and the second visual sensor 302 can also be set to other positions according to actual needs, as long as the overlapping visual range of the two visual sensors can include two parts of the aircraft. The area facing the opposite part is sufficient.
  • the solution of the present disclosure will be described below by taking one of the first visual sensor 301 and the second visual sensor 302 disposed on the top of the aircraft and the other disposed on the bottom of the aircraft as an example.
  • the images collected by the first visual sensor 301 and the second visual sensor 302 both include partial images corresponding to binocular overlapping regions and partial images corresponding to non-overlapping regions.
  • An aircraft may include arms and a fuselage.
  • the first visual sensor 301 and the second visual sensor 302 are both installed on the fuselage.
  • the top is the area above the aircraft arms and fuselage
  • the bottom is the area below the aircraft arms and fuselage
  • the sides are the area beside the aircraft arms and fuselage.
  • the top is the side facing the sky
  • the bottom is the side facing the ground
  • the sides are one or more sides except the top and the bottom.
  • the total visual range of the first visual sensor 301 and the second visual sensor 302 covers almost all directions and angles of the space where the aircraft is located, and the coverage is relatively large.
  • At least one of the first visual sensor 301 and the second visual sensor 302 may adopt a fisheye camera. Since the observation range of a single fisheye camera is relatively large (greater than 180 degrees), omnidirectional observation can be realized by setting two fisheye cameras. Application scenarios with strict requirements on the weight of peripherals.
  • the first visual sensor 301 may also be called an upper fisheye camera
  • the second visual sensor 302 may also be called a lower fisheye camera.
  • the structures of the first visual sensor 301 and the second visual sensor 302 may be the same or different, which is not limited in the present disclosure.
  • the first sensor or the second vision sensor includes a photosensitive circuit module and a group of fisheye optical modules installed in cooperation with the photosensitive circuit.
  • the aircraft is flying horizontally in space, or the flight speed of the aircraft includes a component in the horizontal direction. Therefore, the perception accuracy of the environmental area on the side of the aircraft is relatively high.
  • the overlapping visual ranges of the first visual sensor 301 and the second visual sensor 302 include the environmental area of the side of the aircraft, therefore, objects in the environmental area of the side can be perceived by the two visual sensors at the same time, thereby The accuracy of sensing objects in the side environment area can be improved.
  • the overlapping visual range of the binocular vision system on the aircraft generally only includes one side of the aircraft, while the overlapping visual range in the embodiment of the present disclosure includes two areas facing opposite parts, such as As shown in FIG. 4B, the areas facing the two opposite parts include the first side and the second side of the aircraft, and the first side and the second side may be the left and right sides of the aircraft respectively, or the sides of the aircraft. front and back.
  • the overlapping visual range of the first vision sensor 301 and the second vision sensor 302 in the embodiment of the present disclosure is larger.
  • the images collected by the first visual sensor 301 and the second visual sensor 302 are used to calculate the position information of objects in the environment area.
  • the position information may be determined based on a relative position relationship between the object and the aircraft, wherein the relative position relationship may be characterized by depth information, parallax information or position information itself of the object.
  • the relative positional relationship may be directly determined as the positional information of the object.
  • the relative positional relationship is represented by depth information or disparity information of the object
  • the depth information or disparity information may be converted into position information.
  • the solution of the present disclosure will be described below by taking the relative position relationship as depth information as an example. Those skilled in the art can understand that the depth information hereinafter may also be replaced by other relative positional relationships.
  • the relative positional relationship between the object in the space and the aircraft is jointly determined based on the following two: the first partial image of the first visual sensor 301 within the overlapping visible range, the The second partial image of the second visual sensor 302 in the overlapping visual range; and, the visual range covers the image collected by the visual sensor of the object at the first moment and the image collected at the second moment, wherein , the pose of the visual sensor whose visual range covers the object in space at the first moment is different from the pose in space at the second moment.
  • the first partial image and the second partial image may be images collected at the same time. Two partial images collected at the same time can be obtained based on the time stamps of the first partial image and the second partial image, and the same clock signal can be used to control the first visual sensor and the second visual sensor to collect images, so as to obtain the two partial images collected at the same time Two partial images.
  • both the first partial image and the second partial image may be images collected at the first moment, so that the accuracy of the final output relative positional relationship can be improved.
  • the visual sensor whose visual range covers the object is the first visual sensor 301 .
  • the visual sensor whose visual range covers the object is the second visual sensor 302 .
  • the first moment is different from the second moment.
  • the visual range of the visual sensor covering the object has a different posture in space at the first moment than at the second moment. It may be that the posture of the aircraft itself changes, resulting in a visual The pose of the sensor changes, or the pose of the aircraft itself does not change, and only the pose of the visual sensor changes.
  • the relative positional relationship between the first object in the environmental area on the top or bottom of the aircraft and the aircraft may be based on the same semantic information as the first object within the overlapping visual range.
  • the relative positional relationship between the object and the aircraft is obtained.
  • the following takes the first object as an object in the environmental area on the top of the aircraft as an example, and takes the visual sensor whose visual range covers the first object as the first visual sensor 301 as an example, A scheme of an embodiment of the present disclosure will be described.
  • the first object is an object in the environmental area of the bottom of the aircraft, and the visual sensor whose visual range covers the first object is the second visual sensor 302, the processing method is Similarly, details will not be repeated here.
  • the embodiment of the present disclosure predicts the depth information of objects within the binocular non-overlapping range based on the semantic information of the object, so that not only the depth information within the binocular overlapping range can be obtained, but also the depth information outside the binocular overlapping range can be obtained, fully
  • the images collected by two visual sensors are used to expand the acquisition range of depth information.
  • the overlapping visual range (shown by the oblique line in the figure) includes a part of the area on the cup 501 and a part of the area on the bottle 502
  • it can be based on the depth information of the cup 501 in the overlapping visual range , determine the depth information of a partial area on the cup 501 that is within the visible range of the first visual sensor 302 and outside the visible range of the second visual sensor 302 .
  • the semantic information of each point in the image can be obtained by performing semantic recognition on the image. For example, an image can be input into a pre-trained convolutional neural network, and the semantic information of each point on the image can be output through the convolutional neural network.
  • the relative positional relationship between an object having the same semantic information as the first object (referred to as a target object) and the aircraft within the overlapping visual range may be based on the first visual sensor A first partial image within the overlapping visible range and a second partial image of the second visual sensor within the overlapping visible range are obtained.
  • a binocular algorithm may be used to acquire depth information of the target object relative to the aircraft.
  • the relative positional relationship between the first object and the aircraft may be jointly obtained based on the following information: the relative positional relationship r 1 predicted based on the relative positional relationship between the target object and the aircraft, where the target object is An object having the same semantic information as the first object within the overlapping visual range, and the relative positional relationship between the target object and the aircraft is based on the first partial position of the first visual sensor within the overlapping visual range Determination of a second partial image of an image and said second vision sensor within an overlapping visual range, and based on the image captured at a first time instant and the image captured at a second time instant by a visual sensor covering said first object within the visual range The relative positional relationship r 2 determined by the image, wherein the visual range of the visual sensor covering the first object in the space at the first moment is different from that in the space at the second moment pose.
  • the embodiments of the present disclosure simultaneously use the relative positional relationship r 1 between the first object and the aircraft estimated based on semantic information, and the relative positional relationship r 2 between the first object and the aircraft determined based on the monocular algorithm, which can improve the accuracy of the obtained depth information. precision and robustness.
  • the first object as an object within the visual range of the first visual sensor 301 as an example
  • it can be based on the relationship between the pose of the first visual sensor at the first moment and the pose at the second moment
  • Depth information of the first object is determined by the pose relationship among them, and the image collected by the first visual sensor at the first moment and the image collected at the second moment.
  • the first moment may be the current moment
  • the second moment may be a moment before the first moment (that is, a historical moment)
  • the first partial image and the second partial image may be The image collected at the first moment.
  • An inertial measurement unit may be used to determine the pose of the first visual sensor at the first moment and the pose at the second moment, so as to determine the pose relationship between the poses at the two moments.
  • the IMU can be directly installed on the first visual sensor, so that the output result of the IMU can be directly determined as the pose of the first visual sensor.
  • the IMU can be installed on the fuselage of the aircraft, so that the pose of the first visual sensor can be determined through the output of the IMU and the pose relationship between the aircraft and the first visual sensor.
  • the embodiments of the present disclosure may also calculate the depth information of the objects within the binocular overlapping range.
  • the first visual sensor 301 and the second visual sensor 302 are respectively arranged on the top and the bottom of the aircraft, then the environmental area includes the left-facing area and the right-facing area of the aircraft, And/or comprising a front facing area and a rear facing area of the aircraft.
  • the areas facing the front side, the rear side, the left side, and the right side are collectively referred to as the environmental area of the side of the aircraft.
  • the calculation process of the relative positional relationship between the first object in the environmental area on the top or the bottom of the aircraft and the aircraft, and the second object in the environmental area on the side of the aircraft and the aircraft The calculation process of the relative position relationship is different.
  • the relative positional relationship between the second object and the aircraft can be jointly obtained based on the following information: based on the first partial image of the first visual sensor within the overlapping visible range and the second The relative positional relationship r 3 obtained by the second partial image of the visual sensor within the overlapping visible range, and the image collected by the visual sensor covering the second object based on the visible range at the first moment and at the second moment
  • the relative positional relationship r 3 between the second object and the aircraft can be acquired using a binocular algorithm, and the acquisition method of the relative positional relationship r 4 between the second object and the aircraft can refer to the acquisition mode of the relative positional relationship r 2 , which is not described here. Let me repeat. Through this embodiment, the relative positional relationship between the second object and the aircraft can be obtained with high precision.
  • the sensors have simple configuration, low cost, light weight and large sensing range.
  • an embodiment of the present disclosure also provides an image processing method, the method is applied to a movable platform, and the movable platform includes a first visual sensor and a second visual sensor, and the first visual sensor of the first visual sensor The visual range partially overlaps with the second visual range of the second visual sensor, and the method includes:
  • Step 601 Acquire a first partial image of the first visual sensor within the overlapping visible range, and acquire a second partial image of the second visual sensor within the overlapping visual range;
  • Step 602 Acquire the image collected by the first visual sensor at the first moment and the image collected at the second moment, the pose of the first visual sensor in space at the first moment is different from that at the The pose in space at the second moment;
  • Step 603 Based on the first partial image, the second partial image, the image collected at the first moment and the image collected at the second moment, determine the relationship between the object in the space where the movable platform is located and the The relative position relationship of the movable platform.
  • the method of the embodiments of the present disclosure can be used to calculate the relative positional relationship between objects in the environment area around the movable platform such as unmanned aerial vehicles, unmanned vehicles, and mobile robots and the movable platform, and then calculate the position information of the objects.
  • one of the first visual sensor and the second visual sensor may be arranged on the top of the fuselage of the unmanned aerial vehicle, and the other is arranged on the The bottom of the UAV's fuselage.
  • one of the first visual sensor and the second visual sensor is disposed on a first side of the UAV, and the other is disposed on a second side of the UAV, and the first side is in contact with the UAV.
  • the second sides are oppositely arranged.
  • the first side may be the left side of the UAV, and the second side may be the right side of the UAV; or, the first side may be the front side of the UAV, and the second side may be the right side of the UAV;
  • the side may be the rear side of the UAV.
  • the unmanned aerial vehicle may be the aerial vehicle in any of the foregoing embodiments.
  • the first visual sensor and the second visual sensor can be respectively arranged on the two lights of the unmanned vehicle, or respectively arranged on both sides of the windshield .
  • the first vision sensor and the second vision sensor may be respectively arranged at positions where two eyes of the movable robot are located.
  • the above method of setting the first visual sensor and the second visual sensor enables the overlapping visual range of the two visual sensors to cover the area in the moving direction of the movable platform as much as possible, thereby improving the detection of the moving direction of the movable platform. Perceptual accuracy of objects on .
  • the movable platform can also be other types of equipment that can move autonomously, and the installation positions of the first visual sensor and the second visual sensor can be based on the type of the movable platform and/or other The factor settings will not be explained here one by one.
  • Both the first vision sensor and the second vision sensor can be used independently as a monocular vision sensor, so as to calculate the depth information of the object based on a monocular algorithm.
  • the first vision sensor and the second vision sensor can form a pair of non-rigid binoculars, so as to calculate the depth information of the object based on the binocular algorithm.
  • the first visual sensor may be any visual sensor on the movable platform. Taking the movable platform as an example of an unmanned aerial vehicle, the first visual sensor may be a visual sensor arranged on the top of the fuselage of the unmanned aerial vehicle, or a visual sensor arranged on the bottom of the fuselage of the unmanned aerial vehicle. sensor.
  • the present disclosure does not limit the resolution of the first visual sensor and the second visual sensor.
  • the first visual sensor and the second visual sensor may use visual sensors with a resolution of about 1280 ⁇ 960. If the resolution of the vision sensor is too low, the resolution of the collected image is too low, and it is difficult to accurately identify the features of the object in the image, thus affecting the accuracy of the processing results. If the resolution of the vision sensor is too high, it will be very sensitive to the interference caused by the non-rigid connection of the two vision sensors. Therefore, using a vision sensor with this resolution can effectively balance image clarity and anti-interference.
  • At least one of the first vision sensor and the second vision sensor may employ a fisheye camera. Since the observation range of a single fisheye camera is relatively large (greater than 180 degrees), omnidirectional observation can be achieved by setting two fisheye cameras, which has the advantages of simple configuration, low cost, and light weight.
  • the area of the overlapping field of view of the first and second vision sensors is smaller than the area of the non-overlapping field of view.
  • the area where depth information can be acquired is generally enlarged by increasing the area of the overlapping visual range of binocular vision sensors. Therefore, the area of the binocular overlapping area is generally larger than the area of the non-overlapping area (as shown in FIG. 1 ).
  • the area of the overlapping visual range of the two visual sensors in the embodiment of the present disclosure may be smaller than the area of the non-overlapping visual range, and by adopting a processing method different from that in the related art, a larger Depth information in the range. The processing manner of the embodiment of the present disclosure will be described in detail below.
  • the first partial image and the second partial image may be images collected at the same time.
  • the first partial image can be collected in real time by the first visual sensor on the movable platform
  • the second partial image can be collected by the second visual sensor on the movable platform, and based on the real-time collection
  • the first partial image and the second partial image are used to determine the relative positional relationship between objects in the environment area around the movable platform and the movable platform in real time.
  • the relative positional relationship between objects in the environment area around the movable platform and the movable platform may also be determined by acquiring the first partial image and the second partial image collected at a historical moment. Since both the first partial image and the second partial image are overlapping images within a visible range, the pixels in the first partial image and the second partial image are in one-to-one correspondence, Corresponding points in the two partial images correspond to the same object point in the physical space.
  • the pose of the first visual sensor in space at the first moment is different from the pose in space at the second moment, which may be a change in the pose of the movable platform itself, resulting in
  • the change of the pose of the first visual sensor may also be that the pose of the movable platform itself remains unchanged, and only the pose of the first visual sensor is changed.
  • the first moment and the second moment are different moments, for example, the first moment may be the current moment, and the second moment may be a historical moment before the current moment. For another example, the first moment and the second moment are different historical moments before the current moment.
  • the first partial image and the second partial image may both be images collected at the first moment.
  • the IMU may be used to determine the pose of the first visual sensor at the first moment and the pose at the second moment, so as to determine the pose relationship between the poses at the two moments.
  • the pose of the first visual sensor may also be determined based on wheel speed information and positioning information of the vehicle.
  • the IMU can be directly installed on the first visual sensor, so that the output result of the IMU can be directly determined as the pose of the first visual sensor.
  • the IMU can be installed on the fuselage of the aircraft, so that the pose of the first visual sensor can be determined through the output of the IMU and the pose relationship between the aircraft and the first visual sensor.
  • the pose of the excavator can be obtained through the IMU installed on the fuselage of the excavator, and based on the The rotation angle and stretching amount of the motor determine the pose relationship between the robot arm and the body of the excavator and the robot arm, and then determine the pose of the first visual sensor.
  • the first relative positional relationship between the object and the movable platform may be determined based on the first partial image and the second partial image; based on the image collected at the first moment and the Determining a second relative positional relationship between the object and the movable platform based on the image collected at the second moment; determining the object and the movable platform based on the first relative positional relationship and the second relative positional relationship relative positional relationship.
  • the process of determining the first relative positional relationship based on the first partial image and the second partial image can be realized based on a binocular algorithm, based on the image collected at the first moment and the image collected at the second moment to determine
  • the process of the second relative position relationship can be realized based on a monocular algorithm.
  • the calculation process of the relative positional relationship between the objects within the overlapping visible range and the aircraft is different from the calculation process of the relative positional relationship between the objects within the non-overlapping visible range and the aircraft.
  • the overlapping visual range includes a partial area on the cup
  • the image collected by the first visual sensor may be input into a pre-trained convolutional neural network, and the semantic information of each point on the image is output through the convolutional neural network.
  • the process of determining the relative positional relationship between the target object and the movable platform can be realized based on binocular algorithm.
  • the second relative positional relationship when the second relative positional relationship satisfies the geometric constraint condition corresponding to the first object, the second relative positional relationship may be determined as the first object and the movable The relative position relationship of the platform; in the case that the second relative position relationship does not satisfy the geometric constraint condition, the first relative position relationship can be determined as the relative position of the first object and the movable platform relation.
  • the geometric constraint may be a geometric positional relationship between points on the first object.
  • the depth information of adjacent points on the same object generally changes smoothly, that is, the difference between the depth information of adjacent points on the same object is generally smaller than the preset depth difference threshold.
  • the depth information of each point on the first object can be calculated in the manner described above, and if the difference between the depth information of adjacent points is greater than the depth difference threshold, it is considered that the geometric constraint corresponding to the first object is not satisfied, Only when the difference between the depth information of adjacent points is less than or equal to the depth difference threshold, it is considered that the geometric constraint condition corresponding to the first object is satisfied.
  • the monocular algorithm is less robust to the depth information acquired by moving objects.
  • the relative positional relationship determined based on semantics is not constrained by the physical model, and its robustness is poor.
  • the embodiment of the present disclosure utilizes the two algorithms to complement each other through the above method. When one algorithm does not satisfy the constraint conditions, the relative position relationship acquired by the other algorithm can be used, so as to improve the robustness of the finally determined relative position relationship. sex.
  • the first relative positional relationship between the second object and the movable platform may be determined directly based on the first partial image and the second partial image . Since the second object is within the overlapping visible range, the depth information of the second object can be obtained based on two partial images directly by using a binocular algorithm with high precision and robustness. At the same time, a second relative positional relationship between the second object and the movable platform may also be acquired based on the image collected at the first moment and the image collected at the second moment.
  • the first relative positional relationship may be determined as the relative positional relationship between the second object and the movable platform; if the preset condition is not met
  • the second relative positional relationship may be determined as the relative positional relationship between the second object and the movable platform; the preset condition includes: the depth of the second object is less than a preset depth threshold; And the confidence of the first relative position relationship is greater than a preset confidence threshold.
  • the accuracy of the binocular algorithm is high, but due to the limited length of the baseline of the binocular system, it is impossible to observe a long distance. Therefore, when the depth of the second object is greater than or equal to the preset depth threshold, it is considered The reliability of the first relative positional relationship is low, so the second relative positional relationship can be determined as the relative positional relationship between the second object and the movable platform. At the same time, in the case of occlusion, etc., the confidence degree of the first relative position relationship obtained by the binocular algorithm may not be high, so when the confidence degree of the first relative position relationship is low, the second relative position The relationship is determined as a relative positional relationship between the second object and the movable platform. In this way, the accuracy and reliability of the output results can be improved.
  • the first relative position relationship is obtained by processing the first partial image and the second partial image through a first neural network
  • the second relative position relationship is obtained by processing the first partial image through a second neural network.
  • the image collected at the first moment and the image collected at the second moment are processed to obtain.
  • the image is directly processed through the neural network to output the first relative positional relationship and the second relative positional relationship.
  • the processing process is simple and the complexity is low, and because the neural network can understand and infer the environment, thereby improving visual perception receptive field.
  • the first neural network and/or the second neural network may be a convolutional neural network, or other types of neural networks.
  • the types of the first neural network and the second neural network may be the same or different.
  • the first neural network is based on the first partial sample image of the first visual sensor within the overlapping visual range and the second visual sensor within the overlapping visual range
  • the second partial sample image training is obtained;
  • the second neural network is obtained based on the sample image collected by the first visual sensor at the third moment and the sample image collected at the fourth moment; wherein, the first visual sensor The pose in space at the third moment is different from the pose in space at the fourth moment.
  • the processing procedure of the first neural network includes:
  • the CNN may process the voxel gray values of the first partial image and the second partial image into feature descriptions.
  • Costvolume a projection cost for projecting the first partial image to the second partial image based on the feature description of the first partial image and the feature description of the second partial image.
  • the Costvolume can be calculated through the Plane Sweeping algorithm. For the feature descriptions F1 and F2 obtained in the previous step, move F2 according to different parallax along the direction of the binocular baseline to obtain F2'. Perform dot product and sum operation on F1 and F2 to obtain the corresponding Costvolume after each displacement.
  • the Costvolume is fused by convolution. In order to increase the receptive field, the Costvolume needs to be down-sampled and up-sampled during the convolution process.
  • the normalization process can be realized by arg softmax operation.
  • the initial depth map includes probability values for each point in the scene at different depths.
  • a confidence map output by the first neural network may also be acquired, where the confidence map is used to indicate the confidence degree of the first relative position relationship.
  • the greater the degree of confidence the higher the degree of reliability of the first relative positional relationship.
  • the pose between the pose of the first visual sensor at the first moment and the pose of the first visual sensor at the second moment The relationship is optimized; the optimized pose relationship is used to determine the second relative position relationship.
  • the depth information obtained based on semantic completion does not have the physical constraints of binocular disparity.
  • constraints can be added to the depth information obtained based on semantic completion; on the other hand, it is difficult for the binocular structure to handle parallel In the baseline scene, by using the confidence map, this type of area can be identified, and then the depth information obtained by the monocular algorithm is used as the depth information of this type of area.
  • FIG. 8 it is a schematic diagram of the processing procedure of the second neural network.
  • the processing process of the second neural network is similar to the processing process of the first neural network.
  • the difference between the processing process of the first neural network and the second neural network will be explained below.
  • the embodiment of the processing process will not be repeated here.
  • step (3) Calculating the feature description F3 of the image collected at the first moment and the feature description F4 of the image collected at the second moment.
  • the calculation method is the same as step (2) of the first neural network, and CNN can be used to calculate the feature description.
  • Costvolume For features F3 and F4, use the Plane Sweeping algorithm to project F4 to the coordinate system of F3 according to different depths, and obtain the projected feature F4'. Connect F3 and F4' in series to form Costvolume.
  • the processing method is the same as step (3) of the first neural network.
  • the processing method is the same as step (5) of the first neural network.
  • the binocular disparity, monocular disparity and binocular confidence of the point can be obtained. If the point satisfies the following conditions at the same time: (1) the point is within the binocular overlap area, (2) the binocular depth of the point (the depth calculated by the binocular method) is less than the preset depth threshold d1 (for example, 20m ), and (3) the binocular depth confidence of the point is greater than the preset confidence threshold c1, then the binocular disparity is used as the actual disparity of the point.
  • the preset depth threshold d1 for example, 20m
  • the monocular depth (the depth calculated by the monocular method) satisfies the geometric constraint. If it is satisfied, the monocular parallax is used as the actual parallax of the point; if not, the binocular parallax is still used as the actual parallax of the point.
  • the binocular disparity may be directly determined based on partial images of overlapping regions, or may be inferred based on images of overlapping regions and semantic information, and disparity of non-overlapping regions. Since disparity, depth and position can be converted to each other, the above disparity can also be replaced by depth or position information.
  • the determination order of the above conditions is not limited to what is shown in the figure, for example, the condition (2) may be determined first, then the condition (3), and finally the condition (1) may be determined. Since the accuracy of binocular parallax is generally high, binocular parallax can be used preferentially if the above three conditions are met.
  • the embodiment of the present disclosure fully fuses the monocular depth and binocular depth according to information such as the overlapping area range, the pose size of the visual sensor, the binocular confidence map, and the observation distance.
  • the judging strategy in the embodiment of the present disclosure can be quickly supplemented by the result calculated by another method, so as to implement dynamic switching strategy and make the system more stable and robust.
  • the absolute position information of the object in the space may also be determined according to the relative position relationship between the object and the movable platform.
  • the absolute position information may be the absolute position coordinates of the object in a preset coordinate system (eg, the coordinate system of the movable platform or the world coordinate system). For example, when the preset coordinate system is the world coordinate system, the longitude and latitude height information of the object can be obtained.
  • the moving direction of the movable platform can be determined; based on the first image area in the first partial image related to the moving direction, the second partial image related to the moving direction The second image area of the first moment, the third image area related to the moving direction in the image collected at the first moment, and the fourth image area related to the moving direction in the image collected at the second moment, determine The relative positional relationship between objects in the space where the movable platform is located and the movable platform.
  • the above-mentioned respective image areas related to the moving direction may be image areas including areas in the moving direction.
  • the visual range of the visual sensor installed on the unmanned vehicle may include the area directly in front of the unmanned vehicle, the area on the left side of the unmanned vehicle, The area on the side and the area on the right side of the unmanned vehicle. Therefore, the first image area including the area directly in front of the unmanned vehicle can be segmented from the first image area. The segmentation methods of other image regions are similar and will not be repeated here. Determining the relative positional relationship between the object and the movable platform based on the segmented image area can reduce computing power consumption and improve processing efficiency.
  • an update frequency of the relative position relationship is determined based on the moving speed of the movable platform and the relative position relationship; and the relative position relationship is updated based on the update frequency. For example, when the moving speed of the movable platform is relatively slow, and/or when the distance between the position of the object corresponding to the relative position relationship and the movable platform is relatively long, the The above relative positional relationship is updated. When the moving speed of the movable platform is fast, and/or when the distance between the position of the object corresponding to the relative position relationship and the movable platform is relatively short, the relative The location relationship is updated. By dynamically adjusting the update frequency of the relative position relationship, it is possible to balance the security of the movable platform with the resource consumption when acquiring the relative position relationship.
  • An embodiment of the present disclosure also provides an image device, including a processor, the device is applied to a movable platform, and the movable platform includes a first visual sensor and a second visual sensor, and the first visual sensor of the first visual sensor can The visual range partially overlaps with the second visual range of the second visual sensor, and the processor is configured to perform the following steps:
  • the pose of the first visual sensor in space at the first moment is different from that at the second moment The pose in space at time;
  • the second partial image Based on the first partial image, the second partial image, the image collected at the first moment and the image collected at the second moment, determine the relationship between the object in the space where the movable platform is located and the movable platform relative positional relationship.
  • the processor is specifically configured to: determine a first relative positional relationship between the object and the movable platform based on the first partial image and the second partial image; Determining a second relative positional relationship between the object and the movable platform based on the image collected at one moment and the image collected at the second moment; determining the second relative positional relationship based on the first relative positional relationship and the second relative positional relationship The relative positional relationship between the object and the movable platform.
  • the processor is specifically configured to: determine, based on the first partial image and the second partial image, the target object and the target object having the same semantic information as the first object within the overlapping visible range
  • the relative positional relationship of the movable platform determining a first relative positional relationship between the first object and the movable platform based on the relative positional relationship between the target object and the movable platform.
  • the processor is specifically configured to: if the second relative positional relationship satisfies the geometric constraint condition corresponding to the first object, determine the second relative positional relationship as the first object A relative positional relationship between an object and the movable platform; if the second relative positional relationship does not satisfy the geometric constraint condition, determine the first relative positional relationship as the first object and the The relative position relationship of the movable platform.
  • the processor is specifically configured to: determine a first relative positional relationship between the second object and the movable platform based on the first partial image and the second partial image.
  • the processor is specifically configured to: determine the first relative positional relationship as the relative positional relationship between the second object and the movable platform when a preset condition is met; If the preset condition is not satisfied, the second relative positional relationship is determined as the relative positional relationship between the second object and the movable platform; the preset condition includes: the depth of the second object is less than a preset depth threshold; and the confidence of the first relative position relationship is greater than the preset confidence threshold.
  • the first relative position relationship is obtained by processing the first partial image and the second partial image through a first neural network
  • the second relative position relationship is obtained by processing the first partial image through a second neural network.
  • the image collected at the first moment and the image collected at the second moment are processed to obtain.
  • the first neural network obtains the first relative positional relationship based on the following method: performing feature extraction on the first partial image and the second partial image respectively to obtain the first partial image The feature description of the feature description and the feature description of the second partial image; based on the feature description of the first partial image and the feature description of the second partial image, obtain the projection of the first partial image to the second partial image The projection cost of the image; aggregate the projection cost to obtain an aggregated projection cost; perform normalization processing on the aggregated projection cost to obtain an initial relative positional relationship; based on the initial relative positional relationship and the semantics of the target object information to acquire the first relative positional relationship.
  • the first neural network is based on the first partial sample image of the first visual sensor within the overlapping visual range and the second visual sensor within the overlapping visual range
  • the second partial sample image training is obtained;
  • the second neural network is obtained based on the sample image collected by the first visual sensor at the third moment and the sample image collected at the fourth moment; wherein, the first visual sensor The pose in space at the third moment is different from the pose in space at the fourth moment.
  • the processor is specifically configured to: based on the pose of the first visual sensor in space at the first moment and the position of the first visual sensor in space at the second moment, pose, determine the distance between the position of the first visual sensor at the first moment and the position at the second moment; if the distance is less than a preset value, based on the The first partial image, the second partial image, the image collected at the first moment and the image collected at the second moment determine the relative position of the object in the space where the movable platform is located and the movable platform relation.
  • the processor is further configured to: acquire a confidence map output by the first neural network, where the confidence map is used to indicate the confidence degree of the first relative position relationship; based on the confidence degree being greater than a preset
  • the first relative position of the confidence threshold is to optimize the pose relationship between the pose of the first visual sensor at the first moment and the pose of the first visual sensor at the second moment; optimized The pose relationship is used to determine the second relative position relationship.
  • the movable platform is an unmanned aerial vehicle; one of the first visual sensor and the second visual sensor is arranged on the top of the fuselage of the unmanned aerial vehicle, and the other is arranged on the the bottom of the fuselage of the unmanned aerial vehicle; or the movable platform is an unmanned aerial vehicle; one of the first visual sensor and the second visual sensor is arranged on the first side of the unmanned aerial vehicle, and the other Or it is set on the second side of the unmanned aerial vehicle, the first side is set opposite to the second side; or the movable platform is an unmanned vehicle, and the first visual sensor and the second visual sensor are respectively set on the two lights of the unmanned vehicle, or respectively set on both sides of the windshield; or the movable platform is a movable robot, and the first visual sensor and the second visual sensor are respectively set on the The location of the two eyes of the mobile robot described above.
  • At least one of the first visual sensor and the second visual sensor is a fisheye camera.
  • the area of the overlapping field of view of the first and second vision sensors is smaller than the area of the non-overlapping field of view.
  • the processor is further configured to: determine the absolute position information of the object in the space according to the relative positional relationship between the object and the movable platform.
  • the processor is specifically configured to: determine the moving direction of the movable platform; based on the first image area related to the moving direction in the first partial image, the second partial image The second image area related to the moving direction in the image, the third image area related to the moving direction in the image collected at the first moment, and the image area related to the moving direction in the image collected at the second moment
  • the fourth image area of the mobile platform is used to determine the relative positional relationship between the object in the space where the movable platform is located and the movable platform.
  • the processor is further configured to: determine the update frequency of the relative position relationship based on the moving speed of the movable platform and the relative position relationship; update the relative position based on the update frequency The relationship is updated.
  • FIG. 10 shows a schematic diagram of a hardware structure of an image processing device, which may include: a processor 1001 , a memory 1002 , an input/output interface 1003 , a communication interface 1004 and a bus 1005 .
  • the processor 1001 , the memory 1002 , the input/output interface 1003 and the communication interface 1004 are connected to each other within the device through the bus 1005 .
  • the processor 1001 can be implemented by a general-purpose CPU (Central Processing Unit, central processing unit), a microprocessor, an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, and is used to execute related programs to realize the technical solutions provided by the embodiments of this specification.
  • the processor 1001 may also include a graphics card, and the graphics card may be an Nvidia titan X graphics card or a 1080Ti graphics card.
  • the memory 1002 can be implemented in the form of ROM (Read Only Memory, read-only memory), RAM (Random Access Memory, random access memory), static storage device, dynamic storage device, etc.
  • the memory 1002 can store operating systems and other application programs. When implementing the technical solutions provided by the embodiments of this specification through software or firmware, the relevant program codes are stored in the memory 1002 and invoked by the processor 1001 for execution.
  • the input/output interface 1003 is used to connect the input/output module to realize information input and output.
  • the input/output/module can be configured in the device as a component (not shown in the figure), or can be externally connected to the device to provide corresponding functions.
  • the input device may include a keyboard, mouse, touch screen, microphone, various sensors, etc.
  • the output device may include a display, a speaker, a vibrator, an indicator light, and the like.
  • the communication interface 1004 is used to connect a communication module (not shown in the figure), so as to realize the communication interaction between the device and other devices.
  • the communication module can realize communication through wired means (such as USB, network cable, etc.), and can also realize communication through wireless means (such as mobile network, WIFI, Bluetooth, etc.).
  • Bus 1005 includes a path for transferring information between various components of the device (eg, processor 1001, memory 1002, input/output interface 1003, and communication interface 1004).
  • the above device only shows the processor 1001, the memory 1002, the input/output interface 1003, the communication interface 1004, and the bus 1005, in the specific implementation process, the device may also include other components.
  • the above-mentioned device may only include components necessary to implement the solutions of the embodiments of this specification, and does not necessarily include all the components shown in the figure.
  • an embodiment of the present disclosure also provides a mobile platform, including:
  • the first visual sensor 1101 and the second visual sensor 1102 are respectively used to collect images of the environment area around the movable platform, the first visual range of the first visual sensor 1101 and the first visual range of the second visual sensor 1102 and the image processing device 1103 .
  • the image processing device 1103 may adopt the image processing device described in any one of the above-mentioned embodiments, and the specific details of the image processing device can be found in the above-mentioned embodiments, and will not be repeated here.
  • the solutions of the embodiments of the present disclosure can also be used in VR glasses.
  • the first visual sensor and the second visual sensor are respectively arranged on the left frame and the right frame of the AR glasses.
  • VR glasses can perceive objects in the real scene, and then render virtual scene objects based on the real objects. For example, there is a table at a certain position in front of the user, and a virtual doll model can be rendered on the table. By sensing the distance between the objects in the user's space and the user, virtual scene objects can be rendered at appropriate positions.
  • an embodiment of the present disclosure further provides an image processing system, including a movable platform, on which a first visual sensor and a second visual sensor are installed, and the first visual sensor of the first visual sensor The visible range partially overlaps with the second visible range of the second visual sensor; and a remote control device, the remote control device includes a processor, and the processor is configured to execute the method described in any embodiment of the present disclosure.
  • the embodiment of this specification also provides a computer-readable storage medium, on which several computer instructions are stored, and when the computer instructions are executed, the steps of the method described in any embodiment are implemented.
  • Embodiments of the present description may take the form of a computer program product embodied on one or more storage media (including but not limited to magnetic disk storage, CD-ROM, optical storage, etc.) having program code embodied therein.
  • Computer usable storage media includes both volatile and non-permanent, removable and non-removable media, and may be implemented by any method or technology for information storage.
  • Information may be computer readable instructions, data structures, modules of a program, or other data.
  • Examples of storage media for computers include, but are not limited to: phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Flash memory or other memory technology, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cartridge, tape magnetic disk storage or other magnetic storage device or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
  • PRAM phase change memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • RAM random access memory
  • ROM read only memory
  • EEPROM Electrically Erasable Programmable Read-Only Memory
  • Flash memory or other memory technology
  • CD-ROM Compact Disc Read-Only Memory
  • DVD Digital Versatile Disc
  • Magnetic tape cartridge tape magnetic disk storage or other magnetic storage device or any other non-transmission medium that can be used to

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Automation & Control Theory (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

一种飞行器、图像处理方法和装置、可移动平台,飞行器包括第一视觉传感器(301)和第二视觉传感器(302);第一视觉传感器(301)的第一可视范围与第二视觉传感器(302)的第二可视范围部分重叠,其中,重叠的可视范围包括飞行器的侧部的环境区域,侧部包括第一侧和第二侧,第一侧和第二侧相对设置;第一可视范围包括飞行器顶部的环境区域;第二可视范围包括飞行器底部的环境区域;第一视觉传感器和第二视觉传感器采集的图像用于计算环境区域中的物体的位置信息,物体的位置信息用于控制飞行器在空间中运动。

Description

飞行器、图像处理方法和装置、可移动平台 技术领域
本公开涉及计算机视觉技术领域,具体而言,涉及飞行器、图像处理方法和装置、可移动平台。
背景技术
在可移动平台自主行驶的过程中,为了保证可移动平台的安全性,常常借助视觉传感器来对可移动平台周围的物体进行感知。视觉传感器在可移动平台上的部署方式以及基于视觉传感器采集的图像的处理方式对感知结果将会产生较大的影响。因此,有必要对视觉传感器的部署方式和图像处理方式中的至少一者进行改进。
发明内容
第一方面,本公开实施例提供一种飞行器,所述飞行器包括第一视觉传感器和第二视觉传感器;所述第一视觉传感器的第一可视范围与所述第二视觉传感器的第二可视范围部分重叠,其中,重叠的可视范围包括所述飞行器的周围的环境区域,所述环境区域包括所述无人飞行器的两个相背对部位朝向的区域;所述第一视觉传感器和所述第二视觉传感器采集的图像用于计算环境区域中的物体的位置信息,所述物体的位置信息用于控制所述飞行器在空间中运动。
第二方面,本公开实施例提供一种图像处理方法,所述方法应用于可移动平台,所述可移动平台包括第一视觉传感器和第二视觉传感器,所述第一视觉传感器的第一可视范围与所述第二视觉传感器的第二可视范围部分重叠,所述方法包括:获取所述第一视觉传感器在重叠的可视范围内的第一局部图像,获取所述第二视觉传感器在重叠的可视范围内的第二局部图像;获取所述第一视觉传感器在第一时刻采集的图像和在第二时刻采集的图像,所述第一视觉传感器在所述第一时刻时在空间中的位姿不同于在所述第二时刻时在空间中的位姿;基于所述第一局部图像、所述第二局部图像,所述第一时刻采集的图像和所述第二时刻采集的图像,确定所述可移动平台所在空间中的物体与所述可移动平台的相对位置关系。
第三方面,本公开实施例提供一种图像处理装置,包括处理器,所述装置应用于 可移动平台,所述可移动平台包括第一视觉传感器和第二视觉传感器,所述第一视觉传感器的第一可视范围与所述第二视觉传感器的第二可视范围部分重叠,所述处理器用于执行以下步骤:获取所述第一视觉传感器在重叠的可视范围内的第一局部图像,获取所述第二视觉传感器在重叠的可视范围内的第二局部图像;获取所述第一视觉传感器在第一时刻采集的图像和在第二时刻采集的图像,所述第一视觉传感器在所述第一时刻时在空间中的位姿不同于在所述第二时刻时在空间中的位姿;基于所述第一局部图像、所述第二局部图像,所述第一时刻采集的图像和所述第二时刻采集的图像,确定所述可移动平台所在空间中的物体与所述可移动平台的相对位置关系。
第四方面,本公开实施例提供一种可移动平台,包括:第一视觉传感器和第二视觉传感器,分别用于采集所述可移动平台周围的环境区域的图像,所述第一视觉传感器的第一可视范围与所述第二视觉传感器的第二可视范围部分重叠;以及任一实施例所述的图像处理装置。
应用本公开实施例方案,在飞行器上安装可视范围部分重叠的第一视觉传感器和第二视觉传感器,由于重叠的可视范围包括所述飞行器的周围的环境区域,且所述环境区域包括所述飞行器的两个相背对部位朝向的区域,从而能够获取周围的环境区域中朝向相对的两个区域中的物体的位置信息;此外,还能够获取飞行器的非重叠的可视范围内的物体的位置信息。因此,采用本公开实施例的方案,只需要在飞行器上部署两个视觉传感器就能获得较大的感知范围,视觉传感器的构型简单,成本低,从而减轻了飞行器的重量和成本,并提高了飞行器在空间中运动的安全性。
此外,空间中的物体与可移动平台的相对位置关系基于第一视觉传感器在重叠的可视范围内的第一局部图像、第二视觉传感器在重叠的可视范围内的第二局部图像以及第一视觉传感器在第一时刻采集的图像和在第二时刻采集的图像共同确定,一方面,能够提高对空间中物体的感知范围;另一方面,能够提高感知的准确度,从而准确获得空间中较大范围的物体的相对位置关系。
附图说明
为了更清楚地说明本公开实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是相关技术中的双目视觉传感器的覆盖范围的示意图。
图2是视觉传感器的布局的示意图。
图3是本公开实施例的飞行器的示意图。
图4A和图4B分别是本公开实施例的视觉传感器的覆盖范围的示意图。
图5是本公开实施例的基于语义确定相对位置关系的示意图。
图6是本公开实施例的图像处理方法的流程图。
图7是本公开实施例的双目深度估计过程的示意图。
图8是本公开实施例的单目深度估计过程的示意图。
图9是本公开实施例的整体流程图。
图10是本公开实施例的图像处理装置的硬件结构示意图。
图11是本公开实施例的可移动平台的示意图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。
在本公开使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本公开。在本公开说明书和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。
应当理解,尽管在本公开可能采用术语第一、第二、第三等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本公开范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。
在信息时代,计算机越来越广泛地应用于各种领域。作为智能计算的重要领域,计算机视觉得到了极大的开发应用。计算机视觉依靠成像系统代替视觉器官作为输入敏感手段,最常用的是视觉传感器。在将计算机视觉应用场于可移动平台时,借助视觉传感器来对可移动平台周围的物体进行感知,并基于感知结果来控制可移动平台的移动方式(包括移动速度、移动方向等)。例如,可以在可移动平台上布局双目视觉系统(Stereo Vision System)。双目视觉系统(Stereo Vision System)采用双目算法计算深度,即,通过两个视觉传感器,拍摄同一时刻、不同角度的两张图像,再通过两张图像的差异,以及两个视觉传感器之间的位姿关系,计算出场景与视觉传感器的距离关系,从而得到深度图(Depth Map)。或者,可以通过单目算法计算深度,即,视觉传感器从不同角度拍摄同一物体,同样也是计算多个角度拍摄图像的重叠区域的深度信息。为了提高可移动平台的安全性,需要尽量扩大对可移动平台的周围环境的感知范围。相关技术中,一般通过以下方式来扩大感知范围:
方式一:由于双目系统成像的非重叠区域较大,且非重叠区域的深度信息的可信度不高,在采用双目方式获取深度信息的情况下,一般会通过改进双目视觉传感器的布局方式,尽量增大双目视觉传感器的重叠的可视范围。然而,由于双目算法只能计算出两个视觉传感器的重叠的可视范围内的深度信息,即便重叠的可视范围增大,能够计算出深度信息的范围依然是有限的。如图1所示,S 1和S 2分别是两个视觉传感器,S 1的可视范围为BO 1D对应的范围,S 2的可视范围为AO 2C对应的范围,则S 1和S 2的重叠的可视范围为灰色区域所示。因此,在这种场景下,只能准确地计算出灰色区域内的场景的深度信息,而左右两侧是S 1/S 2独有的视角,无法准确地计算出深度信息。因此,这种改进方式依然无法获得范围足够大且准确的深度信息。此外,双目系统的基线长度有限,无法观测较远距离。
方式二:在多个方向和角度部署多组视觉传感器。参见图2,为了实现全向覆盖,分别在前、后、左、右、上、下多个方向和角度部局了单目视觉传感器或双目视觉传感器。然而,这种布局方式结构复杂、成本高、算力需求大,且会增加可移动平台的重量。
方式三:采用单目计算方式来获取深度信息。然而,这种方式需要视觉传感器动起来,视觉传感器静止或者姿态变化过大或过小的时候,难以生成有效的深度图,从而无法估算物体的深度信息。并且,被拍摄物体不能移动,必须是静止的物体,因此,单目方式获取的深度信息的鲁棒性较差。此外,当视觉传感器在不同时刻的位姿关系 不准确时,将严重影响获取的深度信息的准确度。
基于此,本公开实施例提供一种飞行器,参见图3,所述飞行器包括:
第一视觉传感器301和第二视觉传感器302;所述第一视觉传感器301的第一可视范围与所述第二视觉传感器302的第二可视范围部分重叠,其中,重叠的可视范围包括所述飞行器的周围的环境区域,所述环境区域包括所述飞行器的两个相背对部位朝向的区域;
所述第一视觉传感器301和所述第二视觉传感器302采集的图像用于计算环境区域中的物体的位置信息,所述物体的位置信息用于控制所述飞行器在空间中运动。
本公开实施例的飞行器可以是无人飞行器,例如,多旋翼无人飞行器。所述飞行器可以在空间中自主飞行,或者响应于于用户通过手势、遥控器、声控等方式下发的控制指令,在空间中飞行。
可以在飞行器上布局第一视觉传感器301和第二视觉传感器302。第一视觉传感器301的可视范围称为第一可视范围,第二视觉传感器302的可视范围称为第二可视范围。第一可视范围与第二可视范围的重叠可视范围可以包括所述飞行器的两个相背对部位朝向的区域,例如,包括所述飞行器的顶部朝向的区域和底部朝向的区域,或者,包括所述飞行器的左侧朝向的区域和右侧朝向的区域,或者包括所述飞行器的前侧朝向的区域和后侧朝向的区域。其中,重叠的可视范围也称为重叠区域或双目重叠区域,非重叠的可视范围也称为无重叠区域或双目非重叠区域。
在一些实施例中,可以将所述第一视觉传感器301和第二视觉传感器302中的一者设置在飞行器的顶部,另一者设置在飞行器的底部,从而使得重叠的可视范围包括所述飞行器的左侧朝向的区域和右侧朝向的区域,或者包括所述飞行器的前侧朝向的区域和后侧朝向的区域。在这种情况下,第一视觉传感器301和第二视觉传感器302的非重叠的可视范围包括所述飞行器的顶部的环境区域和底部的环境区域。
在另一些实施例中,可以将所述第一视觉传感器301和第二视觉传感器302中的一者设置在飞行器的左侧,另一者设置在飞行器的右侧,从而使得重叠的可视范围包括所述飞行器的顶部的环境区域和底部的环境区域。在这种情况下,第一视觉传感器301和第二视觉传感器302的非重叠的可视范围包括所述飞行器的左侧的环境区域和右侧的环境区域。
在另一些实施例中,可以将所述第一视觉传感器301和第二视觉传感器302中的 一者设置在飞行器的前侧,另一者设置在飞行器的后侧,从而使得重叠的可视范围包括所述飞行器的顶部的环境区域和底部的环境区域。在这种情况下,第一视觉传感器301和第二视觉传感器302的非重叠的可视范围包括所述飞行器的前侧的环境区域和后侧的环境区域。
除了上述设置方式以外,还可以根据实际需要将第一视觉传感器301和第二视觉传感器302的位置设置为其他位置,只要能使两个视觉传感器的重叠的可视范围包括所述飞行器的两个相背对部位朝向的区域即可。为了便于理解,下面以第一视觉传感器301和第二视觉传感器302中的一者设置在飞行器的顶部,另一者设置在飞行器的底部为例,对本公开的方案进行说明。在第一视觉传感器301和第二视觉传感器302采集的图像中,均包括双目重叠区域对应的局部图像以及无重叠区域对应的局部图像。
飞行器可以包括机臂和机身。其中,第一视觉传感器301和第二视觉传感器302均安装在机身上。在图4A所示的情况下,顶部为飞行器的机臂和机身上方的区域,底部为飞行器的机臂和机身下方的区域,侧部为机臂和机身旁侧的区域。当飞行器在空间中水平飞行时,所述顶部为朝向天空的一侧,所述底部为朝向地面的一侧,所述侧部为除顶部和底部之外的一侧或多侧。这样,第一视觉传感器301和第二视觉传感器302的总的可视范围几乎覆盖了飞行器所在空间的所有方向和角度,覆盖范围较大。
在一些实施例中,第一视觉传感器301和第二视觉传感器302中的至少一者可采用鱼眼相机。由于单个鱼眼相机的观测范围比较大(大于180度),通过设置2个鱼眼相机就可以实现全向观测,构型简单,成本低,重量轻,能够较好地适用于飞行器等对机载外设重量要求比较严格的应用场景。在飞行器的机身上分别部署两个鱼眼相机的情况下,第一视觉传感器301也可称为上鱼眼相机,第二视觉传感器302也可称为下鱼眼相机。第一视觉传感器301和第二视觉传感器302的结构可以相同,也可以不同,本公开对此不做限制。在一些实施例中,所述第一传感器或者所述第二视觉传感器包括一个感光电路模组和与所述感光电路配合安装的一组鱼眼光学模组。
在大多数情况下,飞行器在空间中是水平飞行的,或者飞行器的飞行速度包括水平方向上的分量,因此,对飞行器的侧部的环境区域的感知精度要求较高。上述第一视觉传感器301和第二视觉传感器302的重叠的可视范围包括所述飞行器的侧部的环境区域,因此,侧部的环境区域内的物体可以同时被两个视觉传感器感知到,从而能够提高侧部的环境区域内的物体的感知精度。
此外,相关技术中,飞行器上的双目视觉系统的重叠的可视范围一般只包括飞行器的一侧,而本公开实施例中重叠的可视范围包括两个相背对部位朝向的区域,如图4B所示,所述两个相背对部位朝向的区域包括飞行器的第一侧和第二侧,所述第一侧和第二侧可以分别是飞行器的左侧和右侧,或者飞行器的前侧和后侧。相比于相关技术,本公开实施例的第一视觉传感器301和第二视觉传感器302的重叠的可视范围更大。
所述第一视觉传感器301和所述第二视觉传感器302采集的图像用于计算环境区域中的物体的位置信息。所述位置信息可以基于所述物体与所述飞行器的相对位置关系确定,其中,所述相对位置关系可以通过物体的深度信息、视差信息或位置信息本身来表征。在所述相对位置关系通过物体的位置信息来表征的情况下,可以直接将所述相对位置关系确定为物体的位置信息。在所述相对位置关系通过物体的深度信息或视差信息表征的情况下,可以将深度信息或视差信息转换为位置信息。为了便于描述,下面以相对位置关系是深度信息为例,对本公开的方案进行说明。本领域技术人员可以理解,下文中的深度信息也可以采用其他的相对位置关系来代替。
在一些实施例中,所述空间中的物体与所述飞行器的相对位置关系基于以下两者共同确定:所述第一视觉传感器301在所述重叠的可视范围内的第一局部图像,所述第二视觉传感器302在所述重叠的可视范围内的第二局部图像;以及,可视范围覆盖所述物体的视觉传感器在第一时刻采集的图像和在第二时刻采集的图像,其中,可视范围覆盖所述物体的视觉传感器在所述第一时刻时在空间中的位姿不同于在所述第二时刻时在空间中的位姿。
所述第一局部图像与所述第二局部图像上的点是一一对应的,两张局部图像上对应的点为空间中的同一物点。所述第一局部图像与所述第二局部图像可以是在同一时刻采集得到的图像。可以基于第一局部图像与第二局部图像的时间戳获取同一时刻采集的两张局部图像,也可以通过同一个时钟信号控制第一视觉传感器和第二视觉传感器采集图像,从而得到同一时刻采集的两张局部图像。应当说明的是,这里的同一时刻并非指时间上严格同步,只要是在一定时间差范围内(例如,几毫秒)采集的两张图像,由于时间差较短,场景变化不大,均可认为是同一时刻采集的。
进一步地,第一局部图像与第二局部图像可以均是在所述第一时刻采集的图像,这样,能够提高最终输出的相对位置关系的精度。在所述物体为所述飞行器的顶部的环境区域内的物体的情况下,可视范围覆盖所述物体的视觉传感器为所述第一视觉传 感器301。在所述物体为所述飞行器的底部的环境区域内的物体的情况下,可视范围覆盖所述物体的视觉传感器为所述第二视觉传感器302。所述第一时刻与所述第二时刻为不同的时刻。可视范围覆盖所述物体的视觉传感器在所述第一时刻时在空间中的位姿不同于在所述第二时刻时在空间中的位姿,可以是飞行器本身的位姿改变,导致视觉传感器的位姿改变,也可以是飞行器本身的位姿不变,仅改变视觉传感器的位姿。
在一些实施例中,所述飞行器的顶部或者底部的环境区域的第一物体与所述飞行器的相对位置关系,可以基于所述重叠的可视范围内与所述第一物体具有相同语义信息的物体与所述飞行器的相对位置关系得到。为了方便描述,下文以所述第一物体为所述飞行器的顶部的环境区域的物体为例,并以可视范围覆盖所述第一物体的视觉传感器为所述第一视觉传感器301为例,对本公开实施例的方案进行说明。本领域技术人员可以理解,第一物体为所述飞行器的底部的环境区域的物体,且可视范围覆盖所述第一物体的视觉传感器为所述第二视觉传感器302的情况下的处理方式是类似的,此处不再赘述。本公开实施例基于物体的语义信息对双目非重叠范围内的物体的深度信息进行预测,从而不仅能获取双目重叠范围内的深度信息,还能获取双目重叠范围以外的深度信息,充分利用了两个视觉传感器采集的图像,扩大了深度信息的获取范围。
参见图5,假设重叠的可视范围(图中带斜线部分所示)内包括杯子501上的部分区域以及瓶子502上的部分区域,则可以基于重叠的可视范围内杯子501的深度信息,确定处于第一视觉传感器302可视范围内、且处于第二视觉传感器302可视范围之外的杯子501上的部分区域的深度信息。在一些实施例中,可以基于重叠的可视范围内杯子501上的一个或多个点的深度信息,获取杯子501上处于第一视觉传感器302可视范围内、且处于第二视觉传感器302可视范围之外的各个点的深度信息。其中,图像中的各个点的语义信息可以通过对图像进行语义识别得到。例如,可以将图像输入预先训练的卷积神经网络,通过该卷积神经网络输出图像上各个点的语义信息。
在一些实施例中,所述重叠的可视范围内与所述第一物体具有相同语义信息的物体(称为目标物体)与所述飞行器的相对位置关系,可以基于所述第一视觉传感器在所述重叠的可视范围内的第一局部图像和所述第二视觉传感器在所述重叠的可视范围内的第二局部图像得到。具体地,可以采用双目算法,获取目标物体相对于飞行器的深度信息。
进一步地,所述第一物体与所述飞行器的相对位置关系,可以基于以下信息共同 得到:基于目标物体与所述飞行器的相对位置关系预测得到的相对位置关系r 1,所述目标物体为所述重叠的可视范围内与所述第一物体具有相同语义信息的物体,所述目标物体与所述飞行器的相对位置关系基于所述第一视觉传感器在重叠的可视范围内的第一局部图像和所述第二视觉传感器在重叠的可视范围内的第二局部图像确定,以及,基于可视范围覆盖所述第一物体的视觉传感器在第一时刻采集的图像和在第二时刻采集的图像确定的相对位置关系r 2,其中,可视范围覆盖所述第一物体的视觉传感器在所述第一时刻时在空间中的位姿不同于在所述第二时刻时在空间中的位姿。本公开实施例同时采用基于语义信息推测得到的第一物体与飞行器的相对位置关系r 1,以及基于单目算法确定的第一物体与飞行器的相对位置关系r 2,能够提高获取的深度信息的精度和鲁棒性。
获取相对位置关系r 1的过程可参见前述实施例,此处不再赘述。以第一物体是第一视觉传感器301可视范围内的物体为例,在获取相对位置关系r 2时,可以基于第一视觉传感器在第一时刻的位姿与在第二时刻的位姿之间的位姿关系,以及所述第一视觉传感器在第一时刻采集的图像和在第二时刻采集的图像,确定第一物体的深度信息。其中,所述第一时刻可以是当前时刻,所述第二时刻可以是所述第一时刻之前的某一时刻(即历史时刻),所述第一局部图像和所述第二局部图像可以是所述第一时刻采集的图像。通过上述方式,可以确定第一物体在第一时刻相对于飞行器的深度信息。
可以采用惯性测量单元(Inertial Measurement Unit,IMU)来确定第一视觉传感器在第一时刻的位姿与在第二时刻的位姿,从而确定这两个时刻的位姿之间的位姿关系。IMU可以直接安装在第一视觉传感器上,从而可以将IMU的输出结果直接确定为第一视觉传感器的位姿。或者,IMU可以安装在飞行器的机身上,从而可以通过IMU的输出结果以及飞行器与第一视觉传感器之间的位姿关系确定第一视觉传感器的位姿。
除了计算双目非重叠范围内的物体的深度信息之外,本公开实施例还可以计算出双目重叠范围内的物体的深度信息。在一些实施例中,所述第一视觉传感器301和第二视觉传感器302分别设置在飞行器的顶部和底部,则所述环境区域包括所述飞行器的左侧朝向的区域和右侧朝向的区域,和/或包括所述飞行器的前侧朝向的区域和后侧朝向的区域。为了便于描述,将前侧、后侧、左侧、右侧朝向的区域均统称为飞行器的侧部的环境区域。在这种情况下,所述飞行器的顶部或者底部的环境区域的第一物体与所述飞行器的相对位置关系的计算过程,与所述飞行器的侧部的环境区域的第二物体与所述飞行器的相对位置关系的计算过程不同。
具体地,所述第二物体与所述飞行器的相对位置关系,可以基于以下信息共同得到:基于所述第一视觉传感器在所述重叠的可视范围内的第一局部图像以及所述第二视觉传感器在所述重叠的可视范围内的第二局部图像得到的相对位置关系r 3,以及基于可视范围覆盖所述第二物体的视觉传感器在第一时刻采集的图像和在第二时刻采集的图像确定的相对位置关系r 4,其中,可视范围覆盖所述第一物体的视觉传感器在所述第一时刻时在空间中的位姿不同于在所述第二时刻时在空间中的位姿。第二物体与所述飞行器的相对位置关系r 3可以采用双目算法获取,第二物体与所述飞行器的相对位置关系r 4的获取方式可以参考相对位置关系r 2的获取方式,此处不再赘述。通过本实施例,能够获得精度较高的第二物体与所述飞行器的相对位置关系。
本公开实施例具有以下优点:
(1)仅采用两个视觉传感器即可覆盖飞行器的顶部、底部和侧部的环境区域,传感器构型简单,成本低,重量轻,感知范围大。
(2)融合了双目算法、单目算法以及基于语义预测的方式来获取物体与飞行器的相对位置关系,提高了输出结果的鲁棒性和精度。
参见图6,本公开实施例还提供一种图像处理方法,所述方法应用于可移动平台,所述可移动平台包括第一视觉传感器和第二视觉传感器,所述第一视觉传感器的第一可视范围与所述第二视觉传感器的第二可视范围部分重叠,所述方法包括:
步骤601:获取所述第一视觉传感器在重叠的可视范围内的第一局部图像,获取所述第二视觉传感器在重叠的可视范围内的第二局部图像;
步骤602:获取所述第一视觉传感器在第一时刻采集的图像和在第二时刻采集的图像,所述第一视觉传感器在所述第一时刻时在空间中的位姿不同于在所述第二时刻时在空间中的位姿;
步骤603:基于所述第一局部图像、所述第二局部图像,所述第一时刻采集的图像和所述第二时刻采集的图像,确定所述可移动平台所在空间中的物体与所述可移动平台的相对位置关系。
本公开实施例的方法可以用于计算无人飞行器、无人车、可移动机器人等可移动平台周围的环境区域中的物体与所述可移动平台的相对位置关系,进而计算物体的位置信息。
在所述可移动平台为无人飞行器的情况下,所述第一视觉传感器和第二视觉传感器中的一者可以设置在所述无人飞行器的机身的顶部,另一者设置在所述无人飞行器的机身的底部。或者,所述第一视觉传感器和第二视觉传感器中的一者设置在所述无人飞行器的第一侧,另一者设置在所述无人飞行器的第二侧,所述第一侧与所述第二侧相对设置。例如,所述第一侧可以是无人飞行器的左侧,所述第二侧可以是无人飞行器的右侧;或者,所述第一侧可以是无人飞行器的前侧,所述第二侧可以是无人飞行器的后侧。所述无人飞行器可以是前述任一实施例中的飞行器。在所述可移动平台为无人车的情况下,所述第一视觉传感器和第二视觉传感器可以分别设置在所述无人车的两个车灯上,或者分别设置在挡风玻璃两侧。在所述可移动平台为可移动机器人的情况下,所述第一视觉传感器和第二视觉传感器可以分别设置在所述可移动机器人的两个眼睛所在的位置。上述设置所述第一视觉传感器和第二视觉传感器的方式,使得两个视觉传感器的重叠的可视范围可以尽可能地覆盖可移动平台的移动方向上的区域,从而提高对可移动平台移动方向上的物体的感知精度。除了以上列举的应用场景之外,可移动平台还可以是其他类型的、能够自主移动的设备,所述第一视觉传感器和第二视觉传感器的安装位置可以基于可移动平台的类型和/或其他因素设置,此处不再一一展开说明。
第一视觉传感器和第二视觉传感器二者均可单独作为单目视觉传感器使用,从而基于单目算法计算出物体的深度信息。此外,第一视觉传感器和第二视觉传感器可以构成一对非刚性双目,从而基于双目算法计算出物体的深度信息。其中,第一视觉传感器可以是可移动平台上的任意一个视觉传感器。以可移动平台是无人飞行器为例,第一视觉传感器可以是设置在所述无人飞行器的机身的顶部的视觉传感器,也可以是设置在所述无人飞行器的机身的底部的视觉传感器。
本公开对第一视觉传感器和第二视觉传感器的分辨率不做限制。可选地,第一视觉传感器和第二视觉传感器可采用分辨率为1280×960左右的视觉传感器。如果视觉传感器的分辨率太低,则采集的图像清晰度太低,难以准确地识别出图像中物体的特征,从而影响处理结果的准确度。如果视觉传感器的分辨率太高,则对两个视觉传感器因非刚性连接导致的干扰就会非常敏感。因此,采用这一分辨率的视觉传感器,能够有效地兼顾图像清晰度以及抗干扰性。
在一些实施例中,第一视觉传感器和第二视觉传感器中的至少一者可采用鱼眼相机。由于单个鱼眼相机的观测范围比较大(大于180度),通过设置2个鱼眼相机就 可以实现全向观测,具有构型简单、成本低、重量轻的优点。
在一些实施例中,所述第一视觉传感器和第二视觉传感器的重叠的可视范围的面积小于非重叠的可视范围的面积。相关技术中一般通过增加双目视觉传感器的重叠的可视范围的面积来扩大能够获取深度信息的范围,因此,双目重叠区域的面积一般大于非重叠区域的面积(如图1所示)。不同于此,本公开实施例的两个视觉传感器的重叠的可视范围的面积可以小于非重叠的可视范围的面积,通过采用与相关技术中不同的处理方式,同样可以准确地获得较大范围内的深度信息。下面对本公开实施例的处理方式进行具体说明。
在步骤601中,所述第一局部图像和所述第二局部图像可以是同一时刻采集的图像。在可移动平台移动过程中,可以实时通过可移动平台上的第一视觉传感器采集所述第一局部图像,通过可移动平台上的第二视觉传感器采集所述第二局部图像,并基于实时采集的第一局部图像和第二局部图像,实时地确定可移动平台周围的环境区域中的物体与所述可移动平台的相对位置关系。当然,也可以通过获取历史某一时刻采集的第一局部图像和第二局部图像,确定在该历史时刻时可移动平台周围的环境区域中的物体与所述可移动平台的相对位置关系。由于所述第一局部图像和所述第二局部图像均为重叠的可视范围内的图像,因此,所述第一局部图像和所述第二局部图像中的像素点是一一对应的,两张局部图像中对应的点均对应于物理空间中的同一个物点。
在步骤602中,第一视觉传感器在所述第一时刻时在空间中的位姿不同于在所述第二时刻时在空间中的位姿,可以是可移动平台本身的位姿改变,导致第一视觉传感器的位姿改变,也可以是可移动平台本身的位姿不变,仅改变第一视觉传感器的位姿。所述第一时刻与所述第二时刻为不同的时刻,例如,所述第一时刻可以是当前时刻,所述第二时刻可以是当前时刻之前的历史时刻。又例如,所述第一时刻和所述第二时刻分别是当前时刻之前的不同的历史时刻。进一步地,第一局部图像与第二局部图像可以均是在所述第一时刻采集的图像。
可以采用IMU确定第一视觉传感器在第一时刻的位姿与在第二时刻的位姿,从而确定这两个时刻的位姿之间的位姿关系。在可移动平台为车辆的情况下,也可以基于车辆的轮速信息、定位信息等确定第一视觉传感器的位姿。IMU可以直接安装在第一视觉传感器上,从而可以将IMU的输出结果直接确定为第一视觉传感器的位姿。或者,IMU可以安装在飞行器的机身上,从而可以通过IMU的输出结果以及飞行器与第一视觉传感器之间的位姿关系确定第一视觉传感器的位姿。例如,在可移动平台为挖掘机 且第一视觉传感器安装在挖掘机的机械臂上的情况下,可以通过安装在挖掘机的机身上的IMU获取挖掘机的位姿,并基于机械臂的电机转动角度和伸缩量确定机械臂与挖掘机的机身与机械臂之间的位姿关系,进而确定第一视觉传感器的位姿。
在步骤603中,可以基于所述第一局部图像和所述第二局部图像,确定所述物体与所述可移动平台的第一相对位置关系;基于所述第一时刻采集的图像和所述第二时刻采集的图像,确定所述物体与所述可移动平台的第二相对位置关系;基于所述第一相对位置关系和所述第二相对位置关系确定所述物体与所述可移动平台的相对位置关系。其中,基于所述第一局部图像和所述第二局部图像确定第一相对位置关系的过程可以基于双目算法实现,基于所述第一时刻采集的图像和所述第二时刻采集的图像确定第二相对位置关系的过程可以基于单目算法实现。通过融合不同方式得到的第一相对位置关系和第二相对位置关系来确定物体与所述可移动平台的相对位置关系,能够扩大感知范围,提高输出的相对位置关系的精度和鲁棒性。
在一些实施例中,重叠的可视范围内的物体与所述飞行器的相对位置关系的计算过程,与非重叠的可视范围内的物体与所述飞行器的相对位置关系的计算过程不同。
针对处于所述第一视觉传感器的可视范围内,且处于所述第二视觉传感器的可视范围以外的第一物体,可以基于所述第一局部图像和所述第二局部图像,确定所述重叠的可视范围内与第一物体具有相同语义信息的目标物体与所述可移动平台的相对位置关系;基于所述目标物体与所述可移动平台的相对位置关系,确定所述第一物体与所述可移动平台的第一相对位置关系。例如,假设重叠的可视范围内包括杯子上的部分区域,则可以基于重叠的可视范围内杯子的深度信息,确定处于第一视觉传感器可视范围内、且处于第二视觉传感器可视范围之外的杯子上的其他区域的深度信息。在一些实施例中,可以将第一视觉传感器采集的图像输入预先训练的卷积神经网络,通过该卷积神经网络输出图像上各个点的语义信息。确定目标物体与所述可移动平台的相对位置关系的过程可以基于双目算法实现。
在一些实施例中,在所述第二相对位置关系满足所述第一物体对应的几何约束条件的情况下,可以将所述第二相对位置关系确定为所述第一物体与所述可移动平台的相对位置关系;在所述第二相对位置关系不满足所述几何约束条件的情况下,可以将所述第一相对位置关系确定为所述第一物体与所述可移动平台的相对位置关系。
所述几何约束可以是第一物体上各个点之间的几何位置关系。例如,在一般情况 下,同一物体上相邻的点的深度信息一般是平滑变化的,即,同一物体上相邻的点的深度信息之差一般小于预设的深度差阈值。可以采用上述实施例中的方式计算第一物体上各个点的深度信息,如果相邻点的深度信息之差大于所述深度差阈值,则认为不满足所述第一物体对应的几何约束条件,只有在相邻点的深度信息之差小于或等于所述深度差阈值的情况下,才认为满足所述第一物体对应的几何约束条件。采用单目算法,对移动的物体获取的深度信息的鲁棒性较差。而基于语义确定的相对位置关系没有物理模型的约束,鲁棒性较差。本公开实施例通过上述方式,利用两种算法进行互补,在一种算法不满足约束条件的情况下,采用另一种算法获取的相对位置关系,从而能够提高最终确定的相对位置关系的鲁棒性。
针对处于所述重叠的可视范围内的第二物体,可以直接基于所述第一局部图像和所述第二局部图像,确定所述第二物体与所述可移动平台的第一相对位置关系。由于第二物体处于重叠的可视范围内,因此,第二物体的深度信息可以基于两张局部图像,直接采用精度和鲁棒性较高的双目算法来获取。同时,还可以基于所述第一时刻采集的图像和所述第二时刻采集的图像获取第二物体与所述可移动平台的第二相对位置关系。
在一些实施例中,在满足预设条件的情况下,可以将所述第一相对位置关系确定为所述第二物体与所述可移动平台的相对位置关系;在不满足预设条件的情况下,可以将所述第二相对位置关系确定为所述第二物体与所述可移动平台的相对位置关系;所述预设条件包括:所述第二物体的深度小于预设的深度阈值;以及所述第一相对位置关系的置信度大于预设的置信度阈值。
双目算法的精度较高,但由于双目系统的基线(baseline)长度有限,无法观测较远距离,因此,在第二物体的深度大于或等于于预设的深度阈值的情况下,认为所述第一相对位置关系可信度较低,从而可以将第二相对位置关系确定为所述第二物体与所述可移动平台的相对位置关系。同时,在遮挡等情况下,通过双目算法获取的第一相对位置关系的置信度可能不高,因此,在第一相对位置关系的置信度较低的情况下,也可以将第二相对位置关系确定为所述第二物体与所述可移动平台的相对位置关系。这样,可以提高输出结果的精度和可靠性。
在一些实施例中,所述第一相对位置关系通过第一神经网络对所述第一局部图像和所述第二局部图像进行处理得到,所述第二相对位置关系通过第二神经网络对所述第一时刻采集的图像和所述第二时刻采集的图像进行处理得到。本实施例通过神经网 络直接对图像进行处理,以输出第一相对位置关系和第二相对位置关系,处理过程简单,复杂度低,且由于神经网络能够对环境进行理解和推测,从而提高视觉感知的感受野。所述第一神经网络和/或所述第二神经网络可以是卷积神经网络,也可以是其他类型的神经网络。所述第一神经网络和所述第二神经网络的类型可以相同,也可以不同。
在一些实施例中,所述第一神经网络基于所述第一视觉传感器在所述重叠的可视范围内的第一局部样本图像和所述第二视觉传感器在所述重叠的可视范围内的第二局部样本图像训练得到;所述第二神经网络基于所述第一视觉传感器在第三时刻采集的样本图像和在第四时刻采集的样本图像训练得到;其中,所述第一视觉传感器在所述第三时刻时在空间中的位姿不同于在所述第四时刻时在空间中的位姿。
下面对两个神经网络的具体处理过程进行说明。参见图7,所述第一神经网络的处理过程包括:
(1)分别对所述第一局部图像和所述第二局部图像进行特征提取,得到所述第一局部图像的特征描述F1和所述第二局部图像的特征描述F2。为了保证特征的一致性,两次计算特征描述时可以共享网络参数。由于不同视觉传感器采集的图像很难保证完全一致,而卷积神经网络(Convolutional Neural Networks,CNN)提取的特征则具备良好的抗旋转、光照变换等特点,因此,使用CNN提取的特征更有于计算costvolume。在一些实施例中,CNN可以将第一局部图像和所述第二局部图像的素灰度值处理成特征描述。
(2)基于所述第一局部图像的特征描述和所述第二局部图像的特征描述,获取将所述第一局部图像投影到所述第二局部图像的投影代价(Costvolume)。本步骤可以通过Plane Sweeping算法计算Costvolume。针对上一步得到的特征描述F1和F2,沿着双目基线的方向,将F2按照不同视差进行移动得到F2’。将F1和F2进行点乘求和操作,得到每次位移后对应的Costvolume。
(3)对所述投影代价进行聚合,得到聚合投影代价。每一个位置的Costvolume的取值会受到周围位置的影响,因此需要对Costvolume进行充分聚合,让周围位置的信息能够传递到当前位置。本实施例采用卷积的方式对Costvolume进行融合,为了增大感受野,在进行卷积的过程中,需要对Costvolume进行降采样和上采样。
(4)对所述聚合投影代价进行归一化处理,得到初始相对位置关系。经过上一步 骤,可以得到充分聚合的Costvolume,其中每一个位置在不同的深度都会有一个概率值。对Costvolume进行归一化处理,即可回归出初始深度图。所述归一化处理可以通过arg softmax操作实现。初始深度图中包括场景中的每个点在不同深度下的概率值。
(5)基于所述初始相对位置关系以及所述目标物体的语义信息,获取所述第一相对位置关系。由于双目图像存在很大的非重叠区域,此区域无法根据双目视差出深度,本文利用CNN特征的语义信息补全该区域的深度图,从而可以从补全后的深度图中获取非重叠区域内物体的深度信息。
进一步地,还可以获取所述第一神经网络输出的置信图,所述置信图用于指示所述第一相对位置关系的置信度。置信度越大,表示所述第一相对位置关系的可信度越高。基于置信度大于预设置信度阈值的第一相对位置,对所述第一视觉传感器在第一时刻的位姿和所述第一视觉传感器在所述第二时刻的位姿之间的位姿关系进行优化;优化的位姿关系用于确定所述第二相对位置关系。一方面,基于语义补全得到的深度信息没有双目视差的物理约束,通过采用本实施例的方案,能够为基于语义补全得到的深度信息增加约束;另一方面,双目结构难以处理平行基线场景,通过采用置信图,能够将该类区域识别出来,进而采用单目算法得到的深度信息作为该类区域的深度信息。
参见图8,是所述第二神经网络的处理过程的示意图。第二神经网络的处理过程与第一神经网络的处理过程类似,下面着重对第一神经网络和第二神经网络处理过程中的不同之处进行说明,相同的处理过程可以参见第一神经网络的处理过程的实施例,此处不再赘述。
(1)挑选关键参考帧。基于所述第一视觉传感器在所述第一时刻时在空间中的位姿以及所述第一视觉传感器在所述第二时刻时在空间中的位姿,确定所述第一视觉传感器在所述第一时刻所处的位置与在所述第二时刻所处的位置之间的距离;若所述距离小于预设值,则将所述第一时刻采集的图像和所述第二时刻采集的图像作为关键参考帧,用于步骤603中以确定所述可移动平台所在空间中的物体与所述可移动平台的相对位置关系。
(2)优化位姿关系。根据上一步骤中得到的深度图和第一神经网络输出的置信图,定义置信度阈值α 1,记录置信图中置信度大于阈值α 1的位置,利用对应位置的深度信息优化两个时刻的位姿关系。
(3)计算所述第一时刻采集的图像的特征描述F3和所述第二时刻采集的图像的特征描述F4。计算方式与第一神经网络的步骤(2)相同,可以使用CNN来计算特征描述。
(4)计算Costvolume。对特征F3和F4,利用Plane Sweeping算法,将F4按照不同深度投影至F3的坐标系下,得到投影后的特征F4’。将F3和F4’通过串联的方式,连接到一起组成Costvolume。
(5)Costvolume聚合,处理方式与第一神经网络的步骤(3)相同。
(6)计算初始深度图。经过上一步骤后,可以到的一个充分聚合的Costvolume,其中每一个位置在不同的深度都会有一个概率值。
(7)深度信息补全,处理方式与第一神经网络的步骤(5)相同。
参见图9,是本公开实施例的整体流程图。对于空间中的物体上的任意一个点,可以获取该点的双目视差、单目视差以及双目置信度。如果该点同时满足以下条件:(1)该点处于双目重叠区域内,(2)该点的双目深度(通过双目方式计算出的深度)小于预设的深度阈值d1(例如,20m),以及(3)该点的双目深度的置信度大于预设的置信度阈值c1,则采用双目视差作为该点的实际视差。如果上述条件中的任意一者不满足,则判定单目深度(通过单目方式计算出的深度)是否满足几何约束条件。如果满足,则采用单目视差作为该点的实际视差,如果不满足,则仍然采用双目视差作为该点的实际视差。
在上述实施例中,双目视差可以是基于重叠区域的局部图像直接确定的视差,也可以是基于重叠区域的图像以及语义信息推断出的、非重叠区域的视差。由于视差、深度以及位置三者可以互相转换,因此,上述视差也可以替换为深度或者位置信息。此外,上述各个条件的判断顺序不限于图中所示,例如,可以先判断条件(2),再判断条件(3),最后判断条件(1)。由于双目视差的准确度一般较高,因此,在满足上述3个条件的情况下,可以优先采用双目视差。在实际应用中,也可以先判断单目视差是否满足几何约束条件,如果满足,则直接采用单目视差;如果不符合,再替换为双目视差。本公开实施例根据重叠区域范围、视觉传感器的姿态大小、双目置信图、观测距离大小等信息,对单目深度和双目深度进行充分融合,在单目或者双目计算失败时,通过采用本公开实施例的判断策略,可以快速地使用另一方法计算出的结果补上,实现动态切换策略,使系统更加稳定、鲁棒。
在一些实施例中,在步骤603之后,还可以根据所述物体与所述可移动平台的相对位置关系,确定所述物体在所述空间中的绝对位置信息。所述绝对位置信息可以是物体在预设坐标系(例如,可移动平台的坐标系或者世界坐标系)下的绝对位置坐标。例如,在预设坐标系为世界坐标系的情况下,可以获取物体的经纬高信息。
在一些实施例中,可以确定所述可移动平台的移动方向;基于所述第一局部图像中与所述移动方向相关的第一图像区域、所述第二局部图像中与所述移动方向相关的第二图像区域、所述第一时刻采集的图像中与所述移动方向相关的第三图像区域,以及所述第二时刻采集的图像中与所述移动方向相关的第四图像区域,确定所述可移动平台所在空间中的物体与所述可移动平台的相对位置关系。上述与所述移动方向相关的各个图像区域可以是包括所述移动方向上的区域的图像区域。
以可移动平台为无人车为例,无人车在向正前方行驶的过程中,安装于无人车上的视觉传感器的可视范围可能包括无人车正前方的区域、无人车左侧的区域以及无人车右侧的区域。因此,可以从第一图像区域中分割出包括无人车正前方的区域的第一图像区域。其他图像区域的分割方式类似,此处不再赘述。基于分割出的图像区域确定物体与所述可移动平台的相对位置关系,能够减少算力消耗,提高处理效率。
在一些实施例中,基于所述可移动平台的移动速度以及所述相对位置关系,确定所述相对位置关系的更新频率;基于所述更新频率对所述相对位置关系进行更新。例如,在可移动平台移动速度较慢的情况下,和/或在所述相对位置关系对应的物体的位置与可移动平台之间的距离较远的情况下,可以以较低的频率对所述相对位置关系进行更新。在可移动平台移动速度较快的情况下,和/或在所述相对位置关系对应的物体的位置与可移动平台之间的距离较近的情况下,可以以较高的频率对所述相对位置关系进行更新。通过动态调整所述相对位置关系的更新频率,能够兼顾可移动平台的安全性与获取相对位置关系时的资源消耗。
本公开实施例还提供一种图像装置,包括处理器,所述装置应用于可移动平台,所述可移动平台包括第一视觉传感器和第二视觉传感器,所述第一视觉传感器的第一可视范围与所述第二视觉传感器的第二可视范围部分重叠,所述处理器用于执行以下步骤:
获取所述第一视觉传感器在重叠的可视范围内的第一局部图像,获取所述第二视觉传感器在重叠的可视范围内的第二局部图像;
获取所述第一视觉传感器在第一时刻采集的图像和在第二时刻采集的图像,所述第一视觉传感器在所述第一时刻时在空间中的位姿不同于在所述第二时刻时在空间中的位姿;
基于所述第一局部图像、所述第二局部图像,所述第一时刻采集的图像和所述第二时刻采集的图像,确定所述可移动平台所在空间中的物体与所述可移动平台的相对位置关系。
在一些实施例中,所述处理器具体用于:基于所述第一局部图像和所述第二局部图像,确定所述物体与所述可移动平台的第一相对位置关系;基于所述第一时刻采集的图像和所述第二时刻采集的图像,确定所述物体与所述可移动平台的第二相对位置关系;基于所述第一相对位置关系和所述第二相对位置关系确定所述物体与所述可移动平台的相对位置关系。
在一些实施例中,所述处理器具体用于:基于所述第一局部图像和所述第二局部图像,确定所述重叠的可视范围内与第一物体具有相同语义信息的目标物体与所述可移动平台的相对位置关系;基于所述目标物体与所述可移动平台的相对位置关系,确定所述第一物体与所述可移动平台的第一相对位置关系。
在一些实施例中,所述处理器具体用于:在所述第二相对位置关系满足所述第一物体对应的几何约束条件的情况下,将所述第二相对位置关系确定为所述第一物体与所述可移动平台的相对位置关系;在所述第二相对位置关系不满足所述几何约束条件的情况下,将所述第一相对位置关系确定为所述第一物体与所述可移动平台的相对位置关系。
在一些实施例中,所述处理器具体用于:基于所述第一局部图像和所述第二局部图像,确定所述第二物体与所述可移动平台的第一相对位置关系。
在一些实施例中,所述处理器具体用于:在满足预设条件的情况下,将所述第一相对位置关系确定为所述第二物体与所述可移动平台的相对位置关系;在不满足预设条件的情况下,将所述第二相对位置关系确定为所述第二物体与所述可移动平台的相对位置关系;所述预设条件包括:所述第二物体的深度小于预设的深度阈值;以及所述第一相对位置关系的置信度大于预设的置信度阈值。
在一些实施例中,所述第一相对位置关系通过第一神经网络对所述第一局部图像和所述第二局部图像进行处理得到,所述第二相对位置关系通过第二神经网络对所述 第一时刻采集的图像和所述第二时刻采集的图像进行处理得到。
在一些实施例中,所述第一神经网络基于以下方式获取所述第一相对位置关系:分别对所述第一局部图像和所述第二局部图像进行特征提取,得到所述第一局部图像的特征描述和所述第二局部图像的特征描述;基于所述第一局部图像的特征描述和所述第二局部图像的特征描述,获取将所述第一局部图像投影到所述第二局部图像的投影代价;对所述投影代价进行聚合,得到聚合投影代价;对所述聚合投影代价进行归一化处理,得到初始相对位置关系;基于所述初始相对位置关系以及所述目标物体的语义信息,获取所述第一相对位置关系。
在一些实施例中,所述第一神经网络基于所述第一视觉传感器在所述重叠的可视范围内的第一局部样本图像和所述第二视觉传感器在所述重叠的可视范围内的第二局部样本图像训练得到;所述第二神经网络基于所述第一视觉传感器在第三时刻采集的样本图像和在第四时刻采集的样本图像训练得到;其中,所述第一视觉传感器在所述第三时刻时在空间中的位姿不同于在所述第四时刻时在空间中的位姿。
在一些实施例中,所述处理器具体用于:基于所述第一视觉传感器在所述第一时刻时在空间中的位姿以及所述第一视觉传感器在所述第二时刻时在空间中的位姿,确定所述第一视觉传感器在所述第一时刻所处的位置与在所述第二时刻所处的位置之间的距离;若所述距离小于预设值,基于所述第一局部图像、所述第二局部图像、所述第一时刻采集的图像和所述第二时刻采集的图像,确定所述可移动平台所在空间中的物体与所述可移动平台的相对位置关系。
在一些实施例中,所述处理器还用于:获取所述第一神经网络输出的置信图,所述置信图用于指示所述第一相对位置关系的置信度;基于置信度大于预设置信度阈值的第一相对位置,对所述第一视觉传感器在第一时刻的位姿和所述第一视觉传感器在所述第二时刻的位姿之间的位姿关系进行优化;优化的位姿关系用于确定所述第二相对位置关系。
在一些实施例中,所述可移动平台为无人飞行器;所述第一视觉传感器和第二视觉传感器中的一者设置在所述无人飞行器的机身的顶部,另一者设置在所述无人飞行器的机身的底部;或者所述可移动平台为无人飞行器;所述第一视觉传感器和第二视觉传感器中的一者设置在所述无人飞行器的第一侧,另一者设置在所述无人飞行器的第二侧,所述第一侧与所述第二侧相对设置;或者所述可移动平台为无人车,所述第 一视觉传感器和第二视觉传感器分别设置在所述无人车的两个车灯上,或者分别设置在挡风玻璃两侧;或者所述可移动平台为可移动机器人,所述第一视觉传感器和第二视觉传感器分别设置在所述可移动机器人的两个眼睛所在的位置。
在一些实施例中,所述第一视觉传感器和第二视觉传感器中的至少一者为鱼眼相机。
在一些实施例中,所述第一视觉传感器和第二视觉传感器的重叠的可视范围的面积小于非重叠的可视范围的面积。
在一些实施例中,所述处理器还用于:根据所述物体与所述可移动平台的相对位置关系,确定所述物体在所述空间中的绝对位置信息。
在一些实施例中,所述处理器具体用于:确定所述可移动平台的移动方向;基于所述第一局部图像中与所述移动方向相关的第一图像区域、所述第二局部图像中与所述移动方向相关的第二图像区域、所述第一时刻采集的图像中与所述移动方向相关的第三图像区域,以及所述第二时刻采集的图像中与所述移动方向相关的第四图像区域,确定所述可移动平台所在空间中的物体与所述可移动平台的相对位置关系。
在一些实施例中,所述处理器还用于:基于所述可移动平台的移动速度以及所述相对位置关系,确定所述相对位置关系的更新频率;基于所述更新频率对所述相对位置关系进行更新。
图10示出了一种图像处理装置的硬件结构示意图,该装置可以包括:处理器1001、存储器1002、输入/输出接口1003、通信接口1004和总线1005。其中处理器1001、存储器1002、输入/输出接口1003和通信接口1004通过总线1005实现彼此之间在设备内部的通信连接。
处理器1001可以采用通用的CPU(Central Processing Unit,中央处理器)、微处理器、应用专用集成电路(Application Specific Integrated Circuit,ASIC)、或者一个或多个集成电路等方式实现,用于执行相关程序,以实现本说明书实施例所提供的技术方案。处理器1001还可以包括显卡,所述显卡可以是Nvidia titan X显卡或者1080Ti显卡等。
存储器1002可以采用ROM(Read Only Memory,只读存储器)、RAM(Random Access Memory,随机存取存储器)、静态存储设备,动态存储设备等形式实现。存储器1002可以存储操作系统和其他应用程序,在通过软件或者固件来实现本说明书实施 例所提供的技术方案时,相关的程序代码保存在存储器1002中,并由处理器1001来调用执行。
输入/输出接口1003用于连接输入/输出模块,以实现信息输入及输出。输入输出/模块可以作为组件配置在设备中(图中未示出),也可以外接于设备以提供相应功能。其中输入设备可以包括键盘、鼠标、触摸屏、麦克风、各类传感器等,输出设备可以包括显示器、扬声器、振动器、指示灯等。
通信接口1004用于连接通信模块(图中未示出),以实现本设备与其他设备的通信交互。其中通信模块可以通过有线方式(例如USB、网线等)实现通信,也可以通过无线方式(例如移动网络、WIFI、蓝牙等)实现通信。
总线1005包括一通路,在设备的各个组件(例如处理器1001、存储器1002、输入/输出接口1003和通信接口1004)之间传输信息。
需要说明的是,尽管上述设备仅示出了处理器1001、存储器1002、输入/输出接口1003、通信接口1004以及总线1005,但是在具体实施过程中,该设备还可以包括实现正常运行所必需的其他组件。此外,本领域的技术人员可以理解的是,上述设备中也可以仅包含实现本说明书实施例方案所必需的组件,而不必包含图中所示的全部组件。
参见图11,本公开实施例还提供一种可移动平台,包括:
第一视觉传感器1101和第二视觉传感器1102,分别用于采集所述可移动平台周围的环境区域的图像,所述第一视觉传感器1101的第一可视范围与所述第二视觉传感器1102的第二可视范围部分重叠;以及图像处理装置1103。其中,所述图像处理装置1103可以采用前述任一实施例所述的图像处理装置,图像处理装置的具体细节详见前述实施例,此处不再赘述。
本公开实施例的方案还可用于VR眼镜,在这种应用场景下,第一视觉传感器和第二视觉传感器分别设置在所述AR眼镜的左镜框和右镜框上。VR眼镜可以感知现实场景中的物体,再基于现实中的物体来渲染出虚拟的场景对象。例如,用户前方某位置处有一张桌子,可以在桌子上渲染出一个虚拟的玩偶模型。通过对用户所在空间中的物体与用户之间的距离进行感知,从而可以在适当的位置渲染出虚拟的场景对象。
本公开实施例的图像处理方法和装置还可以应用于与可移动平台通信连接的遥控装置上。在这种情况下,本公开实施例还提供一种图像处理系统,包括可移动平台, 所述可移动平台上安装有第一视觉传感器和第二视觉传感器,所述第一视觉传感器的第一可视范围与所述第二视觉传感器的第二可视范围部分重叠;以及遥控装置,所述遥控装置包括处理器,所述处理器用于执行本公开任一实施例所述的方法。
本说明书实施例还提供一种计算机可读存储介质,所述可读存储介质上存储有若干计算机指令,所述计算机指令被执行时实任一实施例所述方法的步骤。
以上实施例中的各种技术特征可以任意进行组合,只要特征之间的组合不存在冲突或矛盾,但是限于篇幅,未进行一一描述,因此上述实施方式中的各种技术特征的任意进行组合也属于本说明书公开的范围。
本说明书实施例可采用在一个或多个其中包含有程序代码的存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。计算机可用存储介质包括永久性和非永久性、可移动和非可移动媒体,可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括但不限于:相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。
本领域技术人员在考虑说明书及实践这里公开的说明书后,将容易想到本公开的其它实施方案。本公开旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由下面的权利要求指出。
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。
以上所述仅为本公开的较佳实施例而已,并不用以限制本公开,凡在本公开的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本公开保护的范围之内。

Claims (45)

  1. 一种飞行器,其特征在于,所述飞行器包括第一视觉传感器和第二视觉传感器;
    所述第一视觉传感器的第一可视范围与所述第二视觉传感器的第二可视范围部分重叠,其中,重叠的可视范围包括所述飞行器的周围的环境区域,所述环境区域包括所述飞行器的两个相背对部位朝向的区域;
    所述第一视觉传感器和所述第二视觉传感器采集的图像用于计算环境区域中的物体的位置信息,所述物体的位置信息用于控制所述飞行器在空间中运动。
  2. 根据权利要求1所述的飞行器,其特征在于,所述空间中的物体与所述飞行器的相对位置关系基于以下两者共同确定:
    所述第一视觉传感器在所述重叠的可视范围内的第一局部图像,所述第二视觉传感器在所述重叠的可视范围内的第二局部图像;以及,
    可视范围覆盖所述物体的视觉传感器在第一时刻采集的图像和在第二时刻采集的图像,其中,可视范围覆盖所述物体的视觉传感器在所述第一时刻时在空间中的位姿不同于在所述第二时刻时在空间中的位姿。
  3. 根据权利要求1所述的飞行器,其特征在于,所述飞行器的顶部或者底部的环境区域的第一物体与所述飞行器的相对位置关系的计算过程,与所述飞行器的侧部的环境区域的第二物体与所述飞行器的相对位置关系的计算过程不同。
  4. 根据权利要求3所述的飞行器,其特征在于,所述飞行器的顶部或者底部的环境区域的第一物体与所述飞行器的相对位置关系,是基于所述重叠的可视范围内与所述第一物体具有相同语义信息的物体与所述飞行器的相对位置关系得到的。
  5. 根据权利要求4所述的飞行器,其特征在于,所述重叠的可视范围内与所述第一物体具有相同语义信息的物体与所述飞行器的相对位置关系,是基于所述第一视觉传感器在所述重叠的可视范围内的第一局部图像和所述第二视觉传感器在所述重叠的可视范围内的第二局部图像得到的。
  6. 根据权利要求4所述的飞行器,其特征在于,所述空间中的物体的语义信息是基于预设的神经网络对所述第一视觉传感器和所述第二视觉传感器分别采集的图像处理得到的。
  7. 根据权利要求3所述的飞行器,其特征在于,所述第一物体与所述飞行器的相对位置关系,是基于以下信息共同得到的:
    基于目标物体与所述飞行器的相对位置关系预测得到的相对位置关系,所述目标 物体为所述重叠的可视范围内与所述第一物体具有相同语义信息的物体,所述目标物体与所述飞行器的相对位置关系基于所述第一视觉传感器在重叠的可视范围内的第一局部图像和所述第二视觉传感器在重叠的可视范围内的第二局部图像确定,以及,
    基于可视范围覆盖所述第一物体的视觉传感器在第一时刻采集的图像和在第二时刻采集的图像确定的相对位置关系,其中,可视范围覆盖所述第一物体的视觉传感器在所述第一时刻时在空间中的位姿不同于在所述第二时刻时在空间中的位姿。
  8. 根据权利要求3所述的飞行器,其特征在于,所述第二物体与所述飞行器的相对位置关系,是基于以下信息共同得到的:
    基于所述第一视觉传感器在所述重叠的可视范围内的第一局部图像以及所述第二视觉传感器在所述重叠的可视范围内的第二局部图像得到的相对位置关系,以及
    基于可视范围覆盖所述第二物体的视觉传感器在第一时刻采集的图像和在第二时刻采集的图像确定的相对位置关系,其中,可视范围覆盖所述第一物体的视觉传感器在所述第一时刻时在空间中的位姿不同于在所述第二时刻时在空间中的位姿。
  9. 根据权利要求1所述的飞行器,其特征在于,所述飞行器包括机身和机臂,所述第一视觉传感器和所述第二视觉传感器均设置在所述机身上;和/或
    所述第一传感器或者所述第二视觉传感器包括一个感光电路模组和与所述感光电路配合安装的一组鱼眼光学模组。
  10. 根据权利要求1所述的飞行器,其特征在于,所述第一视觉传感器和所述第二视觉传感器中的一者设置在所述飞行器的顶部,另一者设置在所述飞行器的底部;或者
    所述第一视觉传感器和所述第二视觉传感器中的一者设置在所述飞行器的第一侧,另一者设置在所述飞行器的第二侧,所述第一侧与所述第二侧相对设置。
  11. 一种图像处理方法,其特征在于,所述方法应用于可移动平台,所述可移动平台包括第一视觉传感器和第二视觉传感器,所述第一视觉传感器的第一可视范围与所述第二视觉传感器的第二可视范围部分重叠,所述方法包括:
    获取所述第一视觉传感器在重叠的可视范围内的第一局部图像,获取所述第二视觉传感器在重叠的可视范围内的第二局部图像;
    获取所述第一视觉传感器在第一时刻采集的图像和在第二时刻采集的图像,所述第一视觉传感器在所述第一时刻时在空间中的位姿不同于在所述第二时刻时在空间中的位姿;
    基于所述第一局部图像、所述第二局部图像,所述第一时刻采集的图像和所述第 二时刻采集的图像,确定所述可移动平台所在空间中的物体与所述可移动平台的相对位置关系。
  12. 根据权利要求11所述的方法,其特征在于,所述基于所述第一局部图像、所述第二局部图像、所述第一时刻采集的图像和所述第二时刻采集的图像,确定所述可移动平台所在空间中的物体与所述可移动平台的相对位置关系,包括:
    基于所述第一局部图像和所述第二局部图像,确定所述物体与所述可移动平台的第一相对位置关系;
    基于所述第一时刻采集的图像和所述第二时刻采集的图像,确定所述物体与所述可移动平台的第二相对位置关系;
    基于所述第一相对位置关系和所述第二相对位置关系确定所述物体与所述可移动平台的相对位置关系。
  13. 根据权利要求12所述的方法,其特征在于,所述物体包括处于所述第一视觉传感器的可视范围内,且处于所述第二视觉传感器的可视范围以外的第一物体;所述基于所述第一局部图像和所述第二局部图像,确定所述物体与所述可移动平台的第一相对位置关系,包括:
    基于所述第一局部图像和所述第二局部图像,确定所述重叠的可视范围内与第一物体具有相同语义信息的目标物体与所述可移动平台的相对位置关系;
    基于所述目标物体与所述可移动平台的相对位置关系,确定所述第一物体与所述可移动平台的第一相对位置关系。
  14. 根据权利要求13所述的方法,其特征在于,所述基于所述第一相对位置关系和所述第二相对位置关系确定所述物体与所述可移动平台的相对位置关系,包括:
    在所述第二相对位置关系满足所述第一物体对应的几何约束条件的情况下,将所述第二相对位置关系确定为所述第一物体与所述可移动平台的相对位置关系;
    在所述第二相对位置关系不满足所述几何约束条件的情况下,将所述第一相对位置关系确定为所述第一物体与所述可移动平台的相对位置关系。
  15. 根据权利要求12所述的方法,其特征在于,所述物体包括处于所述重叠的可视范围内的第二物体;所述基于所述第一局部图像和所述第二局部图像,确定所述物体与所述可移动平台的第一相对位置关系,包括:
    基于所述第一局部图像和所述第二局部图像,确定所述第二物体与所述可移动平台的第一相对位置关系。
  16. 根据权利要求15所述的方法,其特征在于,所述基于所述第一相对位置关系 和所述第二相对位置关系确定所述物体与所述可移动平台的相对位置关系,包括:
    在满足预设条件的情况下,将所述第一相对位置关系确定为所述第二物体与所述可移动平台的相对位置关系;
    在不满足预设条件的情况下,将所述第二相对位置关系确定为所述第二物体与所述可移动平台的相对位置关系;
    所述预设条件包括:所述第二物体的深度小于预设的深度阈值;以及
    所述第一相对位置关系的置信度大于预设的置信度阈值。
  17. 根据权利要求12所述的方法,其特征在于,所述第一相对位置关系通过第一神经网络对所述第一局部图像和所述第二局部图像进行处理得到,所述第二相对位置关系通过第二神经网络对所述第一时刻采集的图像和所述第二时刻采集的图像进行处理得到。
  18. 根据权利要求17所述的方法,其特征在于,所述第一神经网络基于以下方式获取所述第一相对位置关系:
    分别对所述第一局部图像和所述第二局部图像进行特征提取,得到所述第一局部图像的特征描述和所述第二局部图像的特征描述;
    基于所述第一局部图像的特征描述和所述第二局部图像的特征描述,获取将所述第一局部图像投影到所述第二局部图像的投影代价;
    对所述投影代价进行聚合,得到聚合投影代价;
    对所述聚合投影代价进行归一化处理,得到初始相对位置关系;
    基于所述初始相对位置关系以及所述重叠的可视范围内与第一物体具有相同语义信息的目标物体的语义信息,获取所述第一相对位置关系。
  19. 根据权利要求17所述的方法,其特征在于,所述第一神经网络基于所述第一视觉传感器在所述重叠的可视范围内的第一局部样本图像和所述第二视觉传感器在所述重叠的可视范围内的第二局部样本图像训练得到;
    所述第二神经网络基于所述第一视觉传感器在第三时刻采集的样本图像和在第四时刻采集的样本图像训练得到;其中,所述第一视觉传感器在所述第三时刻时在空间中的位姿不同于在所述第四时刻时在空间中的位姿。
  20. 根据权利要求11所述的方法,其特征在于,所述基于所述第一局部图像、所述第二局部图像、所述第一时刻采集的图像和所述第二时刻采集的图像,确定所述可移动平台所在空间中的物体与所述可移动平台的相对位置关系,包括:
    基于所述第一视觉传感器在所述第一时刻时在空间中的位姿以及所述第一视觉传 感器在所述第二时刻时在空间中的位姿,确定所述第一视觉传感器在所述第一时刻所处的位置与在所述第二时刻所处的位置之间的距离;
    若所述距离小于预设值,基于所述第一局部图像、所述第二局部图像、所述第一时刻采集的图像和所述第二时刻采集的图像,确定所述可移动平台所在空间中的物体与所述可移动平台的相对位置关系。
  21. 根据权利要求17所述的方法,其特征在于,所述方法还包括:
    获取所述第一神经网络输出的置信图,所述置信图用于指示所述第一相对位置关系的置信度;
    基于置信度大于预设置信度阈值的第一相对位置,对所述第一视觉传感器在第一时刻的位姿和所述第一视觉传感器在所述第二时刻的位姿之间的位姿关系进行优化;优化的位姿关系用于确定所述第二相对位置关系。
  22. 根据权利要求11所述的方法,其特征在于,所述可移动平台为无人飞行器;所述第一视觉传感器和第二视觉传感器中的一者设置在所述无人飞行器的机身的顶部,另一者设置在所述无人飞行器的机身的底部;或者
    所述可移动平台为无人飞行器;所述第一视觉传感器和第二视觉传感器中的一者设置在所述无人飞行器的第一侧,另一者设置在所述无人飞行器的第二侧,所述第一侧与所述第二侧相对设置;
    所述可移动平台为无人车,所述第一视觉传感器和第二视觉传感器分别设置在所述无人车的两个车灯上,或者分别设置在挡风玻璃两侧;或者
    所述可移动平台为可移动机器人,所述第一视觉传感器和第二视觉传感器分别设置在所述可移动机器人的两个眼睛所在的位置。
  23. 根据权利要求11所述的方法,其特征在于,所述第一视觉传感器和第二视觉传感器中的至少一者为鱼眼相机。
  24. 根据权利要求11所述的方法,其特征在于,所述第一视觉传感器和第二视觉传感器的重叠的可视范围的面积小于非重叠的可视范围的面积。
  25. 根据权利要求11所述的方法,其特征在于,所述方法还包括:
    根据所述物体与所述可移动平台的相对位置关系,确定所述物体在所述空间中的绝对位置信息。
  26. 根据权利要求11所述的方法,其特征在于,所述基于所述第一局部图像、所述第二局部图像、所述第一时刻采集的图像和所述第二时刻采集的图像,确定所述可移动平台所在空间中的物体与所述可移动平台的相对位置关系,包括:
    确定所述可移动平台的移动方向;
    基于所述第一局部图像中与所述移动方向相关的第一图像区域、所述第二局部图像中与所述移动方向相关的第二图像区域、所述第一时刻采集的图像中与所述移动方向相关的第三图像区域,以及所述第二时刻采集的图像中与所述移动方向相关的第四图像区域,确定所述可移动平台所在空间中的物体与所述可移动平台的相对位置关系。
  27. 根据权利要求11所述的方法,其特征在于,所述方法还包括:
    基于所述可移动平台的移动速度以及所述相对位置关系,确定所述相对位置关系的更新频率;
    基于所述更新频率对所述相对位置关系进行更新。
  28. 一种图像处理装置,包括处理器,其特征在于,所述装置应用于可移动平台,所述可移动平台包括第一视觉传感器和第二视觉传感器,所述第一视觉传感器的第一可视范围与所述第二视觉传感器的第二可视范围部分重叠,所述处理器用于执行以下步骤:
    获取所述第一视觉传感器在重叠的可视范围内的第一局部图像,获取所述第二视觉传感器在重叠的可视范围内的第二局部图像;
    获取所述第一视觉传感器在第一时刻采集的图像和在第二时刻采集的图像,所述第一视觉传感器在所述第一时刻时在空间中的位姿不同于在所述第二时刻时在空间中的位姿;
    基于所述第一局部图像、所述第二局部图像,所述第一时刻采集的图像和所述第二时刻采集的图像,确定所述可移动平台所在空间中的物体与所述可移动平台的相对位置关系。
  29. 根据权利要求28所述的装置,其特征在于,所述处理器具体用于:
    基于所述第一局部图像和所述第二局部图像,确定所述物体与所述可移动平台的第一相对位置关系;
    基于所述第一时刻采集的图像和所述第二时刻采集的图像,确定所述物体与所述可移动平台的第二相对位置关系;
    基于所述第一相对位置关系和所述第二相对位置关系确定所述物体与所述可移动平台的相对位置关系。
  30. 根据权利要求29所述的装置,其特征在于,所述处理器具体用于:
    基于所述第一局部图像和所述第二局部图像,确定所述重叠的可视范围内与第一物体具有相同语义信息的目标物体与所述可移动平台的相对位置关系;
    基于所述目标物体与所述可移动平台的相对位置关系,确定所述第一物体与所述可移动平台的第一相对位置关系。
  31. 根据权利要求30所述的装置,其特征在于,所述处理器具体用于:
    在所述第二相对位置关系满足所述第一物体对应的几何约束条件的情况下,将所述第二相对位置关系确定为所述第一物体与所述可移动平台的相对位置关系;
    在所述第二相对位置关系不满足所述几何约束条件的情况下,将所述第一相对位置关系确定为所述第一物体与所述可移动平台的相对位置关系。
  32. 根据权利要求29所述的装置,其特征在于,所述处理器具体用于:
    基于所述第一局部图像和所述第二局部图像,确定所述第二物体与所述可移动平台的第一相对位置关系。
  33. 根据权利要求32所述的装置,其特征在于,所述处理器具体用于:
    在满足预设条件的情况下,将所述第一相对位置关系确定为所述第二物体与所述可移动平台的相对位置关系;
    在不满足预设条件的情况下,将所述第二相对位置关系确定为所述第二物体与所述可移动平台的相对位置关系;
    所述预设条件包括:所述第二物体的深度小于预设的深度阈值;以及
    所述第一相对位置关系的置信度大于预设的置信度阈值。
  34. 根据权利要求29所述的装置,其特征在于,所述第一相对位置关系通过第一神经网络对所述第一局部图像和所述第二局部图像进行处理得到,所述第二相对位置关系通过第二神经网络对所述第一时刻采集的图像和所述第二时刻采集的图像进行处理得到。
  35. 根据权利要求34所述的装置,其特征在于,所述第一神经网络基于以下方式获取所述第一相对位置关系:
    分别对所述第一局部图像和所述第二局部图像进行特征提取,得到所述第一局部图像的特征描述和所述第二局部图像的特征描述;
    基于所述第一局部图像的特征描述和所述第二局部图像的特征描述,获取将所述第一局部图像投影到所述第二局部图像的投影代价;
    对所述投影代价进行聚合,得到聚合投影代价;
    对所述聚合投影代价进行归一化处理,得到初始相对位置关系;
    基于所述初始相对位置关系以及所述重叠的可视范围内与第一物体具有相同语义信息的目标物体的语义信息,获取所述第一相对位置关系。
  36. 根据权利要求34所述的装置,其特征在于,所述第一神经网络基于所述第一视觉传感器在所述重叠的可视范围内的第一局部样本图像和所述第二视觉传感器在所述重叠的可视范围内的第二局部样本图像训练得到;
    所述第二神经网络基于所述第一视觉传感器在第三时刻采集的样本图像和在第四时刻采集的样本图像训练得到;其中,所述第一视觉传感器在所述第三时刻时在空间中的位姿不同于在所述第四时刻时在空间中的位姿。
  37. 根据权利要求28所述的装置,其特征在于,所述处理器具体用于:
    基于所述第一视觉传感器在所述第一时刻时在空间中的位姿以及所述第一视觉传感器在所述第二时刻时在空间中的位姿,确定所述第一视觉传感器在所述第一时刻所处的位置与在所述第二时刻所处的位置之间的距离;
    若所述距离小于预设值,基于所述第一局部图像、所述第二局部图像、所述第一时刻采集的图像和所述第二时刻采集的图像,确定所述可移动平台所在空间中的物体与所述可移动平台的相对位置关系。
  38. 根据权利要求34所述的装置,其特征在于,所述处理器还用于:
    获取所述第一神经网络输出的置信图,所述置信图用于指示所述第一相对位置关系的置信度;
    基于置信度大于预设置信度阈值的第一相对位置,对所述第一视觉传感器在第一时刻的位姿和所述第一视觉传感器在所述第二时刻的位姿之间的位姿关系进行优化;优化的位姿关系用于确定所述第二相对位置关系。
  39. 根据权利要求28所述的装置,其特征在于,所述可移动平台为无人飞行器;所述第一视觉传感器和第二视觉传感器中的一者设置在所述无人飞行器的机身的顶部,另一者设置在所述无人飞行器的机身的底部;或者
    所述可移动平台为无人车,所述第一视觉传感器和第二视觉传感器分别设置在所述无人车的两个车灯上,或者分别设置在挡风玻璃两侧;或者
    所述可移动平台为可移动机器人,所述第一视觉传感器和第二视觉传感器分别设置在所述可移动机器人的两个眼睛所在的位置。
  40. 根据权利要求28所述的装置,其特征在于,所述第一视觉传感器和第二视觉传感器中的至少一者为鱼眼相机。
  41. 根据权利要求28所述的装置,其特征在于,所述第一视觉传感器和第二视觉传感器的重叠的可视范围的面积小于非重叠的可视范围的面积。
  42. 根据权利要求28所述的装置,其特征在于,所述处理器还用于:
    根据所述物体与所述可移动平台的相对位置关系,确定所述物体在所述空间中的绝对位置信息。
  43. 根据权利要求28所述的装置,其特征在于,所述处理器具体用于:
    确定所述可移动平台的移动方向;
    基于所述第一局部图像中与所述移动方向相关的第一图像区域、所述第二局部图像中与所述移动方向相关的第二图像区域、所述第一时刻采集的图像中与所述移动方向相关的第三图像区域,以及所述第二时刻采集的图像中与所述移动方向相关的第四图像区域,确定所述可移动平台所在空间中的物体与所述可移动平台的相对位置关系。
  44. 根据权利要求28所述的装置,其特征在于,所述处理器还用于:
    基于所述可移动平台的移动速度以及所述相对位置关系,确定所述相对位置关系的更新频率;
    基于所述更新频率对所述相对位置关系进行更新。
  45. 一种可移动平台,其特征在于,包括:
    第一视觉传感器和第二视觉传感器,分别用于采集所述可移动平台周围的环境区域的图像,所述第一视觉传感器的第一可视范围与所述第二视觉传感器的第二可视范围部分重叠;以及
    权利要求28至44任意一项所述的图像处理装置。
PCT/CN2022/071100 2022-01-10 2022-01-10 飞行器、图像处理方法和装置、可移动平台 WO2023130465A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2022/071100 WO2023130465A1 (zh) 2022-01-10 2022-01-10 飞行器、图像处理方法和装置、可移动平台
CN202280057209.4A CN117859104A (zh) 2022-01-10 2022-01-10 飞行器、图像处理方法和装置、可移动平台

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/071100 WO2023130465A1 (zh) 2022-01-10 2022-01-10 飞行器、图像处理方法和装置、可移动平台

Publications (1)

Publication Number Publication Date
WO2023130465A1 true WO2023130465A1 (zh) 2023-07-13

Family

ID=87072973

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/071100 WO2023130465A1 (zh) 2022-01-10 2022-01-10 飞行器、图像处理方法和装置、可移动平台

Country Status (2)

Country Link
CN (1) CN117859104A (zh)
WO (1) WO2023130465A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106931961A (zh) * 2017-03-20 2017-07-07 成都通甲优博科技有限责任公司 一种自动导航方法及装置
US10165182B1 (en) * 2016-12-29 2018-12-25 Scott Zhihao Chen Panoramic imaging systems based on two laterally-offset and vertically-overlap camera modules
CN210986289U (zh) * 2019-11-25 2020-07-10 影石创新科技股份有限公司 四目鱼眼相机及双目鱼眼相机
CN112837207A (zh) * 2019-11-25 2021-05-25 影石创新科技股份有限公司 全景深度测量方法、四目鱼眼相机及双目鱼眼相机

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10165182B1 (en) * 2016-12-29 2018-12-25 Scott Zhihao Chen Panoramic imaging systems based on two laterally-offset and vertically-overlap camera modules
CN106931961A (zh) * 2017-03-20 2017-07-07 成都通甲优博科技有限责任公司 一种自动导航方法及装置
CN210986289U (zh) * 2019-11-25 2020-07-10 影石创新科技股份有限公司 四目鱼眼相机及双目鱼眼相机
CN112837207A (zh) * 2019-11-25 2021-05-25 影石创新科技股份有限公司 全景深度测量方法、四目鱼眼相机及双目鱼眼相机
WO2021104308A1 (zh) * 2019-11-25 2021-06-03 影石创新科技股份有限公司 全景深度测量方法、四目鱼眼相机及双目鱼眼相机

Also Published As

Publication number Publication date
CN117859104A (zh) 2024-04-09

Similar Documents

Publication Publication Date Title
EP3378033B1 (en) Systems and methods for correcting erroneous depth information
US11064178B2 (en) Deep virtual stereo odometry
CN111344644B (zh) 用于基于运动的自动图像捕获的技术
CN112292711A (zh) 关联lidar数据和图像数据
CN112567201A (zh) 距离测量方法以及设备
CN111566612A (zh) 基于姿势和视线的视觉数据采集系统
JP2018535402A (ja) 異なる分解能を有するセンサーの出力を融合するシステム及び方法
CN111784748B (zh) 目标跟踪方法、装置、电子设备及移动载具
JP2018526641A (ja) レーザ深度マップサンプリングのためのシステム及び方法
US20170017839A1 (en) Object detection apparatus, object detection method, and mobile robot
US11748998B1 (en) Three-dimensional object estimation using two-dimensional annotations
CN113887400B (zh) 障碍物检测方法、模型训练方法、装置及自动驾驶车辆
CN113568435B (zh) 一种基于无人机自主飞行态势感知趋势的分析方法与系统
JP7133927B2 (ja) 情報処理装置及びその制御方法及びプログラム
WO2023056789A1 (zh) 农机自动驾驶障碍物识别方法、系统、设备和存储介质
CN108603933A (zh) 用于融合具有不同分辨率的传感器输出的系统和方法
CN113052907B (zh) 一种动态环境移动机器人的定位方法
US11842440B2 (en) Landmark location reconstruction in autonomous machine applications
JP7103354B2 (ja) 情報処理装置、情報処理方法、及びプログラム
WO2020019175A1 (zh) 图像处理方法和设备、摄像装置以及无人机
WO2020024182A1 (zh) 一种参数处理方法、装置及摄像设备、飞行器
WO2023130465A1 (zh) 飞行器、图像处理方法和装置、可移动平台
US11417063B2 (en) Determining a three-dimensional representation of a scene
JP2021099384A (ja) 情報処理装置、情報処理方法およびプログラム
JP2021099383A (ja) 情報処理装置、情報処理方法およびプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22917935

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202280057209.4

Country of ref document: CN